网上购物 货比三家
您现在的位置:快乐比价网 > 图书 > 计算机与网络 > 原理基础 > 商品详情

并行处理基础(英文版)

分享到:
并行处理基础(英文版)

最 低 价:¥39.00

定 价:¥56.00

作 者:(美)Harry F.Jordan,Gita Alaghband

出 版 社:清华大学出版社

出版时间:2003 年10月

I S B N:7302073821

  • 并行处理基础
  • 送货上门
  • 价格
    39.00元
  • 并行处理基础
  • 送货上门
  • 价格
    39.20元
    价格
    50.40元

    商品详情

    编辑推荐

    内容简介

    本书作者在多年从事并行处理教学和研究的基础上,从并行体系结构、算法和语言三者结合的角度全面地介绍了计算机并行处理所涉及到的主要内容,主要包括并行处理的发展、向量处理、集中与分布多处理、互联网络、同步与通信、性能分析、并行时序、并行I/O、并行库等。从提出问题到理论分析和程序算法实现,循序渐进,科学严谨,具有很好的理论性和实践性。本书的写作语言朴实,易于读者阅读和理解,每章后的习题和参考文献又为读者提供了进一步思考相关问题的线索。
       本书适用于计算机科学或计算机工程专业的研究生、高年级本科生作为有关"计算机并行处理"课程的教材,同时也可作为计算机并行处理领域研究人员的科研参考书。
      

    作者简介

    目录

    preface
    chapter 1: parallel machines and computations
    1.1 the evolution of parallel architectures
    1.1.1 parallelism in sequential computers
    1.1.2 vector or simd computers
    1.1.3 multiprocessors or mimd computers
    1.2 interconnection networks
    1.3 application of architectural parallelism
    1.4 getting started in simd and mimd programming
    1.5 parallelism in algorithms
    1.6 conclusion
    1.7 bibliographic notes
    chapter 2: potential for parallel computations
    2.1 parameters characterizing algorithm parallelism
    2.2 prefix problem
    2.3 parallel prefix algorithms
    2.3.1 upper/lower parallel prefix algorithm
    2.3.2 odd/even parallel prefix algorithm
    2.3.3 ladner and fischer's parallel prefix
    2.4 characterizing algorithm behavior for large problem size
    .2.5 programming parallel prefix
    2.6 speedup and efficiency of parallel algorithms
    2.7 the performance perspective
    2.7.1 factors that influence performance
    2.7.2 a simple performance model--amdahl's law
    2.7.3 average execution rate
    2.8 conclusion
    2.9 bibliographic notes
    chapter 3: vector algorithms and architectures
    3.1 vector and matrix algorithms
    3.2 a vector architecture---single instruction multiple data
    3.3 an simd instruction set
    3.3.1 registers and memories of an simd computer
    3.3.2 vector, control unit, and cooperative instructions
    3.3.3 data-dependent conditional operations
    3.3.4 vector length and strip mining
    3.3.5 routing data among the pes
    3.4 the prime memory system
    3.5 use of the pe index to solve storage layout problems
    3.6 simd language constructs--fortran 90
    3.6.1 arrays and array sections
    3.6.2 array assignment and array expressions
    3.6.3 fortran 90 array intrinsic functions
    3.6.4 examples of simd operations in fortran 90
    3.7 pipelined simd vector computers
    3.7.1 pipelined simd processor structure
    processor/memory interaction
    number and types of pipelines
    implementation of arithmetic
    3.7.2 the memory interface of a pipelined simd computer
    3.7.3 performance of pipelined simd computers
    3.8 vector architecture summary
    3.9 bibliographic notes
    chapter 4: mimd computers or multiprocessors
    4.1 shared memory and message-passing architectures
    4.1.1 mixed-type multiprocessor architectures
    4.1.2 characteristics of shared memory and message passing
    4.1.3 switching topologies for message passing architectures
    4.1.4 direct and indirect networks
    4.1.5 classification of real systems
    4.2 overview of shared memory multiprocessor programming
    4.2.1 data sharing and process management
    4.2.2 synchronization
    4.2.3 atomicity and synchronization
    4.2.4 work distribution
    4.2.5 many processes executing one program
    4.3 shared memory programming alternatives and scope
    4.3.1 process management--starting, stopping, and hierarchy
    4.3.2 data access by parallel processes
    4.3.3 work distribution
    4.3.4 multiprocessor synchronization
    atomicity
    hardware and software synchronization mechanisms
    fairness and mutual exclusion
    4.4 a shared memory multiprocessor programming language
    4.4.1 the openmp language extension
    execution model
    process control
    parallel scope of variables
    work distribution
    synchronization and memory consistency
    4.4.2 the openmp fortran applications program interface (api)
    constructs of the openmp fortran api
    4.4.3 openmp fortran examples and discussion
    4.5 pipelined mimd--multithreading
    4.6 summary and conclusions
    4.7 bibliographic notes
    chapter 5: distributed memory multiprocessors
    5.1 distributing data and operations among processor/memory pairs
    5.2 programming with message passing
    5.2.1 the communicating sequential processes (csp) language
    5.2.2 a distributed memory programming example: matrix multiply
    5.3 characterization of communication
    5.3.1 point-to-point communications
    5.3.2 variable classes in a distributed memory program
    5.3.3 high-level communication operations
    5.3.4 distributed gauss elimination with high-level communications
    5.3.5 process topology versus processor topology
    5.4 the message passing interface, mpi
    5.4.1 basic concepts in mpi
    communicator structure
    the envelope
    the data
    point-to-point communication concepts
    collective communications concepts
    5.4.2 an example mpi program--matrix multiplication
    5.5 hardware managed communication--distributed cache
    5.5.1 cache coherence
    5.5.2 shared memory consistency
    5.6 conclusion---shared versus distributed memory multiprocessors
    5.7 bibliographic notes
    chapter 6: interconneetion networks
    6.1 network characteristics
    6.2 permutations
    6.3 static networks
    6.3.1 mesh
    6.3.2 ring
    6.3.3 tree
    6.3.4 cube networks
    6.3.5 performance
    6.4 dynamic networks
    6.4.1 bus
    6.4.2 crossbar
    6.4.3 multistage interconnection networks (mins)
    benes network
    butterfly network
    omega network
    6.4.4 combining networks--mutual exclusion free synchronization
    6.4.5 performance
    6.5 conclusion
    6.6 bibliographic notes
    chapter 7: data dependence and parallelism
    7.1 discovering parallel operations in (sequential) code
    7.2 variables with complex names
    7.2.1 nested loops
    7.2.2 variations on the array reference disambiguation problem
    7.3 sample compiler techniques
    7.3.1 loop transformations
    7.3.2 loop restructuring
    7.3.3 loop replacement transformations
    7.3.4 anti- and output dependence removal transformations
    7.4 data flow principles
    7.4.1 data flow concepts
    7.4.2 graphical representation of data flow computations
    7.4.3 data flow conditionals
    7.4.4 data flow iteration
    7.4.5 data flow function application and recursion
    7.4.6 structured values in data flow--arrays
    7.5 data flow architectures
    7.5.1 the mit static data flow architecture
    7.5.2 dynamic data flow computers
    manchester data flow computer
    the mit tagged-token data flow machine
    7.5.3 issues to be addressed by data flow machines
    7.6 systolic arrays
    7.7 conclusion
    7.8 bibliographic notes
    chapter 8: implementing synchronization and data sharing
    8.1 the character of information conveyed by synchronization
    8.2 synchronizing different kinds of cooperative computations
    8.2.1 one producer with one or more consumers
    8.2.2 global reduction
    8.2.3 global prefix
    8.2.4 cooperative update of a partitioned structure
    8.2.5 managing a shared task set
    8.2.6 cooperative list manipulation
    8.2.7 parallel access queue using fetch&add
    8.2.8 histogram--fine granularity, data-dependent synchronization
    8.3 waiting mechanisms
    8.3.1 hardware waiting
    8.3.2 software waiting
    8.3.3 multilevel waiting
    8.4 mutual exclusion using atomic read and write
    8.5 proving a synchronization implementation correct
    8.5.1 implementing produce/consume using locks
    8.5.2 temporal logic
    8.5.3 proof of correctness
    8.6 alternative implementations of synchronization--barrier
    8.6.1 features of barrier synchronization
    8.6.2 characterization of barrier implementations
    8.7 conclusion
    8.8 bibliographic notes
    chapter 9: parallel processor performance
    9.1 amdahl' s law revisited
    9.1.1 the effect of work granularity on amdahl' s law
    9.1.2 least squares estimation of amdahl's law parameters
    9.2 parameterized execution time
    9.2.1 pipelined vector machine performance
    9.2.2 pipelined multiprocessor performance
    program sections with restricted parallelism
    programs requiring critical sections for parallel execution
    granularity correction to the critical section model
    9.2.3 multiple pipelined multiprocessors
    9.3 performance of barrier synchronization
    9.3.1 accounting for barrier performance
    9.3.2 instrumentation for barrier measurement
    9.3.3 examples of barrier performance measurement
    9.4 statistical models for static and dynamic parallel loops
    9.4.1 dynamic scheduling model
    dynamically scheduled iterations of equal length
    dynamically scheduled iterations of different lengths
    9.4.2 static scheduling model
    prescheduled iterations of equal length
    prescheduled iterations of different lengths
    9.4.3 comparison with experimental results
    9.5 conclusions
    9.6 bibliographic notes
    chapter 10: temporal behavior of parallel programs
    10.1 temporal characterization of cache behavior
    10.1.1 a temporal locality metric for cache behavior
    10.1.2 example application of the locality metric to bubble sort
    10.2 read sharing in multiprocessors with distributed caches
    10.2.1 a simple example of read sharing
    10.2.2 the ksr-1 architecture
    10.2.3 read multiplicity metric
    10.2.4 experiments
    10.2.5 programmed poststore and prefetch
    10.3 message waiting in message passing multiprocessors
    10.4 conclusion
    10.5 bibliographic notes
    chapter 11: parallel i/o
    11.1 the parallel i/o problem
    11.1.1 data dependence and i/o
    11.1.2 i/o format conversion
    11.1.3 numerical examples of i/o latency and bandwidth
    requirements
    11.2 hardware for parallel i/o
    11.2.1 control of the primary memory side of the transfer
    11.2.2 i/o channel concurrency
    11.2.3 peripheral device parallelism
    11.3 parallel access disk arrays--raid
    11.4 parallel formatted i/o in shared memory multiprocessors
    11.4.1 parallel input using c i/o routines fread0 and sscanf0
    application independent input operations
    application dependent input operations
    11.4.2 parallel output using c i/o routines sprintf0 and fwrite0
    application dependent output code
    application independent part of the output code
    11.5 collective i/o in multiprocessors--mpi-io
    11.5.1 mpi-2 i/o concepts
    11.5.2 mpi-io examples
    cyclically mapped read from a striped disk array
    mpi-io for distributed matrix multiply
    11.6 conclusions
    11.7 bibliographic notes
    appendix a: routines of the mpi message passing library
    a. 1 point-to-point communications routines
    a.2 collective communications routines
    a.3 mpi data types and constructors
    a.4 communicators, process groups, and topologies
    a.5 mpi environment and error handling
    a.6 summary and mpi-2 extensions
    appendix b: synchronization mechanisms
    b. 1 hardware level synchronization
    b.2 language level synchronization
    b.3 waiting mechanism
    bibliography
    index

    商品评论(0条)

    暂无评论!

    您的浏览历史

    loading 内容加载中,请稍后...