
最 低 价:¥61.60
| |
|
|
| Contents Foreword Preface 1 Introduction 1.1 Why Parallel Architecture 1.2 Convergence of Parallel Architectures 1.3 Fundamental Design Issues 1.4 Concluding Remarks 1.5 Historical Refernces 1.6 Exercises 2 Parallel Programs 2.1 parallel Application Case Studies 2.2 The Parallelization Process 2.3 Paralleliation of an Example Program 2.4Concluding Remarks 2.5 Exercises 3 Programming for Performance 3.1 Partitioning for Performance 3.2 Data Access and Communication in a Multimemory System 3.3 Orchestration for Performance 3.4 Performance Factors from the Processor's Perspective 3.5 The Parallel Application Case Studies:An In-Depth Look 3.6 Implications for Programming Models 3.7 Concluding Reamarks 3.8 Exercises 4 Workload-Driven Evaluation 4.1 Scaling Workloads and Machines 4.2 Evaluating a Real Machine 4.3 Evaluating an Architectural Idea or Trade-off 4.4 Illustrating Workload Characterization 4.5 Concluding Remarks 4.6 Exercises 5 Shared Memory Multiprocessors 5.1 Cache Coherence 5.2 Memory consistency 5.3 Design Space for Snooping Protocols 5.4 Assessing Protocol Design Trade-offs 5.5 Synchronization 5.6 Implications for Software 5.7 Concluding Remarks 5.8 Exercises 6 Snoop-Based Multiprocessor Design 6.1 Correctness Requirements 6.2 Base Design :simgle-Level Caches with an Atomic Bus 6.3 Multilevel Cache Hierarchies 6.4 Split-Transaction Bus 6.5 Case Studies :SGI Challenge and Sun Enterprise 6.6 Extending Cache Coherence 6.7 Concluding Remarks 6.8 Exercises 7 Scalable Multiprocessors 7.1 Scalability 7.2 Realizing Programming Models 7.3 Physical DMA 7.4 User-Level Access 7.5 Dedicated Message Processing 7.6 Shared Physical Address Space 7.7 Clusters and Networks of Workstatiomns 7.8 Implications for Parallel Software 7.9 Synchronization 7.10 Concluding Remarks 7.11 Exercises 8 Directory-Based Cache Coherence 8.1 Scalable Cache Coherence 8.2 Overview of Directory-Based Approaches 8.3 Assessing Directory Protocols and Trade-Offs 8.4 Design Challenges for Directory Protocols 8.5 Memory-Based Directory Protocols:The SGI Origin System 8.6 Cache-Based Directory Protocols:The Sequent NUMA-Q 8.7 Performacne Parameters and Protocol Performacne 8.8 Synchronization 8.9 Implications for Parallel software 8.10 Advanced topics 8.11 Concluding Remarks 8.12 Exercises 9 Haradware/Software Trade-Offs 9.1 Relaxed Memory Consistency Models 9.2 Overcoming Capacity Limitations 9.3 Reducing Hardware Cost 9.4 Putting It All Together:Ataxonomy and Simple COMA 9.5 Implications for Parallel Software 9.6 Advanced topics 9.7 Concluding Remarks 9.8 Exercises 10 Interconnection Network Design 10.1 Basic Definitions 10.2 Basic Communication Performance 10.3 Organizational Structure 10.4 Interconnection Topologies 10.5 Evaluating Design Trade-Offs in Network Topology 10.6 Routing 10.7 Switch Design 10.8 Flow Control 10.9 Case Studies 10.10 Concluding Remarks 10.11 Exercises 11 Latency Tolerance 11.1 Overview of Latency tolerance 11.2 Latency Tolerance in Explicit message Passing 11.3 latency Tolerance in a Shared Address Space 11.4 Block Data TRansfer in a Shared Address Space 11.5 Proceeding Past Long-Latency Events 11.6 Precommunication in a Shared Address Space 11.7 Multithreading in a Shared Address Space 11.8 Lockup-Free Cache Design 11.9 Concluding Remarks 11.10 Exercises 12 Future Directions 12.1 Technology and Architecture 12.2 Applications and System Software Appendix:Parallel Benchmark Suites |
商品评论(0条)