网上购物 货比三家
您现在的位置:快乐比价网 > 图书 > 教育/科技 > 工具书 > 商品详情

高性能计算机上的数值线性(英文影印版)

分享到:
高性能计算机上的数值线性(英文影印版)

最 低 价:¥38.70

定 价:¥49.00

作 者:Jack J. Dongarra

出 版 社:清华大学出版社

出版时间:2011 年2月

I S B N:9787302244998

商品详情

编辑推荐

内容简介

the purpose of this book is to unify and document in one place many of the techniques and much of the current understanding about solving systems of linear equations on vector and parallel computers. this book is not a textbook,but it is meant to provide a fast entrance to the world of vector and parallel processing for these linear algebra applications. we intend this book to be used by three groups of readers: graduate students, researchers working in computational science, and numerical analysts. as such, we hope this book can serve both as a reference and as a supplement to a teaching text on aspects of scientific computation.
  the book is divided into five major parts: (1) introduction to terms and concepts, including an overview of the state of the art for high-performance computers and a discussion of performance evaluation (chapters 1-4); (2) direct solution of dense matrix problems (chapter 5); (3) direct solution of sparse systems of equations (chapter 6); (4) iterative solution of sparse systems of equations (chapters 7-9); and (5) iterative solution of sparse eigenvalue problems (chapters 10-11). any book that attempts to cover these topics must necessarily be somewhat out of date before it appears, because the area is in a state of flux. we have purposely avoided highly detailed descriptions of popular machines and have tried instead to focus on concepts as much as possible; nevertheless, to make the description more concrete, we do point to specific computers.

作者简介

Jack J. Dongarra is a Distinguished Professor of Computer Science at the University to Tennessee and a Distinguished Scientist at Oak Ridge National Laboratory.
lain S. Duff is Group Leader of Numerical Analysis at the CCLRC Rutherford Appleton Laboratory, the Project Leader for the Parallel Algorithms Group at CERFACS in Toulouse, and a Visiting Professor of Mathematics at the University or Strathclyde.
Danny C. Sorensen is a Professor.. << 查看详细

目录

《高性能计算机上的数值线性(英文影印版)》
about the authors
preface
introduction
1 high-performance computing
1.1 trends in computer design
1.2 traditional computers and their limitations
1.3 parallelism within a single processor
1.3.1 multiple functional units
1.3.2 pipelining
1.3.3 overlapping
1.3.4 risc
1.3.5 vliw
1.3.6 vector instructions
1.3.7 chaining
1.3.8 memory-to-memory and register-to-register organizations
1.3.9 register set
1.3.10 stripmining
1.3.11 reconfigurable vector registers
1.3.12 memory organization
.1.4 data organization
1.4.1 main memory
1.4.2 cache
1.4.3 local memory
1.5 memory management
1.6 parallelism through multiple pipes or multiple processors
1.7 message passing
1.8 virtual shared memory
1.8.1 routing
1.9 interconnection topology
1.9.1 crossbar switch
1.9.2 timeshared bus
1.9.3 ring connection
1.9.4 mesh connection
1.9.5 hypercube
1.9.6 multi-staged network
1.10 programming techniques
1.11 trends: network-based computing
2 overview of current high-performance computers
2.1 supercomputers
2.2 risc-based processors
2.3 parallel processors
3 implementation details and overhead
3.1 parallel decomposition and data dependency graphs
3.2 synchronization
3.3 load balancing
3.4 recurrence
3.5 indirect addressing
3.6 message passing
3.6.1 performance prediction
3.6.2 message-passing standards
3.6.3 routing
4 performance: analysis, modeling, and measurements
4.1 amdahl's law
4.1.1 simple case of amdahl's law
4.1.2 general form of amdahl's law
4.2 vector speed and vector length
4.3 amdahl's law--parallel processing
4.3.1 a simple model
4.3.2 gustafson's model
4.4 examples of (r∞, n1/2)-values for various computers
4.4.1 cray j90 and cray t90 (one processor)
4.4.2 general observations
4.5 linpack benchmark
4.5.1 description of the benchmark
4.5.2 calls to the blas
4.5.3 asymptotic performance
5 building blocks in linear algebra
5.1 basic linear algebra subprograms
5.1.1 level 1 blas
5.1.2 level 2 blas
5.1.3 level 3 blas
5.2 levels of parallelism
5.2.1 vector computers
5.2.2 parallel processors with shared memory
5.2.3 parallel-vector computers
5.2.4 clusters computing
5.3 basic factorizations of linear algebra
5.3.1 point algorithm: gaussian elimination with partial pivoting
5.3.2 special matrices
5.4 blocked algorithms: matrix-vector and matrix-matrix versions
5.4.1 right-looking algorithm
5.4.2 left-looking algorithm
5.4.3 crout algorithm
5.4.4 typical performance of blocked lu decomposition
5.4.5 blocked symmetric indefinite factorization
5.4.6 typical performance of blocked symmetric indefinite factorization
5.5 linear least squares
5.5.1 householder method
5.5.2 blocked householder method
5.5.3 typical performance of the blocked householder factorization
5.6 organization of the modules
5.6.1 matrix-vector product
5.6.2 matrix-matrix product
5.6.3 typical performance for parallel processing
5.6.4 benefits
5.7 lapack
5.8 scalapack
5.8.1 the basic linear algebra communication subprograms (blacs)
5.8.2 pblas
5.8.3 scalapack sample code
6 direct solution of sparse linear systems
6.1 introduction to direct methods for sparse linear systems
6.1.1 four approaches
6.1.2 description of sparse data structure
6.1.3 manipulation of sparse data structures
6.2 general sparse matrix methods
6.2.1 fill-in and sparsity ordering
6.2.2 indirect addressing--its effect and how to avoid it
6.2.3 comparison with dense codes
6.2.4 other approaches
6.3 methods for symmetric matrices and band systems
6.3.1 the clique concept in gaussian elimination
6.3.2 further comments on ordering schemes
6.4 frontal methods
6.4.1 frontal methods--link to band methods and numerical pivoting
6.4.2 vector performance
6.4.3 parallel implementation of frontal schemes
6.5 multifrontal methods
6.5.1 performance on vector machines
6.5.2 performance on risc machines
6.5.3 performance on parallel machines
6.5.4 exploitation of structure
6.5.5 unsymmetric multifrontal methods
6.6 other approaches for exploitation of parallelism
6.7 software
6.8 brief summary
7 krylov subspaces: projection
7.1 notation
7.2 basic iteration methods: richardson iteration, power method
7.3 orthogonal basis (arnoldi, lanczos)
8 iterative methods for linear systems
8.1 krylov subspace solution methods: basic principles
8.1.1 the ritz-galerkin approach: fom and cg
8.1.2 the minimum residual approach: gmres and minres
8.1.3 the petrov-galerkin approach: bi-cg and qmr
8.1.4 the minimum error approach: symmlq and gmerr
8.2 iterative methods in more detail
8.2.1 the cg method
8.2.2 parallelism in the co method: general aspects
8.2.3 parallelism in the cg method: communication overhead
8.2.4 minres
8.2.5 least squares cg
8.2.6 gmres and gmres(m)
8.2.7 gmres with variable preconditioning
8.2.8 bi-cg and qmr
8.2.9 cgs
8.2.10 bi-cgstab
8.2.11 bi-cgstab(l) and variants
8.3 other issues
8.4 how to test iterative methods
9 preconditioning and parallel preconditioning
9.1 preconditioning and parallel preconditioning
9.2 the purpose of preconditioning
9.3 incomplete lu decompositions
9.3.1 efficient implementations of ilu(0) preconditioning
9.3.2 general incomplete decompositions
9.3.3 variants of ilu preconditioners
9.3.4 some general comments on ilu
9.4 some other forms of preconditioning
9.4.1 sparse approximate inverse (spai)
9.4.2 polynomial preconditioning
9.4.3 preconditioning by blocks or domains
9.4.4 element by element preconditioners
9.5 vector and parallel implementation of preconditioners
9.5.1 partial vectorization
9.5.2 reordering the unknowns
9.5.3 changing the order of computation
9.5.4 some other vectorizable preconditioners
9.5.5 parallel aspects of reorderings
9.5.6 experiences with parallelism
10 linear eigenvalue problems ax=λχ
10.1 theoretical background and notation
10.2 single-vector methods
10.3 the qr algorithm
10.4 subspace projection methods
10.5 the arnoldi factorization
10.6 restarting the arnoldi process
10.6.1 explicit restarting
10.7 implicit restarting
10.8 lanczos' method
10.9 harmonic ritz values and vectors
10.10 other subspace iteration methods
10.11 davidson's method
10.12 the jacobi-davidson iteration method
10.12.1 jdqr
10.13 eigenvalue software: arpack, p_arpack
10.13.1 reverse communication interface
10.13.2 parallelizing arpack
10.13.3 data distribution of the arnoldi factorization
10.14 message passing
10.15 parallel performance
10.16 availability
10.17 summary
11 the generalized eigenproblem
11.1 arnoldi/lanczos with shift-invert
11.2 alternatives to arnoldi/lanczos with shift-invert
11.3 the jacobi-davidson qz algorithm
11.4 the jacobi-davidson qz method: restart and deflation
11.5 parallel aspects
a acquiring mathematical software
a.1 netlib
a.1.1 mathematical software
a.2 mathematical software libraries
b glossary
c level 1, 2, and 3 blas quick reference
d operation counts for various blas and decompositions
bibliography
index

商品评论(0条)

暂无评论!

您的浏览历史

loading 内容加载中,请稍后...