Up: Designing and Building Parallel Programs
Previous: References
- *Lisp
- Chapter Notes
- Abstract processors in HPF
- 7.3.1 Processors
- Actor model
- Chapter Notes
- Agglomeration
- 2.1 Methodical Design, (, )
- and granularity
- 2.4.1 Increasing Granularity, Surface-to-Volume Effects.
- and granularity
- 2.4.1 Increasing Granularity, Surface-to-Volume Effects.
- design checklist
- 2.4.4 Agglomeration Design Checklist
- for atmosphere model
- Agglomeration.
- for floorplan optimization
- Agglomeration.
- for Fock matrix problem
- Communication and Agglomeration.
- in data-parallel model
- 7.1.3 Design
- AIMS performance tool
- 9.4.7 AIMS, Chapter Notes
- Amdahl's law
- application to HPF
- 7.7.2 Sequential Bottlenecks
- definition
- 3.2.1 Amdahl's Law, Chapter Notes
- definition
- 3.2.1 Amdahl's Law, Chapter Notes
- Applied Parallel Research
- Chapter Notes
- ARPANET
- Chapter Notes
- Asymptotic analysis
- limitations of
- 3.2.3 Asymptotic Analysis, 3.2.3 Asymptotic Analysis
- limitations of
- 3.2.3 Asymptotic Analysis, 3.2.3 Asymptotic Analysis
- reference
- Chapter Notes
- Asynchronous communication
- 2.3.4 Asynchronous Communication
- in CC++
- 5.6 Asynchronous Communication
- in FM
- 6.5 Asynchronous Communication
- in MPI
- 8.4 Asynchronous Communication
- Asynchronous Transfer Mode
- 1.2.2 Other Machine Models
- Atmosphere model
- basic equations
- 2.6.1 Atmosphere Model Background
- description
- (, )
- description
- (, )
- parallel algorithms
- (, )
- parallel algorithms
- (, )
- references
- Chapter Notes
- BBN Butterfly
- Chapter Notes
- Bisection bandwidth
- Exercises
- Bisection width
- Exercises, Chapter Notes
- Bitonic mergesort
- Chapter Notes
- Bottlenecks in HPF
- 7.7.2 Sequential Bottlenecks
- Branch-and-bound search
- description
- 2.7.1 Floorplan Background, Chapter Notes
- description
- 2.7.1 Floorplan Background, Chapter Notes
- in MPI
- 8.1 The MPI Programming
- Breadth-first search
- Partition.
- Bridge construction problem
- definition
- 1.3.1 Tasks and Channels
- determinism
- 1.3.1 Tasks and Channels
- in CC++
- 5.2 CC++
Introduction
- in Fortran M
- 6.1 FM Introduction, 6.1 FM Introduction, 6.4.3 Dynamic Channel Structures
- in Fortran M
- 6.1 FM Introduction, 6.1 FM Introduction, 6.4.3 Dynamic Channel Structures
- in Fortran M
- 6.1 FM Introduction, 6.1 FM Introduction, 6.4.3 Dynamic Channel Structures
- in MPI
- 8.2 MPI Basics
- Bubblesort
- Exercises
- Bucketsort
- Chapter Notes
- Bus-based networks
- Bus-based Networks.
- Busy waiting strategy
- 6.5 Asynchronous Communication
- Butterfly
- bandwidth competition on
- Multistage Interconnection Networks.
- description
- Replicating Computation.
- hypercube formulation
- Hypercube Network.
- C*
- Chapter Notes, 7 High Performance Fortran, Chapter Notes
- C++
- Chapter Notes
- classes
- 5.1.2 Classes
- constructor functions
- 5.1.2 Classes
- default constructors
- 5.1.2 Classes
- inheritance
- 5.1.3 Inheritance, 5.1.3 Inheritance
- inheritance
- 5.1.3 Inheritance, 5.1.3 Inheritance
- member functions
- 5.1.2 Classes
- overloading
- 5.1.1 Strong Typing and
- protection
- 5.1.2 Classes
- virtual functions
- 5.1.3 Inheritance
- Cache effect
- 3.6.2 Speedup Anomalies
- Cache memory
- 1.2.2 Other Machine Models, Bus-based Networks.
- CC++
- Part II: Tools
- asynchronous communication
- 5.6 Asynchronous Communication
- basic abstractions
- 5.2 CC++
Introduction
- channel communication
- 5.5.2 Synchronization
- communication costs
- 5.10 Performance Issues, 5.10 Performance Issues
- communication costs
- 5.10 Performance Issues, 5.10 Performance Issues
- communication structures
- 5.5 Communication
- compiler optimization
- 5.10 Performance Issues
- concurrency
- 5.3 Concurrency
- library building
- 5.11 Case Study: Channel
- locality
- 5.4 Locality
- mapping
- (, )
- mapping
- (, )
- mapping
- (, )
- mapping
- (, )
- modularity
- 5.9 Modularity
- modularity
- 5.9 Modularity
- modularity
- 5.9 Modularity
- modularity
- 5.9 Modularity
- nondeterministic interactions
- 5.7 Determinism
- sequential composition
- 5.9 Modularity, 5.9 Modularity
- sequential composition
- 5.9 Modularity, 5.9 Modularity
- synchronization mechanisms
- 5.5.2 Synchronization, 5.5.2 Synchronization, 5.5.2 Synchronization, 5.5.3 Mutual Exclusion
- synchronization mechanisms
- 5.5.2 Synchronization, 5.5.2 Synchronization, 5.5.2 Synchronization, 5.5.3 Mutual Exclusion
- synchronization mechanisms
- 5.5.2 Synchronization, 5.5.2 Synchronization, 5.5.2 Synchronization, 5.5.3 Mutual Exclusion
- synchronization mechanisms
- 5.5.2 Synchronization, 5.5.2 Synchronization, 5.5.2 Synchronization, 5.5.3 Mutual Exclusion
- threads
- 5.2 CC++
Introduction
- tutorial
- Chapter Notes
- unstructured parallelism
- 5.3 Concurrency
- CHAMMP climate modeling program
- Chapter Notes
- Channels
- 1.3.1 Tasks and Channels
- and data dependencies
- 1.3.1 Tasks and Channels
- connecting outport/inport pairs
- 1.3.1 Tasks and Channels
- creation in Fortran M
- 6.3.1 Creating Channels
- dynamic in Fortran M
- 6.4.3 Dynamic Channel Structures
- for argument passing in Fortran M
- 6.7 Argument Passing
- in communication
- 2.3 Communication
- in CSP
- Chapter Notes
- Checkpointing
- 3.8 Input/Output, Chapter Notes
- CHIMP
- Chapter Notes
- Classes in C++
- 5.1.2 Classes
- Climate modeling
- 1.1.1 Trends in Applications, 2.2.2 Functional Decomposition, 2.6 Case Study: Atmosphere , 9.4.1 Paragraph
- in CC++
- 5.8.2 Mapping Threads to
- in Fortran M
- 6.8.3 Submachines
- in MPI
- 8.8 Case Study: Earth
- Clock synchronization
- 9.3.2 Traces, Chapter Notes
- CM Fortran
- Chapter Notes
- Collaborative work environments
- 1.1.1 Trends in Applications
- Collective communication
- 8.3 Global Operations, 9.4.2 Upshot
- Collocation of arrays
- 7.3.2 Alignment
- Combining scatter
- 7.6.3 HPF Features Not
- Communicating Sequential Processes
- Chapter Notes
- Communication
- (, )
- and channels
- 2.3 Communication
- collective
- 8.1 The MPI Programming , 8.3 Global Operations
- collective
- 8.1 The MPI Programming , 8.3 Global Operations
- design checklist
- 2.3.5 Communication Design Checklist
- disadvantages of local
- 2.3.2 Global Communication
- for atmosphere model
- Communication.
- for floorplan optimization
- Communication.
- for Fock matrix problem
- Communication and Agglomeration.
- in CC++
- 5.5 Communication
- in data-parallel model
- 7.1.3 Design
- in Fortran M
- 6.3 Communication
- in MPI
- 8.1 The MPI Programming
- synchronous
- 6.4.3 Dynamic Channel Structures, 8.6.2 MPI Features Not
- synchronous
- 6.4.3 Dynamic Channel Structures, 8.6.2 MPI Features Not
- Communication costs
- Communication Time.
- bandwidth competition
- 3.7 A Refined Communication
- in CC++
- 5.10 Performance Issues
- in HPF
- 7.7.3 Communication Costs
- in MPI
- 8.7 Performance Issues, 8.7 Performance Issues
- in MPI
- 8.7 Performance Issues, 8.7 Performance Issues
- of unaligned array mapping
- 7.7.3 Communication Costs
- with cyclic distribution
- 7.7.3 Communication Costs
- Communication patterns
- 2.3 Communication
- asynchronous
- 2.3 Communication, 2.3.4 Asynchronous Communication, 6.5 Asynchronous Communication, 7.6.2 Storage and Sequence , 8.4 Asynchronous Communication
- asynchronous
- 2.3 Communication, 2.3.4 Asynchronous Communication, 6.5 Asynchronous Communication, 7.6.2 Storage and Sequence , 8.4 Asynchronous Communication
- asynchronous
- 2.3 Communication, 2.3.4 Asynchronous Communication, 6.5 Asynchronous Communication, 7.6.2 Storage and Sequence , 8.4 Asynchronous Communication
- asynchronous
- 2.3 Communication, 2.3.4 Asynchronous Communication, 6.5 Asynchronous Communication, 7.6.2 Storage and Sequence , 8.4 Asynchronous Communication
- asynchronous
- 2.3 Communication, 2.3.4 Asynchronous Communication, 6.5 Asynchronous Communication, 7.6.2 Storage and Sequence , 8.4 Asynchronous Communication
- dynamic
- 2.3 Communication, 2.3.3 Unstructured and Dynamic
- dynamic
- 2.3 Communication, 2.3.3 Unstructured and Dynamic
- local
- 2.3 Communication
- many-to-many
- 6.4.2 Many-to-Many Communication
- many-to-one
- 1.4.4 Parameter Study, 6.4.1 Many-to-One Communication
- many-to-one
- 1.4.4 Parameter Study, 6.4.1 Many-to-One Communication
- point-to-point
- 8.1 The MPI Programming
- static
- 2.3 Communication
- structured
- 2.3 Communication
- synchronous
- 2.3 Communication, 6.4.3 Dynamic Channel Structures
- synchronous
- 2.3 Communication, 6.4.3 Dynamic Channel Structures
- unstructured
- 2.3.3 Unstructured and Dynamic , 6.4 Unstructured Communication
- unstructured
- 2.3.3 Unstructured and Dynamic , 6.4 Unstructured Communication
- Communication time
- Communication Time.
- Communication/computation ratio
- Surface-to-Volume Effects.
- Communicators
- seeMPI
- Competition for bandwidth
- examples
- Multistage Interconnection Networks., Multistage Interconnection Networks.
- examples
- Multistage Interconnection Networks., Multistage Interconnection Networks.
- idealized model of
- 3.7.1 Competition for Bandwidth
- impact
- 3.7 A Refined Communication
- Compilers
- data-parallel
- 7.1.3 Design, 7.7.1 HPF Compilation
- data-parallel
- 7.1.3 Design, 7.7.1 HPF Compilation
- for CC++
- 5.10 Performance Issues
- for Fortran M
- 6.10 Performance Issues
- for HPF
- 7.7.1 HPF Compilation, Chapter Notes
- for HPF
- 7.7.1 HPF Compilation, Chapter Notes
- Composition
- concurrent
- 4.2 Modularity and Parallel , 4.2.4 Concurrent Composition
- concurrent
- 4.2 Modularity and Parallel , 4.2.4 Concurrent Composition
- definition
- 4 Putting Components Together
- parallel
- 4.2 Modularity and Parallel
- sequential
- 4.2 Modularity and Parallel , 4.2.2 Sequential Composition
- sequential
- 4.2 Modularity and Parallel , 4.2.2 Sequential Composition
- Compositional C++
- seeCC++
- Computation time
- Computation Time.
- Computational chemistry
- 2.8 Case Study: Computational , Chapter Notes
- Computational geometry
- 12 Further Reading
- Computer architecture
- 1.2.2 Other Machine Models, 3.7.2 Interconnection Networks
- references
- Chapter Notes, Chapter Notes, 12 Further Reading
- references
- Chapter Notes, Chapter Notes, 12 Further Reading
- references
- Chapter Notes, Chapter Notes, 12 Further Reading
- trends
- 1.1.4 Summary of Trends
- Computer performance improvement
- 1.1.2 Trends in Computer , 1.1.2 Trends in Computer
- Computer trends
- 1.1.4 Summary of Trends
- Computer vision
- Chapter Notes, 12 Further Reading
- Computer-aided diagnosis
- 1.1.1 Trends in Applications
- Concert C
- Chapter Notes
- Concurrency
- explicit vs. implicit
- 7.1.1 Concurrency
- in CC++
- 5.3 Concurrency
- in data-parallel programs
- 7.1.1 Concurrency
- in Fortran M
- 6.2 Concurrency
- parallel software requirement
- 1.1.2 Trends in Computer
- Concurrent C
- Chapter Notes
- Concurrent composition
- 4.2 Modularity and Parallel , 4.2.4 Concurrent Composition
- benefits
- 4.2.4 Concurrent Composition, 4.2.4 Concurrent Composition
- benefits
- 4.2.4 Concurrent Composition, 4.2.4 Concurrent Composition
- cost
- 4.2.4 Concurrent Composition
- example
- 4.2.4 Concurrent Composition
- in CC++
- 5.8.2 Mapping Threads to
- in Fortran M
- 6.8.3 Submachines
- tuple space example
- 4.5 Case Study: Tuple
- Concurrent Computation Project
- 12 Further Reading
- Concurrent data structures
- 12 Further Reading
- Concurrent logic programming
- Chapter Notes
- Conferences in parallel computing
- 12 Further Reading
- Conformality
- definition
- 7.1.1 Concurrency
- in Fortran M
- 6.3.1 Creating Channels
- of array sections
- 7.2.1 Array Assignment Statement
- Constructor functions in C++
- 5.1.2 Classes
- Convolution algorithm
- application in image processing
- 4.4 Case Study: Convolution
- components
- 4.4.1 Components
- parallel 2-D FFTs
- 4.4.1 Components
- parallel composition
- 4.4.2 Composing Components
- sequential composition
- 4.4.2 Composing Components
- COOL
- Chapter Notes
- Cosmic Cube
- Chapter Notes, Chapter Notes, Chapter Notes
- Counters
- 9.1 Performance Analysis, 9.2.2 Counters
- Cray T3D
- 1.2.2 Other Machine Models, Chapter Notes
- Crossbar switching network
- Crossbar Switching Network.
- Cycle time trends
- 1.1.2 Trends in Computer
- Cyclic mapping
- Cyclic Mappings., Mapping., Mapping., Chapter Notes
- in HPF
- 7.3.3 Distribution, 7.7.3 Communication Costs, 7.8 Case Study: Gaussian
- in HPF
- 7.3.3 Distribution, 7.7.3 Communication Costs, 7.8 Case Study: Gaussian
- in HPF
- 7.3.3 Distribution, 7.7.3 Communication Costs, 7.8 Case Study: Gaussian
- Data collection
- (, )
- basic techniques
- 9.1 Performance Analysis
- counters
- 9.2.2 Counters
- process
- 9.2.4 Summary of Data
- traces
- 9.2.3 Traces
- Data decomposition
- seeDomain decomposition
- Data dependency
- 1.3.1 Tasks and Channels
- Data distribution
- at module boundaries
- 4.2.1 Data Distribution
- dynamic
- 7.6.3 HPF Features Not
- in data-parallel languages
- 7.1.2 Locality
- in HPF
- (, )
- in HPF
- (, )
- Data distribution neutrality
- benefits
- 4.2.1 Data Distribution
- example
- (, )
- example
- (, )
- in ScaLAPACK
- 4.2.2 Sequential Composition
- in SPMD libraries
- Chapter Notes
- Data fitting
- 3.5.3 Fitting Data to
- Data parallelism
- 1.3.2 Other Programming Models, 7 High Performance Fortran
- and Fortran 90
- 7.1.4 Data-Parallel Languages, 7.2.2 Array Intrinsic Functions
- and Fortran 90
- 7.1.4 Data-Parallel Languages, 7.2.2 Array Intrinsic Functions
- and HPF
- 7.1.4 Data-Parallel Languages
- and modular design
- 7.1.3 Design
- and task parallelism
- Chapter Notes
- for irregular problems
- Chapter Notes
- languages
- 7.1.4 Data-Parallel Languages, 9.3.3 Data-Parallel Languages
- languages
- 7.1.4 Data-Parallel Languages, 9.3.3 Data-Parallel Languages
- Data reduction
- 9.3.1 Profile and Counts, 9.3.2 Traces
- Data replication
- 3.9.3 Shortest-Path Algorithms Summary
- Data transformation
- 9.1 Performance Analysis
- Data visualization
- 9.1 Performance Analysis, 9.3.2 Traces
- Data-parallel C
- Chapter Notes, Chapter Notes
- Data-parallel languages
- 7.1.4 Data-Parallel Languages, 9.3.3 Data-Parallel Languages
- Data-parallel model
- 1.3.2 Other Programming Models, 7.1.3 Design, 7.1.3 Design, 7.1.3 Design
- Databases
- Chapter Notes, Chapter Notes, 4.5.1 Application
- Deadlock detection
- 12 Further Reading
- Decision support
- 1.1.1 Trends in Applications
- Dense matrix algorithms
- 12 Further Reading
- Depth-first search
- Agglomeration.
- Design checklists
- agglomeration
- 2.4.4 Agglomeration Design Checklist
- communication
- 2.3.5 Communication Design Checklist
- mapping
- 2.5.3 Mapping Design Checklist
- modular design
- Design checklist.
- partitioning
- 2.2.3 Partitioning Design Checklist
- Determinism
- 1.3.1 Tasks and Channels
- advantages
- 1.3.1 Tasks and Channels, Chapter Notes
- advantages
- 1.3.1 Tasks and Channels, Chapter Notes
- in CC++
- 5.7 Determinism
- in Fortran M
- 6.6 Determinism
- in MPI
- 8.2.2 Determinism
- Diagonalization
- Exercises, 9.4.2 Upshot
- Diameter of network
- 3.7.1 Competition for Bandwidth
- Dijkstra's algorithm
- 3.9.2 Dijkstra's Algorithm, 3.9.3 Shortest-Path Algorithms Summary
- DINO
- Chapter Notes
- DISCO
- Communication and Agglomeration.
- Distributed computing
- 1.1.3 Trends in Networking
- Distributed data structures
- Fock matrix
- 2.8 Case Study: Computational
- for load balancing
- Decentralized Schemes.
- implementation
- (, )
- implementation
- (, )
- in CC++
- 5.12 Case Study: Fock
- in Fortran M
- 6.11 Case Study: Fock
- in MPI
- 8.4 Asynchronous Communication
- tuple space
- 4.5 Case Study: Tuple
- Divide-and-conquer
- Uncovering Concurrency: Divide
- Domain decomposition
- 2.2 Partitioning, 2.2.1 Domain Decomposition
- communication requirements
- 2.3 Communication
- for atmosphere model
- 2.6 Case Study: Atmosphere
- for Fock matrix problem
- Partition.
- Efficiency
- 3.3.2 Efficiency and Speedup, 3.3.2 Efficiency and Speedup, 3.3.2 Efficiency and Speedup
- Embarrassingly parallel problems
- 1.4.4 Parameter Study
- Entertainment industry
- 1.1.1 Trends in Applications
- Environmental enquiry;tex2html_html_special_mark_quot;in MPI
- 8.6.2 MPI Features Not
- Ethernet
- 1.2.2 Other Machine Models, Chapter Notes
- performance
- Communication Time., Ethernet., Multistage Interconnection Networks., Multistage Interconnection Networks.
- performance
- Communication Time., Ethernet., Multistage Interconnection Networks., Multistage Interconnection Networks.
- performance
- Communication Time., Ethernet., Multistage Interconnection Networks., Multistage Interconnection Networks.
- performance
- Communication Time., Ethernet., Multistage Interconnection Networks., Multistage Interconnection Networks.
- Event traces
- 9.1 Performance Analysis, 9.3.2 Traces
- Execution profile
- 3.4.3 Execution Profiles, 3.6 Evaluating Implementations
- Execution time
- (, )
- as performance metric
- 3.3 Developing Models
- limitations of
- 3.3.2 Efficiency and Speedup
- Exhaustive search
- 2.7.1 Floorplan Background
- Experimental calibration
- 3.5.1 Experimental Design, 3.5.1 Experimental Design, 3.5.3 Fitting Data to
- Express
- Part II: Tools, 8 Message Passing Interface, Chapter Notes, Chapter Notes
- Fairness
- in CC++
- 5.10 Performance Issues
- in Fortran M
- 6.10 Performance Issues
- Fast Fourier transform
- 4.4 Case Study: Convolution
- in convolution
- (, )
- in convolution
- (, )
- in HPF
- 7.4.2 The INDEPENDENT Directive
- performance
- Multistage Interconnection Networks.
- using hypercube
- Chapter Notes
- Fine-grained decomposition
- 2.2 Partitioning
- Finite difference algorithm
- computation cost
- 3.5.3 Fitting Data to
- efficiency
- 3.3.2 Efficiency and Speedup
- execution time
- Idle Time.
- in CC++
- 5.9 Modularity
- in Fortran 90
- 7.2.2 Array Intrinsic Functions
- in Fortran M
- 6.9 Modularity
- in HPF
- 7.3.3 Distribution
- in MPI
- 8.3.3 Reduction Operations
- isoefficiency analysis
- 3.4.2 Scalability with Scaled
- Finite element method
- 2.3.3 Unstructured and Dynamic
- Fixed problem analysis
- 3.4.1 Scalability with Fixed
- Floorplan optimization problem
- description
- (, )
- description
- (, )
- parallel algorithms
- (, )
- parallel algorithms
- (, )
- Floyd's algorithm
- (, ), (, )
- Fock matrix problem
- algorithms for
- Chapter Notes
- description
- (, )
- description
- (, )
- in CC++
- 5.12 Case Study: Fock
- in Fortran M
- 6.11 Case Study: Fock
- in MPI
- 8.4 Asynchronous Communication, 8.4 Asynchronous Communication, 8.6.1 Derived Datatypes
- in MPI
- 8.4 Asynchronous Communication, 8.4 Asynchronous Communication, 8.6.1 Derived Datatypes
- in MPI
- 8.4 Asynchronous Communication, 8.4 Asynchronous Communication, 8.6.1 Derived Datatypes
- performance
- 9.4.2 Upshot
- Fortran 90
- array assignment
- 7.2.1 Array Assignment Statement, 7.4 Concurrency
- array assignment
- 7.2.1 Array Assignment Statement, 7.4 Concurrency
- array intrinsics
- 7.2.2 Array Intrinsic Functions
- as basis for HPF
- 7.1.4 Data-Parallel Languages
- conformality
- 7.1.1 Concurrency, 7.2.1 Array Assignment Statement
- conformality
- 7.1.1 Concurrency, 7.2.1 Array Assignment Statement
- CSHIFT function
- 7.2.2 Array Intrinsic Functions
- explicit parallelism in
- 7.1.1 Concurrency
- finite difference problem
- 7.2.2 Array Intrinsic Functions
- implicit parallelism in
- 7.1.1 Concurrency
- inquiry functions
- 7.6.1 System Inquiry Intrinsic
- limitations as data-parallel language
- 7.2.2 Array Intrinsic Functions
- SIZE function
- 7.6.1 System Inquiry Intrinsic
- transformational functions
- 7.2.2 Array Intrinsic Functions
- WHERE
- 7.2.1 Array Assignment Statement
- Fortran D
- Chapter Notes
- Fortran M
- Part II: Tools, (, )
- and SPMD computations
- 6.9 Modularity
- argument passing
- 6.7 Argument Passing
- busy waiting strategy
- 6.5 Asynchronous Communication
- communication
- 6.3 Communication
- communication
- 6.3 Communication
- communication
- 6.3 Communication
- communication
- 6.3 Communication
- communication
- 6.3 Communication
- communication
- 6.3 Communication
- communication
- 6.3 Communication
- communication
- 6.3 Communication
- communication
- 6.3 Communication
- communication
- 6.3 Communication
- compiler optimization
- 6.10 Performance Issues
- concurrency
- 6.2 Concurrency
- concurrency
- 6.2 Concurrency
- concurrency
- 6.2 Concurrency
- concurrency
- 6.2 Concurrency
- concurrency
- 6.2 Concurrency
- conformality
- 6.3.1 Creating Channels
- determinism
- 6.6 Determinism, 6.7.1 Copying and Determinism
- determinism
- 6.6 Determinism, 6.7.1 Copying and Determinism
- distribution of data
- 6.5 Asynchronous Communication
- list of extensions
- 6.1 FM Introduction
- mapping
- (, )
- mapping
- (, )
- mapping
- (, )
- mapping
- (, )
- mapping
- (, )
- message passing
- 6.9 Modularity, 6.9 Modularity, 6.9 Modularity
- message passing
- 6.9 Modularity, 6.9 Modularity, 6.9 Modularity
- message passing
- 6.9 Modularity, 6.9 Modularity, 6.9 Modularity
- modularity
- 6.1 FM Introduction
- modularity
- 6.1 FM Introduction
- modularity
- 6.1 FM Introduction
- modularity
- 6.1 FM Introduction
- performance analysis
- 6.10 Performance Issues
- port variables
- 6.2.1 Defining Processes
- process creation
- 6.2.2 Creating Processes
- quick reference
- 6.12 Summary, 6.12 Summary
- quick reference
- 6.12 Summary, 6.12 Summary
- sequential composition
- 6.9 Modularity
- tree-structured computation
- 6.3.3 Receiving Messages
- Fujitsu VPP 500
- Crossbar Switching Network.
- Functional decomposition
- (, )
- appropriateness
- 2.2.2 Functional Decomposition
- communication requirements
- 2.3 Communication
- complement to domain decomposition
- 2.2.2 Functional Decomposition
- design complexity reduced by
- 2.2.2 Functional Decomposition
- for climate model
- 2.2.2 Functional Decomposition
- for Fock matrix problem
- Partition.
- Functional programming
- Chapter Notes, 12 Further Reading
- Gantt chart
- 9.3.2 Traces, 9.4.1 Paragraph, 9.4.2 Upshot
- Gauge performance tool
- 9.4.4 Gauge, Chapter Notes
- Gauss-Seidel update
- 2.3.1 Local Communication, 2.3.1 Local Communication
- Gaussian elimination
- 7.8 Case Study: Gaussian , 9.3.3 Data-Parallel Languages
- Genetic sequences
- 4.5.1 Application
- GIGAswitch
- Crossbar Switching Network.
- Global communication
- 2.3.2 Global Communication
- Grand Challenge problems
- Chapter Notes
- Granularity
- 2.2 Partitioning
- agglomeration used to increase
- 2.4 Agglomeration
- flexibility related to
- 2.2 Partitioning
- of modular programs
- 4.3 Performance Analysis
- Handles in MPI
- 8.2.1 Language Bindings
- Hash tables
- 4.5.2 Implementation
- High Performance Fortran
- seeHPF
- Histograms
- 9.3.1 Profile and Counts
- HPF
- Part II: Tools, (, )
- abstract processors
- 7.3.1 Processors
- advantages
- 7.9 Summary
- collocation of arrays
- 7.3.2 Alignment
- compilation
- 7.7.1 HPF Compilation
- data distribution
- (, )
- data distribution
- (, )
- data distribution
- (, )
- data distribution
- (, )
- data distribution
- (, )
- data distribution
- (, )
- data distribution
- (, )
- data distribution
- (, )
- data distribution
- (, )
- data distribution
- (, )
- data distribution
- (, )
- data distribution
- (, )
- data distribution
- (, )
- data distribution
- (, )
- extrinsic functions
- 7.6.3 HPF Features Not
- language specification
- Chapter Notes
- mapping inquiry functions
- 7.6.3 HPF Features Not
- modularity
- 7.5 Dummy Arguments and
- modularity
- 7.5 Dummy Arguments and
- pure functions
- 7.6.3 HPF Features Not
- remapping of arguments
- Strategy 1: Remap
- sequence association
- 7.6.2 Storage and Sequence
- storage association
- 7.6.2 Storage and Sequence
- subset (official)
- 7.1.4 Data-Parallel Languages
- system inquiry functions
- 7.6.1 System Inquiry Intrinsic
- Hypercube algorithms
- (, )
- all-to-all communication
- 11 Hypercube Algorithms
- matrix transposition
- 11.3 Matrix Transposition
- parallel mergesort
- 11.4 Mergesort
- template for
- 11 Hypercube Algorithms
- vector broadcast
- 11.2 Vector Reduction
- vector reduction
- 11.2 Vector Reduction, 11.2 Vector Reduction
- vector reduction
- 11.2 Vector Reduction, 11.2 Vector Reduction
- Hypercube network
- Hypercube Network.
- I/O, parallel
- applications requiring
- 3.8 Input/Output, Chapter Notes
- applications requiring
- 3.8 Input/Output, Chapter Notes
- performance issues
- 3.8 Input/Output, 3.8 Input/Output
- performance issues
- 3.8 Input/Output, 3.8 Input/Output
- two-phase strategy
- 3.8 Input/Output, Chapter Notes
- two-phase strategy
- 3.8 Input/Output, Chapter Notes
- IBM RP3
- Chapter Notes
- IBM SP
- Chapter Notes
- Idle time
- Idle Time., 4.3 Performance Analysis
- Image processing
- Exercises, 4.4 Case Study: Convolution
- Immersive virtual environments
- 9.4.3 Pablo
- Incremental parallelization
- 3.2.1 Amdahl's Law
- Information hiding
- Ensure that modules
- Inheritance in C++
- 5.1.3 Inheritance
- Intel DELTA
- 3.6.2 Speedup Anomalies, Multistage Interconnection Networks., Multistage Interconnection Networks., Chapter Notes
- Intel iPSC
- Chapter Notes, Chapter Notes
- Intel Paragon
- 1.2.2 Other Machine Models, Chapter Notes, 9.4.5 ParAide
- Intent declarations
- 6.7.2 Avoiding Copying
- Interconnection Networks
- seeNetworks
- IPS-2 performance tool
- Chapter Notes
- Isoefficiency
- 3.4.2 Scalability with Scaled , Chapter Notes
- J machine
- Chapter Notes
- Jacobi update
- 2.3.1 Local Communication
- Journals in parallel computing
- 12 Further Reading
- Kali
- Chapter Notes
- Latency
- 3.1 Defining Performance
- Leapfrog method
- 10.3.2 The Leapfrog Method, 10.3.2 The Leapfrog Method, 10.3.3 Modified Leapfrog
- Least-squares fit
- 3.5.3 Fitting Data to , 3.5.3 Fitting Data to
- scaled
- 3.5.3 Fitting Data to
- simple
- 3.5.3 Fitting Data to
- Linda
- Chapter Notes
- and tuple space
- 4.5 Case Study: Tuple , Chapter Notes
- and tuple space
- 4.5 Case Study: Tuple , Chapter Notes
- types of parallelism with
- Chapter Notes
- Load balancing
- cyclic methods
- Cyclic Mappings.
- dynamic methods
- 2.5 Mapping
- local methods
- 2.5 Mapping, Local Algorithms.
- local methods
- 2.5 Mapping, Local Algorithms.
- manager/worker method
- Manager/Worker.
- probabilistic methods
- 2.5 Mapping, Probabilistic Methods.
- probabilistic methods
- 2.5 Mapping, Probabilistic Methods.
- recursive bisection methods
- Recursive Bisection.
- Local area network
- 1.2.2 Other Machine Models
- Local communication
- definition
- 2.3.1 Local Communication
- finite difference example
- (, )
- finite difference example
- (, )
- Locality
- and task abstraction
- 1.3.1 Tasks and Channels
- definition
- 1.2.1 The Multicomputer
- in CC++
- 5.4 Locality
- in data-parallel programs
- 7.1.2 Locality, 7.8 Case Study: Gaussian
- in data-parallel programs
- 7.1.2 Locality, 7.8 Case Study: Gaussian
- in multicomputers
- 1.2.1 The Multicomputer
- in PRAM model
- 1.2.2 Other Machine Models
- Locks
- 1.3.2 Other Programming Models
- Machine parameters
- Communication Time.
- Mapping
- 2.1 Methodical Design, (, )
- design rules
- 2.5.3 Mapping Design Checklist
- in CC++
- 5.8 Mapping, 5.8.2 Mapping Threads to
- in CC++
- 5.8 Mapping, 5.8.2 Mapping Threads to
- in data-parallel model
- 7.1.3 Design
- in Fortran M
- 6.8 Mapping
- Mapping independence
- 1.3.1 Tasks and Channels
- MasPar MP
- 1.2.2 Other Machine Models
- Matrix multiplication
- (, )
- 1-D decomposition
- 4.6.1 Parallel Matrix-Matrix Multiplication
- 2-D decomposition
- 4.6.1 Parallel Matrix-Matrix Multiplication, 4.6.1 Parallel Matrix-Matrix Multiplication
- 2-D decomposition
- 4.6.1 Parallel Matrix-Matrix Multiplication, 4.6.1 Parallel Matrix-Matrix Multiplication
- and data distribution neutral libraries
- 4.6 Case Study: Matrix
- communication cost
- 4.6.2 Redistribution Costs
- communication structure
- 4.6.1 Parallel Matrix-Matrix Multiplication
- systolic communication
- 4.6.3 A Systolic Algorithm
- Matrix transpose
- seeTranspose
- Meiko CS-2
- 1.2.2 Other Machine Models
- Member functions in C++
- 5.1.2 Classes
- Mentat
- Chapter Notes
- Mergesort
- (, )
- parallel
- Compare-Exchange.
- parallel algorithms
- (, )
- parallel algorithms
- (, )
- performance
- Performance
- references
- Chapter Notes
- sequential algorithm
- 11.4 Mergesort, 11.4 Mergesort
- sequential algorithm
- 11.4 Mergesort, 11.4 Mergesort
- Mesh networks
- Mesh Networks.
- Message Passing Interface
- seeMPI
- Message-passing model
- description
- Chapter Notes
- in HPF
- 7.7 Performance Issues
- task/channel model comparison
- 1.3.2 Other Programming Models
- MIMD computers
- 1.2.2 Other Machine Models
- Modular design
- and parallel computing
- 1.3 A Parallel Programming , (, )
- and parallel computing
- 1.3 A Parallel Programming , (, )
- and parallel computing
- 1.3 A Parallel Programming , (, )
- design checklist
- Design checklist.
- in CC++
- 5.9 Modularity
- in Fortran M
- 6.9 Modularity
- in HPF
- 7.1.3 Design, 7.5 Dummy Arguments and
- in HPF
- 7.1.3 Design, 7.5 Dummy Arguments and
- in MPI
- 8.5 Modularity
- in task/channel model
- 1.3.1 Tasks and Channels
- performance analysis
- 4.3 Performance Analysis
- principles
- (, )
- principles
- (, )
- Monte Carlo methods
- Chapter Notes
- MPI
- Part II: Tools, (, )
- basic functions
- 8.2 MPI Basics
- C binding
- C Language Binding.
- collective communication functions
- (, )
- collective communication functions
- (, )
- collective communication functions
- (, )
- collective communication functions
- (, )
- collective communication functions
- (, )
- collective communication functions
- (, )
- collective communication functions
- (, )
- communicators
- 8.5 Modularity, 8.5.1 Creating Communicators
- communicators
- 8.5 Modularity, 8.5.1 Creating Communicators
- communicators
- 8.5 Modularity, 8.5.1 Creating Communicators
- communicators
- 8.5 Modularity, 8.5.1 Creating Communicators
- communicators
- 8.5 Modularity, 8.5.1 Creating Communicators
- communicators
- 8.5 Modularity, 8.5.1 Creating Communicators
- derived datatypes
- 8.6.1 Derived Datatypes, 8.6.1 Derived Datatypes
- derived datatypes
- 8.6.1 Derived Datatypes, 8.6.1 Derived Datatypes
- determinism
- 8.2.2 Determinism, 8.2.2 Determinism
- determinism
- 8.2.2 Determinism, 8.2.2 Determinism
- environmental enquiry
- 8.6.2 MPI Features Not
- Fortran binding
- Fortran Language Binding.
- handles
- 8.2.1 Language Bindings
- message tags
- 8.2.2 Determinism
- modularity
- (, )
- modularity
- (, )
- modularity
- (, )
- modularity
- (, )
- MPMD model
- 8.1 The MPI Programming
- performance issues
- 8.7 Performance Issues
- probe operations
- 8.4 Asynchronous Communication
- starting a computation
- 8.2 MPI Basics
- MPI Forum
- Chapter Notes
- MPMD model
- 8.1 The MPI Programming
- MPP Apprentice
- Chapter Notes
- Multicomputer model
- 1.2.1 The Multicomputer, 3.3 Developing Models
- and locality
- 1.2.1 The Multicomputer
- early examples
- Chapter Notes
- Multicomputer Toolbox
- Chapter Notes
- Multiprocessors
- 1.2.2 Other Machine Models, 1.2.2 Other Machine Models
- Multistage networks
- Multistage Interconnection Networks., Multistage Interconnection Networks.
- nCUBE
- 1.2.2 Other Machine Models, Chapter Notes, Chapter Notes
- NESL
- Chapter Notes
- Networks
- ATM
- 1.2.2 Other Machine Models
- bus-based
- Bus-based Networks.
- crossbar switch
- Crossbar Switching Network.
- Ethernet
- Ethernet.
- hypercube
- Hypercube Network.
- LAN
- 1.2.2 Other Machine Models
- shared memory
- Bus-based Networks.
- torus
- Mesh Networks.
- trends in
- 1.1.3 Trends in Networking
- WAN
- 1.2.2 Other Machine Models
- Nondeterminism
- from random numbers
- 3.5.2 Obtaining and Validating
- in Fortran M
- 6.6 Determinism
- in message-passing model
- 8.2.2 Determinism
- in MPI
- 8.2.2 Determinism
- in parameter study problem
- 1.4.4 Parameter Study, 1.4.4 Parameter Study
- in parameter study problem
- 1.4.4 Parameter Study, 1.4.4 Parameter Study
- Notation
- Terminology
- Numerical analysis
- 12 Further Reading
- Object-oriented model
- 1.3.1 Tasks and Channels
- Objective C
- Chapter Notes
- Out-of-core computation
- 3.8 Input/Output
- Overhead anomalies
- 3.6.1 Unaccounted-for Overhead
- Overlapping computation and communication
- 2.4.2 Preserving Flexibility, Idle Time.
- Overloading in C++
- 5.1.1 Strong Typing and
- Owner computes rule
- 7 High Performance Fortran, 7.1.1 Concurrency, 7.8 Case Study: Gaussian
- P++ library
- Chapter Notes
- p4
- Part II: Tools, 8 Message Passing Interface, Chapter Notes
- Pablo performance tool
- 9.4.3 Pablo, Chapter Notes
- Pairwise interactions
- (, )
- in Fortran M
- 6.3.3 Receiving Messages
- in HPF
- 7.3.3 Distribution
- in MPI
- Fortran Language Binding., Fortran Language Binding., 8.2.2 Determinism
- in MPI
- Fortran Language Binding., Fortran Language Binding., 8.2.2 Determinism
- in MPI
- Fortran Language Binding., Fortran Language Binding., 8.2.2 Determinism
- Paragraph performance tool
- 9.4.1 Paragraph, Chapter Notes
- ParAide performance tool
- 9.4.5 ParAide, Chapter Notes
- Parallel algorithm design
- bibliography
- 12 Further Reading
- and performance
- 3.10 Summary
- case studies
- (, )
- case studies
- (, )
- methodology
- 2.1 Methodical Design, 2.9 Summary
- methodology
- 2.1 Methodical Design, 2.9 Summary
- Parallel algorithms
- branch and bound search
- 2.7.1 Floorplan Background
- convolution
- 4.4 Case Study: Convolution
- fast Fourier transform
- 4.4 Case Study: Convolution
- Gaussian elimination
- 7.8 Case Study: Gaussian , 9.3.3 Data-Parallel Languages
- Gaussian elimination
- 7.8 Case Study: Gaussian , 9.3.3 Data-Parallel Languages
- matrix multiplication
- (, )
- matrix multiplication
- (, )
- mergesort
- 11.4 Mergesort
- parallel prefix
- 7.6.3 HPF Features Not
- parallel suffix
- 7.6.3 HPF Features Not
- quicksort
- Chapter Notes
- random number generation
- 10 Random Numbers
- reduction