Atnaujinkite slapukų nuostatas

Using Advanced MPI: Modern Features of the Message-Passing Interface [Minkštas viršelis]

(Argonne National Laboratory), (University of Illinois Urbana-Champaign), (ETH Zürich), (Argonne National Laboratory)
  • Formatas: Paperback / softback, 392 pages, aukštis x plotis x storis: 229x203x17 mm, 140 b&w illus.; 280 Illustrations
  • Serija: Scientific and Engineering Computation
  • Išleidimo metai: 07-Nov-2014
  • Leidėjas: MIT Press
  • ISBN-10: 0262527634
  • ISBN-13: 9780262527637
Kitos knygos pagal šią temą:
  • Formatas: Paperback / softback, 392 pages, aukštis x plotis x storis: 229x203x17 mm, 140 b&w illus.; 280 Illustrations
  • Serija: Scientific and Engineering Computation
  • Išleidimo metai: 07-Nov-2014
  • Leidėjas: MIT Press
  • ISBN-10: 0262527634
  • ISBN-13: 9780262527637
Kitos knygos pagal šią temą:

This book offers a practical guide to the advanced features of the MPI (Message-Passing Interface) standard library for writing programs for parallel computers. It covers new features added in MPI-3, the latest version of the MPI standard, and updates from MPI-2. Like its companion volume, Using MPI, the book takes an informal, example-driven, tutorial approach. The material in each chapter is organized according to the complexity of the programs used as examples, starting with the simplest example and moving to more complex ones.

Using Advanced MPI covers major changes in MPI-3, including changes to remote memory access and one-sided communication that simplify semantics and enable better performance on modern hardware; new features such as nonblocking and neighborhood collectives for greater scalability on large systems; and minor updates to parallel I/O and dynamic processes. It also covers support for hybrid shared-memory/message-passing programming; MPI_Message, which aids in certain types of multithreaded programming; features that handle very large data; an interface that allows the programmer and the developer to access performance data; and a new binding of MPI to Fortran.

Series Foreword xv
Foreword xvii
Preface xix
1 Introduction 1(14)
1.1 MPI-1 and MPI-2
1(1)
1.2 MPI-3
2(1)
1.3 Parallelism and MPI
3(8)
1.3.1 Conway's Game of Life
4(1)
1.3.2 Poisson Solver
5(6)
1.4 Passing Hints to the MPI Implementation with MPI_Info
11(2)
1.4.1 Motivation, Description, and Rationale
12(1)
1.4.2 An Example from Parallel I/O
12(1)
1.5 Organization of This Book
13(2)
2 Working with Large-Scale Systems 15(40)
2.1 Nonblocking Collectives
16(15)
2.1.1 Example: 2-D FFT
16(3)
2.1.2 Example: Five-Point Stencil
19(1)
2.1.3 Matching, Completion, and Progression
20(2)
2.1.4 Restrictions
22(1)
2.1.5 Collective Software Pipelining
23(4)
2.1.6 A Nonblocking Barrier?
27(3)
2.1.7 Nonblocking Allreduce and Krylov Methods
30(1)
2.2 Distributed Graph Topologies
31(9)
2.2.1 Example: The Peterson Graph
37(1)
2.2.2 Edge Weights
37(2)
2.2.3 Graph Topology Info Argument
39(1)
2.2.4 Process Reordering
39(1)
2.3 Collective Operations on Process Topologies
40(8)
2.3.1 Neighborhood Collectives
41(3)
2.3.2 Vector Neighborhood Collectives
44(1)
2.3.3 Nonblocking Neighborhood Collectives
45(3)
2.4 Advanced Communicator Creation
48(7)
2.4.1 Nonblocking Communicator Duplication
48(2)
2.4.2 Noncollective Communicator Creation
50(5)
3 Introduction to Remote Memory Operations 55(46)
3.1 Introduction
57(2)
3.2 Contrast with Message Passing
59(3)
3.3 Memory Windows
62(3)
3.3.1 Hints on Choosing Window Parameters
64(1)
3.3.2 Relationship to Other Approaches
65(1)
3.4 Moving Data
65(6)
3.4.1 Reasons for Using Displacement Units
69(1)
3.4.2 Cautions in Using Displacement Units
70(1)
3.4.3 Displacement Sizes in Fortran
71(1)
3.5 Completing RMA Data Transfers
71(2)
3.6 Examples of RMA Operations
73(15)
3.6.1 Mesh Ghost Cell Communication
74(10)
3.6.2 Combining Communication and Computation
84(4)
3.7 Pitfalls in Accessing Memory
88(7)
3.7.1 Atomicity of Memory Operations
89(1)
3.7.2 Memory Coherency
90(1)
3.7.3 Some Simple Rules for RMA
91(2)
3.7.4 Overlapping Windows
93(1)
3.7.5 Compiler Optimizations
93(2)
3.8 Performance Tuning for RMA Operations
95(6)
3.8.1 Options for MPI_Win_create
95(2)
3.8.2 Options for MPI_Win_fence
97(4)
4 Advanced Remote Memory Access 101(56)
4.1 Passive Target Synchronization
101(1)
4.2 Implementing Blocking, Independent RMA Operations
102(2)
4.3 Allocating Memory for MPI Windows
104(4)
4.3.1 Using MPI_Alloc_mem and MPI_Win_allocate from C
104(1)
4.3.2 Using MPI_Alloc_mem and MPI_Win_allocate from Fortran 2008
105(2)
4.3.3 Using MPI_ALLOC_MEM and MPI_WIN_ALLOCATE from Older Fortran
107(1)
4.4 Another Version of NXTVAL
108(7)
4.4.1 The Nonblocking Lock
110(1)
4.4.2 NXTVAL with MPI_Fetch_and_op
110(2)
4.4.3 Window Attributes
112(3)
4.5 An RMA Mutex
115(5)
4.6 Global Arrays
120(10)
4.6.1 Create and Free
122(2)
4.6.2 Put and Get
124(3)
4.6.3 Accumulate
127(1)
4.6.4 The Rest of Global Arrays
128(2)
4.7 A Better Mutex
130(1)
4.8 Managing a Distributed Data Structure
131(17)
4.8.1 A Shared-Memory Distributed List Implementation
132(3)
4.8.2 An MPI Implementation of a Distributed List
135(5)
4.8.3 Inserting into a Distributed List
140(3)
4.8.4 An MPI Implementation of a Dynamic Distributed List
143(2)
4.8.5 Comments on More Concurrent List Implementations
145(3)
4.9 Compiler Optimization and Passive Targets
148(1)
4.10 MPI RMA Memory Models
149(3)
4.11 Scalable Synchronization
152(4)
4.11.1 Exposure and Access Epochs
152(1)
4.11.2 The Ghost-Point Exchange Revisited
153(2)
4.11.3 Performance Optimizations for Scalable Synchronization
155(1)
4.12 Summary
156(1)
5 Using Shared Memory with MPI 157(12)
5.1 Using MPI Shared Memory
159(4)
5.1.1 Shared On-Node Data Structures
159(1)
5.1.2 Communication through Shared Memory
160(3)
5.1.3 Reducing the Number of Subdomains
163(1)
5.2 Allocating Shared Memory
163(2)
5.3 Address Calculation
165(4)
6 Hybrid Programming 169(18)
6.1 Background
169(1)
6.2 Thread Basics and Issues
170(3)
6.2.1 Thread Safety
171(1)
6.2.2 Performance Issues with Threads
172(1)
6.2.3 Threads and Processes
173(1)
6.3 MPI and Threads
173(3)
6.4 Yet Another Version of NXTVAL
176(2)
6.5 Nonblocking Version of MPI_Comm_accept
178(1)
6.6 Hybrid Programming with MPI
179(3)
6.7 MPI Message and Thread-Safe Probe
182(5)
7 Parallel I/0 187(56)
7.1 Introduction
187(1)
7.2 Using MPI for Simple I/O
187(8)
7.2.1 Using Individual File Pointers
187(4)
7.2.2 Using Explicit Offsets
191(3)
7.2.3 Writing to a File
194(1)
7.3 Noncontiguous Accesses and Collective I/O
195(8)
7.3.1 Noncontiguous Accesses
195(4)
7.3.2 Collective I/O
199(4)
7.4 Accessing Arrays Stored in Files
203(12)
7.4.1 Distributed Arrays
204(2)
7.4.2 A Word of Warning about Darray
206(1)
7.4.3 Subarray Datatype Constructor
207(3)
7.4.4 Local Array with Ghost Area
210(1)
7.4.5 Irregularly Distributed Arrays
211(4)
7.5 Nonblocking I/O and Split Collective I/O
215(1)
7.6 Shared File Pointers
216(3)
7.7 Passing Hints to the Implementation
219(2)
7.8 Consistency Semantics
221(8)
7.8.1 ''S'imple Cases
224(1)
7.8.2 Accessing a Common File Opened with MPI_COMM_WORLD
224(3)
7.8.3 Accessing a Common File Opened with MPI_COMM_SELF
227(1)
7.8.4 General Recommendation
228(1)
7.9 File Interoperability
229(5)
7.9.1 File Structure
229(1)
7.9.2 File Data Representation
230(1)
7.9.3 Use of Datatypes for Portability
231(2)
7.9.4 User-Defined Data Representations
233(1)
7.10 Achieving High I/O Performance with MPI
234(4)
7.10.1 The Four "Levels" of Access
234(3)
7.10.2 Performance Results
237(1)
7.11 An Example Application
238(4)
7.12 Summary
242(1)
8 Coping with Large Data 243(6)
8.1 MPI Support for Large Data
243(1)
8.2 Using Derived Datatypes
243(1)
8.3 Example
244(1)
8.4 Limitations of This Approach
245(4)
8.4.1 Collective Reduction Functions
245(1)
8.4.2 Irregular Collectives
246(3)
9 Support for Performance and Correctness Debugging 249(22)
9.1 The Tools Interface
250(13)
9.1.1 Control Variables
251(6)
9.1.2 Performance Variables
257(6)
9.2 Info, Assertions, and MPI Objects
263(4)
9.3 Debugging and the MPIR Debugger Interface
267(2)
9.4 Summary
269(2)
10 Dynamic Process Management 271(34)
10.1 Intercommunicators
271(1)
10.2 Creating New MPI Processes
271(20)
10.2.1 Parallel cp: A Simple System Utility
272(7)
10.2.2 Matrix-Vector Multiplication Example
279(5)
10.2.3 Intercommunicator Collective Operations
284(1)
10.2.4 Intercommunicator Point-to-Point Communication
285(1)
10.2.5 Finding the Number of Available Processes
285(5)
10.2.6 Passing Command-Line Arguments to Spawned Programs
290(1)
10.3 Connecting MPI Processes
291(11)
10.3.1 Visualizing the Computation in an MPI Program
292(2)
10.3.2 Accepting Connections from Other Programs
294(2)
10.3.3 Comparison with Sockets
296(2)
10.3.4 Moving Data between Groups of Processes
298(1)
10.3.5 Name Publishing
299(3)
10.4 Design of the MPI Dynamic Process Routines
302(3)
10.4.1 Goals for MPI Dynamic Process Management
302(1)
10.4.2 What MPI Did Not Standardize
303(2)
11 Working with Modern Fortran 305(8)
11.1 The mpi_f08 Module
305(1)
11.2 Problems with the Fortran Interface
306(7)
11.2.1 Choice Parameters in Fortran
307(1)
11.2.2 Nonblocking Routines in Fortran
308(2)
11.2.3 Array Sections
310(1)
11.2.4 Trouble with LOGICAL
311(2)
12 Features for Libraries 313(28)
12.1 External Interface Functions
313(11)
12.1.1 Decoding Datatypes
313(2)
12.1.2 Generalized Requests
315(7)
12.1.3 Adding New Error Codes and Classes
322(2)
12.2 Mixed-Language Programming
324(3)
12.3 Attribute Caching
327(4)
12.4 Using Reduction Operations Locally
331(2)
12.5 Error Handling
333(2)
12.5.1 Error Handlers
333(2)
12.5.2 Error Codes and Classes
335(1)
12.6 Topics Not Covered in This Book
335(6)
13 Conclusions 341(2)
13.1 MPI Implementation Status
341(1)
13.2 Future Versions of the MPI Standard
341(1)
13.3 MPI at Exascale
342(1)
A MPI Resources on the World Wide Web 343(2)
References 345(8)
Subject Index 353(6)
Function and Term Index 359