Figure 1.1: Sample decomposition of a 3-D mesh. The upper right corner box has been pulled out to show that the mesh has been subdivided along the x, y, and z axes.
Chapter 2: Node Hardware
Figure 2.1: Block diagram of a motherboard chipset. The chipset consists of the entire diagram excluding the processor and memory.
Chapter 3: Linux
Figure 3.1: A simple program to touch many pages of memory.
Chapter 4: System Area Networks
Figure 4.1: A simple cluster network.
Figure 4.2: A complex cluster network.
Chapter 5: Configuring and Tuning Cluster Networks
Figure 5.1: Layering of network protocols
Figure 5.2: Diagram showing the configuration of our simple example cluster.
Figure 5.3: Diagram showing compute nodes with multiple interfaces on multiple networks. Notice that the Myrinet network is entirely internal to the cluster, a common design point since the dedicated network is typically much higher performing than networks outside the cluster.
Figure 5.4: Above are shown some possible locations one may wish to place a firewall, denoted by the curved dotted lines.
Figure 5.5: Above are shown some of the interesting points through the Linux kernel where network packets are affected. The letters are points in kernel space where routing decisions are made. Numbered locations are some of the places where netfilters exist that will determine the fate of packets passing through. A.) incoming packet routing decision. B.) local machine process space. C.) postrouting decision. 1.) FORWARD netfilter table. 2.) INPUT netfilter table. 3.) OUTPUT netfilter table.
Chapter 6: Setting up Clusters
Figure 6.1: Cable bundles. Wire ties make 8 power cables into a neat and managable group
Figure 6.2: The back of a rack, showing the clean organization of the cables. Note that the fans are unobstructed.
Figure 6.3: Basic RedHat Kickstart file. The RedHat Installer, Anaconda, interprets the contents of the kickstart file to build a node
Figure 6.4: Description (Kickstart) Graph. This graph completely describes all of the appliances of a Rocks Cluster.
Figure 6.5: Description Graph Detail. This illustrates how two modules 'standalone.xml' and 'base.xml' share base configuration and also differ in other specifics
Figure 6.6: The ssh.xml module includes the ssh packages and configures the service in the Kickstart post section.
Figure 6.7: The 'base.xml' module configures the main section of the Kickstart file.
Chapter 7: An Introduction to Writing Parallel Programs for Clusters
Figure 7.1: Schematic of a general manager-worker system
Figure 7.2: A simple server in C
Figure 7.3: A simple client in C
Figure 7.4: A simple server in Python
Figure 7.5: A simple client in Python
Figure 7.6: A simple server in Perl
Figure 7.7: A simple client in Perl
Figure 7.8: A Python server that uses select
Figure 7.9: A Python client
Figure 7.10: Matrix-matrix multiply program
Figure 7.11: Manager for parameter study
Figure 7.12: Two code fragments for parallelizing the Poisson problem with the Jacobi iteration
Figure 7.13: Two code fragments for parallelizing the Poisson problem with the Jacobi iteration, including the communication of ghost points. Note the changes in the declarations for U and UNEW.
Figure 7.14: LU Factorization code. The factors L and U are computed in-place; that is, they are stored over the input matrix a.
Chapter 8: Parallel Programming with MPI
Figure 8.1: Simple "Hello World" program in MPI.
Figure 8.2: A more interesting version of "Hello World".
Figure 8.3: A more complex "Hello World" program in MPI. Only process 0 writes to stdout; each process sends a message to process 0.
Figure 8.4: Using MPI_Probe to find the size of a message before receiving it.
Figure 8.5: Framework of the matrix-vector multiply program.
Figure 8.6: The matrix-vector multiply program, manager code.
Figure 8.7: The matrix-vector multiply program, worker code.
Figure 8.8: Domain and 9 ? 9 computational mesh for approximating the solution to the Poisson problem.
Figure 8.9: A simple version of the neighbor exchange code. See the text for a discussion of the limitations of this routine.
Figure 8.10: A better version of the neighbor exchange code.
Figure 8.11: Computing π using collective operations.
Figure 8.12: — Computing π using the Monte Carlo method.
Figure 8.13: A parallel Poisson solver that exploits two libraries written with MPI.
Figure 8.14: The main program in a high-level program to solve a nonlinear partial differential equation using PETSc.
Figure 8.15: Jumpshot displaying message traffic.
Chapter 9: Advanced Topics in MPI Programming
Figure 9.1: Dynamic process matrix-vector multiply program, manager part.
Figure 9.2: Dynamic process matrix-vector multiply program, worker part.
Figure 9.3: Fault-tolerant manager.
Figure 9.4: Nonblocking exchange code for the Jacobi example.
Figure 9.5: A 12 x 12 computational mesh, divided into 4?4 domains, for approximating the solution to the Poisson problem using a two-dimensional decomposition.
Figure 9.6: Locations of mesh points in ulocal for a two-dimensional decomposition.
Figure 9.7: Nonblocking exchange code for the Jacobi problem for a two-dimensional decomposition of the mesh.
Figure 9.8: Two possible message-matching patterns when MPI_ANY_SOURCE is used in the MPI_Recv calls (from [48]).
Figure 9.9: Schematic representation of collective data movement in MPI.
Figure 9.10: Using MPI_Allgather and MPI_Allgatherv.
Figure 9.11: Parallel I/O of Jacobi solution. Note that this choice of file view works only for a single output step; if output of multiple steps of the Jacobi method are needed, the arguments to MPI_File_set_view must be modified.
Figure 9.12: C program for writing a distributed array that is also noncontiguous in memory because of a ghost area (derived from an example in [50]).
Figure 9.13: Neighbor exchange using MPI remote memory access.
Figure 9.14: Simple MPI program in C++.
Chapter 10: Parallel Virtual Machine
Figure 10.1: PVM used to create a Grid of clusters.
Figure 10.2: PVM program 'hello.c'.
Figure 10.3: PVM program 'hello_other.c'.
Figure 10.4: Output of fork/join program.
Chapter 14: Cluster Workload Management
Figure 14.1: Activities performed by a workload management system.
Chapter 15: Condor: A Distributed Job Scheduler
Figure 15.1: Examples of ClassAds in Condor.
Figure 15.2: Condor jobmonitor tool.
Figure 15.3: A sample Java submit file.
Figure 15.4: Remote System calls in the Standard Universe.
Figure 15.5: A directed acyclic graph with four nodes.
Figure 15.6: Daemon layout of an idle Condor pool.
Figure 15.7: Daemon layout when a job submitted from Machine 2 is running.
Figure 15.8: CondorView displaying machine usage.
Chapter 18: Scyld Beowulf
Figure 18.1: Evolution of Beowulf System Image.
Figure 18.2: Migration of processes using bproc.
Chapter 19: Parallel I/O and the Parallel Virtual File System
Figure 19.1: Parallel I/O System Components
Figure 19.2: Nested-Strided Example
Figure 19.3: Frangipani and Petal File System Architecture
Figure 19.4: GPFS Architecture Using Storage Area Network
Figure 19.5: Galley Architecture
Figure 19.6: PVFS File System Architecture
Figure 19.7: Concurrent Writes and NFS
Figure 19.8: Two-Phase Write Steps
Figure 19.9: PVFS2 Software Architecture
Figure 19.10: Migrating Storage Objects
Figure 19.11: Examples of Data Distributions
Chapter 20: A Tale of Two Clusters: Chiba City and Jazz