List of Figures

List of Figures

Chapter 1: So You Want to Use a Cluster

Figure 1.1: Sample decomposition of a 3-D mesh. The upper right corner box has been pulled out to show that the mesh has been subdivided along the x, y, and z axes.

Chapter 2: Node Hardware

Figure 2.1: Block diagram of a motherboard chipset. The chipset consists of the entire diagram excluding the processor and memory.

Chapter 3: Linux

Figure 3.1: A simple program to touch many pages of memory.

Chapter 4: System Area Networks

Figure 4.1: A simple cluster network.
Figure 4.2: A complex cluster network.

Chapter 5: Configuring and Tuning Cluster Networks

Figure 5.1: Layering of network protocols
Figure 5.2: Diagram showing the configuration of our simple example cluster.
Figure 5.3: Diagram showing compute nodes with multiple interfaces on multiple networks. Notice that the Myrinet network is entirely internal to the cluster, a common design point since the dedicated network is typically much higher performing than networks outside the cluster.
Figure 5.4: Above are shown some possible locations one may wish to place a firewall, denoted by the curved dotted lines.
Figure 5.5: Above are shown some of the interesting points through the Linux kernel where network packets are affected. The letters are points in kernel space where routing decisions are made. Numbered locations are some of the places where netfilters exist that will determine the fate of packets passing through. A.) incoming packet routing decision. B.) local machine process space. C.) postrouting decision. 1.) FORWARD netfilter table. 2.) INPUT netfilter table. 3.) OUTPUT netfilter table.

Chapter 6: Setting up Clusters

Figure 6.1: Cable bundles. Wire ties make 8 power cables into a neat and managable group
Figure 6.2: The back of a rack, showing the clean organization of the cables. Note that the fans are unobstructed.
Figure 6.3: Basic RedHat Kickstart file. The RedHat Installer, Anaconda, interprets the contents of the kickstart file to build a node
Figure 6.4: Description (Kickstart) Graph. This graph completely describes all of the appliances of a Rocks Cluster.
Figure 6.5: Description Graph Detail. This illustrates how two modules 'standalone.xml' and 'base.xml' share base configuration and also differ in other specifics
Figure 6.6: The ssh.xml module includes the ssh packages and configures the service in the Kickstart post section.
Figure 6.7: The 'base.xml' module configures the main section of the Kickstart file.

Chapter 7: An Introduction to Writing Parallel Programs for Clusters

Figure 7.1: Schematic of a general manager-worker system
Figure 7.2: A simple server in C
Figure 7.3: A simple client in C
Figure 7.4: A simple server in Python
Figure 7.5: A simple client in Python
Figure 7.6: A simple server in Perl
Figure 7.7: A simple client in Perl
Figure 7.8: A Python server that uses select
Figure 7.9: A Python client
Figure 7.10: Matrix-matrix multiply program
Figure 7.11: Manager for parameter study
Figure 7.12: Two code fragments for parallelizing the Poisson problem with the Jacobi iteration
Figure 7.13: Two code fragments for parallelizing the Poisson problem with the Jacobi iteration, including the communication of ghost points. Note the changes in the declarations for U and UNEW.
Figure 7.14: LU Factorization code. The factors L and U are computed in-place; that is, they are stored over the input matrix a.

Chapter 8: Parallel Programming with MPI

Figure 8.1: Simple "Hello World" program in MPI.
Figure 8.2: A more interesting version of "Hello World".
Figure 8.3: A more complex "Hello World" program in MPI. Only process 0 writes to stdout; each process sends a message to process 0.
Figure 8.4: Using MPI_Probe to find the size of a message before receiving it.
Figure 8.5: Framework of the matrix-vector multiply program.
Figure 8.6: The matrix-vector multiply program, manager code.
Figure 8.7: The matrix-vector multiply program, worker code.
Figure 8.8: Domain and 9 ? 9 computational mesh for approximating the solution to the Poisson problem.
Figure 8.9: A simple version of the neighbor exchange code. See the text for a discussion of the limitations of this routine.
Figure 8.10: A better version of the neighbor exchange code.
Figure 8.11: Computing π using collective operations.
Figure 8.12: — Computing π using the Monte Carlo method.
Figure 8.13: A parallel Poisson solver that exploits two libraries written with MPI.
Figure 8.14: The main program in a high-level program to solve a nonlinear partial differential equation using PETSc.
Figure 8.15: Jumpshot displaying message traffic.

Chapter 9: Advanced Topics in MPI Programming

Figure 9.1: Dynamic process matrix-vector multiply program, manager part.
Figure 9.2: Dynamic process matrix-vector multiply program, worker part.
Figure 9.3: Fault-tolerant manager.
Figure 9.4: Nonblocking exchange code for the Jacobi example.
Figure 9.5: A 12 x 12 computational mesh, divided into 4?4 domains, for approximating the solution to the Poisson problem using a two-dimensional decomposition.
Figure 9.6: Locations of mesh points in ulocal for a two-dimensional decomposition.
Figure 9.7: Nonblocking exchange code for the Jacobi problem for a two-dimensional decomposition of the mesh.
Figure 9.8: Two possible message-matching patterns when MPI_ANY_SOURCE is used in the MPI_Recv calls (from [48]).
Figure 9.9: Schematic representation of collective data movement in MPI.
Figure 9.10: Using MPI_Allgather and MPI_Allgatherv.
Figure 9.11: Parallel I/O of Jacobi solution. Note that this choice of file view works only for a single output step; if output of multiple steps of the Jacobi method are needed, the arguments to MPI_File_set_view must be modified.
Figure 9.12: C program for writing a distributed array that is also noncontiguous in memory because of a ghost area (derived from an example in [50]).
Figure 9.13: Neighbor exchange using MPI remote memory access.
Figure 9.14: Simple MPI program in C++.

Chapter 10: Parallel Virtual Machine

Figure 10.1: PVM used to create a Grid of clusters.
Figure 10.2: PVM program 'hello.c'.
Figure 10.3: PVM program 'hello_other.c'.
Figure 10.4: Output of fork/join program.

Chapter 14: Cluster Workload Management

Figure 14.1: Activities performed by a workload management system.

Chapter 15: Condor: A Distributed Job Scheduler

Figure 15.1: Examples of ClassAds in Condor.
Figure 15.2: Condor jobmonitor tool.
Figure 15.3: A sample Java submit file.
Figure 15.4: Remote System calls in the Standard Universe.
Figure 15.5: A directed acyclic graph with four nodes.
Figure 15.6: Daemon layout of an idle Condor pool.
Figure 15.7: Daemon layout when a job submitted from Machine 2 is running.
Figure 15.8: CondorView displaying machine usage.

Chapter 18: Scyld Beowulf

Figure 18.1: Evolution of Beowulf System Image.
Figure 18.2: Migration of processes using bproc.

Chapter 19: Parallel I/O and the Parallel Virtual File System

Figure 19.1: Parallel I/O System Components
Figure 19.2: Nested-Strided Example
Figure 19.3: Frangipani and Petal File System Architecture
Figure 19.4: GPFS Architecture Using Storage Area Network
Figure 19.5: Galley Architecture
Figure 19.6: PVFS File System Architecture
Figure 19.7: Concurrent Writes and NFS
Figure 19.8: Two-Phase Write Steps
Figure 19.9: PVFS2 Software Architecture
Figure 19.10: Migrating Storage Objects
Figure 19.11: Examples of Data Distributions

Chapter 20: A Tale of Two Clusters: Chiba City and Jazz

Figure 20.1: Chiba City schematic.
Figure 20.2: A Chiba City town.
Figure 20.3: The Chiba City Ethernet.
Figure 20.4: One of two rows of Chiba City.
Figure 20.5: Node image management.
Figure 20.6: OS image management.
Figure 20.7: Serial infrastructure.
Figure 20.8: Power infrastructure.
Figure 20.9: Argonne's Jazz Cluster



Part III: Managing Clusters