# 8.6 MPI Programming without MPI

## 8.6 MPI Programming without MPI

One of the major strengths of MPI is the support that MPI provides for building libraries of useful software. These libraries often eliminate the need for explicit programming in MPI; in cases where no suitable library exists, MPI's design encourages the use of modern software engineering techniques in creating application-specific support libraries. Some of the available libraries are shown in Table 9.1; Chapter 12 discusses some of the more important libraries in more detail. To illustrate the power of libraries in MPI, this section shows several programs that solve partial differential equations without the explicit use of MPI. These are still MPI programs, however, and must be run using mpiexec just like other MPI programs.

### 8.6.1 A Poisson Solver

Section 8.3 presented an MPI code that implemented the Jacobi method for solving a simple partial differential equation. This example provided a good introduction to MPI but is not meant as an example of how to solve differential equations in parallel with MPI. For that task, one or more parallel libraries should be used. Figure 8.13 shows a short code for solving two-dimentional Poisson problems on a regular mesh. This code makes very heavy use of two libraries:

• PETSc [9, 10, 8] is a library designed to solve in parallel linear and nonlinear equations that arise from PDEs. PETSc uses MPI.

• "regmesh" is an application-specific library written to simplify the use of PETSc for regular mesh discritizations of elliptic partial differential equations. This library makes no explicit MPI calls; instead, all parallelism is handled through PETSc. ```#include <math.h>
#include "petsc.h"
#include "regmesh.h"

/* This function is used to define the right-hand side of the
Poisson equation to be solved */
double func( double x, double y )
{
return sin(x)*sin(y);
}

int main( int argc, char *argv[] )
{
SLES    sles;
RegMesh g;
Mat     m;
Vec     b, x;
Viewer  viewer;
int     its;

PetscInitialize( &argc, &argv, 0, 0 );

g = Create2dDistributedArray( n, n, 0.0, 1.0, 0.0, 1.0 );
m = ApplyStencilTo2dDistributedArray( g, REGMESH_LAPLACIAN );
b = SetVectorFromFunction( g, (RegMeshFunc)func );
VecDuplicate( b, &x );
SLESCreate( PETSC_COMM_WORLD, &sles );
SLESSetOperators( sles, m, m, DIFFERENT_NONZERO_PATTERN);
SLESSetFromOptions( sles );
SLESSolve( sles, b, x, &its );
PetscViewerNetcdfOpen( PETSC_COMM_WORLD, "solution.nc",
PETSC_NETCDF_CREATE, &viewer );
MeshDAView( g, viewer );
RegMeshDestroy( g ); MatDestroy( m ); VecDestroy( b ); VecDestroy( x );
SLESDestroy( sles );
PetscFinalize( );
return 0;
}
``` Figure 8.13: A parallel Poisson solver that exploits two libraries written with MPI.

The routines in this example perform the following operations:

• PetscInitialize Initialize the PETSc library

• Create2dDistributedArray — Create a handle (g) to a structure that defines a two-dimensional mesh of size n?n on the unit square. This mesh is distributed across all processes. This routine is from regmesh.

• ApplyStencilTo2dDistributedArray — Create the sparse matrix (returned as the value m) by applying a disretization stencil to the mesh g. The discretization is predefined and is the same one described in Section 8.3. This routine is from regmesh but returns a handle to a PETSc matrix.

• SetVectorFromFunction — Return the vector representing the right-hand side of the problem by applying a function func to the mesh g. This routine is from regmesh but returns a handle to a PETSc vector.

• SLESCreate — Create a PETSc context for solving a linear system of equations. This is a handle for the internal structure that PETSc uses to hold all of the information, such as the choice of algorithm, used to solve a linear system of equations.

• SLESSetOperators — Define the linear system to solve by specifying the matrices. This routine allows several variations of specification; this example uses the most common.

• SLESSetFromOptions — Set the various parameter choices from the command line and a defaults file. This lets the user choose the iterative method and pre-conditioner at run time by using command-line arguments.

• SLESSolve — Solve the linear system Ax = b, returning the solution in the PETSc vector x. This is a PETSc routine.

• PetscViewerNetcdfOpen — Create a "viewer" by which a PETSc vector can be written to a file (here 'solution.nc') using the community-standard NetCDF format . This is a PETSc routine.

• MeshDAView — Output the solution using the viewer. This makes use of the PETSc "distributed array" structure as well as other data from the regmesh g.

• xxxDestroy — Free the space used by the mesh, vector, and matrix structures, as well as the linear equation solver.

An advantage of this approach to writing parallel programs is that it allows the application programmer to take advantage of the best numerical algorithms and parallel tools. For example, the command-line

```    mpiexec -n 64 poisson -pc_type=ilu -ksp_type=gmres
```

runs this example on 64 processors, using the GMRES iterative method with a block incomplete factorization preconditioner. Changing the choice of iterative method or preconditioner is accomplished by simply changing the command-line arguments.

In addition, this example includes output of the solution, using parallel I/O into a file (when supported by a parallel file system such as PVFS, described in Chapter 19). Further, this file is written in a standard format called NetCDF; a wide variety of tools exist for postprocessing this file, including programs to display the contents graphically.

Regmesh is a specialized library designed to simplify the creation of parallel programs that work with regular meshes. More importantly, Regmesh is an example of structuring an application so that the important operations are organized into logical units.

### 8.6.2 Solving a Nonlinear Partial Differential Equation

To further illustrate the power of MPI libraries, Figure 8.14 shows the main program for solving the problem ```#include "petsc.h"
/* User-defined data describing the problem */
typedef struct {
DA            da;             /* distributed array data structure */
double        param;          /* test problem parameter */
} AppCtx;
extern int FormFunctionLocal(DALocalInfo*,double**,double**,AppCtx*);
extern int FormJacobianLocal(DALocalInfo*,double**,Mat,AppCtx*);
int main(int argc,char *argv[])
{
SNES     snes;     /* nonlinear solver */
Vec      x,r;      /* solution, residual vectors */
Mat      A,J;      /* Jacobian matrix */
AppCtx   user;     /* user-defined work context */
int      its;      /* iterations for convergence */

PetscInitialize(&argc,&argv,(char *)0,help);
user.param = 6.0;
SNESCreate(PETSC_COMM_WORLD,&snes);
DACreate2d(PETSC_COMM_WORLD,DA_NONPERIODIC,DA_STENCIL_STAR,
-4,-4,PETSC_DECIDE,PETSC_DECIDE,
1,1,PETSC_NULL,PETSC_NULL,&user.da);
DACreateGlobalVector(user.da,&x);
VecDuplicate(x,&r);
DASetLocalFunction(user.da,(DALocalFunction1)FormFunctionLocal);
DASetLocalJacobian(user.da,(DALocalFunction1)FormJacobianLocal);
SNESSetFunction(snes,r,SNESDAFormFunction,&user);
DAGetMatrix(user.da,MATMPIAIJ,&J);
A    = J;
SNESSetJacobian(snes,A,J,SNESDAComputeJacobian,&user);
SNESSetFromOptions(snes);
FormInitialGuess(&user,x);
SNESSolve(snes,x,&its);
PetscPrintf(PETSC_COMM_WORLD,"Number of Newton iterations = %d\n",its);

MatDestroy(J); VecDestroy(x); VecDestroy(r); SNESDestroy(snes);
PetscFinalize();
return 0;
}
``` Figure 8.14: The main program in a high-level program to solve a nonlinear partial differential equation using PETSc.
 ∇2u = -λeu on Ω = [0, 1] ? [0, 1] u = 0 on the boundary of Ω.

This problem is the Bratu problem. This code uses only PETSc and, as a result, is somewhat longer. Not included in this figure are some of the routines for computing the Jacobian elements, evaluating the function, setting the initial guess, or checking for errors. A complete version of this example is included as 'src/snes/examples/tutorials/ex5.c' in the PETSc distribution. Even this program is only a few hundred lines, including extensive comments.

These two examples show that tools are available that make writing parallel programs using MPI relatively easy, as long as high-quality libraries are available for the operations needed by an application. Fortunately, in many areas of science and engineering, such libraries are available, and more are added all the time.  BackCover  Beowulf Cluster Computing with Linux, Second Edition  Series Foreword  Foreword  Preface to the Second Edition  Preface to the First Edition  Chapter 1: So You Want to Use a Cluster  Part I: Enabling Technologies  Part II: Parallel Programming  Chapter 7: An Introduction to Writing Parallel Programs for Clusters  Chapter 8: Parallel Programming with MPI  8.1 Hello World in MPI  8.2 Manager/Worker Example  8.3 Two-Dimensional Jacobi Example with One-Dimensional Decomposition  8.4 Collective Operations  8.5 Parallel Monte Carlo Computation  8.6 MPI Programming without MPI  8.7 Installing MPICH2 under Linux  8.8 Tools for MPI Programs  8.9 MPI Implementations for Clusters  Chapter 9: Advanced Topics in MPI Programming  Chapter 10: Parallel Virtual Machine  Chapter 11: Fault-Tolerant and Adaptive Programs with PVM  Chapter 12: Numerical and Scientific Software for Clusters  Part III: Managing Clusters  Appendix A: Glossary of Terms  Appendix B: Annotated Reading List  Appendix C: Annotated URLs  References  List of Figures  List of Tables