Preface to the First Edition

Preface to the First Edition

Within the past three years, there has been a rapid increase in the deployment and application of computer clusters to expand the range of available system capabilities beyond those of conventional desktop and server platforms. By leveraging the development of hardware and software for these widely marketed and heavily used mainstream computer systems, clusters deliver order of magnitude or more scaling of computational performance and storage capacity without incurring significant additional R&D costs. Beowulf-class systems, which exploit mass-market PC hardware and software in conjunction with cost-effective commercial network technology, provide users with the dual advantages of unprecedented price/performance and configuration flexibility for parallel computing. Beowulf-class systems may be implemented by the end users themselves from available components. But with their growth in popularity, so has evolved industry support for commercial Beowulf systems. Today, depending on source and services, Beowulf systems can be installed at a cost of between one and three dollars per peak megaflops and of a scale from a few gigaflops to half a teraflops. Equally important is the rapid growth in diversity of application. Originally targeted to the scientific and technical community, Beowulf-class systems have expanded in scope to the broad commercial domain for transaction processing and Web services as well as to the entertainment industry for computer-generated special effects. Right now, the largest computer under development in the United States is a commodity cluster that upon completion will be at a scale of 30 teraflops peak performance. It is quite possible that, by the middle of this decade, commodity clusters in general and Beowulf-class systems in particular may dominate middle and high-end computing for a wide range of technical and business workloads. It also appears that for many students, their first exposure to parallel computing is through hands-on experience with Beowulf clusters.

The publication of How to Build a Beowulf by MIT Press marked an important milestone in commodity computing. For the first time, there was an entry-level comprehensive book showing how to implement and apply a PC cluster. The initial goal of that book, which was released almost two years ago, was to capture the style and content of the highly successful tutorial series that had been presented at a number of conferences by the authors and their colleagues. The timeliness of this book and the almost explosive interest in Beowulf clusters around the world made it the most successful book of the MIT Press Scientific and Engineering Computation series last year. While other books have since emerged on the topic of assembling clusters, it still remains the most comprehensive work teaching hardware, software, and programming methods. Nonetheless, in spite of its success, How to Build a Beowulf addressed the needs of only a part of the rapidly growing commodity cluster community. And because of the rapid evolution in hardware and software, aspects of its contents have grown stale in a very short period of time. How to Build a Beowulf is still a very useful introduction to commodity clusters and has been widely praised for its accessibility to first-time users. It has even found its way into a number of high schools across the country. But the community requires a much more extensive treatment of a topic that has changed dramatically since that book was introduced.

In addition to the obvious improvements in hardware, over the past two years there have been significant advances in software tools and middleware for managing cluster resources. The early Beowulf systems ordinarily were employed by one or a few closely associated workers and applied to a small easily controlled workload, sometimes even dedicated to a single application. This permitted adequate supervision through direct and manual intervention, often by the users themselves. But as the user base has grown and the nature of the responsibilities for the clusters has rapidly diversified, this simple "mom-and-pop" approach to system operations has proven inadequate in many commercial and industrial-grade contexts. As one reviewer somewhat unkindly put it, How to Build a Beowulf did not address the hard problems. This was, to be frank, at least in part true, but it reflected the state of the community at the time of publication. Fortunately, the state of the art has progressed to the point that a new snapshot of the principles and practices is not only justified but sorely needed.

The book you are holding is far more than a second addition of the original How to Build a Beowulf; it marks a major transition from the early modest experimental Beowulf clusters to the current medium- to large-scale, industrial-grade PC-based clusters in wide use today. Instead of describing a single depth-first minimalist path to getting a Beowulf system up and running, this new reference work reflects a range of choices that system users and administrators have in programming and managing what may be a larger user base for a large Beowulf clustered system. Indeed, to support the need for a potentially diverse readership, this new book comprises three major parts. The first part, much like the original How to Build a Beowulf, provides the introductory material, underlying hardware technology, and assembly and configuration instructions to implement and initially use a cluster. But even this part extends the utility of this basic-level description to include discussion and tutorial on how to use existing benchmark codes to test and evaluate new clusters. The second part focuses on programming methodology. Here we have given equal treatment to the two most widely used programming frameworks: MPI and PVM. This part stands alone (as do the other two) and provides detailed presentation of parallel programming principles and practices, including some of the most widely used libraries of parallel algorithms. The largest and third part of the new book describes software infrastructure and tools for managing cluster resources. This includes some of the most popular of the readily available software packages for distributed task scheduling, as well as tools for monitoring and administering system resources and user accounts.

To provide the necessary diversity and depth across a range of concepts, topics, and techniques, I have developed a collaboration among some of the world's experts in cluster computing. I am grateful to the many contributors who have added their expertise to the body of this work to bring you the very best presentation on so many subjects. In many cases, the contributors are the original developers of the software component being described. Many of the contributors have published earlier works on these or other technical subjects and have experience conveying sometimes difficult issues in readable form. All are active participants in the cluster community. As a result, this new book is a direct channel to some of the most influential drivers of this rapidly moving field.

One of the important changes that has taken place is in the area of node operating system. When Don Becker and I developed the first Beowulf-class systems in 1994, we adopted the then-inchoate Linux kernel because it was consistent with other Unix-like operating systems employed on a wide range of scientific compute platforms from workstations to supercomputers and because it provided a full open source code base that could be modified as necessary, while at the same time providing a vehicle for technology transfer to other potential users. Partly because of these efforts, Linux is the operating system of choice for many users of Beowulf-class systems and the single most widely used operating system for technical computing with clusters. However, during the intervening period, the single widest source of PC operating systems, Microsoft, has provided the basis for many commercial clusters used for data transaction processing and other business-oriented workloads. Microsoft Windows 2000 reflects years of development and has emerged as a mature and robust software environment with the single largest base of targeted independent software vendor products. Important path-finding work at NCSA and more recently at the Cornell Theory Center has demonstrated that scientific and technical application workloads can be performed on Windows-based systems. While heated debate continues as to the relative merit of the two environments, the market has already spoken: both Linux and Windows have their own large respective user base for Beowulf clusters.

As a result of attempting to represent the PC cluster community that clearly embodies two distinct camps related to the node operating system, my colleagues and I decided to simultaneously develop two versions of the same book. Beowulf Cluster Computing with Linux and Beowulf Cluster Computing with Windows are essentially the same book except that, as the names imply, the first assumes and discusses the use of Linux as the basis of a PC cluster while the second describes similar clusters using Microsoft Windows. In spite of this marked difference, the two versions are conceptually identical. The hardware technologies do not differ. The programming methodologies vary in certain specific details of the software packages used but are formally the same. Many but not all of the resource management tools run on both classes of system. This convergence is progressing even as the books are in writing. But even where this is not true, an alternative and complementary package exists and is discussed for the other system type. Approximately 80 percent of the actual text is identical between the two books. Between them, they should cover the vast majority of PC clusters in use today.

On behalf of my colleagues and myself, I welcome you to the world of low-cost Beowulf cluster computing. This book is intended to facilitate, motivate, and drive forward this rapidly emerging field. Our fervent hope is that you are able to benefit from our efforts and this work.

Acknowledgments

I thank first the authors of the chapters contributed to this book:

Peter H. Beckman, Turbolinux

Remy Evard, Argonne National Laboratory

Al Geist, Oak Ridge National Laboratory

William Gropp, Argonne National Laboratory

David B. Jackson, University of Utah

James Patton Jones, Altair Grid Technologies

Jim Kohl, Oak Ridge National Laboratory

Walt Ligon, Clemson University

Miron Livny, University of Wisconsin

Ewing Lusk, Argonne National Laboratory

Karen Miller, University of Wisconsin

Bill Nitzberg, Altair Grid Technologies

Rob Ross, Argonne National Laboratory

Daniel Savarese, University of Maryland

Todd Tannenbaum, University of Wisconsin

Derek Wright, University of Wisconsin

Many other people helped in various ways to put this book together. Thanks are due to Michael Brim, Philip Carns, Anthony Chan, Andreas Dilger, Michele Evard, Tramm Hudson, Andrew Lusk, Richard Lusk, John Mugler, Thomas Naughton, John-Paul Navarro, Daniel Savarese, Rick Stevens, and Edward Thornton.

Jan Lindheim of Caltech provided substantial information related to networking hardware. Narayan Desai of Argonne provided invaluable help with both the node and network hardware chapters. Special thanks go to Rob Ross and Dan Nurmi of Argonne for their advice and help with the cluster setup chapter.

Paul Angelino of Caltech contributed the assembly instructions for the Beowulf nodes. Susan Powell of Caltech performed the initial editing of several chapters of the book.

The authors would like to respectfully acknowledge the important initiative and support provided by George Spix, Svetlana Verthein, and Todd Needham of Microsoft that were critical to the development of this book. Dr. Sterling would like to thank Gordon Bell and Jim Gray for their advice and guidance in its formulation.

Gail Pieper, technical writer in the Mathematics and Computer Science Division at Argonne, was an indispensable guide in matters of style and usage and vastly improved the readability of the prose.

Thomas Sterling




Part III: Managing Clusters