Foreword

Foreword

Supercomputers! What computer scientist would not want one? After all, when I was growing up (in the dark ages), everything good was "Super." Superman, Supergirl, Superdog, Supersize ... everyone and everything wanted to be "Super." And so, with the work of a lot of very intelligent people, the Supercomputer was born. People like Seymour Cray, who expended much money, time and effort in creating machines that could solve problems very quickly.

Unfortunately these Supercomputers were also Supercostly. Often built by hand, they would cost millions of dollars (and that was when a million dollars was a lot of money) to design and build, then a relatively few systems were produced. In addition, these systems tended to be handmade, or at least produced in relatively small qualities, which also drove the production costs up. Finally, each style of supercomputer (whether it be a Cray, a CDC Cyber, an ECL or others) would have a different instruction set, and run a different operating system, which caused the people writing software for them to learn this new operating system, and write their applications to it. Likewise a lot of the software tools for writing applications (compilers, debuggers, profilers, etc.) had to be created for each line, if not each model, of supercomputer. This made these software tools and operating systems costly to develop and maintain.

As general-purpose computers started getting more and more prevalent, the ability to manufacture machines of increasing speed and size at lower and lower prices made the lifetime of supercomputers shorter and shorter. After all, the purpose of purchasing and using a supercomputer was to be able to run your application in the shortest possible time. When this speed transitioned from the previously purchased supercomputer to the latest mass-produced "mainframe" or "super-mini," the justification for running a supercomputer became more difficult.

Because of these and other financial issues, a lot of the supercomputing companies started to go out of business. This was bad for a lot of reasons. First of all, we need supercomputers, or at least we need to have the ability to solve large problems quickly. Whether it is trying to prospect for natural resources, or trying to protect the environment; whether it is analyzing aerial photographs for weapons of mass destruction or trying to predict the weather precisely for a shuttle launch; whether it is generating real time computer graphics or analyzing a mammogram to determine if a woman has cancer or not, the time needed to analyze the problem can mean the difference between success or failure, life or death. Too long in analysis, and you miss the window for the answer to do you any good. For iterative processes, you may find that your competitor, who is using a faster computer, comes up with a better answer or a better product faster than you do.

A good example of this is the computer industry itself. In designing a CPU, a lot of simulation of the new chip is done by already existing computers. The faster the simulation can be done, or the faster a checkout of the finished design can be accomplished, the faster the next iteration of the design can be started. This is why, for years, many chips were designed by supercomputers, even if those supercomputers were from rival chip manufacturers.

As the fortune of supercomputer companies declined, the need for high-speed computing still continued to grow. Two people in NASA, Dr. Thomas Sterling and Dr. Donald Becker, realized that something had to be done. They hypothesized that using inexpensive, off the shelf computer systems (COTS) hooked together with high-speed networking (even with speeds as low as 10 Mbit/sec Ethernet) could duplicate the power of supercomputers, particularly applications that could be converted into highly parallelized threads of execution. They theorized that the price/performance of these COTS systems would more than make up for the overhead of having to send data between the different nodes to have that additional computing done, and sooner or later this concept became known as "Beowulf clusters," or just "Beowulfs."

At first these systems were built from individual PCs built from individual boxes, mounted on commodity racks (and sometimes just stacked on the floor), but as time went on various small companies started to sell pre-packaged, pre-built units in ever-smaller packages with more and more CPUs in them. Boxes kept getting smaller and smaller so you could put more boxes in each rack, and customers were able to order pre-built and pre-wired systems. And because these Beowulfs were made with high-volume manufactured chips, the cost was often one-fortieth that of a conventional supercomputer. Over time even the larger manufacturers such as HP and IBM began building rack-mounted Beowulf systems to order.

Of course there were a few other problems to think about, such as the time it took to send the data back and forth (usually called "latency"), sizing the system, or coordination of the flow of data and instructions to the many, many nodes that might be required. And these were just the beginning of the issues. As the number of COTS nodes increased, so did the amount of power needed, the amount of air conditioning, and even the amount of floor space and floor loading needed to support that many individual units.

These systems were made up of what we call "commodity architectures." While some of these "commodity architectures" were made up of relatively low volume Alpha chips, or SPARC chips, the majority of the Beowulfs were 32-bit Intel chips. And finally, the bulk of the systems used a newly developed operating system called "Linux." The combination of a commodity architecture with a free and high-volume operating system allowed supercomputing to have a volume binary interface for the first time. Applications that worked on a single CPU Intel system running Linux would also work on a Beowulf.

Linux was royalty free, and came with all the source code needed to create it, which allowed people to change the kernel to help make it work better on a Beowulf cluster. People wrote new libraries, and contributed to changing existing libraries to make them work better in the new environment. Compilers were made more efficient, and newer interconnects were developed that had higher throughput, lower latency, and lower overhead than the original ones. New algorithms allowed applications that could not utilize Beowulfs before to utilize the new technique. However, the open source nature of Linux and these compilers and libraries allowed a pseudo-standard for Beowulf systems to emerge. For the first time we could think about mass-produced supercomputers ... units that could duplicate the power of a supercomputer for less than one-fortieth of the price.

Still, a lot of people did not foresee how Beowulf systems would change the face of computing. It was only when certain projects happened that people began to realize the excitement generated by affordable supercomputers.

The first project came from Oak Ridge National Labs, where a "mistake in planning" left a project without budgeted money for computing. By going to their colleagues who had recently had upgrades to their desktop systems, the project managers were able to collect forty-eight cast-off units and make the Beowulf needed to do their calculations. They called it the "Soupercomputer" after the old story of "Stone Soup" and the man who fed a village by making a soup only out of water, fire and a white stone. After all, they had made their soupercomputer at "no cost" to the facility.

The second success was a CD-ROM made by Red Hat Software in conjunction with NASA, more as a marketing gimmick than anything else. Declaring the Beowulf software on the CD-ROM as "rocket science," the CD-ROM that was expected to sell as a sleeper flew off the shelves of Red Hat, and became one of their largest sellers. Whether the CD-ROM was ever installed or not made no difference, everyone wanted to have supercomputing software on their bookshelf, particularly for the low, low price of a Linux CD.

Another success started happening in high schools and small colleges. These schools never dreamed of owning a traditional supercomputer before, but with the concept of Beowulf systems, either with donated "Stone Soup" computers or new ones bought through a small grant, the schools were able to create that computing power. This was important to not only the computer science department, but to areas such as chemistry, biology, animation, music, physics, and other areas needing high performance computations for real-time visualizations and simulations.

As the use of Beowulf systems grew into other areas such as bio-informatics and genome research, new uses for supercomputers were derived that had never been considered before. A major financial company had to maintain a certain amount of monetary reserve as required by the SEC. Since this company was so large, the amount of money that it had in this reserve at any one time took over twelve hours to calculate. Since it took so long to come up with a correct answer (which by definition was no longer correct), they had to keep a significant buffer to meet a potential audit. By purchasing a Beowulf system, they were able to calculate the amount of reserve accurately in fifteen minutes, and therefore calculated it every fifteen minutes of the day. This allowed them to reduce their reserve, and with the reclaimed money re-invested, they were able to make fifteen million dollars in profits the first year. This (of course) paid for their Beowulf system many times over.

There are other points to programming these Beowulf systems. The techniques used in programming them (message passing, parallel threads of execution, memory locks, and latency speeds) are all considerations of programming what are known as "workstation farms," which these days are simply desktop PCs hooked together with Ethernet. One moment these machines could be used as a high school or college computing laboratory. But within a few moments and with the right operating system software you could have a "horizontal Beowulf" capable of solving anything that a dedicated, rack-mounted Beowulf could solve.

A hospital, for instance, could use the nurses and doctors stations standing idle between accesses to do the analysis of a mammogram, something that was modeled using a Beowulf, and which reduced the analysis time from twenty hours on a single SPARCstation 20 to ten minutes on a 160 unit Intel Beowulf. By utilizing the excess cycles of idle PCs throughout the hospital, the hospital was not required to buy a Beowulf system for this speedup in mammogram analysis. They simply utilized the idle CPU cycles that they already had.

We are entering into a new age of computing. Sixty-four bit computers made out of commodity chips will allow us to more easily solve problems of almost any size. Pulling together hundreds, if not thousands, of CPUs in various configurations (SMP, Beowulf, and NUMA) will allow us to tackle problems where we could not have afforded the solutions ten years ago. Use of the Grid will use a lot, if not all, of the same programming and systems administration techniques that are used in the classic Beowulf system.

Finally, I believe that all of the programming techniques used in Beowulf systems are relevant to even single-CPU desktop machines today. Multi-threaded, distributed programming should be the normal way of thinking about programming, not the exception. Therefore I think that every high school and college computer science student should at one time or another learn how to program a Beowulf system, and the sooner the better.

This book is an excellent place to start.

Carpe Diem.

Jon "maddog" Hall, President
Linux International
Amherst, NH, USA
July 4th, 2003




Part III: Managing Clusters