6.4 The Basic Steps

6.4 The Basic Steps

Before you select a toolkit to build your cluster, one needs to understand the basic high-level steps that are required to install your basic Beowulf. At this point, we assume that the hardware has been physically assembled, cabled, and is ready for power up. The steps are:

  1. Install Head Node

  2. Configure Cluster Services on Head Node

  3. Define Configuration of a Compute Node

  4. For each compute node—repeat

    1. Detect Ethernet Hardware Address of New Node

    2. Install complete OS onto new node

    3. Complete Configuration of new node

  5. Restart Services on head node that are cluster-aware (e.g. PBS, Sun Grid Engine)

Sounds simple enough, and it is. Let's examine the first steps of installation and cluster services on the head node. Some toolkits, like OSCAR, require the user to set up the Linux configuration separately from installing the cluster toolkit. Others, like Rocks, combine these two steps into one.

The next step (define configuration of a cluster node) is perhaps where the differences between disk imaging and description methods are most keenly felt. For disk imaging, a golden node needs to be installed and configured by a savvy administrator. OSCAR's System Installation Suite (SIS), which is a combination of the Linux Utility for Installation and System Imager (originally from VA Linux), uses a package list and an elaborate set of GUI's to create a golden image without actually first installing a node and represents a significant improvement over older methods. (More details on OSCAR Installation will be given in Section 6.6). Rocks uses an automatically generated text-description of a compute node "appliance" that is quite general across a wide variety of harware types. (More details about Rocks installation and design will given in Section 6.5).

Once the basic configuration of compute node has either been created by a golden image or defined through a text description, one must map where nodes are in the cluster. All ethernet interfaces have a unique MAC (Media Access Control) address, 00:50: 8B: D3:47: A5 is an example) and this is used by all tookits to identify particular nodes. When a node boots, it needs network configuration parameters and usually gets this through a DHCP (Dynamic Host Configration Protocol) request. The node presents the DHCP server with its MAC address and the server returns IP, Netmask, routing, node name, and other useful components. (Chapters 2, 18 and 20 give further examples and details) Nearly all toolkits have some function or program to help detect new MAC addresses (and hence new nodes). Rocks, for example, probes the '/var/log/messages' file for the appearance of DHCPDISCOVER requests and checks these against a database. If an unknown address appears, the node is added. OSCAR uses tcpdump to ascertain the same information. Not only do these detect the new addresses, but new node names are automatically assigned. Once the detection is complete for a node, it does not have to be repeated, and the assigned IP address is (almost always) permanent.

Installing the OS onto each node is another place where decription and image-based sytems differ. Image-based systems download the golden image, make some adjustments for differences in disk geometry, IP address and other limited changes, and then install the image on the compute node disk. Description-based methods download a text-based node description (which already contains customization information) and use the native installer to drive the installation automatically. The description will partition the disk, download packages, and perform post configuration of packages. The packages themselves are downloaded from a distribution server instead of being contained inside of disk image.

It is critical to understand that disk image methods put the bulk of configuration information into the creation of the golden image. Description-based methods, on the other hand, put configuration information into the text description, which is then applied at installation time. It is often a matter of style as to which an administrator prefers. But, certain scenarios favor one method over another.

The final step of node installation is to complete node configuration. Until recently, this was something that had to be done explicitly by the system administrator after the base images were installed. Current toolkits completely automate this step.

The last step in complete cluster configuration is simply reconfiguring and restarting cluster-wide services like schedulers and monitors to reflect an updated cluster configuration. Modern toolkits automate this for you.

In the next two sections we describe in some detail the NPACI Rocks Toolkit and the OSCAR Toolkit as two exemplars of description-based and image-based methods. These sections do not take the place of howto's or installation instructions, but rather describe the underlying mechanisms for installation and configuration.




Part III: Managing Clusters