18.4 Features in Upcoming Releases

The Scyld Beowulf OS continues to evolve over time, and many new features are planned for upcoming releases, primarily focused on scalability and high reliability. Beowulf clusters are being built with larger and larger numbers of nodes, and are more often now used in production environments. Larger clusters often require substantially different techniques than those used to run 8, 16, or 64 node clusters, and production environments find downtime to deal with hardware failures or upgrades less acceptable.

18.4.1 Failover Head Nodes

One of the most important new features will be support for multi-headed clusters. While a current Scyld cluster can continue to function in the event of a compute node failure, the head node remains a single point of failure. In the upcoming release, a new head node can take over when the original head fails.

This is achieved by adding some extensions to the bproc model. Bproc is being extended to allow slave processes to detach from the head node that spawned them, and run independently. These tasks can then continue to run to completion on their own, or they can use the slave daemon on the nodes to contact a new master, and insert themselves into the process table of the new head node. This will allow a switch from one head to another without disrupting any ongoing jobs.

18.4.2 Scalable bproc Job Spawning

The bproc mechanism provides extremely rapid migration of jobs from the head node to the compute nodes. However, as the number of compute nodes grows to hundreds or even thousands, the total time to launch jobs via bproc can become substantial. Future versions of bproc will contain the ability to do a tree-based spawn. In this system, the head node will migrate tasks to nodes at the top of the tree, and these nodes will then migrate the tasks to additional nodes, and so on. This offloads some of the load of spawning tasks from the head, and removes a potential bottleneck. Experimental work at Scyld has shown that this approach begins to become useful as clusters grow past 256 nodes using a single head.