3.3 Pruning Your Beowulf Node

3.3 Pruning Your Beowulf Node

Even if recompiling your kernel, downloading a new one, or choosing a journaling file system seems too adventuresome at this point, you can some very simple things to your Beowulf node that can increase performance and manageability. Remember that just as the kernel, with its nearly five hundred dynamically loadable modules, provides drivers and capabilities you probably will never need, so too your Linux distribution probably looks more like a kitchen sink than a lean and mean computing machine. While you may now be tired of the Linux Beowulf adage "a smaller operating system is a better operating system," it must be once again applied to the auxiliary programs often run with a conventional Linux distribution. If we look at the issue from another perspective, every single CPU instruction performed by the kernel or operating system daemon not directly contributed to the scientific calculation is a wasted CPU instruction.

The starting point for pruning your Beowulf node will be what the Linux distribution installer set up. Many distributions have options during installation for "workstation" or "server" or "development" configurations. As a general rule of thumb, "server" installations make a good starting point. Workstation configurations often have windowing systems running by default, and a myriad of background tasks to make Linux as user-friendly as possible to the desktop user. Fortunately, with Linux you can understand and modify any daemon or process as you convert your kitchen sink of useful utilities and programs into a designed-for-computation roadster. For a Beowulf, eliminating useless tasks delivers more megaflop per dollar to the end user.

The first step to pruning the operating system daemons and auxiliary programs is to find out what is running on the system. For most Linux systems there are at least two standard ways to start daemons and other processes, which may waste CPU resources as well as memory bandwidth (often the most precious commodity on a cluster).

  • inetd: This is the "Internet superserver". Many Linux distributions use a newer version of the program, which has essentially the same functionality called xinetd. Both programs basic function is to wait for connections on a set of ports and then spawn and hand off the network connection to the appropriate program when an incoming connection is made. The configuration for what ports inetd or xinetd listening to, as well as what will get spawned can been determined by looking at '/etc/inetd.conf' and '/etc/services' or '/etc/xinetd. conf' and '/etc/xinetd.d' respectively.

  • /etc/rc.d/init.d: This special directory represents the scripts that are run during the booting sequence and that often launch daemons that will run until the machine is shut down.

3.3.1 inetd.conf

The file 'inetd.conf' is a simple configuration file. Each line in the file represents a single service, including the port associated with that service and the program to launch when a connection to the port is made. Below are some simple examples:

ftp     stream  tcp     nowait  root    /usr/sbin/tcpd  in.proftpd
finger  stream  tcp     nowait  root    /usr/sbin/tcpd  in.fingerd
talk    dgram   udp     wait    root    /usr/sbin/tcpd  in.talkd

The first column provides the name of the service. The file '/etc/services' maps the port name to the port number, for example,

% grep ^talk /etc/services
talk 517/udp # BSD talkd(8)

To slim down your Beowulf node, get rid of the extra services in 'inetd.conf'; you probably will not require the /usr/bin/talk program on each of the compute nodes. Of course, what is required will depend on the computing environment. In many very secure environments, where ssh is run as a daemon and not launched from 'inetd.conf' for every new connection, 'inetd.conf' has no entries. In such extreme examples, the inetd process that normally reads 'inetd.conf' and listens on ports, ready to launch services, can even be eliminated.

3.3.2 /etc/rc.d/init.d

The next step is to eliminate any daemons or processes that are normally started at boot. While occasionally Linux distributions differ in style, the organization of the files that launch daemons or run scripts during the first phases of booting up a system are very similar. For most distributions, the directory '/etc/rc.d/init.d' contains scripts that are run when entering or leaving a run level. Below is an example:

% cd /etc/rc.d/init.d
% ls
anacron    functions     kdcrotate  nfslock     sendmail   wine
apachectl  gpm           keytable   nscd        single     xfs
apmd       halt          killall    ntpd        snmpd      xinetd
arpwatch   http_sanity   kudzu      portmap     snmptrapd  ypbind
atd        http_sanity~  lpd        radvd       sshd       yppasswdd
autofs     identd        netfs      random      syslog     ypserv
crond      ipchains      network    rawdevices  vncserver  ypxfrd
cups       iptables      nfs        rhnsd       winbind

However, the presence of the script does not indicate it will be run. Other directories and symlinks control which scripts will be run. Most systems now use the convenient chkconfig interface for managing all the scripts and symlinks that control when they get turned on or off. Not every script spawns a daemon. Some scripts just initialize hardware or modify some setting.

A convenient way to see all the scripts that will be run when entering run level 3 is the following:

% chkconfig --list | grep '3:on'
syslog 0:off 1:off 2:on 3:on 4:on 5:on 6:off
xinetd 0:off 1:off 2:off 3:on 4:on 5:on 6:off
lpd 0:off 1:off 2:off 3:on 4:on 5:on 6:off
mysql 0:off 1:off 2:on 3:on 4:on 5:on 6:off
httpd 0:off 1:off 2:off 3:on 4:on 5:on 6:off
sshd 0:off 1:off 2:off 3:on 4:on 5:on 6:off
atd 0:off 1:off 2:off 3:on 4:on 5:on 6:off
named 0:off 1:off 2:off 3:on 4:on 5:on 6:off
dhcpd 0:off 1:off 2:off 3:on 4:on 5:on 6:off
gpm 0:off 1:off 2:on 3:on 4:on 5:on 6:off
inet 0:off 1:off 2:off 3:on 4:on 5:on 6:off
network 0:off 1:off 2:on 3:on 4:on 5:on 6:off
nfsfs 0:off 1:off 2:off 3:on 4:on 5:on 6:off
random 0:off 1:off 2:on 3:on 4:on 5:on 6:off
keytable 0:off 1:off 2:on 3:on 4:on 5:on 6:off
nfs 0:off 1:off 2:off 3:on 4:on 5:on 6:off
nfslock 0:off 1:off 2:off 3:on 4:on 5:on 6:off
ntpd 0:off 1:off 2:off 3:on 4:on 5:on 6:off
portmap 0:off 1:off 2:off 3:on 4:on 5:on 6:off
sendmail 0:off 1:off 2:on 3:on 4:on 5:on 6:off
serial 0:off 1:off 2:on 3:on 4:on 5:on 6:off
squid 0:off 1:off 2:off 3:on 4:on 5:on 6:off
tltime 0:off 1:off 2:off 3:on 4:off 5:on 6:off
crond 0:off 1:off 2:on 3:on 4:on 5:on 6:off

Remember that not all of these spawn cycle-stealing daemons that are not required for Beowulf nodes. The "serial" script, for example, simply initializes the serial ports at boot time; its removal is not likely to reduce overall performance. However, in this example many things could be trimmed. For example, there is probably no need for lpd, mysql, httpd, named, dhcpd, sendmail, or squid on a compute node. It would be a good idea to become familiar with the scripts and use the chkconfig command to turn off unneeded scripts. With only a few exceptions, an X-Windows server should not be run on a compute node. Starting an X session takes ever-increasing amounts of memory and spawns a large set of processes. Except for special circumstances, run level 3 will be the highest run level for a compute node.

3.3.3 Other Processes and Daemons

In addition to 'inetd.conf' and the scripts in '/etc/rc.d/init.d', there are other common ways for a Beowulf node to waste CPU or memory resources. The cron program is often used to execute programs at scheduled times. For example, cron is commonly used to schedule a nightly backup or an hourly cleanup of system files. Many distributions come with some cron scripts scheduled for execution. The program slocate is often run as a nightly cron to create an index permitting the file system to be searched quickly. Beowulf users may be unhappy to learn that their computation and file I/O are being hampered by a system utility that is probably not useful for a Beowulf. A careful examination of cron and other ways that tasks can be started will help trim a Beowulf compute node.

The ps command can be invaluable during your search-and-destroy mission.

% ps -eo pid,pcpu,sz,vsize,user,fname --sort=vsize

This example command demonstrates sorting the processes by virtual memory size.

The small excerpt below illustrates how large server processes can use memory. The example is taken from a Web server, not a well-tuned Beowulf node.

  PID %CPU   SZ   VSZ  USER COMMAND
26593  0.0  804  3216   web httpd
26595  0.0  804  3216   web httpd
 3574  0.0  804  3216   web httpd
  506  0.0  819  3276  root squid
  637  0.0  930  3720  root AgentMon
  552  0.0 1158  4632 dbenl postmast
13207  0.0 1213  4852  root named
13209  0.0 1213  4852  root named
13210  0.0 1213  4852  root named
13211  0.0 1213  4852  root named
13212  0.0 1213  4852  root named
  556  0.0 1275  5100 dbenl postmast
  657  0.0 1280  5120 dbenl postmast
  557  0.0 1347  5388 dbenl postmast
  475  0.0 2814 11256 mysql mysqld
  523  0.0 2814 11256 mysql mysqld
  524  0.0 2814 11256 mysql mysqld
  507  0.0 3375 13500 squid squid

In this example the proxy cache program squid is using a lot of memory (and probably some cache), even though the CPU usage is negligible. Similarly, the ps command can be used to locate CPU hogs. Becoming familiar with ps will help quickly find runaway processes or extra daemons competing for cycles with the scientific applications intended for your Beowulf.




Part III: Managing Clusters