Writing Control Scripts

For a multiuser system, the most important control scripts reside in the /etc/rc2.d and /etc/rc3.d directories, which are responsible for enabling multiuser services and NFS network resource sharing, respectively. A basic script for starting up a web server looks like this:

#!/bin/sh
# Sample webserver startup script
# Should be placed in /etc/rc2.d/S99webserver
case "$1" in
    'start')
         echo "Starting webserver...\c"
         if [ -f /usr/local/sbin/webserver ]; then
              /usr/local/sbin/webserver start
         fi
         echo ""
         ;;
    'stop')
     echo "Stopping webserver...\c"
         if [ -f /usr/local/sbin/webserver ]; then
              /usr/local/sbin/webserver stop
         fi
         echo ""
    ;;
    *)
         echo "Usage: /etc/rc2.d/S99webserver { start | stop }"
         ;;
    esac

This file should be created by root (with the group sys) and placed in the file /etc/rc2.d/S99webserver, and should have executable permissions.

# chmod 0744 /etc/rc2.d/S99webserver
# chgrp sys /etc/rc2.d/S99webserver

This location of the file is a matter of preference. Many admins treat the web server similar to an NFS server—in this respect, the system’s run level 3 represents a “shared” state. Since a web server is a shared service, it could also be started from a script in /etc/rc3.d. When called with the argument start (represented in the script by "$1"), the script prints a status message that the web server daemon is starting, and proceeds to execute the command if the web server binary exists. The script can also act as a kill script, since it has a provision to be called with a stop argument. Of course, a more complete script would provide more elaborate status information if the web server binary did not exist, and may further process any output from the web server by using a pipe (for example, mailing error messages to the superuser).

One of the advantages of the flexible boot system is that these scripts can be executed to start and stop specific daemons without changing the init state. For example, if a web site was going to be updated and the web server needed to be switched off for a few minutes, the command

# /etc/rc2.d/S99webserver stop

would halt the web server process, but would not force the system back into a single-user state. The web server could be restarted after all content was uploaded by typing this command:

# /etc/rc2.d/S99webserver start

In order to conform to System V standards, it is actually more appropriate to create all the run control scripts in the /etc/init.d directory, and create symbolic links back to the appropriate rc2.d and rc3.d directories. This means that all scripts executed by init through different run levels are centrally located and can be easily maintained. With the web server example, a file could be created in /etc/init.d with a descriptive filename:

# vi /etc/init.d/webserver

After adding the appropriate contents, the file could be saved, and the appropriate symbolic link could be created using the symbolic link command ln:

# ln -s /etc/init.d/webserver /etc/rc2.d/S99webserver

Using this convention, kill and startup scripts for each service can literally coexist in the same script, with the ability to process a start argument for startup scripts, and a stop argument for kill scripts. In this example, you would also need to create a symbolic link to /etc/init.d/webserver for K99webserver.

Writing Kill Scripts

Under System V, kill scripts follow the same convention as startup scripts, in that a stop argument is passed to the script to indicate that a kill rather than a startup is required, in which case a start argument would be passed. A common approach to killing off processes is to find them by name in the process list. The following script kills the asynchronous PPP daemon, which is the link manager for the asynchronous data link protocol. This daemon is started using aspppd—thus, the script generates a process list that is piped through a grep to identify any entries containing "aspppd," and the process number is extracted using awk. This value is assigned to a variable ("$procid"), which is then used by the kill command to terminate the appropriate process. Alternatively, the pgrep or pkill command could be used:

     procid=`ps -e | grep aspppd | awk '{print $1}'`
     if test -n "$procid"
     then
          kill $procid
     fi

Alternatively, sed could be used to match the process name:

procid=`/usr/bin/ps -e |
    /usr/bin/grep aspppd |
    /usr/bin/sed -e 's/^  *//' -e 's/ .*//'`

When multiple processes are to be terminated using a single script (for example, when the NFS server terminates), a shell function (killprocid()) can be written that takes an argument and searches for it in the process list, terminating the named process if it exists:

killprocid() {
procid=`/usr/bin/ps -e |
          /usr/bin/grep -w $1 |
          /usr/bin/sed -e 's/^  *//' -e 's/ .*//'`
     [ "$procid" != "" ] && kill $procid
}

A more modern way of finding and killing processes involves using the pgrep and pkill commands, respectively.

Individual processes can then be terminated using the same function:

killproc nfsd
killproc mountd
killproc rpc.boot
killproc in.rarpd
killproc rpld

However, there are two problems with these approaches to process termination. Firstly, there is an ambiguity problem in that different daemons and applications can be identified by the same name. For example, a system may be running the Apache web server, which is identified by the process name httpd, as well as a web server from another vendor (such as NCSA) that is also identified by httpd. If a script was written to kill the Apache web server, but the first process identified actually belonged to the NCSA web server, the NCSA web server process would be terminated. One solution to this problem is to ensure that all applications are launched with a unique name, or from a wrapper script with a unique name. The second problem is that for a system with even a moderately heavy process load (for example, 500 active processes), executing the ps command to kill each process is going to generate a large CPU overhead, leading to excessively slow shutdown times. Alternative solutions to this problem are provided in the previous section.

Displaying eeprom Variables

To examine the default values used by your system for booting the kernel, and the default boot devices, simply use the /usr/sbin/eeprom command:

# /usr/sbin/eeprom
tpe-link-test?=true
scsi-initiator-id=7
keyboard-click?=false
keymap: data not available.
ttyb-rts-dtr-off=false
ttyb-ignore-cd=true
ttya-rts-dtr-off=false
ttya-ignore-cd=true
ttyb-mode=9600,8,n,1,-
ttya-mode=9600,8,n,1,-
pcia-probe-list=1,2,3,4
pcib-probe-list=1,2,3
mfg-mode=off
diag-level=max
#power-cycles=50
system-board-serial#: data not available.
system-board-date: data not available.
fcode-debug?=false
output-device=screen
input-device=keyboard
load-base=16384
boot-command=boot
auto-boot?=true
watchdog-reboot?=false
diag-file: data not available.
diag-device=net
boot-file: data not available.
boot-device=disk net
local-mac-address?=false
ansi-terminal?=true
screen-#columns=80
screen-#rows=34
silent-mode?=false
use-nvramrc?=false
nvramrc: data not available.
security-mode=none
security-password: data not available.
security-#badlogins=0
oem-logo: data not available.
oem-logo?=false
oem-banner: data not available.
oem-banner?=false
hardware-revision: data not available.
last-hardware-update: data not available.
diag-switch?=false

You can also change the values of the boot device and boot command from within Solaris by using the eeprom command, rather than having to reboot, jump into the OpenBoot monitor, and set the values directly.

Shutting Down the System

In order to manually change run levels, the desired init state is used as an argument to /sbin/init. For example, to bring the system down to a single-user mode for maintenance, the following command can be used:

#  init s
INIT: New run level: S
The system is coming down for administration. Please wait.
Print services stopped.
syslogd: going down on signal 15
Killing user processes: done.
INIT: SINGLE USER MODE
Type Ctrl-d to proceed with normal startup,
(or give root password for system maintenance):
Entering System Maintenance Mode ...
#

The system is most easily shut down by using the new /usr/sbin/shutdown command (not the old BSD-style /usr/ucb/shutdown command discussed later). This command is issued with the form

# shutdown -i run-level -g grace-period -y

where run-level is an init state different than the default init state S (that is, one of the run levels 0, 1 2, 5, or 6). However, most administrators will typically be interested in using the shutdown command with respect to the reboot or power-down run levels. The grace-period is the number of seconds before the shutdown process is initiated. On single-user machines, the superuser will easily know who is logged in and what processes need to be terminated gracefully. However, on a multiuser machine, it is more useful to warn users in advance of a power down or reboot. If the change of init state is to proceed without user intervention, it is useful to include the -y flag at the end of the shutdown command; otherwise, the message

Do you want to continue? (y or n):

will be displayed, and y must be entered in order for the shutdown to proceed. The default grace period on Solaris is 60 seconds, so if the administrators wished to reboot with 2 minutes warning given to all users, without user intervention, the command would be

# shutdown -i 5 -g 120 -y

The system will then periodically display a message warning all users of the imminent init state change:

Shutdown started. Mon Jan 10 10:22:00 EST 2001
Broadcast Message from root (console) on server Mon Jan 10 10:22:00...
The system server will be shut down in 2 minutes

The system will then reboot without user intervention, and does not enter the OpenBoot monitor. If commands need to be issued using the monitor (that is, an init state of 0 is desired), the following command can be used:

# shutdown -i0 -g180 -y
Shutdown started. Mon Jan 10 11:15:00 EST 2001
Broadcast Message from root (console) on server Mon Jan 10 11:15:00...
The system will be shut down in 3 minutes
.
.
.
INIT: New run level: 0
The system is coming down. Please wait.
.
.
.
The system is down.
syncing file systems... [1] [2] [3] done
Program terminated
Type help for more information
ok

There are many ways to warn users in advance of a shutdown. One way is to edit the “message of the day” file (/etc/motd) to contain a warning that the server will be “down” and/or rebooted for a specific time. This message will be displayed every time a user successfully logs in with an interactive shell. The following message gives the date and time of the shutdown, expected duration, and a contact address for enquiries:

System server will be shutdown at 5 p.m. 2/1/2001.
Expected downtime: 1 hour.
E-mail root@system for further details.

At least 24 hours notice is usually required for users on a large system, as long jobs need to be rescheduled. In practice, many administrators will only shut down or reboot outside business hours to minimize inconvenience; however, power failure and hardware problems can necessitate unexpected downtime.

This method works well in advance, but since many users are continuously logged in from remote terminals, they won’t always read the new “message of the day.” An alternative approach is to use the “write all” command (wall), which sends a message to all terminals of all logged-in users. This command can be sent manually at hourly intervals prior to shutdown, or a cron job could be established to perform this task automatically. An example command would be

# wall
System server will be shutdown at 5 p.m. 1/10/2001.
Expected downtime: 1 hour.
E-mail root@system for further details.
^d

After sending the wall message, a final check of logged-in users prior to shutdown can be performed using the who command:

# who
root     console     Jan 10 10:15
pwatters     pts/0     Jan 10 10:15     (client)

A message can be sent to the user pwatters on pts/0 directly to notify him of the imminent shutdown:

# write pwatters
Dear pwatters,
Please logout immediately as the system server is going down.
If you do not logout now, your unsaved work may be lost.
Yours Sincerely,
System Administrator (root@system)
^d

Depending on the status of the user, it may also be fruitful to request a talk session by using this command:

# talk pwatters

If all these strategies fail to convince the user pwatters to log out, there is nothing left to do but proceed with the shutdown.

Changing init States

In addition to being the process spawner, init can be used to switch run levels at any time. For example, to perform hardware maintenance, the following command would be used:

# init 0

To enter the administrative state, the following command would be used:

# init 1

To enter the first multiuser state, the following command would be used:

# init 2

To enter the second multiuser state, the following command would be used:

# init 3

To enter a user-defined state, the following command would be used:

# init 4

To power down the system, the following command would be used:

# init 5

To halt and reboot the operating system, the following command would be used:

# init 6

To enter the administrative state, with all of the file systems available, the following command would be used:

# init S

Before using init in this way, it’s often advisable to precede its execution with a call to sync. The sync command renews the disk superblock, which ensures that all outstanding data operations are flushed and the file system is stable before shutting down.

/etc/inittab

After the kernel is loaded into memory, the /sbin/init process is initialized, and the system is bought up to the default init state, which is determined by the initdefault value contained in /etc/inittab, which controls the behavior of the init process. Each entry has the form

identifier:runlevel:action:command

where identifier is a unique two-character identifier, runlevel specifies the run level to be entered, action specifies the process characteristics of the command to be executed, and command is the name of the program to be run. The program can be an application or a script file. The run level must be one of the following: s, a, b, c, 1, 2, 3, 4, 5, or 6. Alternatively, if the process is to be executed by all run levels, no run level should be specified.

The following is a standard inittab file:

ap::sysinit:/sbin/autopush -f /etc/iu.ap
ap::sysinit:/sbin/soconfig -f /etc/sock2path
fs::sysinit:/sbin/rcS sysinit           >>/dev/msglog 2<<>>/dev/msglog
<</dev/console
is:3:initdefault:
p3:s1234:powerfail:/usr/sbin/shutdown -y -i5 -g0 >>/dev/msglog 2
<<>>/dev/msglog
sS:s:wait:/sbin/rcS                     >>/dev/msglog 2<<>>/dev/msglog
<</dev/console
s0:0:wait:/sbin/rc0                     >>/dev/msglog 2<<>>/dev/msglog
<</dev/console
s1:1:respawn:/sbin/rc1                  >>/dev/msglog 2<<>>/dev/msglog
<</dev/console
s2:23:wait:/sbin/rc2                    >>/dev/msglog 2<<>>/dev/msglog
<</dev/console
s3:3:wait:/sbin/rc3                     >>/dev/msglog 2<<>>/dev/msglog
<</dev/console
s5:5:wait:/sbin/rc5                     >>/dev/msglog 2<<>>/dev/msglog
<</dev/console
s6:6:wait:/sbin/rc6                     >>/dev/msglog 2<<>>/dev/msglog
<</dev/console
fw:0:wait:/sbin/uadmin 2 0              >>/dev/msglog 2<<>>/dev/msglog
<</dev/console
of:5:wait:/sbin/uadmin 2 6              >>/dev/msglog 2<<>>/dev/msglog
<</dev/console
rb:6:wait:/sbin/uadmin 2 1              >>/dev/msglog 2<<>>/dev/msglog
<</dev/console
sc:234:respawn:/usr/lib/saf/sac -t 300
co:234:respawn:/usr/lib/saf/ttymon -g -h -p "`uname -n` console
login: " -T sun
-d /dev/console -l console -m ldterm,ttcompat

This /etc/inittab file only contains entries for the actions sysinit, respawn, initdefault, wait, and powerfail. These are the common actions found on most systems; however, Solaris provides a wide variety of actions which may be useful in special situations (for example, when powerwait is more appropriate then powerfail). Potential actions are identified by any one of the following:

initdefault This is a mandatory entry found on all systems, which is used to configure the default run level for the system. This is specified by the highest init state specified in the rstate field. If this field is empty, init interprets the rstate as the highest possible run level (run level 6), which will force a continuous reboot of the system. In addition, if the entry is missing, the administrator must supply one manually on the console for booting to proceed.
sysinit This entry is provided as a safeguard for asking which run level is required at boot time if the initdefault entry is missing. Only devices required to ask the question are affected.
boot This entry is only parsed at boot time, and is mainly used for initialization following a full reboot of the system after power down.
off This entry ensures that a process is terminated upon entering a particular run level. A warning signal is sent, followed by a kill signal, again with a five-second interval.
once This entry is similar to boot, but more flexible in that the named process runs only once and is not respawned.
ondemand This entry is similar to the respawn action.
powerfail Runs the process associated with the entry when a power fail signal is received.
powerwait Similar to powerfail, except that init waits until the process terminates before further processing entries in /etc/inittab. This is especially useful for enforcing sequential shutdown of services that are prioritized.
bootwait This entry is parsed only on the first occasion that the transition from single-user to multiuser run levels occur after a system boot.
wait This entry starts a process and waits for its completion on entering the specified run level; however, the entry is ignored if /etc/inittab is reread during the same run level.
respawn This entry ensures that if a process that should be running is not, it should be respawned.

The /etc/inittab file follows conventions for text layout used by the Bourne shell: a long entry can be continued on the following line by using a backslash (\), and comments can only be inserted into the process field by using a hash character (#). There is a limitation of 512 characters for each entry imposed on /etc/inittab; however, there is no limit on the number of entries that may be inserted.