Recipe 16.17 Writing a Signal Handler

16.17.1 Problem

You want to write a subroutine that will be called whenever your program receives a signal.

16.17.2 Solution

A signal handler is just a subroutine. With some risk, you can do anything in a signal handler you'd do in any Perl subroutine, but the more you do, the riskier it gets.

Some systems require you to reinstall your signal handler after each signal:

$SIG{INT} = \&got_int;
sub got_int {
    $SIG{INT} = \&got_int;          # but not for SIGCHLD!
    # ...

Some systems restart blocking operations, such as reading data. In such cases, you must call die within the handler and trap it with eval:

my $interrupted = 0;

sub got_int {
    $interrupted = 1;
    $SIG{INT} = 'DEFAULT';          # or 'IGNORE'

eval {
    $SIG{INT} = \&got_int;
    # ... long-running code that you don't want to restart

if ($interrupted) {
    # deal with the signal

16.17.3 Discussion

At the C level, signals can interrupt just about anything. Unfortunately, this means that signals could interrupt Perl while Perl is changing its own internal data structures, leaving those data structures inconsistent and leading to a core dump. As of Perl 5.8, Perl tries very hard to ensure that this doesn't happenwhen you install a signal handler, Perl installs a C-level signal handler that says "Perl received this signal." When Perl's data structures are consistent (after each operation it performs), the Perl interpreter checks to see whether a signal was received. If one was, your signal handler is called.

This prevents core dumps, but at the cost of slightly delaying signals in cases where one of Perl's built-in operations takes a long time to finish. For example, building a long list like this:

@a = 1..5_000_000;

might take 10 seconds on a heavily loaded system, but you won't be able to interrupt it because Perl will not check whether a signal was received while the list is being built. There are two operations in this statement, list generation and assignment, and Perl checks for signals only after each operation completes.

Signals have been implemented in many different operating systems, often in slightly different flavors. The two situations where signal implementations vary the most are when a signal occurs while its signal handler is active (reliability), and when a signal interrupts a blocking syscall such as read or accept (restarting).

The initial Unix implementation of signals was unreliable, meaning that while a handler was running, further occurrences of the same signal would cause the default action, likely aborting the program. Later systems addressed this (each in their own subtly different way, of course) by providing a way to block the delivery of further signals of that number until the handler has finished. If Perl detects that your system can use reliable signals, it generates the proper syscalls needed for this saner, safer behavior. You can use POSIX signals to block signal delivery at other times, as described in Recipe 16.20.

For truly portable code, the paranoid programmer will assume the worst case (unreliable signals) and reinstall the signal handler manually, usually as the first statement in a function:

$SIG{INT} = \&catcher;
sub catcher {
    $SIG{INT} = \&catcher;
    # ...

In the special case of catching SIGCHLD, see Recipe 16.19. System V has bizarre behavior that can trip you up.

Use the Config module to find out whether you have reliable signals:

use Config;
print "Hurrah!\n" if $Config{d_sigaction};

Just because you have reliable signals doesn't mean you automatically get reliable programs. But without them, you certainly won't.

The first implementation of signals interrupted slow syscalls, functions that require the cooperation of other processes or device drivers. If a signal comes in while those syscalls are still running, they (and their Perl counterparts) return an error value and set the error to EINTR, "Interrupted system call". Checking for this condition made programs so complicated that most didn't check, and therefore misbehaved or died if a signal interrupted a slow syscall. Most modern versions of Unix allow you to change this behavior. Perl will always make syscalls restartable if it is on a system that supports it. If you have a POSIX system, you can control restarting using the POSIX module (see Recipe 16.20).

To determine whether your interrupted syscalls will automatically restart, look at your system's C signal.h include file:

% egrep 'S[AV]_(RESTART|INTERRUPT)' /usr/include/*/signal.h

Two signals are untrappable and unignorable: SIGKILL and SIGSTOP. Full details of the signals available on your system and what they mean can be found in the signal(3) manpage.

Finally, if you have a hostile operating system, you can still have signal problems. In particular, some operating systems have library calls that themselves intercept signals. For example, gethostbyname(3) on some systems uses SIGALRM signals to manage timeouts and restarts. There can be only one timer running, and so you can't say "stop looking up this hostname after five seconds," because your five-second timer is replaced by gethostbyname's timer as soon as Perl calls the library routine. This means you can't interrupt a wedged hostname lookup on such systems, because the signals don't get through. Fortunately, such situations are rare.

16.17.4 See Also

The "Signals" sections in Chapter 16 of Programming Perl and in perlipc(1); your system's sigaction(2), signal(3), and kill(2) manpages (if you have them); Advanced Programming in the UNIX Environment