Recipe 4.5 Iterating Over an Array

4.5.1 Problem

You want to repeat a procedure for every element in a list.

Often you use an array to collect information you're interested in; for instance, login names of users who have exceeded their disk quota. When you finish collecting the information, you want to process it by doing something with every element in the array. In the disk quota example, you might send each user a stern mail message.

4.5.2 Solution

Use a foreach loop:

foreach $item (LIST) {
    # do something with $item

4.5.3 Discussion

Let's say we've used @bad_users to compile a list of users who are over their allotted disk quotas. To call some complain subroutine for each user, we'd use:

foreach $user (@bad_users) {

Rarely is this recipe so simply applied. Instead, we often use functions to generate the list:

foreach $var (sort keys %ENV) {
    print "$var=$ENV{$var}\n";

Here we're using sort and keys to build a sorted list of environment variable names. If you use the list more than once, you'll obviously keep it around by saving in an array. But for one-shot processing, it's often tidier to process the list directly.

Not only can we add complexity to this formula by building up the list in the foreach, we can also add complexity by doing more work inside the code block. A common application of foreach is to gather information on every element of a list and then, based on that information, decide whether to do something. For instance, returning to the disk quota example:

foreach $user (@all_users) {
    $disk_space = get_usage($user);     # find out how much disk space in use
    if ($disk_space > $MAX_QUOTA) {     # if it's more than we want ...
        complain($user);                # ... then object vociferously

More complicated program flow is possible. The code can call last to jump out of the loop, next to move on to the next element, or redo to jump back to the first statement inside the block. Use these to say "no point continuing with this one, I know it's not what I'm looking for" (next), "I've found what I'm looking for, there's no point in my checking the rest" (last), or "I've changed some things, I'd better run this loop's calculations again" (redo).

The variable set to each value in the list is called a loop variable or iterator variable. If no iterator variable is supplied, the global variable $_ is used. $_ is the default variable for many of Perl's string, list, and file functions. In brief code blocks, omitting $_ improves readability. (In long ones, though, too much implicit use hampers readability.) For example:

foreach (`who`) {
    if (/tchrist/) {

or combining with a while loop:

while (<FH>) {              # $_ is set to the line just read
    chomp;                  # $_ has a trailing \n removed, if it had one
    foreach (split) {       # $_ is split on whitespace, into @_
                            # then $_ is set to each chunk in turn
        $_ = reverse;       # the characters in $_ are reversed
        print;              # $_ is printed

Perhaps all these uses of $_ are starting to make you nervous. In particular, the foreach and the while both give values to $_. You might fear that at the end of the foreach, the full line as read into $_ with <FH> would be forever gone.

Fortunately, your fears would be unfounded, at least in this case. Perl won't permanently clobber $_'s old value, because the foreach's iterator variable (here, $_) is automatically preserved during the loop. It saves away any old value on entry and restores it upon exit.

However, there is some cause for concern. If the while had been the inner loop and the foreach the outer one, your fears would have been realized. Unlike a foreach loop, the while (<FH>) construct clobbers the value of the global $_ without first localizing it! So any routineor block for that matterthat uses this construct with $_ should declare local $_.

If a lexical variable (one declared with my) is in scope, the temporary variable will be lexically scoped, private to that loop. Otherwise, it will be a dynamically scoped global variable. To avoid strange magic at a distance, write this more obviously and more clearly as:

foreach my $item (@array) {
    print "i = $item\n";

The foreach looping construct has another feature: each time through the loop, the iterator variable becomes not a copy of but rather an alias for the current element. This means that when you change that iterator variable, you really change each element in the list:

@array = (1,2,3);
foreach $item (@array) {
print "@array\n";
0 1 2

# multiply everything in @a and @b by seven
@a = ( .5, 3 ); @b = ( 0, 1 );
foreach $item (@a, @b) {
    $item *= 7;
print "@a @b\n";
3.5 21 0 7

You can't change a constant, though, so this is illegal:

foreach $n (1, 2, 3) {
    $n **= 2;

This aliasing means that using a foreach loop to modify list values is both more readable and faster than the equivalent code using a three-part for loop and explicit indexing would be. This behavior is a feature, not a bug, that was introduced by design. If you didn't know about it, you might accidentally change something. Now you know about it.

For example, to trim leading and trailing whitespace in a hash, we take advantage of how the values function works: the elements of its return list really are the values of the hash, and changing these changes the original hash. Because we use s/// directly on the list returned by the values function without copying these into a variable, we change the real hash itself.

# trim whitespace in the scalar, the array, and in all 
# the values in the hash
foreach ($scalar, @array, values %hash) {

For reasons hearkening back to the equivalent construct in the Unix Bourne shell, the for and foreach keywords are interchangeable:

for $item (@array) {  # same as foreach $item (@array)
    # do something

for (@array)      {   # same as foreach $_ (@array)
    # do something

This style often indicates that its author writes or maintains shell scripts, perhaps for Unix system administration. As such, their life is probably hard enough, so don't speak too harshly of them. Remember, TMTOWTDI. This is just one of those ways.

If you aren't fluent in Bourne shell, you might find it clearer to express "for each $thing in this @list" by saying foreach, to make your code look less like the shell and more like English. (But don't try to make your English look like your code!)

4.5.4 See Also

The "For Loops," "Foreach Loops," and "Loop Control" sections of perlsyn(1) and Chapter 4 of Programming Perl; the "Temporary Values via local" section of perlsub(1); the "Scoped Declarations" section of Chapter 4 of Programming Perl; we talk about local in Recipe 10.13; we talk about my in Recipe 10.2