Doing Something with Every Element in a List

Problem

You want to repeat a procedure for every element in a list.

Often you use an array to collect information you’re interested in; for instance, login names of users who have exceeded their disk quota. When you finish collecting the information, you want to process it by doing something with every element in the array. In the disk quota example, you might send each user a stern mail message.

Solution

Use a foreach loop:

foreach $item (LIST) {
    # do something with $item
}

Discussion

Let’s say we’ve used @bad_users to compile a list of users over their allotted disk quota. To call some complain() subroutine for each one we’d use:

foreach $user (@bad_users) {
        complain($user);
}

Rarely is this recipe so simply applied. Instead, we often use functions to generate the list:

foreach $var (sort keys %ENV) {
    print "$var=$ENV{$var}\n";
}

Here we’re using sort and keys to build a sorted list of environment variable names. In situations where the list will be used more than once, you’ll obviously keep it around by saving in an array. But for one-shot processing, it’s often tidier to process the list directly.

Not only can we add complexity to this formula by building up the list in the foreach, we can also add complexity by doing more work inside the code block. A common application of foreach is to gather information on every element of a list, and then decide (based on that information) whether to do something. For instance, returning to the disk quota example:

foreach $user (@all_users) {
    $disk_space = get_usage($user);     # find out how much disk space in use
    if ($disk_space > $MAX_QUOTA) {     # if it's more than we want ...
        complain($user);                # ... then object vociferously
    }
}

More complicated program flow is possible. The code can call last to jump out of the loop, next to move on to the next element, or redo to jump back to the first statement inside the block. Use these to say “no point continuing with this one, I know it’s not what I’m looking for” (next), “I’ve found what I’m looking for, there’s no point in my checking the rest” (last), or “I’ve changed some things, I’d better do my tests and calculations again” (redo).

The variable set to each value in the list is called a loop variable or iterator variable. If no iterator variable is supplied, the global variable $_ is used. $_ is the default variable for many of Perl’s string, list, and file functions. In brief code blocks, omitting $_ improves readability. (In long ones, though, too much implicit use hampers readability.) For example:

foreach (`who`) {
    if (/tchrist/) {
        print;
    }
}

or combining with a while loop:

while (<FH>) {              # $_ is set to the line just read
    chomp;                  # $_ has a trailing \n removed, if it had one
    foreach (split) {       # $_ is split on whitespace, into @_
                            # then $_ is set to each chunk in turn
        $_ = reverse;       # the characters in $_ are reversed
        print;              # $_ is printed
    }
}

Perhaps all these uses of $_ are starting to make you nervous. In particular, the foreach and the while both give values to $_. You might fear that at the end of the foreach, the full line as read into $_ with <FH> would be forever gone.

Fortunately, your fears would be unfounded, at least in this case. Perl won’t permanently clobber $_’s old value, because the foreach’s iterator variable ($_ in this case) is automatically preserved during the loop. It saves away any old value on entry and restores it upon exit.

There is cause for some concern though. If the while had been the inner loop and the foreach the outer one, then your fears would have been realized. Unlike a foreach loop, the while <FH> construct clobbers the value of the global $_ without first localizing it! So any routine—or block for that matter—that uses such a construct with $_ should always declare local $_ at its front.

If a lexical variable (one declared with my) is in scope, the temporary variable will be lexically scoped, private to that loop. Otherwise, it will be a dynamically scoped global variable. To avoid strange magic at a distance, as of release 5.004 you can write this more obviously and more clearly as:

foreach my $item (@array) {
    print "i = $item\n";
}

The foreach looping construct has another feature: each time through the loop, the iterator variable becomes not a copy of but rather an alias for the current element. This means that when you change that iterator variable, you really change each element in the list:

@array = (1,2,3);
foreach $item (@array) {
    $item--;
}
print "@array\n";

                  0 1 2

# multiply everything in @a and @b by seven
@a = ( .5, 3 ); @b =( 0, 1 );
foreach $item (@a, @b) {
    $item *= 7;
}
print "@a @b\n";

                  3.5 21 0 7

This aliasing means that using a foreach loop to modify list values is both more readable and faster than the equivalent code using a three-part for loop and explicit indexing would be. This behavior is a feature, not a bug, that was introduced by design. If you didn’t know about it, you might accidentally change something. Now you know about it.

For example, if we used s/// on elements of the list returned by the values function, we would only be changing copies, not the real hash itself. The hash slice (@hash{keys %hash} is a hash slice, explained in Chapter 5), however, gives us something we can usefully change:

# trim whitespace in the scalar, the array, and all the values
# in the hash
foreach ($scalar, @array, @hash{keys %hash}) {
    s/^\s+//;
    s/\s+$//;
}

For reasons hearkening back to the equivalent construct in the Unix Bourne shell, the for and foreach keywords are interchangeable:

for $item (@array) {  # same as foreach $item (@array)
    # do something
}

for (@array)      {   # same as foreach $_ (@array)
    # do something
}

This style often indicates that its author writes or maintains shell scripts, perhaps for Unix systems administration. As such, their life is probably hard enough, so don’t speak too harshly of them. Remember, TMTOWTDI. This is just one of those ways.

If you aren’t fluent in Bourne shell, you might find it clearer to express “for each $thing in this @list,” by saying foreach to make your code less like the shell and more like English. (But don’t try to make your English look like your code!)

Perl Cookbook by

Doing Something with Every Element in a List

Problem

Solution

Discussion

See Also

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly