# Recipe 6.8 Extracting a Range of Lines

#### 6.8.1 Problem

You want to extract all lines from a starting pattern through an ending pattern or from a starting line number up to an ending line number.

A common example of this is extracting the first 10 lines of a file (line numbers 1 to 10) or just the body of a mail message (everything past the blank line).

#### 6.8.2 Solution

Use the operators .. or ... with patterns or line numbers.

The .. operator will test the right operand on the same iteration that the left operand flips the operator into the true state.

```while (<>) {
if (/BEGIN PATTERN/ .. /END PATTERN/) {
# line falls between BEGIN and END in the
# text, inclusive.
}
}

while (<>) {
if (FIRST_LINE_NUM .. LAST_LINE_NUM) {
# operate only between first and last line, inclusive.
}
}```

But the ... operator waits until the next iteration to check the right operand.

```while (<>) {
if (/BEGIN PATTERN/ ... /END PATTERN/) {
# line is between BEGIN and END on different lines
}
}

while (<>) {
if (FIRST_LINE_NUM ... LAST_LINE_NUM) {
# operate only between first and last line, not inclusive
}
}```

#### 6.8.3 Discussion

The range operators, .. and ..., are probably the least understood of Perl's myriad operators. They were designed to allow easy extraction of ranges of lines without forcing the programmer to retain explicit state information. Used in scalar context, such as in the test of if and while statements, these operators return a true or false value that's partially dependent on what they last returned. The expression left_operand .. right_operand returns false until left_operand is true, but once that test has been met, it stops evaluating left_operand and keeps returning true until right_operand becomes true, after which it restarts the cycle. Put another way, the first operand turns on the construct as soon as it returns a true value, whereas the second one turns it off as soon as it returns true.

The two operands are completely arbitrary. You could write mytestfunc1( ) .. mytestfunc2( ), although this is rarely seen. Instead, the range operators are usually used with either line numbers as operands (the first example), patterns as operands (the second example), or both.

```# command-line to print lines 15 through 17 inclusive (see below)
perl -ne 'print if 15 .. 17' datafile

# print all <XMP> .. </XMP> displays from HTML doc
while (<>) {
print if m#<XMP>#i .. m#</XMP>#i;
}

# same, but as shell command
% perl -ne 'print if m#<XMP>#i .. m#</XMP>#i' document.html```

If either operand is a numeric literal, the range operators implicitly compare against the \$. variable (\$NR or \$INPUT_LINE_NUMBER if you use English). Be careful with implicit line number comparisons here. You must specify literal numbers in your code, not variables containing line numbers. That means you simply say 3 .. 5 in a conditional, but not \$n .. \$m where \$n and \$m are 3 and 5 respectively. For that, be more explicit by testing the \$. variable directly.

```perl -ne 'BEGIN { \$top=3; \$bottom=5 }  print if \$top .. \$bottom' /etc/passwd
# WRONG
perl -ne 'BEGIN { \$top=3; \$bottom=5 }
print if \$. =  = \$top .. \$. =  =     \$bottom' /etc/passwd    # RIGHT
perl -ne 'print if 3 .. 5' /etc/passwd   # also RIGHT```

The difference between .. and ... is their behavior when both operands become true on the same iteration. Consider these two cases:

```print if /begin/ ..  /end/;
print if /begin/ ... /end/;```

Given the line "You may not end ere you begin", both versions of the previous range operator return true. But the code using .. won't print any further lines. That's because .. tests both conditions on the same line once the first test matches, and the second test tells it that it's reached the end of its region. On the other hand, ... continues until the next line that matches /end/ because it never tries to test both operands on the same line.

You may mix and match conditions of different sorts, as in:

```while (<>) {
\$in_body   = /^\$/ .. eof( );
}```

The first assignment sets \$in_header to be true from the first input line until after the blank line separating the header, such as from a mail message, a USENET news posting, or even an HTTP header. (Technically, an HTTP header should have linefeeds and carriage returns as network line terminators, but in practice, servers are liberal in what they accept.) The second assignment sets \$in_body to true as soon as the first blank line is encountered, up through end-of-file. Because range operators do not retest their initial condition, any further blank lines, like those between paragraphs, won't be noticed.

Here's an example. It reads files containing mail messages and prints addresses it finds in headers. Each address is printed only once. The extent of the header is from a line beginning with a "From:" up through the first blank line. If we're not within that range, go on to the next line. This isn't an RFC-822 notion of an address, but it is easy to write.

```%seen = ( );
while (<>) {
next unless /^From:?\s/i .. /^\$/;
while (/([^<>( ),;\s]+\@[^<>( ),;\s]+)/g) {
print "\$1\n" unless \$seen{\$1}++;
}
}```

The .. and ... operators in the "Range Operator" sections of perlop(1) and Chapter 3 of Programming Perl; the entry for \$NR in perlvar(1) and the "Per-Filehandle Variables" section of Chapter 28 of Programming Perl

 Chapter 1. Strings
 Chapter 2. Numbers
 Chapter 3. Dates and Times
 Chapter 4. Arrays
 Chapter 5. Hashes
 Chapter 7. File Access
 Chapter 8. File Contents
 Chapter 9. Directories
 Chapter 10. Subroutines
 Chapter 11. References and Records
 Chapter 12. Packages, Libraries, and Modules
 Chapter 13. Classes, Objects, and Ties
 Chapter 14. Database Access
 Chapter 15. Interactivity
 Chapter 16. Process Management and Communication
 Chapter 17. Sockets
 Chapter 18. Internet Services
 Chapter 19. CGI Programming
 Chapter 20. Web Automation
 Chapter 21. mod_perl
 Chapter 22. XML