# Recipe 1.11 Expanding and Compressing Tabs

#### 1.11.1 Problem

You want to convert tabs in a string to the appropriate number of spaces, or vice versa. Converting spaces into tabs can be used to reduce file size when the file has many consecutive spaces. Converting tabs into spaces may be required when producing output for devices that don't understand tabs or think them at different positions than you do.

#### 1.11.2 Solution

Either use a rather funny looking substitution:

```while (\$string =~ s/\t+/' ' x (length(\$&) * 8 - length(\$`) % 8)/e) {
# spin in empty loop until substitution finally fails
}```

or use the standard Text::Tabs module:

```use Text::Tabs;
@expanded_lines  = expand(@lines_with_tabs);
@tabulated_lines = unexpand(@lines_without_tabs);```

#### 1.11.3 Discussion

Assuming tab stops are set every N positions (where N is customarily eight), it's easy to convert them into spaces. The standard textbook method does not use the Text::Tabs module but suffers slightly from being difficult to understand. Also, it uses the \$` variable, whose very mention currently slows down every pattern match in the program. This is explained in Special Variables in Chapter 6. You could use this algorithm to make a filter to expand its input's tabstops to eight spaces each:

```while (<>) {
1 while s/\t+/' ' x (length(\$&) * 8 - length(\$`) % 8)/e;
print;
}```

To avoid \$`, you could use a slightly more complicated alternative that uses the numbered variables for explicit capture; this one expands tabstops to four each instead of eight:

`1 while s/^(.*?)(\t+)/\$1 . ' ' x (length(\$2) * 4 - length(\$1) % 4)/e;`

Another approach is to use the offsets directly from the @+ and @- arrays. This also expands to four-space positions:

`1 while s/\t+/' ' x ((\$+[0] - \$-[0]) * 4 - \$-[0] % 4)/e;`

If you're looking at all of these 1 while loops and wondering why they couldn't have been written as part of a simple s///g instead, it's because you need to recalculate the length from the start of the line again each time rather than merely from where the last match occurred.

The convention 1 while CONDITION is the same as while (CONDITION) { }, but shorter. Its origins date to when Perl ran the first incredibly faster than the second. While the second is now almost as fast, it remains convenient, and the habit has stuck.

The standard Text::Tabs module provides conversion functions to convert both directions, exports a \$tabstop variable to control the number of spaces per tab, and does not incur the performance hit because it uses \$1 and \$2 rather than \$& and \$`.

```use Text::Tabs;
\$tabstop = 4;
while (<>) { print expand(\$_) }```

We can also use Text::Tabs to "unexpand" the tabs. This example uses the default \$tabstop value of 8:

```use Text::Tabs;
while (<>) { print unexpand(\$_) }```

The manpage for the Text::Tabs module; the s/// operator in perlre(1) and perlop(1); the @- and @+ variables (@LAST_MATCH_START and @LAST_MATCH_END) in Chapter 28 of Programming Perl; the section on "When a global substitution just isn't global enough" in Chapter 5 of Programming Perl

 Chapter 2. Numbers
 Chapter 3. Dates and Times
 Chapter 4. Arrays
 Chapter 5. Hashes
 Chapter 6. Pattern Matching
 Chapter 7. File Access
 Chapter 8. File Contents
 Chapter 9. Directories
 Chapter 10. Subroutines
 Chapter 11. References and Records
 Chapter 12. Packages, Libraries, and Modules
 Chapter 13. Classes, Objects, and Ties
 Chapter 14. Database Access
 Chapter 15. Interactivity
 Chapter 16. Process Management and Communication
 Chapter 17. Sockets
 Chapter 18. Internet Services
 Chapter 19. CGI Programming
 Chapter 20. Web Automation
 Chapter 21. mod_perl
 Chapter 22. XML