5.2 Viewing Complex Data with Data::Dumper

Another way to visualize a complex data structure rapidly is to dump it. A particularly nice dumping package is included in the Perl core distribution, called Data::Dumper. Let's replace the last half of the byte-counting program with a simple call to Data::Dumper:

use Data::Dumper;

my %total_bytes;
while (<>) {
  my ($source, $destination, $bytes) = split;
  $total_bytes{$source}{$destination} += $bytes;
}

print Dumper(\%total_bytes);

The Data::Dumper module defines the Dumper subroutine. This subroutine is similar to the x command in the debugger. You can give Dumper one or more values, and Dumper turns those values into a printable string. The difference between the debugger's x command and Dumper, however, is that the string generated by Dumper is Perl code:

myhost% perl bytecounts2 <bytecounts-in
$VAR1 = {
          'thurston.howell.hut' => {
                                     'lovey.howell.hut' => 1250
                                   },
          'ginger.girl.hut' => {
                                 'maryann.girl.hut' => 199,
                                 'professor.hut' => 1218
                               },
          'professor.hut' => {
                               'gilligan.crew.hut' => 1250,
                               'lovey.howell.hut' => 1360
                             }
        };
myhost%

The Perl code is fairly understandable; it shows that you have a reference to a hash of three elements, with each value of the hash being a reference to a nested hash. You can evaluate this code and get a hash that's equivalent to the original hash. However, if you're thinking about doing this in order to have a complex data structure persist from one program invocation to the next, please keep reading.

Data::Dumper, like the debugger's x command, handles shared data properly. For example, go back to that "leaking" data from Chapter 4:

use Data::Dumper;
$Data::Dumper::Purity = 1; # declare possibly self-referencing structures
my @data1 = qw(one won);
my @data2 = qw(two too to);
push @data2, \@data1;
push @data1, \@data2;
print Dumper(\@data1, \@data2);

Here's the output from this program:

$VAR1 = [
          'one',
          'won',
          [
            'two',
            'too',
            'to',
            [  ]
          ]
        ];
$VAR1->[2][3] = $VAR1;
$VAR2 = $VAR1->[2];

Notice how you've created two different variables now, since there are two parameters to Dumper. The element $VAR1 corresponds to a reference to @data1, while $VAR2 corresponds to a reference to @data2. The debugger shows the values similarly:

DB<1> x \@data1, \@data2
    0  ARRAY(0xf914)
 0  'one'
 1  'won'
 2  ARRAY(0x3122a8)
    0  'two'
    1  'too'
    2  'to'
    3  ARRAY(0xf914)
       -> REUSED_ADDRESS
    1  ARRAY(0x3122a8)
 -> REUSED_ADDRESS

Note that the phrase REUSED_ADDRESS indicates that some parts of the data are actually references you've already seen.