Recipe 14.6 Storing Complex Data in a DBM File

14.6.1 Problem

You want values in a DBM file to be something other than scalars. For instance, you use a hash of hashes in your program and want to store them in a DBM file for other programs to access, or you want them to persist across process runs.

14.6.2 Solution

Use the CPAN module MLDBM to store more complex values than strings and numbers.

use MLDBM 'DB_File';
tie(%HASH, 'MLDBM', [... other DBM arguments]) or die $!;

Specify a particular serializing module with:

use MLDBM qw(DB_File Storable);

14.6.3 Discussion

MLDBM uses a serializing module like Storable, Data::Dumper, or FreezeThaw (see Recipe 11.4) to convert data structures to and from strings so that they can be stored in a DBM file. It doesn't store references; instead, it stores the data those references refer to:

# %hash is a tied hash
$hash{"Tom Christiansen"} = [ "book author", '' ];          
$hash{"Tom Boutell"} = [ "shareware author", '' ];

# names to compare
$name1 = "Tom Christiansen";
$name2 = "Tom Boutell";

$tom1 = $hash{$name1};      # snag local pointer
$tom2 = $hash{$name2};      # and another           

print "Two Toming: $tom1 $tom2\n";

Tom Toming: ARRAY(0x73048) ARRAY(0x73e4c)

Each time MLDBM retrieves a data structure from the DBM file, it generates a new copy of that data. To compare data that you retrieve from a MLDBM database, you need to compare the values within the structure:

if ($tom1->[0] eq $tom2->[0] &&
    $tom1->[1] eq $tom2->[1]) {
    print "You're having runtime fun with one Tom made two.\n";
} else {
    print "No two Toms are ever alike.\n";

This is more efficient than:

if ($hash{$name1}->[0] eq $hash{$name2}->[0] &&     # INEFFICIENT
    $hash{$name1}->[1] eq $hash{$name2}->[1]) {
    print "You're having runtime fun with one Tom made two.\n";
 } else {
    print "No two Toms are ever alike.\n";

Each time we say $hash{...}, the DBM file is consulted. The inefficient code accesses the database four times, whereas the code using the temporary variables $tom1 and $tom2 only accesses the database twice.

Current limitations of Perl's tie mechanism prevent you from storing or modifying parts of a MLDBM value directly:

$hash{"Tom Boutell"}->[0] = "Poet Programmer";      # WRONG

Always get, change, and set pieces of the stored structure through a temporary variable:

$entry = $hash{"Tom Boutell"};                      # RIGHT
$entry->[0] = "Poet Programmer";
$hash{"Tom Boutell"} = $entry;

If MLDBM uses a database with size limits on values, like SDBM, you'll quickly hit those limits. To get around this, use GDBM_File or DB_File, which don't limit the size of keys or values. DB_File is the better choice because it is byte-order neutral, which lets the database be shared between both big- and little-endian architectures.

14.6.4 See Also

The documentation for the standard Data::Dumper and Storable modules; the documentation for the FreezeThaw and MLDBM modules from CPAN; Recipe 11.13; Recipe 14.7