19.6 Examples

Let's look at a few examples that will demonstrate the theory presented at the beginning of the chapter.

19.6.1 tie( )-ing Once and Forever

If you know that your code accesses the DBM file in read-only mode and you want to gain the maximum data-retrieval speed, you should tie the DBM file during server startup and register code in the child initialization stage that will tie the DBM file when the child process is spawned.

Consider the small test module in Example 19-2.

Example 19-2. Book/DBMCache.pm
package Book::DBMCache;

use DB_File;
use Fcntl qw(O_RDONLY O_CREAT);

use vars qw(%dbm);

sub init {
    my $filename = shift;
    tie %dbm, 'DB_File', $filename, O_RDONLY|O_CREAT,
        0660, $DB_BTREE or die "Can't tie $filename: $!";
}
1;

This module imports two symbols from the Fcntl package that we will use to tie the DBM file. The first one is O_RDONLY, as we want the file to be opened only for reading. It is important to note that in the case of the tie( ) interface, nothing prevents you from updating the DBM file, even if the file was tied with the O_RDONLY flag. The second flag, O_CREAT, is used just in case the DBM file wasn't found where it was expected?in this case, an empty file will be created instead, since otherwise tie( ) will fail and the code execution will be aborted.

The module specifies a global variable, %dbm, which we need to be global so that we can access it directly from outside of the Book::DBMCache module. Alternatively, we could define this variable as lexically scoped to this module and write an accessor (method), which would make the code cleaner. However, this accessor would be called every time we wanted to read some value.

When Book::DBMCache::init( ) is called with a path to the DBM file as its argument, the global variable %dbm is tied to this file. We want the tie operation to happen before the first request is made, so we do it in the ChildInitHandler code executed from startup.pl:

use Book::DBMCache;
Apache->push_handlers(PerlChildInitHandler => sub {
                        Book::DBMCache::init("/tmp/foo.db");
                    });

Assuming /tmp/foo.db is already populated with data, we can now write the test script shown in Example 19-3.

Example 19-3. test_dbm.pl
use Book::DBMCache;
use strict;

my $r = shift;
$r->send_http_header("text/plain");

my $foo = exists $Book::DBMCache::dbm{foo} ? $Book::DBMCache::dbm{foo} : '';
print "The value of foo: [$foo]";

When this is executed as an Apache::Registry script (assuming the DBM file was populated with the foo, bar key/value pair), we will see the following output:

The value of foo: [bar]

There's an easy way to guarantee that a tied hash is read-only: use a subclass of the tie module you're using that prevents writing. For example, you can subclass DB_File as follows:

package DB_File::ReadOnly;

use strict;
require DB_File;
$DB_File::ReadOnly::ISA = qw(DB_File);

sub STORE  {  }
sub DELETE {  }
sub CLEAR  {  }

1;

As you can see, the methods of the tie( ) interface that can alter the DBM file are overriden with methods that do nothing. Of course, you may want to use warn( ) or die( ) inside these methods, depending on how you want to flag writes. Any attempts to write probably should be considered serious problems.

Now you can use DB_File::ReadOnly just like you were using DB_File before, but you can be sure that the DBM file won't be modified through this interface.

19.6.2 Read/Write Access

This simple example will show you how to use the DBM file when you want to be able to safely modify it in addition to just reading from it. As mentioned earlier, we are running in a multiprocess environment in which more than one process might attempt to write to the file at the same time. Therefore, we need to have a lock on the DBM file before we can access it, even when doing only a read operation?we want to make sure that the retrieved data is completely valid, which might not be the case if someone is writing to the same record at the time of our read. We are going to use the DB_File::Lock module from CPAN to perform the actual locking.

The simple script shown in Example 19-4 imports the O_RDWR and O_CREAT symbols from the Fcntl module, loads the DB_File::Lock module, and sends the HTTP header as usual.

Example 19-4. read_write_lock.pl
use strict;
use DB_File::Lock;
use Fcntl qw(O_RDWR O_CREAT);

my $r = shift;
$r->send_http_header("text/plain");

my $dbfile = "/tmp/foo.db";
tie my %dbm, 'DB_File::Lock', $dbfile, O_RDWR|O_CREAT,
    0600, $DB_HASH, 'write';
# assign a random value
$dbm{foo} = ('a'..'z')[int rand(26)];
untie %dbm;

# read the assigned value
tie %dbm, 'DB_File::Lock', $dbfile, O_RDWR|O_CREAT,
    0600, $DB_HASH, 'read';
my $foo = exists $dbm{foo} ? $dbm{foo} : 'undefined';
untie %dbm;

print "The value of foo: [$foo]";

The next step is to tie the existing /tmp/foo.db file, or create a new one if it doesn't already exist. Notice that the last argument for the tie is 'write', which tells DB_File::Lock to obtain an exclusive (write) lock before moving on. Once the exclusive lock is acquired and the DBM file is tied, the code assigns a random letter as a value and saves the change by calling untie( ), which unlocks the DBM and closes it. It's important to stress here that in our example the section of code between the calls to tie( ) and untie( ) is called a critical section, because while we are inside of it, no other process can read from or write to the DBM file. Therefore, it's important to keep it the execution time of this section as short as possible.

The next section is similar to the first one, but this time we ask for a shared (read) lock, as we only want to read the value from the DBM file. Once the value is read, it's printed. Since the letter was picked randomly, you will see something like this:

The value of foo: [d]

then this (when reloading again):

The value of foo: [z]

and so on.

Based on this example you can build more evolved code, and of course you may choose to use other locking wrapper modules, as discussed earlier.

19.6.3 Storing Complex Data Structures

As mentioned earlier, you can use the MLDBM module to store complex data structures in the DBM file (which apparently accepts only a scalar as a single value). Example 19-5 shows how to do this.

Example 19-5. mldbm.pl
use strict;
use MLDBM qw(DB_File);
use DB_File;
use Data::Dumper ( );
use Fcntl qw(O_RDWR O_CREAT);

my $r = shift;
$r->send_http_header("text/plain");

my $rh = {
          bar => ['a'..'c'],
          tar => { map {$_ => $_**2 } 1..4 },
         };

my $dbfile = "/tmp/foo.db";
tie my %dbm, 'MLDBM', $dbfile, O_RDWR|O_CREAT, 
    0600, $DB_HASH or die $!;
# assign a reference to a Perl datastructure
$dbm{foo} = $rh;
untie %dbm;

# read the assigned value
tie %dbm, 'MLDBM', $dbfile, O_RDWR|O_CREAT, 
    0600, $DB_HASH or die $!;
my $foo = exists $dbm{foo} ? $dbm{foo} : 'undefined';
untie %dbm;

print Data::Dumper::Dumper($foo);

As you can see, this example is very similar to the normal use of DB_File; we just use MLDBM instead, and tell it to use DB_File as an underlying DBM implementation. You can choose any other available implementation instead. If you don't specify one, SDBM_File is used.

The script creates a complicated nested data structure and stores it in the $rh scalar. Then we open the database and store this value as usual.

When we want to retrieve the stored value, we do pretty much the same thing as before. The script uses the Data::Dumper::Dumper method to print out the nested data structure. Here is what it prints:

$VAR1 = {
          'bar' => [
                     'a',
                     'b',
                     'c'
                   ],
          'tar' => {
                     '1' => '1',
                     '2' => '4',
                     '3' => '9',
                     '4' => '16'
                   }
        };

That's exactly what we inserted into the DBM file.

There is one important note, though. If you want to modify a value that is a reference to a data structure, you cannot modify it directly. You have to retrieve the value, modify it, and store it back.

For example, in the above example you cannot do:

tie my %dbm, 'MLDBM', $dbfile, O_RDWR|O_CREAT, 
    0600, $DB_HASH or die $!;
# update the existing key
$dbm{foo}->{bar} = ['a'..'z']; # this doesn't work
untie %dbm;

if the key bar existed before. Instead, you should do the following:

tie my %dbm, 'MLDBM', $dbfile, O_RDWR|O_CREAT, 
    0600, $DB_HASH or die $!;
# update the existing key
my $tmp     = $dbm{foo};
$tmp->{bar} = ['a'..'z'];
$dbm{foo}   = $tmp;       # this works
untie %dbm;

This limitation exists because the perl TIEHASH interface currently has no support for multidimensional ties.

By default, MLDBM uses Data::Dumper to serialize the nested data structures. You may want to use the FreezeThaw or Storable serializer instead. In fact, Storable is the preferred one. To use Storable in our example, you should do:

use MLDBM qw(DB_File Storable);

at the beginning of the script.

Refer to the MLDBM manpage to find out more information about it.



    Part I: mod_perl Administration
    Part II: mod_perl Performance
    Part VI: Appendixes