3.11 Gene.pm: A Fourth Example of a Perl Class

We've now come to the fourth and final version of the Gene class, Gene.pm. This final version adds a few more bells and whistles to make the code more reliable and useful. You'll see how to define the class attributes in such a way as to specify the operations that are permitted on them, thus enforcing more discipline in how the class can be used. You'll also see how to initialize an object with class defaults or clone an already existing object. You'll see the standard and simple way in which the documentation for a class can be incorporated into the .pm file. This will conclude my introduction to OO Perl programming (but check out the exercises at the end of the chapter and see later chapters of this book for more ideas).

3.11.1 Building Gene.pm

Here then is the code for Gene.pm. Again, I recommend that you take the time to read this code and compare it to the previous version, Gene3.pm, before continuing with the discussion that follows:

package Gene;

#
# A fourth and final version of the Gene.pm class
#

use strict;
use warnings;
our $AUTOLOAD; # before Perl 5.6.0 say "use vars '$AUTOLOAD';"
use Carp;

# Class data and methods
{
    # A list of all attributes with default values and read/write/required properties
    my %_attribute_properties = (
        _name        => [ '????',        'read.required'],
        _organism    => [ '????',        'read.required'],
        _chromosome  => [ '????',        'read.write'],
        _pdbref      => [ '????',        'read.write'],
        _author      => [ '????',        'read.write'],
        _date        => [ '????',        'read.write'],
    );
        
    # Global variable to keep count of existing objects
    my $_count = 0;

    # Return a list of all attributes
    sub _all_attributes {
            keys %_attribute_properties;
    }

    # Check if a given property is set for a given attribute
    sub _permissions {
        my($self, $attribute, $permissions) = @_;
        $_attribute_properties{$attribute}[1] =~ /$permissions/;
    }

    # Return the default value for a given attribute
    sub _attribute_default {
            my($self, $attribute) = @_;
        $_attribute_properties{$attribute}[0];
    }

    # Manage the count of existing objects
    sub get_count {
        $_count;
    }
    sub _incr_count {
        ++$_count;
    }
    sub _decr_count {
        --$_count;
    }
}

# The constructor method
# Called from class, e.g. $obj = Gene->new(  );
sub new {
    my ($class, %arg) = @_;
    # Create a new object
    my $self = bless {  }, $class;

    foreach my $attribute ($self->_all_attributes(  )) {
        # E.g. attribute = "_name",  argument = "name"
        my($argument) = ($attribute =~ /^_(.*)/);
        # If explicitly given
        if (exists $arg{$argument}) {
            $self->{$attribute} = $arg{$argument};
        # If not given, but required
        }elsif($self->_permissions($attribute, 'required')) {
            croak("No $argument attribute as required");
        # Set to the default
        }else{
            $self->{$attribute} = $self->_attribute_default($attribute);
        }
    }
    $class->_incr_count(  );
    return $self;
}

# The clone method
# All attributes will be copied from the calling object, unless
# specifically overridden
# Called from an exisiting object, e.g. $cloned_obj = $obj1->clone(  );
sub clone {
    my ($caller, %arg) = @_;
    # Extract the class name from the calling object
    my $class = ref($caller);
    # Create a new object
    my $self = bless {  }, $class;

    foreach my $attribute ($self->_all_attributes(  )) {
        # E.g. attribute = "_name",  argument = "name"
        my($argument) = ($attribute =~ /^_(.*)/);
        # If explicitly given
        if (exists $arg{$argument}) {
            $self->{$attribute} = $arg{$argument};
        # Otherwise copy attribute of new object from the calling object
        }else{
            $self->{$attribute} = $caller->{$attribute};
        }
    }
    $self->_incr_count(  );
    return $self;
}

# This takes the place of such accessor definitions as:
#  sub get_attribute { ... }
# and of such mutator definitions as:
#  sub set_attribute { ... }
sub AUTOLOAD {
    my ($self, $newvalue) = @_;

    my ($operation, $attribute) = ($AUTOLOAD =~ /(get|set)(_\w+)$/);
    
    # Is this a legal method name?
    unless($operation && $attribute) {
        croak "Method name $AUTOLOAD is not in the recognized form (get|set)_
attribute\n";
    }
    unless(exists $self->{$attribute}) {
        croak "No such attribute $attribute exists in the class ", ref($self);
    }

    # Turn off strict references to enable "magic" AUTOLOAD speedup
    no strict 'refs';

    # AUTOLOAD accessors
    if($operation eq 'get') {

        # Complain if you can't get the attribute
        unless($self->_permissions($attribute, 'read')) {
            croak "$attribute does not have read permission";
        }

        # Install this accessor definition in the symbol table
        *{$AUTOLOAD} = sub {
            my ($self) = @_;
            unless($self->_permissions($attribute, 'read')) {
                croak "$attribute does not have read permission";
            }
            $self->{$attribute};
        };

    # AUTOLOAD mutators
    }elsif($operation eq 'set') {

        # Complain if you can't set the attribute
        unless($self->_permissions($attribute, 'write')) {
            croak "$attribute does not have write permission";
        }

        # Set the attribute value
        $self->{$attribute} = $newvalue;

        # Install this mutator definition in the symbol table
        *{$AUTOLOAD} = sub {
               my ($self, $newvalue) = @_;
            unless($self->_permissions($attribute, 'write')) {
                croak "$attribute does not have write permission";
            }
            $self->{$attribute} = $newvalue;
        };
    }

    # Turn strict references back on
    use strict 'refs';

    # Return the attribute value
    return $self->{$attribute};
}

# When an object is no longer being used, this will be automatically called
# and will adjust the count of existing objects
sub DESTROY {
    my($self) = @_;
    $self->_decr_count(  );
}

# Other methods.  They do not fall into the same form as the majority handled by 
AUTOLOAD
sub citation {
    my ($self, $author, $date) = @_;
    $self->{_author} = set_author($author) if $author;
    $self->{_date} = set_date($date) if $date;
    return ($self->{_author}, $self->{_date})
}

1;

=head1 Gene

Gene: objects for Genes with a minimum set of attributes

=head1 Synopsis

    use Gene;

    my $gene1 = Gene->new(
        name       => 'biggene',
        organism   => 'Mus musculus',
        chromosome => '2p',
        pdbref     => 'pdb5775.ent',
        author     => 'L.G.Jeho',
        date       => 'August 23, 1989',
    );

    print "Gene name is ", $gene1->get_name(  );
    print "Gene organism is ", $gene1->get_organism(  );
    print "Gene chromosome is ", $gene1->get_chromosome(  );
    print "Gene pdbref is ", $gene1->get_pdbref(  );
    print "Gene author is ", $gene1->get_author(  );
    print "Gene date is ", $gene1->get_date(  );

    $clone = $gene1->clone(name => 'biggeneclone');

    $gene1-> set_chromosome('2q');
    $gene1-> set_pdbref('pdb7557.ent');
    $gene1-> set_author('G.Mendel');
    $gene1-> set_date('May 25, 1865');

    $clone->citation('T.Morgan', 'October 3, 1912');

    print "Clone citation is ", $clone->citation;

=head1 AUTHOR

A kind reader

=head1 COPYRIGHT

Copyright (c) 2003, We Own Gene, Inc.

=cut

3.11.2 Defining Attributes and Their Behaviors

This fourth version of Gene.pm does some additional things with the available attributes:

  • It collects them in their own hash, %_attribute_properties. This makes it easier to modify the class; you only have to add or delete attributes to this one hash, and the rest of the code will behave accordingly.

  • It enables you to specify default values for each attribute. In the Gene.pm class, I just specify the string ???? as the default for each attribute, but any values could be specified.

  • This attribute hash specifies, for each attribute, whether it is permitted to read or write it, and if it is required to have a nondefault value provided.

Here is the hash that supports all this:

# A list of all attributes with default values and read/write/required properties
    my %_attribute_properties = (
        _name        => [ '????',        'read.required'],
        _organism    => [ '????',        'read.required'],
        _chromosome  => [ '????',        'read.write'],
        _pdbref      => [ '????',        'read.write'],
        _author      => [ '????',        'read.write'],
        _date        => [ '????',        'read.write'],
    );

Why have the read/write/required properties been specified? It's because sometimes overwriting an attribute may get you into deep water; for instance, if you have a unique ID number assigned to each object you create, it may be a bad idea to allow the user of the class to overwrite that ID number. Restricting the access to read-only forces the user of the class to destroy an unwanted object and create a new one with a new ID. It depends on the application you're writing, but in general, the ability to enforce read/write discipline on your attributes can help you create safer code.

The required property ensures that the user gives an attribute a value when the object is created. I've already discussed why that is useful in earlier versions of the class; here, I'm just implementing it in a slightly different way.

This way of specifying properties can easily be expanded. For instance, if you want to add a property no_overwrite that prevents overwriting a previously set (nondefault) value, just add such a string to this hash and alter the code of the mutator method accordingly.

Now that we've got a fair amount of information about the attributes collected in a separate data structure, we need a few helper methods to access that information.

First, you need a method that simply returns a list of all the attributes:

# Return a list of all attributes
sub _all_attributes {
        keys %_attribute_properties;
}

Next, you'll want a way to check, for any given attribute and property, if that property is set for that attribute. The return value is the value of the last statement in the subroutine, which is true or false depending on whether or not the property $permissions is set for the given attribute:

# Check if a given property is set for a given attribute
sub _permissions {
    my($self, $attribute, $permissions) = @_;
    $_attribute_properties{$attribute}[1] =~ /$permissions/;
}

Finally, to set attribute values, you'll want to report on the default value for any given attribute. This returns the value of the last statement in the subroutine, which is the default value for the given attribute (this is a hash of arrays, and the code is returning the first element of the array stored for that attribute, which contains the default value):

# Return the default value for a given attribute
sub _attribute_default {
        my($self, $attribute) = @_;
    $_attribute_properties{$attribute}[0];
}

3.11.3 Initializing the Attributes of a New Object

This fourth and final version of Gene.pm has some alterations to the new constructor method. These alterations incorporate tests and actions relating to the new information being specified about the attributes, namely, their default values and their various properties.

I've also added an entirely new constructor method, clone. Recall that the new constructor method is called as a class method (e.g., Gene->new( )) and uses default values for every attribute not specified when called. It is often useful to create a new object by copying an old object and just changing some of its values. clone gives this capability. It is called as an object method (e.g., $geneobject->clone( )).

Let's examine the changes that were made to the new constructor; then we'll look at the clone constructor.

3.11.3.1 The newer new constructor

Here is the new version of the code for the new constructor:

# The constructor method
# Called from class, e.g. $obj = Gene->new(  );
sub new {
    my ($class, %arg) = @_;
    # Create a new object
    my $self = bless {  }, $class;

    foreach my $attribute ($self->_all_attributes(  )) {
        # E.g. attribute = "_name",  argument = "name"
        my($argument) = ($attribute =~ /^_(.*)/);
        # If explicitly given
        if (exists $arg{$argument}) {
            $self->{$attribute} = $arg{$argument};
        # If not given, but required
        }elsif($self->_permissions($attribute, 'required')) {
            croak("No $argument attribute as required");
        # Set to the default
        }else{
            $self->{$attribute} = $self->_attribute_default($attribute);
        }
    }
    $class->_incr_count(  );
    return $self;
}

Notice that we start by blessing an empty anonymous hash: bless { }, and then setting the values of the attributes.

These attribute values are set one by one, looping over their list given by the new helper method _all_attributes. Recall that the attribute names start with an underscore, which indicates they are private to the class code and not available to the user of the class. Each attribute is associated with an argument that has the same name without the leading underscore.

The logic of attribute initialization is three part. If an argument and value for an attribute is given, the attribute is set to that value. If no argument/value is given, but a value is required according to the properties specified for that attribute, the program croaks. Finally, if no argument is given and the attribute isn't required, the attribute is set to the default value specified for that attribute.

As before, at the end of the new constructor, the count of objects is increased, and the new object is returned.

3.11.3.2 The clone constructor

The clone constructor is very similar to the new constructor. In fact, the two subroutines could be combined into one without much trouble. (See the chapter exercises.) However, it makes sense to separate them, especially since it makes it clearer what's happening in the code that uses these subroutines. Besides, you just have to figure that the special ability to clone objects will come in handy in bioinformatics!

Here is the code for the clone constructor:

# The clone method
# All attributes will be copied from the calling object, unless
# specifically overridden
# Called from an exisiting object, e.g. $cloned_obj = $obj1->clone(  );
sub clone {
    my ($caller, %arg) = @_;
    # Extract the class name from the calling object
    my $class = ref($caller);
    # Create a new object
    my $self = bless {  }, $class;

    foreach my $attribute ($self->_all_attributes(  )) {
        # E.g. attribute = "_name",  argument = "name"
        my($argument) = ($attribute =~ /^_(.*)/);
        # If explicitly given
        if (exists $arg{$argument}) {
            $self->{$attribute} = $arg{$argument};
        # Otherwise copy attribute of new object from the calling object
        }else{
            $self->{$attribute} = $caller->{$attribute};
        }
    }
    $self->_incr_count(  );
    return $self;
}

Notice, first of all, that this method is called from an object, in contrast to the new constructor, which is called from the class. That is, to create a new object, you say something like:

$newobject = Myclass->new(  );

As usual, the class Myclass is named explicitly when calling the new constructor.

On the other hand, to clone an existing object, you say something like:

$clonedobject = $newobject->clone(  );

in which the clone constructor is called from an already existing object, in this case, the object $newobject.

Now, in the code for the clone method, the class name must be extracted from the caller by the ref($caller) code because the caller is an object, not a class.

Next, as in the new constructor, an empty anonymous hash is blessed as an object in the class, and then each attribute is considered in turn in a foreach loop.

Now, the argument name associated with the attribute name is extracted. Here, a simpler two-stage test is made. As before, if the argument is specified, the attribute is set as requested. If not, the attribute is set to the value it had in the calling object. Finally, the count of objects is incremented, and the new object is returned.

These two constructors give you some flexibility in how new objects are created and initialized in the Gene class. This flexibility may prove convenient and useful for you.

3.11.4 Permissions

The code to AUTOLOAD has been augmented with checks for appropriate permissions for the various attributes. The part of the code that handles the get_ accessor methods now checks to see if the read flag is set in the attribute hash via the _permissions class method. Notice the code that installs the definition of an accessor into the symbol table has also been modified to accommodate this additional test:

# AUTOLOAD accessors
if($AUTOLOAD =~ /.*::get_\w+/) {
    # Install this accessor definition in the symbol table
    *{$AUTOLOAD} = sub {
        my ($self) = @_;
        unless($self->_permissions($attribute, 'read')) {
            croak "$attribute does not have read permission";
        }
        $self->{$attribute};
    };
    # Return the attribute value
    unless($self->_permissions($attribute, 'read')) {
        croak "$attribute does not have read permission";
    }
    return $self->{$attribute};
}

Similarly, the part of AUTOLOAD that defines mutator methods for setting attribute values now checks for write permissions in a similar fashion.

3.11.5 Gene.pm Test Program and Output

Here is a test program testGene that exercises some of the new features of Gene.pm, followed by its output. It's worthwhile to take the time to read the testGene program, looking back at the class module Gene.pm for the definitions of the objects and methods and seeing what kind of output the test program creates. Also, see the exercises for suggestions on how to further modify and extend the capabilities of Gene.pm.

#!/usr/bin/perl

#
# Test the fourth and final version of the Gene module
#

use strict;
use warnings;

# Change this line to show the folder where you store Gene.pm
use lib "/home/tisdall/MasteringPerlBio/development/lib";
use Gene;

print "Object 1:\n\n";

# Create first object
my $obj1 = Gene->new(
        name            => "Aging",
        organism        => "Homo sapiens",
        chromosome      => "23",
        pdbref          => "pdb9999.ent"
); 

# Print the attributes of the first object
print $obj1->get_name, "\n";
print $obj1->get_organism, "\n";
print $obj1->get_chromosome, "\n";
print $obj1->get_pdbref, "\n";
# Test AUTOLOAD failure: try uncommenting one or both of these lines
#print $obj1->get_exon, "\n";
#print $obj1->getexon, "\n";

print "\n\nObject 2:\n\n";

# Create second object
my $obj2 = Gene->new(
        organism        => "Homo sapiens",
        name            => "Aging",
); 

# Print the attributes of the second object ... some will be unset
print $obj2->get_name, "\n";
print $obj2->get_organism, "\n";
print $obj2->get_chromosome, "\n";
print $obj2->get_pdbref, "\n";

# Reset some of the attributes of the second object
# set_name will cause an error
#$obj2->set_name("RapidAging");
$obj2->set_chromosome("22q");
$obj2->set_pdbref("pdf9876.ref");
$obj2->set_author("D. Enay");
$obj2->set_date("February 9, 1952");

print "\n\n";

# Print the reset attributes of the second object
print $obj2->get_name, "\n";
print $obj2->get_organism, "\n";
print $obj2->get_chromosome, "\n";
print $obj2->get_pdbref, "\n";
print $obj2->citation, "\n";

# Use a class method to report on a statistic about all existing objects
print "\nCount is ", Gene->get_count, "\n\n";

print "Object 3: a clone of object 2\n\n";

# Clone an object
my $obj3 = $obj2->clone(
        name            => "screw",
        organism        => "C.elegans",
        author          => "I.Turn",
);

# Print the attributes of the cloned object
print $obj3->get_name, "\n";
print $obj3->get_organism, "\n";
print $obj3->get_chromosome, "\n";
print $obj3->get_pdbref, "\n";
print $obj3->citation, "\n";

print "\nCount is ", Gene->get_count, "\n\n";

print "\n\nObject 4:\n\n";

# Create a fourth object: but this fails
#  because the "name" value is required (see Gene.pm)
my $obj4 = Gene->new(
        organism        => "Homo sapiens",
        chromosome      => "23",
        pdbref          => "pdb9999.ent"
); 

# This line is not reached due to the fatal failure to
#  create the fourth object
print "\nCount is ", Gene->get_count, "\n\n";

Here is the output from running the preceding program:

Object 1:

Aging
Homo sapiens
23
pdb9999.ent

Object 2:

Aging
Homo sapiens
????
????

Aging
Homo sapiens
22q
pdf9876.ref
D. EnayFebruary 9, 1952

Count is 2

Object 3: a clone of object 2

screw
C.elegans
22q
pdf9876.ref
I.TurnFebruary 9, 1952

Count is 3


Object 4:

No name attribute as required at testGene line 89