1.2 Why Perl Modules?

Building a medium- to large-sized program usually requires you to divide tasks into several smaller, more manageable, and more interactive pieces. (A rule of thumb is that each "piece" should be about one or two printed pages in length, but this is just a general guideline.) An analogy can be made to building a microarray machine, which requires that you construct separate interacting pieces such as housing, temperature sensors and controls, robot arms to position the pipettes, hydraulic injection devices, and computer guidance for all these systems.

1.2.1 Subroutines and Software Engineering

Subroutines divide a large programming job into more manageable pieces. Modern programming languages all provide subroutines, which are also called functions, coroutines, or macros in other programming languages.

A subroutine lets you write a piece of code that performs some part of a desired computation (e.g., determining the length of DNA sequence). This code is written once and then can be called frequently throughout the main program. Using subroutines speeds the time it takes to write the main program, makes it more reliable by avoiding duplicated sections (which can get out of sync and make the program longer), and makes the entire program easier to test. A useful subroutine can be used by other programs as well, saving you development time in the future. As long as the inputs and outputs to the subroutine remain the same, its internal workings can be altered and improved without worrying about how the changes will affect the rest of the program. This is known as encapsulation.

The benefits of subroutines that I've just outlined also apply to other approaches in software engineering. Perl modules are a technique within a larger umbrella of techniques known as software encapsulation and reuse. Software encapsulation and reuse are fundamental to object-oriented programming.

A related design principle is abstraction, which involves writing code that is usable in many different situations. Let's say you write a subroutine that adds the fragment TTTTT to the end of a string of DNA. If you then want to add the fragment AAAAA to the end of a string of DNA, you have to write another subroutine. To avoid writing two subroutines, you can write one that's more abstract and adds to the end of a string of DNA whatever fragment you give it as an argument. Using the principle of abstraction, you've saved yourself half the work.

Here is an example of a Perl subroutine that takes two strings of DNA as inputs and returns the second one appended to the end of the first:

sub DNAappend {
        my ($dna, $tail) = @_;

        return($dna . $tail);

This subroutine can be used as follows:

my $polyT = 'TTTTTTTT';

print DNAappend($dna, $polyT);

If you wish, you can also define subroutines polyT and polyA like so:

sub polyT {
    my ($dna) = @_;

    return DNAappend($dna, 'TTTTTTTT');
sub polyA {
    my ($dna) = @_;

    return DNAappend($dna, 'AAAAAAAA');

At this point, you should think about how to divide a problem into interacting parts; that is, an optimal (or at least good) way to define a set of subroutines that can cooperate to solve a particular problem.

1.2.2 Modules and Libraries

In my projects, I gather subroutine definitions into separate files called libraries,[1] or modules, which let me collect subroutine definitions for use in other programs. Then, instead of copying the subroutine definitions into the new program (and introducing the potential for inaccurate copies or for alternate versions proliferating), I can just insert the name of the library or module into a program, and all the subroutines are available in their original unaltered form. This is an example of software reuse in action.

[1] Perl libraries were traditionally put in files ending with .pl, which stands for perl library; the term library is also used to refer to a collection of Perl modules. The common denominator is that a library is a collection of reusable subroutines.

To fully understand and use modules, you need to understand the simple concepts of namespaces and packages. From here on, think of a Perl module as any Perl library file that uses package declarations to create its own namespace. These simple concepts are examined in the next sections.