What You Need to Know to Use This Book

This book assumes that you have some experience with Perl, including a working knowledge of writing, saving, and running programs; basic Perl syntax; control structures such as loops and conditional tests; the most common operators such as addition, subtraction, and string concatenation; input and output from the user, files, and other programs; subroutines; the basic data types of scalar, array, and hash; and regular expressions for searching and for altering strings. In other words, you should be able to program Perl well enough to extract data from sources such as GenBank and the Protein Data Bank using pattern matching and regular expressions.

If you are new to Perl but feel you can forge ahead using a language summary and examples of programs, Appendix A provides a summary of the important parts of the Perl language. Previous programming experience in a high-level language such as C, Java, or FORTRAN (or any similar language); some experience at using subroutines to break a large problem into smaller, appropriately interrelated parts; and a tinkerer's delight in taking things apart and seeing what makes them tick may be all the computer-science prerequisites you need.

This book is primarily written for biologists, so it assumes you know the elementary facts about DNA, proteins, and restriction enzymes; how to represent DNA and protein data in a Perl program; how to search for motifs; and the structure and use of the databases GenBank, PDB, and Rebase. Because the book assumes you are a biologist, biology concepts are not explained in detail in order to concentrate on programming skills.

Biological data appears in many forms. The most important sources of biological data include the repository of public genetic data called GenBank (Genetic Data Bank) and the repository of public protein structure data called PDB (Protein Data Bank). Many other similar sources of biological data such as Rebase (Restriction Enzyme Database) are in wide use. All the databases just mentioned are most commonly distributed as text files, which makes Perl a good programming tool to find and extract information from the databases.