Structure of This Book

This book is divided into six parts: An Introduction to BLAST, Theory, Practice, Industrial-Strength BLAST, Reference, and the Appendixes. The quick start guide in Chapter 1 is the best place to begin if you've never run BLAST before. You won't need sophisticated hardware or software, just a web browser connected to the Internet. In Part II, we begin by exploring the molecular biology, computer science, and statistics that form the foundation of BLAST searches. We then describe the BLAST algorithm in detail. You will find that a sound theoretical understanding is essential when you put BLAST into practice. In Part III, we present practical advice to help you design and interpret BLAST experiments intelligently and efficiently. Whether you're a complete novice or a seasoned pro, you'll find the tutorials and protocols a valuable resource. Part IV discusses using BLAST in a high-throughput setting where the goal is to get as much BLAST as possible for your buck. Here, we integrate the information usually found scattered among systems administrators, database administrators, and advanced BLAST users into a few sensible chapters. Part V contains reference chapters for NCBI-BLAST and WU-BLAST with detailed descriptions of each parameter.

Here's a chapter-by-chapter breakdown:

Part I

Chapter 1, gives a quick introduction to BLAST by exploring Internet search pages.

Part II

Chapter 2, gives some background molecular and evolutionary biology to help you understand why biological sequences are similar to one another.

Chapter 3, explains how global and local sequence alignment works and describes common algorithms for aligning sequences of letters.

Chapter 4, explains how scores are used to determine the best alignmentand discusses the statistical significance of sequence similarity in a database search.

Part III

Chapter 5, discusses BLAST itself. Understanding the theoretical framework of the BLAST suite of programs will help you design and interpret BLAST experiments and give you a foundation for troubleshooting when your search produces unexpected results.

Chapter 6, explores the standard format of the BLAST report.

Chapter 7, shows how to calculate the numbers in a BLAST report and use this knowledge to better understand the results of a BLAST search.

Chapter 8, is a summary of the previous seven chapters as well as the authors' expertise, and is designed to help you get the most from your BLAST searches.

Chapter 9, contains "recipes" for the most common BLAST searches; it describes what to do and why to do it.

Part IV

Chapter 10, shows how to install NCBI-BLAST and WU-BLAST software on your own computer. This is necessary if you want to use BLAST in a high-throughput setting or develop specialized applications.

Chapter 11, shows how to create and maintain BLAST databases?one of the most neglected yet important aspects of using BLAST.

Chapter 12, explores how to optimize BLAST searches for maximum throughput and will help you get the most out of your current and future hardware and software.

Part V

Chapter 13, describes the parameters and options for the NCBI suite of BLAST programs.

Chapter 14, describes the parameters and options for the WU-BLAST program.

Part VI

Appendix A, gives a brief description of each NCBI-BLAST sequence alignment display option, followed by a detailed explanation and example.

Appendix B, shows the target frequencies and simple gap costs for pairs of sequences of length 100, 500, and 1,000.

Appendix C, shows the default values for several combinations of NCBI-BLAST matrices and gap costs.

Appendix D, is a Perl script that creates a graphical summary of a BLAST report using Thomas Boutell's GD graphics library, which has been ported to Perl by Lincoln Stein.

Appendix E, is a Perl script that converts standard WU-BLAST or NCBI-BLAST output to the NCBI tabular format (-m 8) described in Appendix A.

There is also a Glossary of BLAST terms.