Chapter 7. A BLAST Statistics Tutorial

BLAST statistics are everywhere in biology today. In fact, it's hard to find a molecular-biology paper, grant proposal, patent application, or biotech business plan that doesn't refer to the Expect or P-value of a BLAST result. The BLAST Expect has so permeated biological thinking in recent years that for many scientists it has become synonymous with biological truth. Tell a colleague that you've just cloned a gene that's homologous to something trendy, and odds are that he will ask what the Expect of its alignment was in a BLAST search.

Of course, what some see as a sweeping change, others label a fad. While some researchers consider BLAST statistics a welcome injection of mathematical rigor into the biological world, others lament the abandonment of biological insight for faith in a number. No matter where you stand on this issue, there is no avoiding the reality of BLAST statistics in today's bioinformatics workplace. Understanding what the numbers in a BLAST report mean and how they are derived isn't just for mathematicians; it's a real-world survival skill for biologists and bioinformatics professionals in academia and industry alike.

The material covered in this chapter is practical rather than theoretical in nature. Chapter 4 summarized some of the theory behind local alignment statistics. Read that chapter to learn more about the basic parameters of BLAST: l, k, and H. This chapter shows how to calculate the numbers in a BLAST report and use this knowledge to better understand the results of a BLAST search.