8.6 When Troubleshooting, Read the Footer First

Novices usually focus on the one-line summaries, regular users concentrate on the alignments and their statistics, and professionals first read the footer. When it comes to solving the two most common problems, no hits and too many hits, the one-line summaries aren't much help. Regular users can often look at alignments and diagnose compositional biases and unidentified repeats, but determining the cause of no hits isn't easy. Examining the footer to determine what the search was actually looking for is the best way to determine what happened. Always answer the following questions first:

  • What are the values for the seeding parameters W, T, and two-hit distance? If the seeding parameters are too stringent, divergent alignments may not be seeded. In NCBI-BLAST, W is unfortunately not displayed in the footer. The value for T and two-hit distance are given as T: and A:, respectively.

  • What is the scoring scheme expecting to find (i.e., target frequency)? If the scoring matrix expects nearly identical sequences, highly divergent sequences may be missed.

  • What is the alignment threshold? If the alignment threshold is too high, low scoring alignments will be thrown away. The gapped and ungapped values are given after S1: and S2: in NCBI-BLAST. In WU-BLAST, they are on the rows beneath S2.

  • What are B and V set to? If they are set too low, the number of one-line summaries and database hits may be truncated.

  • What is the score and expected length of a significant alignment? Use the Karlin-Altschul equation to solve for the normalized score and then divide by H to calculate the length.

  • Was complexity filtering employed, and if so, was it hard or soft? Complexity filtering is generally a good idea, but may prevent some sequences from generating significant alignments. NCBI-BLAST doesn't not currently report which filters were employed.