12.6 Optimized NCBI-BLAST

The source code for NCBI-BLAST is in the public domain, and anyone can modify it without restriction (ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools). It's therefore not surprising that there are a number of variants. The rest of this chapter discusses three of them.

12.6.1 Apple/Genentech BLAST

Macintosh G4 computers have an additional vector processing unit called VelocityEngine or Altivec that can process several similar instructions in parallel. Apple Computer and Genentech collaborated to rewrite portions of NCBI-BLAST to take advantage of the Altivec processor. These modifications affect the seeding phase of BLASTN. The result, AG-BLAST, significantly outperforms NCBI-BLAST under certain conditions.

Table 12-5 shows an experiment in which a Caenorhabditis elegans transcript (F44B9.10) was searched against the Caenorhabditis briggsae genome using various word sizes but otherwise default parameters (the hardware is a 550-MHz PowerBook). For cross-species work, it's generally a good idea to employ word sizes slightly smaller than the default 11 to minimize the chance of missing meaningful similarities. Here, AG-BLAST has a significant speed advantage over NCBI-BLAST. AG-BLAST also runs faster at very large word sizes, which is useful if you are matching sequences that are expected to be identical or nearly identical (e.g., mapping ESTs to their own genome).

Table 12-5. Apple/Genentech BLAST



AG-BLAST (sec)

Speed increase




1.5 x




5.3 x




8.5 x




1.0 x




1.0 x




1.4 x




2.3 x




2.8 x

AG-BLAST does have a few disadvantages. First, the version may be slightly out of date with respect to NCBI-BLAST. The current version of AG-BLAST is based on 2.2.2, while NCBI-BLAST is up to Version 2.2.6. Not all changes are backward-compatible; for example, the latest preformatted databases require Version 2.2.5. Second, AG-BLAST doesn't work with multiple CPUs. You can execute more than one job at a time, but you can't use the -a option to increase the number of CPUs used by a single process. Finally, the minimum word size on AG-BLAST is 8, or one greater than the NCBI-BLAST minimum. See http://developer.apple.com/hardware/ve/acgresearch.html for more information.

12.6.2 Paracel-BLAST and BlastMachine

Paracel makes an NCBI-BLAST derivative called Paracel-BLAST and sells it with a prepackaged computer cluster called a BlastMachine. This product takes all the high performance hardware and software tricks and puts them into a single, easy-to-use product. The hardware is a rack of Linux-Intel machines, and the DRM software is Platform LSF. Large query sequences are chopped, small ones are packed, and data is distributed so the search comes back as fast as possible. This is really convenient because it lets users concentrate on what they want to do and not how they have to do it. In the end, more science and less frustration is a good thing.

See http://www.paracel.com for more information.

12.6.3 TimeLogic Tera-BLAST

TimeLogic uses an entirely different approach to optimizing BLAST. The BLAST algorithm is soft-wired into a special kind of chip called a field programmable gate array (FPGA). Each FPGA executes the search very quickly and multiple FPGA boards reside in a single computer called a DeCypher accelerator. The end result is a specialized computer that is limited in what it can do, but what it does, it does astonishingly well. A single DeCypher accelerator running Tera-BLAST (the name for their NCBI-BLAST-derived algorithm) is the equivalent of about 100 general-purpose computers. Shockingly, it all fits in a standard server case. Such technology doesn't come cheaply. However, if you do a lot of BLAST searches (or use some of the other algorithms they provide), it may be far cheaper than a huge cluster, especially when you consider power consumption and maintenance.

One hidden cost in specialized systems such as a DeCypher accelerator is the time and effort required to integrate them with more general systems you may already have. If you have a stepwise sequence-analysis pipeline already worked out, it may be difficult to adapt it to Tera-BLAST. Tera-BLAST works most efficiently with big jobs, and to take advantage of this requires giving it a whole bunch of sequences at once. Thus, you might have to restructure your pipeline in much the same way as discussed earlier with respect to caching.

TimeLogic also offers a completely new variant of BLAST called Gene-BLAST. This algorithm strings together HSPs by dynamic programming (an affine Smith-Waterman with two levels of gap scoring schemes) to achieve a better model of exon-intron structure. Gene-BLAST works with both nucleotide- and protein-level alignments and appears to be a welcome new addition to the BLAST family. Unfortunately, the only way to run Gene-BLAST is on TimeLogic hardware. See http://www.timelogic.com for more details.