|
|
|
|
FASTA [ Pearson
et al. 1988] is another commonly used family of
programs for sequence database searches. FASTA stands for
FAST-All, reflecting the fact that it can be used for a protein
or nucleotide comparisons. The program achieves a high level of
sensitivity for similarity searching at high speed. The high
speed is achieved by using the observed pattern of word hits to
identify potential matches before attempting the more time
consuming optimized search. The trade-off between speed and
sensitivity is controlled by the ktup parameter, which specifies
the size of the word. Increasing the ktup decreases the number
of background hits. Not every word hit is investigated but
instead initially looks for segments containing several nearby
hits.
|
|
|
|
|
|
Like BLAST, FASTA offers a variety of
programs for different searches. Here is an overview:
Program | Description |
fasta3 | Compare a protein sequence to a protein sequence database or
a DNA sequence to a DNA sequence database using the FASTA
algorithm (Pearson and Lipman, 1988, Pearson, 1996). Search
speed and selectivity are controlled with the ktup(wordsize)
parameter. For protein comparisons, ktup = 2 by default;
ktup =1 is more sensitive but slower. For DNA comparisons,
ktup=6 by default; ktup=3 or ktup=4 provides higher
sensitivity;
ktup=1 should be used for oligonucleotides (DNA
query lengths < 20). |
ssearch3 | Compare a protein sequence to a protein sequence database or
a DNA sequence to a DNA sequence database using the
Smith-Waterman algorithm
(Smith and Waterman, 1981). ssearch3 is
about 10-times slower than FASTA3, but is more sensitive for
full-length protein sequence comparison. |
fastx3/fasty3 | Compare a DNA sequence to a protein sequence database, by
comparing the translated DNA sequence in three frames and
allowing gaps and frameshifts. fastx3 uses a simpler,
faster algorithm for alignments that allows frameshifts only
between codons; fasty3 is slower but produces better
alignments
with poor quality sequences because frameshifts are
allowed within codons. |
tfastx3/tfasty3 | Compare a protein sequence
to a DNA sequence database,
calculating similarities with frameshifts to the forward and
reverse orientations. |
tfasta3 | Compare a protein sequence to a DNA
sequence database,
calculating similarities (without frameshifts) to the 3 forward
and three reverse reading frames. tfastx3 and tfasty3 are
preferred because they calculate similarity over
frameshifts. |
|
|
|