|
|
||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
|
Altavist: Online description:AltAVist (Alternative Alignment Visualization Tool) is a WWW-based software program that is able to compare two alternative multiple alignments of a given sequence set to each other. Regions where both alignments coincide are color-coded to visualize the local agreement between the two alignments and to identify those regions of the alignments that can be considered to be most reliable.
Sequence alignment is the most fundamental tool for sequence data analysis in molecular biololgy. Practically all methods of computational sequence analysis rely in one way or the other on sequence comparison, so their results depend on the quality of the underlying alignments. Pairwise and multiple alignment therefore continues to be one of the most active areas of research in bioinformatics. There are two major challenges in the context of sequence alignment: (a) it can be hard to distinguish weak local homologies from random similarities and (b) alignment programs can only detect those homologies that appear in the same relative order in the input sequences. The latter problem is inherent in sequence alignment and means that, for many data sets, correct alignment of one homologous region necessarily prevents other homologies from being correctly aligned. No single alignment procedure can be expected to construct biologically correct alignments in all possible situations. The reason for this is that every alignment program tries - explicitly or implicitly - to find optimal alignments according to some relatively simple mathematical scoring function. Yet it cannot be expected that any given scoring function will, under all conditions, be in accordance with biology giving the mathematically highest score to the biologically correct alignments. Consequently, human intervention is often necessary to check the results of automated alignment procedures and to obtain biologically reasonable alignments. A popular way of testing the (local) reliability of pairwise or multiple alignments is to construct alternative alignments of the same sequence family using different alignment methods. Notredame et al. (2000) used this idea systematically and proposed a software tool that integrates results from different multi-alignment methods into one single output alignment. For multiple alignment, a variety of programs are now available that rely on very different objective functions and optimization techniques. The results of these methods can therefore be quite diverse, see Notredame (2002) for an excellent review of the state-of-the-art multi-alignment algorithms and Thompson et al. (1999b) for a systematic evaluation of the most widely used software tools. If two alignments have been constructed by different methods, those regions where both alignments coincide are generally considered to be more reliable than regions where they disagree. However, manually comparing different multiple alignments is a tedious task.
AltAVisT compares two different multiple alignmenst of a given data set and highlights regions where both alignments coincide. Two input options are available:
Our tool can not only be used to determine reliable regions in
alignments but also to evaluate alignment programs by comparing
the alignments they produce to reference alignments that are
considered as a standard of truth. There is now a
high-quality data base called BAliBASE
that has been designed as a benchmark data base for evaluation of
multiple alignment methods (Thompson et al., 1999a). The
authors of BAliBASE also provide software that automatically
compares arbitrary alignments of their test data to the reference
alignments and determines the overal degree of agreement between
these two alignments. However, for the development of alignment
methods, it can be interesting to know not only the overal
quality of the produced alignments but to also know where exactly
these alignments are in agreement with the given reference
alignment and where they are not. Our method can be used for this
purpose and should therefore also be useful for further
development and improvement of pairwise and multiple alignment
methods. Below is the result of AltAVisT applied to a small test sequence set. The first alignment has been produced by DIALIGN, the second one by CLUSTAL. For each column in the first alignment, those residue pairs are cololred that also appear in one column in the second alignment. Different colors are used to distinguish groups of residues where the alignment coincides within groups but not between different groups. For example, The two Ms in column 4 in the DIALIGN alignment also appear in the same column in the CLUSTAL alignment, namely in column 21; they are therefore colored. The same holds true for the two Cs in the same column of the DIALIGN alignment. These residues also appear in a common column in the CLUSTAL alignment, namely in column 4. However, the Ms and Cs belong to different columns in the CLUSTAL alignment so different colors are used. All lower-case residues in the DIALIGN alignment are printed in black because they are not considered aligned by DIALIGN, regardles in which column they are. In the second alignment, all residues have the same color as in the first alignment so the two alignments can be easily compared. This may imply, howerver, that residues in the second alignment appear in the same color even though they are not aligned together in the first alignment, see for example column 21 in the second alignment. This is what the AltAVisT output looks like: THE RESULT OF DIALIGN IS :prtp_mouse YQSMNS-----------------QYLKLLSSQKYQILLYNGDVDMACNFMGDEWFVDSLn NOTE: lower case letters are not considered to be aligned THE RESULT OF CLUSTALW IS :prtp_mouse -----------------YQSMNSQYLKLLSSQKYQILLYNGDVDMACNFMGDEWFVDSLN
Back to AltAVisT home page.
|
|
|||||||||||||||||||||||||||||||||
|
|
|
|||||||||||||||||||||||||||||||||