BiBiServ Logo
Attention:
Due to technical maintenance some tools might be unavailable.
See maintenance information.
BiBiServ -
                                    Bielefeld         University Bioinformatic Service
Tools
Education
Administration
Tools
Genome Comparison
Gecko
REPuter
...more
Alignments
e2g
PoSSuMsearch
...more
Primer Design
GeneFisher
RNA Studio
RNAcast
RNAshapes
...more
Evolutionary Relationship
ROSE
SplitsTree
...more
Others
XenDB
PREdictor
...more

Altavist: Online description:


AltAVist (Alternative Alignment Visualization Tool) is a WWW-based software program that is able to compare two alternative multiple alignments of a given sequence set to each other. Regions where both alignments coincide are color-coded to visualize the local agreement between the two alignments and to identify those regions of the alignments that can be considered to be most reliable.


Why comparing alignments?

Sequence alignment is the most fundamental tool for sequence data analysis in molecular biololgy. Practically all methods of computational sequence analysis rely in one way or the other on sequence comparison, so their results depend on the quality of the underlying alignments. Pairwise and multiple alignment therefore continues to be one of the most active areas of research in bioinformatics. There are two major challenges in the context of sequence alignment: (a) it can be hard to distinguish weak local homologies from random similarities and (b) alignment programs can only detect those homologies that appear in the same relative order in the input sequences. The latter problem is inherent in sequence alignment and means that, for many data sets, correct alignment of one homologous region necessarily prevents other homologies from being correctly aligned.

No single alignment procedure can be expected to construct biologically correct alignments in all possible situations. The reason for this is that every alignment program tries - explicitly or implicitly - to find optimal alignments according to some relatively simple mathematical scoring function. Yet it cannot be expected that any given scoring function will, under all conditions, be in accordance with biology giving the mathematically highest score to the biologically correct alignments. Consequently, human intervention is often necessary to check the results of automated alignment procedures and to obtain biologically reasonable alignments. A popular way of testing the (local) reliability of pairwise or multiple alignments is to construct alternative alignments of the same sequence family using different alignment methods. Notredame et al. (2000) used this idea systematically and proposed a software tool that integrates results from different multi-alignment methods into one single output alignment.

For multiple alignment, a variety of programs are now available that rely on very different objective functions and optimization techniques. The results of these methods can therefore be quite diverse, see Notredame (2002) for an excellent review of the state-of-the-art multi-alignment algorithms and Thompson et al. (1999b) for a systematic evaluation of the most widely used software tools. If two alignments have been constructed by different methods, those regions where both alignments coincide are generally considered to be more reliable than regions where they disagree. However, manually comparing different multiple alignments is a tedious task.


Input options:

AltAVisT compares two different multiple alignmenst of a given data set and highlights regions where both alignments coincide. Two input options are available:

  • It is possible to enter a family of sequences. In this case, our program will run the programs DIALIGN (Morgenstern, 1999) and CLUSTAL W (Thompson et al., 1994) on the input sequences and compare the resulting alignments to each other. These two programs are currently among the most popular multi-alignment methods. Since they rely on fundamentally different algorithmical approaches, those parts of the alignments where both programs agree can be considered to be reliable.
  • It is possible to enter two different pre-calculated alignments of a sequence family set that may have been produced by any method; this way the user can compare the results of arbitrary alignment methods to each other.
With either option, those residue pairs that are aligned to each other in both alignments are colored. Different colors are used to distinguish groups of residues for which the alignment coincides within groups but not between different groups. In other words, considering alignments as consistent equivalence relations as outlined in Morgenstern et al. (1996), residue pairs that are in the same column and have the same color belong to the set-theoretical intersection of the equivalence relations corresponding to the two alignments.

Our tool can not only be used to determine reliable regions in alignments but also to evaluate alignment programs by comparing the alignments they produce to reference alignments that are considered as a standard of truth. There is now a high-quality data base called BAliBASE that has been designed as a benchmark data base for evaluation of multiple alignment methods (Thompson et al., 1999a). The authors of BAliBASE also provide software that automatically compares arbitrary alignments of their test data to the reference alignments and determines the overal degree of agreement between these two alignments. However, for the development of alignment methods, it can be interesting to know not only the overal quality of the produced alignments but to also know where exactly these alignments are in agreement with the given reference alignment and where they are not. Our method can be used for this purpose and should therefore also be useful for further development and improvement of pairwise and multiple alignment methods.


Program output:

Below is the result of AltAVisT applied to a small test sequence set. The first alignment has been produced by DIALIGN, the second one by CLUSTAL. For each column in the first alignment, those residue pairs are cololred that also appear in one column in the second alignment. Different colors are used to distinguish groups of residues where the alignment coincides within groups but not between different groups. For example, The two Ms in column 4 in the DIALIGN alignment also appear in the same column in the CLUSTAL alignment, namely in column 21; they are therefore colored. The same holds true for the two Cs in the same column of the DIALIGN alignment. These residues also appear in a common column in the CLUSTAL alignment, namely in column 4. However, the Ms and Cs belong to different columns in the CLUSTAL alignment so different colors are used. All lower-case residues in the DIALIGN alignment are printed in black because they are not considered aligned by DIALIGN, regardles in which column they are.

In the second alignment, all residues have the same color as in the first alignment so the two alignments can be easily compared. This may imply, howerver, that residues in the second alignment appear in the same color even though they are not aligned together in the first alignment, see for example column 21 in the second alignment.

This is what the AltAVisT output looks like:

THE RESULT OF DIALIGN IS :

prtp_mouse   YQSMNS-----------------QYLKLLSSQKYQILLYNGDVDMACNFMGDEWFVDSLn
yua6_caeel ---MTS-----------------RVLNAVNNNNLKMMLYNGDVDLACNALMGQRFTDKLg
cbpy_yeast -------INRNFLFAGDWMKPYHTAVTDLLNQDLPILVYAGDKDFICNWLGNKAWTDVLP
yby9_yeast ----DNDVFTGFLFTGDGSKPFQQYIAELLNHNIPVLIYAGDKDYICNWLGNHAWSNELE
cbpy_picpa YESCNFEINRNFLFAGDWMKPYHEHVSSLLNKGLPVLIYAGDKDFICNWLGNRAWTDVLP
cbpx_arath FVSCSTSVYQAMLv--DWMRNLEVGIPTLLEDGISLLVYAGEYDLICNWLGNSRWVNAME

prtp_mouse --QKMEVQR---RPWLVDygesgEQVAGFVKEC--SHITFLTIKGAG---PY-
yua6_caeel ltl-----SKKKTHFTVK-----GQIGGYVTQYkgSQVTFATVRGAGHaf---
cbpy_yeast WKYDEEFASQKVRNWTASIT---DEVAGEVKSY--KHFTYLRVFNGGHM----
yby9_yeast WINKRRYQRRMLRPWVSKET---GEELGQVKNY--GPFTFLRIYDAGHMVP--
cbpy_picpa WVDADGFEKAEVQDWLVN-----GRKAGEFKNY--SNFTYLRVYDAGHMAPY-
cbpx_arath WSGKTNFGAAKEVPFIVD-----GKEAGLLKTY--EQLSFLKVRDAGHMVPmd




NOTE: lower case letters are not considered to be aligned 

THE RESULT OF CLUSTALW IS :



prtp_mouse   -----------------YQSMNSQYLKLLSSQKYQILLYNGDVDMACNFMGDEWFVDSLN
yua6_caeel --------------------MTSRVLNAVNNNNLKMMLYNGDVDLACNALMGQRFTDKLG
cbpy_yeast -------INRNFLFAGDWMKPYHTAVTDLLNQDLPILVYAGDKDFICNWLGNKAWTDVLP
yby9_yeast ----DNDVFTGFLFTGDGSKPFQQYIAELLNHNIPVLIYAGDKDYICNWLGNHAWSNELE
cbpy_picpa YESCNFEINRNFLFAGDWMKPYHEHVSSLLNKGLPVLIYAGDKDFICNWLGNRAWTDVLP
cbpx_arath FVSCSTSVYQAMLVD--WMRNLEVGIPTLLEDGISLLVYAGEYDLICNWLGNSRWVNAME

prtp_mouse QKME-----VQRRPW-LVDYGESGEQVAGFVKECSHITFLTIKGAG-PY---
yua6_caeel LTLS-----KKKTHF-TVK-GQIGGYVTQYK--GSQVTFATVRGAGHAF---
cbpy_yeast WKYDEEFASQKVRNWTASITDEVAGEVKSYK----HFTYLRVFNGGHM----
yby9_yeast WINKRRYQRRMLRPWVSKETGEELGQVKNYG----PFTFLRIYDAGHMVP--
cbpy_picpa WVDADGFEKAEVQDW--LVNGRKAG</ color="0000ff">YDAGHMVP--
cbpy_picpa WVDADGFEKAEVQDW--LVNGRKAGEFKNYS----NFTYLRVYDAGHMAPY-
cbpx_arath WSGKTNFGAAKEVPF--IVDGKEAGLLKTYE----QLSFLKVRDAGHMVPMD




Literature:

  • S. Abdeddaim and B. Morgenstern (2001).
    Speeding up the DIALIGN multiple alignment program by using the `greedy alignment of biological sequences library' (GABIOS-LIB).
    Lecture Notes in Computer Science, 2066:1-11.

  • B. Morgenstern (1999).
    DIALIGN 2: improvement of the segment-to-segment approach to multiple sequence alignment.
    Bioinformatics, 15:211-218.

  • B. Morgenstern, A.W.M. Dress, and T. Werner (1996)
    Multiple DNA and protein sequence alignment based on segment-to-segment comparison.
    Proc. Natl. Acad. Sci. USA, 93:12098-12103.

  • C. Notredame (2002).
    Recent progress in multiple sequence alignment: a survey.
    Pharmacogenomics, 3:131-144.

  • C. Notredame, D. Higgins, and J. Heringa (2000).
    T-Coffee: a novel algorithm for multiple sequence alignment.
    J. Mol. Biol., 302:205-217.

  • J.D. Thompson, D.G. Higgins, and T.J. Gibson (1994).
    CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.
    Nucleic Acids Research, 22:4673-4680.

  • J.D. Thompson, F. Plewniak, and O. Poch (1999a).
    BAliBASE: A benchmark alignment database for the evaluation of multiple sequence alignment programs.
    Bioinformatics, 15:87-88.

  • J.D. Thompson, F. Plewniak, and O. Poch (1999b).
    A comprehensive comparison of protein sequence alignment programs.
    Nucleic Acids Research, 27:2682-2690.


                                         Back to AltAVisT home page. 

                                   



                                   




Welcome
Submission 1
Submission 2
Reference
Contact
About
Tue May 23 08:41:34 2006