Literature DB >> 10373585

A comprehensive comparison of multiple sequence alignment programs.

J D Thompson1, F Plewniak, O Poch.   

Abstract

In recent years improvements to existing programs and the introduction of new iterative algorithms have changed the state-of-the-art in protein sequence alignment. This paper presents the first systematic study of the most commonly used alignment programs using BAliBASE benchmark alignments as test cases. Even below the 'twilight zone' at 10-20% residue identity, the best programs were capable of correctly aligning on average 47% of the residues. We show that iterative algorithms often offer improved alignment accuracy though at the expense of computation time. A notable exception was the effect of introducing a single divergent sequence into a set of closely related sequences, causing the iteration to diverge away from the best alignment. Global alignment programs generally performed better than local methods, except in the presence of large N/C-terminal extensions and internal insertions. In these cases, a local algorithm was more successful in identifying the most conserved motifs. This study enables us to propose appropriate alignment strategies, depending on the nature of a particular set of sequences. The employment of more than one program based on different alignment techniques should significantly improve the quality of automatic protein sequence alignment methods. The results also indicate guidelines for improvement of alignment algorithms.

Mesh:

Substances:

Year:  1999        PMID: 10373585      PMCID: PMC148477          DOI: 10.1093/nar/27.13.2682

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


  171 in total

1.  BAliBASE (Benchmark Alignment dataBASE): enhancements for repeats, transmembrane sequences and circular permutations.

Authors:  A Bahr; J D Thompson; J C Thierry; O Poch
Journal:  Nucleic Acids Res       Date:  2001-01-01       Impact factor: 16.971

2.  Toward a comprehensive phylogeny for mammalian and avian herpesviruses.

Authors:  D J McGeoch; A Dolan; A C Ralph
Journal:  J Virol       Date:  2000-11       Impact factor: 5.103

3.  DbClustal: rapid and reliable global multiple alignments of protein sequences detected by database searches.

Authors:  J D Thompson; F Plewniak; J Thierry; O Poch
Journal:  Nucleic Acids Res       Date:  2000-08-01       Impact factor: 16.971

4.  Molecular identification of enterovirus by analyzing a partial VP1 genomic region with different methods.

Authors:  G Palacios; I Casas; A Tenorio; C Freire
Journal:  J Clin Microbiol       Date:  2002-01       Impact factor: 5.948

5.  A comparison of position-specific score matrices based on sequence and structure alignments.

Authors:  Anna R Panchenko; Stephen H Bryant
Journal:  Protein Sci       Date:  2002-02       Impact factor: 6.725

6.  A method for prediction of the locations of linker regions within large multifunctional proteins, and application to a type I polyketide synthase.

Authors:  Daniel W Udwary; Matthew Merski; Craig A Townsend
Journal:  J Mol Biol       Date:  2002-10-25       Impact factor: 5.469

7.  MTRAP: pairwise sequence alignment algorithm by a new measure based on transition probability between two consecutive pairs of residues.

Authors:  Toshihide Hara; Keiko Sato; Masanori Ohya
Journal:  BMC Bioinformatics       Date:  2010-05-08       Impact factor: 3.169

8.  MSACompro: protein multiple sequence alignment using predicted secondary structure, solvent accessibility, and residue-residue contacts.

Authors:  Xin Deng; Jianlin Cheng
Journal:  BMC Bioinformatics       Date:  2011-12-14       Impact factor: 3.169

9.  MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform.

Authors:  Kazutaka Katoh; Kazuharu Misawa; Kei-ichi Kuma; Takashi Miyata
Journal:  Nucleic Acids Res       Date:  2002-07-15       Impact factor: 16.971

10.  Finding weak similarities between proteins by sequence profile comparison.

Authors:  Anna R Panchenko
Journal:  Nucleic Acids Res       Date:  2003-01-15       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.