Literature DB >> 18614299

Characterization of pairwise and multiple sequence alignment errors.

Giddy Landan1, Dan Graur.   

Abstract

We characterize pairwise and multiple sequence alignment (MSA) errors by comparing true alignments from simulations of sequence evolution with reconstructed alignments. The vast majority of reconstructed alignments contain many errors. Error rates rapidly increase with sequence divergence, thus, for even intermediate degrees of sequence divergence, more than half of the columns of a reconstructed alignment may be expected to be erroneous. In closely related sequences, most errors consist of the erroneous positioning of a single indel event and their effect is local. As sequences diverge, errors become more complex as a result of the simultaneous mis-reconstruction of many indel events, and the lengths of the affected MSA segments increase dramatically. We found a systematic bias towards underestimation of the number of gaps, which leads to the reconstructed MSA being on average shorter than the true one. Alignment errors are unavoidable even when the evolutionary parameters are known in advance. Correct reconstruction can only be guaranteed when the likelihood of true alignment is uniquely optimal. However, true alignment features are very frequently sub-optimal or co-optimal, with the result that optimal albeit erroneous features are incorporated into the reconstructed MSA. Progressive MSA utilizes a guide-tree in the reconstruction of MSAs. The quality of the guide-tree was found to affect MSA error levels only marginally.

Mesh:

Year:  2008        PMID: 18614299     DOI: 10.1016/j.gene.2008.05.016

Source DB:  PubMed          Journal:  Gene        ISSN: 0378-1119            Impact factor:   3.688


  18 in total

1.  Correlated Selection on Amino Acid Deletion and Replacement in Mammalian Protein Sequences.

Authors:  Yichen Zheng; Dan Graur; Ricardo B R Azevedo
Journal:  J Mol Evol       Date:  2018-06-28       Impact factor: 2.395

Review 2.  Getting a better picture of microbial evolution en route to a network of genomes.

Authors:  Tal Dagan; William Martin
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2009-08-12       Impact factor: 6.237

3.  An alignment confidence score capturing robustness to guide tree uncertainty.

Authors:  Osnat Penn; Eyal Privman; Giddy Landan; Dan Graur; Tal Pupko
Journal:  Mol Biol Evol       Date:  2010-03-05       Impact factor: 16.240

Review 4.  Visualization of multiple alignments, phylogenies and gene family evolution.

Authors:  James B Procter; Julie Thompson; Ivica Letunic; Chris Creevey; Fabrice Jossinet; Geoffrey J Barton
Journal:  Nat Methods       Date:  2010-03       Impact factor: 28.547

5.  A phylogenetic analysis of normal modes evolution in enzymes and its relationship to enzyme function.

Authors:  Jason Lai; Jing Jin; Jan Kubelka; David A Liberles
Journal:  J Mol Biol       Date:  2012-05-28       Impact factor: 5.469

6.  Phylogenetic assessment of alignments reveals neglected tree signal in gaps.

Authors:  Christophe Dessimoz; Manuel Gil
Journal:  Genome Biol       Date:  2010-04-06       Impact factor: 13.583

7.  The tree and net components of prokaryote evolution.

Authors:  Pere Puigbò; Yuri I Wolf; Eugene V Koonin
Journal:  Genome Biol Evol       Date:  2010-10-01       Impact factor: 3.416

8.  PSAR: measuring multiple sequence alignment reliability by probabilistic sampling.

Authors:  Jaebum Kim; Jian Ma
Journal:  Nucleic Acids Res       Date:  2011-05-16       Impact factor: 16.971

9.  Reproducing the manual annotation of multiple sequence alignments using a SVM classifier.

Authors:  Christian Blouin; Scott Perry; Allan Lavell; Edward Susko; Andrew J Roger
Journal:  Bioinformatics       Date:  2009-09-21       Impact factor: 6.937

10.  Using false discovery rates to benchmark SNP-callers in next-generation sequencing projects.

Authors:  Rhys A Farrer; Daniel A Henk; Dan MacLean; David J Studholme; Matthew C Fisher
Journal:  Sci Rep       Date:  2013       Impact factor: 4.379

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.