Literature DB >> 17646343

Multiple alignment by aligning alignments.

Travis J Wheeler1, John D Kececioglu.   

Abstract

MOTIVATION: Multiple sequence alignment is a fundamental task in bioinformatics. Current tools typically form an initial alignment by merging subalignments, and then polish this alignment by repeated splitting and merging of subalignments to obtain an improved final alignment. In general this form-and-polish strategy consists of several stages, and a profusion of methods have been tried at every stage. We carefully investigate: (1) how to utilize a new algorithm for aligning alignments that optimally solves the common subproblem of merging subalignments, and (2) what is the best choice of method for each stage to obtain the highest quality alignment.
RESULTS: We study six stages in the form-and-polish strategy for multiple alignment: parameter choice, distance estimation, merge-tree construction, sequence-pair weighting, alignment merging, and polishing. For each stage, we consider novel approaches as well as standard ones. Interestingly, the greatest gains in alignment quality come from (i) estimating distances by a new approach using normalized alignment costs, and (ii) polishing by a new approach using 3-cuts. Experiments with a parameter-value oracle suggest large gains in quality may be possible through an input-dependent choice of alignment parameters, and we present a promising approach for building such an oracle. Combining the best approaches to each stage yields a new tool we call Opal that on benchmark alignments matches the quality of the top tools, without employing alignment consistency or hydrophobic gap penalties. AVAILABILITY: Opal, a multiple alignment tool that implements the best methods in our study, is freely available at http://opal.cs.arizona.edu.

Mesh:

Substances:

Year:  2007        PMID: 17646343     DOI: 10.1093/bioinformatics/btm226

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  60 in total

1.  PhyLAT: a phylogenetic local alignment tool.

Authors:  Hongtao Sun; Jeremy D Buhler
Journal:  Bioinformatics       Date:  2012-04-06       Impact factor: 6.937

2.  Large-scale multiple sequence alignment and tree estimation using SATé.

Authors:  Kevin Liu; Tandy Warnow
Journal:  Methods Mol Biol       Date:  2014

3.  Evolutionary origin of a streamlined marine bacterioplankton lineage.

Authors:  Haiwei Luo
Journal:  ISME J       Date:  2014-11-28       Impact factor: 10.302

4.  PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences.

Authors:  Siavash Mirarab; Nam Nguyen; Sheng Guo; Li-San Wang; Junhyong Kim; Tandy Warnow
Journal:  J Comput Biol       Date:  2014-12-30       Impact factor: 1.479

5.  Cooperation of Spaln and Prrn5 for Construction of Gene-Structure-Aware Multiple Sequence Alignment.

Authors:  Osamu Gotoh
Journal:  Methods Mol Biol       Date:  2021

Review 6.  Revisiting Evaluation of Multiple Sequence Alignment Methods.

Authors:  Tandy Warnow
Journal:  Methods Mol Biol       Date:  2021

7.  Ixora (Rubiaceae) on the Philippines - crossroad or cradle?

Authors:  Cecilia I Banag; Arnaud Mouly; Grecebio Jonathan D Alejandro; Birgitta Bremer; Ulrich Meve; Guido W Grimm; Sigrid Liede-Schumann
Journal:  BMC Evol Biol       Date:  2017-06-07       Impact factor: 3.260

8.  Adaptive Local Realignment of Protein Sequences.

Authors:  Dan DeBlasio; John Kececioglu
Journal:  J Comput Biol       Date:  2018-06-11       Impact factor: 1.479

9.  Accuracy estimation and parameter advising for protein multiple sequence alignment.

Authors:  John Kececioglu; Dan DeBlasio
Journal:  J Comput Biol       Date:  2013-03-14       Impact factor: 1.479

Review 10.  Upcoming challenges for multiple sequence alignment methods in the high-throughput era.

Authors:  Carsten Kemena; Cedric Notredame
Journal:  Bioinformatics       Date:  2009-07-30       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.