Literature DB >> 10089197

Combining many multiple alignments in one improved alignment.

K Bucka-Lassen1, O Caprani, J Hein.   

Abstract

MOTIVATION: The fact that the multiple sequence alignment problem is of high complexity has led to many different heuristic algorithms attempting to find a solution in what would be considered a reasonable amount of computation time and space. Very few of these heuristics produce results that are guaranteed always to lie within a certain distance of an optimal solution (given a measure of quality, e.g. parsimony). Most practical heuristics cannot guarantee this, but nevertheless perform well for certain cases. An alignment, obtained with one of these heuristics and with a bad overall score, is not unusable though, it might contain important information on how substrings should be aligned. This paper presents a method that extracts qualitatively good sub-alignments from a set of multiple alignments and combines these into a new, often improved alignment. The algorithm is implemented as a variant of the traditional dynamic programming technique.
RESULTS: An implementation of ComAlign (the algorithm that combines multiple alignments) has been run on several sets of artificially generated sequences and a set of 5S RNA sequences. To assess the quality of the alignments obtained, the results have been compared with the output of MSA 2.1 (Gupta et al., Proceedings of the Sixth Annual Symposium on Combinatorial Pattern Matching, 1995; Kececioglu et al., http://www.techfak.uni-bielefeld. de/bcd/Lectures/kececioglu.html, 1995). In all cases, ComAlign was able to produce a solution with a score comparable to the solution obtained by MSA. The results also show that ComAlign actually does combine parts from different alignments and not just select the best of them. AVAILABILITY: The C source code (a Smalltalk version is being worked on) of ComAlign and the other programs that have been implemented in this context are free and available on WWW (http://www.daimi.au.dk/ õcaprani). CONTACT: klaus@bucka-lassen.dk; jotun@pop.bio.au.dk;ocaprani@daimi.au.dk

Mesh:

Substances:

Year:  1999        PMID: 10089197     DOI: 10.1093/bioinformatics/15.2.122

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  9 in total

1.  DbClustal: rapid and reliable global multiple alignments of protein sequences detected by database searches.

Authors:  J D Thompson; F Plewniak; J Thierry; O Poch
Journal:  Nucleic Acids Res       Date:  2000-08-01       Impact factor: 16.971

2.  Discovering common stem-loop motifs in unaligned RNA sequences.

Authors:  J Gorodkin; S L Stricklin; G D Stormo
Journal:  Nucleic Acids Res       Date:  2001-05-15       Impact factor: 16.971

3.  MergeAlign: improving multiple sequence alignment performance by dynamic reconstruction of consensus multiple sequence alignments.

Authors:  Peter W Collingridge; Steven Kelly
Journal:  BMC Bioinformatics       Date:  2012-05-30       Impact factor: 3.169

4.  Accounting for alignment uncertainty in phylogenomics.

Authors:  Martin Wu; Sourav Chatterji; Jonathan A Eisen
Journal:  PLoS One       Date:  2012-01-17       Impact factor: 3.240

5.  Automatic assessment of alignment quality.

Authors:  Timo Lassmann; Erik L L Sonnhammer
Journal:  Nucleic Acids Res       Date:  2005-12-16       Impact factor: 16.971

6.  Efficient representation of uncertainty in multiple sequence alignments using directed acyclic graphs.

Authors:  Joseph L Herman; Ádám Novák; Rune Lyngsø; Adrienn Szabó; István Miklós; Jotun Hein
Journal:  BMC Bioinformatics       Date:  2015-04-01       Impact factor: 3.169

7.  ADLD: a novel graphical representation of protein sequences and its application.

Authors:  Lei Wang; Hui Peng; Jinhua Zheng
Journal:  Comput Math Methods Med       Date:  2014-10-30       Impact factor: 2.238

8.  M-Coffee: combining multiple sequence alignment methods with T-Coffee.

Authors:  Iain M Wallace; Orla O'Sullivan; Desmond G Higgins; Cedric Notredame
Journal:  Nucleic Acids Res       Date:  2006-03-23       Impact factor: 16.971

9.  The M-Coffee web server: a meta-method for computing multiple sequence alignments by combining alternative alignment methods.

Authors:  Sebastien Moretti; Fabrice Armougom; Iain M Wallace; Desmond G Higgins; Cornelius V Jongeneel; Cedric Notredame
Journal:  Nucleic Acids Res       Date:  2007-05-25       Impact factor: 16.971

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.