Literature DB >> 23428640

Making automated multiple alignments of very large numbers of protein sequences.

Fabian Sievers1, David Dineen, Andreas Wilm, Desmond G Higgins.   

Abstract

MOTIVATION: Recent developments in sequence alignment software have made possible multiple sequence alignments (MSAs) of >100 000 sequences in reasonable times. At present, there are no systematic analyses concerning the scalability of the alignment quality as the number of aligned sequences is increased.
RESULTS: We benchmarked a wide range of widely used MSA packages using a selection of protein families with some known structures and found that the accuracy of such alignments decreases markedly as the number of sequences grows. This is more or less true of all packages and protein families. The phenomenon is mostly due to the accumulation of alignment errors, rather than problems in guide-tree construction. This is partly alleviated by using iterative refinement or selectively adding sequences. The average accuracy of progressive methods by comparison with structure-based benchmarks can be improved by incorporating information derived from high-quality structural alignments of sequences with solved structures. This suggests that the availability of high quality curated alignments will have to complement algorithmic and/or software developments in the long-term.
AVAILABILITY AND IMPLEMENTATION: Benchmark data used in this study are available at http://www.clustal.org/omega/homfam-20110613-25.tar.gz and http://www.clustal.org/omega/bali3fam-26.tar.gz. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Mesh:

Year:  2013        PMID: 23428640     DOI: 10.1093/bioinformatics/btt093

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  19 in total

1.  PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences.

Authors:  Siavash Mirarab; Nam Nguyen; Sheng Guo; Li-San Wang; Junhyong Kim; Tandy Warnow
Journal:  J Comput Biol       Date:  2014-12-30       Impact factor: 1.479

2.  Simple chained guide trees give high-quality protein multiple sequence alignments.

Authors:  Kieran Boyce; Fabian Sievers; Desmond G Higgins
Journal:  Proc Natl Acad Sci U S A       Date:  2014-07-07       Impact factor: 11.205

3.  Reticulate evolution is favored in influenza niche switching.

Authors:  Eric J Ma; Nichola J Hill; Justin Zabilansky; Kyle Yuan; Jonathan A Runstadler
Journal:  Proc Natl Acad Sci U S A       Date:  2016-04-25       Impact factor: 11.205

4.  A survey for gregarines (Protozoa: Apicomplexa) in arthropods in Spain.

Authors:  A Criado-Fornelio; C Verdú-Expósito; T Martin-Pérez; I Heredero-Bermejo; J Pérez-Serrano; L Guàrdia-Valle; M Panisello-Panisello
Journal:  Parasitol Res       Date:  2016-09-30       Impact factor: 2.289

5.  TreeCluster: Clustering biological sequences using phylogenetic trees.

Authors:  Metin Balaban; Niema Moshiri; Uyen Mai; Xingfan Jia; Siavash Mirarab
Journal:  PLoS One       Date:  2019-08-22       Impact factor: 3.240

6.  Instability in progressive multiple sequence alignment algorithms.

Authors:  Kieran Boyce; Fabian Sievers; Desmond G Higgins
Journal:  Algorithms Mol Biol       Date:  2015-10-09       Impact factor: 1.405

7.  SDT: a virus classification tool based on pairwise sequence alignment and identity calculation.

Authors:  Brejnev Muhizi Muhire; Arvind Varsani; Darren Patrick Martin
Journal:  PLoS One       Date:  2014-09-26       Impact factor: 3.240

8.  DECIPHER: harnessing local sequence context to improve protein multiple sequence alignment.

Authors:  Erik S Wright
Journal:  BMC Bioinformatics       Date:  2015-10-06       Impact factor: 3.169

9.  Systematic exploration of guide-tree topology effects for small protein alignments.

Authors:  Fabian Sievers; Graham M Hughes; Desmond G Higgins
Journal:  BMC Bioinformatics       Date:  2014-10-04       Impact factor: 3.169

10.  High-Resolution Identification of Specificity Determining Positions in the LacI Protein Family Using Ensembles of Sub-Sampled Alignments.

Authors:  Roman Sloutsky; Kristen M Naegle
Journal:  PLoS One       Date:  2016-09-28       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.