Literature DB >> 29771363

Multiple Sequence Alignment Averaging Improves Phylogeny Reconstruction.

Haim Ashkenazy1, Itamar Sela2, Eli Levy Karin1,3, Giddy Landan4, Tal Pupko1.   

Abstract

The classic methodology of inferring a phylogenetic tree from sequence data is composed of two steps. First, a multiple sequence alignment (MSA) is computed. Then, a tree is reconstructed assuming the MSA is correct. Yet, inferred MSAs were shown to be inaccurate and alignment errors reduce tree inference accuracy. It was previously proposed that filtering unreliable alignment regions can increase the accuracy of tree inference. However, it was also demonstrated that the benefit of this filtering is often obscured by the resulting loss of phylogenetic signal. In this work we explore an approach, in which instead of relying on a single MSA, we generate a large set of alternative MSAs and concatenate them into a single SuperMSA. By doing so, we account for phylogenetic signals contained in columns that are not present in the single MSA computed by alignment algorithms. Using simulations, we demonstrate that this approach results, on average, in more accurate trees compared to 1) using an unfiltered MSA and 2) using a single MSA with weights assigned to columns according to their reliability. Next, we explore in which regions of the MSA space our approach is expected to be beneficial. Finally, we provide a simple criterion for deciding whether or not the extra effort of computing a SuperMSA and inferring a tree from it is beneficial. Based on these assessments, we expect our methodology to be useful for many cases in which diverged sequences are analyzed. The option to generate such a SuperMSA is available at http://guidance.tau.ac.il.

Mesh:

Year:  2019        PMID: 29771363      PMCID: PMC6657586          DOI: 10.1093/sysbio/syy036

Source DB:  PubMed          Journal:  Syst Biol        ISSN: 1063-5157            Impact factor:   15.683


  74 in total

1.  A comprehensive comparison of multiple sequence alignment programs.

Authors:  J D Thompson; F Plewniak; O Poch
Journal:  Nucleic Acids Res       Date:  1999-07-01       Impact factor: 16.971

2.  Stretch coding and block coding: two new strategies to represent questionably aligned DNA sequences.

Authors:  Daniel L Geiger
Journal:  J Mol Evol       Date:  2002-02       Impact factor: 2.395

3.  Evolutionary HMMs: a Bayesian approach to multiple alignment.

Authors:  I Holmes; W J Bruno
Journal:  Bioinformatics       Date:  2001-09       Impact factor: 6.937

4.  Maximum-likelihood phylogenetic analysis under a covarion-like model.

Authors:  N Galtier
Journal:  Mol Biol Evol       Date:  2001-05       Impact factor: 16.240

5.  A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach.

Authors:  S Whelan; N Goldman
Journal:  Mol Biol Evol       Date:  2001-05       Impact factor: 16.240

Review 6.  Phylogeny estimation: traditional and Bayesian approaches.

Authors:  Mark Holder; Paul O Lewis
Journal:  Nat Rev Genet       Date:  2003-04       Impact factor: 53.242

7.  Integrating ambiguously aligned regions of DNA sequences in phylogenetic analyses without violating positional homology.

Authors:  F Lutzoni; P Wagner; V Reeb; S Zoller
Journal:  Syst Biol       Date:  2000-12       Impact factor: 15.683

8.  MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform.

Authors:  Kazutaka Katoh; Kazuharu Misawa; Kei-ichi Kuma; Takashi Miyata
Journal:  Nucleic Acids Res       Date:  2002-07-15       Impact factor: 16.971

9.  Bayesian gene/species tree reconciliation and orthology analysis using MCMC.

Authors:  Lars Arvestad; Ann-Charlotte Berglund; Jens Lagergren; Bengt Sennblad
Journal:  Bioinformatics       Date:  2003       Impact factor: 6.937

10.  A hidden Markov model for progressive multiple alignment.

Authors:  Ari Löytynoja; Michel C Milinkovitch
Journal:  Bioinformatics       Date:  2003-08-12       Impact factor: 6.937

View more
  4 in total

1.  Incorporating alignment uncertainty into Felsenstein's phylogenetic bootstrap to improve its reliability.

Authors:  Jia-Ming Chang; Evan W Floden; Javier Herrero; Olivier Gascuel; Paolo Di Tommaso; Cedric Notredame
Journal:  Bioinformatics       Date:  2019-02-06       Impact factor: 6.937

2.  Relative model selection of evolutionary substitution models can be sensitive to multiple sequence alignment uncertainty.

Authors:  Stephanie J Spielman; Molly L Miraglia
Journal:  BMC Ecol Evol       Date:  2021-11-29

3.  Understanding the Genetic Diversity of Picobirnavirus: A Classification Update Based on Phylogenetic and Pairwise Sequence Comparison Approaches.

Authors:  Lester J Perez; Gavin A Cloherty; Michael G Berg
Journal:  Viruses       Date:  2021-07-28       Impact factor: 5.048

4.  First Phylogeny of Bitterbush Family, Picramniaceae (Picramniales).

Authors:  Alexey Shipunov; Shyla Carr; Spencer Furniss; Kyle Pay; José Rubens Pirani
Journal:  Plants (Basel)       Date:  2020-02-21
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.