Literature DB >> 20122216

Efficient genome-scale phylogenetic analysis under the duplication-loss and deep coalescence cost models.

Mukul S Bansal1, J Gordon Burleigh, Oliver Eulenstein.   

Abstract

BACKGROUND: Genomic data provide a wealth of new information for phylogenetic analysis. Yet making use of this data requires phylogenetic methods that can efficiently analyze extremely large data sets and account for processes of gene evolution, such as gene duplication and loss, incomplete lineage sorting (deep coalescence), or horizontal gene transfer, that cause incongruence among gene trees. One such approach is gene tree parsimony, which, given a set of gene trees, seeks a species tree that requires the smallest number of evolutionary events to explain the incongruence of the gene trees. However, the only existing algorithms for gene tree parsimony under the duplication-loss or deep coalescence reconciliation cost are prohibitively slow for large datasets.
RESULTS: We describe novel algorithms for SPR and TBR based local search heuristics under the duplication-loss cost, and we show how they can be adapted for the deep coalescence cost. These algorithms improve upon the best existing algorithms for these problems by a factor of n, where n is the number of species in the collection of gene trees. We implemented our new SPR based local search algorithm for the duplication-loss cost and demonstrate the tremendous improvement in runtime and scalability it provides compared to existing implementations. We also evaluate the performance of our algorithm on three large-scale genomic data sets.
CONCLUSION: Our new algorithms enable, for the first time, gene tree parsimony analyses of thousands of genes from hundreds of taxa using the duplication-loss and deep coalescence reconciliation costs. Thus, this work expands both the size of data sets and the range of evolutionary models that can be incorporated into genome-scale phylogenetic analyses.

Entities:  

Mesh:

Year:  2010        PMID: 20122216      PMCID: PMC3009515          DOI: 10.1186/1471-2105-11-S1-S42

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  21 in total

1.  Bayesian gene/species tree reconciliation and orthology analysis using MCMC.

Authors:  Lars Arvestad; Ann-Charlotte Berglund; Jens Lagergren; Bengt Sennblad
Journal:  Bioinformatics       Date:  2003       Impact factor: 6.937

2.  Genome-scale phylogenetics: inferring the plant tree of life from 18,896 gene trees.

Authors:  J Gordon Burleigh; Mukul S Bansal; Oliver Eulenstein; Stefanie Hartmann; André Wehe; Todd J Vision
Journal:  Syst Biol       Date:  2010-12-24       Impact factor: 15.683

3.  RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models.

Authors:  Alexandros Stamatakis
Journal:  Bioinformatics       Date:  2006-08-23       Impact factor: 6.937

4.  Gene family evolution by duplication, speciation, and loss.

Authors:  Cedric Chauve; Jean-Philippe Doyon; Nadia El-Mabrouk
Journal:  J Comput Biol       Date:  2008-10       Impact factor: 1.479

5.  DupTree: a program for large-scale phylogenetic analyses using gene tree parsimony.

Authors:  André Wehe; Mukul S Bansal; J Gordon Burleigh; Oliver Eulenstein
Journal:  Bioinformatics       Date:  2008-05-12       Impact factor: 6.937

6.  GeneTree: comparing gene and species phylogenies using reconciled trees.

Authors:  R D Page
Journal:  Bioinformatics       Date:  1998       Impact factor: 6.937

7.  Reconstruction of ancient molecular phylogeny.

Authors:  R Guigó; I Muchnik; T F Smith
Journal:  Mol Phylogenet Evol       Date:  1996-10       Impact factor: 4.286

8.  STEM: species tree estimation using maximum likelihood for gene trees under coalescence.

Authors:  Laura S Kubatko; Bryan C Carstens; L Lacey Knowles
Journal:  Bioinformatics       Date:  2009-02-10       Impact factor: 6.937

9.  Species tree inference by minimizing deep coalescences.

Authors:  Cuong Than; Luay Nakhleh
Journal:  PLoS Comput Biol       Date:  2009-09-11       Impact factor: 4.475

10.  The Apicomplexan whole-genome phylogeny: an analysis of incongruence among gene trees.

Authors:  Chih-Horng Kuo; John P Wares; Jessica C Kissinger
Journal:  Mol Biol Evol       Date:  2008-09-26       Impact factor: 16.240

View more
  15 in total

1.  Turning the crown upside down: gene tree parsimony roots the eukaryotic tree of life.

Authors:  Laura A Katz; Jessica R Grant; Laura Wegener Parfrey; J Gordon Burleigh
Journal:  Syst Biol       Date:  2012-02-14       Impact factor: 15.683

2.  Genome-scale phylogenetics: inferring the plant tree of life from 18,896 gene trees.

Authors:  J Gordon Burleigh; Mukul S Bansal; Oliver Eulenstein; Stefanie Hartmann; André Wehe; Todd J Vision
Journal:  Syst Biol       Date:  2010-12-24       Impact factor: 15.683

3.  Consensus properties for the deep coalescence problem and their application for scalable tree search.

Authors:  Harris T Lin; J Gordon Burleigh; Oliver Eulenstein
Journal:  BMC Bioinformatics       Date:  2012-06-25       Impact factor: 3.169

4.  Efficient error correction algorithms for gene tree reconciliation based on duplication, duplication and loss, and deep coalescence.

Authors:  Ruchi Chaudhary; J Gordon Burleigh; Oliver Eulenstein
Journal:  BMC Bioinformatics       Date:  2012-06-25       Impact factor: 3.169

5.  Statistical inconsistency of the unrooted minimize deep coalescence criterion.

Authors:  Ayed A R Alanzi; James H Degnan
Journal:  PLoS One       Date:  2021-05-10       Impact factor: 3.240

6.  Genome-scale coestimation of species and gene trees.

Authors:  Bastien Boussau; Gergely J Szöllosi; Laurent Duret; Manolo Gouy; Eric Tannier; Vincent Daubin
Journal:  Genome Res       Date:  2012-11-06       Impact factor: 9.043

7.  iGTP: a software package for large-scale gene tree parsimony analysis.

Authors:  Ruchi Chaudhary; Mukul S Bansal; André Wehe; David Fernández-Baca; Oliver Eulenstein
Journal:  BMC Bioinformatics       Date:  2010-11-23       Impact factor: 3.169

8.  A rooted net of life.

Authors:  David Williams; Gregory P Fournier; Pascal Lapierre; Kristen S Swithers; Anna G Green; Cheryl P Andam; J Peter Gogarten
Journal:  Biol Direct       Date:  2011-09-21       Impact factor: 4.540

9.  TreeKO: a duplication-aware algorithm for the comparison of phylogenetic trees.

Authors:  Marina Marcet-Houben; Toni Gabaldón
Journal:  Nucleic Acids Res       Date:  2011-02-18       Impact factor: 16.971

10.  Exploiting gene families for phylogenomic analysis of myzostomid transcriptome data.

Authors:  Stefanie Hartmann; Conrad Helm; Birgit Nickel; Matthias Meyer; Torsten H Struck; Ralph Tiedemann; Joachim Selbig; Christoph Bleidorn
Journal:  PLoS One       Date:  2012-01-20       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.