Literature DB >> 11262972

Scaling of accuracy in extremely large phylogenetic trees.

O R Bininda-Emonds1, S G Brady, J Kim, M J Sanderson.   

Abstract

The accuracy of phylogenetic inference was examined in simulated data sets up to nearly 10,000 taxa, the size of the largest set of homologous genes in existing molecular sequence databases. Even with a simple search algorithm (maximum parsimony without branch swapping), the number of characters needed to estimate 80% of a tree correctly can scale remarkably well at optimal substitution rates (on the order of log N, where N is the number of taxa). In other words, the number of taxa in an analysis can be doubled and only an arithmetic increase in the number of characters is required to maintain the same level of accuracy. Even substitution rates that are much higher than normally used in phylogenetic studies did not affect the scaling too adversely. However, scaling is usually worse than log N for more stringent levels of accuracy. Moreover, errors are not distributed randomly throughout the tree. Shallow nodes are remarkably easy to reconstruct and display favourable log-linear scaling. The deepest nodes are extremely difficult to reconstruct accurately, even with branch swapping, and the scaling is poor. Therefore, the strategy of sequencing large numbers of homologous genes may not always provide global solutions to extreme phylogenetic problems and alternative strategies may be required.

Mesh:

Year:  2001        PMID: 11262972     DOI: 10.1142/9789814447362_0053

Source DB:  PubMed          Journal:  Pac Symp Biocomput        ISSN: 2335-6928


  8 in total

1.  Using confidence set heuristics during topology search improves the robustness of phylogenetic inference.

Authors:  Shirley L Pepke; Davin Butt; Isabelle Nadeau; Andrew J Roger; Christian Blouin
Journal:  J Mol Evol       Date:  2006-12-09       Impact factor: 2.395

2.  Maximum Likelihood Analyses of 3,490 rbcL Sequences: Scalability of Comprehensive Inference versus Group-Specific Taxon Sampling.

Authors:  Alexandros Stamatakis; Markus Göker; Guido W Grimm
Journal:  Evol Bioinform Online       Date:  2010-05-24       Impact factor: 1.625

3.  PhyloMap: an algorithm for visualizing relationships of large sequence data sets and its application to the influenza A virus genome.

Authors:  Jiajie Zhang; Amir Madany Mamlouk; Thomas Martinetz; Suhua Chang; Jing Wang; Rolf Hilgenfeld
Journal:  BMC Bioinformatics       Date:  2011-06-20       Impact factor: 3.169

4.  Short-wavelength sensitive opsin (SWS1) as a new marker for vertebrate phylogenetics.

Authors:  Ilke van Hazel; Francesco Santini; Johannes Müller; Belinda S W Chang
Journal:  BMC Evol Biol       Date:  2006-11-15       Impact factor: 3.260

5.  Evaluating multi-locus phylogenies for species boundaries determination in the genus Diaporthe.

Authors:  Liliana Santos; Artur Alves; Rui Alves
Journal:  PeerJ       Date:  2017-03-28       Impact factor: 2.984

6.  A census-based estimate of Earth's bacterial and archaeal diversity.

Authors:  Stilianos Louca; Florent Mazel; Michael Doebeli; Laura Wegener Parfrey
Journal:  PLoS Biol       Date:  2019-02-04       Impact factor: 8.029

7.  FastTree: computing large minimum evolution trees with profiles instead of a distance matrix.

Authors:  Morgan N Price; Paramvir S Dehal; Adam P Arkin
Journal:  Mol Biol Evol       Date:  2009-04-17       Impact factor: 16.240

8.  Broadly sampled multigene trees of eukaryotes.

Authors:  Hwan Su Yoon; Jessica Grant; Yonas I Tekle; Min Wu; Benjamin C Chaon; Jeffrey C Cole; John M Logsdon; David J Patterson; Debashish Bhattacharya; Laura A Katz
Journal:  BMC Evol Biol       Date:  2008-01-18       Impact factor: 3.260

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.