Literature DB >> 21301032

Uncovering hidden phylogenetic consensus in large data sets.

Nicholas D Pattengale1, Andre J Aberer, Krister M Swenson, Alexandros Stamatakis, Bernard M E Moret.   

Abstract

Many of the steps in phylogenetic reconstruction can be confounded by “rogue” taxa—taxa that cannot be placed with assurance anywhere within the tree, indeed, whose location within the tree varies with almost any choice of algorithm or parameters. Phylogenetic consensus methods, in particular, are known to suffer from this problem. In this paper, we provide a novel framework to define and identify rogue taxa. In this framework, we formulate a bicriterion optimization problem, the relative information criterion, that models the net increase in useful information present in the consensus tree when certain taxa are removed from the input data. We also provide an effective greedy heuristic to identify a subset of rogue taxa and use this heuristic in a series of experiments, with both pathological examples from the literature and a collection of large biological data sets. As the presence of rogue taxa in a set of bootstrap replicates can lead to deceivingly poor support values, we propose a procedure to recompute support values in light of the rogue taxa identified by our algorithm; applying this procedure to our biological data sets caused a large number of edges to move from “unsupported” to “supported” status, indicating that many existing phylogenies should be recomputed and reevaluated to reduce any inaccuracies introduced by rogue taxa. We also discuss the implementation issues encountered while integrating our algorithm into RAxML v7.2.7, particularly those dealing with scaling up the analyses. This integration enables practitioners to benefit from our algorithm in the analysis of very large data sets (up to 2,500 taxa and 10,000 trees, although we present the results of even larger analyses).

Mesh:

Year:  2011        PMID: 21301032     DOI: 10.1109/TCBB.2011.28

Source DB:  PubMed          Journal:  IEEE/ACM Trans Comput Biol Bioinform        ISSN: 1545-5963            Impact factor:   3.710


  9 in total

1.  The evolution of Dscam genes across the arthropods.

Authors:  Sophie A O Armitage; Rebecca Y Freiburg; Joachim Kurtz; Ignacio G Bravo
Journal:  BMC Evol Biol       Date:  2012-04-13       Impact factor: 3.260

2.  RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies.

Authors:  Alexandros Stamatakis
Journal:  Bioinformatics       Date:  2014-01-21       Impact factor: 6.937

3.  Enumerating all maximal frequent subtrees in collections of phylogenetic trees.

Authors:  Akshay Deepak; David Fernández-Baca
Journal:  Algorithms Mol Biol       Date:  2014-06-18       Impact factor: 1.405

4.  Concatabominations: identifying unstable taxa in morphological phylogenetics using a heuristic extension to safe taxonomic reduction.

Authors:  Karen Siu-Ting; Davide Pisani; Christopher J Creevey; Mark Wilkinson
Journal:  Syst Biol       Date:  2014-09-02       Impact factor: 15.683

5.  The ropAe gene encodes a porin-like protein involved in copper transit in Rhizobium etli CFN42.

Authors:  Antonio González-Sánchez; Ciro A Cubillas; Fabiola Miranda; Araceli Dávalos; Alejandro García-de Los Santos
Journal:  Microbiologyopen       Date:  2017-12-27       Impact factor: 3.139

6.  Pruning rogue taxa improves phylogenetic accuracy: an efficient algorithm and webservice.

Authors:  Andre J Aberer; Denis Krompass; Alexandros Stamatakis
Journal:  Syst Biol       Date:  2012-09-06       Impact factor: 15.683

7.  A scalable method for identifying frequent subtrees in sets of large phylogenetic trees.

Authors:  Avinash Ramu; Tamer Kahveci; J Gordon Burleigh
Journal:  BMC Bioinformatics       Date:  2012-10-03       Impact factor: 3.169

8.  The Evolution of the Secreted Regulatory Protein Progranulin.

Authors:  Roger G E Palfree; Hugh P J Bennett; Andrew Bateman
Journal:  PLoS One       Date:  2015-08-06       Impact factor: 3.240

9.  HPV16 variants distribution in invasive cancers of the cervix, vulva, vagina, penis, and anus.

Authors:  Sara Nicolás-Párraga; Carolina Gandini; Ville N Pimenoff; Laia Alemany; Silvia de Sanjosé; F Xavier Bosch; Ignacio G Bravo
Journal:  Cancer Med       Date:  2016-09-21       Impact factor: 4.452

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.