Literature DB >> 20377449

How many bootstrap replicates are necessary?

Nicholas D Pattengale1, Masoud Alipour, Olaf R P Bininda-Emonds, Bernard M E Moret, Alexandros Stamatakis.   

Abstract

Phylogenetic bootstrapping (BS) is a standard technique for inferring confidence values on phylogenetic trees that is based on reconstructing many trees from minor variations of the input data, trees called replicates. BS is used with all phylogenetic reconstruction approaches, but we focus here on one of the most popular, maximum likelihood (ML). Because ML inference is so computationally demanding, it has proved too expensive to date to assess the impact of the number of replicates used in BS on the relative accuracy of the support values. For the same reason, a rather small number (typically 100) of BS replicates are computed in real-world studies. Stamatakis et al. recently introduced a BS algorithm that is 1 to 2 orders of magnitude faster than previous techniques, while yielding qualitatively comparable support values, making an experimental study possible. In this article, we propose stopping criteria--that is, thresholds computed at runtime to determine when enough replicates have been generated--and we report on the first large-scale experimental study to assess the effect of the number of replicates on the quality of support values, including the performance of our proposed criteria. We run our tests on 17 diverse real-world DNA--single-gene as well as multi-gene--datasets, which include 125-2,554 taxa. We find that our stopping criteria typically stop computations after 100-500 replicates (although the most conservative criterion may continue for several thousand replicates) while producing support values that correlate at better than 99.5% with the reference values on the best ML trees. Significantly, we also find that the stopping criteria can recommend very different numbers of replicates for different datasets of comparable sizes. Our results are thus twofold: (i) they give the first experimental assessment of the effect of the number of BS replicates on the quality of support values returned through BS, and (ii) they validate our proposals for stopping criteria. Practitioners will no longer have to enter a guess nor worry about the quality of support values; moreover, with most counts of replicates in the 100-500 range, robust BS under ML inference becomes computationally practical for most datasets. The complete test suite is available at http://lcbb.epfl.ch/BS.tar.bz2, and BS with our stopping criteria is included in the latest release of RAxML v7.2.5, available at http://wwwkramer.in.tum.de/exelixis/software.html.

Entities:  

Mesh:

Year:  2010        PMID: 20377449     DOI: 10.1089/cmb.2009.0179

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  282 in total

1.  Multiple rod-cone and cone-rod photoreceptor transmutations in snakes: evidence from visual opsin gene expression.

Authors:  Bruno F Simões; Filipa L Sampaio; Ellis R Loew; Kate L Sanders; Robert N Fisher; Nathan S Hart; David M Hunt; Julian C Partridge; David J Gower
Journal:  Proc Biol Sci       Date:  2016-01-27       Impact factor: 5.349

2.  Novel Reassortant Human-Like H3N2 and H3N1 Influenza A Viruses Detected in Pigs Are Virulent and Antigenically Distinct from Swine Viruses Endemic to the United States.

Authors:  Daniela S Rajão; Phillip C Gauger; Tavis K Anderson; Nicola S Lewis; Eugenio J Abente; Mary Lea Killian; Daniel R Perez; Troy C Sutton; Jianqiang Zhang; Amy L Vincent
Journal:  J Virol       Date:  2015-08-26       Impact factor: 5.103

3.  Species Diversity With Comprehensive Annotations of Wood-Inhabiting Poroid and Corticioid Fungi in Uzbekistan.

Authors:  Yusufjon Gafforov; Alexander Ordynets; Ewald Langer; Manzura Yarasheva; Adriana de Mello Gugliotta; Dmitry Schigel; Lorenzo Pecoraro; Yu Zhou; Lei Cai; Li-Wei Zhou
Journal:  Front Microbiol       Date:  2020-12-09       Impact factor: 5.640

4.  An AlgU-Regulated Antisense Transcript Encoded within the Pseudomonas syringae fleQ Gene Has a Positive Effect on Motility.

Authors:  Eric Markel; Hollie Dalenberg; Caroline L Monteil; Boris A Vinatzer; Bryan Swingle
Journal:  J Bacteriol       Date:  2018-03-12       Impact factor: 3.490

5.  Description of Geodermatophilus amargosae sp. nov., to accommodate the not validly named Geodermatophilus obscurus subsp. amargosae (Luedemann, 1968).

Authors:  Maria del Carmen Montero-Calasanz; Markus Göker; Manfred Rohde; Cathrin Spröer; Peter Schumann; Shanmugam Mayilraj; Michael Goodfellow; Hans-Peter Klenk
Journal:  Curr Microbiol       Date:  2013-11-08       Impact factor: 2.188

6.  Primate phylogenetic relationships and divergence dates inferred from complete mitochondrial genomes.

Authors:  Luca Pozzi; Jason A Hodgson; Andrew S Burrell; Kirstin N Sterner; Ryan L Raaum; Todd R Disotell
Journal:  Mol Phylogenet Evol       Date:  2014-02-28       Impact factor: 4.286

7.  Different from tracheophytes, liverworts commonly have mixed 35S and 5S arrays.

Authors:  Aretuza Sousa; Julia Bechteler; Eva M Temsch; Susanne S Renner
Journal:  Ann Bot       Date:  2020-06-01       Impact factor: 4.357

8.  Putting pleiotropy and selection into context defines a new paradigm for interpreting genetic data.

Authors:  Irene M Predazzi; Antonis Rokas; Amos Deinard; Nathalie Schnetz-Boutaud; Nicholas D Williams; William S Bush; Alessandra Tacconelli; Klaus Friedrich; Sergio Fazio; Giuseppe Novelli; Jonathan L Haines; Giorgio Sirugo; Scott M Williams
Journal:  Circ Cardiovasc Genet       Date:  2013-04-24

9.  Fast evolving 18S rRNA sequences from Solenogastres (Mollusca) resist standard PCR amplification and give new insights into mollusk substitution rate heterogeneity.

Authors:  Achim Meyer; Christiane Todt; Nina T Mikkelsen; Bernhard Lieb
Journal:  BMC Evol Biol       Date:  2010-03-09       Impact factor: 3.260

10.  Complete genome sequence of Thermocrinis albus type strain (HI 11/12).

Authors:  Reinhard Wirth; Johannes Sikorski; Evelyne Brambilla; Monica Misra; Alla Lapidus; Alex Copeland; Matt Nolan; Susan Lucas; Feng Chen; Hope Tice; Jan-Fang Cheng; Cliff Han; John C Detter; Roxane Tapia; David Bruce; Lynne Goodwin; Sam Pitluck; Amrita Pati; Iain Anderson; Natalia Ivanova; Konstantinos Mavromatis; Natalia Mikhailova; Amy Chen; Krishna Palaniappan; Yvonne Bilek; Thomas Hader; Miriam Land; Loren Hauser; Yun-Juan Chang; Cynthia D Jeffries; Brian J Tindall; Manfred Rohde; Markus Göker; James Bristow; Jonathan A Eisen; Victor Markowitz; Philip Hugenholtz; Nikos C Kyrpides; Hans-Peter Klenk
Journal:  Stand Genomic Sci       Date:  2010-03-30
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.