Literature DB >> 29432193

Bayesian selection of misspecified models is overconfident and may cause spurious posterior probabilities for phylogenetic trees.

Ziheng Yang1,2,3, Tianqi Zhu3.   

Abstract

The Bayesian method is noted to produce spuriously high posterior probabilities for phylogenetic trees in analysis of large datasets, but the precise reasons for this overconfidence are unknown. In general, the performance of Bayesian selection of misspecified models is poorly understood, even though this is of great scientific interest since models are never true in real data analysis. Here we characterize the asymptotic behavior of Bayesian model selection and show that when the competing models are equally wrong, Bayesian model selection exhibits surprising and polarized behaviors in large datasets, supporting one model with full force while rejecting the others. If one model is slightly less wrong than the other, the less wrong model will eventually win when the amount of data increases, but the method may become overconfident before it becomes reliable. We suggest that this extreme behavior may be a major factor for the spuriously high posterior probabilities for evolutionary trees. The philosophical implications of our results to the application of Bayesian model selection to evaluate opposing scientific hypotheses are yet to be explored, as are the behaviors of non-Bayesian methods in similar situations.

Keywords:  Bayesian inference; fair-coin paradox; model selection; posterior probability; star-tree paradox

Mesh:

Year:  2018        PMID: 29432193      PMCID: PMC5828583          DOI: 10.1073/pnas.1712673115

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  21 in total

1.  Likelihood-based tests of topologies in phylogenetics.

Authors:  N Goldman; J P Anderson; A G Rodrigo
Journal:  Syst Biol       Date:  2000-12       Impact factor: 15.683

2.  Fair-balance paradox, star-tree paradox, and Bayesian phylogenetics.

Authors:  Ziheng Yang
Journal:  Mol Biol Evol       Date:  2007-05-07       Impact factor: 16.240

3.  The choice of statistical tests illustrated on the interpretation of data classed in a 2 X 2 table.

Authors:  E S PEARSON
Journal:  Biometrika       Date:  1947       Impact factor: 2.445

4.  Probability distribution of molecular evolutionary trees: a new method of phylogenetic inference.

Authors:  B Rannala; Z Yang
Journal:  J Mol Evol       Date:  1996-09       Impact factor: 2.395

Review 5.  Challenges in Species Tree Estimation Under the Multispecies Coalescent Model.

Authors:  Bo Xu; Ziheng Yang
Journal:  Genetics       Date:  2016-12       Impact factor: 4.562

6.  CONFIDENCE LIMITS ON PHYLOGENIES: AN APPROACH USING THE BOOTSTRAP.

Authors:  Joseph Felsenstein
Journal:  Evolution       Date:  1985-07       Impact factor: 3.694

7.  Maximum-likelihood estimation of phylogeny from DNA sequences when substitution rates differ over sites.

Authors:  Z Yang
Journal:  Mol Biol Evol       Date:  1993-11       Impact factor: 16.240

8.  Deuterostome phylogeny reveals monophyletic chordates and the new phylum Xenoturbellida.

Authors:  Sarah J Bourlat; Thorhildur Juliusdottir; Christopher J Lowe; Robert Freeman; Jochanan Aronowicz; Mark Kirschner; Eric S Lander; Michael Thorndyke; Hiroaki Nakano; Andrea B Kohn; Andreas Heyland; Leonid L Moroz; Richard R Copley; Maximilian J Telford
Journal:  Nature       Date:  2006-10-18       Impact factor: 49.962

9.  MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space.

Authors:  Fredrik Ronquist; Maxim Teslenko; Paul van der Mark; Daniel L Ayres; Aaron Darling; Sebastian Höhna; Bret Larget; Liang Liu; Marc A Suchard; John P Huelsenbeck
Journal:  Syst Biol       Date:  2012-02-22       Impact factor: 15.683

10.  BEAST 2: a software platform for Bayesian evolutionary analysis.

Authors:  Remco Bouckaert; Joseph Heled; Denise Kühnert; Tim Vaughan; Chieh-Hsi Wu; Dong Xie; Marc A Suchard; Andrew Rambaut; Alexei J Drummond
Journal:  PLoS Comput Biol       Date:  2014-04-10       Impact factor: 4.475

View more
  10 in total

1.  Target-capture phylogenomics provide insights on gene and species tree discordances in Old World treefrogs (Anura: Rhacophoridae).

Authors:  Kin Onn Chan; Carl R Hutter; Perry L Wood; L Lee Grismer; Rafe M Brown
Journal:  Proc Biol Sci       Date:  2020-12-09       Impact factor: 5.349

2.  Applying hierarchical bayesian modeling to experimental psychopathology data: An introduction and tutorial.

Authors:  Ivy F Tso; Stephan F Taylor; Timothy D Johnson
Journal:  J Abnorm Psychol       Date:  2021-11

3.  A phylogeny for the Drosophila montium species group: A model clade for comparative analyses.

Authors:  William R Conner; Emily K Delaney; Michael J Bronski; Paul S Ginsberg; Timothy B Wheeler; Kelly M Richardson; Brooke Peckenpaugh; Kevin J Kim; Masayoshi Watada; Ary A Hoffmann; Michael B Eisen; Artyom Kopp; Brandon S Cooper; Michael Turelli
Journal:  Mol Phylogenet Evol       Date:  2020-12-31       Impact factor: 4.286

Review 4.  The last universal common ancestor between ancient Earth chemistry and the onset of genetics.

Authors:  Madeline C Weiss; Martina Preiner; Joana C Xavier; Verena Zimorski; William F Martin
Journal:  PLoS Genet       Date:  2018-08-16       Impact factor: 5.917

5.  Using natural history to guide supervised machine learning for cryptic species delimitation with genetic data.

Authors:  Shahan Derkarabetian; James Starrett; Marshal Hedin
Journal:  Front Zool       Date:  2022-02-22       Impact factor: 3.172

6.  A simulation study to examine the impact of recombination on phylogenomic inferences under the multispecies coalescent model.

Authors:  Tianqi Zhu; Tomáš Flouri; Ziheng Yang
Journal:  Mol Ecol       Date:  2022-04-04       Impact factor: 6.622

7.  [Curcumol inhibits keloid fibroblast proliferation and collagen synthesis through the ERK signaling pathway].

Authors:  W Yuan; H Sun; L Yu; J Wang
Journal:  Nan Fang Yi Ke Da Xue Xue Bao       Date:  2021-05-20

8.  Evaluation of phylogenetic reconstruction methods using bacterial whole genomes: a simulation based study.

Authors:  John A Lees; Michelle Kendall; Julian Parkhill; Caroline Colijn; Stephen D Bentley; Simon R Harris
Journal:  Wellcome Open Res       Date:  2018-03-23

9.  Coalescent Analysis of Phylogenomic Data Confidently Resolves the Species Relationships in the Anopheles gambiae Species Complex.

Authors:  Yuttapong Thawornwattana; Daniel Dalquen; Ziheng Yang
Journal:  Mol Biol Evol       Date:  2018-10-01       Impact factor: 16.240

10.  ASPEN, a methodology for reconstructing protein evolution with improved accuracy using ensemble models.

Authors:  Roman Sloutsky; Kristen M Naegle
Journal:  Elife       Date:  2019-10-17       Impact factor: 8.140

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.