Literature DB >> 20616145

Assessment of substitution model adequacy using frequentist and Bayesian methods.

Jennifer Ripplinger1, Jack Sullivan.   

Abstract

In order to have confidence in model-based phylogenetic methods, such as maximum likelihood (ML) and Bayesian analyses, one must use an appropriate model of molecular evolution identified using statistically rigorous criteria. Although model selection methods such as the likelihood ratio test and Akaike information criterion are widely used in the phylogenetic literature, model selection methods lack the ability to reject all models if they provide an inadequate fit to the data. There are two methods, however, that assess absolute model adequacy, the frequentist Goldman-Cox (GC) test and Bayesian posterior predictive simulations (PPSs), which are commonly used in conjunction with the multinomial log likelihood test statistic. In this study, we use empirical and simulated data to evaluate the adequacy of common substitution models using both frequentist and Bayesian methods and compare the results with those obtained with model selection methods. In addition, we investigate the relationship between model adequacy and performance in ML and Bayesian analyses in terms of topology, branch lengths, and bipartition support. We show that tests of model adequacy based on the multinomial likelihood often fail to reject simple substitution models, especially when the models incorporate among-site rate variation (ASRV), and normally fail to reject less complex models than those chosen by model selection methods. In addition, we find that PPSs often fail to reject simpler models than the GC test. Use of the simplest substitution models not rejected based on fit normally results in similar but divergent estimates of tree topology and branch lengths. In addition, use of the simplest adequate substitution models can affect estimates of bipartition support, although these differences are often small with the largest differences confined to poorly supported nodes. We also find that alternative assumptions about ASRV can affect tree topology, tree length, and bipartition support. Our results suggest that using the simplest substitution models not rejected based on fit may be a valid alternative to implementing more complex models identified by model selection methods. However, all common substitution models may fail to recover the correct topology and assign appropriate bipartition support if the true tree shape is difficult to estimate regardless of model adequacy.

Entities:  

Mesh:

Year:  2010        PMID: 20616145      PMCID: PMC2981515          DOI: 10.1093/molbev/msq168

Source DB:  PubMed          Journal:  Mol Biol Evol        ISSN: 0737-4038            Impact factor:   16.240


  44 in total

1.  Codon-substitution models for heterogeneous selection pressure at amino acid sites.

Authors:  Z Yang; R Nielsen; N Goldman; A M Pedersen
Journal:  Genetics       Date:  2000-05       Impact factor: 4.562

2.  Bayesian inference of phylogeny and its impact on evolutionary biology.

Authors:  J P Huelsenbeck; F Ronquist; R Nielsen; J P Bollback
Journal:  Science       Date:  2001-12-14       Impact factor: 47.728

Review 3.  Molecular phylogenetics: state-of-the-art methods for looking into the past.

Authors:  S Whelan; P Liò; N Goldman
Journal:  Trends Genet       Date:  2001-05       Impact factor: 11.639

4.  Testing a covariotide model of DNA substitution.

Authors:  John P Huelsenbeck
Journal:  Mol Biol Evol       Date:  2002-05       Impact factor: 16.240

5.  Exploring among-site rate variation models in a maximum likelihood framework using empirical data: effects of model assumptions on estimates of topology, branch lengths, and bootstrap support.

Authors:  T R Buckley; C Simon; G K Chambers
Journal:  Syst Biol       Date:  2001-02       Impact factor: 15.683

6.  Bias in phylogenetic estimation and its relevance to the choice between parsimony and likelihood methods.

Authors:  D L Swofford; P J Waddell; J P Huelsenbeck; P G Foster; P O Lewis; J S Rogers
Journal:  Syst Biol       Date:  2001-08       Impact factor: 15.683

7.  Parametric phylogenetics?

Authors:  M J Sanderson; J Kim
Journal:  Syst Biol       Date:  2000-12       Impact factor: 15.683

8.  Bayesian model adequacy and choice in phylogenetics.

Authors:  Jonathan P Bollback
Journal:  Mol Biol Evol       Date:  2002-07       Impact factor: 16.240

9.  Measuring fit of sequence data to phylogenetic model: gain of power using marginal tests.

Authors:  Peter J Waddell; Rissa Ota; David Penny
Journal:  J Mol Evol       Date:  2009-10-23       Impact factor: 2.395

10.  Phylogeography and molecular systematics of the Peromyscus aztecus species group (Rodentia: Muridae) inferred using parsimony and likelihood.

Authors:  J Sullivan; J A Markert; C W Kilpatrick
Journal:  Syst Biol       Date:  1997-09       Impact factor: 15.683

View more
  12 in total

1.  Posterior predictive Bayesian phylogenetic model selection.

Authors:  Paul O Lewis; Wangang Xie; Ming-Hui Chen; Yu Fan; Lynn Kuo
Journal:  Syst Biol       Date:  2013-11-04       Impact factor: 15.683

2.  MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods.

Authors:  Koichiro Tamura; Daniel Peterson; Nicholas Peterson; Glen Stecher; Masatoshi Nei; Sudhir Kumar
Journal:  Mol Biol Evol       Date:  2011-05-04       Impact factor: 16.240

Review 3.  Statistics and truth in phylogenomics.

Authors:  Sudhir Kumar; Alan J Filipski; Fabia U Battistuzzi; Sergei L Kosakovsky Pond; Koichiro Tamura
Journal:  Mol Biol Evol       Date:  2011-08-26       Impact factor: 16.240

4.  Approximating model probabilities in Bayesian information criterion and decision-theoretic approaches to model selection in phylogenetics.

Authors:  Jason Evans; Jack Sullivan
Journal:  Mol Biol Evol       Date:  2010-07-29       Impact factor: 16.240

5.  Phylodynamic Model Adequacy Using Posterior Predictive Simulations.

Authors:  Sebastian Duchene; Remco Bouckaert; David A Duchene; Tanja Stadler; Alexei J Drummond
Journal:  Syst Biol       Date:  2019-03-01       Impact factor: 15.683

6.  Declining transition/transversion ratios through time reveal limitations to the accuracy of nucleotide substitution models.

Authors:  Sebastián Duchêne; Simon Y W Ho; Edward C Holmes
Journal:  BMC Evol Biol       Date:  2015-03-11       Impact factor: 3.260

7.  Differences in Performance among Test Statistics for Assessing Phylogenomic Model Adequacy.

Authors:  David A Duchêne; Sebastian Duchêne; Simon Y W Ho
Journal:  Genome Biol Evol       Date:  2018-06-01       Impact factor: 3.416

8.  EM for phylogenetic topology reconstruction on nonhomogeneous data.

Authors:  Esther Ibáñez-Marcelo; Marta Casanellas
Journal:  BMC Evol Biol       Date:  2014-06-17       Impact factor: 3.260

9.  Impact of the tree prior on estimating clock rates during epidemic outbreaks.

Authors:  Simon Möller; Louis du Plessis; Tanja Stadler
Journal:  Proc Natl Acad Sci U S A       Date:  2018-04-02       Impact factor: 11.205

10.  Relative Model Fit Does Not Predict Topological Accuracy in Single-Gene Protein Phylogenetics.

Authors:  Stephanie J Spielman
Journal:  Mol Biol Evol       Date:  2020-07-01       Impact factor: 16.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.