Literature DB >> 30790674

Statistical binning leads to profound model violation due to gene tree error incurred by trying to avoid gene tree error.

Richard H Adams1, Todd A Castoe2.   

Abstract

Fundamental to all phylogenomic studies is the notion that increasing the amount of data - to entire genomes when possible - will increase the accuracy of phylogenetic inference. Simply adding more data does not, however, guarantee phylogenomic inferences will be more accurate. Even genome-scale reconstructions of species histories can suffer the effects of both incomplete lineage sorting (ILS) and gene tree estimation error (GTEE). Weighted statistical binning was originally proposed as a technique to assist the avian phylogenomics project in solving the bird tree of life, which has long eluded resolution as a result of both ILS and GTEE. These so-called "statistical binning procedures" seek to overcome GTEE by concatenating loci into longer multi-locus "supergenes" that are used to reconstruct a species tree under the assumption that the supergene tree set is an accurate estimate of the true underlying gene tree distribution. Here we evaluate the performance of the method using the original avian phylogenomics dataset. Our results suggest that statistical binning constructs false supergenes that concatenate loci with different coalescent histories more often than not: >92% of supergenes comprise discordant loci. Our results underscore a major logical inconsistency: GTEE - the sole justification for using statistical binning instead of standard concatenation - also makes these methods unreliable. These findings underscore the need for developing new robust frameworks for phylogenomic inference that more appropriately accommodate GTEE and ILS at a genome-wide scale.
Copyright © 2019 Elsevier Inc. All rights reserved.

Keywords:  Coalescent; Concatenation; Phylogenetic inference; Phylogenomics; Species trees; Supergene

Mesh:

Year:  2019        PMID: 30790674     DOI: 10.1016/j.ympev.2019.02.012

Source DB:  PubMed          Journal:  Mol Phylogenet Evol        ISSN: 1055-7903            Impact factor:   4.286


  6 in total

1.  PhyloWGA: chromosome-aware phylogenetic interrogation of whole genome alignments.

Authors:  Richard H Adams; Todd A Castoe; Michael DeGiorgio
Journal:  Bioinformatics       Date:  2021-07-27       Impact factor: 6.937

2.  Of Traits and Trees: Probabilistic Distances under Continuous Trait Models for Dissecting the Interplay among Phylogeny, Model, and Data.

Authors:  Richard H Adams; Heath Blackmon; Michael DeGiorgio
Journal:  Syst Biol       Date:  2021-06-16       Impact factor: 15.683

3.  Supergene validation: A model-based protocol for assessing the accuracy of non-model-based supergene methods.

Authors:  Richard H Adams; Todd A Castoe
Journal:  MethodsX       Date:  2019-09-24

4.  Comprehensive phylogenomic analyses re-write the evolution of parasitism within cynipoid wasps.

Authors:  Bonnie B Blaimer; Dietrich Gotzek; Seán G Brady; Matthew L Buffington
Journal:  BMC Evol Biol       Date:  2020-11-23       Impact factor: 3.260

5.  An empirical assessment of a single family-wide hybrid capture locus set at multiple evolutionary timescales in Asteraceae.

Authors:  Katy E Jones; Tomáš Fér; Roswitha E Schmickl; Rebecca B Dikow; Vicki A Funk; Sonia Herrando-Moraira; Paul R Johnston; Norbert Kilian; Carolina M Siniscalchi; Alfonso Susanna; Marek Slovák; Ramhari Thapa; Linda E Watson; Jennifer R Mandel
Journal:  Appl Plant Sci       Date:  2019-10-25       Impact factor: 1.936

6.  Phylogenetic informativeness analyses to clarify past diversification processes in Cucurbitaceae.

Authors:  Sidonie Bellot; Thomas C Mitchell; Hanno Schaefer
Journal:  Sci Rep       Date:  2020-01-16       Impact factor: 4.379

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.