Literature DB >> 25173567

A confounding effect of missing data on character conflict in maximum likelihood and Bayesian MCMC phylogenetic analyses.

Mark P Simmons1.   

Abstract

Contrived and simulated examples were used to quantify the range of conditions in which maximum likelihood and Bayesian MCMC methods are biased in favor of phylogenetic signal present in globally sampled characters over that present in conflicting locally sampled characters (those with missing data). The bias occurs in both the optimal tree identified as well as branch supports even when there are more locally sampled characters supporting the conflicting topology. The bias can lead to high bootstrap, SH-like aLRT support (up to 100%), and posterior probabilities for the conflicting clades. The bias can occur even when only a single terminal has missing data. The bias is not limited to likelihood methods that only ever present a single optimal tree that is fully resolved (as in PhyML and RAxML)-it can also occur in branch-and-bound PAUP(∗) searches. The bias persists despite sampling numerous characters, and the bias is consistently unidirectional. The bias may occur in the context of incongruence between gene trees as well as within a single gene wherein terminals have different sequence lengths caused by DNA-amplification differences or gaps caused by indels. This bias is another example wherein commonly implemented parametric phylogenetic methods interpret ambiguity as support. In contrast, parsimony is robust to the bias.
Copyright © 2014 Elsevier Inc. All rights reserved.

Keywords:  Bootstrap; Parsimony; Phylogenetic bias; SH-like aLRT; Supermatrix; Unlinked branch lengths

Mesh:

Year:  2014        PMID: 25173567     DOI: 10.1016/j.ympev.2014.08.021

Source DB:  PubMed          Journal:  Mol Phylogenet Evol        ISSN: 1055-7903            Impact factor:   4.286


  8 in total

1.  Which morphological characters are influential in a Bayesian phylogenetic analysis? Examples from the earliest osteichthyans.

Authors:  Benedict King
Journal:  Biol Lett       Date:  2019-07-17       Impact factor: 3.703

2.  Phylogeny Estimation Given Sequence Length Heterogeneity.

Authors:  Vladimir Smirnov; Tandy Warnow
Journal:  Syst Biol       Date:  2021-02-10       Impact factor: 15.683

3.  RADseq dataset with 90% missing data fully resolves recent radiation of Petalidium (Acanthaceae) in the ultra-arid deserts of Namibia.

Authors:  Erin A Tripp; Yi-Hsin Erica Tsai; Yongbin Zhuang; Kyle G Dexter
Journal:  Ecol Evol       Date:  2017-08-30       Impact factor: 2.912

4.  Uneven Missing Data Skew Phylogenomic Relationships within the Lories and Lorikeets.

Authors:  Brian Tilston Smith; William M Mauck; Brett W Benz; Michael J Andersen
Journal:  Genome Biol Evol       Date:  2020-07-01       Impact factor: 3.416

5.  Straight From the Plastome: Molecular Phylogeny and Morphological Evolution of Fargesia (Bambusoideae: Poaceae).

Authors:  Yun Zhou; Yu-Qu Zhang; Xiao-Cheng Xing; Jian-Qiang Zhang; Yi Ren
Journal:  Front Plant Sci       Date:  2019-08-06       Impact factor: 5.753

6.  X-chromosomal STR based genetic polymorphisms and demographic history of Sri Lankan ethnicities and their relationship with global populations.

Authors:  Nandika Perera; Gayani Galhena; Gaya Ranawaka
Journal:  Sci Rep       Date:  2021-06-17       Impact factor: 4.379

7.  Population genetic study of 34 X-Chromosome markers in 5 main ethnic groups of China.

Authors:  Suhua Zhang; Yingnan Bian; Li Li; Kuan Sun; Zheng Wang; Qi Zhao; Lagabaiyila Zha; Jifeng Cai; Yuzhen Gao; Chaoneng Ji; Chengtao Li
Journal:  Sci Rep       Date:  2015-12-04       Impact factor: 4.379

8.  Split-inducing indels in phylogenomic analysis.

Authors:  Alexander Donath; Peter F Stadler
Journal:  Algorithms Mol Biol       Date:  2018-07-16       Impact factor: 1.405

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.