Literature DB >> 28950365

Modeling Site Heterogeneity with Posterior Mean Site Frequency Profiles Accelerates Accurate Phylogenomic Estimation.

Huai-Chun Wang1,2,3, Bui Quang Minh4, Edward Susko1,3, Andrew J Roger2,3.   

Abstract

Proteins have distinct structural and functional constraints at different sites that lead to site-specific preferences for particular amino acid residues as the sequences evolve. Heterogeneity in the amino acid substitution process between sites is not modeled by commonly used empirical amino acid exchange matrices. Such model misspecification can lead to artefacts in phylogenetic estimation such as long-branch attraction. Although sophisticated site-heterogeneous mixture models have been developed to address this problem in both Bayesian and maximum likelihood (ML) frameworks, their formidable computational time and memory usage severely limits their use in large phylogenomic analyses. Here we propose a posterior mean site frequency (PMSF) method as a rapid and efficient approximation to full empirical profile mixture models for ML analysis. The PMSF approach assigns a conditional mean amino acid frequency profile to each site calculated based on a mixture model fitted to the data using a preliminary guide tree. These PMSF profiles can then be used for in-depth tree-searching in place of the full mixture model. Compared with widely used empirical mixture models with $k$ classes, our implementation of PMSF in IQ-TREE (http://www.iqtree.org) speeds up the computation by approximately $k$/1.5-fold and requires a small fraction of the RAM. Furthermore, this speedup allows, for the first time, full nonparametric bootstrap analyses to be conducted under complex site-heterogeneous models on large concatenated data matrices. Our simulations and empirical data analyses demonstrate that PMSF can effectively ameliorate long-branch attraction artefacts. In some empirical and simulation settings PMSF provided more accurate estimates of phylogenies than the mixture models from which they derive.

Entities:  

Mesh:

Year:  2018        PMID: 28950365     DOI: 10.1093/sysbio/syx068

Source DB:  PubMed          Journal:  Syst Biol        ISSN: 1063-5157            Impact factor:   15.683


  85 in total

1.  Diversification of giant and large eukaryotic dsDNA viruses predated the origin of modern eukaryotes.

Authors:  Julien Guglielmini; Anthony C Woo; Mart Krupovic; Patrick Forterre; Morgan Gaia
Journal:  Proc Natl Acad Sci U S A       Date:  2019-09-10       Impact factor: 11.205

2.  Dinoflagellates with relic endosymbiont nuclei as models for elucidating organellogenesis.

Authors:  Chihiro Sarai; Goro Tanifuji; Takuro Nakayama; Ryoma Kamikawa; Kazuya Takahashi; Euki Yazaki; Eriko Matsuo; Hideaki Miyashita; Ken-Ichiro Ishida; Mitsunori Iwataki; Yuji Inagaki
Journal:  Proc Natl Acad Sci U S A       Date:  2020-02-24       Impact factor: 11.205

3.  Secondary Plastids of Euglenids and Chlorarachniophytes Function with a Mix of Genes of Red and Green Algal Ancestry.

Authors:  Rafael I Ponce-Toledo; David Moreira; Purificación López-García; Philippe Deschamps
Journal:  Mol Biol Evol       Date:  2018-09-01       Impact factor: 16.240

4.  An estimate of the deepest branches of the tree of life from ancient vertically evolving genes.

Authors:  Edmund R R Moody; Tara A Mahendrarajah; Nina Dombrowski; James W Clark; Celine Petitjean; Pierre Offre; Gergely J Szöllősi; Anja Spang; Tom A Williams
Journal:  Elife       Date:  2022-02-22       Impact factor: 8.140

5.  Phylogenomic resolution of the root of Panpulmonata, a hyperdiverse radiation of gastropods: new insight into the evolution of air breathing.

Authors:  Patrick J Krug; Serena A Caplins; Krisha Algoso; Kanique Thomas; Ángel A Valdés; Rachael Wade; Nur Leena W S Wong; Douglas J Eernisse; Kevin M Kocot
Journal:  Proc Biol Sci       Date:  2022-04-06       Impact factor: 5.349

6.  Proposal of the reverse flow model for the origin of the eukaryotic cell based on comparative analyses of Asgard archaeal metabolism.

Authors:  Anja Spang; Courtney W Stairs; Nina Dombrowski; Laura Eme; Jonathan Lombard; Eva F Caceres; Chris Greening; Brett J Baker; Thijs J G Ettema
Journal:  Nat Microbiol       Date:  2019-04-01       Impact factor: 17.745

7.  A standardized archaeal taxonomy for the Genome Taxonomy Database.

Authors:  Christian Rinke; Maria Chuvochina; Aaron J Mussig; Pierre-Alain Chaumeil; Adrián A Davín; David W Waite; William B Whitman; Donovan H Parks; Philip Hugenholtz
Journal:  Nat Microbiol       Date:  2021-06-21       Impact factor: 17.745

8.  Lack of support for Deuterostomia prompts reinterpretation of the first Bilateria.

Authors:  Paschalia Kapli; Paschalis Natsidis; Daniel J Leite; Maximilian Fursman; Nadia Jeffrie; Imran A Rahman; Hervé Philippe; Richard R Copley; Maximilian J Telford
Journal:  Sci Adv       Date:  2021-03-19       Impact factor: 14.136

9.  A Comprehensive Evolutionary Scenario of Cell Division and Associated Processes in the Firmicutes.

Authors:  Pierre S Garcia; Wandrille Duchemin; Jean-Pierre Flandrois; Simonetta Gribaldo; Christophe Grangeasse; Céline Brochier-Armanet
Journal:  Mol Biol Evol       Date:  2021-05-19       Impact factor: 16.240

10.  St. Louis Encephalitis Virus in the Southwestern United States: A Phylogeographic Case for a Multi-Variant Introduction Event.

Authors:  Chase L Ridenour; Jill Cocking; Samuel Poidmore; Daryn Erickson; Breezy Brock; Michael Valentine; Chandler C Roe; Steven J Young; Jennifer A Henke; Kim Y Hung; Jeremy Wittie; Elene Stefanakos; Chris Sumner; Martha Ruedas; Vivek Raman; Nicole Seaton; William Bendik; Heidie M Hornstra O'Neill; Krystal Sheridan; Heather Centner; Darrin Lemmer; Viacheslav Fofanov; Kirk Smith; James Will; John Townsend; Jeffrey T Foster; Paul S Keim; David M Engelthaler; Crystal M Hepp
Journal:  Front Genet       Date:  2021-06-08       Impact factor: 4.772

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.