Literature DB >> 35139203

nQMaker: Estimating Time Nonreversible Amino Acid Substitution Models.

Cuong Cao Dang1, Bui Quang Minh2, Hanon McShea3, Joanna Masel4, Jennifer Eleanor James5, Le Sy Vinh1, Robert Lanfear6.   

Abstract

Amino acid substitution models are a key component in phylogenetic analyses of protein sequences. All commonly used amino acid models available to date are time-reversible, an assumption designed for computational convenience but not for biological reality. Another significant downside to time-reversible models is that they do not allow inference of rooted trees without outgroups. In this article, we introduce a maximum likelihood approach nQMaker, an extension of the recently published QMaker method, that allows the estimation of time nonreversible amino acid substitution models and rooted phylogenetic trees from a set of protein sequence alignments. We show that the nonreversible models estimated with nQMaker are a much better fit to empirical alignments than pre-existing reversible models, across a wide range of data sets including mammals, birds, plants, fungi, and other taxa, and that the improvements in model fit scale with the size of the data set. Notably, for the recently published plant and bird trees, these nonreversible models correctly recovered the commonly estimated root placements with very high-statistical support without the need to use an outgroup. We provide nQMaker as an easy-to-use feature in the IQ-TREE software (http://www.iqtree.org), allowing users to estimate nonreversible models and rooted phylogenies from their own protein data sets. The data sets and scripts used in this article are available at https://doi.org/10.5061/dryad.3tx95x6hx. [amino acid sequence analyses; amino acid substitution models; maximum likelihood model estimation; nonreversible models; phylogenetic inference; reversible models.].
© The Author(s) 2022. Published by Oxford University Press on behalf of the Society of Systematic Biologists.

Entities:  

Mesh:

Substances:

Year:  2022        PMID: 35139203      PMCID: PMC9366462          DOI: 10.1093/sysbio/syac007

Source DB:  PubMed          Journal:  Syst Biol        ISSN: 1063-5157            Impact factor:   9.160


  39 in total

1.  A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach.

Authors:  S Whelan; N Goldman
Journal:  Mol Biol Evol       Date:  2001-05       Impact factor: 16.240

2.  Identifying the rooted species tree from the distribution of unrooted gene trees under the coalescent.

Authors:  Elizabeth S Allman; James H Degnan; John A Rhodes
Journal:  J Math Biol       Date:  2010-07-23       Impact factor: 2.259

3.  An improved general amino acid replacement matrix.

Authors:  Si Quang Le; Olivier Gascuel
Journal:  Mol Biol Evol       Date:  2008-03-26       Impact factor: 16.240

4.  Linking Branch Lengths across Sets of Loci Provides the Highest Statistical Support for Phylogenetic Inference.

Authors:  David A Duchêne; K Jun Tong; Charles S P Foster; Sebastián Duchêne; Robert Lanfear; Simon Y W Ho
Journal:  Mol Biol Evol       Date:  2020-04-01       Impact factor: 16.240

5.  Maximum likelihood estimation of the heterogeneity of substitution rate among nucleotide sites.

Authors:  X Gu; Y X Fu; W H Li
Journal:  Mol Biol Evol       Date:  1995-07       Impact factor: 16.240

6.  Universal and taxon-specific trends in protein sequences as a function of age.

Authors:  Jennifer E James; Sara M Willis; Paul G Nelson; Catherine Weibel; Luke J Kosinski; Joanna Masel
Journal:  Elife       Date:  2021-01-08       Impact factor: 8.140

7.  Minimum variance rooting of phylogenetic trees and implications for species tree reconstruction.

Authors:  Uyen Mai; Erfan Sayyari; Siavash Mirarab
Journal:  PLoS One       Date:  2017-08-11       Impact factor: 3.240

8.  Whole-genome analyses resolve early branches in the tree of life of modern birds.

Authors:  Erich D Jarvis; Siavash Mirarab; Andre J Aberer; Bo Li; Peter Houde; Cai Li; Simon Y W Ho; Brant C Faircloth; Benoit Nabholz; Jason T Howard; Alexander Suh; Claudia C Weber; Rute R da Fonseca; Jianwen Li; Fang Zhang; Hui Li; Long Zhou; Nitish Narula; Liang Liu; Ganesh Ganapathy; Bastien Boussau; Md Shamsuzzoha Bayzid; Volodymyr Zavidovych; Sankar Subramanian; Toni Gabaldón; Salvador Capella-Gutiérrez; Jaime Huerta-Cepas; Bhanu Rekepalli; Kasper Munch; Mikkel Schierup; Bent Lindow; Wesley C Warren; David Ray; Richard E Green; Michael W Bruford; Xiangjiang Zhan; Andrew Dixon; Shengbin Li; Ning Li; Yinhua Huang; Elizabeth P Derryberry; Mads Frost Bertelsen; Frederick H Sheldon; Robb T Brumfield; Claudio V Mello; Peter V Lovell; Morgan Wirthlin; Maria Paula Cruz Schneider; Francisco Prosdocimi; José Alfredo Samaniego; Amhed Missael Vargas Velazquez; Alonzo Alfaro-Núñez; Paula F Campos; Bent Petersen; Thomas Sicheritz-Ponten; An Pas; Tom Bailey; Paul Scofield; Michael Bunce; David M Lambert; Qi Zhou; Polina Perelman; Amy C Driskell; Beth Shapiro; Zijun Xiong; Yongli Zeng; Shiping Liu; Zhenyu Li; Binghang Liu; Kui Wu; Jin Xiao; Xiong Yinqi; Qiuemei Zheng; Yong Zhang; Huanming Yang; Jian Wang; Linnea Smeds; Frank E Rheindt; Michael Braun; Jon Fjeldsa; Ludovic Orlando; F Keith Barker; Knud Andreas Jønsson; Warren Johnson; Klaus-Peter Koepfli; Stephen O'Brien; David Haussler; Oliver A Ryder; Carsten Rahbek; Eske Willerslev; Gary R Graves; Travis C Glenn; John McCormack; Dave Burt; Hans Ellegren; Per Alström; Scott V Edwards; Alexandros Stamatakis; David P Mindell; Joel Cracraft; Edward L Braun; Tandy Warnow; Wang Jun; M Thomas P Gilbert; Guojie Zhang
Journal:  Science       Date:  2014-12-12       Impact factor: 47.728

9.  Genome-scale DNA sequence data and the evolutionary history of placental mammals.

Authors:  Shaoyuan Wu; Scott Edwards; Liang Liu
Journal:  Data Brief       Date:  2018-05-01

10.  The Pfam protein families database in 2019.

Authors:  Sara El-Gebali; Jaina Mistry; Alex Bateman; Sean R Eddy; Aurélien Luciani; Simon C Potter; Matloob Qureshi; Lorna J Richardson; Gustavo A Salazar; Alfredo Smart; Erik L L Sonnhammer; Layla Hirsh; Lisanna Paladin; Damiano Piovesan; Silvio C E Tosatto; Robert D Finn
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.