Literature DB >> 24441033

An amino acid substitution-selection model adjusts residue fitness to improve phylogenetic estimation.

Huai-Chun Wang1, Edward Susko, Andrew J Roger.   

Abstract

Standard protein phylogenetic models use fixed rate matrices of amino acid interchange derived from analyses of large databases. Differences between the stationary amino acid frequencies of these rate matrices from those of a data set of interest are typically adjusted for by matrix multiplication that converts the empirical rate matrix to an exchangeability matrix which is then postmultiplied by the amino acid frequencies in the alignment. The result is a time-reversible rate matrix with stationary amino acid frequencies equal to the data set frequencies. On the basis of population genetics principles, we develop an amino acid substitution-selection model that parameterizes the fitness of an amino acid as the logarithm of the ratio of the frequency of the amino acid to the frequency of the same amino acid under no selection. The model gives rise to a different sequence of matrix multiplications to convert an empirical rate matrix to one that has stationary amino acid frequencies equal to the data set frequencies. We incorporated the substitution-selection model with an improved amino acid class frequency mixture (cF) model to partially take into account site-specific amino acid frequencies in the phylogenetic models. We show that 1) the selection models fit data significantly better than corresponding models without selection for most of the 21 test data sets; 2) both cF and cF selection models favored the phylogenetic trees that were inferred under current sophisticated models and methods for three difficult phylogenetic problems (the positions of microsporidia and breviates in eukaryote phylogeny and the position of the root of the angiosperm tree); and 3) for data simulated under site-specific residue frequencies, the cF selection models estimated trees closer to the generating trees than a standard Г model or cF without selection. We also explored several ways of estimating amino acid frequencies under neutral evolution that are required for these selection models. By better modeling the amino acid substitution process, the cF selection models will be valuable for phylogenetic inference and evolutionary studies.

Keywords:  amino acid substitution; maximum likelihood; mixture model; molecular phylogenetics; selection; site-specific frequencies

Mesh:

Substances:

Year:  2014        PMID: 24441033     DOI: 10.1093/molbev/msu044

Source DB:  PubMed          Journal:  Mol Biol Evol        ISSN: 0737-4038            Impact factor:   16.240


  11 in total

Review 1.  Probabilistic models of eukaryotic evolution: time for integration.

Authors:  Nicolas Lartillot
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2015-09-26       Impact factor: 6.237

2.  A Surrogate Function for One-Dimensional Phylogenetic Likelihoods.

Authors:  Brian C Claywell; Vu Dinh; Mathieu Fourment; Connor O McCoy; Frederick A Matsen Iv
Journal:  Mol Biol Evol       Date:  2018-01-01       Impact factor: 16.240

3.  Untangling the early diversification of eukaryotes: a phylogenomic study of the evolutionary origins of Centrohelida, Haptophyta and Cryptista.

Authors:  Fabien Burki; Maia Kaplan; Denis V Tikhonenkov; Vasily Zlatogursky; Bui Quang Minh; Liudmila V Radaykina; Alexey Smirnov; Alexander P Mylnikov; Patrick J Keeling
Journal:  Proc Biol Sci       Date:  2016-01-27       Impact factor: 5.349

4.  Parameter Identifiability for a Profile Mixture Model of Protein Evolution.

Authors:  Samaneh Yourdkhani; Elizabeth S Allman; John A Rhodes
Journal:  J Comput Biol       Date:  2021-05-06       Impact factor: 1.549

5.  Expansion of the molecular and morphological diversity of Acanthamoebidae (Centramoebida, Amoebozoa) and identification of a novel life cycle type within the group.

Authors:  Alexander K Tice; Lora L Shadwick; Anna Maria Fiore-Donno; Stefan Geisen; Seungho Kang; Gabriel A Schuler; Frederick W Spiegel; Katherine A Wilkinson; Michael Bonkowski; Kenneth Dumack; Daniel J G Lahr; Eckhard Voelcker; Steffen Clauß; Junling Zhang; Matthew W Brown
Journal:  Biol Direct       Date:  2016-12-28       Impact factor: 4.540

6.  Between a Pod and a Hard Test: The Deep Evolution of Amoebae.

Authors:  Seungho Kang; Alexander K Tice; Frederick W Spiegel; Jeffrey D Silberman; Tomáš Pánek; Ivan Cepicka; Martin Kostka; Anush Kosakyan; Daniel M C Alcântara; Andrew J Roger; Lora L Shadwick; Alexey Smirnov; Alexander Kudryavtsev; Daniel J G Lahr; Matthew W Brown
Journal:  Mol Biol Evol       Date:  2017-09-01       Impact factor: 16.240

7.  Nuclear genetic codes with a different meaning of the UAG and the UAA codon.

Authors:  Tomáš Pánek; David Žihala; Martin Sokol; Romain Derelle; Vladimír Klimeš; Miluše Hradilová; Eliška Zadrobílková; Edward Susko; Andrew J Roger; Ivan Čepička; Marek Eliáš
Journal:  BMC Biol       Date:  2017-02-13       Impact factor: 7.431

8.  Phylotranscriptomics suggests the jawed vertebrate ancestor could generate diverse helper and regulatory T cell subsets.

Authors:  Anthony K Redmond; Daniel J Macqueen; Helen Dooley
Journal:  BMC Evol Biol       Date:  2018-11-15       Impact factor: 3.260

9.  Heme pathway evolution in kinetoplastid protists.

Authors:  Ugo Cenci; Daniel Moog; Bruce A Curtis; Goro Tanifuji; Laura Eme; Julius Lukeš; John M Archibald
Journal:  BMC Evol Biol       Date:  2016-05-18       Impact factor: 3.260

10.  SubVis: an interactive R package for exploring the effects of multiple substitution matrices on pairwise sequence alignment.

Authors:  Scott Barlowe; Heather B Coan; Robert T Youker
Journal:  PeerJ       Date:  2017-06-27       Impact factor: 2.984

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.