Literature DB >> 25948563

Evaluation of Ancestral Sequence Reconstruction Methods to Infer Nonstationary Patterns of Nucleotide Substitution.

Tomotaka Matsumoto1, Hiroshi Akashi2, Ziheng Yang3.   

Abstract

Inference of gene sequences in ancestral species has been widely used to test hypotheses concerning the process of molecular sequence evolution. However, the approach may produce spurious results, mainly because using the single best reconstruction while ignoring the suboptimal ones creates systematic biases. Here we implement methods to correct for such biases and use computer simulation to evaluate their performance when the substitution process is nonstationary. The methods we evaluated include parsimony and likelihood using the single best reconstruction (SBR), averaging over reconstructions weighted by the posterior probabilities (AWP), and a new method called expected Markov counting (EMC) that produces maximum-likelihood estimates of substitution counts for any branch under a nonstationary Markov model. We simulated base composition evolution on a phylogeny for six species, with different selective pressures on G+C content among lineages, and compared the counts of nucleotide substitutions recorded during simulation with the inference by different methods. We found that large systematic biases resulted from (i) the use of parsimony or likelihood with SBR, (ii) the use of a stationary model when the substitution process is nonstationary, and (iii) the use of the Hasegawa-Kishino-Yano (HKY) model, which is too simple to adequately describe the substitution process. The nonstationary general time reversible (GTR) model, used with AWP or EMC, accurately recovered the substitution counts, even in cases of complex parameter fluctuations. We discuss model complexity and the compromise between bias and variance and suggest that the new methods may be useful for studying complex patterns of nucleotide substitution in large genomic data sets.
Copyright © 2015 by the Genetics Society of America.

Keywords:  GC content; ancestral reconstruction; codon usage; nonstationary models; nucleotide substitution; stochastic mapping

Mesh:

Substances:

Year:  2015        PMID: 25948563      PMCID: PMC4512549          DOI: 10.1534/genetics.115.177386

Source DB:  PubMed          Journal:  Genetics        ISSN: 0016-6731            Impact factor:   4.562


  93 in total

1.  Fitting nonstationary general-time-reversible models to obtain edge-lengths and frequencies for the barry-hartigan model.

Authors:  Liwen Zou; Edward Susko; Chris Field; Andrew J Roger
Journal:  Syst Biol       Date:  2012-04-16       Impact factor: 15.683

2.  Learning to count: robust estimates for labeled distances between molecular sequences.

Authors:  John D O'Brien; Vladimir N Minin; Marc A Suchard
Journal:  Mol Biol Evol       Date:  2009-01-08       Impact factor: 16.240

3.  Inferring pattern and process: maximum-likelihood implementation of a nonhomogeneous model of DNA sequence evolution for phylogenetic analysis.

Authors:  N Galtier; M Gouy
Journal:  Mol Biol Evol       Date:  1998-07       Impact factor: 16.240

4.  A new method of inference of ancestral nucleotide and amino acid sequences.

Authors:  Z Yang; S Kumar; M Nei
Journal:  Genetics       Date:  1995-12       Impact factor: 4.562

5.  Estimating the pattern of nucleotide substitution.

Authors:  Z Yang
Journal:  J Mol Evol       Date:  1994-07       Impact factor: 2.395

6.  Molecular phylogeny of the Drosophila melanogaster species subgroup.

Authors:  Wen-Ya Ko; Ryan M David; Hiroshi Akashi
Journal:  J Mol Evol       Date:  2003-11       Impact factor: 2.395

7.  Ancestral sequence reconstruction in primate mitochondrial DNA: compositional bias and effect on functional inference.

Authors:  Neeraja M Krishnan; Hervé Seligmann; Caro-Beth Stewart; A P Jason De Koning; David D Pollock
Journal:  Mol Biol Evol       Date:  2004-06-30       Impact factor: 16.240

8.  Population genomic analysis of base composition evolution in Drosophila melanogaster.

Authors:  Yu-Ping Poh; Chau-Ti Ting; Hua-Wen Fu; Charles H Langley; David J Begun
Journal:  Genome Biol Evol       Date:  2012       Impact factor: 3.416

9.  Comparison of methods for calculating conditional expectations of sufficient statistics for continuous time Markov chains.

Authors:  Paula Tataru; Asger Hobolth
Journal:  BMC Bioinformatics       Date:  2011-12-05       Impact factor: 3.169

10.  The impact of recombination on nucleotide substitutions in the human genome.

Authors:  Laurent Duret; Peter F Arndt
Journal:  PLoS Genet       Date:  2008-05-09       Impact factor: 5.917

View more
  16 in total

Review 1.  Horizontal Gene Transfer and the History of Life.

Authors:  Vincent Daubin; Gergely J Szöllősi
Journal:  Cold Spring Harb Perspect Biol       Date:  2016-04-01       Impact factor: 10.005

2.  Allele-specific nonstationarity in evolution of influenza A virus surface proteins.

Authors:  Anfisa V Popova; Ksenia R Safina; Vasily V Ptushenko; Anastasia V Stolyarova; Alexander V Favorov; Alexey D Neverov; Georgii A Bazykin
Journal:  Proc Natl Acad Sci U S A       Date:  2019-10-02       Impact factor: 11.205

3.  A Darwinian Uncertainty Principle.

Authors:  Olivier Gascuel; Mike Steel
Journal:  Syst Biol       Date:  2020-05-01       Impact factor: 15.683

4.  Determinants of the Efficacy of Natural Selection on Coding and Noncoding Variability in Two Passerine Species.

Authors:  Pádraic Corcoran; Toni I Gossmann; Henry J Barton; Jon Slate; Kai Zeng
Journal:  Genome Biol Evol       Date:  2017-11-01       Impact factor: 3.416

5.  Variation in the Intensity of Selection on Codon Bias over Time Causes Contrasting Patterns of Base Composition Evolution in Drosophila.

Authors:  Benjamin C Jackson; José L Campos; Penelope R Haddrill; Brian Charlesworth; Kai Zeng
Journal:  Genome Biol Evol       Date:  2017-01-01       Impact factor: 3.416

6.  Ancestral Function and Diversification of a Horizontally Acquired Oomycete Carboxylic Acid Transporter.

Authors:  Fiona R Savory; David S Milner; Daniel C Miles; Thomas A Richards
Journal:  Mol Biol Evol       Date:  2018-08-01       Impact factor: 16.240

7.  Impact of C-terminal amino acid composition on protein expression in bacteria.

Authors:  Marc Weber; Raul Burgos; Eva Yus; Jae-Seong Yang; Maria Lluch-Senar; Luis Serrano
Journal:  Mol Syst Biol       Date:  2020-05       Impact factor: 11.429

8.  An experimental phylogeny to benchmark ancestral sequence reconstruction.

Authors:  Ryan N Randall; Caelan E Radford; Kelsey A Roof; Divya K Natarajan; Eric A Gaucher
Journal:  Nat Commun       Date:  2016-09-15       Impact factor: 14.919

9.  Alignment Modulates Ancestral Sequence Reconstruction Accuracy.

Authors:  Ricardo Assunção Vialle; Asif U Tamuri; Nick Goldman
Journal:  Mol Biol Evol       Date:  2018-07-01       Impact factor: 16.240

10.  Distinguishing Among Evolutionary Forces Acting on Genome-Wide Base Composition: Computer Simulation Analysis of Approximate Methods for Inferring Site Frequency Spectra of Derived Mutations.

Authors:  Tomotaka Matsumoto; Hiroshi Akashi
Journal:  G3 (Bethesda)       Date:  2018-05-04       Impact factor: 3.154

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.