Literature DB >> 17038448

Evolutionary model selection with a genetic algorithm: a case study using stem RNA.

Sergei L Kosakovsky Pond1, Frank V Mannino, Michael B Gravenor, Spencer V Muse, Simon D W Frost.   

Abstract

The choice of a probabilistic model to describe sequence evolution can and should be justified. Underfitting the data through the use of overly simplistic models may miss out on interesting phenomena and lead to incorrect inferences. Overfitting the data with models that are too complex may ascribe biological meaning to statistical artifacts and result in falsely significant findings. We describe a likelihood-based approach for evolutionary model selection. The procedure employs a genetic algorithm (GA) to quickly explore a combinatorially large set of all possible time-reversible Markov models with a fixed number of substitution rates. When applied to stem RNA data subject to well-understood evolutionary forces, the models found by the GA 1) capture the expected overall rate patterns a priori; 2) fit the data better than the best available models based on a priori assumptions, suggesting subtle substitution patterns not previously recognized; 3) cannot be rejected in favor of the general reversible model, implying that the evolution of stem RNA sequences can be explained well with only a few substitution rate parameters; and 4) perform well on simulated data, both in terms of goodness of fit and the ability to estimate evolutionary rates. We also investigate the utility of several distance measures for comparing and contrasting inferred evolutionary models. Using widely available small computer clusters, our approach allows, for the first time, to evaluate the performance of existing RNA evolutionary models by comparing them with a large pool of candidate models and to validate common modeling assumptions. In addition, the new method provides the foundation for rigorous selection and comparison of substitution models for other types of sequence data.

Entities:  

Mesh:

Substances:

Year:  2006        PMID: 17038448     DOI: 10.1093/molbev/msl144

Source DB:  PubMed          Journal:  Mol Biol Evol        ISSN: 0737-4038            Impact factor:   16.240


  12 in total

Review 1.  Models of coding sequence evolution.

Authors:  Wayne Delport; Konrad Scheffler; Cathal Seoighe
Journal:  Brief Bioinform       Date:  2008-10-29       Impact factor: 11.622

2.  Evolutionary fingerprinting of genes.

Authors:  Sergei L Kosakovsky Pond; Konrad Scheffler; Michael B Gravenor; Art F Y Poon; Simon D W Frost
Journal:  Mol Biol Evol       Date:  2009-10-28       Impact factor: 16.240

3.  Modeling invasive species spread in Lake Champlain via evolutionary computations.

Authors:  B M Osei; C D Ellingwood; J P Hoffmann; D E Bentil
Journal:  Theory Biosci       Date:  2011-02-04       Impact factor: 1.919

4.  Estimating selection pressures on HIV-1 using phylogenetic likelihood models.

Authors:  S L Kosakovsky Pond; A F Y Poon; S Zárate; D M Smith; S J Little; S K Pillai; R J Ellis; J K Wong; A J Leigh Brown; D D Richman; S D W Frost
Journal:  Stat Med       Date:  2008-10-15       Impact factor: 2.373

5.  CodonTest: modeling amino acid substitution preferences in coding sequences.

Authors:  Wayne Delport; Konrad Scheffler; Gordon Botha; Mike B Gravenor; Spencer V Muse; Sergei L Kosakovsky Pond
Journal:  PLoS Comput Biol       Date:  2010-08-19       Impact factor: 4.475

6.  Benchmarking multi-rate codon models.

Authors:  Wayne Delport; Konrad Scheffler; Mike B Gravenor; Spencer V Muse; Sergei Kosakovsky Pond
Journal:  PLoS One       Date:  2010-07-21       Impact factor: 3.240

7.  Adaptive molecular evolution of MC1R gene reveals the evidence for positive diversifying selection in indigenous goat populations.

Authors:  Hafiz Ishfaq Ahmad; Guiqiong Liu; Xunping Jiang; Chenhui Liu; Yuqing Chong; Huang Huarong
Journal:  Ecol Evol       Date:  2017-06-07       Impact factor: 2.912

8.  The Effect of RNA Substitution Models on Viroid and RNA Virus Phylogenies.

Authors:  Juan Ángel Patiño-Galindo; Fernando González-Candelas; Oliver G Pybus
Journal:  Genome Biol Evol       Date:  2018-02-01       Impact factor: 3.416

9.  Dynamic Molecular Evolution of Mammalian Homeobox Genes: Duplication, Loss, Divergence and Gene Conversion Sculpt PRD Class Repertoires.

Authors:  Thomas D Lewin; Amy H Royall; Peter W H Holland
Journal:  J Mol Evol       Date:  2021-06-07       Impact factor: 2.395

10.  HIV-specific probabilistic models of protein evolution.

Authors:  David C Nickle; Laura Heath; Mark A Jensen; Peter B Gilbert; James I Mullins; Sergei L Kosakovsky Pond
Journal:  PLoS One       Date:  2007-06-06       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.