Literature DB >> 18662926

Site-specific evolutionary rates in proteins are better modeled as non-independent and strictly relative.

Andrew D Fernandes1, William R Atchley.   

Abstract

MOTIVATION: In a nucleotide or amino acid sequence, not all sites evolve at the same rate, due to differing selective constraints at each site. Currently in computational molecular evolution, models incorporating rate heterogeneity always share two assumptions. First, the rate of evolution at each site is assumed to be independent of every other site. Second, the values of these rates are assumed to be drawn from a known prior distribution. Although often assumed to be small, the actual effect of these assumptions has not been previously quantified in the literature.
RESULTS: Herein we describe an algorithm to simultaneously infer the set of n-1 relative rates that parameterize the likelihood of an n-site alignment. Unlike previous work (a) these relative rates are completely identifiable and distinct from the branch-length parameters, and (b) a far more general class of rate priors can be used, and their effects quantified. Although described in a Bayesian framework, we discuss a future maximum likelihood extension.
CONCLUSIONS: Using both synthetic data and alignments from the Myc, Max and p53 protein families, we find that inferring relative rather than absolute rates has several advantages. First, both empirical likelihoods and Bayes factors show strong preference for the relative-rate model, with a mean Delta ln P=-0.458 per alignment site. Second, the computed likelihoods and Bayes factors were essentially independent of the relative-rate prior, indicating that good estimates of the posterior rate distribution are not required a priori. Third, a novel finding is that rates can be accurately inferred even when up to approximately 4 substitutions per site have occurred. Thus biologically relevant putative hypervariable sites can be identified as easily as conserved sites. Lastly, our model treats rates and tree branch-lengths as completely identifiable, allowing for the first time coherent simultaneous inference of branch-lengths and site-specific evolutionary rates. AVAILABILITY: Source code for the utility described is available under a BSD-style license at http://www.fernandes.org/txp/article/9/site-specific-relative-evolutionary-rates.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 18662926      PMCID: PMC2553437          DOI: 10.1093/bioinformatics/btn395

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  28 in total

1.  DNA-binding by Ig-fold proteins.

Authors:  M J Rudolph; J P Gergen
Journal:  Nat Struct Biol       Date:  2001-05

2.  The Ig fold of the core binding factor alpha Runt domain is a member of a family of structurally and functionally related Ig-fold DNA-binding domains.

Authors:  M J Berardi; C Sun; M Zehr; F Abildgaard; J Peng; N A Speck; J H Bushweller
Journal:  Structure       Date:  1999-10-15       Impact factor: 5.006

3.  A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach.

Authors:  S Whelan; N Goldman
Journal:  Mol Biol Evol       Date:  2001-05       Impact factor: 16.240

4.  T-Coffee: A novel method for fast and accurate multiple sequence alignment.

Authors:  C Notredame; D G Higgins; J Heringa
Journal:  J Mol Biol       Date:  2000-09-08       Impact factor: 5.469

5.  Rate4Site: an algorithmic tool for the identification of functional regions in proteins by surface mapping of evolutionary determinants within their homologues.

Authors:  Tal Pupko; Rachel E Bell; Itay Mayrose; Fabian Glaser; Nir Ben-Tal
Journal:  Bioinformatics       Date:  2002       Impact factor: 6.937

6.  MrBayes 3: Bayesian phylogenetic inference under mixed models.

Authors:  Fredrik Ronquist; John P Huelsenbeck
Journal:  Bioinformatics       Date:  2003-08-12       Impact factor: 6.937

7.  An invariant form for the prior probability in estimation problems.

Authors:  H JEFFREYS
Journal:  Proc R Soc Lond A Math Phys Sci       Date:  1946

8.  Infinite allele model with varying mutation rate.

Authors:  M Nei; R Chakraborty; P A Fuerst
Journal:  Proc Natl Acad Sci U S A       Date:  1976-11       Impact factor: 11.205

9.  Maximum likelihood estimation of the heterogeneity of substitution rate among nucleotide sites.

Authors:  X Gu; Y X Fu; W H Li
Journal:  Mol Biol Evol       Date:  1995-07       Impact factor: 16.240

10.  Identifying site-specific substitution rates.

Authors:  Sonja Meyer; Arndt von Haeseler
Journal:  Mol Biol Evol       Date:  2003-02       Impact factor: 16.240

View more
  4 in total

1.  Calculating site-specific evolutionary rates at the amino-acid or codon level yields similar rate estimates.

Authors:  Dariya K Sydykova; Claus O Wilke
Journal:  PeerJ       Date:  2017-05-30       Impact factor: 2.984

2.  Biochemical and functional evidence of p53 homology is inconsistent with molecular phylogenetics for distant sequences.

Authors:  Andrew D Fernandes; William R Atchley
Journal:  J Mol Evol       Date:  2008-06-17       Impact factor: 2.395

Review 3.  Causes of evolutionary rate variation among protein sites.

Authors:  Julian Echave; Stephanie J Spielman; Claus O Wilke
Journal:  Nat Rev Genet       Date:  2016-01-19       Impact factor: 53.242

4.  A mechanistic stress model of protein evolution accounts for site-specific evolutionary rates and their relationship with packing density and flexibility.

Authors:  Tsun-Tsao Huang; María Laura del Valle Marcos; Jenn-Kang Hwang; Julian Echave
Journal:  BMC Evol Biol       Date:  2014-04-09       Impact factor: 3.260

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.