| Literature DB >> 27342318 |
Francesca Rizzato1, Alex Rodriguez1, Alessandro Laio2.
Abstract
BACKGROUND: Many models of protein sequence evolution, in particular those based on Point Accepted Mutation (PAM) matrices, assume that its dynamics is Markovian. Nevertheless, it has been observed that evolution seems to proceed differently at different time scales, questioning this assumption. In 2011 Kosiol and Goldman proved that, if evolution is Markovian at the codon level, it can not be Markovian at the amino acid level. However, it remains unclear up to which point the Markov assumption is verified at the codon level.Entities:
Keywords: Amino acid substitution matrices; Evolutionary distances; Non-Markovian evolution; Protein sequence evolution; Substitution rate variability
Mesh:
Substances:
Year: 2016 PMID: 27342318 PMCID: PMC4921000 DOI: 10.1186/s12859-016-1135-1
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Transition probabilities in a simplified world. Comparison between the transition probabilities in a sequence with constant substitution rate over sites and in a sequence with two equiprobable classes of rates for a simplified system described in Results. Black: P (t) (solid line) and (points); Blue: P (t) (solid line) and (points); Red: P (t) (solid line) and (points)
Fig. 2Comparison between Markovian and non-Markovian substitution probabilities in the framework of codons and of amino acids. a Points: entry-by-entry comparison of P (t) and in log-log scale, with t=0.235. Each point corresponds to a pair i,j of codons and its x-value is given by , while its y-value is . The black squares (zoomed in the yellow inset) are the entries with i=j, while red, green and blue points are respectively the entries where codon i and codon j differ by one, two or three nucleotides. Solid line: line y=x. b Points: entry-by-entry comparison of P (t) and in log-log scale, with t=0.23. Coordinates and lines have the same meaning as in panel (a) and colors are such that the entries where i=j are black (zoomed in the yellow inset), while red, green and blue identify the entries with i≠j where the most similar pair of codons coding for amino acids i and j differ respectively by one, two or three nucleotides
Examples of the variation of the transition probabilities at the sequence identity of 80 % between Markovian (P) and non-Markovian () dynamics
| Initial state | Final state |
|
|
|
|---|---|---|---|---|
| ATC | TGG | 5.21·10−8 | 8.23·10−6 | 0.006 |
| TTC | ATG | 3.39·10−5 | 3.09·10−4 | 0.110 |
| GTC | GTT | 0.1507 | 0.0951 | 1.58 |
| Ile | Val | 0.104 | 0.076 | 1.4 |
| Arg | Lys | 0.064 | 0.049 | 1.3 |
| Gly | Ile | 0.0006 | 0.0022 | 0.3 |
The first three rows involve substitutions in the framework of codons, while the last three are in the framework of amino acids
Fig. 3Impact of the Markovian assumption on the estimation of Q. Points: entry-by-entry comparison between Q and estimated by Eq. 8. Solid line: y=x