| Literature DB >> 23132897 |
Onuralp Soylemez1, Fyodor A Kondrashov.
Abstract
Whether or not evolutionary change is inherently irreversible remains a controversial topic. Some examples of evolutionary irreversibility are known; however, this question has not been comprehensively addressed at the molecular level. Here, we use data from 221 human genes with known pathogenic mutations to estimate the rate of irreversibility in protein evolution. For these genes, we reconstruct ancestral amino acid sequences along the mammalian phylogeny and identify ancestral amino acid states that match known pathogenic mutations. Such cases represent inherent evolutionary irreversibility because, at the present moment, reversals to these ancestral amino acid states are impossible for the human lineage. We estimate that approximately 10% of all amino acid substitutions along the mammalian phylogeny are irreversible, such that a return to the ancestral amino acid state would lead to a pathogenic phenotype. For a subset of 51 genes with high rates of irreversibility, as much as 40% of all amino acid evolution was estimated to be irreversible. Because pathogenic phenotypes do not resemble ancestral phenotypes, the molecular nature of the high rate of irreversibility in proteins is best explained by evolution with a high prevalence of compensatory, epistatic interactions between amino acid sites. Under such mode of protein evolution, once an amino acid substitution is fixed, the probability of its reversal declines as the protein sequence accumulates changes that affect the phenotypic manifestation of the ancestral state. The prevalence of epistasis in evolution indicates that the observed high rate of irreversibility in protein evolution is an inherent property of protein structure and function.Entities:
Mesh:
Year: 2012 PMID: 23132897 PMCID: PMC3542581 DOI: 10.1093/gbe/evs096
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
Estimating the Total Fraction of Irreversible States Out of Phenotypically Relevant Substitutions on the Placental Phylogeny
| Fraction of Irreversible States Out of All Substitutions | Estimated Total Fraction of Irreversible States Out of All Substitutions | Corrected Total Fraction of Irreversible States among Beneficial Substitutions | |
|---|---|---|---|
| HGMD data for all genes in the data set | 1.2% (221) | 10.5% (221) | 18.4% (221) |
| HGMD data for genes with at least one irreversible state in the placental phylogeny | 4.8% (51) | 43% (51) | 81.1% (51) |
| SwissVar data for all genes in the data set | 0.8% (221) | 5.3% (221) | 9.3% (221) |
| SwissVar data for genes with at least one irreversible state in the placental phylogeny | 7.2% (23) | 45.4% (23) | 85.6% (23) |
Note.—The estimated total fraction of irreversible states was obtained by correcting for the sparseness of the mutational data as described in section Materials and Methods. The fraction of irreversible states among beneficial substitution is estimated by dividing the value in the second column by α (0.57 for 221 genes and 0.53 for 51 genes), the fraction of beneficial substitutions in evolution from the McDonald–Kreitman test (supplementary table S4, Supplementary Material online).
FDistribution of pathogenic mutations in our data set. Two histograms show the distribution of missense and nonsense mutations listed in HGMD for 221 genes included in our study.
Number and Fraction of Irreversible States that Represent Cases of a Pathogenic Mutation Matching an Ancestral State that Has a >0.95 Posterior Probability in the Mammalian Phylogeny
| HGMD | SwissVar | HGMD with Severe Phenotypes | SwissVar with Severe Phenotypes | |
|---|---|---|---|---|
| Total number of pathogenic mutations (genes) | 29,143 (221) | 14,481 (221) | 6,892 (57) | 4,490 (57) |
| Number of irreversible states (genes) | 98 (51) | 49 (23) | 22 (11) | 16 (5) |
FAmino acid states that match known human pathogenic mutations. States that match known human pathogenic mutations were found in closely related species without the drastic phenotypic manifestations observed in humans. Such amino acid states may be confined to a sister clade (a) and, therefore, while being indicative of compensatory evolution (Kondrashov et al. 2002) do not represent cases of evolutionary irreversibility. Alternatively, some pathogenic mutations may match the ancestral state at some point along the phylogenetic lineage leading to the human branch (b), and such cases necessarily represent cases of evolutionary irreversibility on the genotype level as such mutations cannot be currently fixed in the human population. For both cases, H1380Y in ATM gene leading to breast cancer susceptibility (a) and Q841K in ABCA4 gene leading to Stargardt disease (b) the multiple alignment with seven amino acids on each side of the site in question is shown. Human wild-type and pathogenic states are represented by black and red colors, respectively. Posterior probabilities associated with irreversible states are shown in parentheses.
Pathogenic Mutations Matching Known Human SNPs from 1000 Genomes Data and dbSNP
| HGMD Mutations Matching a Known SNP (%) | Irreversible States from HGMD Matching a Known SNP (%) | SwissVar Mutations Matching a Known SNP (%) | Irreversible States from SwissVar Matching a Known SNP (%) | |
|---|---|---|---|---|
| 1000 genomes data | 368/29,143 (1.26) | 15/98 (15.3) | 181/14,481 (1.25) | 4/49 (8.1) |
| dbSNP | 85/29,143 (0.29) | 6/98 (6.6) | 50/14,481 (0.33) | 4/49 (8.1) |
*All comparisons were statistically significant (Fisher's exact test, P < 0.0001).
Pathogenic Mutation Data and Estimated Fraction of Missense Mutations that Are Pathogenic
| Missense Mutations | Nonsense Mutations | |
|---|---|---|
| Average number of mutations | 131.9 (123.4) | 26.5 (38.6) |
| Average number of total possible mutations | 5,498.8 (5,050.1) | 332.3 (330.5) |
| Estimated average number of pathogenic mutations | 2,133 (2,704.1) | 332.3 (330.5) |
| Average fraction out of all possible mutations | 0.037 (0.042) | 0.099 (0.102) |
| Average fraction of described mutations out of estimated pathogenic mutations | 0.107 (0.99) | 0.099 (0.102) |
| Estimated average fraction of pathogenic mutations among all possible mutations | 0.431 (0.253) | 1.00 (N.A.) |
Note.—The table reports averages across 221 genes in our data sets and standard deviations in parentheses.
FEstimating the overall rate of amino acid irreversibility along the placental mammal phylogeny. We estimated the fraction of irreversible states between the amino acid sequence of humans and their common ancestor in six clades of placental mammals. Protein distance was calculated as the fraction of different amino acid states between the human sequences and that in the ancestral node. Vertical (horizontal) error bars represent standard error for the fraction of irreversible states (protein distance).