| Literature DB >> 26274323 |
Liat Rockah-Shmuel1, Ágnes Tóth-Petróczy1, Dan S Tawfik1.
Abstract
Systematic mappings of the effects of protein mutations are becoming increasingly popular. Unexpectedly, these experiments often find that proteins are tolerant to most amino acid substitutions, including substitutions in positions that are highly conserved in nature. To obtain a more realistic distribution of the effects of protein mutations, we applied a laboratory drift comprising 17 rounds of random mutagenesis and selection of M.HaeIII, a DNA methyltransferase. During this drift, multiple mutations gradually accumulated. Deep sequencing of the drifted gene ensembles allowed determination of the relative effects of all possible single nucleotide mutations. Despite being averaged across many different genetic backgrounds, about 67% of all nonsynonymous, missense mutations were evidently deleterious, and an additional 16% were likely to be deleterious. In the early generations, the frequency of most deleterious mutations remained high. However, by the 17th generation, their frequency was consistently reduced, and those remaining were accepted alongside compensatory mutations. The tolerance to mutations measured in this laboratory drift correlated with sequence exchanges seen in M.HaeIII's natural orthologs. The biophysical constraints dictating purging in nature and in this laboratory drift also seemed to overlap. Our experiment therefore provides an improved method for measuring the effects of protein mutations that more closely replicates the natural evolutionary forces, and thereby a more realistic view of the mutational space of proteins.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26274323 PMCID: PMC4537296 DOI: 10.1371/journal.pcbi.1004421
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
The theoretically possible vs. observed mutational space of M.HaeIII.
| Nonsynonymous | Synonymous Mutations | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Missense mutations | Nonsense Mutations (Stop codons) | ||||||||
| 1 nt | 2 nt | 3 nt | 1 nt | 2 nt | 3 nt | 1 nt | 2 nt | 3 nt | |
|
|
| 3,190 | 1,104 | 125 | 145 | 59 | 321 | 55 | 18 |
|
|
|
|
| ||||||
| Observed G0 | 1,880 | 8 | 0 | 125 | 0 | 0 | 321 | 0 | 0 |
|
| |||||||||
| Observed G3 | 1,401 | 26 | 0 | 33 | 0 | 0 | 321 | 1 | 0 |
| Observed G7 | 1,302 | 41 | 0 | 24 | 0 | 0 | 320 | 5 | 0 |
| Observed G17 | 1,374 | 228 | 1 | 11 | 1 | 0 | 320 | 14 | 0 |
|
| 1,541 | 275 | 1 | 36 | 1 | 0 | 321 | 16 | 0 |
|
| 1,915 | 281 | 1 | 125 | 1 | 0 | 321 | 16 | 0 |
|
| 100% | 8.8% | 0.1% | 100% | 0.7% | 0.0% | 100% | 29.1% | 0.0% |
The number of 'all possible mutations’ is the number of all possible mutations derived from the DNA sequence of wild-type M.HaeIII (329 codons), either nonsynonymous mutations (missense or nonsense) or synonymous mutations. The number of 'observed' mutations comprises the sum of all the mutations identified with above background frequencies in each library. 'Coverage' relates to the percentage of the total observed mutations out of all possible mutations.
* 1,880 mutations were observed at G0 with ≥0 ‘net’ frequencies, and 77 mutations were observed at lower than background frequencies. Out of these, 35 were detected in G3, G7 and/or G17. The remaining 42 mutations were also observed with under background frequencies in G3, G7 and G17, and were assigned a ‘net’ frequency of 0 (i.e., as eliminated by selection, marked in red in ).
‘1/2/3 nt’–all mutations accessible through single/double/triple nucleotide substitutions of a given codon.
Distributions of the relative fitness effect values (W ) for all possible single nucleotide mutations along M.HaeIII gene.
|
|
|
| |||||||
|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
| |
| counts (n) |
|
|
|
|
|
|
|
|
|
|
| 0.042 | 0.82 | 0.40 | 0.028 | 0.84 | 0.36 | 0.020 | 0.91 | 0.41 |
| Standard Deviation | 0.15 | 0.12 | 0.38 | 0.12 | 0.15 | 0.36 | 0.11 | 0.10 | 0.38 |
|
| 0.58 | 0.55 | 0.72 | ||||||
|
| 0.70 | 0.70 | 0.82 | ||||||
|
| 0.34 | 1.06 | 0.27 | 1.13 | 0.24 | 1.11 | |||
| Fraction of mutations (as in | |||||||||
| Deleterious ( | 96.8% | 2.5% | 67.5% | 98.4% | 4.0% | 70.3% | 98.4% | 0.3% | 63.1% |
| Neutral (0.6< | 3.2% | 95.6% | 29.8% | 1.6% | 93.1% | 27.6% | 1.6% | 98.4% | 35.3% |
| Beneficial ( | 0.0% | 1.9% | 2.7% | 0.0% | 2.8% | 2.1% | 0.0% | 1.2% | 1.7% |
| Fraction of the mutations by their frequencies (as in | |||||||||
| Deleterious ( | 0.10% | 0.2% | 15.0% | 0.05% | 0.3% | 8.7% | 0.01% | 0.0% | 2.9% |
| Neutral (0.6< | 0.13% | 41.7% | 40.1% | 0.04% | 46.4% | 37.5% | 0.03% | 48.0% | 36.3% |
| Beneficial ( | 0.00% | 0.8% | 2.0% | 0.00% | 2.6% | 4.4% | 0.00% | 2.5% | 10.3% |
| Total fraction | 0.23% | 42.7% | 57.0% | 0.09% | 49.3% | 50.6% | 0.04% | 50.5% | 49.4% |
|
| 0.003% | 0.53% | 0.71% | 0.002% | 1.11% | 1.14% | 0.002% | 2.97% | 2.91% |
‘nonSense’—refers to the all possible stop codons that can be derived by single nucleotide mutations from the reference gene.
‘Syn’–refers to all the possible synonymous mutations giving the same amino acid as found in the reference gene and can be derived by single nucleotide mutations. Note that 8 positions in the reference gene with Met and Trp that are encoded by one codon only were excluded.
‘nonSyn’–refers to all the possible nonsynonymous, missense mutations that can be derived by single nucleotide mutations from the reference gene.
‘ (W )’–refers to the relative average W value for all possible single nucleotide mutations.
‘ ‘ for the synonymous mutations (W ≈ 0.6, on average) was set as the upper threshold for ‘deleterious’ mutations.
‘ ‘ for the synonymous mutations was used as sub-category of ‘neutral’ mutations, categorizing mutation with W values in the range of 0.6–0.8 as ‘nearly-neutral’.
The ‘ ‘ of the synonymous mutations (~1.1 on average) was set as the upper threshold of neutral mutations, thus categorizing mutations with W >1.1 as ‘beneficial’.
The ‘ ‘ of the nonsense mutations (W ≈ 0.3, on average) was set as the upper threshold defining ‘highly-deleterious’ mutations.
‘Neutral’, ‘Deleterious’ and ‘Beneficial’ show the fractions of mutations found within the defined thresholds of W values for each category.
‘N per position’ is the average mutational frequency per position observed in each library for the cited type of mutation (nonsense, synonymous or nonsynonymous).
The ‘Fraction’ is the fraction of the cited type of mutation out of all mutations observed in a given round.