| Literature DB >> 34947869 |
Elena Fimmel1, Markus Gumbel1, Martin Starman1, Lutz Strüngmann1.
Abstract
It is believed that the codon-amino acid assignments of the standard genetic code (SGC) help to minimize the negative effects caused by point mutations. All possible point mutations of the genetic code can be represented as a weighted graph with weights that correspond to the probabilities of these mutations. The robustness of a code against point mutations can be described then by means of the so-called conductance measure. This paper quantifies the wobble effect, which was investigated previously by applying the weighted graph approach, and seeks optimal weights using an evolutionary optimization algorithm to maximize the code's robustness. One result of our study is that the robustness of the genetic code is least influenced by mutations in the third position-like with the wobble effect. Moreover, the results clearly demonstrate that point mutations in the first, and even more importantly, in the second base of a codon have a very large influence on the robustness of the genetic code. These results were compared to single nucleotide variants (SNV) in coding sequences which support our findings. Additionally, it was analyzed which structure of a genetic code evolves from random code tables when the robustness is maximized. Our calculations show that the resulting code tables are very close to the standard genetic code. In conclusion, the results illustrate that the robustness against point mutations seems to be an important factor in the evolution of the standard genetic code.Entities:
Keywords: evolutionary algorithm; genetic code; point mutations; wobble effect
Year: 2021 PMID: 34947869 PMCID: PMC8707135 DOI: 10.3390/life11121338
Source DB: PubMed Journal: Life (Basel) ISSN: 2075-1729
Figure 1The graph G for = {A, C, G, U}, and all weights . For simplicity, tuples of length 2 are used instead of codons ()—which leads to only 16 instead of 64 nodes.
Figure 2The graph of = {ACU, ACC, ACA, ACG} inside the graph G for . The set-conductance is ≃.
Weight distribution as introduced in [12]. Each table shows the weights according to base position 1 to 3 within a codon (see number in upper left corner). In accordance with Definition 1 the values are for all with , and .
| 1 | U | C | A | G |
| U | 0 | 1 | 1 | 1 |
| C | 1 | 0 | 1 | 1 |
| A | 1 | 1 | 0 | 1 |
| G | 1 | 1 | 1 | 0 |
| 2 | U | C | A | G |
| U | 0 | 1 | 1 | 1 |
| C | 1 | 0 | 1 | 1 |
| A | 1 | 1 | 0 | 1 |
| G | 1 | 1 | 1 | 0 |
| 3 | U | C | A | G |
| U | 0 | 4 | 2 | 2 |
| C | 4 | 0 | 2 | 2 |
| A | 2 | 2 | 0 | 4 |
| G | 2 | 2 | 4 | 0 |
Figure 3Simplified flow chart of a evolutionary algorithm (EA).
The index transformation from the row-column system to a linear indexing system. (a) All non-zero columns are mapped to a linear index ranging from 1 to 18. (b) List with values of Table 1.
| (a) Indices | ||||||
|---|---|---|---|---|---|---|
| Base position | U ↔ C | U ↔ A | U ↔ G | C ↔ A | C ↔ G | A ↔ G |
| 1 | 1 | 2 | 3 | 4 | 5 | 6 |
| 2 | 7 | 8 | 9 | 10 | 11 | 12 |
| 3 | 13 | 14 | 15 | 16 | 17 | 18 |
|
| ||||||
| Base position | U ↔ C | U ↔ A | U ↔ G | C ↔ A | C ↔ G | A ↔ G |
| 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| 2 | 1 | 1 | 1 | 1 | 1 | 1 |
| 3 | 4 | 2 | 2 | 2 | 2 | 4 |
Optimized weights of the conductance graphs. The values are normalized such that the sum equals 36. (a) Weights from Table 1 in normalized form. These weights were used in [12]. (b) The optimal weights as found by the evolutionary algorithm. Important figures which are discussed in the text are shown in bold.
| ( | ||||
| 1 | U | C | A | G |
| U | 0 | 0.6 | 0.6 | 0.6 |
| C | 0.6 | 0 | 0.6 | 0.6 |
| A | 0.6 | 0.6 | 0 | 0.6 |
| G | 0.6 | 0.6 | 0.6 | 0 |
| 2 | U | C | A | G |
| U | 0 | 0.6 | 0.6 | 0.6 |
| C | 0.6 | 0 | 0.6 | 0.6 |
| A | 0.6 | 0.6 | 0 | 0.6 |
| G | 0.6 | 0.6 | 0.6 | 0 |
| 3 | U | C | A | G |
| U | 0 |
| 1.3 | 1.3 |
| C |
| 0 | 1.3 | 1.3 |
| A | 1.3 | 1.3 | 0 |
|
| G | 1.3 | 1.3 |
| 0 |
| ( | ||||
| 1 | U | C | A | G |
| U | 0 | 0.006 | 0.002 | 0.003 |
| C | 0.006 | 0 | 0.003 | 0.003 |
| A | 0.002 | 0.003 | 0 | 0.005 |
| G | 0.003 | 0.003 | 0.005 | 0 |
| 2 | U | C | A | G |
| U | 0 | 0.002 | 0.002 | 0.003 |
| C | 0.002 | 0 | 0.003 | 0.004 |
| A | 0.002 | 0.003 | 0 | 0.005 |
| G | 0.003 | 0.004 | 0.005 | 0 |
| 3 | U | C | A | G |
| U | 0 |
| 0.018 | 0.007 |
| C |
| 0 | 0.021 | 0.012 |
| A | 0.018 | 0.021 | 0 |
|
| G | 0.007 | 0.012 |
| 0 |
Point mutation frequencies. The values are normalized such that the sum equals 36. (a) Normalized point mutations per base position of SNV measured in mouse chromosome 1 and 2. The transitions were normalized which makes the comparison with the weights matrix simpler. (b) The normalized mutation matrices of (a) were made symmetrical as the conductance weights are symmetrical, too. The average value of two corresponding cells is calculated. Important figures which are discussed in the text are shown in bold.
| ( | ||||
| 1 | U | C | A | G |
| U | 0 | 0.9 | 0.2 | 0.3 |
| C | 1.5 | 0 | 0.4 | 0.3 |
| A | 0.2 | 0.3 | 0 | 0.9 |
| G | 0.4 | 0.3 | 1.4 | 0 |
| 2 | U | C | A | G |
| U | 0 | 0.7 | 0.2 | 0.2 |
| C | 1.1 | 0 | 0.2 | 0.2 |
| A | 0.2 | 0.2 | 0 | 0.7 |
| G | 0.2 | 0.2 | 1.1 | 0 |
| 3 | U | C | A | G |
| U | 0 |
| 0.5 | 0.6 |
| C |
| 0 | 0.8 | 0.7 |
| A | 0.5 | 0.6 | 0 |
|
| G | 0.8 | 0.6 |
| 0 |
| ( | ||||
| 1 | U | C | A | G |
| U | 0 | 1.2 | 0.2 | 0.3 |
| C | 1.2 | 0 | 0.4 | 0.3 |
| A | 0.2 | 0.4 | 0 | 1.1 |
| G | 0.3 | 0.3 | 1.1 | 0 |
| 2 | U | C | A | G |
| U | 0 | 0.9 | 0.2 | 0.2 |
| C | 0.9 | 0 | 0.2 | 0.2 |
| A | 0.2 | 0.2 | 0 | 0.9 |
| G | 0.2 | 0.2 | 0.9 | 0 |
| 3 | U | C | A | G |
| U | 0 |
| 0.5 | 0.7 |
| C |
| 0 | 0.7 | 0.7 |
| A | 0.5 | 0.7 | 0 |
|
| G | 0.7 | 0.7 |
| 0 |
Figure 4Relative frequencies of point mutations in coding sequences in mouse chromosome 1 and 2 according to their position in the codon. Each bar is divided into non-synonymous (red) and synonymous (or silent; in gray) mutations.
Optimized code tables with 21 classes. (a) shows the optimal genetic code table as published in [13] where all weights are set to 1. (b) is the optimization result if one takes the weights from Table 1. (c) is the optimization result if one takes the weights from Table 3b. Amino acids are displayed when they match the SGC.
| ( | |||||||||
| U | C | A | G | ||||||
| U | UUU |
| UCU |
| UAU |
| UGU |
| U |
| UUC | UCC | UAC | UGC | C | |||||
| UUA | UCA | UAA | UGA | A | |||||
| UUG |
| UCG | 8 | UAG |
| UGG | G | ||
| C | CUU |
| CCU |
| CAU |
| CGU |
| U |
| CUC | CCC | CAC | CGC | C | |||||
| CUA | CCA | CAA | CGA | A | |||||
| CUG |
| CCG |
| CAG |
| CGG |
| G | |
| A | AUU | ACU |
| AAU |
| AGU |
| U | |
| AUC | ACC | AAC | AGC | C | |||||
| AUA | ACA | AAA | AGA | A | |||||
| AUG | ACG |
| AAG |
| AGG |
| G | ||
| G | GUU |
| GCU |
| GAU |
| GGU |
| U |
| GUC | GCC | GAC | GGC | C | |||||
| GUA | GCA | GAA | GGA | A | |||||
| GUG |
| GCG |
| GAG |
| GGG |
| G | |
|
( | |||||||||
| U | C | A | G | ||||||
| U | UUU | UCU | UAU | UGU |
| U | |||
| UUC | UCC | UAC | UGC | C | |||||
| UUA | UCA | UAA | UGA | A | |||||
| UUG | UCG | UAG | UGG | G | |||||
| C | CUU | CCU | CAU | CGU | U | ||||
| CUC | CCC | CAC | CGC | C | |||||
| CUA | CCA | CAA | CGA | A | |||||
| CUG | CCG | CAG | CGG | G | |||||
| A | AUU |
| ACU | AAU |
| AGU | U | ||
| AUC | ACC | AAC | AGC | C | |||||
| AUA | ACA | AAA | AGA | A | |||||
| AUG | ACG | AAG | AGG | G | |||||
| G | GUU | GCU | GAU | GGU | U | ||||
| GUC | GCC | GAC | GGC | C | |||||
| GUA | GCA | GAA | GGA | A | |||||
| GUG | GCG | GAG | GGG | G | |||||
|
( | |||||||||
| U | C | A | G | ||||||
| U | UUU | UCU | UAU | UGU |
| U | |||
| UUC | UCC | UAC | UGC | C | |||||
| UUA | UCA | UAA | UGA | A | |||||
| UUG | UCG | UAG | UGG | G | |||||
| C | CUU | CCU | CAU |
| CGU | U | |||
| CUC | CCC | CAC | CGC | C | |||||
| CUA | CCA | CAA | CGA | A | |||||
| CUG | CCG | CAG | CGG | G | |||||
| A | AUU |
| ACU | AAU | AGU | U | |||
| AUC | ACC | AAC | AGC | C | |||||
| AUA | ACA | AAA | AGA | A | |||||
| AUG | ACG | AAG | AGG | G | |||||
| G | GUU | 5 (Val/V) | GCU | 9 (Ala/a) | GAU | 15 (Asp/D) | GGU | 21 (Gly/G) | U |
| GUC | GCC | GAC | GGC | C | |||||
| GUA | GCA | GAA | 16 (Glu/E) | GGA | A | ||||
| GUG | GCG | GAG | GGG | G | |||||