| Literature DB >> 29476089 |
Tom Froese1,2, Jorge I Campos3,4, Kosuke Fujishima5,6, Daisuke Kiga7, Nathaniel Virgo5.
Abstract
Theories of the origin of the genetic code typically appeal to natural selection and/or mutation of hereditable traits to explain its regularities and error robustness, yet the present translation system presupposes high-fidelity replication. Woese's solution to this bootstrapping problem was to assume that code optimization had played a key role in reducing the effect of errors caused by the early translation system. He further conjectured that initially evolution was dominated by horizontal exchange of cellular components among loosely organized protocells ("progenotes"), rather than by vertical transmission of genes. Here we simulated such communal evolution based on horizontal transfer of code fragments, possibly involving pairs of tREntities:
Mesh:
Substances:
Year: 2018 PMID: 29476089 PMCID: PMC5824800 DOI: 10.1038/s41598-018-21973-y
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Assignments of the 64 codons of the genetic code. The bases of the codon table are arranged according to their specific error robustness: least (top), middle (left), and most robust (right). An amino acid’s slot is coloured according to its polar requirement to illustrate chemical similarity. Its aminoacyl-tRNA synthetase class is I or II. (a) The highly ordered standard genetic code. Stop codon slots are coloured white. (b) A highly robust artificial code emerging from the iterated learning model. Stop codons were not included in the model.
Figure 2‘Black box’ of a protocell’s primitive translation system. For simplicity, and following previous work on the iterated learning model, we used a fully interconnected feed-forward multi-layer perceptron network to model the translational mapping from a codon to its corresponding amino acid. There are three input nodes, one for each of a codon’s bases. The order of base positions is arbitrary and interchangeable (no third base ‘wobble’). There are six hidden nodes. Output is an 11-dimensional vector that specifies an amino acid in terms of properties by which it can be uniquely distinguished in chemical space.
Figure 3A model of communal evolution of the genetic code. (a) A small group of protocells is initialized such that their ‘black box’ primitive translation systems encode random genetic codes consisting of few amino acids. Then the ‘iterative learning’ cycle begins. (b) Two protocells are randomly selected for horizontal transfer of a fragment of the donor’s genetic code to the recipient. (c) A small subset of codon assignments is randomly chosen and transferred; occasionally, codon assignment inaccuracies can occur in the transferred components. (d) The recipient adjusts its genetic code to be more like the donor’s code according to the received assignments. (e) The process of horizontal transfer is completed. Then the cycle starts again by going back to (b).
Figure 4Emergence of artificial genetic codes. Results are averaged from 50 runs and plotted in intervals of 500 transfers. Expressivity counts the receiver’s encoded amino acids after a transfer (range [1, 20]), plotted as a box plot where the dark green bar represents the overall mean, the lighter green bar represents lower and upper quartiles, dotted lines represent minimum and maximum non-outliers, and circles represent outliers. Δcode represents optimality as the code’s robustness to single nucleotide changes (red box plot). The standard genetic code (SGC) has an expressivity of 20, the number of amino acids encoded in the code. The Δcode of SGC’s codons (excluding stop codons) is 5.24 (red line). The most robust artificial code with the same expressivity as the SGC has a Δcode of 4.17 (see Fig. 1b for details). Universality is measured as the average distance between all codes in a group of protocells, where distance is calculated as the number of different codon assignments (range [0, 64]). We plot the overall mean distance and its standard deviation, with the final average of 16.86 different assignments being the smallest overall average encountered for the duration of these runs.
Figure 5Regularities of the artificial genetic codes. We analysed the average properties of the 50 most optimal artificial genetic codes, one from each of 50 the independent runs. (a) Like the standard genetic code, the class of simple amino acids has more assignments than the complex and sulfur classes (red). This may partly result from the fact that the simple class is more frequent among the 20 encoded amino acids, but this tendency remains even if we correct for the unequal distribution of classes (blue). (b) Like the standard genetic code, there is a positive correlation between an amino acid’s frequency in proteins, modelled in terms of probability of amino acid transfer, and number of assignments (black). And there is also a negative correlation between its molecular weight and number of assignments (purple). Again, this may partly result from the fact that lighter amino acids are more frequent among the 20 encoded amino acids.