| Literature DB >> 23945356 |
Anton A Polyansky1, Mario Hlevnjak, Bojan Zagrovic.
Abstract
Despite more than 50 years of effort, the origin of the genetic code remains enigmatic. Among different theories, the stereochemical hypothesis suggests that the code evolved as a consequence of direct interactions between amino acids and appropriate bases. If indeed true, such physicochemical foundation of the mRNA/protein relationship could also potentially lead to novel principles of protein-mRNA interactions in general. Inspired by this promise, we have recently explored the connection between the physicochemical properties of mRNAs and their cognate proteins at the proteome level. Using experimentally and computationally derived measures of solubility of amino acids in aqueous solutions of pyrimidine analogs together with knowledge-based interaction preferences of amino acids for different nucleobases, we have revealed a statistically significant matching between the composition of mRNA coding sequences and the base-binding preferences of their cognate protein sequences. Our findings provide strong support for the stereochemical hypothesis of genetic code's origin and suggest the possibility of direct complementary interactions between mRNAs and cognate proteins even in present-day cells.Entities:
Keywords: knowledge-based statistical potentials; mRNA-cognate protein complementarity; origin of the genetic code; polar requirement; stereochemical hypothesis
Mesh:
Substances:
Year: 2013 PMID: 23945356 PMCID: PMC3817144 DOI: 10.4161/rna.25977
Source DB: PubMed Journal: RNA Biol ISSN: 1547-6286 Impact factor: 4.652

Figure 1. Matching of mRNA coding-sequence pyrimidine (PYR) profiles and their cognate protein sequence polar-requirement (PR) profiles. (A) Distribution of Pearson correlation coefficients (R, x-axis) between window-averaged PYR-content profiles of individual mRNA-coding sequences and window-averaged PR sequence profiles of their cognate proteins for the human proteome (window size is 21 amino acids/codons). P (y-axis) corresponds to bin-size-normalized probability density. Inset: the median Rs for the human proteome obtained using protein PR sequence profiles and different nucleobase density mRNA profiles. (B) Comparison between mRNA PYR-content and their cognate protein PR profiles for three exemplary human proteins. Experimental PR scale has been used for all comparisons presented here. Note that due to the definition of the PR scale, negative correlations indicate positive matching between the given content of mRNAs and the affinity for PYR analogs of their cognate proteins and vice versa.

Figure 2. Matching of mRNA coding-sequence purine (PUR) density profiles and protein sequence knowledge-based interaction preference profiles. (A) Distribution of Pearson correlation coefficients (R, x-axis) between mRNA PUR profiles and G- (blue) or A-preference (magenta) sequence profiles of human proteins. P (y-axis) corresponds to bin-size-normalized probability density. Inset: the median Rs for the human proteome obtained using PUR mRNA profiles and different knowledge-based interaction preference scales for cognate protein sequences. (B) Comparison between mRNA PUR density profiles and their cognate protein G-preference profiles for the same three exemplary human proteins as given in Figure 1B. Note that due to the definition of knowledge-based scales, negative correlations indicate positive matching between the given content of mRNAs and the binding preferences of their cognate proteins and vice versa.

Figure 3. Connection between knowledge-based nucleobase-binding preferences of amino acids and the base content of their cognate codons. (A) Correlation between G-interaction preferences of amino acids from Miller-Urey experiment and the average G-content of their codons in mRNAs of the entire human proteome. (B) Correlation between A-interaction preferences of other amino acids and the average A-content of their codons. (C) Correlation coefficients (R) between different amino-acid interaction preferences and the respective compositions of their codons. Amino acids are grouped into three subsets, which are colored according to the legend. Note that due to the definition of knowledge-based scales, negative correlations indicate positive matching between the given content of codons and the binding preferences of their cognate amino acids and vice versa.