| Literature DB >> 21954438 |
Andrey Chursov1, Mathias C Walter, Thorsten Schmidt, Andrei Mironov, Alexander Shneider, Dmitrij Frishman.
Abstract
It is generally accepted that functionally important RNA structure is more conserved than sequence due to compensatory mutations that may alter the sequence without disrupting the structure. For small RNA molecules sequence-structure relationships are relatively well understood. However, structural bioinformatics of mRNAs is still in its infancy due to a virtual absence of experimental data. This report presents the first quantitative assessment of sequence-structure divergence in the coding regions of mRNA molecules based on recently published transcriptome-wide experimental determination of their base paring patterns. Structural resemblance in paralogous mRNA pairs quickly drops as sequence identity decreases from 100% to 85-90%. Structures of mRNAs sharing sequence identity below roughly 85% are essentially uncorrelated. This outcome is in dramatic contrast to small functional non-coding RNAs where sequence and structure divergence are correlated at very low levels of sequence similarity. The fact that very similar mRNA sequences can have vastly different secondary structures may imply that the particular global shape of base paired elements in coding regions does not play a major role in modulating gene expression and translation efficiency. Apparently, the need to maintain stable three-dimensional structures of encoded proteins places a much higher evolutionary pressure on mRNA sequences than on their RNA structures.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21954438 PMCID: PMC3273797 DOI: 10.1093/nar/gkr790
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Sequence alignment, experimental and theoretical structures of the first and last 50 nt for the pair of yeast mRNA sequences YBR092C (dashed lines) and YBR093C (dotted lines).
Figure 2.The profile of local structural similarity versus local sequence identity for the pair of yeast mRNA sequences YBR092C and YBR093C. The length of the sliding window is 300. The global sequence identity between these two sequences is 86.5%.
Correlation coefficients and P-values for different ranges of sequence identity
| Sequence identity range (%) | Sequence identity versus RMSD between experimental structures | Sequence identity versus RMSD between theoretical structures | RMSD between experimental structures versus RMSD between theoretical structures | |||
|---|---|---|---|---|---|---|
| Correlation coefficient | Correlation coefficient | Correlation coefficient | ||||
| 50–60 | 0.12 | 0.39 | −0.07 | 0.62 | 0.14 | 0.31 |
| 60–70 | 0.14 | 0.22 | −0.10 | 0.37 | −0.02 | 0.87 |
| 70–80 | −0.08 | 0.67 | −0.08 | 0.67 | −0.24 | 0.21 |
| 80–90 | 0.01 | 0.91 | −0.14 | 0.40 | 0.04 | 0.79 |
| 90–100 | −0.92 | 5.66e−27 | −0.75 | 1.24e−12 | 0.69 | 3.56e−10 |
Figure 3.Boxplots of distances between structures of aligned paralogous mRNAs in different ranges of sequence similarity. Each box corresponds to the range of similarity 2.5%. The box extends from the lower to the upper quartile values, with a horizontal line at the median value. Whiskers demonstrate the entire range of the data. Crosses show outliers. (a) Distances between experimental structures. The average level of PARS score distances for alignments of random sequence pairs is 2.14 (dashed line). (b) Distances between theoretical structures. The average level of probability distance for alignments of random sequence pairs is 0.5 (dashed line).
Figure 4.Boxplot of distances between theoretical structures of aligned orthologous mRNAs in different ranges of sequence similarity. Notation as in Figure 3.