| Literature DB >> 20159997 |
Abstract
Structural 3D motifs in RNA play an important role in the RNA stability and function. Previous studies have focused on the characterization and discovery of 3D motifs in RNA secondary and tertiary structures. However, statistical analyses of the distribution of 3D motifs along the RNA appear to be lacking. Herein, we present a novel strategy for evaluating the distribution of 3D motifs along the RNA chain and those motifs whose distributions are significantly non-random are identified. By applying it to the X-ray structure of the large ribosomal subunit from Haloarcula marismortui, helical motifs were found to cluster together along the chain and in the 3D structure, whereas the known tetraloops tend to be sequentially and spatially dispersed. That the distribution of key structural motifs such as tetraloops differ significantly from a random one suggests that our method could also be used to detect novel 3D motifs of any size in sufficiently long/large RNA structures. The motif distribution type can help in the prediction and design of 3D structures of large RNA molecules.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20159997 PMCID: PMC2887949 DOI: 10.1093/nar/gkq074
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Schematic diagrams showing the various tetraloop motifs and their hydrogen-bonding interactions, adapted from Hsiao et al. (11), Figure 3. (a) The standard tetraloop. (b) A tetraloop with a 3–2 switch where the bases of the j+2 and the j+3 residues are switched. (c) A tetraloop with insertion where a residue (in pink) is inserted between the j+1 and the j+2 residues. However, >1 residue can be inserted and if extensive enough, would produce a strand clip. (d) A tetraloop with deletion where the j+2 residue in the standard tetraloop is absent so the j+3 residue becomes the j+2 residue. Rectangles, pentagons and circles denote base, sugar and phosphate groups, respectively, while dashed lines denote characteristic hydrogen bonds.
Figure 3.Three-dimensional backbone and secondary-structures corresponding to the representative 4-mer motifs in Table 1: (a) helical motif, 178−181, (b) part of an internal loop, 1052−1055, (c) tetraloop motif, 1794−1797, (d) part of an internal loop, 209−212 and (e) part of an internal loop, 2689−2692. In the secondary structures, circles denote the residues, while filled ones denote residues comprising the 4-mer motif. Single and double lines denote one and two hydrogen bonds, respectively, dot on the line singles out non-Watson–Crick base pairs, while red lines represent base triples. Secondary structures were prepared by the program, VARNA (47).
Figure 2.The shape histogram of a given RNA fragment. (a) The backbone atoms of a RNA fragment 1794–1797 and some of their distances to a centroid. (b) The shape histogram of a given RNA fragment represented by the frequency of an integer distance in Angstrom.
4-mer motifs derived using Cos = 0.95 and RMSD = 1.5 Å
| Motif | Frequency | σd | Consensus sequence | |||
|---|---|---|---|---|---|---|
| 178 | 1029 | 8.55 | 1.26 | 0.41 | C (30%) G (32%) G (36%) G (36%) | |
| 1052 | 117 | −1.75 | 0.83 | 1.26 | G (38%) C (33%) G (32%) G (47%) | |
| 34 | −1.90 | 0.66 | 2.16 | G (65%) A (50%) A (65%) G (35%) | ||
| 209 | 23 | −1.84 | 0.61 | 2.22 | C (35%) G/C (30%) C (48%) A (56%) | |
| 2689 | 16 | −1.93 | 0.53 | 2.98 | C (37%) C (62%) A (56%) G (50%) | |
aPosition of the representative l-mer of the motif along the RNA chain (‘Methods’ section); number in bold corresponds to the representative of the tetraloop motif.
bThe number of times the l-mer motif is found along the RNA chain.
cC score calculated according to Equations (4 and 5).
dσ score calculated according to Equation (3).
eThe average of the distances between the centroid of each l-mer and the centroid of its nearest neighbor divided by the average of the distances between any two centroids of all l-mer motifs in the 1jj2 structure.
fThe consensus sequence with percentage frequency of each base in parentheses.
All 4-mers encompassing tetraloop motifs derived using Cos = 0.95 and RMSD = 1.5 Å
aPosition of the first residue of the 4-mer along the RNA chain.
bSequence of 4-mer.
cComputed relative to the backbone atoms of the 1794−1797 fragment.
dTetraloop type, according to the annotation by Hsiao et al. (11), Figure 4 (see also Figure 1); FP means false positive; i.e. the 4-mer is not one of the 43 tetraloops annotated by Hsiao et al., and is shaded grey.
3-mer motifs derived using Cos = 0.95 and RMSD = 1.5 Å
aSee footnotes to Table 1.
bShaded motifs are found as part of 4-mer motifs in Table 1; the motifs in italics do not overlap with known motifs in the SCOR database.
5-mer motifs derived using Cos = 0.95 and RMSD = 1.5 Å
aSee footnotes to Table 1.
bShaded motifs are found as part of 4-mer motifs in Table 1.
Sequence signatures of the recurrent motifs derived using Cos = 0.95 and RMSD = 1.5 Å
| Motif | Frequency | σ | Consensus sequence | ||
|---|---|---|---|---|---|
| 3-mer | |||||
| 212 | 11 | −0.12 | 0.84 | 3.62 | A (81%) G (81%) U (72%) |
| 534 | 11 | 0.54 | 0.99 | 3.23 | G (63%) A (63%) A (90%) |
| 552 | 14 | −0.63 | 0.77 | 3.16 | A (71%) C (85%) C (64%) |
| 4-mer | |||||
| 567 | 16 | −0.34 | 0.84 | 2.79 | C (56%) G (81%) A (81%) A (75%) |
| 1389 | 28 | −1.65 | 0.67 | 2.45 | G (64%) A (57%) A (71%) A (75%) |
| 1862 | 14 | −1.33 | 0.62 | 3.32 | C (57%) G (71%) A (57%) A (64) |
| 5-mer | |||||
| 469 | 16 | −1.12 | 0.69 | 3.12 | G (75%) A (50%) A (68%) A (81%) G (62%) |
| 690 | 16 | −2.61 | 0.39 | 3.34 | C (50%) G (81%) A (50%) A (68%) A (81%) |
aSee footnotes to Table 1.