| Literature DB >> 22504587 |
Feng Ding1, Christopher A Lavender, Kevin M Weeks, Nikolay V Dokholyan.
Abstract
Molecular modeling guided by experimentally derived structural information is an attractive approach for three-dimensional structure determination of complex RNAs that are not amenable to study by high-resolution methods. Hydroxyl radical probing (HRP), which is performed routinely in many laboratories, provides a measure of solvent accessibility at individual nucleotides. HRP measurements have, to date, only been used to evaluate RNA models qualitatively. Here we report the development of a quantitative structure refinement approach using HRP measurements to drive discrete molecular dynamics simulations for RNAs ranging in size from 80 to 230 nucleotides. We first used HRP reactivities to identify RNAs that form extensive helical packing interactions. For these RNAs, we achieved highly significant structure predictions given the inputs of RNA sequence and base pairing. This HRP-directed tertiary structure refinement approach generates robust structural hypotheses that are useful for guiding explorations of structure-function inter-relationships in RNA.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22504587 PMCID: PMC3422565 DOI: 10.1038/nmeth.1976
Source DB: PubMed Journal: Nat Methods ISSN: 1548-7091 Impact factor: 28.547
Figure 1Relationship between RNA structure and HRP reactivity
(a) Structure of the M-Box riboswitch shown in cartoon representation. Nucleotides are colored according to HRP reactivity (blue to red); nucleotides without HRP data are gray. A solvent exposed nucleotide with low HRP reactivity (blue) and a buried nucleotide with high HRP reactivity (red) are emphasized with all-atom representations (asterisks). (b) Structure-reactivity correlation coefficient, C, as a function of d for the six training RNAs using HRP data smoothed over a three-nucleotide window (Online Methods). (c) Comparison of experimentally measured HRP reactivities (red) with the number of through-space contacts (black) for the TPP riboswitch RNA using a d of 14.0 Å. Buried and exposed nucleotide segments are denoted with blue and red lines, respectively (top); arrows indicate the representative nucleotides characteristic of each nucleotide segment. Dashed horizontal lines represent the exposed (R), buried (R), and intermediate (R) threshold values.
Summary of HRP-directed RNA fold refinement for the studied RNAs
The first six RNAs comprise the training set used for algorithm optimization and applicability determination: yeast tRNAAsp (ref 27), thiamine pyrophosphate (TPP) riboswitch[28], specificity domain of ribonuclease P[29]), P546 domain of the Tetrahymena thermophila group I intron[19], M-Box riboswitch[24], and Azoarcus group I intron[30]. The last four RNAs were used for testing the performance: glmS ribozyme[31], lysine riboswitch[32], catalytic domain of ribonuclease P[33], and Oceanobacillus iheyensis group II intron[34]. The fraction of highly protected nucleotides, f0.25, was computed using only the experimental HRP data; f0.25 values above 0.25 are in bold. The structure-reactivity correlation, C, was calculated with reference to the accepted experimental structure. One hundred selected structures were clustered by pairwise RMSD (Online Methods). Small clusters (1 or 2 structures) were excluded. For each cluster, P-values were calculated based on the average RMSD with respect to the accepted experimental structure[16]; highly significant predictions (P < 0.01) are in bold.
| RNA | Length (nts) | Number of clusters | Large Clusters | ||||
|---|---|---|---|---|---|---|---|
| 214 | −0.45 | 1 | 100 | 16.8±2.1 | |||
| M-Box riboswitch | 161 | −0.68 | 1 | 100 | 11.6±2.3 | ||
| P546 domain | 158 | −0.57 | 3 | 66 | 19.8±1.4 | ||
| 32 | 15.1±1.9 | ||||||
| RNase P specificity domain | 152 | 0.25 | −0.30 | 3 | 93 | 24.9±1.2 | 0.67 |
| 4 | 24.1±3.2 | 0.50 | |||||
| 3 | 22.7±0.5 | 0.22 | |||||
| TPP riboswitch | 80 | 0.21 | −0.50 | 4 | 44 | 12.6±1.3 | 0.10 |
| 42 | 14.6±1.9 | 0.44 | |||||
| 12 | 9.9±1.8 | ||||||
| tRNAAsp | 75 | 0.25 | −0.45 | 8 | 29 | 14.1±1.3 | 0.50 |
| 23 | 17.5±1.5 | 0.97 | |||||
| 18 | 16.4±0.8 | 0.90 | |||||
| 8 | 18.9±0.8 | 0.99 | |||||
| 6 | 12.3±1.5 | 0.17 | |||||
| 6 | 15.7±0.4 | 0.82 | |||||
| 5 | 18.9±0.8 | 0.99 | |||||
| Group II intron | 412 | 0.21 | −0.30 | - | - | - | |
| RNase P catalytic domain | 231 | −0.50 | 4 | 46 | 19.2±1.6 | ||
| 41 | 21.6±2.1 | ||||||
| 8 | 25.0±0.7 | ||||||
| 5 | 24.4±2.0 | ||||||
| Lysine riboswitch | 174 | −0.57 | 3 | 57 | 12.0±1.6 | ||
| 42 | 18.1±1.0 | ||||||
| glmS ribozyme | 152 | −0.55 | 2 | 74 | 16.6±1.7 | ||
| 26 | 8.5±1.3 | ||||||
Figure 2Assignment of potentials for incorporating HRP reactivities into DMD simulations
(a) Scheme for modeling the number of allowed contacts. Each nucleotide is assigned a threshold number of contacts (N) within the cutoff distance (d = 14 Å). For a given nucleotide i, its n through-space neighbors are denoted as i1, i2, i3 … An approaching nucleotide can form a new contact (indicated by the inward arrow) if the number of total contacts is smaller than N. If n is larger than N, the approaching nucleotide can form a contact only if the total DMD kinetic energy is sufficient to overcome the energy penalty for over-packing (Online Methods). Otherwise, the nucleotide reflects back without forming a new contact (denoted by the outward arrow). (b) Fraction of nucleotides, f(n), forming at most a given number of contacts, n. Mean (open circles) and standard deviations (error bars) were computed over all single-chain RNA structures in the RCSB database. Adjacent and same-helix nucleotide neighbors were excluded from the number of contacts calculation. Vertical dashed lines correspond to the minimal and maximal number of contacts, 0.5 and 11, respectively. (c) HRP-directed DMD simulation algorithm.
Figure 3HRP-directed RNA fold refinement for the training set
RNAs are shown with backbone traces. The left-most panel shows the accepted structure for each RNA. Right-hand panels show representative structures for each highly populated cluster. Small clusters (with 1 or 2 structures) are not shown. Backbones are colored from blue to red in the 5′ to 3′ direction. For each cluster, the number of structures, mean RMSD, and P-value are shown. Significant P-values[16] are emphasized in bold.