| Literature DB >> 20190761 |
Rhiju Das1, John Karanicolas, David Baker.
Abstract
We present fragment assembly of RNA with full-atom refinement (FARFAR), a Rosetta framework for predicting and designing noncanonical motifs that define RNA tertiary structure. In a test set of thirty-two 6-20-nucleotide motifs, FARFAR recapitulated 50% of the experimental structures at near-atomic accuracy. Sequence redesign calculations recovered native bases at 65% of residues engaged in noncanonical interactions, and we experimentally validated mutations predicted to stabilize a signal recognition particle domain.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20190761 PMCID: PMC2854559 DOI: 10.1038/nmeth.1433
Source DB: PubMed Journal: Nat Methods ISSN: 1548-7091 Impact factor: 28.547
Attainment of native-like structure by de novo Fragment Assembly of RNA with Full Atom Refinement (FARFAR), using the full-atom Rosetta energy function. The lowest energy 500 of 50,000 refined conformations were clustered with a model-model heavy-atom RMSD cutoff of 2.0 Å. The five lowest energy clusters were taken as the de novo models; features of the best cluster (lowest RMSD to the experimental structure) are listed. See Supplementary Fig. 2 for motif definitions.
| Motif | Clustering | Cluster center | Lowest energy | Lowest | |||||
|---|---|---|---|---|---|---|---|---|---|
| No. | No. | Clust | Cluster | RMSD | fNWC | RMSD | fNWC | ||
| G-A base pair | 6 | 2 | 1 | 471 | 1.19 | 1/ 1 | 1.89 | 0/ 1 | 0.54 |
| UUCG tetraloop | 6 | 1 | 1 | 498 | 1.12 | 1/ 1 | 1.14 | 1/ 1 | 0.64 |
| GAGA tetraloop from sarcin/ricin loop | 6 | 1 | 1 | 500 | 0.82 | 1/ 1 | 1.00 | 1/ 1 | 0.52 |
| Loop 8, A-type Ribonuclease P | 7 | 1 | 5 | 27 | 1.38 | 0/ 0 | 1.41 | 0/ 0 | 1.13 |
| Pentaloop from conserved region of | 7 | 1 | 3 | 237 | 1.10 | 1/ 1 | 1.48 | 1/ 1 | 0.88 |
| L3, thiamine pyrophosphate riboswitch | 7 | 1 | 4 | 6 | 2.00 | 0/ 1 | 2.68 | 0/ 1 | 1.44 |
| Fragment with A-C pairs, SRP helix VI | 8 | 2 | 1 | 284 | 1.83 | 2/ 2 | 2.74 | 1/ 2 | 0.48 |
| Helix with U-C base pairs | 8 | 2 | 2 | 491 | 2.10 | 2/ 2 | 2.56 | 1/ 2 | 1.11 |
| Rev response element high affinity site | 9 | 2 | 2 | 4 | 3.95 | 1/ 2 | 4.42 | 0/ 2 | 1.96 |
| J4/5 from P4-P6 domain, | 9 | 2 | 1 | 335 | 1.76 | 1/ 2 | 2.12 | 1/ 2 | 1.09 |
| Tetraloop/helix interaction, L1 ligase | 10 | 3 | 1 | 500 | 1.10 | 1/ 3 | 1.21 | 2/ 3 | 0.69 |
| Hook-turn motif | 11 | 3 | 5 | 121 | 2.56 | 3/ 3 | 2.06 | 3/ 3 | 1.37 |
| Helix with A-C base pairs | 12 | 2 | 2 | 242 | 2.45 | 1/ 4 | 1.81 | 2/ 4 | 1.53 |
| Curved helix with G-A and A-A base | 12 | 2 | 1 | 205 | 1.74 | 2/ 4 | 1.06 | 4/ 4 | 0.96 |
| Fragment with G-G and G-A base | 12 | 2 | 3 | 98 | 3.27 | 0/ 5 | 4.25 | 0/ 5 | 0.86 |
| Signal recognition particle Domain IV | 12 | 2 | 4 | 321 | 1.54 | 2/ 5 | 1.22 | 4/ 5 | 0.93 |
| Stem C internal loop, L1 ligase | 12 | 2 | 1 | 489 | 2.24 | 2/ 3 | 2.42 | 2/ 3 | 1.88 |
| Four-way junction, HCV IRES | 13 | 4 | 3 | 30 | 10.09 | 1/ 4 | 10.63 | 1/ 4 | 2.99 |
| Bulged G motif, sarcin/ricin loop | 13 | 2 | 1 | 81 | 1.46 | 4/ 4 | 1.66 | 3/ 4 | 0.86 |
| Kink-turn motif from SAM-I riboswitch | 13 | 2 | 1 | 7 | 1.43 | 3/ 3 | 1.36 | 3/ 3 | 1.22 |
| Three-way junction, purine riboswitch | 13 | 3 | 3 | 24 | 6.15 | 0/ 3 | 6.10 | 0/ 3 | 3.16 |
| J4a-4b region, metal-sensing | 14 | 2 | 3 | 4 | 3.71 | 0/ 2 | 3.52 | 0/ 2 | 1.27 |
| Kink-turn motif | 15 | 2 | 2 | 25 | 8.85 | 1/ 3 | 9.43 | 2/ 3 | 3.05 |
| Tetraloop/receptor, P4-P6 domain, Tetr. | 15 | 3 | 4 | 13 | 3.31 | 2/ 5 | 2.89 | 2/ 5 | 2.21 |
| Tertiary interaction, hammerhead | 16 | 3 | 2 | 4 | 7.82 | 0/ 3 | 8.50 | 1/ 3 | 4.37 |
| Active site, hammerhead ribozyme | 17 | 3 | 4 | 5 | 8.64 | 1/ 3 | 9.28 | 1/ 3 | 4.41 |
| J5-5a hinge, P4-P6 domain, Tetr. | 17 | 2 | 3 | 12 | 9.99 | 0/ 4 | 10.12 | 0/ 4 | 4.23 |
| Loop E motif, 5S RNA | 18 | 2 | 2 | 40 | 1.64 | 3/ 6 | 2.16 | 6/ 6 | 1.43 |
| L2-L3 tertiary interaction, purine | 18 | 2 | 2 | 10 | 8.19 | 0/ 7 | 8.08 | 0/ 7 | 5.04 |
| Pseudoknot, domain III, CPV IRES | 18 | 2 | 4 | 11 | 3.55 | 0/ 0 | 3.90 | 0/ 0 | 2.29 |
| Pre-catalytic conformation, | 19 | 3 | 5 | 2 | 8.44 | 1/ 4 | 7.66 | 0/ 4 | 4.80 |
| P1-L3, SAM-II riboswitch | 23 | 2 | 5 | 5 | 7.40 | 0/ 1 | 7.47 | 0/ 1 | 3.99 |
Heavy-atom RMSD to crystal structure.
Number of non-Watson-Crick base pairs in crystal structure recovered in the model. Assignment of base pairing followed an automated method based on the RNAVIEW algorithm; counts of correct base pairings are lowered due to ambiguities in assigning bifurcated base pairs, pairs connected by single hydrogen bonds, or pairs that are not completely coplanar.
Figure 1Successes of de novo modeling of non-canonical RNA structure with Fragment Assembly of RNA with Full Atom Refinement (FARFAR). Two-dimensional annotations15 and three-dimensional representations are shown for (a) the E. coli signal recognition particle Domain IV RNA, (b) the bulged-G motif from the E. coli sarcin-ricin loop, (c) the E. coli loop E motif, (d) the kink-turn motif from the SAM-I riboswitch (T. tengcongensis), and (e) the hook-turn motif. (PDB codes are 1LNT, 1Q9A, 354D, 2GIS, and 1MHK respectively.) Each panel depicts the experimentally observed structure (left) and the best of five low-energy cluster centers (right). In (a), a conserved A-C interaction that was missed by automated annotation is shown in gray. (f) All-heavy-atom RMSD for the best of five final predictions (low-energy cluster centers) plotted against the number of residues in the modeled motif. Filled symbols denote atomic accuracy models (see text).
Figure 2Computational and experimental tests validate sequence design and thermostabilization. (a) Sequence recovery over 15 high resolution side-chain-stripped RNA structures optimizing the Rosetta full-atom energy (black bars) was better than chance (25%, dashed line) and better than tests with the FARNA score function (gray bars). (b) Sequence preference predicted from 1000 redesigns (top) compared to an alignment of SRP Domain IV RNA sequences drawn from all three kingdoms of life 16, in sequence logo format 17. Two mutations (I and II) predicted by the Rosetta redesigns to stabilize folding are indicated. (c) Dimethyl sulfate (DMS) modification data probes the structure and thermodynamics of the SRP motif and variants. Sites of chemical modification were read out by reverse transcription of modified RNA with fluorescently labeled DNA primers, separated by multiplexed capillary electrophoresis. (d) Schematic of the construct's tertiary structure. Wedges mark residues that remained accessible to dimethyl sulfate in high Mg2+ folding conditions for the wild type RNA; the pattern for the mutant construct is indistinguishable except at the sites of mutation. (e) Folding isotherms by Mg2+ titration for four separate residues involved in the SRP motif's noncanonical structure (cf. symbols in c & d) overlay well and indicate that the Rosetta-predicted double mutant folds more stably than the wild type sequence. The left-most symbols represent conditions without Mg2+. Full electrophoretic profiles and single mutant fits are presented in Supplementary Fig. 6.