Literature DB >> 34550747

Structures of artificially designed discrete RNA nanoarchitectures at near-atomic resolution.

Di Liu1, Yaming Shao2, Joseph A Piccirilli1,2, Yossi Weizmann1,3.   

Abstract

Although advances in nanotechnology have enabled the construction of complex and functional synthetic nucleic acid–based nanoarchitectures, high-resolution discrete structures are lacking because of the difficulty in obtaining good diffracting crystals. Here, we report the design and construction of RNA nanostructures based on homooligomerizable one-stranded tiles for x-ray crystallographic determination. We solved three structures to near-atomic resolution: a 2D parallelogram, a 3D nanobracelet unexpectedly formed from an RNA designed for a nanocage, and, eventually, a bona fide 3D nanocage designed with the guidance of the two previous structures. Structural details of their constituent motifs, such as kissing loops, branched kissing loops, and T-junctions, that resemble natural RNA motifs and resisted x-ray determination are revealed, providing insights into those natural motifs. This work unveils the largely unexplored potential of crystallography in gaining high-resolution feedback for nanoarchitectural design and suggests a route to investigate RNA motif structures by configuring them into nanoarchitectures.

Entities:  

Year:  2021        PMID: 34550747      PMCID: PMC8457670          DOI: 10.1126/sciadv.abf4459

Source DB:  PubMed          Journal:  Sci Adv        ISSN: 2375-2548            Impact factor:   14.136


INTRODUCTION

A prominent theme in chemistry and biology is to understand the structures and functions of nucleic acids (NAs) and to engineer them for a wide range of applications (). While functional NAs can be obtained by harnessing the natural ones (), they can also be artificially created either in a combinatorial manner using in vitro selection/evolution (–) or in a more rational manner established by the field of DNA/RNA nanotechnology (–). Although many high-resolution structures of natural or in vitro selected/evolved NAs have been determined (), with the exception of few crystallographic studies of self-assembled lattices (–) or relatively simple two-dimensional (2D) cyclic nano-objects (–), the majority of de novo designed discrete NA-based nanostructures were only studied and confirmed by microscopic methods to a low resolution (worse than 10 Å), only allowing the approximate fitting of helices. The high-resolution structures of more complex nanoarchitectures, especially the 3D ones, as well as the precise details of critical constituent motifs joining the helices, still remain to be determined. Factors impeding the structural analysis of artificial NA nanoarchitectures involve (i) high conformational plasticity of these structures and (ii) crystallization difficulties of NA molecules in general (). To solve these problems, we report here a strategy for nanostructure design based on tiles that are folded from an RNA single-strand and homooligomerize into geometrically closed 2D and 3D shapes (Fig. 1, A and B) for crystallographic studies. Using this homomeric design based on one-stranded tiles, we expect to rigidify the structure by minimizing strand breaks and to improve homogeneity by dispensing with exact stoichiometry. To ensure the tiles’ one-strandedness, we use programmable kissing interactions such as kissing loops (KLs) and branched KLs (bKLs; Fig. 1C) () to mediate the assembly. These interactions are often referred to as paranemic cohesions (–) because they are formed via the association of two topologically closed moieties that can be separated from each other without strand scission. In addition, the resulting structures are homooligomers with intrinsic symmetries that are beneficial for biomacromolecule crystallization (, ). Furthermore, on the basis of the principles of RNA tectonics (, ), we incorporate well-structured bent motifs as the joining regions (“J”) to provide the curvature necessary for ring closure. These bent motifs can also serve as search models of molecular replacement for solving the crystals. In total, we have successfully solved three RNA nanoarchitectures. Besides providing structural feedback on the nanostructure designs, solving these structures enables us to gain structural insights into their constituent motifs [such as a KL complex, three bKLs, and a T-junction (T-J)] that resemble natural RNA motifs and previously resisted x-ray crystallographic determination. Determining the 3D structures of these kissing interactions is crucial for understanding the structures and functions of those natural motifs within their natural contexts.
Fig. 1.

Design of 2D and 3D homomeric nanoarchitectures self-assembled from RNA tiles folded from a single strand.

(A and B) Schematics illustrating the design of one-stranded RNA tiles that self-assemble into 2D (A) and 3D (B) nanostructures via KL or bKL interactions. Pn indicates paired helical regions. Jmn indicates joining regions between Pm and Pn and provides bending between them. Ln and Lmn indicate loop regions mediating the kissing interactions. Dashed double arrows indicate intermolecular kissing interactions between the loops in the same colors (red or green) as the arrows. In each shown example of homodimer, the other copy of the RNA tile is colored blue. (C) The formation of a bKL is mediated by Watson-Crick (WC) base-pairing between the single-stranded regions of a bulged helix and a hairpin loop. The formed bKL has four helical domains: Hb1 and Hb2 are the 5′ and 3′ flanking helices of the bulge, respectively. Hl is connected to the loop. Hk is the helix resulting from the kissing interaction.

Design of 2D and 3D homomeric nanoarchitectures self-assembled from RNA tiles folded from a single strand.

(A and B) Schematics illustrating the design of one-stranded RNA tiles that self-assemble into 2D (A) and 3D (B) nanostructures via KL or bKL interactions. Pn indicates paired helical regions. Jmn indicates joining regions between Pm and Pn and provides bending between them. Ln and Lmn indicate loop regions mediating the kissing interactions. Dashed double arrows indicate intermolecular kissing interactions between the loops in the same colors (red or green) as the arrows. In each shown example of homodimer, the other copy of the RNA tile is colored blue. (C) The formation of a bKL is mediated by Watson-Crick (WC) base-pairing between the single-stranded regions of a bulged helix and a hairpin loop. The formed bKL has four helical domains: Hb1 and Hb2 are the 5′ and 3′ flanking helices of the bulge, respectively. Hl is connected to the loop. Hk is the helix resulting from the kissing interaction.

RESULTS

Design and structure of a dimeric parallelogram

The first solved structure is of a 2D parallelogram (PLM) assembled from a 51-nucleotide (nt) RNA tile (Fig. 2A). The joining motif J12 is a K-turn () expected to form a ~60° angle. The KL is adapted from a 7–base pair (bp) KL of the RNA I–RNA II complex of the Escherichia coli ColE1 plasmid () with a ~120° angle as determined by nuclear magnetic resonance (NMR) (). We optimized the assembly by testing annealing buffers of different cation contents (Fig. 2B): An exclusive product (subsequently proved by x-ray crystallography to be the expected dimer, PLM) is obtained in buffers containing ≤1 mM Mg2+ (lanes 1 to 3), and a slower-migrating band (speculated as trimer) starts to emerge under a higher concentration (3 mM) of Mg2+ (lane 4). Large-scale assembly of PLM was conducted in a buffer of 1 mM Mg2+ and, without any purification, was concentrated for crystallization (see Materials and Methods). The crystal is solved to 2.16 Å resolution (Fig. 2C and table S1), which has the P43212 space-group symmetry, with each crystallographic asymmetric unit (ASU) containing one RNA tile.
Fig. 2.

Design and crystal structure of the dimeric parallelogram.

(A) Sequence and secondary structure of the RNA tile in the dimeric parallelogram structure (PLM). WC base pairs are shown as sticks, and non-WC base pairs (including wobble base pairs) are shown as dots. (B) Native polyacrylamide gel electrophoresis (nPAGE; 6%) analyses of the assembly products of the RNA under different annealing buffers. The bands for the target dimer and the speculated trimer are marked on the left of the gel. (C) Electron density map (2Fo − Fc map, gray mesh; contoured at 1.0σ level and carved within 2.0 Å) of one copy of RNA (which is the crystallographic ASU) in PLM. Insets show the zoomed-in views of representative regions of the K-turn (top, labeled 1) and KL (bottom, labeled 2). (D and E) Two (front and side) views of 3D structure of PLM. (F) Comparing the 7-bp KL complex determined by x-ray crystallography in the present study [colored the same as (D)] and that previously determined by NMR [Protein Data Bank (PDB) code: 2BJ2; in gray] (). Arrows indicate the direction of the strands (from 5′ to 3′). Insets show the internal angles of the helices flanking the two KL structures.

Design and crystal structure of the dimeric parallelogram.

(A) Sequence and secondary structure of the RNA tile in the dimeric parallelogram structure (PLM). WC base pairs are shown as sticks, and non-WC base pairs (including wobble base pairs) are shown as dots. (B) Native polyacrylamide gel electrophoresis (nPAGE; 6%) analyses of the assembly products of the RNA under different annealing buffers. The bands for the target dimer and the speculated trimer are marked on the left of the gel. (C) Electron density map (2Fo − Fc map, gray mesh; contoured at 1.0σ level and carved within 2.0 Å) of one copy of RNA (which is the crystallographic ASU) in PLM. Insets show the zoomed-in views of representative regions of the K-turn (top, labeled 1) and KL (bottom, labeled 2). (D and E) Two (front and side) views of 3D structure of PLM. (F) Comparing the 7-bp KL complex determined by x-ray crystallography in the present study [colored the same as (D)] and that previously determined by NMR [Protein Data Bank (PDB) code: 2BJ2; in gray] (). Arrows indicate the direction of the strands (from 5′ to 3′). Insets show the internal angles of the helices flanking the two KL structures. The overall shape revealed by crystallography (Fig. 2, D and E) mostly agrees with our design: All the helical regions, K-turns, and KLs form as expected. Furthermore, the structure allowed us to compare the KL solved in the structure with the previous NMR model (Fig. 2F) (). The two models mostly agree with each other: the kissing helix forms with seven Watson-Crick (WC) base-pairings and continuously stacks upon the flanking helical regions. Nonetheless, subtle differences of the bent angles are revealed: In the NMR structure, the KL mediates a wide range of angles (107° to 132°; 117° for the average structure) (); in our x-ray model, the nanoarchitecture geometrically constrains the KL to a fixed angle of 134°, larger than the NMR average structure by 17° but closer to a previously estimated angle of ~135° by comparative gel electrophoresis (). Beyond this 7-bp KL that regulates the ColE1 plasmid replication, there are many other biologically important KLs (). However, their crystal structures (, ) are scarce, reflecting the difficulties of obtaining suitable crystals, probably due to their intrinsic structural flexibility and elongated shapes. Our results imply the feasibility of obtaining the crystal structures of these KLs by configuring them within well-formed nanoarchitectures.

Emergence of an unexpected dimeric structure

Encouraged by the success of obtaining the first crystal structure of PLM, we next set out to design a more complex 3D RNA nanoarchitecture for crystallographic study. The design of a 3D architecture necessitates the inclusion of branched structural motifs to ensure that all the helices cannot share the same plane. For this purpose, we chose the bKL motif (), an artificially designed motif formed by the WC pairing between the single-stranded regions of a bulged helix and a hairpin loop (Fig. 1C). Besides serving as a paranemic cohesion for the convenient design of a homooligomeric self-assembly system, this bKL motif provides a three-way branched junction with the kissing helix Hk spanning the width of the helix Hb1 and coaxially stacking between helices Hb2 and Hl. Moreover, by solving the crystal structure of the designed nanoarchitecture, we envision an opportunity to gain structural insights into the bKL that governs the assembly. Figure 3A shows the expected secondary structure of the designed tiles: J23 and J45 are the 5-nt (AACUA) bulge from the internal ribosome entry site (IRES) RNA of the hepatitis C virus (HCV) for creating an angle of ~90° (); eight bKLs are anticipated to mediate the assembly of four tiles into the tetrameric nanocages, whose symmetries (either C4 or D2) could be controlled by the loops’ complementarity (Fig. 3, B and C). Because of structural flexibility, the tile of C4-symmetry design (Fig. 3B) resulted in multiple assemblies (Fig. 3D, lanes 1 and 2), and at a relatively low Mg2+ concentration (1 mM; lane 2 of Fig. 3D), the trimer (kinetically more favorable than tetramer) predominates. While the tile of D2-symmetry design (Fig. 3C) inhibits the formation of trimer or pentamer as expected, it fails to generate the target tetramer with a good yield (Fig. 3D, lanes 3 and 4): A faster-migrating species (presumed as a dimer based on the electrophoretic mobility) formed as the dominant product. Further annealing tests (Fig. 3E, lanes 1 to 3) of this D2-symmetry design tile reveal that the dimer can be exclusively formed in a buffer containing low (0.3 mM) Mg2+ and no Na+ (lane 2). Circularization of the tile [by splinted ligation (); text S1] imparts different self-assembly behaviors (lanes 4 to 6 of Fig. 3E): A series of products are formed under low (0.3 mM) Mg2+ and no Na+ (lane 5); the target tetramer forms almost exclusively under 0.3 mM Mg2+ and 100 mM Na+ (lane 4). To learn the approximate shape of the unexpectedly formed, presumed dimer of D2-symmetry design, we conducted cryo–electron microscopy (cryo-EM) analysis (Fig. 3, F and G), which revealed a ring-shaped structure.
Fig. 3.

Emergence of an unexpected dimeric structure in the preparation of a designed tetrameric nanocage.

(A) Sequence and expected secondary structure of the RNA tile for the construction of tetrameric nanocages. The nucleotides denoted by N’s (gray) in the loop regions are indicated in (B) and (C) for designs of different symmetries. (B and C) The kissing patterns of the loop regions of the RNA tiles dictate the symmetries, either C4 (B) or D2 (C), of the prospective tetrameric nanocages. The D2 design was intended to inhibit the formation of undesired products such as trimer and pentamer. (D) nPAGE (6%) analyses of the assembly products of the C4 (lanes 1 and 2) and D2 (lanes 3 and 4) designs after annealing in different buffers. The bands speculated as the target tetramer, undesired trimeric and pentameric products, and unexpected dimeric species are marked on the left. (E) nPAGE (4%) analyses of the assembly products of the linear (lanes 1 to 3) or circular (lanes 4 to 6) RNA of the D2 design. (F and G) Cryo-EM image (F) and representative reference-free 2D class averages (G) reveal a ring-shaped structure of the unexpected dimer. Scale bars, 10 nm (F and G).

Emergence of an unexpected dimeric structure in the preparation of a designed tetrameric nanocage.

(A) Sequence and expected secondary structure of the RNA tile for the construction of tetrameric nanocages. The nucleotides denoted by N’s (gray) in the loop regions are indicated in (B) and (C) for designs of different symmetries. (B and C) The kissing patterns of the loop regions of the RNA tiles dictate the symmetries, either C4 (B) or D2 (C), of the prospective tetrameric nanocages. The D2 design was intended to inhibit the formation of undesired products such as trimer and pentamer. (D) nPAGE (6%) analyses of the assembly products of the C4 (lanes 1 and 2) and D2 (lanes 3 and 4) designs after annealing in different buffers. The bands speculated as the target tetramer, undesired trimeric and pentameric products, and unexpected dimeric species are marked on the left. (E) nPAGE (4%) analyses of the assembly products of the linear (lanes 1 to 3) or circular (lanes 4 to 6) RNA of the D2 design. (F and G) Cryo-EM image (F) and representative reference-free 2D class averages (G) reveal a ring-shaped structure of the unexpected dimer. Scale bars, 10 nm (F and G).

Crystal structure of the unexpected dimer

We undertook crystallographic study (see Materials and Methods) of the unexpectedly formed dimer to elucidate its structure. The crystal was solved to 3.07 Å resolution (Fig. 4A and table S1): The crystal is of P21 space-group symmetry, and each ASU is the dimer consisting two tiles. A shape of a two-eared bracelet (hereafter, we refer to this dimer as BRC; Fig. 4, B and C) is revealed, agreeing with the ring-shaped structure observed by cryo-EM (Fig. 3, F and G). The two RNA chains at each BRC particle are related by a noncrystallographic C2 symmetry and are near-identical (Fig. 4D). From the crystal structure, the exact secondary structure of the tile can be inferred (Fig. 4E). A prominent feature is the unexpected formation of an intramolecular bKL (intra-bKL; via L14/L3 kissing) within each tile. Two tile copies are then associated via intermolecular bKLs (inter-bKLs; via L12/L5 kissing), as illustrate in Fig. 4F. The bracelet shape has an open cross section of ~18 nm2 (Fig. 4B), and the view from either side reveals an ear-like obtuse triangle (Fig. 4C) that is enclosed by the helical domains P1a-P1b, P2, and P3, which are joined by J23 and crotches of the inter- and intra-bKLs.
Fig. 4.

Crystal structure of the unexpected dimer.

(A) Electron density map (2Fo − Fc map, gray mesh; contoured at 1.0σ level and carved within 1.6 Å) of the unexpected dimer, referred to as BRC because of its shape of a two-eared bracelet. Two copies of RNA chains (chain A in orange and chain B in blue) are present in the crystallographic ASU. Insets (map contoured at 2.0σ level) show the zoomed-in views of representative regions of the inter-bKL (top, labeled 1) and intra-bKL (bottom, labeled 2). (B and C) Two (front and side) views of 3D structure of the BRC nanoparticle. Its emergence is largely due to the unexpected formation of an intra-bKL (red) within each tile. Two copies of the tiles dimerize through inter-bKLs (green). (D) The two chains of the nanoparticle are near-identical, as shown by their superposition. (E) Secondary structure of the BRC RNA illuminated by the crystal structure. (F) Schematic illustrating the formation of BRC. (G) Crystal packing of the BRC nanoparticles is mediated by shape complementarity. Five neighboring particles (1 to 5) are shown with surface rendering.

Crystal structure of the unexpected dimer.

(A) Electron density map (2Fo − Fc map, gray mesh; contoured at 1.0σ level and carved within 1.6 Å) of the unexpected dimer, referred to as BRC because of its shape of a two-eared bracelet. Two copies of RNA chains (chain A in orange and chain B in blue) are present in the crystallographic ASU. Insets (map contoured at 2.0σ level) show the zoomed-in views of representative regions of the inter-bKL (top, labeled 1) and intra-bKL (bottom, labeled 2). (B and C) Two (front and side) views of 3D structure of the BRC nanoparticle. Its emergence is largely due to the unexpected formation of an intra-bKL (red) within each tile. Two copies of the tiles dimerize through inter-bKLs (green). (D) The two chains of the nanoparticle are near-identical, as shown by their superposition. (E) Secondary structure of the BRC RNA illuminated by the crystal structure. (F) Schematic illustrating the formation of BRC. (G) Crystal packing of the BRC nanoparticles is mediated by shape complementarity. Five neighboring particles (1 to 5) are shown with surface rendering. The distinct geometric features play an important role in mediating the crystal packing via shape complementarity that is reminiscent of a mortise and tenon joint, especially along the crystallographic axes a and b (Fig. 4G and fig. S1). Along these two axes, each BRC particle directly contacts four neighboring particles that are related to it with the 21 screw-axis symmetry. The two “ears” of a BRC particle (see particle 1 in Fig. 4G and fig. S1) create a cleft in between, which accommodates the insertion of two ears of two neighboring particles (particles 2 and 5) on the layers above and below it (along the b axis); concomitantly, its own two ears fill the clefts of the other two neighboring particles (particles 3 and 4). Although this shape complementarity–mediated packing prevents the formation of extended channel through the openings of the bracelets, a cavity of ~29 nm3 is enclosed within each bracelet by its neighbors. Except for the flipped-out, exposed U’s (U27 and U95) of the HCV IRES bulges J23 and J45, all the bases are buried toward the helical interiors, and no base-pairing or stacking interaction is observed at the interparticle packing contacts in the crystal. Instead, the packing contacts are largely mediated by hydrogen bonds involving the phosphate oxygens and 2′-hydroxyls (fig. S1) as normally observed for RNA helices ().

Inter- and intra-bKLs at BRC

The crystal structure of BRC captures two different kinds of bKLs that are initially anticipated to form the same 3D configuration. The configuration of inter-bKL (Fig. 5A) is mostly in accordance with the original bKL design (): (i) A 6-bp intermolecular kissing helix (Hk; via L12/L5 kissing) forms as designed and (ii) this Hk stacks between the flanking P2 and P5 (i.e., Hb2 and Hl of bKL as defined in Fig. 1C). On the bulge (L12) side, the unpaired A18 stacks between C19 from P2 and C17 from the Hk. On the loop (L5) side, A103 and A111 form a trans-WC/WC (tWW) pair [according to Leontis/Westhof base-pairing classification (, ); Fig. 5B], and A104 forms a base triple with the G12:C110 pair (Fig. 5C). A somewhat unexpected feature of this inter-bKL is the sharp angle (28°) between P2 and P1a, which was previously modeled and measured to be ~60° (). This small discrepancy reflects the flexibility of the bKL motif. The intra-bKL (Fig. 5D), however, deviates substantially from the initial bKL design, although the L14/L3 kissing still forms the 6-bp Hk stacking between P3 and P4. Notably, rather than pairing with C78, G124 unexpectedly pairs with A85 (originally designed to be unpaired) to form a cis-WC/WC (cWW) pair (Fig. 5E), leaving C78 unpaired. On the loop (L3) side, A36 and A44 form a cWW pair (Fig. 5F) that stacks between P3 and the Hk, and A37 inserts itself and stacks between C78 and A125 of P1b. These structural features of intra-bKL account for the enlarged angle (111°) between P1b and P4, accommodating the geometry of the ear-like triangles of BRC.
Fig. 5.

Structural features of the inter- and intra-bKLs in BRC.

(A) Structure of the inter-bKL. (B and C) Structural details of three A’s connecting the kissing nucleotides and the helix in the apical loop at the inter-bKL. A noncanonical tWW [according to Leontis/Westhof base-pairing classification (, )] base pair is formed between A103 and A111 (B); A104 forms a base triple with the WC G12:C110 pair by interacting with G12 via a cis-WC/Hoogsteen interaction (C). (D) Structure of the intra-bKL. (E and F) Two noncanonical base pairs are formed at the intra-bKL. In the bulge part of the intra-bKL, a cWW base pair is formed by A85 and G124 (E); in the loop part, a cWW base pair is formed by A36 and A44 (F). Insets in (A) and (D) show the angles between the flanking helices of the two bKLs.

Structural features of the inter- and intra-bKLs in BRC.

(A) Structure of the inter-bKL. (B and C) Structural details of three A’s connecting the kissing nucleotides and the helix in the apical loop at the inter-bKL. A noncanonical tWW [according to Leontis/Westhof base-pairing classification (, )] base pair is formed between A103 and A111 (B); A104 forms a base triple with the WC G12:C110 pair by interacting with G12 via a cis-WC/Hoogsteen interaction (C). (D) Structure of the intra-bKL. (E and F) Two noncanonical base pairs are formed at the intra-bKL. In the bulge part of the intra-bKL, a cWW base pair is formed by A85 and G124 (E); in the loop part, a cWW base pair is formed by A36 and A44 (F). Insets in (A) and (D) show the angles between the flanking helices of the two bKLs. Together with the previous data from the assembly assays (Fig. 3, D and E), the structural insights gained from the crystal structure shed some light upon the possible mechanisms by which the RNA folds and assembles into BRC. The intra-bKL likely forms during the early stage of annealing because of the kinetic advantage offered by intramolecularity. Thus, two possible pathways are suggested in Fig. 6A: a two-step pathway consisting of steps (i) and (ii) and a one-step pathway (iii). The two-step pathway likely governs assembly of the linear RNA tile because of the nick between P1a and P1b. The higher temperature of the initial annealing stage can overcome the coaxial stacking of P1a and P1b so that the conformations with bending between P1a and P1b become populated to allow the intramolecular bulge-loop kissing interaction between L14 and L3. The intermediate (the middle structure in Fig. 6A) formed at this stage would have the desired intra-bKL configuration (similar to the inter-bKL in Fig. 5A) and a bent P1a-P1b helix. As the temperature decreases during the annealing process, stacking of P1a and P1b would become more favorable and deform the configuration of the intra-bKL in the intermediate to that observed in BRC (Fig. 5D). In contrast, the one-step pathway to the dimer would require direct formation of the deformed intra-bKL configuration, making kinetic capture of the intra-bKL more difficult. Accordingly, annealing at a higher salt concentration or sealing the nick to fuse P1a and P1b (as in the circular RNA tile), both of which would impede the bending of P1a-P1b helix and therefore render the two-step pathway less favorable (), promotes tetramer formation (lanes 1 and 4 in Fig. 3E). Furthermore, the location of the nick within the linear RNA also dictates which bKL forms intramolecularly, the one involving L14 and L3 instead of that involving L12 and L5, because the helix tends to bend toward the direction approximately opposite to the nick (). For the circular RNA, there are no preferred kissing patterns because of the absence of the nick, explaining the emergence of assemblies containing odd-number tiles (lane 5 of Fig. 3E). For example, formation of a trimer can result from assembly of tiles with intra-bKLs formed by different kissing pairs (Fig. 6B).
Fig. 6.

Putative mechanisms of the folding and assembly of the BRC RNA.

(A) Proposed pathways by which each monomeric tile in the BRC is folded. The tile can be folded via a two-step pathway: (i) The formation of the intra-bKL (via the kissing of L3 and L14) is facilitated by the bending between P1a and P1b at the nick (indicated by a gray arrow); (ii) the coaxial stacking between P1a and P1b is restored through the rearrangement of the base-pairing pattern of the intra-bKL so that the tile is in a proper conformation for dimerization. Alternatively, the tile, especially for the circular RNA, can be folded directly in one step via the pathway (iii). (B) A possible mechanism by which a trimeric side product emerges. The presumed trimer from the circular RNA (see lane 5 of Fig. 3E) is likely to be formed via this mechanism.

Putative mechanisms of the folding and assembly of the BRC RNA.

(A) Proposed pathways by which each monomeric tile in the BRC is folded. The tile can be folded via a two-step pathway: (i) The formation of the intra-bKL (via the kissing of L3 and L14) is facilitated by the bending between P1a and P1b at the nick (indicated by a gray arrow); (ii) the coaxial stacking between P1a and P1b is restored through the rearrangement of the base-pairing pattern of the intra-bKL so that the tile is in a proper conformation for dimerization. Alternatively, the tile, especially for the circular RNA, can be folded directly in one step via the pathway (iii). (B) A possible mechanism by which a trimeric side product emerges. The presumed trimer from the circular RNA (see lane 5 of Fig. 3E) is likely to be formed via this mechanism.

Structural implications for HIV-1 dimerization initiation site KL complex

We note that the loop components (L3 and L5) of the bKLs in the structure of BRC follow a sequence pattern (2A-6N-1A) based on the KL complex in HIV-1 dimerization initiation site (DIS) (). Previous structural studies of this KL using x-ray crystallography () or NMR (–) revealed remarkable differences, especially with respect to the highly debated conformations () of three nominally unpaired A’s (corresponding to A36, A37, and A44 in the intra-bKL of BRC; A103, A104, and A111 in the inter-bKL of BRC; and A272, A273, and A280 in the HIV-1 DIS KL complex). Unlike the intra-bKL, where A37 is involved in a long-range stacking interaction, the loop part of the inter-bKL at BRC is in a similar structural context as the loops in the KL complex [see Fig. 7 (A to F) for the comparison of the inter-bKL with the previously determined HIV-1 DIS KL structures]. The NMR structure by Baba et al. () [Protein Data Bank (PDB) ID: 2D1B; Fig. 7E] shares two features with our inter-bKL: (i) the 5′ most A (A272 in KL and A103 in inter-bKL) and the 3′ most A (A280 in KL and A111 in inter-bKL) form a noncanonical base pair and (ii) the second-to-5′ most A (A273 in KL and A104 in inter-bKL) is flipped toward the interior of the kissing duplex. These features are also predicted by a recent molecular dynamics study (). In contrast, the previous crystallographic model (PDB ID: 2B8R; Fig. 7C) () has A272 and A273 bulged out of the helix to mediate stacking interactions in the crystal lattice (). This conformation, which has not been observed in any of the three NMR structures (Fig. 7, D to F), does not necessarily represent the most favorable conformation of the KL in solution.
Fig. 7.

Structural information of the loop part at the inter-bKL of BRC provides insights into the structure of the HIV-1 DIS KL complex.

(A) Two views (related by a 90° rotation) of the inter-bKL in BRC. The bulge part is colored in gray, and the nucleotides in the loop part are in three colors: Those in the stem are in orange; the six nucleotides participating in the kissing interaction are in green, and the three unpaired A’s are in magenta. (B) Sequence and secondary structure of the HIV-1 DIS KL complex (subtype B). One of the stem loops in the complex is colored in gray, and the other stem loop is color coded as the loop part of the bKL in (A). The three unpaired A’s are denoted according to the numbering of the HIV-1 RNA, and the corresponding locations of their counterparts in BRC RNA are indicated in parentheses. (C to F) Four structures of the KL complex previously solved by x-ray (C) () and NMR (D to F) (–). The PDB codes are given in parentheses. Regarding the configuration of the three unpaired A’s, the structure in (E) () is the closest to the inter-bKL structure shown in (A), featuring a noncanonical A:A base pair with the third A buried in the kissing helix’s major groove. (G) A 3D model of the HIV-1 DIS KL complex generated by SimRNA simulation () using the distance constraints of the inter-bKL of BRC. (H) A comparison of three KL structures gained from x-ray, NMR, and SimRNA modeling with their 6-bp kissing helices superimposed. The inset is a zoomed-in view of the junction of the stem and the kissing helix, showing the overtwisting of the x-ray model compared to the other two models.

Structural information of the loop part at the inter-bKL of BRC provides insights into the structure of the HIV-1 DIS KL complex.

(A) Two views (related by a 90° rotation) of the inter-bKL in BRC. The bulge part is colored in gray, and the nucleotides in the loop part are in three colors: Those in the stem are in orange; the six nucleotides participating in the kissing interaction are in green, and the three unpaired A’s are in magenta. (B) Sequence and secondary structure of the HIV-1 DIS KL complex (subtype B). One of the stem loops in the complex is colored in gray, and the other stem loop is color coded as the loop part of the bKL in (A). The three unpaired A’s are denoted according to the numbering of the HIV-1 RNA, and the corresponding locations of their counterparts in BRC RNA are indicated in parentheses. (C to F) Four structures of the KL complex previously solved by x-ray (C) () and NMR (D to F) (–). The PDB codes are given in parentheses. Regarding the configuration of the three unpaired A’s, the structure in (E) () is the closest to the inter-bKL structure shown in (A), featuring a noncanonical A:A base pair with the third A buried in the kissing helix’s major groove. (G) A 3D model of the HIV-1 DIS KL complex generated by SimRNA simulation () using the distance constraints of the inter-bKL of BRC. (H) A comparison of three KL structures gained from x-ray, NMR, and SimRNA modeling with their 6-bp kissing helices superimposed. The inset is a zoomed-in view of the junction of the stem and the kissing helix, showing the overtwisting of the x-ray model compared to the other two models. We further modeled the structure of the HIV-1 DIS KL computationally with SimRNA (, ) using the distance constraints derived from the three unpaired A’s at the inter-bKL (Fig. 7G). Aligning the kissing duplexes of this SimRNA model with the x-ray (2B8R) and the NMR (2D1B) models (Fig. 7H) revealed an overtwisting for the x-ray model at the junction of the kissing duplex and the helical stem. Programmable KL motifs play an important role in the field of RNA nanotechnology, especially for the single-stranded RNA origamis (, , ). The KL from HIV-1 DIS has one additional advantage as a programmable KL: Its overall geometry is approximately linear so that it can directly replace an A-form RNA helix for connecting other motifs such as kinks and junctions. Currently, the crystal structure of 2B8R () provides the structural basis for such helix replacement (, ) together with the assumption that the KL junction (the 6-bp kissing helix plus the unpaired A’s at both ends) contributes a twist that is equivalent to a 9-bp A-form helix. For example, to replace a two-turn (22-bp) A-form helix with the KL, the sum of the lengths of the two helical stems of the KL would need to be 13 bp so that the total twist is equivalent to 22 bp. Nevertheless, our SimRNA model derived from our crystal structure of BRC, as well as the 2D1B NMR structure, suggests that the KL junction may contribute a twist equivalent to an 8-bp A-form helix (fig. S2). As a consequence, previous RNA nanostructures designed based on the 2B8R structure may need to be refined by adding one more base pair in the stem regions.

Design and crystal structure of a 3D nanocage

Building upon our experience with BRC, we next aimed to design an RNA molecule that could assemble into a bona fide 3D cage. To prevent the formation of the intra-bKL and ensure the assembly of rigid and homogeneous cage, we introduced five adjustments into the design of the RNA tile (Fig. 8A). (i) We decreased the length of the P1 helix by 10 bp (about one helical turn) and removed the nick (as seen between P1a and P1b of BRC) to disfavor its bending that is speculated to facilitate the formation of intra-bKL. (ii) We circularly permuted the RNA so that the 5′ and 3′ ends of the RNA molecule now reside at the terminus of P3, rendering the original apical loop L3 into a 5′ overhang to mediate an intermolecular kissing interaction with L12. This kissing interaction comprises a motif known as the T-J, first conceived as a DNA motif () and later extended to an RNA motif (). (iii) We deleted the unpaired A (corresponding to A18 in BRC) at the 3′ side of the bulge L14 to restrict the bending angle between P1 and P4. (iv) To further render the intramolecular kissing less accessible, we altered the kissing pattern from the symmetry of D to C (n is the number of tiles in the assembled structure) so that L3 pairs with L12 and L5 with L14. (v) To mitigate the problem that the C-symmetry design could potentially result in byproducts, such as those containing n + 1 or n − 1 tiles, we introduced K-turns (with a larger bending angle than the AACUA bulge) as bending moieties for J23 and J45 to reduce n so that the formation of (n + 1)–mer or (n − 1)–mer would be much less energetically favorable then n-mer. Accordingly, if the kissing helices in the formed bKLs and T-Js stack coaxially with their respective flanking helices without substantial bending, a trimeric nanocage (Fig. 8A, top-right inset) would be expected. Nevertheless, a dimeric nanocage (Fig. 8A, bottom-left inset) might be favored kinetically, considering the likely flexibility of the RNA tile, especially at the locations of T-Js and bKLs, both of which may contribute to the bending at the junctions of kissing helices and the flanking helices.
Fig. 8.

Design and crystal structure of a dimeric nanocage.

(A) Sequence and secondary structure of the RNA tile designed to self-assemble into a nanocage. Insets show the schematics of the possibly formed dimeric (bottom left) and trimeric (top right) cages. The shown secondary structure is derived from the crystal structure of the dimeric cage, DCG. (B) nPAGE (4%) analyses of the assembly products under different annealing buffers. The bands for the dimeric and the speculated trimeric species are marked on the left of the gel. (C) Electron density map (2Fo − Fc map, gray mesh; contoured at 1.0σ level and carved within 2.0 Å) of DCG. Two copies of the RNA monomers (orange and blue) are present in the crystallographic ASU. Insets show the zoomed-in views of representative regions of the K-turn (top, labeled 1) and the T-J (bottom, labeled 2). (D to F) Three (front, side, and top) views of the 3D structure of DCG. (G) A recurrent contact of K-turns mediates the crystal packing of DCG. For clarity, the two interacting K-turns are colored in orange and blue, while the other parts are in gray. The red dashed box highlights the continuous base stacking to connect the NC helices of the two contacting K-turns. Bottom left inset: Sequence of K-turn with the standard nomenclature of the nucleotides (b, n, or L) and the flanking helices (C or NC) (). Bottom right inset: Zoomed-out view of two contacting DCG particles. (H and I) Structural details of the T-J (H) and bKL (I) interactions in DCG. Insets show the angles between the flanking helices of the cohesions.

Design and crystal structure of a dimeric nanocage.

(A) Sequence and secondary structure of the RNA tile designed to self-assemble into a nanocage. Insets show the schematics of the possibly formed dimeric (bottom left) and trimeric (top right) cages. The shown secondary structure is derived from the crystal structure of the dimeric cage, DCG. (B) nPAGE (4%) analyses of the assembly products under different annealing buffers. The bands for the dimeric and the speculated trimeric species are marked on the left of the gel. (C) Electron density map (2Fo − Fc map, gray mesh; contoured at 1.0σ level and carved within 2.0 Å) of DCG. Two copies of the RNA monomers (orange and blue) are present in the crystallographic ASU. Insets show the zoomed-in views of representative regions of the K-turn (top, labeled 1) and the T-J (bottom, labeled 2). (D to F) Three (front, side, and top) views of the 3D structure of DCG. (G) A recurrent contact of K-turns mediates the crystal packing of DCG. For clarity, the two interacting K-turns are colored in orange and blue, while the other parts are in gray. The red dashed box highlights the continuous base stacking to connect the NC helices of the two contacting K-turns. Bottom left inset: Sequence of K-turn with the standard nomenclature of the nucleotides (b, n, or L) and the flanking helices (C or NC) (). Bottom right inset: Zoomed-out view of two contacting DCG particles. (H and I) Structural details of the T-J (H) and bKL (I) interactions in DCG. Insets show the angles between the flanking helices of the cohesions. The newly designed RNA tile yields different homomeric assemblies in different buffer conditions (Fig. 8B), again underscoring the influence of cation contents and their relative contributions to the intramolecular folding and intermolecular assembly. In the presence of 0.3 mM Mg2+ and 100 mM Na+, the trimer is formed almost exclusively (lane 1). Omitting Na+ shifts the assembly to the kinetically favored dimer (lane 2). Further increasing the Mg2+ concentration (in the absence of Na+) yields some trimer, but dimer formation still prevails (lane 3). We were able to crystallize both the dimer and trimer, but only the dimer (referred to as DCG) yielded crystals of good diffraction quality that allows its solution to 3.21 Å resolution (Fig. 8C and table S1). The crystal has P1 space-group symmetry, and each ASU is one entire DCG dimer containing two RNA tiles. DCG has an overall shape of a twisted cage (Fig. 8, D to F), with P1 linking similarly shaped top and bottom parallelograms. The resemblance of the two parallelograms makes it difficult to determine the nanoparticle orientation in the crystal (fig. S3); therefore, we prepared a heavy-atom modified DCG by substituting U118 with a 5-bromouracil (fig. S3 and text S2) and thereby unambiguously established the orientation. Furthermore, inspection of the crystal packing enables the identification of a recurrent contact of stacking K-turns involving the L2 bases (Fig. 8G for DCG and fig. S4 for two other K-turn–containing structures (, )). Through this K-turn–mediated contact, the parallelograms at both the top and bottom of the DCG nanoparticle form an infinite 2D array in the crystal along crystallographic axes a and c, and layers of these arrays pack along axis b via the shape complementarity of the corrugated surfaces (fig. S5). We did not observe this contact in the crystal of PLM, although the K-turns still play an important role in mediating the crystal packing (fig. S5). Considering its apparent propensity to mediate crystal contacts, the K-turn motif might be strategically installed onto unknown RNA structures as a crystallization module. The 3D configurations of the two kissing motifs, T-J and bKL, are consequently revealed, both adopting a T-shaped geometry with similar stacking patterns. In T-J (Fig. 8H), the intended G7:C129 pair does not form, probably as a means to accommodate the bending between P2 and P3 for the top parallelogram formation. Regarding the bKL (Fig. 8I), the electron densities of the unpaired A’s (A64, A65, and A72) in L5 are relatively weak, implying high flexibility. Compared with the inter-bKL of BRC (Fig. 5A), Hb1 and Hb2 intersect at a larger angle in this bKL, probably as a consequence of the removal of the unpaired A (corresponding to A18 in BRC) in L14 for the intended restriction of the bending of P1 and P4.

Naturally occurring T-J and bKL motifs

The T-J and bKL motifs in DCG, as well as the two bKLs in BRC, represent de novo designed motifs that are conceived specifically for nanoarchitectural purposes. Nonetheless, a number of natural RNA motifs share the same or close topological features with the T-J or bKL motifs. For example, in the antiterminator of the T-box riboswitch (), four conserved nucleotides from a 7-nt bulge form base-pairing interactions with the 3′ tail of uncharged tRNA. It has been previously predicted that the acceptor stem of tRNA, the kissing helix, and helix A1 of the T-box antiterminator form a coaxially stacking arrangement (), making this interaction a T-J motif. Recently, solved crystal structures (, ) support this predicted structural arrangement. Compared to the T-J motif, the bKL motif, defined as the kissing interaction formed by an apical loop and a bulge/internal loop, occurs more often in biological RNA folds. In Fig. 9, we present 14 different bKLs from natural functional RNAs. Five of them—the T-box riboswitch-tRNA complex (Fig. 9A) (), the assembly of phi29 prohead RNAs (Fig. 9B) (), bacterial ribonuclease (RNase) P RNA (Fig. 9C) (), the tetrahydrofolate (THF) riboswitch (Fig. 9D) (), and pistol ribozyme (Fig. 9E) ()—have a stacking arrangement similar to the bKLs in the present study (i.e., a continuous stacking arrangement is formed by Hl, Hk, and Hb2). Another four bKLs—E. coli 16S ribosomal RNA (rRNA; Fig. 9F) (), adenovirus virus–associated I RNA (Fig. 9G) (), the 5-aminoimidazole-4-carboxamide riboside 5′-triphosphate (ZTP) riboswitch (Fig. 9H) (), and group II intron (Fig. 9I) (, )—adopt different configurations in which the Hl-Hk-Hb2 coaxial stack is not formed. Three of these bKLs (Fig. 9, F, H, and I) have a longer linker of unpaired nucleotides (three for all of them) 3′ to kissing region of the bulge. We suspect that the length of this linker may be one of the determinants of stacking behaviors of the bKLs. The bKLs found in twister ribozyme (Fig. 9J) () and the class II c-di-GMP [bis-(3′-5′)-cyclic dimeric guanosine monophosphate] riboswitch (Fig. 9K) () are more complex in that their loop moiety, besides participating in the kissing interactions (i.e., Hk) with the bulge, is involved in base-pairing interactions with nucleotide(s) nearby. The final three cases are bKLs from RNAs of unknown structures: the tertiary interaction of P11 in the subgroup IA1 introns (Fig. 9L) (), the pseudoknot in the programmed-1 ribosomal frameshifting region of the E. coli transposable element IS3411 (Fig. 9M) (), and a long-distance interaction in transmissible gastroenteritis coronavirus (TGEV) genomic RNA (gRNA) regulating subgenomic N protein mRNA synthesis (Fig. 9N) (). We predict that they should adopt a stacking pattern similar to the bKLs in this work because of the analogous secondary structures.
Fig. 9.

Examples of naturally occurring bKL-containing RNAs.

(A to E) Five structures (in green boxes) containing bKLs adopting a stacking arrangement similar to those in this work. (F to I) Four structures (in red boxes) containing bKLs adopting a different stacking arrangement to those in this work. (J and K) Two RNA structures (in purple boxes) have extra base pair(s) (purple or pink) in their bKLs. (L to N) Three RNAs (in gray boxes) containing bKLs of unknown structures. See Fig. 1C for the bKL’s helix definition (Hb1, Hb2, Hl, and Hk). Red circled number in each panel indicates the number of base pairs in the kissing helix (Hk). References: (A), (); (B), (); (C), (); (D), (); (E), (); (F), (); (G), (); (H), (); (I), (, ); (J), (); (K), (); (L), (); (M), (); (N), subgenomic mRNA (sgmRNA), (). In (B), underlined nucleotides are present in the wild-type RNA and are removed in the crystallographic construct. In (J), the 3-bp bKL formation is accompanied by a nearby A:A pair (pink) and a 2-bp interaction (purple). In (K), the bKL has a discontinuous 7-bp kissing interaction with an A:U pair (purple) inserted. For the 11 known structures [(A to K); PDB ID indicated in parentheses], arrows in the models indicate the RNA strand direction (from 5′ to 3′) and secondary structures of the bKLs are presented below the corresponding models. Bound ligands are shown as sticks in (D) and (K). For the three unknown structures (L to N), the bKL secondary structures are shown with a stacking pattern similar to those in this work due to the analogous secondary structures.

Examples of naturally occurring bKL-containing RNAs.

(A to E) Five structures (in green boxes) containing bKLs adopting a stacking arrangement similar to those in this work. (F to I) Four structures (in red boxes) containing bKLs adopting a different stacking arrangement to those in this work. (J and K) Two RNA structures (in purple boxes) have extra base pair(s) (purple or pink) in their bKLs. (L to N) Three RNAs (in gray boxes) containing bKLs of unknown structures. See Fig. 1C for the bKL’s helix definition (Hb1, Hb2, Hl, and Hk). Red circled number in each panel indicates the number of base pairs in the kissing helix (Hk). References: (A), (); (B), (); (C), (); (D), (); (E), (); (F), (); (G), (); (H), (); (I), (, ); (J), (); (K), (); (L), (); (M), (); (N), subgenomic mRNA (sgmRNA), (). In (B), underlined nucleotides are present in the wild-type RNA and are removed in the crystallographic construct. In (J), the 3-bp bKL formation is accompanied by a nearby A:A pair (pink) and a 2-bp interaction (purple). In (K), the bKL has a discontinuous 7-bp kissing interaction with an A:U pair (purple) inserted. For the 11 known structures [(A to K); PDB ID indicated in parentheses], arrows in the models indicate the RNA strand direction (from 5′ to 3′) and secondary structures of the bKLs are presented below the corresponding models. Bound ligands are shown as sticks in (D) and (K). For the three unknown structures (L to N), the bKL secondary structures are shown with a stacking pattern similar to those in this work due to the analogous secondary structures. Although the bKL motif is a recurrent structural motif in natural RNAs, it has been overlooked to some degree and has been coarsely categorized as a pseudoknot when formed intramolecularly. However, close inspection reveals that an intra-bKL does not belong to any of the six basic pseudoknot types according to the most commonly used classification proposed by PseudoViewer () and adopted by PseudoBase database (, ). Rather, bKLs occur only in some more complex pseudoknots that result from the combination of two basic pseudoknot types (one contains a hairpin, designated by H, and the other contains a bulge loop, designated by L). Three such examples are presented in Fig. 10.
Fig. 10.

An intra-bKL can be generated by fusing two simple pseudoknots at the position of the kissing helix.

(A to C) Three bKLs of different configurations are generated by fusing H- and HLout-type pseudoknots (A), H- and LLin-type pseudoknots (B), and HH- and HLout-type pseudoknots (C). The representation and denotation for the pseudoknots are from the convention of PseudoViewer (), where H designates a hairpin and L designates a bulge, interior, or multiple loop. The symbol #, which is adopted from the denotation for the operation of connected sum in topology, designates the fusion of pseudoknots. The hairpin parts are colored in blue. The bulge parts are in orange. The kissing helices are in red, and the rest are in gray. Using a similar fusion operation, we can further get composite pseudoknots of HH # LLin, HLin # HLout, and HLin # LLin that contain bKLs, and their schematics are omitted here. We did not include HHH-type pseudoknot in our discussion because it can be viewed as H # H.

An intra-bKL can be generated by fusing two simple pseudoknots at the position of the kissing helix.

(A to C) Three bKLs of different configurations are generated by fusing H- and HLout-type pseudoknots (A), H- and LLin-type pseudoknots (B), and HH- and HLout-type pseudoknots (C). The representation and denotation for the pseudoknots are from the convention of PseudoViewer (), where H designates a hairpin and L designates a bulge, interior, or multiple loop. The symbol #, which is adopted from the denotation for the operation of connected sum in topology, designates the fusion of pseudoknots. The hairpin parts are colored in blue. The bulge parts are in orange. The kissing helices are in red, and the rest are in gray. Using a similar fusion operation, we can further get composite pseudoknots of HH # LLin, HLin # HLout, and HLin # LLin that contain bKLs, and their schematics are omitted here. We did not include HHH-type pseudoknot in our discussion because it can be viewed as H # H. Solving unknown bKL structures may have profound significance in biomedical sciences. For example, genetic and biochemical studies have identified several long-distance interactions in viral RNAs involving the base-pairing of RNA elements that are distantly positioned in the primary sequences (). These interactions play diverse and important roles in regulating viral processes, and a large portion of them are essentially bKLs. As a conserved bKL in some coronavirus species, the one found in the TGEV gRNA (Fig. 9N) is a remarkable example, which involves a hairpin loop and a bulge located more than 25,000 nucleotides away. Understanding the structures of these bKL-mediated long-distance interactions would enable better understanding of mechanisms of the relevant viral processes. Furthermore, it has been argued that appropriate druggable RNA motifs should have greater structural complexity than the simple helices, bulges, and stem loops so that they can achieve both specificity and binding affinity for drug-like molecules (). From this perspective, as three-way branched tertiary interactions, bKLs found in viral RNAs could be attractive targets for developing small-molecule antivirals. Three riboswitches from the bKL-containing structures shown in Fig. 9 have their ligands bound to pockets defined by bKLs, implying that bKLs are predisposed by nature to bind small molecules. Just as we could configure unknown KLs within 2D circularly closed nanoarchitectures (such as PLM) to elucidate their structures, we can also configure unknown bKLs within 3D cages to elucidate the bKL structures. As is seen from the solved bKLs shown in Fig. 9 (A to K), several factors appear to dictate the topologies of bKLs, such as the length of the kissing duplex (i.e., Hk), the unpaired nucleotides (number and identity) linking the helical domains, and the structural contexts in which the bKLs reside. As more bKL structures become experimentally determined, sufficient knowledge of this class of motif will accumulate to unveil a set of rules to guide the predictions of structural features of bKLs, analogous to what has been achieved for RNA multiway junctions (). In addition, the solved structures could serve as templates for computational homology modeling (, ) of other unknown bKLs with similar secondary structures.

DISCUSSION

In this work, we have solved the crystal structures of three synthetic RNA nanoarchitectures as our effort to integrate x-ray crystallography and NA-based nanotechnology. Crystallography has played an indispensable role in the early development of the field of structural DNA and RNA nanotechnology. On the one hand, the structural features of helices and other motifs that are revealed mostly by crystallography have informed the design principles of synthetic DNA and RNA nanostructures. On the other hand, the initial motivation for the field was actually derived from the desire to rationally construct 3D crystalline lattices from an infinite network of DNA multiway junctions connected by sticky end cohesions (). This goal was partly achieved by Paukstelis and co-workers () in 2004 using noncanonical DNA interactions and ultimately achieved by the laboratories of Seeman and Mao () in 2009 using DNA tensegrity triangle, in which all the interactions are rationally designed WC base-pairings. Later, Yan and co-workers (, ) showed how more diverse DNA motifs can be used for constructing rationally designed DNA lattices. Whereas the previous work aimed to construct DNA lattices with precise control, the present work aims to demonstrate the feasibility of getting crystal structures from discrete RNA nanostructures and to elucidate the structural details. The structures presented in this work are among the most complex synthetic NA-based nanoarchitectures that have been determined by crystallography so far; nevertheless, they remain relatively simple compared to the architectures being designed and constructed in the field currently, such as those based on DNA origami (, ) or DNA bricks (, ). 3D DNA arrays have been recently constructed with DNA origami building blocks (, ); however, these 3D arrays are not amenable to x-ray structural determination because of the imperfections caused by growth defects and/or the flexibility and heterogeneity of their building blocks. In our experiments, we screened a total of 29 RNA nanostructures, including some larger and more complex RNA homooligomeric nanocages. Although we could obtain crystals from 17 structures, only six yielded crystals of diffraction better than 4 Å. Besides the three solved structures presented in the current work, the other three unsolved structures are a monomeric nanopyramid (3.45 Å), a 2D trimeric hexagon (3.10 Å), and a 3D trimeric nanocage (3.53 Å). We are still in the process of solving these crystal structures, and preliminary trials of molecular replacement using the constituent bent motifs of known structures failed, implying substantial distortions of the motifs used in these structures. Nonetheless, these efforts conveyed two related empirical lessons. First, the crystal quality seems to deteriorate as the size of the cage or the number of subunits increases, likely reflecting the increase in both the structural flexibility and crystal solvent content associated with a larger enclosed cavity. Second, structurally stressed or kinetically favorable assemblies appear to generate better diffracting crystals. BRC and DCG represent two case-in-point examples illustrated by our work. Although the dimeric DCG yielded good-diffracting crystals, the trimeric nanocage assembled from the same RNA did not. Replacing one of the K-turns (~60° internal angle) in the monomer with an AACUA bulge (~90° internal angle) produced a trimeric cage (expected to be more structurally stressed than the trimer assembled from DCG RNA) that yielded a crystal diffracting to 3.53 Å resolution. Going forward, we plan to use postcrystallization treatments () such as cation replacement and crystal dehydration (), which can improve the diffraction by inducing favorable rigid body–like lattice rearrangements driven by solvent loss. Therefore, we see broad opportunities to extend the crystallographic investigations into artificially designed NA-based nanoarchitectures having greater complexity or larger cavities. This work also demonstrates the feasibility of determining the structures of unknown motifs or validating motif structures obtained by other methods (such as NMR or computer modeling) by solving the nanostructures containing them. Many RNA and DNA structural motifs have intrinsic flexibility, which can hinder crystallization; configuring them as constituent elements in closed nanostructures could impose geometrical constraints that mitigate their flexibility. Moreover, different nanostructures may capture snapshots of different configurational states of the same motif, enabling a more comprehensive understanding of its structural dynamics. In a similar vein to our current approach, configuration of a tetraloop-tetraloop receptor complex into a dimeric nanostructure () facilitated its NMR study. In addition, a conceptually similar nanoarchitectural platform has been recently devised to enable high-throughput investigation of the thermodynamics governing diverse RNA two-way junctions (). Besides the assets associated with closed geometry, using homomeric nanostructures for crystallographic studies offers an additional benefit that the symmetry could promote crystallization by reducing the number of distinct crystal contacts required to form a connected network in 3D space (, ). This symmetry principle operates for some natural RNAs that crystallize as domain-swapped dimers [such as the Plautia stali intestine virus IRES RNA (), the S-adenosyl-(L)-homocysteine riboswitch (), the ZTP riboswitch (), the Varkud satellite ribozyme (), and the glutamine-II riboswitch ()]. In conclusion, we believe that the interactive and iterative interplay between structural determination and nanoarchitectural construction yields valuable guidelines for nanoarchitectural design with a greater precision and provides a potentially robust methodology for solving challenging structures, benefiting both nanotechnologists and structural biologists.

MATERIALS AND METHODS

RNA sequence design and preparation

The sequences were designed with the assistance of the program NUPACK (). The secondary structure of each RNA molecule was further evaluated by Mfold () to ensure the correct folding. All RNA molecules were synthesized by in vitro transcription using the HiScribe T7 High Yield RNA Synthesis Kit from the New England Biolabs (NEB). The corresponding DNA templates for in vitro transcription were the polymerase chain reaction (PCR) products of the gBlocks gene fragments (sequences shown in table S2) from the Integrated DNA Technologies. The PCR experiments were conducted using the Q5 Hot Start High-Fidelity DNA Polymerase (NEB) following the recommended protocol provided by NEB. Two methods were used to reduce transcriptional heterogeneity at the 3′ ends of the RNAs: For PLM, the RNA was transcribed with a self-cleaving hepatitis delta virus ribozyme at the 3′ end (); for BRC and DCG, the first two nucleotides of the reverse primer are modified with 2′-O-methyl (). Experimental details to prepare the circular version of the BRC RNA and the 5-bromouracil–substituted DCG RNA are presented in texts S1 and S2. All RNA molecules were purified by denaturing polyacrylamide gel electrophoresis (PAGE) (containing 7 M urea); then, were precipitated using ethanol and suspended in pure water. The RNA concentration was determined by measuring optical density at 260 nm.

RNA nanostructure assembly

Before assembly, RNAs were denatured at 85°C for 1 min and snap-cooled on ice. Then, the annealing buffers containing 20 mM tris acetate (pH 8.0) and varied concentrations of Mg2+ (using 100 mM Mg2+ stock solution containing 110 mM MgCl2 and 10 mM EDTA) and Na+ (using 1 M Na+ stock solution containing 1 M NaCl) were added to the denatured RNAs (to a final RNA concentration of ~800 nM). The mixtures were then annealed from 70° to 4°C with the following protocol: 70° to 50°C over 6 min, 50° to 37°C over 20 min, and 37°C to 4°C over 2 hours. The annealed RNAs were analyzed by native PAGE (nPAGE; 4 or 6%) in 0.5× tris-borate EDTA (TBE) buffer supplemented by 3 mM MgCl2 (so that the final concentration of Mg2+ is 2 mM because 1 mM EDTA is included in 0.5 × TBE). To prepare the samples for crystallization experiments, the cation concentrations in the annealing buffers were chosen on the basis of the nPAGE results: For PLM, the annealing buffer contains 1 mM Mg2+; for both BRC and DCG, the annealing buffer contains 0.3 mM Mg2+. The assembled nanostructures need to be concentrated for the crystallization experiments. Before concentration, Mg2+ was added to the annealing mixtures to a final concentration of 2 mM (using the 100 mM Mg2+ stock solution). Then, the samples were concentrated to ~5 μg/μl using Amicon Ultra centrifugal filters (molecular weight cutoff, 10 kDa).

Crystallization and structure determination

The initial crystallization trails were carried out using the high-throughput hanging-drop vapor-diffusion method in 96-well plates set up by a mosquito liquid handling robot (TTP Labtech). Each drop contains 0.1 μl of RNA and 0.1 μl of reservoir solution. Four commercially premade screening kits [Natrix (Hampton), Nucleix (QIAGEN), Nuc-Pro (Jena Biosciences), and Crystallization Kit for RNA (Sigma-Aldrich)] were used. Initial hits were identified, and the drops were enlarged to contain 1.2 μl of RNA and 1.2 μl of reservoir solution on siliconized glass slides. The final crystallization condition for PLM contains 50 mM Hepes (pH 7.0), 20 mM KCl, 5 mM MnCl2, and 35% (v/v) MPD (2-methyl-2,4-pentanediol). The final crystallization condition for BRC contains 80 mM NaCl, 12 mM KCl, 20 mM MgCl2, 40 mM sodium cacodylate (pH 6.0), 30% (v/v) MPD, and 12 mM spermine tetrahydrochloride. The final crystallization condition for DCG contains 20 mM MgCl2, 0.3 mM spermine tetrahydrochloride, 50 mM sodium succinate (pH 5.5), and 3.0 M ammonium sulfate. Datasets were collected at Northeastern Collaborative Access Team beamlines 24-ID-C&E at the Advanced Photon Source of Argonne National Laboratory. To solve the structure of PLM, molecular replacement was performed using the structure of K-turn (PDB ID: 4CS1) () with flanking 9- and 6-bp A-form helices as the search model. The KL was built by placing a 7-bp A-form helix into its supposed position in the electron density map, followed by rigid body refinement. The model was completed by ligating all fragments. To solve the structure of BRC, the AACUA bulge motif (PDB ID: 2NOK) () flanked by two standard 7-bp A-form helices was built. Molecular replacement was performed by using two copies of bulge structure and two copies of standard 10-bp A-form helices as the search models. Relying on the initial map from this solution, more nucleotides were built into the map. The structure of DCG was solved by molecular replacement using the structure of K-turn (PDB ID: 4CS1) () with flanking 10- and 5-bp A-form helices as the search model. The complete model was obtained by iteratively building more nucleotides and improving the map. To determine the orientation of DCG in lattice, we collected anomalous diffraction data of 5-bromouracil–substituted DCG at wavelength of 0.9195 Å. The locations of bromine atoms were determined by MR-SAD using PHENIX (). All models were built in COOT (). All molecular replacement and refinement were performed using PHENIX (). All crystal structure figures were prepared with PyMOL (DeLano Scientific LLC). Analyses of crystal structures (including identification of noncanonical base pairs and determination of interhelical angles) were assisted by DSSR is software name, no expanded form (). The statistics of data collection and refinement were tabulated in table S1.

Cryo-EM imaging

Cryo-EM imaging was performed for the structure of BRC. The concentrated sample (~5 μg/μl) prepared for the crystallization experiment was diluted 10 times with 1 × tris-acetate-EDTA–Mg buffer [11 mM MgCl2, 40 mM tris, 20 mM acetic acid, and 1 mM EDTA (pH 8.0)], and 3 μl of diluted sample was applied onto a glow-discharged C-flat holey carbon grid (CF-1.2/1.3-4C, EMS), blotted for 5.5 s, and immediately flash-frozen by liquid nitrogen–cooled liquid ethane with a Cryoplunge 3 System (GATAN). Images were collected on a JEOL 3200FS transmission electron microscope (300 kV) equipped with a K2 Summit camera (GATAN) under low-dose mode. Images were recorded at ×25,000 microscope magnification with the defocus ranging from about −1.0 to −4.0 μm. Reference-free class averages was generated using EMAN2 (). Because of the small molecular weight and presumed structural flexibility, we did not attempt the single-particle reconstruction by collecting a large dataset.

Computer modeling of HIV-1 KL complex using SimRNA

From the crystal structure of BRC, we extracted seven nucleotides—G12 of chain A and G102, A103, A104, C110, A111, and C112 of chain B—to generate the distance constraints for the corresponding nucleotides—G274 of chain A (or chain B) and G271, A272, A273, C279, A280, and C281 of chain B (or chain A)—in the HIV-1 KL complex. These constraints ensured that the three unpaired A’s of the target KL structure would have the same conformations and stacking environments as the inter-bKL at BRC. SimRNA (, ) simulation was run with provided sequence file, secondary structure file, and the distance constraints file. After the simulation, the structure with the lowest free energy was retrieved from the trajectory output file. The output structure was further minimized/refined with QRNAS () with default settings.
  100 in total

1.  Paranemic cohesion of topologically-closed DNA molecules.

Authors:  Xiaoping Zhang; Hao Yan; Zhiyong Shen; Nadrian C Seeman
Journal:  J Am Chem Soc       Date:  2002-11-06       Impact factor: 15.419

2.  Functional architecture of HCV IRES domain II stabilized by divalent metal ions in the crystal and in solution.

Authors:  Sergey M Dibrov; Hillary Johnston-Cox; Yi-Hsin Weng; Thomas Hermann
Journal:  Angew Chem Int Ed Engl       Date:  2007       Impact factor: 15.336

3.  Using Rosetta for RNA homology modeling.

Authors:  Andrew M Watkins; Ramya Rangan; Rhiju Das
Journal:  Methods Enzymol       Date:  2019-06-11       Impact factor: 1.600

4.  Apical loop-internal loop RNA pseudoknots: a new type of stimulator of -1 translational frameshifting in bacteria.

Authors:  Marie-Hélène Mazauric; Patricia Licznar; Marie-Françoise Prère; Isabelle Canal; Olivier Fayet
Journal:  J Biol Chem       Date:  2008-05-12       Impact factor: 5.157

5.  Structure of the S-adenosylmethionine riboswitch regulatory mRNA element.

Authors:  Rebecca K Montange; Robert T Batey
Journal:  Nature       Date:  2006-06-29       Impact factor: 49.962

6.  Structure and assembly of the essential RNA ring component of a viral DNA packaging motor.

Authors:  Fang Ding; Changrui Lu; Wei Zhao; Kanagalaghatta R Rajashankar; Dwight L Anderson; Paul J Jardine; Shelley Grimes; Ailong Ke
Journal:  Proc Natl Acad Sci U S A       Date:  2011-04-06       Impact factor: 11.205

Review 7.  The emerging field of RNA nanotechnology.

Authors:  Peixuan Guo
Journal:  Nat Nanotechnol       Date:  2010-11-21       Impact factor: 39.213

8.  PseudoBase++: an extension of PseudoBase for easy searching, formatting and visualization of pseudoknots.

Authors:  Michela Taufer; Abel Licon; Roberto Araiza; David Mireles; F H D van Batenburg; Alexander P Gultyaev; Ming-Ying Leung
Journal:  Nucleic Acids Res       Date:  2008-11-06       Impact factor: 16.971

9.  Twenty years of RNA crystallography.

Authors:  Eric Westhof
Journal:  RNA       Date:  2015-04       Impact factor: 4.942

10.  DSSR: an integrated software tool for dissecting the spatial structure of RNA.

Authors:  Xiang-Jun Lu; Harmen J Bussemaker; Wilma K Olson
Journal:  Nucleic Acids Res       Date:  2015-07-15       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.