Literature DB >> 35442658

Isotope Labels Combined with Solution NMR Spectroscopy Make Visible the Invisible Conformations of Small-to-Large RNAs.

Theodore K Dayie1, Lukasz T Olenginski1, Kehinde M Taiwo1.   

Abstract

RNA is central to the proper function of cellular processes important for life on earth and implicated in various medical dysfunctions. Yet, RNA structural biology lags significantly behind that of proteins, limiting mechanistic understanding of RNA chemical biology. Fortunately, solution NMR spectroscopy can probe the structural dynamics of RNA in solution at atomic resolution, opening the door to their functional understanding. However, NMR analysis of RNA, with only four unique ribonucleotide building blocks, suffers from spectral crowding and broad linewidths, especially as RNAs grow in size. One effective strategy to overcome these challenges is to introduce NMR-active stable isotopes into RNA. However, traditional uniform labeling methods introduce scalar and dipolar couplings that complicate the implementation and analysis of NMR measurements. This challenge can be circumvented with selective isotope labeling. In this review, we outline the development of labeling technologies and their application to study biologically relevant RNAs and their complexes ranging in size from 5 to 300 kDa by NMR spectroscopy.

Entities:  

Mesh:

Substances:

Year:  2022        PMID: 35442658      PMCID: PMC9136934          DOI: 10.1021/acs.chemrev.1c00845

Source DB:  PubMed          Journal:  Chem Rev        ISSN: 0009-2665            Impact factor:   72.087


Introduction

RNA is central to medicine, chemical and structural biology, and basic research. For more than a half-century, it has been known that the code of life is imprinted in DNA sequences, following the so-called “sequence hypothesis”, usually wrongly labeled as the “central dogma” in the popular parlance.[1] In the last several decades, it has become increasingly clear that the functions of cells are also transacted by DNA’s lesser-known relative, RNA.[2] Indeed, the varied roles that RNAs play in both normal and dysfunctional cells have motivated RNA-based therapeutic development, as highlighted by the recent SARS COV-2 mRNA vaccines.[3−9] Additionally, RNAs are central to the workings of molecular nanomachines such as the ribosome[10−12] and the spliceosome[13−15] to name a few. Moreover, thanks to the advent of genomic sequencing efforts, we now understand that the amount of RNA sequence transcribed in humans exceeds the number of protein sequences translated by at least 50-fold (Figure A).[16] Paradoxically, the number of RNA-only structures deposited in the Protein Data Bank (PDB) remains below 1%, whereas the number of protein-only structures is a staggering 87% (Figure B). This paucity undercuts current understanding of RNA structure–function relationships.
Figure 1

(A) Percentage of protein coding and nonprotein coding genomic material in selected genomes.[16] Organismal complexity increases with RNA coding but decreases with protein coding capacity as a percentage of the DNA genomic output. (B) Percentage of RNA-only and protein-only structures deposited in the PDB. Given that this analysis excluded DNA-only structures and structures of protein–DNA/RNA complexes, the percentages do not sum to 100%. (C) Percentage of RNA-only and protein-only structures deposited in the Nucleic Acid Database (NDB) and PDB, sorted by structure determination technique. Given that this analysis is self-contained within categories, the percentages sum to 100%. NMR accounts for a larger fraction of RNA structures as compared to proteins. PDB and NDB statistics were accessed from https://www.rcsb.org/ and http://ndbserver.rutgers.edu/ in January 2022.

Nuclear magnetic resonance (NMR) spectroscopy accounts for ∼35% of the RNA structures deposited in the PDB and ∼7% of the protein structures, making it competitive with other biophysical tools such as X-ray crystallography and more recently cryo-electron microscopy (cryo-EM) (Figure C).[17] Moreover, NMR spectroscopy provides high-resolution structural dynamic information in solution, rendering it an ideal tool to study RNA and its interactions with macromolecules or small drug-like compounds or both.[18−25] However, unlike proteins, which are made up of 20 unique amino acid building blocks, RNAs are composed of only four aromatic nucleotides [i.e., adenosine (Ade or A), guanosine (Gua or G), cytidine (Cyt or C), and uridine (Uri or U)] that resonate over a very narrow chemical shift region. This poor chemical shift dispersion is further exacerbated with increasing RNA size. To overcome these limitations, novel isotope labeling strategies that incorporate atom-specific labels (e.g., uridine 13C6) or expand the number of NMR probes beyond the traditional 1H–15N and 1H–13C spin pairs (e.g., 13C–19F) have been developed. In this review, we will outline the development of isotope labeling technologies for RNA NMR and some of the exciting new applications enabled by these labels to study small-to-large RNAs. Specifically, we will begin by detailing the benefits afforded by each common NMR-active isotope (Section ). Next, we outline the various technologies that incorporate such labels into RNA building blocks and eventually into RNA (Section ). This discussion will center around chemo-enzymatic labeling, a method that our group has extensively developed for the past near-decade. Next, we will examine how these labels benefit dynamics measurements (Section ) and can be leveraged to study interactions involving large RNA systems (Section ). Finally, to conclude, we will comment on how isotope labeling can advance the field of RNA chemical and structural biology (Section ).

Stable Isotopes in NMR Spectroscopy

Frederick Soddy is credited with coining the word “isotope” from the Greek isos (i̋σος) and topos (τóπος) meaning “same place”,[26] with the idea that stable isotopes are chemical elements that occupy the same position in the periodic table but differ in mass due to a different number of neutrons within the atomic nucleus. Stable isotopes have been used in a wide range of applications in industry, academia, and medicine.[26] In particular, stable isotopes have significantly impacted methods such as NMR and mass spectrometry (MS). For this work, we will focus on how these probes impact RNA NMR spectroscopy, with special emphasis on proton (hydrogen-1 or 1H), deuterium (hydrogen-2 or 2H), carbon-13 (13C), nitrogen-15 (15N), fluorine-19 (19F), and phosphorus-31 (31P) (Table ).
Table 1

Stable Isotopes Relevant to RNA NMR Spectroscopy[27,28]

isotopenatural abundance (%)γ (rad Hz T–1)spin
1H99.9926.752 × 1071/2
2H0.014.107 × 1071
12C98.90NMR inactiveNMR inactive
13C1.106.728 × 1071/2
14N99.631.934 × 1071
15N0.37–2.713 × 1071/2
19F100.0025.181 × 1071/2
31P100.0010.839 × 1071/2

Proton Isotope

The proton isotope has high natural abundance (∼100%) and the highest sensitivity of NMR receptive and stable nuclei (Table ). Therefore, homonuclear two-dimensional (2D) 1H–1H NMR methods were attractive in the early days of NMR analysis. However, the very limited resolution of ribose and aromatic nucleobase resonances in the RNA 1H spectra restricted such studies to small RNAs (<5 kDa). Within the ribose, all protons with the exception of H1′ (i.e., H2′, H3′, H4′, H5′, and H5′′) are clustered within a narrow ∼0.6–0.8 ppm range (Figure A).[29] Within the nucleobase, the chemical shift distribution of all protons is limited to 1 ppm or less, except for imino protons with a dispersion of ∼4 ppm (Figure A).[30,31] Taken together, the distribution of proton resonances leads to severe chemical shift overlap that worsens as RNAs grow in size due to increased line broadening (Figure B). This, in part, explains the paucity of NMR structures of large RNAs (e.g., > 60 nt) (Figure C).
Figure 2

(A) 1H NMR spectrum of a 61 nt RNA emphasize the narrow chemical shift dispersion of RNA protons. Here, bp and nc refer to canonical Watson–Crick base pair and noncanonical base pairs, respectively. A schematic of RNA ribose and nucleobase structures and numbering are shown above the spectrum. (B) Nucleobase region of 1H NMR spectra for RNAs of increasing size. Both signal overlap and broad linewidths worsen as RNAs grow in size. In fact, for the best visual representation, the signals corresponding to the 61 and 232 nt RNAs were increased to display them on a similar scale to that of the 14 nt RNA. (C) Histogram of RNA NMR structures in the NDB, sorted by RNA size (in nt, bin = 10 nt). Given the challenges faced by RNA NMR, there are only 23 NMR structures corresponding to RNAs > 60 nt. NDB statistics were accessed from http://ndbserver.rutgers.edu/ in January 2022.

Heteronuclear 15N and 13C Isotopes

Unlike protons, with a chemical shift span of 2–15 ppm, 15N and 13C nuclei in nucleic acids have larger chemical shift distributions among the various atomic sites. For example, 13C nuclei in RNA have chemical shifts from 61 (C5′-ribose) to 170 (nonprotonated pyrimidine nucleobase C4) ppm, and 15N nuclei from 70 (amino nitrogen) to 240 (nonprotonated purine nucleobase N7) ppm.[29−31] Introduction of the 15N isotope (0.37%) into RNA nucleobases circumvents the extensive line broadening arising from the electric quadrupole moment of the naturally abundant 14N isotope (99.63%) (Table ). Incorporation of the 15N isotope has several additional advantages. As a spin 1/2 nucleus with low gyromagnetic ratio (γ) (Table ), the 15N isotope provides very narrow spectral lines. Nitrogen atoms, like protons and carbon, are distributed in nucleic acid major and minor grooves, and both grooves serve as important sites for metal, drug, or macromolecule interactions. However, given the wider chemical shift dispersion of 15N over the 1H nucleus and its narrower linewidths over 13C and 1H nuclei, 15N is more suited to monitor those grooves, especially in larger RNAs. However, nitrogen’s low-γ is also an “Achilles heel”. In the absence of appropriate NMR cryogenic probes and the availability of high magnetic fields, detecting low-γ nuclei such as 15N has been very unattractive. Increasing the availability of such probes is expected to reverse this trend. Nevertheless, these considerations suggest that the shortcomings of proton NMR can be overcome by heteronuclear NMR methods.[32] Beginning in the 1980s, several groups introduced 15N, 2H, and 13C labels to facilitate NMR studies of RNAs and proteins.[33−42] Depending on the scientific question, these labels were introduced uniformly or selectively using bacteria in vivo or enzyme catalyzed synthesis in vitro. Selective enrichment was achieved by growing auxotrophs on obligate chemically synthesized compounds. 13C-labeling of bacterial tRNAs[33−35] and 15N-labeling of tRNA and 5S rRNA enabled various atomic sites in these RNAs to be monitored by NMR. Uniform 15N-labeling was also applied to 5S rRNA in vivo.[36−39] To extend this labeling to additional RNAs, several research groups developed in vitro methods to convert ribonucleoside 5′-monophosphates isolated from bacteria grown on 15N-, 2H-, and 13C-sources into the corresponding triphosphates for in vitro transcription.[43−47] These uniform 15N- and 13C-labeling technologies did extend the use of NMR to medium-sized RNAs (MW < 20 kDa). However, two perennial challenges of low signal-to-noise and decreased spectral resolution remained. The latter problem arises from the reintroduction of spectral overlap along the heteronuclear dimension as the RNA grows in size, and the former arises from increased relaxation that results from the slower overall tumbling of large biomolecules. The next section will describe recent labeling methods to overcome both problems.

Deuteration in Context of Heteronuclear 15N and 13C Isotopes

Deuteration (i.e., replacement of protons with deuterons) simplifies the multiplicity of spin–spin interactions, eliminates nonessential resonance lines, reduces spectral crowding, helps to identify coupling patterns, and improves calculation of coupling constants with precision.[48] Given the smaller γ of the deuterium spin relative to proton (γD ≈ γH/6.5) (Table ), the relaxation rates for deuterated nuclei are scaled proportionally by 2% [(γD/γH)[2] ≈ 0.02]. By eliminating competing relaxation pathways of dipolar coupled protons, deuteration suppresses spin diffusion within a relaxation network, leading to smaller linewidths and higher signal-to-noise for the remaining protons and directly attached 13C and 15N nuclei.[47−49] Given these advantages, 2H-labeling has played an important role in probing the structure, dynamics, and interactions of large RNAs by NMR.[17,50−55] (A) Percentage of protein coding and nonprotein coding genomic material in selected genomes.[16] Organismal complexity increases with RNA coding but decreases with protein coding capacity as a percentage of the DNA genomic output. (B) Percentage of RNA-only and protein-only structures deposited in the PDB. Given that this analysis excluded DNA-only structures and structures of protein–DNA/RNA complexes, the percentages do not sum to 100%. (C) Percentage of RNA-only and protein-only structures deposited in the Nucleic Acid Database (NDB) and PDB, sorted by structure determination technique. Given that this analysis is self-contained within categories, the percentages sum to 100%. NMR accounts for a larger fraction of RNA structures as compared to proteins. PDB and NDB statistics were accessed from https://www.rcsb.org/ and http://ndbserver.rutgers.edu/ in January 2022. (A) 1H NMR spectrum of a 61 nt RNA emphasize the narrow chemical shift dispersion of RNA protons. Here, bp and nc refer to canonical Watson–Crick base pair and noncanonical base pairs, respectively. A schematic of RNA ribose and nucleobase structures and numbering are shown above the spectrum. (B) Nucleobase region of 1H NMR spectra for RNAs of increasing size. Both signal overlap and broad linewidths worsen as RNAs grow in size. In fact, for the best visual representation, the signals corresponding to the 61 and 232 nt RNAs were increased to display them on a similar scale to that of the 14 nt RNA. (C) Histogram of RNA NMR structures in the NDB, sorted by RNA size (in nt, bin = 10 nt). Given the challenges faced by RNA NMR, there are only 23 NMR structures corresponding to RNAs > 60 nt. NDB statistics were accessed from http://ndbserver.rutgers.edu/ in January 2022.

Fluorination in Context of 15N, 2H, and 13C Isotopes

In addition to 2H, magnetically active nuclei such as 19F have valuable spectroscopic properties that confer clear advantages in the study of macromolecular structure and conformational changes.[56] These benefits include the 100% natural abundance of 19F (Table ), a comparably large γ (94% of 1H) (Table ), and a superior chemical shift dispersion that is ∼6-fold that of 1H.[18,57] Furthermore, 19F is sensitive to changes in its local chemical environment, making it a useful probe of conformational changes.[18,56,57] Finally, fluorine has an atomic radius (1.35 Å) slightly larger than that of a hydrogen (1.20 Å) but slightly smaller than that of a methyl group (2.00 Å). The 19F nuclei is therefore expected to substitute for either group without serious structural perturbations,[58] making it a valuable tool for the in vitro study of medically important RNAs.[59] Finally, 19F is virtually absent in biological systems and therefore offers 19F NMR a biorthogonal advantage of background-free drug screening.[60] Taken together, 19F is an attractive probe for studying RNAs in solution. Details of new technologies developed to incorporate 19F into nucleobases will be presented in Section , and its utility to expand NMR studies to larger RNAs will be discussed in Section .

Preparation of 15N, 2H, 13C, and 19F Isotope-Labeled RNA

A number of companies (Cassia LLC, Cambridge Isotope Laboratories (CIL), INNotope, Sigma-Aldrich, and Silantes) offer isotope-labeled RNA building blocks with uniform and selective labeling. However, most comprehensive labels are made by academic laboratories using biochemical, biomass, chemical, and chemo-enzymatic approaches, as reviewed in the past.[61−67] In this section, we outline promising developments in the chemical synthesis of isotope-labeled purine [i.e., adenine (Ade or A) and guanine (Gua or G)] and pyrimidine [i.e., cytosine (Cyt or C) and uracil (Ura or U) (in RNA) or thymine (Thy or T) (in DNA)] nucleobases and their incorporation into RNA. The main approaches to obtain isotope-labeled RNA are enzymatic or solid-phase chemical synthesis. The enzymatic approach involves DNA template-directed T7 RNA polymerase-based in vitro transcription using ribonucleoside 5′-triphosphates (rNTPs).[44,45,68−76] The alternative method is chemical solid-phase synthesis using RNA phosphoramidites (amidites).[77−80] Both methods can use unlabeled and isotope-labeled building blocks (rNTPs and amidites) to generate versatile RNA labeling patterns, as recently reviewed.[61,62,66,67]

Chemical Synthesis of Nucleobases

In this section, we give a general overview of the chemical synthetic methods to label RNA nucleobases at specific positions with 15N, 2H, 13C, and 19F isotopes. These nucleobases can then serve as the building blocks for the synthesis of the rNTPs or amidites that enable the eventual enzymatic or chemical production of labeled RNAs of defined sequence and length.

Specific 13C Labeling

Pyrimidine Synthesis with 15N, 2H, 13C, and 19F Labels

The uracil nucleobase is easily assembled using a method initially devised by Roberts and Poulter,[81] later streamlined by SantaLucia and Tinoco and co-workers,[71] and further improved by Kreutz and co-workers.[82] In the original synthetic eight-step pathway described by Roberts and Poulter, the 13C label can be placed in any position of the six-membered ring simply by changing the 13C-source.[81] SantaLucia and Tinoco and co-workers streamlined this to a three-step reaction scheme to make 13C-labeled cyanoacetyl urea from inexpensive commercially available 13C-labeled precursors.[71] A slightly modified approach from Kreutz and co-workers uses bromoacetic rather than chloroacetic acid. Bromoacetic acid is the preferred starting material due to the lower costs and better handling of the cyanide reagent.[74,82] Other methods with fewer steps exist such as condensation of malic or propiolic acid and urea.[83,84] Even though these are straightforward two-step reactions, execution is not as convenient or cost-effective. Using the Poulter-SantaLucia-Kreutz approach,[71,74,81,82] [1-13C]- and [2-13C]-bromoacetic acid selectively incorporate 13C at uracil C4 and C5, respectively. Use of 13C-urea, on the other hand, delivers 13C at the C2 site, and that of 13C-potassium cyanide (13C-KCN) labels the C6 site. Finally, 15N-urea installs 15N at N1 and N3. All possible uracil heteroatom positions can therefore be labeled in good yields, and these reactions can be easily scaled to gram quantities.[74,82] An example of a synthetic scheme using the Poulter-SantaLucia-Kreutz approach[71,74,81,82] is shown for uracil C6 labeling (Scheme ).[82,85] In brief, bromoacetic acid 1 reacts with 13C-KCN and sodium carbonate (Na2CO3) in a Kolbe nitrile reaction to form 2-[cyano-13C]acetic acid 2. Treatment of 2 with urea in the presence of acetic anhydride (Ac2O) then yields a urea intermediate 3 that can be readily converted to [6-13C]-uracil 4 using a palladium catalyst (e.g., Pd/BaSO4) under hydrogen atmosphere (H2). Given that pyrimidine H5/H6 protons have three-bond scalar coupling (J ≈ 8 Hz[29]) and strong dipolar coupling (H5–H6 distance of 2 Å) that complicate NMR experiments, selective and quantitative deuteration can be achieved by reacting 4 with triethylamine (TEA) to form the desired [6-13C, 5-2H]-uracil 5.[85] Taken together, 5 was synthesized with four-steps in 63% overall yield (Scheme ).[82,85]
Scheme 1

Synthetic Route to [6-13C, 5-2H]-Uracil[82,85]

Given the valuable spectroscopic properties of 19F (Section ), uracil can be fluorinated with the commercially available Selectfluor, as recently reported.[18,57,86] This synthetic scheme is similar to that described for uracil C6 labeling (Scheme ),[82,85] except using [2-13C]-bromoacetic acid 6 as starting material. Kolbe nitrile reaction of 6 forms an intermediate 7 that reacts with 15N-urea and Ac2O to yield 8. Addition of Pd/BaSO4 in H2 to 8 then forms [5-13C, 1,3-15N2]-uracil 9, which can then be fluorinated with Selectfluor to yield [5-13C, 5-19F, 1,3-15N2]-uracil (5FU) 10. Again, selective and quantitative deuteration of H6 can remove coupling (J ≈ 7.1 Hz[88]) that complicates NMR spectra by heating 10 5FU in sodium deuteroxide (NaOD) to form [6-2H]-5FU 11.[18,86,87] In summary, 11 was synthesized in five-steps with a total yield of 38% (Scheme ).[18,57,82,85,86]
Scheme 2

Synthetic Route to [6-2H]-5FU[18,57,82,85,86]

Finally, thymine C6 can be selectively labeled with a three-step synthesis in a manner similar to uracil labeling (Schemes and 2).[18,57,82,85,86] In brief, bromopropionic acid 12 is used in a Kolbe nitrile reaction followed by addition of urea and Ac2O to form intermediates 13 and 14.[89,90] Then reaction of 14 with Pd/BaSO4 in H2 forms the desired [6-13C]-thymine 15 in 45% overall yield (Scheme ).[89]
Scheme 3

Synthetic Route to [6-13C]-Thymine[89]

Purine Synthesis with C8 Specific Labeling

As with pyrimidines, purine nucleobases can be selectively labeled with 13C and 15N isotopes using commercially available precursor compounds. In the early 1990s, SantaLucia and Tinoco and co-workers described an effective purine synthesis using 13C-formic acid to label purine C8.[71] More recently, Kreutz and co-workers streamlined and improved the efficiency of such labeling in one-step reactions.[75,85,91] Here, the condensation of 13C-formic acid 16 with morpholine forms morpholinium formate intermediate that immediately reacts with either 4,5,6-triaminopyrimidine 17 to yield [8-13C]-adenine 18 (Scheme ) or 2,5,6-triaminopyrimidin-4-ol sulfate 19 to form [8-13C]-guanine 20 (Scheme ) with 64% and 94% yield, respectively.[75,85]
Scheme 4

Synthetic Route to [8-13C]-Adenine[75,85]

Scheme 5

Synthetic Route to [8-13C]-Guanine[75,85]

Purine Synthesis with C2 Specific Labeling

As with purine C8 labeling, adenine C2 can be readily labeled. Labeling C2 is attractive because its chemical shift can monitor protonation at adenine N1,[92] which cannot be achieved with 15N NMR experiments due to severe line broadening.[92−94] Unlike the environments of single-stranded RNA, those in structured RNAs can shift the pKa values of protonated adenosine or cytidines significantly toward neutrality, serving both catalytic and structural functions in RNA enzymes.[94−97] The 13C isotope can be incorporated at the purine C2 site starting with 5-aminoimidazole-4-carboxamide (AICA) and ethylsodium 13C-xanthate to form [2-13C]-hypoxanthine, [2-13C]-adenine, or [2-13C]-guanine.[98] A preferred alternative for purine C2 labeling uses the method of Battaglia and Ouwerkerk and co-workers, wherein sodium ethoxide (C2H5ONa) mediates cyclization of ethyl cyanoacetate 21 with 13C-thiourea 22 to give [2-13C]-6-amino-2-thiouracil 23.[99,100] Unlabeled sodium nitrite (NaNO2) is then used for nitrosylation (the 15N-labeled form can also be used to introduce a second isotope label) to form 24. Then sodium dithionite (Na2S2O4) mediates the reduction of the nitroso group to yield 25 followed by desulfurization over Raney-Nickel to form the diaminopyrimidine 26.[101] Treatment of the product with sulfuric (H2SO4) and formic (HCOOH) acids yields [2-13C]-hypoxanthine 27.[102] Subsequent reaction with phosphorus oxychloride (POCl3) and N,N-dimethylaniline (N,N-DMA) yields [2-13C]-6-chloropurine 28.[103] In the final step, reaction with methanolic NH3 in a microwave reactor yields the desired [2-13C]-adenine 29 (Scheme ).[100] Alternative purine synthesis pathways have been devised to enable specific labeling of adenine C2 or any purine nitrogen position.[98,100,102,104−108] We recently synthesized [7-15N]-labeled 29 through intermediates 21–23 and 15N-labeled intermediates 24–28 using the Battaglia-Ouwerkerk approach[99,100] and demonstrated its utility in NMR analysis of RNA structure and dynamics (Scheme ).[104]
Scheme 6

Synthetic Route to [2-13C]-Adenine

Adapted with permission from Dayie and co-workers. Copyright 2020 Springer Nature.[104] Adenine can be labeled at N7 by using 15N-labeled sodium nitrite in the second chemical step.

Synthetic Route to [2-13C]-Adenine

Adapted with permission from Dayie and co-workers. Copyright 2020 Springer Nature.[104] Adenine can be labeled at N7 by using 15N-labeled sodium nitrite in the second chemical step.

Specific 15N Labeling

Several approaches have been reported for the synthesis of atom-specific 15N-labeled nucleobases and nucleosides as well as their incorporation into the corresponding rNTPs and amidites for RNA synthesis.[98,100−102,104−114] Here, we highlight those methods that allow streamlined 15N-labeled nucleobase synthesis in high yield. These labeling patterns permit direct monitoring of Watson–Crick base pairs or analysis of interconverting duplex, triplex, and quadruplex structures by multidimensional NMR.[110−113]

Pyrimidine N1, N3, and N4 Labeling

As described above, using the Poulter-SantaLucia-Kreutz approach,[71,74,81,82]15N-urea delivers 15N at uracil N1 and N3 sites. Cytosine labeling, on the other hand, occurs through uracil, given that the corresponding CTP can be built directly from enzymatic conversion (with ammonium chloride, NH4Cl) from UTP[74,115] or by chemical synthesis from a transiently protected uridine amidite.[85] In this way, all uracil isotope labeling patterns will be retained in CTP and cytidine amidites. Moreover, additional 15N-labeling of the cytidine N4 amino group can be achieved using 15NH4Cl in the enzymatic[74] or chemical[85] reaction, as will be described in Sections and 3.3.

Purine N1, N3, N7, and N9 Labeling

Synthesis of adenine N1 labeling occurs in two-steps.[101] Here, commercially available 5-aminoimidazole-4-carbonitrile 30 reacts with diethoxymethyl acetate (DEMA) to yield intermediate 31. Subsequent reaction of 31 with aqueous ammonia (NH3) readily forms the desired product [1-15N]-adenine 32 with a total yield of 60% (Scheme )[101]
Scheme 7

Synthetic Route to [1-15N]-Adenine[101]

Adenine labeled at N3, on the other hand, can be synthesized in six steps.[108] In brief, commercially available 4-imidazolecarboxylic acid 33 is nitrated with ammonium nitrate (NH415NO3) to afford 5-[nitro-15N]1H-imidazole-4-carboxylic acid 34. Activation of 34 with 1,1′-carbonyldiimidazole (CDI) in dimethylformamide (DMF) and excess NH3 forms carboxamide 35. Importantly, addition of 15NH4Cl in this step can also introduce a 15N label at the N1 site, permitting the eventual production of [1,3-15N2]-adenine.[108] Catalytic reduction of 35 affords [5-15N]-AICA 36. Ring closure of 36 with triethyl orthoformate (HC(OC2H5)3) gives a hypoxanthine intermediate 37, which readily forms [3-15N]-6-chloropurine 38 upon chlorination with POCl3 and N,N-DMA. Finally, ammonolysis with ammonium hydroxide (NH4OH) yields the desired [3-15N]-adenine 39 with ∼47% total yield (Scheme ).[108]
Scheme 8

Synthetic Route to [3-15N]-Adenine[108]

Adenine N3 and its amino group can also be labeled at by 15NH4Cl and 15NH4OH in the second and final chemical steps, respectively.

Synthetic Route to [3-15N]-Adenine[108]

Adenine N3 and its amino group can also be labeled at by 15NH4Cl and 15NH4OH in the second and final chemical steps, respectively. In addition, purine N7 labeling is readily achieved and has been widely adapted.[99,100,102,104,106,111] For example, synthesis of [7-15N]-guanine is achieved in three-steps. Nitrosylation of commercially available 2,6-diaminopyrimidin-4-ol 40 by Na15NO2 yields 2,6-diamino-5-[nitroso-15N]pyrimidin-4-ol 41. Reduction of 41 with sodium dithionite followed by acidification by H2SO4 forms 2,6-diamino-5-[amino-15N]pyrimidin-4-ol 42. In the final step, reflux with formamide (HCONH2) followed by HCOOH provides the desired [7-15N]-guanine 43 with a total yield of 65%[500] (Scheme ).
Scheme 9

Synthetic Route to [7-15N]-Guanine

Dayie and co-workers.[500]

Synthetic Route to [7-15N]-Guanine

Dayie and co-workers.[500] Several direct routes to 15N-labeled adenine initiate from commercially available aminopyrimidines.[102,106] However, Micura and Kreutz and co-workers[111] employed a sodium ethoxide mediated cyclization of 21 with 44 to form 6-amino-2-thiouracil 45.[116] Subsequent nitrosylation of 45 installs the 15N label using Na15NO2 to yield the nitroso-containing 46.[102] A sodium dithionite mediated reduction of the nitroso group forms 47 and desulfurization over Raney-Nickel affords 48.[102] Subsequent treatment with H2SO4 and HCOOH yields hypoxanthine 49,[102] which was then reacted with POCl3 and N,N-DMA to give [7-15N]-6-chloropurine 50.[103] In the final step, reaction with methanolic NH3 in a microwave reactor gives the desired [7-15N]-adenine 51 with a total yield of 18% (Scheme ).[100,104,106,111] As mentioned above, we recently showcased the same synthetic scheme while also incorporating selective 13C2 labeling.[104]
Scheme 10

Synthetic Route to [7-15N]-Adenine

Adapted with permission from Dayie and co-workers. Copyright 2020 Springer Nature.[104] Adenine C2 can also be labeled if 13C-labeled thiourea is used as the starting material.

Synthetic Route to [7-15N]-Adenine

Adapted with permission from Dayie and co-workers. Copyright 2020 Springer Nature.[104] Adenine C2 can also be labeled if 13C-labeled thiourea is used as the starting material. Finally, in the synthesis of N9-labeled adenine, 5-amino-4,6-dichloropyrimidine 52 is converted to a [9-15N]-6-chloropurine 53 using aqueous 15NH3 and DEMA.[117] Then a reaction with aqueous NH3 yields the desired [9-15N]-adenine 54. This simple three-step reaction proceeds with an overall yield of 79% (Scheme ).[117]
Scheme 11

Synthetic Route to [9-15N]-Adenine[117]

Nucleobase Labels: Summary and Outlook

As described in Sections and 3.1.2, and shown in Schemes –11, a wide range of isotope-labeled nucleobases (Table ) are now available to the scientific community. Of all synthetic procedures, purine C8 sites are most readily labeled in one chemical step in a single day and with high yield (64–94%) (Table ). Conversely, adenine N3 is the least readily labeled, taking 11 days (Figure ). Adenine C2 and N7 have the lowest overall yields of 18% (Table ). In future work, it would be advantageous to focus on improving yields and reducing the number of chemical steps. Nevertheless, these RNA labeling patterns are commonly chosen based on the experimental information required and less often dictated by the relative time and yield of the building blocks.
Table 2

Summary of All Nucleobase Labels As Outlined in Schemes –11

nucleobase labeltime (days)achemical stepsbyield (%)ref
[8-13C]-adenine1164(75), (85)
[8-13C]-guanine1194(75), (85)
[2-13C]-adeninec2.57 (1)18(104)
[1-15N]-adenine2.52 (1)60(101)
[3-15N]-adenine116 (2)47(108)
[7-15N]-adeninec2.57 (1)18(104)
[7-15N]-guanined1.5365 
[9-15N]-adenine5.53 (3)79(117)
[6-13C, 5-2H]-uracil7463(82), (85)
[5-13C, 5-19F, 6-2H]-uracil8538(18), (57), (82), (85), (86)
[6-13C]-thymine2.5345(89)

Total reaction time was based on the time required for all chemical steps. In addition, 16 h were added for any explicit mention of overnight procedures, and 24 h were added for any chromatographic purifications.

Number in parentheses represents the number of chromatographic purification steps.

All data for [2-13C]-adenine and [7-15N]-adenine labeling came from the same doubly labeled [2-13C, 7-15N]-adenine labeling scheme.[104]

This synthetic procedure is from Dayie and co-workers.[500]

Total reaction time was based on the time required for all chemical steps. In addition, 16 h were added for any explicit mention of overnight procedures, and 24 h were added for any chromatographic purifications. Number in parentheses represents the number of chromatographic purification steps. All data for [2-13C]-adenine and [7-15N]-adenine labeling came from the same doubly labeled [2-13C, 7-15N]-adenine labeling scheme.[104] This synthetic procedure is from Dayie and co-workers.[500]

Chemo-enzymatic Labeling

With chemically synthesized isotope-labeled nucleobases in-hand, this section outlines the various enzymatic methods that can be used to build them into isotope-labeled rNTPs (and dNTPs). Alternatively, this can be accomplished using Escherichia coli(45,118−120) or Methylophilus methylotrophus(44) grown on 13C- or 15N-enriched media, as reviewed elsewhere.[62,66]

Enzymatic Coupling of Nucleobase and Ribose Sources

The first enzymatic approach to prepare isotope-labeled rNTPs was the Gilles-Schramm-Williamson pentose phosphate pathway method,[65,121−124] which uses isotope-labeled d-glucoses as the precursor and requires 14 enzymes (Table ) and several coenzymes. This method is appealing for uniform ribose labeling using commercially available uniformly 13C- or 2H-labeled d-glucoses.
Table 3

Enzymes of Glycolysis, Pentose Phosphate, and Nucleotide Biosynthesis and Salvage Pathway for rNTP Synthesis

enzymeaabbreviationEC numbersource
Gilles-Schramm-Williamson and Co-workers[65,121124]
HexokinaseHXK2.7.1.1Baker’s yeast
Glucose-6-phosphate isomerasePGI15.3.1.9Baker’s yeast
Glucose-6-phosphate dehydrogenaseZWF1.1.1.49L. mesenteroides
Phosphogluconate dehydrogenaseGND1.1.1.44Torula yeast
Ribose-5-phosphate isomeraseRPI15.3.1.6Spinach
Phosphoribosylpyrophosphate synthetasePRPPS2.7.6.1E. coli
Adenine phosphoribosyltransferaseAPRT2.4.2.7JM109/pTTA6
Uracil phosphoribosyltransferaseUPRT2.4.2.9JM109/pTTU2
Xanthine-guanine phosphoribosyltransferaseXGPRT2.4.2.22JM109/pTTG2
Nucleoside-monophosphate kinaseNMPK2.7.4.4Bovine liver
Myokinase (Adenylate kinase)MK2.7.4.3Rabbit muscle
Guanylate kinaseGK2.7.4.8Porcine brain
3-Phosphoglycerate mutaseYIBO5.4.2.1Rabbit muscle
EnolaseENO4.2.1.11Baker’s yeast
Pyruvate kinasePYKF2.7.1.40Rabbit muscle
Glutamate dehydrogenase (NAD(P)+)GLUD1.4.1.3Bovine liver
CTP synthaseCTPS6.3.4.2JM109/pMW5
l-Lactate dehydrogenaseLDH1.1.1.27Rabbit muscle
Dayie and Co-workers[74,75,128]
RibokinaseRK2.7.1.15E. coli
Creatine kinaseCK2.7.3.2Chicken muscle
UMP kinaseUMPK2.7.4.22E. coli
Serianni and Co-workers[129]
Purine nucleoside phosphorylasePNPase2.4.2.1E. coli
Xanthine oxidaseXO1.1.3.22Buttermilk
CatalaseCT1.11.1.6Bovine liver
Uridine phosphorylaseUPase2.4.2.3E. coli

Given that there is overlap in the enzymes used in the methods of Schramm-Williamson and co-workers[65,121−124] and Dayie and co-workers,[74,75,128] only the unique enzymes are listed for the latter. All enzymes are commercially available except APRT, UPRT, XGPRT, CTPS, and RK.[128] These are currently only available in a few academic laboratories. At some point, these plasmids would be available at Addgene.

Given that there is overlap in the enzymes used in the methods of Schramm-Williamson and co-workers[65,121−124] and Dayie and co-workers,[74,75,128] only the unique enzymes are listed for the latter. All enzymes are commercially available except APRT, UPRT, XGPRT, CTPS, and RK.[128] These are currently only available in a few academic laboratories. At some point, these plasmids would be available at Addgene. In brief, hexokinase (HXK) (EC 2.7.1.1) phosphorylates 13C-labeled d-glucose 55 at its O6 position to yield glucose-6-phosphate 56. Then glucose-6-phosphate dehydrogenase (ZWF) (EC 1.1.1.49) oxidizes 56 to 6-phosphogluconate 57, and phosphogluconate dehydrogenase (GND) (EC 1.1.1.44) further oxides 57 to 58. Finally, ribose-5-phosphate isomerase (RPI1) (EC 5.3.1.6) isomerizes 58 to ribose-5-phosphate 59. Following isomerization, phosphoribosylpyrophosphate synthetase (PRPPS) (EC 2.7.6.1) pyrophosphorylates 59 at its O1′ site to yield 60. Then, adenine (APRT) (EC 2.4.2.7), guanine (XGPRT) (EC 2.4.2.22), or uridine (UPRT) (2.4.2.9) phosphoribosyl transferases facilitate the nucleophilic attack of the adenine or guanine N9 or uracil N1 to the C1′ of 60 to yield 5′-monophosphates 61–63, respectively. Adenylate (MK) (EC 2.7.4.3), guanylate (GK) (EC 2.7.4.8), or nucleoside monophosphate (NMPK) (EC 2.7.4.4) kinases phosphorylate 61–63 to form the 5′-diphosphates 64–66, respectively. Pyruvate kinase (PYKF) (EC 2.7.1.40) then catalyzes the final phosphorylation to form the 5′-triphosphates 67–69 (Scheme ).[65,121−124] Finally, UTP 69 can be converted to CTP 70 by CTP synthase (CTPS) (EC 6.3.4.2) (Scheme ).[65,121−124] Importantly, 15N-labeling of the cytidine amino group can be achieved by using 15NH3 in the final step (Scheme ).[65,121−124]
Scheme 12

Enzymatic Synthesis of Isotope-Labeled rNTPs from d-Glucose Sources[65,121−124]

Moreover, Williamson and Hennig and co-workers demonstrated that the Gilles-Schramm-Williamson method[65,121−124] is compatible with 19F-labeled nucleobases[58,125,126] by synthesizing [2-19F]-ATP,[126] [5-19F]-UTP,[125] and [5-19F]-CTP.[125] However, d-ribose is a more cost-effective labeled precursor than d-glucose for the selective 13C- or 2H-ribose labeling of rNTPs.[127] On the basis of earlier work by Whitesides and co-workers,[130−132] our group truncated the relatively complex Gilles-Schramm-Williamson method[65,121−124] to use 10 enzymes instead of 18, and two cofactor regeneration systems (dATP and creatine phosphate) (Table ). This chemo-enzymatic labeling[74,75,128] is a versatile technology to couple nucleobase to ribose followed by subsequent phosphorylation to the rNTP in a one-pot enzymatic reaction.[74,75,128] The nucleobase and ribose building blocks can be unlabeled, isotope-labeled, chemically synthesized, or commercially available. This method therefore permits a diverse set of labeling patterns. Moreover, this approach has many advantages over previously reported de novo(72,73) or chemical[133−137] synthesis methods including fewer enzymes, fewer synthetic steps, and greater yields. This method affords the facile coupling of chemically synthesized uniformly 15N- and 13C/15N-labeled uracil (Scheme )[82,85] to commercially available unlabeled d-ribose and 13C-labeled d-ribose. The resulting uniformly 15N-labeled and uniformly 13C/15N-labeled UTP provided 338- and 14-fold savings over the commercially available material from CIL, respectively. However, the main advantage of chemo-enzymatic synthesis is the ability to generate noncommercially available atom-specific labeling patterns. We showcased the power of this method with the synthesis of [1′,5′,6-13C3, 1,3-15N2]-pyrimidine rNTPs using six enzymes (Table ).[74] We also used this method to synthesize [1′,8-13C2]-, or [2′,8-13C2]-, or [1′,5′,8-13C3]-ATPs and -GTPs with five enzymes (Table ).[75] First, 13C-labeled D-ribose 71 was phosphorylated at its O5 position by ribokinase (RK) (EC 2.7.1.15) to yield ribose-5-phosphate 72 followed by pyrophosphorylation at the O1 site by PRPPS to afford 73. Then APRT, XGPRT, or UPRT catalyzed the nucleophilic attack of the adenine or guanine N9 or uracil N1 to the C1′ of 73 to yield 5′-monophosphates 74–76, respectively. Phosphorylation of 74–76 is achieved by MK, GK, or UMP kinase (UMPK) (EC 2.7.4.22) to form the 5′-diphosphates 77–79, respectively. Creatine kinase (CK) (EC 2.7.3.2) then facilitates the final phosphorylation to afford the 5′-triphosphates 80–82 (Scheme ).[74,75,128] Similar to the Gilles-Schramm-Williamson method,[65,121−124] a final 15N label can be introduced at the CTP 83 amino group if 15NH4Cl is used alongside CTPS in the final enzymatic step (Scheme ).[74,75,128] These atom-specifically labeled rNTPs can then be used with in vitro transcription to make RNAs without any size limit. Importantly, these labeling patterns reduced spectral crowding, increased signal-to-noise ratios, facilitated direct carbon detection experiments, and eliminated 13C–13C scalar and dipolar couplings.[63,74,75,86,104]
Scheme 13

Enzymatic Synthesis of Isotope-Labeled rNTPs from d-Ribose Sources[74,75,128]

As with the Gilles-Schramm-Williamson method,[65,121−124] the approach developed by Dayie and co-workers[74,75,128] is also compatible with 19F-labeled nucleobases (e.g., [2-19F]-adenine and [5-19F]-uracil[18,86]). It is worth noting that Serianni and co-workers have also developed a complementary approach to enzymatically couple nucleobase and ribose sources using four enzymes (Table ).[129] Their method uses hypoxanthine 84 and 1-O-acetyl-2,3,5-tri-O-benzoyl-α-d-ribofuranoside (ATBR) 85 in a Vorbrüggen reaction (detailed in Scheme ) to yield inosine 86. Then purine nucleoside phosphorylase (PNPase) (EC 2.4.2.1) replaces the hypoxanthine moiety on the C1 position of 86 with a phosphate group to give α-d-ribofuranosyl-1-phosphate sodium salt (αR1P) 87 (Table ).[129] Then 87 is glycosylated by PNPase with adenine or guanine or by UPase (EC 2.4.3.2) with uracil to form nucleosides 88–90, respectively (Scheme ).[129] Products 88–90 can then be converted to the desired rNTP or amidite with further enzymatic or chemical synthesis.
Scheme 15

Synthetic Route to [6-13C, 5-2H]-Uridine 2′-O-TOM Amidite[85]

Scheme 14

Enzymatic Synthesis of Isotope-Labeled Nucleosides from Inosine[129]

Enzymatic Methods for Position-Specific Labeling

While these chemo-enzymatic methods enable straightforward atom-specific labeling, they rely solely on DNA template-directed T7 RNA polymerase-based in vitro transcription and are therefore unable to incorporate these labels position-specifically (e.g., nucleotide 5). Fortunately, there are two alternative enzymatic methods capable of such position-specific labeling, both of which are compatible with the isotope-labeled rNTP building blocks described above. Wang and co-workers developed a hybrid solid–liquid phase transcription technique that employs an automated robotic platform known as position-selective labeling of RNA (PLOR).[138] In PLOR, the DNA template is attached to beads and RNA synthesis is initiated by the addition of T7 RNA polymerase and a mixture of three of the four rNTP building blocks (e.g., ATP, GTP, and CTP). The beads are then washed and a new rNTP mixture is added, this time containing the previously omitted building block. Thus, PLOR can incorporate any isotope-labeled rNTP (e.g., [6-13C, 5-2H]-UTP) position-specifically, assuming the desired labeling site (e.g., uridine 10) does not coincide with a stretch of identical nucleotides (e.g., UUU). While isotope labeling by PLOR has aided NMR studies of RNA,[138−140] its widespread use is still limited due to the requisite equipment needed and its laborious nature. Schwalbe and co-workers developed an alternative chemo-enzymatic approach for position-specific labeling.[141] Importantly, this method uses standard laboratory equipment and commercially available enzymes T4 RNA ligase 1 (EC 6.5.1.3), recombinant shrimp alkaline phosphatase (rSAP) (EC 3.1.3.1), and T4 RNA ligase 2 (EC 6.5.1.3), making it more accessible than PLOR. In their method, a modified nucleoside 3′,5′-biphosphate is incorporated at the 3′-end of an RNA fragment by T4 RNA ligase 1 followed by dephosphorylation by rSAP and DNA-splinted ligation by T4 RNA ligase 2. This technique has been used to introduce modified nucleosides (i.e., photocaged, photoswitchable, and isotope-labeled) into RNAs up to 392 nts. While this method holds great promise for NMR applications, low yields of bis-phosphorylation (6–22%) and ligation (9–49%) reactions are a major drawback.[141] More recent efforts by Schwalbe and co-workers to improve this technology include the addition of magnetic streptavidin beads as a solid-support and 5′-biotinylated RNA.[142]

rNTP Labels: Summary and Outlook

As described in Section and shown in Scheme , the chemo-enzymatic labeling method developed by Dayie and co-workers[74,75,128] permits the synthesis of a versatile assortment of rNTPs with atom-specific isotope labels (Table ). While there are other enzymatic methods to generate both atom-specific (e.g., the Gilles-Schramm-Williamson[65,121−124] or Serriani[129] methods shown in Schemes and 14, respectively) and position-specific (e.g., PLOR[138] and the Schwalbe method[141,142]) labels, no other technique offers the versatility and simplicity that is afforded by the Dayie method. Our one-pot chemo-enzymatic approach can produce isotope-labeled purine and pyrimidine rNTPs in a few days and with high yield (75–95%) (Table ). The main disadvantage of this method is the need to express and purify five noncommercial enzymes in-house (Table ). However, providing these plasmids to Addgene will make our method widely accessible to the field.
Table 4

Summary of rNTP Labels Made from Chemo-enzymatic Synthesis[74,75,128]

rNTP labelatime (days)benzymatic stepscyield (%)ref
[8-13C]-ATP1.51 (1)90(75)
[8-13C]-GTP1.51 (1)75(75)
[1′,5′,6-13C3, 1,3-15N2]-CTP33 (2)95(74)
[1′,5′,6-13C3, 1,3-15N2]-UTP2.52 (2)90(74)

[8-13C]-adenine and -guanine were coupled to [1-13C]-, or [2-13C]-, or [1,5-13C2]-d-ribose to generate a variety of ATPs and GTPs.[75] The [6-13C, 1,3-15N2]-uracil and -cytosine nucleobases, on the other hand, were coupled to [1′,5′-13C2]-d-ribose only.[74] Nevertheless, the reported times, enzymatic steps, and yields are representative of all ATP, GTP, CTP, and UTP reactions made with this method.

Total reaction time was based on the time required for all chemical steps. In addition, 24 h were added for any chromatographic purification.

Number in parentheses represents the number of chromatographic purification steps. Since the time of our original publication,[74] pyrimidine rNTP synthesis now only requires one chromatographic purification.[18,86]

[8-13C]-adenine and -guanine were coupled to [1-13C]-, or [2-13C]-, or [1,5-13C2]-d-ribose to generate a variety of ATPs and GTPs.[75] The [6-13C, 1,3-15N2]-uracil and -cytosine nucleobases, on the other hand, were coupled to [1′,5′-13C2]-d-ribose only.[74] Nevertheless, the reported times, enzymatic steps, and yields are representative of all ATP, GTP, CTP, and UTP reactions made with this method. Total reaction time was based on the time required for all chemical steps. In addition, 24 h were added for any chromatographic purification. Number in parentheses represents the number of chromatographic purification steps. Since the time of our original publication,[74] pyrimidine rNTP synthesis now only requires one chromatographic purification.[18,86]

Synthesis of Labeled RNA Phosphoramidites

While the enzymatic production of RNA with isotope-labeled rNTPs[44,45,69−75] is the most widely used approach to obtain labeled RNA, an attractive alternative is to use isotope-labeled amidites and solid-phase synthesis. Like PLOR introduced by Wang and co-workers[138] and the chemo-enzymatic approach developed by Schwalbe and co-workers,[141,142] the amidite method offers the advantage of position-specific RNA labeling. However, even though amidite labeling is currently the most effective and widely used method for position-specific labeling, its utility for NMR studies is limited to RNAs ≈ 60 nt.

15N and 13C Labeling

The Kreutz and Micura groups have used isotope-labeled nucleobases to prepare 2′-O-tert-butyldimethylsilyl (tBDMS) and 2′-O-[(triisopropylsilyl)oxy]methyl (TOM) phosphoramidites for NMR studies,[57,82,85,89,110,111,143,144] as recently reviewed.[61] A representative example of [6-13C, 5-2H]-pyrimidine 2′-O-TOM amidite syntheses is shown in Schemes and 16.[85] In brief, [6-13C, 5-2H]-uracil 5 is coupled to ATBR under Vorbrüggen conditions[137] to give the 2′,3′,5′-O-benzoyl (Bz)-protected 91, which is then fully deprotected to nucleoside 92 after treatment with methylamine (CH3NH2) in ethanol (C2H5OH). Addition of 4,4′-dimethoxytrityl chloride (DMT-Cl) and TOM-Cl protects the 5′- and 2′-hydroxyl (OH) to form 93 and 94, respectively. Finally, phosphitylation of the 3′-OH of 94 with 2-cyanoethyl N,N-diisopropylchlorophosphoramidite (CEP-Cl) and N,N-diisopropylethylamine (DiPEA) yields the desired [6-13C, 5-2H]-uridine 2′-O-TOM amidite 95 with five-steps in 22% total yield (Scheme ).[85]
Scheme 16

Synthetic Route to [6-13C, 5-2H]-N4–Ac-Cytidine 2′-O-TOM Amidite[85]

The corresponding cytidine derivative is obtained from 94 in four additional steps (Scheme ).[85] First, the 3′-OH of 94 is transiently acetylated with Ac2O to afford 96. Then treatment with 2,4,6-triisopropylbenzenesulfonyl chloride (TiBSC) and TEA yields the 5′-O-DMT-2′-O-TOM cytidine 97, which is immediately N4-acetylated (Ac) with Ac2O to form 98. Finally, 3′-OH phosphitylation yields the desired [6-13C, 5-2H]-N4–Ac-cytidine 2′-O-TOM amidite 99. Starting from uracil 5, this cytidine synthesis has an overall yield of 14% (Scheme ).[85] In contrast to pyrimidines, the starting purine is protected before beginning the nucleosidation reaction. Representative examples of [8-13C]-purine 2′-O-TOM amidite syntheses are shown in Schemes and 18.[85] Starting with [8-13C]-adenine 18, N6-Bz-protected adenine 100 is formed with a yield of 86%. A subsequent Vorbrüggen reaction[137] gives the 2′,3′,5′-O-Bz-protected 101, which is readily 2′,3′,5′-O-deprotected to nucleoside 102 after treatment with sodium hydroxide (NaOH) in pyridine and C2H5OH. Then, 5′-OH tritylation, 2′-OH TOM protection, and 3′-OH phosphitylation yields 103, 104, and 105, respectively. Taken together, [8-13C]-N6-Bz-adenosine 2′-O-TOM amidite 105 was synthesized with 17% total yield (Scheme ).[85]
Scheme 17

Synthetic Route to [8-13C]-N6-Bz-Adenosine 2′-O-TOM Amidite[85]

Scheme 18

Synthetic Route to [8-13C]-N2-iBu-Guanosine 2′-O-TOM Amidite[85]

Guanosine synthesis, on the other hand, proceeds from a N2-isobutyryl (iBu) protected guanine 106 made from [8-13C]-guanine 20 with a yield of 77%. From there, however, synthesis proceed as with adenine. That is, 106 is reacted under Vorbrüggen conditions[137] to form 107, which is then 2′,3′,5′-O-deprotected to nucleoside 108. Again, 5′-OH tritylation, 2′-OH TOM protection, and 3′-OH phosphitylation yields 109, 110, and 111, respectively. In summary, [8-13C]-N2-iBu-guanosine 2′-O-TOM amidite 111 was synthesized with an overall yield of 18% (Scheme ).[85] Importantly, Schemes −18 can be adapted to prepare 2′-O-tBDMS amidites simply by altering the 2′-OH protection reaction steps. However, these 2′-O-tBDMS or 2′-O-TOM amidites are not suitable for producing RNAs > 60 nts. Instead, amidites with 2-cyanoethoxymethyl (CEM) as the 2′-OH protecting group[145,146] are used, due to its increased coupling efficiency, which rivals that in DNA synthesis.[80] Using a protocol developed by Yano and co-workers,[145,146] Kreutz and co-workers prepared [6-13C, 5-2H]-pyrimidine, [8-13C]-purine, and the modified [1,3-15N2]-dihydrouridine and [2,8-13C2]-inosine 2′-O-CEM amidites.[91] While the benefits of the CEM amidite method are attractive for obvious reasons, it has not gained widespread use due to the commercial unavailability of both unlabeled and isotope-labeled CEM amidites.

19F Labeling and Post-transcriptional Modifications

Another benefit of labeling with amidites is the position-specific incorporation of modified building blocks. Indeed, many epigenetic and post-transcriptional modifications modulate the structure, dynamics, and folding of RNAs, and NMR is providing new insights into their functions.[147] These studies have been greatly aided by the synthesis of 13C- or 15N-labeled amidites bearing modifications such as uridine 5-oxyacetic acid (cmo5U)[148] and N6-methyladenine (m6A).[149,150] In collaboration with the Al-Hashimi group, Kreutz and co-workers synthesized a 15N-labeled cmo5U amidite.[148] Their synthetic route begins from bromoacetic acid 1 and through intermediates 112 and 113 to assemble [1,3-15N2]-uracil 114, as in Schemes (82,85) and 2.[18,57,82,86] Then 114 was coupled to ATBR under Vorbrüggen conditions, 2′,3′,5′-O-deprotected, and hydroxylated at the C5 position to yield 115, 116, and 117, respectively. Addition of para-toluene sulfonic acid (pTSA) and dimethoxypropane ((CH3)2C(OCH3)2) then formed the 2′,3′,5′-O-protected nucleoside 118. Reacting 118 with ethyl-2-iodo acetate in C2H5OH and NaOH transformed the 5-OH into an ethylcarboxymethoxy group while also deprotecting the 5′-OH to afford 119. After transient 2′,3′-O-deprotection of 119 to form 120, the 3′- and 5′-OH were immediately protected along with 2′-O-tBDMS protection to yield 121 by adding di-tert-butylsilyl bis(trifluoromethanesulfonate) (DtBS) and tBDMS-Cl. Addition of pyridine and CH3OH to 121 forms 122, and subsequent treatment with nitrophenyl ethanol (NPE), N-dimethyl aminopyridine (DMAP), and N-ethyl-N′-(3-dimethyl aminopropyl) carbodiimide (EDC) construct the NPE-protected cmo5 group to yield 123. Reaction of 123 with hydrogen fluoride (HF) affords the 3′,5′-O-deprotected 124, which can then be 5′-O-tritylatyed to yield 125. Finally, phosphitylation of the 3′-OH of 125 with 2-cyanoethyl N,N,N′,N′-tetraisopropylphosphorodiamidite (TiPCEP) yields 126 (Scheme ).[148] Taken together, [1,3-15N2]-cmo5U 2′-O-tBDMS amidite 126 was synthesized with 15 steps in 1% total yield (Scheme ).[148]
Scheme 19

Synthetic Route to [1,3-15N2]-cmo5-Uridine 2′-O-tBDMS Amidite[148]

Another example from the Al-Hashimi and Kreutz groups showcases the synthesis of a 13C-labeled m6A amidite.[149] Their synthetic route begins with ethyl cyanoacetate 21 and 13C-thiourea 22 and through intermediates 23–25 to assemble [2-13C]-5,6-diamino-4-pyrimidinone 26, as in Scheme .[104] In contrast to Scheme , however, H13COOH was used with H2SO4 to introduce a second 13C label and form [2,8-13C2]-hypoxanthine 127. Then the familiar Vorbrüggen reaction of 127 with ATBR yields the 2′,3′,5′-O-Bz-protected 128 followed by addition of sulfuryl chloride (SO2Cl2) to yield 6-chloropurine nucleoside 129. Sequential addition of CH3NH2 in C2H5OH and then H2O affords the m6A nucleoside 130. Again, the synthetic route ends with 2′-O-tBDMS protection, 5′-O-tritylation, and 3′-O-phosphitylation to yield 131, 132, and 133, respectively (Scheme ).[149] In summary, [2,8-13C2]-N6-methyladenosine 2′-O-tBDMS amidite 133 was synthesized in 11 steps with an overall yield of 4% (Scheme ).[149]
Scheme 20

Synthetic Route to [2,8-13C2]-N6-Methyladenosine 2′-O-tBDMS Amidite[149]

Commercially, INNotope has 13C-labeled N1-methyladenine, m6A, and N3-methylcytidine 2′-O-tBDMS amidites available. Finally, [1,3-15N2]-pseudouridine (Ψ) amidites can be made from 15N-labeled uracil with 11 steps in 6% total yield.[151] Additionally, building on the work shown in Scheme ,[18,57,82,85,86] Kreutz and co-workers showcased new methods to incorporate 19F–13C into the pyrimidine nucleobase of amidites.[18,57,86] Starting from [6-13C]-uracil 4, fluorination is achieved with Selectfluor to yield 5FU 134, as in Scheme .[18,57,82,85,86] The remaining chemical steps are similar for other 2′-O-tBDMS amidite syntheses (Schemes [148] and 20(149)). That is, 134 is coupled to ATBR under Vorbrüggen conditions, 2′,3′,5′-O-deprotected, and then 3′,5′-O-protected and 2′-O-tBDMS protected to yield 135, 136, and 137, respectively. Finally, 137 is 5′-O-tritylated, and 3′-O-phosphitylated to yield 138 and 139, respectively (Scheme ).[57] Taken together, [5-13C, 5-19F]-uridine 2′-O-tBDMS amidite 139 was synthesized with six-steps in 8% total yield (Scheme ).[57] The corresponding cytidine derivative is obtained from 137 through intermediates 140–142 to afford the desired 143 (Scheme ),[57] as in Scheme .[85] In summary, [5-13C, 5-19F]-N4-Ac-cytidine 2′-O-tBDMS amidite 143 was synthesized in eight-steps with an overall yield of 4% (Scheme ).[57] These labeling topologies not only capitalize on the beneficial spectroscopic properties of the 19F nuclei (Section ) but also open the door to NMR studies of large RNAs, as will be discussed in greater detail in Section .
Scheme 21

Synthetic Route to [5-13C, 5-19F]-Uridine 2′-O-tBDMS Amidite[57]

Scheme 22

Synthetic Route to [5-13C, 5-19F]-Cytidine 2′-O-tBDMS Amidite[57]

Synergy between Phosphoramidites and Chemo-enzymatic Labeling

In principle, any nucleobase labeling scheme described in Section can be coupled to any commercially available 13C- or 2H-labeled d-ribose (from Omicron Biochemicals or CIL) with the chemo-enzymatic method (Section ) and built into an amidite with a variety of 2′-OH protecting groups (Section ). Indeed, our group recently made [1′,8-13C2]-N6-Bz-adenosine 2′-O-tBDMS144 and [1′,6-13C2, 5-2H]-uridine 2′-O-CEM[152] amidites via chemo-enzymatic synthesis, dephosphorylation with rSAP, and chemical synthesis. These amidites can then be used to make RNA via solid-phase synthesis. Given that the Kreutz and Micura groups have implemented a wide variety of atom-specific labeling schemes into the nucleobase of RNAs,[57,61,82,85,89,110,111,143,144] this hybrid approach is only needed if ribose labeling is desired in a position-specific manner. However, INNotope and Silantes have [1′,2,8-13C3]-N6-Ac-adenosine, [1′,8-13C2]-adenosine, [1′,8-13C2]-N2-Ac-guanosine, [1′,6-13C2, 5-2H]-uridine, and [1′,6-13C2, 5-2H]-N4-Ac-cytidine 2′-O-tBDMS amidites available.

Phosphoramidite Labels: Summary and Outlook

As described in Sections and 3.3.2, and shown in Schemes –22, again, a wide range of isotope-labeled amidites (Table ) are becoming available to the scientific community. For all synthetic protocols, pyrimidine C6/C5 and purine C8 sites are most readily labeled. The production of these 2′-O-TOM amidites is streamlined[85] and proceeds quickly (∼1 week) and with adequate yields (14–18%) (Table ). The introduction of 19F labels and post-transcriptional modifications, on the other hand, dramatically increases the time of synthesis (i.e., up to 10 days) and reduces the overall reaction yields (i.e., as low as 1%) (Table ). Nevertheless, the benefits afforded by the position-specific incorporation of these labels into RNA more than offsets these shortcomings. As with nucleobase labeling, researchers are typically motivated by the scientific question they are pursuing rather than the relative yields of each labeling reaction. Still, improvements in reaction yields and reduction in chemical steps would be advantageous for future work.
Table 5

Summary of All RNA Phosphoramidite Labels As Outlined in Schemes –22

RNA phosphoramidite labelatime (days)bchemical stepscyield (%)ref
[8-13C]-N6-Bz-adenosine (TOM)4.55 (4)17(85)
[2,8-13C2]-N6-methyladenosine (tBDMS)811 (5)4(149)
[8-13C]-N2-Ac-guanosine (TOM)55 (4)18(85)
[6-13C, 5-2H]-N4-Ac-cytidine (TOM)88 (6)34(85)
[5-13C, 5-19F]-N4-Ac-cytidine (tBDMS)108 (6)4(57)
[6-13C, 5-2H]-uridine (TOM)45 (3)22(85)
[5-13C, 5-19F]-uridine (tBDMS)7.56 (4)8(57)
[1,3-15N2]-cmo5-uridine (tBDMS)815 (3)1(148)

The 2′-OH protecting groups are listed in the parentheses.

Total reaction time was based on the time required for all chemical steps. In addition, 16 h were added for any explicit mention of overnight procedures and 24 h were added for any chromatographic purifications.

Reactions for amidites harboring post-transcriptional modifications begin with isotope-labeled precursors whereas reactions for unmodified amidites begin with isotope-labeled protected nucleobase. Also, the number in parentheses represents the number of chromatographic purification steps.

The 2′-OH protecting groups are listed in the parentheses. Total reaction time was based on the time required for all chemical steps. In addition, 16 h were added for any explicit mention of overnight procedures and 24 h were added for any chromatographic purifications. Reactions for amidites harboring post-transcriptional modifications begin with isotope-labeled precursors whereas reactions for unmodified amidites begin with isotope-labeled protected nucleobase. Also, the number in parentheses represents the number of chromatographic purification steps.

Current State of RNA Labeling: Where We Are and Where We Are Headed

Despite the synergy between the synthesis of nucleobases (Section ), rNTPs (Section ), and amidites (Section ), and their contribution to RNA labeling for applications with solution NMR spectroscopy, a number of insurmountable limitations remain for RNAs prepared enzymatically (using, e.g., T7 RNA polymerase) and chemically (i.e., solid-phase synthesis). The former is incapable of position-specific labeling and the latter is size limited, even though both methods can install isolated 1H–13C spin pairs into RNA that remove the 13C–13C scalar and dipolar couplings that are normally present in uniformly labeled RNA, as will be detailed in Section . Again, unlike DNA template-directed in vitro transcription, a tremendous advantage to the field is that amidite labeling and solid-phase synthesis can provide direct read-outs of the biophysical consequences of post-transcriptional modifications. This will be discussed in greater detail in Section . However, despite this strength, the “size problem” of solid-phase synthesis limits the production of RNAs to ∼60 nt, beyond which it is exceedingly difficult to prepare RNA in high yield and sufficient purity for NMR studies. Even though the 2′-O-CEM[91,145,146] protecting group initially held promise for synthesizing larger RNAs, it has not gained widespread use. Conversely, while much larger RNAs can be transcribed enzymatically, larger RNAs always carry with them more extensive signal overlap and broader linewidths. These complications make NMR analysis of RNAs > 60 nt extremely difficult, even when atom-specific labeling is used. However, introducing 13C–19F spin pairs into RNA,[18,57,86] leveraging the spectral properties of the 15N nuclei,[53,153] or combining selective deuteration with 1H NMR[17,53−55] all hold promise to lessen the burden imposed by overlap and broad lines. This will be discussed in detail in Section . It is clear that elucidating the structure, interactions, and dynamics of large RNAs and their complexes (e.g., those implicated in viral transcription, splicing, nuclear export, translation, packaging, and particle assembly) requires developing breakthrough technologies and new experimental strategies to solve the structures of such large RNAs rapidly and accurately. While the advances in the synthesis of atom-specific isotope-labeled rNTPs and amidites are essential first steps in this direction, the ability to incorporate these labels position-specifically will be a game changer for RNA structural and chemical biology. Overnight, it would transform our ability to perform position-specific readouts in vitro and in vivo. Moreover, it would enable scientists to peer directly into the active site of RNA enzymes, visualize the binding pockets of RNA–drug complexes, and exquisitely map out the interfaces of RNA–protein, RNA–RNA, or RNA–DNA–RNA hybrids. At least that is the dream. While we await these technological advances, the availability of these isotope-labeled RNA building blocks with diverse labeling topologies (Figure ) still bodes well to address structural dynamic features of RNAs with NMR spectroscopy as well as MS or small angle neutron/X-ray scattering. The remaining sections highlight how the labels described in Section can be exploited to study RNA structure, interactions, and dynamics by NMR spectroscopy.
Figure 3

List of possible atom-specifically isotope-labeled nucleobase and ribose labeling patterns. These can be coupled to form rNTPs via chemo-enzymatic synthesis but also converted into amidites with further chemical synthesis. Nucleobase labeling patterns (unmodified and modified) are based on the synthetic schemes described in Sections and 3.3. These need not be mutually exclusive, and some labeled sites can be incorporated simultaneously. Labeled ribose, on the other hand, is available from commercial sources (Omicron Biochemicals and CIL).

List of possible atom-specifically isotope-labeled nucleobase and ribose labeling patterns. These can be coupled to form rNTPs via chemo-enzymatic synthesis but also converted into amidites with further chemical synthesis. Nucleobase labeling patterns (unmodified and modified) are based on the synthetic schemes described in Sections and 3.3. These need not be mutually exclusive, and some labeled sites can be incorporated simultaneously. Labeled ribose, on the other hand, is available from commercial sources (Omicron Biochemicals and CIL).

NMR Probes of Macromolecular Dynamics

Originating more than 45 years ago, early investigations of RNA dynamics were limited to the study of bacterial tRNAs using one-dimensional (1D) NMR methods.[154] More than a decade later, development of 1D and 2D heteronuclear polarization transfer schemes to measure heteronuclear relaxation rates[155−157] uniquely positioned solution NMR spectroscopy to probe protein[158−161] and RNA[162−166] dynamics. With multidimensional NMR spectroscopy, we can measure the dynamics of ribose, nucleobase, and phosphorus nuclei distributed along the entire RNA structure.[167−171] We can especially characterize motions that range from picosecond to seconds and visualize conformers that are transient and sparsely populated (Figure ). For these low populated states, we can extract chemical shifts (structure), rates (kinetics), and populations (thermodynamics) under various physiological conditions of temperature, salt, pH, and cellular environment. Finally, we can examine how the cellular milieu modulates the structure, dynamics, and interactions of RNA in real time.
Figure 4

Dynamic processes in RNA and corresponding NMR methods and RNA nuclei that can be used to characterize such motions. The highlighted 15N and 13C sites have been used extensively in NMR spin relaxation and relaxation dispersion experiments,[147] whereas 31P[169,170] and 2H[171] sites are probed less frequently. Alternative time charts can be found elsewhere.[147,168,172,173]

Dynamic processes in RNA and corresponding NMR methods and RNA nuclei that can be used to characterize such motions. The highlighted 15N and 13C sites have been used extensively in NMR spin relaxation and relaxation dispersion experiments,[147] whereas 31P[169,170] and 2H[171] sites are probed less frequently. Alternative time charts can be found elsewhere.[147,168,172,173]

Probing Fast Motions with Uniform and Selective Labels

On the picosecond (ps)-to-nanosecond (ns) (ps-ns) time scales, spin relaxation provides information about the amplitude and time scale of motions powered by the bond vectors (e.g., 15N–1H, 13C–1H, 13C–19F, 1H–1H) reorienting relative to the external applied magnetic field (Figure ).[168,174−176] Longitudinal relaxation describes the return to the equilibrium distribution of spins along the z-axis, with a characteristic exponential time constant T1 (or rate constant R1 = 1/T1). Transverse relaxation, on the other hand, describes the decay of magnetization in the transverse xy-plane, with a characteristic decay time constant T2 (or rate constant R2 = 1/T2). Larger R2 values produce broader peaks and lower peak heights in an NMR experiment. The linewidth, defined as full-width at half-height (given in Hz), is Δν1/2 = R2/π. The heteronuclear Overhauser effect (hNOE) measures the enhancement of the heteroatom magnetization that arises from saturating the proton magnetization, and is mediated by their dipolar interaction. For an isolated pair of spin-1/2 nuclei S and I (here, S is 15N, 13C, 31P, 19F; and I is 1H), R1, R2, and the hNOE of nucleus S are related to the rotational diffusion tensor of the molecule according to well-known relations:[177,178]where , σ = σ33 – σ11, σ = σ33 – σ22, σ11, σ22, and σ33 are the principal components of the chemical shielding anisotropy (CSA) tensor,[179,180]J(ω) is a spectral density function, which is assumed to be a Lorentzian (e.g., simplest form is ), γ is the gyromagnetic ratio of spin i, rSI is the distance between spins I and S, h is Planck’s constant, and Rex is the exchange contribution to R2 due to slow (i.e., microsecond-to-millisecond, μs-ms) motions. The raw data represented by the three relaxation parameters (R1, R2, and hNOE) reveal the nucleotide level variation of the dynamic motions encoded in the RNA primary sequence. Additional motional variables such as the overall correlation time (τC) and generalized order parameter (S) can be fit within a Model Free formalism[181,182] to describe fast (i.e., ps-ns) motions. Though, for reasons enumerated below, this becomes problematic for large uniformly labeled RNAs.[183] The RNA motions reported by R1, R2, and hNOE are easily probed by 13C[162−165,183−188] and 15N[162,166,189] nuclei. 15N sites are present in the four nucleobases at the following sites: adenosine (Ade)-H2-N1, Ade-H2-N3, Ade-H8-N7, and Ade-H8-N9, guanosine (Gua)-H1-N1, Gua-H8-N7, and Gua-H8-N9, uridine (Uri)-H3-N3, and Uri-H6-N1, and cytidine (Cyt)-H6-N1 (Figures and 4). These are suitable reporters of hydrogen-bonding and non-hydrogen-bond dynamics that occur in base-paired and nonbase-paired regions. However, solvent exposed imino regions are usually broadened beyond detection. Nonprotonated nitrogen sites such as Ade-N1 and Ade-N3, purine (Pur)-N7 and Pur-N9, and pyrimidine (Pyr)-N1 remain underutilized. The limited availability of directly protonated imino nitrogen probes has made protonated carbons an attractive alternative for probing RNA relaxation. These sites are found in both the ribose (C1′–C5′) and nucleobase (Ade-C2, Pur-C8, Pyr-C5, and Pyr-C6) moieties (Figures and 4). Despite the greater number of detectable 13C nuclei in RNA, complications arise for measurements and analysis of 13C relaxation. First, the carbon sites are linked by intricate multibond couplings (i.e., to 15N, 13C, and 1H nuclei) that are proximally positioned within 3 Å or less. Therefore, 13C spins do not approximate an isolated two-spin system. In uniformly labeled samples, these extensive dipolar couplings complicate 13C R1 rate measurements and analysis[119,120,184,185,187,188,190−194] in biopolymers of large size (τC > 7 ns). Given this fact, our group has developed pulse schemes (based on the isolated 1H–15N backbone amide spin pair in proteins[195]) to leverage the isolated 1H–13C spin pairs afforded by our atom-specifically labeled RNA samples (Figure ).
Figure 5

Pulse scheme for transverse relaxation optimized spectroscopy (TROSY)-detected experiments for measuring (A) rotating-frame (R1ρ) (from which R2 can be calculated[185,195]) and (B) 13C R1 rates in selectively labeled RNA, adapted from previous reports.[195] Quadrature detection and sensitivity-enhanced/gradient-selection is implemented using the Rance-Kay[196,197] echo/antiecho scheme with the polarity of G1 inverted and phase Φ4 and Φ5 incremented 180° for each second FID of the quadrature pair.

Pulse scheme for transverse relaxation optimized spectroscopy (TROSY)-detected experiments for measuring (A) rotating-frame (R1ρ) (from which R2 can be calculated[185,195]) and (B) 13C R1 rates in selectively labeled RNA, adapted from previous reports.[195] Quadrature detection and sensitivity-enhanced/gradient-selection is implemented using the Rance-Kay[196,197] echo/antiecho scheme with the polarity of G1 inverted and phase Φ4 and Φ5 incremented 180° for each second FID of the quadrature pair. Theoretical simulations of R1 rates for Pyr-C5 and Pyr-C6, ribose C1′, Ade-C2, and Pur-C8 in uniformly and selectively labeled RNAs suggest that the various 1H–13C, 13C–13C, and 13C–15N dipolar couplings (Figure A) present in uniformly labeled samples lead to overestimated R1 rates (Figure B). Moreover, this discrepancy, measured by the R1 difference (where R1 difference = [100 × (R1,uni – R1,sel)/R1,uni)]), increases with higher molecular weights and magnetic field strengths (Figure B). Experimental measurements with our customized pulse sequence (for selectively labeled RNA) (Figure ) and those of others[185] (for uniformly labeled RNA), corroborated our simulations, suggesting that these discrepancies in R1 cannot be wholly ignored, even for fairly isolated Ade-C2 and Pur-C8 sites.[187,188] Taken together, the contribution of 13C–13C dipolar interactions needs to be explicitly taken into consideration in data analysis of uniformly labeled RNA. Spin relaxation measurements on uniformly labeled RNA from Al-Hashimi and co-workers[185] demonstrate that this is not an insurmountable hurdle. Nevertheless, the focus of our discussion on RNA dynamics will center on slower conformational exchange motions, which will be discussed in Section .
Figure 6

Dipolar couplings complicate dynamics measurements in uniformly labeled RNA. (A) Nucleobase and ribose structures shown to highlight dipolar coupling networks to nuclei of interest (i.e., Ade-C2 and Ade-C8, Uri-C6, and ribose C1′). Distances are shown in units of Å. (B) Simulated R1 rates and R1 difference (defined as above) for the nuclei highlighted in panel A. R1 simulations were carried out for 800 MHz field and R1 difference simulations were run at multiple magnetic fields. All simulations were carried out at various τC values, and additional details can be found in the original works.[187,188]

Dipolar couplings complicate dynamics measurements in uniformly labeled RNA. (A) Nucleobase and ribose structures shown to highlight dipolar coupling networks to nuclei of interest (i.e., Ade-C2 and Ade-C8, Uri-C6, and ribose C1′). Distances are shown in units of Å. (B) Simulated R1 rates and R1 difference (defined as above) for the nuclei highlighted in panel A. R1 simulations were carried out for 800 MHz field and R1 difference simulations were run at multiple magnetic fields. All simulations were carried out at various τC values, and additional details can be found in the original works.[187,188]

Probing Slow Motions with Uniform and Selective Labels: Relaxation Dispersion and Saturation Transfer Methods

Spin-1/2 nuclei with a positive gyromagnetic ratio either align parallel (α, high-populated, favorable energetic state) to the static NMR magnetic field (B0) or antiparallel (β, low-populated, unfavorable state). The net bulk magnetization, oriented parallel to B0, can be realigned with radiofrequency (RF) pulses along a direction perpendicular to B0. The magnetization then precesses about B0 at a resonant Larmor frequency (ω) characteristic of the nucleus. When Fourier transformed, this detectable oscillating time-domain signal yields a frequency-domain NMR spectrum with signals at characteristic frequencies for each nucleus. When referenced against a standard frequency (e.g., sodium-3-(trimethylsilyl)-1-propanesulfonate (DSS) for 1H), we obtain a field-independent chemical shift that is directly proportional to the energy difference between the α and β states. For RNA exchanging between two states A and B, the chemical shift difference (Δω) between the two states and the exchange rate constant [kex, sum of the forward (kAB) and reverse (kBA) rate constants] or the exchange lifetime (τex = 1/kex) determine if two distinct NMR peaks are observed and what signal intensity and linewidth are obtained for a given nucleus.[198,199] In the slow exchange regime, two distinct peaks are detected at the chemical shifts of the individual states, and the peak intensities are proportional to the populations of each state. In the fast exchange regime, kex is much larger than Δω, and therefore, a single peak is observed at the population-weighted average chemical shift. In the intermediate exchange regime, which, as its name implies, lies between the fast and slow time scales, kex ≈ Δω. Regardless of the exchange regime, if chemical exchange is present, R2 increases by Rex, which depends on kex and Δω and can therefore be modulated by magnetic field strength.[198−202] Dynamics on the intermediate and slow time scales (i.e., μs-ms) can be characterized with relaxation dispersion (RD) using R1ρ,[202,203] Carr–Purcell–Meiboom–Gill (CPMG),[204−206] or chemical exchange saturation transfer (CEST)[207] experiments (Figure ). Moreover, even processes slower than seconds can be studied with real-time NMR (Figure ).[208] For two-site exchange, a general expression for the R2 rate constant (RCPMG(τcp)) for state A (where pA > pB), that encompasses all conformational exchange time scales, is given by the Carver-Richards equation:[198,209]where R2A/B and pA/B are the R2 rate and relative populations of the A/B state, respectively. A main disadvantage of the CPMG experiment is that only the magnitude (and not the sign) of Δω is obtained. Still, this disadvantage of the CPMG experiment is offset by the relative ease of its implementation and data analysis. That is, conformational exchange is easily detected by a nonflat CPMG curve when plotting R2,eff versus vCPMG (Figure A). Nonexchanging nuclei, on the other hand, have no dependence of R2,eff on vCPMG and therefore appear as flat curves (Figure A).
Figure 7

Simulated NMR RD experiments. (A) CPMG curves for two nuclei: one in exchange (in red, Rex > 0) using the parameters kex = 794 s–1, pB = 8.7%, and Δω = 228 Hz (150 MHz 13C-Larmor frequency) and one without (in black, Rex = 0, or Δω = 0, or both), based on published data.[210] (B) CEST profile for a given nuclei showing evidence of two states A and B. Calculations assumed kex = 121 s–1, pB = 10.8%, γ(1H)B0/2π = 600 MHz, Δω = −4 ppm, R1A = R1B, T = 0.3 s, and the B1 fields specified on the figure. (C) R1ρ profile for a given nuclei showing evidence of two states A and B. Calculations used the same parameters as in (B) but with different B1 fields, which are again specified on the figure. As seen in the CEST and R1ρ profiles, at higher B1 fields, linewidths broaden to the point where state B becomes increasingly difficult to detect. CEST and R1ρ profiles are based on published data.[211]

Simulated NMR RD experiments. (A) CPMG curves for two nuclei: one in exchange (in red, Rex > 0) using the parameters kex = 794 s–1, pB = 8.7%, and Δω = 228 Hz (150 MHz 13C-Larmor frequency) and one without (in black, Rex = 0, or Δω = 0, or both), based on published data.[210] (B) CEST profile for a given nuclei showing evidence of two states A and B. Calculations assumed kex = 121 s–1, pB = 10.8%, γ(1H)B0/2π = 600 MHz, Δω = −4 ppm, R1A = R1B, T = 0.3 s, and the B1 fields specified on the figure. (C) R1ρ profile for a given nuclei showing evidence of two states A and B. Calculations used the same parameters as in (B) but with different B1 fields, which are again specified on the figure. As seen in the CEST and R1ρ profiles, at higher B1 fields, linewidths broaden to the point where state B becomes increasingly difficult to detect. CEST and R1ρ profiles are based on published data.[211] R1ρ and CEST experiments provide more robust information regarding the chemical shifts of state B. For a two-site model, Δω, kex, and pB can be extracted from CEST profiles using the Bloch-McConnell 7 × 7 matrix (including the equilibrium magnetization terms).[212−214] By combining all data sets, global kex and pB values can be fit numerically for all the CEST profiles, plotted as I/I0 versus spin-lock offset (in Hz) (Figure B). The 7 × 7 two-site Bloch-McConnell equation is derived from the relaxation matrix and the kinetic rate matrix for an exchanging two-site system:[207,211,213,214]where R1A/B, ωA/B, and ω1 are the R1 rate of the A/B state, the offset of the B1 spin-lock field from the peaks in the A/B state (in rad s–1), and the B1 field strength (in rad s–1), respectively. The evolution of magnetization for the peak in state A during the CEST spinlock period is given bySimilarly, under the R1ρ model for two-site exchange, the R1ρ value for state A magnetization is given by[215]and where Ω = ωrf – Ωobs is the difference between the resonance frequency of the observed nucleus (Ωobs) and the spinlock transmitter frequency (ωrf). For R1ρ experiments, conformational exchange can be detected by plotting R2,eff versus Ω/2π (Figure C). The expression for CEST and R1ρ (eqs –16) provide insight into the parameters that are important for acquiring useful data. For example, higher B1 fields decrease chemical shift resolution between states and also broadens linewidths (Figure B,C). While almost all RD studies involve two-site systems, expressions for CPMG, R1ρ, and CEST models for characterizing N-site exchange have been described by Arthur Palmer III and co-workers.[198] Indeed, work from Al-Hashimi and co-workers on Watson–Crick mismatches and base pair reshuffling in RNA feature R1ρ and CEST data that described three-site exchange.[216]

Slow Motions: Are Selective Labels Needed?

As with spin relaxation, the scalar and dipolar couplings present in uniformly labeled samples can lead to complications in RD and CEST experiments. As we have discussed elsewhere,[75] numerous spectroscopic solutions have been proposed to circumvent the problems that arise from 13C–13C couplings that exist in uniformly labeled RNA. These advances include constant time evolution,[217−220] adiabatic band selective decoupling,[221−223] and selective cross-polarization with weak RF fields.[224−226] These solutions have benefited RD and CEST experiments to varying degrees in RNA. Specifically, 13C–13C scalar couplings (e.g., C1′–C2′ or C5–C6) complicate CPMG experiments[227,228] to a much larger degree than both CEST and R1ρ. However, these couplings still pose a problem to CEST[229,230] and R1ρ(211) and oscillations are sometimes observed in the decay profiles of C1′ and C6 nuclei. Moreover, as with spin relaxation, these couplings must be explicitly taken into consideration in data analysis. The number of coupled homogeneous differential equations (n) is equal to (2 × 4) – 1, where m is the number of weakly coupled nuclear spins in an m-spin system. Therefore, for 1-, 2-, and 3-spin systems, n = 7, 31, and 127, respectively.[213,214,230] This transforms the CEST matrix (eq ) from 7 × 7 to 31 × 31 for 13C–13C scalar coupled spin pairs found in the nucleobase and ribose moieties. Atom-specific labeling (Section ), on the other hand, circumvents this problem entirely, and dramatically simplifies NMR spectra, especially when incorporated position-specifically via solid-phase synthesis (Section ). However, a drawback for selective labels is the obvious reduction of probe sites. Nevertheless, using both selective and uniformly labeled RNA, CEST and R1ρ experiments have now been applied to the protonated nucleobase (Pyr-C5 and Pyr-C6, Pur-C8, and Ade-C2) and ribose (C1′-C5′) carbons, the nucleobase imino (Gua-N1 and Thy/Uri-N3) and amino (Gua-N2) nitrogen, nucleobase (Uri-H3, Gua-H1, Ade-H2, Pur-H8, Pyr-H5, and Pyr-H6) and ribose H1′ protons, as well as nonprotonated (Gua-N7, Ade-N1, and Pur-N7) and amino (Cyt-N4) nitrogen sites (Figure ).[75,147,167,211,231−233] In practice, CPMG experiments are solely implemented on selectively labeled RNA, and mainly from our group[75,144,227] and the Kreutz group,[82,85,89,210,234] though not exclusively.[193] CEST and R1ρ, on the other hand, have been used to great success with uniformly labeled RNA by the Al-Hashimi,[31,147,167,235−245] Petzold,[246,247] and Zhang[211,231,248−250] groups. Moreover, Petzold and co-workers have developed a SELective Optimized Proton Experiment (SELOPE) approach[251] that can be implemented with R1ρ and CEST[252] experiments using unlabeled samples. The rest of this section will highlight recent examples of RD experiments on labeled (selectively and uniformly) and unlabeled RNA.

Examples of Relaxation Dispersion Experiments in Selectively Labeled RNA

As highlighted above, implementation of RD experiments on selectively labeled RNA circumvents all complications from strong 13C–13C scalar couplings and permits straightforward data analysis. The following sections will be devoted to showcasing examples of CPMG, CEST, and R1ρ experiments performed on selectively labeled RNAs. Specifically, we will highlight recent work from our group[227,233] using isotope-labeled rNTPs and from Kreutz and Al-Hashimi and co-workers[150] using isotope-labeled amidites with post-transcriptional modifications.

CPMG in Atom-Specifically Labeled RNA

Until recently, CPMG experiments to measure the chemical shifts of nucleobase methine 1H and ribose methylene C5′(H2) in a low populated, transient state (i.e., state B) were not available. This gap existed, in part, due to complications from 13C–13C scalar couplings. To fill this knowledge gap, our group adapted single-quantum (SQ) 1H CPMG experiments previously designed for methyl groups in protein side-chains[253,254] to obtain CPMG data for the selectively labeled ([2′,8-13C2]-ATP, [1′,6-13C2]-CTP, [1′,8-13C2]-GTP, and [2′,6-13C2]-UTP) bacterial A-site RNA (Figures A,B).[227]
Figure 8

(A) Pulse scheme for SQ 1H CPMG experiment for selectively labeled RNA,[227] adapted from previous reports.[253,254] (B) Secondary structure of the 27 nt bacterial A-site RNA with all nucleotides harboring isotope labels shown bolded in orange. Nucleotides that were found to be in exchange are circled. Exchange parameters were extracted from a global fit of the CPMG data (i.e., G19-C8 and A21-C8). (C) Pulse scheme for methylene CH21H–13C TROSY-detected CPMG experiment for selectively labeled RNA,[227] adapted from previous reports.[256] (D) Secondary structure of the 29 nt iron-responsive element (IRE) RNA with isotope labels and nucleotides in exchange presented as in panel B. Exchange parameters were extracted from a global fit of the CPMG data (i.e., C18–C5′, C18–C1′, and C18–C6) and likely refer to a structural rearrangement in the IRE triloop. Orange circles and D refer to 13C and 2H nuclei, respectively. Additional details can be found in the original work.[227]

(A) Pulse scheme for SQ 1H CPMG experiment for selectively labeled RNA,[227] adapted from previous reports.[253,254] (B) Secondary structure of the 27 nt bacterial A-site RNA with all nucleotides harboring isotope labels shown bolded in orange. Nucleotides that were found to be in exchange are circled. Exchange parameters were extracted from a global fit of the CPMG data (i.e., G19-C8 and A21-C8). (C) Pulse scheme for methylene CH21H–13C TROSY-detected CPMG experiment for selectively labeled RNA,[227] adapted from previous reports.[256] (D) Secondary structure of the 29 nt iron-responsive element (IRE) RNA with isotope labels and nucleotides in exchange presented as in panel B. Exchange parameters were extracted from a global fit of the CPMG data (i.e., C18–C5′, C18–C1′, and C18–C6) and likely refer to a structural rearrangement in the IRE triloop. Orange circles and D refer to 13C and 2H nuclei, respectively. Additional details can be found in the original work.[227] The SQ 1H CPMG experiment was amenable to Pur-H8 sites, detecting exchange in G19 and A21. The extracted exchange rate (kex = 4000 ± 100 s–1) from a global fit was consistent with that determined from a standard 1H–13C TROSY CPMG experiment (kex = 3000 ± 800 s–1), demonstrating that these new experiments are feasible for RNA (Figure B).[227] Moreover, these data agree with R1ρ measurements on uniformly labeled RNA from Al-Hashimi and co-workers, which suggests that each measurement, using various methods and labeling techniques, is picking up fundamental motions within this RNA.[255] In addition, these SQ experiments could provide important data on 1H chemical shifts, which are currently lacking, such as ribose H1′ and Pyr-H6. In the latter case, however, the presence of Pyr-H5 can cause dispersive CPMG patterns for the H6 site.[227] Fortunately, Pyr-H5 deuteration is easily achieved (Scheme ),[85] and therefore, this experiment can be readily implemented to obtain data for Pyr-H6 sites. Our group also designed a CH21H–13C TROSY-detected CPMG pulse sequence (Figure C)[227,256] to leverage the isolated 13C spin at the ribose C5′ position (Figure ) afforded by our chemo-enzymatic labeling (Sections and 3.2).[74,75,128] This new CPMG experiment was implemented using the selectively labeled ([1′,5′,6-13C3, 5-2H]-CTP) iron-responsive element (IRE) RNA and detected exchange in C18–C5′ (Figure D).[227] These data were then globally fit with additional CPMG data from other nuclei to obtain chemical shift (Δω = 2.5 ± 0.2 ppm), population (pB = 1.7 ± 0.2%), and exchange rate (kex = 3600 ± 300 s–1) information that suggests a significant structural rearrangement in the IRE triloop (Figure D).[227]

CEST in Atom- and Position-Specifically Labeled RNA

In addition to using selective labels to benefit CPMG experiments, they can also be used to simplify CEST experiments. Specifically, our group combined enzymatic ligation, chemo-enzymatic labeling, and newly developed CEST experiments (Figure A) to study the conformational equilibria of the SAM-II riboswitch in the apo (ligand-free) state.[233] To understand the formation of the SAM metabolite-binding pocket, a SAM-II RNA was constructed via DNA splinted ligation with T4 DNA ligase (EC 6.5.1.1) of two RNA fragments: an unlabeled 31 nt acceptor fragment and a [1′,6-13C2, 5-2H]-CTP labeled 21 nt donor fragment. This strategy enabled position-specific labeling, given that there was only one cytidine (C43) in the donor sequence and therefore permitted direct monitoring of the G22–C43 base pair interaction in the SAM binding pocket. Moreover, the isolated spin pair labeling topology enabled the design of a 1H CEST experiment, and simplified setup and analysis of 1H and 13C CEST experiments without complications from 13C–13C couplings to Cyt-C1′ and Cyt-C6 sites.[229]
Figure 9

(A) Pulse scheme for 13C and 1H CEST experiments with temperature compensation (TC) and 1H decoupling (1H Dec) for selectively labeled RNA,[233] adapted from previous reports.[211,213] (B) Secondary structure of the 52 nt SAM-II riboswitch RNA. C43 position-specific labeling is shown bolded in orange and circled to indicate that it was the subject of CEST experiments. Exchange parameters were extracted from a global fit of the CEST data (i.e., C43–C1′ and C43–C6) and reveal a transition from an open to a closed conformation that resembles the SAM-bound form. Orange circles and D refer to 13C and 2H nuclei, respectively. Additional details can be found in the original work.[233]

(A) Pulse scheme for 13C and 1H CEST experiments with temperature compensation (TC) and 1H decoupling (1H Dec) for selectively labeled RNA,[233] adapted from previous reports.[211,213] (B) Secondary structure of the 52 nt SAM-II riboswitch RNA. C43 position-specific labeling is shown bolded in orange and circled to indicate that it was the subject of CEST experiments. Exchange parameters were extracted from a global fit of the CEST data (i.e., C43–C1′ and C43–C6) and reveal a transition from an open to a closed conformation that resembles the SAM-bound form. Orange circles and D refer to 13C and 2H nuclei, respectively. Additional details can be found in the original work.[233] To leverage the labeling scheme, our group designed a new 13C CEST experiment based on previous pulse schemes[211,213] and used it on the apo SAM-II riboswitch (Figure A). The CEST profiles of C43–C1′ and C43–C6 indicated two states of the free SAM-II riboswitch: one that matched the resonance of the ligand-free, highly populated conformation (i.e., state A) and another that matched the ligand-bound, transient conformation (i.e., state B) (Figure B).[233] We then used our new 1H CEST experiment (Figure A) to indirectly obtain the C43–H1′ chemical shift of state A and B.[233] In agreement with the 13C data, the 1H chemical shift of state B matched the ligand-bound SAM-II (Figure B).[233] Taken together, these results suggest that the apo SAM-II exists in a dynamic equilibrium (kex = 36 ± 3 s–1) between an open (highly populated, pA = 90.5 ± 0.5%) and a partially closed (transient, pB = 9.5 ± 0.5%) state (Figure B).[233] Moreover, these results underscore the emerging consensus that transient, low populated states likely enhance rapid ligand recognition and therefore play a potentially ubiquitous role in RNA recognition and signaling.

R1ρ and CEST in Atom- and Position-Specifically Labeled RNA Harboring Post-transcriptional Modifications

Perhaps the greatest benefit of selective labeling is the ability to monitor the structural dynamic consequences of epigenetic and post-transcriptional modifications. Using labels created by Kreutz and co-workers, the Al-Hashimi group has been at the forefront of exploring how these modifications alter the dynamic ensembles of nucleic acids.[148−150,232,257−260] One such example is m6A, an abundant RNA post-transcriptional modification that modulates gene expression,[261−263] viral lifecycles,[264−270] and other biological phenomena.[271−274] Recent work from the Al-Hashimi group demonstrated that m6A preferentially slows RNA duplex annealing with minimal effect on the rate of duplex melting.[149] The effect of m6A on hybridization kinetics stands in contrast to the effect of mismatches. Mismatches also slow the rate of duplex annealing but dramatically increase the rate of duplex melting.[275−277] Of critical importance, the methylamino group of the m6A nucleobase can form two rotational isomers that interconvert on the millisecond time scale[278,279] (Figure A). The preferred syn isomer (i.e., high-populated, state A) cannot form a canonical Watson–Crick base pair with uridine due to a steric clash between the uridine keto group and the methylamino[278−280] and is therefore mismatch-like (Figure A). Instead, when base-paired with uridine, the methylamino rotates into the anti isomer (i.e., transient, state B) to form a canonical Watson–Crick m6A:U base pair (Figure A).
Figure 10

(A) Equilibrium between syn:anti conformations of the m6A nucleobase and the types of base pairing that each conformation can adopt.[278−280] (B) Secondary structure of the 9 and 18 nt ssRNA and dsRNA that were position-specifically labeled with isotope-labeled m6A, as shown bolded in orange and circled to indicate that it was the subject of RD and CEST experiments. RNA samples harboring m6A were either made with [2,8-13C2]-m6A (top) or [13CH3]-m6A (bottom) labels to obtain 13C RD and CEST data for CH3 (methyl), C2, or C8 sites. (C) Schematic of the four-state CS-and-IF kinetic model with rate constants shown from RD and CEST data collected at 65 °C.[150] Orange circles refer to 13C. Additional details can be found in the original work.[150]

(A) Equilibrium between syn:anti conformations of the m6A nucleobase and the types of base pairing that each conformation can adopt.[278−280] (B) Secondary structure of the 9 and 18 nt ssRNA and dsRNA that were position-specifically labeled with isotope-labeled m6A, as shown bolded in orange and circled to indicate that it was the subject of RD and CEST experiments. RNA samples harboring m6A were either made with [2,8-13C2]-m6A (top) or [13CH3]-m6A (bottom) labels to obtain 13C RD and CEST data for CH3 (methyl), C2, or C8 sites. (C) Schematic of the four-state CS-and-IF kinetic model with rate constants shown from RD and CEST data collected at 65 °C.[150] Orange circles refer to 13C. Additional details can be found in the original work.[150] Kinetic mechanisms that involve binding and conformational change can occur via pathways wherein the conformational change occurs prior to (conformational selection, CS) or post (induced fit, IF) binding. Al-Hashimi and co-workers employed their recently developed RD-based and CEST experiments[31,147,167,236−245] to measure hybridization kinetics of single- and double-stranded RNA (ssRNA and dsRNA, respectively) harboring atom- and position-specifically labeled m6A probes (i.e., [2,8-13C2]-m6A or [13CH3]-m6A) (Figure B) to determine how m6A modulates hybridization.[150] In this way, they had direct readouts of the effects of the m6A isomers on Watson-Crick or mismatch-like hybridizations. They showed that m6A with the methylamino group in the anti conformation forms a Watson–Crick base pair with uridine that transiently isomerizes on the millisecond time scale to a singly hydrogen-bonded (pB ≈ 1%) mismatch-like conformation, with the methylamino group in the syn conformation.[150] This rapid interconversion between Watson–Crick and mismatch forms, combined with different syn:anti preferences in ssRNA and dsRNA states, hints at how m6A slows duplex annealing without affecting melting via two pathways in which isomerization occurs before (CS) or after (IF) duplex annealing (Figure C).[150]

Examples of Relaxation Dispersion Experiments without Selectively Labeled RNA

While RD experiments work well with selective labels, it is not a prerequisite, as long as care is taken to either minimize strong 13C–13C scalar couplings (i.e., probe nuclei where these are minimized) or take them into consideration in data analysis. The following sections will be devoted to showcasing examples of CEST and R1ρ experiments performed without selectively labeled RNA. We will highlight recent work from the Zhang[249] and Petzold[246] groups using uniformly 13C/15N-labeled rNTPs and also new experiments from the Petzold[251] and Al-Hashimi[252] groups that require no labels at all.

CEST and R1ρ Experiments in Uniformly Labeled RNA

RNA dynamics can regulate biological processes from transcription to translation. One such example is the Bacillus cereus fluoride riboswitch RNA (Figure A), which has been characterized extensively by Zhang and co-workers.[249] Here, they showed that the riboswitch aptamer adopts a near-identical solution structure[249] with (holo) and without (apo) the fluoride ligand, in agreement with X-ray crystal structures (Figure B).[281] Moreover, these states also undergo very similar dynamic motions across a wide range of time scales, as determined from 13C spin relaxation rates and residual dipolar couplings (RDCs).[249] However, functional assays indicate that transcription activation is fluoride-dependent and kinetically driven.[281,282] What is more, mutational studies suggest that a prefolded “holo-like” apo state lowers the kinetic barrier for ligand binding, enabling efficient fluoride sensing to activate transcription below or near the toxicity threshold. Until recently, the mechanism by which this holo-like apo state achieves the “transcription–off” state remained unknown.[249]
Figure 11

(A) Secondary structure of the 48 nt fluoride riboswitch aptamer RNA with domains labeled by color. (B) Solution NMR structure[249] of the apo aptamer (B. cereus) (PDB ID, 5KH8) (left) compared to crystal structures[281] of the apo (PDB ID, 4ENC) and holo (PDB ID, 3VRS) aptamers (T. petrophila). In solution, the aptamer adopts near-identical structures in the apo and holo forms, in agreement with crystallography.[249,281] (C) Schematic of the equilibrium between the highly populated apo state (i.e., State A) and the transient “holo-like” conformation of the apo state (i.e., State B). Exchange parameters were extracted from a global fit of the CEST data. The transient “holo-like” conformation of the apo state (i.e., State B) occludes the formation of a reverse Hoogsteen base pair in the highly populated conformation of the apo state (i.e., State A) to signal transcription termination. Additional details can be found in the original work.[249]

(A) Secondary structure of the 48 nt fluoride riboswitch aptamer RNA with domains labeled by color. (B) Solution NMR structure[249] of the apo aptamer (B. cereus) (PDB ID, 5KH8) (left) compared to crystal structures[281] of the apo (PDB ID, 4ENC) and holo (PDB ID, 3VRS) aptamers (T. petrophila). In solution, the aptamer adopts near-identical structures in the apo and holo forms, in agreement with crystallography.[249,281] (C) Schematic of the equilibrium between the highly populated apo state (i.e., State A) and the transient “holo-like” conformation of the apo state (i.e., State B). Exchange parameters were extracted from a global fit of the CEST data. The transient “holo-like” conformation of the apo state (i.e., State B) occludes the formation of a reverse Hoogsteen base pair in the highly populated conformation of the apo state (i.e., State A) to signal transcription termination. Additional details can be found in the original work.[249] To shed light on this mechanism, 13C CEST experiments were implemented on uniformly 13C/15N-GTP- and uniformly 13C/15N -ATP/UTP labeled aptamer RNA. For the holo state, CEST profiles consistently showed a single, highly populated conformation (i.e., state A).[249] A subset of CEST profiles of the apo state, on the other hand, revealed the presence of conformational exchange to a transient state (i.e., state B).[249] The nucleotides that undergo chemical exchange were localized to the junction of P3, J13, J23, and the 3′-tail, suggesting a concerted transition (Figure A,B). A global fit of the CEST data determined the population (pB = 1.4 ± 0.1%) and lifetime (τB = 3.2 ± 0.3 ms) of the holo-like conformation of the apo state. This fleeting process differentiates the apo and holo states. Rapid transition to the holo-like conformation of the apo state, which unlocks the highly conserved reverse Hoogsteen base pair located at the interface between the aptamer domain and the expression platform, promotes strand invasion and provides a path to transcription termination (Figure C).[249] Conversely, fluoride binding allosterically suppresses access to the holo-like conformation of the apo state, ensuring continued gene transcription.[249] RNA can also regulate the initial steps of translational silencing. This process begins when a mature miRNA binds to the human Argonaute (Ago2) protein to form the RNA-induced silencing complex (RISC).[283] Here, translational silencing is predominantly controlled by base pair complementarity between the “seed” region of the miRNA and the target mRNA.[283−288] Interestingly, data from bioinformatics,[289] structural,[290] and mutational[291] studies all suggest that RNA dynamics within the central bulge of miRNA–mRNA duplex likely controls mRNA fate. To test this hypothesis, Petzold and co-workers used R1ρ experiments coupled with molecular dynamics simulations to investigate the structural dynamics of the interaction between miR-34a and its miRNA recognition element in the 3′-UTR of silent information regulator 1 mRNA (mSirt1) (Figure A).[246] Using these experiments, the authors detected chemical exchange in nucleotides surrounding the central bulge of the miR-34a–mSirt1 duplex (Figure A).[246] In this structural rearrangement, the gG8:tC17 base pair (‘g’ refers to the guide miRNA and ‘t’ refers to the target mRNA) interconverts from a highly populated (i.e., state A) to a transient (i.e., state B) conformation. A global fit of the R1ρ data determined the exchange rate (kex = 1008 ± 12 s–1) and population (pB = 0.9 ± 0.2%) of the unfavorable state (Figure B),[246] and the chemical shift data[246] from 1H (Δω −2.20 ± 0.02 ppm) and 15N (Δω −3.8 ± 0.1 ppm) R1ρ experiments suggest formation of a gG8:tU21 wobble pair (Figure B),[246] a motif seen in other miRNAs.[292,293] Taken together, the miR-34a–mSirt1 binding site is in equilibrium between a highly populated 7-mer-A1 and a transient 8-mer-GU (Figure B).
Figure 12

(A) Secondary structure of the mir-34a–mSirt1 duplex.[246] Nucleotides that were found to be in exchange are circled. (B) Schematic of the equilibrium between the highly populated 7-mer-A1 and transient 8-mer-GU mir-34a–mSirt1 duplex. Exchange parameters were extracted from a global fit of the R1ρ data (i.e., gG8-H1, gG8-N1, gG8-C8, tC17-C1′, tA19-C8, tU20-C1′, tU21-C6, and tA22-C8). The boxed nucleotides represent the critical switch from the gG8:tC17 to gG8:tU21 base pair. (C) Replotted functional data[246] showing the percentage of target repression for each miR-34a duplex. The transient 8-mer-GU reduces target mRNA levels ∼2-fold compared to the highly populated 7-mer-A1. The 8-mer-GU duplex therefore represents a “catalytically competent RISC”. Additional details can be found in the original work.[246]

(A) Secondary structure of the mir-34a–mSirt1 duplex.[246] Nucleotides that were found to be in exchange are circled. (B) Schematic of the equilibrium between the highly populated 7-mer-A1 and transient 8-mer-GU mir-34a–mSirt1 duplex. Exchange parameters were extracted from a global fit of the R1ρ data (i.e., gG8-H1, gG8-N1, gG8-C8, tC17-C1′, tA19-C8, tU20-C1′, tU21-C6, and tA22-C8). The boxed nucleotides represent the critical switch from the gG8:tC17 to gG8:tU21 base pair. (C) Replotted functional data[246] showing the percentage of target repression for each miR-34a duplex. The transient 8-mer-GU reduces target mRNA levels ∼2-fold compared to the highly populated 7-mer-A1. The 8-mer-GU duplex therefore represents a “catalytically competent RISC”. Additional details can be found in the original work.[246] Next, Petzold and co-workers sought to investigate the functional relevance of the 8-mer-GU unfavorable state using a functional assay and simulated complexes of human Ago with 7-mer-A1 and 8-mer-GU 34a–mSirt1 duplexes. Interestingly, the switch to the 8-mer-GU state causes coaxial stacking of the seed and supplementary helix fitting into Ago2, reminiscent of an active state in prokaryotic Ago.[294,295] Moreover, this state enhances repression of the target mRNA, revealing the importance of this dynamic miRNA–mRNA structure (Figure C).

CEST and R1ρ Experiments in Unlabeled RNA

After highlighting RD experiments in selectively and uniformly labeled RNA, we will conclude this section with a brief description of two pulse schemes that permit R1ρ(251) and CEST[252] experiments in unlabeled RNA. In the first, Petzold and co-workers developed a SELOPE homonuclear NMR method by combining the selective excitation of specific groups of protons and reduction of spectral crowding using coherence transfer among scalar coupled protons. These coherence transfers take advantage of uniform homonuclear three bond scalar coupling between H5 and H6 for pyrimidine bases ( ∼ 8–10 Hz) or between H1' and H2' for ribose in C2'-endo conformation ( ∼ 8Hz). Taken together, SELOPE permits well-resolved 1D and 2D spectra of unlabeled RNA. To demonstrate the utility of this method to probe RNA transient states, Petzold and co-workers adapted the SELOPE pulse scheme to include a spinlock (Figure A).[251] As proof-of-concept, this new 1H R1ρ SELOPE experiment was used to detect chemical exchange in the central bulge region of the GUG RNA (Figure B).[251] Importantly, this method enables the use of lower spinlock strengths to measure slower exchange time scales.[251]
Figure 13

(A) Pulse scheme for 1H R1ρ experiment on unlabeled RNA using a SELOPE readout.[251] (B) Secondary structure of the 25 nt GUG RNA. Nucleotides that were found to be in exchange are circled, and representative exchange parameters for U7–H6 (shaded green) are shown. (C) Pulse scheme for 1H CEST experiment on unlabeled RNA again with a SELOPE readout.[252] (D) Equilibrium between Watson–Crick and Hoogsteen A:T and G:C base pairs is depicted. Exchange rates and populations are shown based on previous reports,[236] and reporter imino protons are shaded red. Additional details can be found in the original works.[251,252]

(A) Pulse scheme for 1H R1ρ experiment on unlabeled RNA using a SELOPE readout.[251] (B) Secondary structure of the 25 nt GUG RNA. Nucleotides that were found to be in exchange are circled, and representative exchange parameters for U7–H6 (shaded green) are shown. (C) Pulse scheme for 1H CEST experiment on unlabeled RNA again with a SELOPE readout.[252] (D) Equilibrium between Watson–Crick and Hoogsteen A:T and G:C base pairs is depicted. Exchange rates and populations are shown based on previous reports,[236] and reporter imino protons are shaded red. Additional details can be found in the original works.[251,252] Building on this work, Al-Hashimi and co-workers introduced a high-power 1H CEST SELOPE experiment to target imino protons (Figure C).[252] To showcase the utility of this method, Watson–Crick to Hoogsteen exchange of G:C and A:T base pairs in DNA were monitored (Figure D).[252] Importantly, Al-Hashimi and co-workers showed that short relaxation delays could be used to characterize fast exchange events that effectively minimize NOE effects that complicate 1H RD experiments.[213,252,296−301] Moreover, their approach also takes advantage of high-power RF fields recently shown to extend the time scale sensitivity of CEST to include faster exchange processes that were traditionally only detectable by R1ρ.[252,302] While both of these exciting new advancements hold promise, they are inherently limited to small RNAs. However, RNA biology is increasingly moving toward larger and larger RNAs. This important topic will be the focus of the next section.

Exploring Large Molecular Weight Nucleic Acids

Until now, most studies of RNA dynamics have focused on relatively small systems. However, RNA structural biology is increasingly moving toward larger RNAs, especially as cryo-EM advances in resolution and popularity.[303−305] Solution NMR spectroscopy, unlike X-ray crystallography and cryo-EM, is the only biophysical technique capable of probing nucleic acid conformational dynamics on a wide range of time scales in a physiologically relevant environment. Moreover, four technological advances have expanded the types of problems that NMR can tackle in studies of molecular nanomachines on the order of 1 MDa: (1) commercial availability of high-field magnets, up to 1.2 GHz 1H Larmor frequency (28.2 T),[306] (2) specialized probes (e.g., cryo-probes) that minimize noise associated with the NMR signals,[307] (3) new isotope labeling technologies (described in Section ), and (4) the design of new NMR experiments that are tailored to the isotope labeling used (described in Section ). Our final section will describe how new labeling efforts can be leveraged to study large RNAs by NMR. Taking inspiration from protein labeling,[308] our group installed 19F directly next to a 13C spin in UTP (Scheme )[18,86] and showed that, compared to the 13C–1H spin pair, 13C–19F had better sensitivity, ∼6-times wider chemical shift dispersion and ∼2-times more favorable relaxation properties in 2D TROSY experiments (Figure A,B).[18] Importantly, the high sensitivity of the 19F nucleus enabled clear delineation of helical and nonhelical regions as well as G:U wobble and Watson–Crick base pairs (Figure C).[18,57] In parallel, the Kreutz group incorporated 13C–19F into both cytidine and uridine 2′-O-tBDMS amidites (Schemes and 22) to show the same effect in RNAs made by solid-phase synthesis.[57] These findings suggest that structural insights are possible even in the absence of complete resonance assignment, which is a substantial bottleneck for large RNAs. Moreover, these labeling schemes can be readily adapted to exploit 19F CEST and R1ρ experiments, which have been described for proteins up to 360 kDa.[309−314]
Figure 14

(A) Simulated 13C R2 rates (linewidths) at various magnetic fields in RNAs of various molecular weights (as measured by τC) to compare the relative TROSY effects of 13C–1H or 13C–19F spin pairs, which are shown on the left. (B) Same simulated rates as in A for each spin pair but only at the magnetic fields corresponding to the narrowest linewidths (smallest R2) (600 and 950 MHz for 13C–19F or 13C–1H, respectively, as shown by the gray lines in panel A).[18] (B) Representative 19F–13C TROSY spectrum to highlight the dispersion of resonances based on secondary structure (i.e., G:U wobble base pairs, nonhelical nucleotides, and helical A:U base pairs). Spectral regions are colored to match the respective uridines on the corresponding RNA. Additional details can be found in the original works.[18,57]

(A) Simulated 13C R2 rates (linewidths) at various magnetic fields in RNAs of various molecular weights (as measured by τC) to compare the relative TROSY effects of 13C–1H or 13C–19F spin pairs, which are shown on the left. (B) Same simulated rates as in A for each spin pair but only at the magnetic fields corresponding to the narrowest linewidths (smallest R2) (600 and 950 MHz for 13C–19F or 13C–1H, respectively, as shown by the gray lines in panel A).[18] (B) Representative 19F–13C TROSY spectrum to highlight the dispersion of resonances based on secondary structure (i.e., G:U wobble base pairs, nonhelical nucleotides, and helical A:U base pairs). Spectral regions are colored to match the respective uridines on the corresponding RNA. Additional details can be found in the original works.[18,57] An alternative approach to heteronuclear correlation experiments that include nuclei with large CSAs such as 13C and 19F, which broaden the lines of nearby protons, was recently described by Bax and Summers and co-workers.[53] This approach capitalizes on the favorable relaxation properties of 15N nuclei within RNA nucleobases. Here, they employed 1H–15N heteronuclear multiple quantum coherence (HMQC) experiments to measure 15N R1ρ rates and RDCs in a large 232 nt (∼78 kDa) RNA by selectively transferring magnetization from Ade-H2 to Ade-N1/N3 via the two-bond scalar coupling (J ≈ 15 Hz[29]) (Figure ). Extending this method in the same 232 nt RNA, Marchant and Tjandra and co-workers measured pseudocontact shifts using the two-bond scalar coupling of Ade-H8-N7 ( ≈ 11 Hz[29]) and Ade-H8-N9 ( ≈ 8 Hz) for coherence transfer.[315] Importantly, both experiments would benefit by atom-specific labeling. That is, selective 15N labeling of Ade-N1 or Ade-N3 (described in Section ) (Schemes and 8) would reduce crowding considerably and direct magnetization transfer uniquely from Ade-H2 rather than splitting it between both sites, as in uniformly 15N-labeled RNA (Figure ). In the same way, selective 15N labeling of Pur-N7 or Pur-N9 (described in Section ) (Schemes –11) would again reduce crowding and direct coherence transfer uniquely from Pur-H8 (Figure ). However, selective pulses can be deployed to affect the same decrowding and directed transfer. These labeling topologies could then be leveraged to probe two-bond 15N CEST in large RNAs, as recently described by Zhang and co-workers.[231]
Figure 15

Examples of possible routes for coherence transfer between two-bond scalar couplings between Ade–H2-N1 and Ade–H2-N3 (J ≈ 15 Hz[29]), Pur–H8-N7 (J ≈ 11 Hz[29]) and Pur–H8-N9 (J ≈ 8 Hz[29]) in uniformly and selectively labeled RNA.

Examples of possible routes for coherence transfer between two-bond scalar couplings between Ade–H2-N1 and Ade–H2-N3 (J ≈ 15 Hz[29]), Pur–H8-N7 (J ≈ 11 Hz[29]) and Pur–H8-N9 (J ≈ 8 Hz[29]) in uniformly and selectively labeled RNA. Our final example of harnessing the versatility of the 15N nuclei is one that exploits the narrow linewidths in 1H–15N TROSY experiments compared to its 1H–13C counterpart (Figure A). Here, Fürtig and Schwalbe and co-workers investigated several reconstituted complexes between an adenine-sensing riboswitch and the 30S ribosome by NMR spectroscopy.[153] In particular, they implemented the 1H–15N BEST-TROSY pulse scheme[316,317] to obtain incredible spectra for a massive-sized complex (>800 kDa) (Figure B). Taken together, Fürtig and Schwalbe and co-workers succeed in illuminating the dynamic network that links the riboswitch RNA regulator, adenine ligand inducer, and ribosome protein S1 modulator during translation initiation.[153]
Figure 16

(A) Simulated TROSY-detected R2 rates (linewidths) for 13C and 15N nuclei at 800 MHz. The 15N nuclei has significantly narrower linewidths (smaller R2) than that of 13C. (B) Structural model of the >800 kDa complex of adenine-sensing riboswitch bound to the 30S ribosomal complex (structural model built from PDB IDs 1Y26 and 5MLN).[153] Additional details can be found in the original work.[153]

(A) Simulated TROSY-detected R2 rates (linewidths) for 13C and 15N nuclei at 800 MHz. The 15N nuclei has significantly narrower linewidths (smaller R2) than that of 13C. (B) Structural model of the >800 kDa complex of adenine-sensing riboswitch bound to the 30S ribosomal complex (structural model built from PDB IDs 1Y26 and 5MLN).[153] Additional details can be found in the original work.[153]

Conclusion

In humans, RNA transcripts exceed the number of proteins decoded by more than 50-fold, and yet the number of RNA structures remains below 1%, preventing a detailed understanding of RNA function (Figure ). It is therefore essential to characterize RNA structural dynamics and interactions at atomic resolution to fill this critical knowledge gap. Over the past two decades, NMR spectroscopy has assumed a central role in RNA structure determination and probing dynamics on functionally relevant time scales in solution. In this review, we have summarized some of the many contributions of solution NMR studies to our knowledge of RNA structure, dynamics, and interactions, as facilitated by isotope labeling. We have presented a detailed overview of the prominent role stable isotopes continue to play in NMR analysis of nucleic acids (Section 2), how to synthesize these labels and introduce them into RNA (Section 3), and how these labels benefit NMR analysis. Of great interest, selective isotope labeling alleviates spectral crowding and removes dipolar and scalar couplings to simplify NMR dynamics measurements and data interpretation (Section 4). Moreover, recent advances in labeling open the door to study large RNA systems in a manner previously thought impossible (Section 5). As new orthogonal technologies are developed to better characterize the functional relevance of RNA, their structural dynamics will become increasingly important to better understand the cellular basis of RNA-based dysfunction that leads to various diseases. We anticipate that several imminent breakthrough technologies, some described herein, will enable NMR spectroscopy to continue to play a pivotal role in shining light on the structure, dynamics, and function of the important “dark matter of the genome”, RNA in vitro, in cellulo, and in vivo.
  260 in total

Review 1.  Annotating non-coding regions of the genome.

Authors:  Roger P Alexander; Gang Fang; Joel Rozowsky; Michael Snyder; Mark B Gerstein
Journal:  Nat Rev Genet       Date:  2010-07-13       Impact factor: 53.242

Review 2.  Characterization of the dynamics of biomacromolecules using rotating-frame spin relaxation NMR spectroscopy.

Authors:  Arthur G Palmer; Francesca Massi
Journal:  Chem Rev       Date:  2006-05       Impact factor: 60.622

3.  Invisible RNA state dynamically couples distant motifs.

Authors:  Janghyun Lee; Elizabeth A Dethoff; Hashim M Al-Hashimi
Journal:  Proc Natl Acad Sci U S A       Date:  2014-06-16       Impact factor: 11.205

4.  Conformational selection and functional dynamics of calmodulin: a (19)F nuclear magnetic resonance study.

Authors:  Joshua Hoang; R Scott Prosser
Journal:  Biochemistry       Date:  2014-09-03       Impact factor: 3.162

Review 5.  Advanced approaches for elucidating structures of large RNAs using NMR spectroscopy and complementary methods.

Authors:  Anita Kotar; Hannah N Foley; Kirk M Baughman; Sarah C Keane
Journal:  Methods       Date:  2020-01-20       Impact factor: 3.608

6.  Chemienzymatic synthesis of uridine nucleotides labeled with [15N] and [13C].

Authors:  A M Gilles; I Cristea; N Palibroda; I Hilden; K F Jensen; R S Sarfati; A Namane; J Ughetto-Monfrin; O Bârzu
Journal:  Anal Biochem       Date:  1995-12-10       Impact factor: 3.365

7.  Labeling monosaccharides with stable isotopes.

Authors:  Wenhui Zhang; Shikai Zhao; Anthony S Serianni
Journal:  Methods Enzymol       Date:  2015-08-06       Impact factor: 1.600

8.  13C relaxation and dynamics of the purine bases in the iron responsive element RNA hairpin.

Authors:  K B Hall; C Tang
Journal:  Biochemistry       Date:  1998-06-30       Impact factor: 3.162

9.  Chemo-Enzymatic Synthesis of Position-Specifically Modified RNA for Biophysical Studies including Light Control and NMR Spectroscopy.

Authors:  Sara Keyhani; Thomas Goldau; Anja Blümler; Alexander Heckel; Harald Schwalbe
Journal:  Angew Chem Int Ed Engl       Date:  2018-08-10       Impact factor: 15.336

Review 10.  Isotope labeling strategies for NMR studies of RNA.

Authors:  Kun Lu; Yasuyuki Miyazaki; Michael F Summers
Journal:  J Biomol NMR       Date:  2009-09-30       Impact factor: 2.835

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.