Philipp Innig Aguion1, Alexander Marchanka1,2, Teresa Carlomagno3. 1. Institute for Organic Chemistry and Centre of Biomolecular Drug Research (BMWZ), Leibniz University Hannover, Schneiderberg 38, 30167 Hannover, Germany. 2. Structural and Computational Biology Unit, European Molecular Biology Laboratory, Meyerhofstr. 1, 69117 Heidelberg, Germany. 3. School of Biosciences/College of Life and Enviromental Sciences, Institute of Cancer and Genomic Sciences/College of Medical and Dental Sciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, UK.
Abstract
Solid-state NMR (ssNMR) has become a well-established technique to study large and insoluble protein assemblies. However, its application to nucleic acid-protein complexes has remained scarce, mainly due to the challenges presented by overlapping nucleic acid signals. In the past decade, several efforts have led to the first structure determination of an RNA molecule by ssNMR. With the establishment of these tools, it has become possible to address the problem of structure determination of nucleic acid-protein complexes by ssNMR. Here we review first and more recent ssNMR methodologies that study nucleic acid-protein interfaces by means of chemical shift and peak intensity perturbations, direct distance measurements and paramagnetic effects. At the end, we review the first structure of an RNA-protein complex that has been determined from ssNMR-derived intermolecular restraints.
Solid-state NMR (ssNMR) has become a well-established technique to study large and insoluble protein assemblies. However, its application to nucleic acid-protein complexes has remained scarce, mainly due to the challenges presented by overlapping nucleic acid signals. In the past decade, several efforts have led to the first structure determination of an RNA molecule by ssNMR. With the establishment of these tools, it has become possible to address the problem of structure determination of nucleic acid-protein complexes by ssNMR. Here we review first and more recent ssNMR methodologies that study nucleic acid-protein interfaces by means of chemical shift and peak intensity perturbations, direct distance measurements and paramagnetic effects. At the end, we review the first structure of an RNA-protein complex that has been determined from ssNMR-derived intermolecular restraints.
Besides its long-established role as carrier of genetic information in protein translation, RNA acts in many other cellular contexts, with new roles being discovered regularly (Cech and Steitz, 2014). The vast majority of the RNAs made from the human genome have distinct functions from protein coding (non-coding RNAs, ncRNAs), but many of these functions remain unknown. Both coding RNA (mRNA) and ncRNAs may act in complex with specific RNA-binding proteins (RPBs), which contain well-defined RNA recognition motifs (Corley et al., 2020). Being involved in such high number of biological functions, RNA molecules hold potentials as both therapeutic agents and targets.As for proteins, the development of RNAs either as therapeutic targets or as protein-targeting agents requires understanding their three-dimensional structures and their interaction modes with proteins. Methods to characterize RNA–protein interactions develop rapidly (Schlundt et al., 2017). X-ray crystallography and cryo electron microscopy (cryo-EM) can be applied to high molecular-weight complexes and have been the methods of choice to study many large ribonucleoprotein complexes (RNPs) in the past years (Ben-Shem et al., 2011, Ghanim et al., 2021, Jackson et al., 2014, Khatter et al., 2015, Nguyen et al., 2016). These techniques work well for RNPs with well-defined structures but fall short when addressing conformational heterogeneity or dynamic processes. RNA molecules can adopt different folds depending on both the environmental conditions and the interacting partners and are often flexible in isolation. The presence of disordered RNA regions can make crystallization of RNPs quite challenging (Blanco and Montoya, 2011), while cryo-EM analysis remains blind to disordered molecular regions. In addition, both X-ray crystallography and cryo-EM are unable to represent the dynamics of the studied system at atomic level, which, especially in enzymes, provides the crucial link between structure and function. Finally, several intermolecular interactions with functional regulatory roles are transient in nature; transient complexes are difficult to crystallize and may dissociate during the preparation of cryo-EM grids, rendering both X-ray crystallography and cryo-EM inapplicable.NMR spectroscopy is a structural biology technique that is able to provide structural information in the presence of disorder and dynamics. As such, NMR is very useful to study RNPs, and more generally nucleic acid–protein complexes, containing flexible regions, and reveals whether and how these disordered regions contribute to binding specificity and/or modulate affinity. NMR spectroscopy also enables structural studies of transiently forming complexes in a wide range of affinities (Campagne et al., 2011, Carlomagno, 2014, Dominguez et al., 2011, Simon et al., 2011, Yadav and Lukavsky, 2016). Finally, NMR allows the direct observation of hydrogen atoms, which remain inaccessible by X-ray crystallography or cryo-EM but play a crucial role in nucleic acids interactions (Yip et al., 2020).Notoriously, NMR studies in solution are limited to particles of less than ∼ 50 kDa, due to the direct dependency of line-broadening on the molecular size. For proteins, this limit has been considerably extended by the methyl-TROSY methodology (transverse relaxation-optimized spectroscopy) (Kay, 2011, Rosenzweig and Kay, 2014, Tugarinov et al., 2003) in combination with selective 13CH3 methyl group labeling of highly deuterated proteins. Methyl-TROSY NMR exploits the excellent relaxation properties of methyl groups, which, when present as only hydrogen-bearing groups in otherwise deuterated proteins, retain feasible NMR line widths even in particles as large as 1 MDa (Lapinaite et al., 2013, Mas et al., 2018, Mas et al., 2013, Sprangers and Kay, 2007, Graziadei et al., 2020). Unfortunately, RNAs do not contain any methyl groups and NMR studies of large RNAs rely on a challenging combination of two-dimensional 1H–1H correlation experiments measured on several selectively-deuterated samples (Brown et al., 2020, Keane et al., 2015). Recently, methyl-TROSY has been applied to high-molecular weight DNA, where methyl groups were engineered at the C5 and N6 positions of cytosines (5mC) and adenines (6 mA) (Abramov et al., 2020). However, because artificial methylation of both DNA and RNA can affect their structures, dynamics and interactions with binding partners (Choy et al., 2010, Helm, 2006, Kawai et al., 1992, Ngo et al., 2016, Williams et al., 2001), nucleic acids methylation cannot be generally applicable to study large nucleic acids by NMR. In fact, methylation of both DNA and RNA occurs naturally in the cell, where it exerts a regulatory function by modulating both the structure and the interactome of nucleic acids.Solid-state NMR spectroscopy (ssNMR) is another form of biomolecular NMR spectroscopy, which has been applied extensively to insoluble and non-crystalline particles, such as membrane proteins (Cady et al., 2010, Lange et al., 2006, Park et al., 2012, Shahid et al., 2012, Shi et al., 2009) and amyloid fibrils (Colvin et al., 2016, Hoop et al., 2016, Tycko, 2011, Van Melckebeke et al., 2010). ssNMR linewidths do not depend on the molecular weight, making the application of ssNMR to large particles feasible, provided that there is enough signal to compensate for the small number of large particles that can be fitted into a rotor of a given size. In addition, because ssNMR lines are intrinsically broader than solution NMR lines, selective isotope labeling is often crucial to resolve spectral overlaps and achieve site-specific assignment also of moderately sized molecules. Despite these limitations, ssNMR has been applied to large viral assemblies (Andreas et al., 2016, Goldbourt et al., 2007, Lusky et al., 2021, Morag et al., 2015, Morag et al., 2014, Sergeyev et al., 2011, Yu and Schaefer, 2008) and site specific structural information was obtained for the 46-residue-long major coat protein subunit of the filamentous bacteriophage Pf1, as part of the 36 MDa virion (Goldbourt et al., 2007), thanks to the fact that the 7300 subunits of the virion all adopt the same conformation. Over the years, the ssNMR toolbox has been extended for the application to RNA in isolation (Leppert et al., 2004, Lusky et al., 2021, Riedel et al., 2006, Riedel et al., 2005a, Riedel et al., 2005b, Yang et al., 2017), RNA bound to short peptides (Huang et al., 2010, Huang et al., 2011, Huang et al., 2017, Olsen et al., 2005, Olsen et al., 2010), RNA as part of RNP complexes (Aguion et al., 2021, Ahmed et al., 2020, Marchanka et al., 2013, Marchanka et al., 2015, Marchanka et al., 2018b) and DNA–protein complexes (Boudet et al., 2019, Lacabanne et al., 2020, Malär et al., 2021b, Wiegand et al., 2020b, Wiegand et al., 2019, Wiegand et al., 2017b, Wiegand et al., 2016, Zehnder et al., 2021).The main challenge in NMR of RNA both in solution and in solid-state is the overlap of the signals due to the limited chemical diversity of the nucleotides. This is especially true in canonical, and thus homogeneous, tertiary structure elements, such as A-form helices. This challenge can be addressed using nucleotide-type selective and/or segmental isotope labeling (Duss et al., 2010, Nelissen et al., 2008, Tzakos et al., 2007), as well as site-specific labeling (Lu et al., 2010, Marchanka et al., 2018a), which reduce spectral crowding. As mentioned above, selective labeling becomes crucial in ssNMR, where the signal overlaps are significant also for RNAs of medium size in the absence of significant structural diversity (i.e. in helical regions).The second challenge in NMR of RNA is the unequal proton distribution. Nucleic acids have a high proton density in the ribose ring but only few protons in the nucleobases and no protons at the backbone phosphate. This leads to a limited number of 1H–1H distance restraints available to determine the conformation at both the backbone and the Watson-Crick and Hoogsteen sides of the nucleobases. Fortunately, in ssNMR, the number of distance restraints that can be collected does not directly correlate with the number of protons, as distance restraints can be measured via both heteronuclear and homonuclear correlations mediated by dipolar couplings. Moreover, the distance range of restraints measured in ssNMR experiments such as rotational echo double resonance (REDOR) (Gullion and Schaefer, 1989a, Gullion and Schaefer, 1989b) or proton-driven spin diffusion (PDSD) (Szeverenyi et al., 1982) can exceed the range of those obtained in solution NMR by NOESY experiments (Huang et al., 2011, Huang et al., 2010, Marchanka and Carlomagno, 2019, Olsen et al., 2005, Olsen et al., 2003, Studelska et al., 1996), providing a useful tool for the refinement of global conformations. For example, REDOR experiments have provided long-range distance restraints up to 16 Å in proteins (Studelska et al., 1996) and 13 Å in nucleic acids (Olsen et al., 2003).Because of its applicability to particles of large size and the versatility it offers in the design of magnetization transfer pathways, MAS ssNMR can adopt an important role in structural biology of RNP complexes. However, the disadvantages caused by signals overlap have long discouraged the application of ssNMR to RNA-containing particles. In the past decade our lab has developed a suite of ssNMR experiments that achieve assignment of RNA 13C,15N (Marchanka et al., 2015, Marchanka et al., 2013, Marchanka and Carlomagno, 2019) and 1H resonances (Aguion et al., 2021, Aguion and Marchanka, 2021, Marchanka et al., 2018b) as well as de novo RNA structure determination using distance restraints obtained solely from ssNMR experiments (Marchanka et al., 2015). These experiments, together with those developed in several other laboratories for the structure determination of proteins using 13C,15N detection (Castellani et al., 2003, Zhao, 2012) at low MAS frequencies and 1H detection (Andreas et al., 2016, Schubeis et al., 2018) at fast MAS frequencies (Penzel et al., 2019, Schledorn et al., 2020), allow the structure determination of the individual RNA and protein components of RNP complexes by ssNMR. Once the structures of the individual components are established, the identification of the intermolecular interfaces as well as the measurement of intermolecular distances yield the structure of the complex.In this review we present recent advances of ssNMR spectroscopy to determine intermolecular contacts in RNP complexes and discuss advantages and challenges of the individual strategies. Due to the similar nature of the interfaces, we also review ssNMR studies of intermolecular interactions in DNA–protein complexes. We present conventional ssNMR methods that rely on the detection of low γ nuclei, such as 13C, 15N and 31P, as well as novel 1H-detected ssNMR experiments under fast magic-angle-spinning (MAS) rates. Finally, we discuss the first example of structure determination of an RNA–protein complex guided solely by ssNMR-derived restraints.
Sample preparation
NMR of nucleic acids is more challenging than that of proteins due to the poor chemical shift dispersion of their NMR signals. Thus, even for relatively small nucleic acids, advanced isotope labeling strategies may be required to obtain site specific structural information. Nucleic acids for ssNMR studies can be prepared by either chemical (Beaucage and Reese, 2009, Roy and Caruthers, 2013) or enzymatic synthesis; however, the methods available to produce isotope-labeled RNA are considerably more advanced than those available for DNA. An extensive description of isotope labeling strategies for RNA (Marchanka et al., 2018a) and ssDNA/dsDNA (Nelissen et al., 2016) by either chemical or enzymatic synthesis can be found in the literature.As opposed to X-ray crystallography, ssNMR does not require crystals of any particular size and is therefore applicable to particles with substantial flexible regions, which are difficult to force in well-ordered, large crystals. Common sample preparation techniques in ssNMR include micro (nano)-crystallization (Bertini et al., 2010a, Franks et al., 2005, Huang et al., 2012, Marchanka et al., 2013, McDermott et al., 2000, Yang et al., 2017), ethanol precipitation (Zhao et al., 2019), lyophilization (Huang et al., 2011, Leppert et al., 2004, Olsen et al., 2008, Olsen et al., 2005, Wang et al., 1994), freezing in the presence of a cryoprotectant (Siemer et al., 2012) or sedimentation of soluble macromolecules into the ssNMR rotor using ultracentrifugation (Barbet-Massin et al., 2015, Bertini et al., 2011, Gardiennet et al., 2012, Lacabanne et al., 2020, Wiegand et al., 2016, Wiegand et al., 2020a). Micro-crystallization, ethanol precipitation and sedimentation could all yield sufficiently narrow linewidths to allow for site-specific assignments in individual samples (Aguion and Marchanka, 2021); however, ethanol precipitation is incompatible with the protein component of RNP complexes, while sedimentation has been successfully applied to RNP complexes, such as the prokaryotic ribosome (Barbet-Massin et al., 2015), but no data is available with respect to RNA linewidths in these samples. In contrast, both lyophilization and flash-freezing have been demonstrated to lead to inhomogeneous line broadening (Huang et al., 2011, Olsen et al., 2005, Siemer et al., 2012), impairing site-specific assignments. Nevertheless, structural information can be obtained also for these samples, in combination with site-specific isotopic labeling (Huang et al., 2011, Huang et al., 2010, Olsen et al., 2005).ssNMR of nucleic acid–protein complexes prepared using either sedimentation or micro-crystallization have been reported to be stable over a long time. A DNA–protein–ATP complex prepared by sedimentation and stored at –20 °C for 3 years has been shown to yield virtually the same 13C–13C DARR spectrum as the freshly prepared sample (Lacabanne et al., 2020, Wiegand et al., 2020a). A ssNMR RNP complex sample prepared by micro-crystallization in our laboratory (Aguion et al., 2021) shows identical 1H–13C and 1H-15N fingerprint spectra after storage at + 4 °C for two years, with only minimal loss of signal intensities (Fig. 1).
Fig. 1
Long-term stability of precipitated RNP samples. (A) Overlay of the 2D CP hCH spectra tailored for the base spectral region of uniformly 13C,15N labeled Pyrococcus furiosus (Pf) 26mer Box C/D RNA in complex with the Pf L7Ae protein (Aguion et al., 2021, Marchanka et al., 2018b) immediately after sample preparation (blue) and two years after storage at + 4 °C (grey). (B) Representative horizontal 1D trace of the C6-H6 peak from residue U20 taken from the 2D spectra at the indicated dashed line in (A). Peaks after two years of storage at + 4 °C show only minimal loss of signal intensity. Spectra were recorded on a Bruker Avance III HD spectrometer at a 1H resonance frequency of 850 MHz, a magic angle spinning (MAS) rate of 100 kHz, and a temperature of 275 K, using a 0.81-mm MAS probe-head developed by the Samoson group (). The measurement time for both experiments was 7.5 h and all acquisition and processing parameters were the same as described in detail in (Aguion et al., 2021).
Long-term stability of precipitated RNP samples. (A) Overlay of the 2D CP hCH spectra tailored for the base spectral region of uniformly 13C,15N labeled Pyrococcus furiosus (Pf) 26mer Box C/D RNA in complex with the Pf L7Ae protein (Aguion et al., 2021, Marchanka et al., 2018b) immediately after sample preparation (blue) and two years after storage at + 4 °C (grey). (B) Representative horizontal 1D trace of the C6-H6 peak from residue U20 taken from the 2D spectra at the indicated dashed line in (A). Peaks after two years of storage at + 4 °C show only minimal loss of signal intensity. Spectra were recorded on a Bruker Avance III HD spectrometer at a 1H resonance frequency of 850 MHz, a magic angle spinning (MAS) rate of 100 kHz, and a temperature of 275 K, using a 0.81-mm MAS probe-head developed by the Samoson group (). The measurement time for both experiments was 7.5 h and all acquisition and processing parameters were the same as described in detail in (Aguion et al., 2021).Finally, ssNMR 1H-detected experiments at MAS rates above 100 kHz require low sample quantities (300–800 μg for rotors of 0.7–0.8-mm size) (Aguion et al., 2021, Lacabanne et al., 2020, Marchanka et al., 2018b), thus limiting the cost and time-demand of sample preparation.
Characterization of nucleic acid–protein interfaces by MAS ssNMR
The approaches used so far to characterize intermolecular contacts in nucleic acid–protein complexes can be divided in three classes, whereby many of the published studies use a combination of these approaches.Similar to solution-state NMR, the involvement of a molecular surface in interactions with a binding partner can be detected by either chemical shift perturbations (CSPs) or intensity changes of the ssNMR peaks of the surface atoms, when comparing the free and the bound-state of the molecule (Ahmed et al., 2020, Boudet et al., 2019, Lacabanne et al., 2020, Malär et al., 2021b, Wiegand et al., 2020b, Wiegand et al., 2019, Wiegand et al., 2016, Williamson, 2013). For example, the formation of an hydrogen bond at a nucleic acid–protein interface causes a downfield shift of the involved 1H atom (Wagner et al., 1983). Changes in peak intensities can report on changes in the dynamics of one of the binding partners upon complex formation (Lacabanne et al., 2020, Wiegand et al., 2019). In another study, Asami et al. measured a 15N-1H correlation spectrum of a protein in complex with either 1H or 2H-RNA (Asami et al., 2013) and quantified the difference in protein peaks’ intensities that distinguished the RNA–protein interface, owing to the line-broadening caused by the RNA 1H in the 1H-RNA–protein complex. These effects are specific to the surface of the protein in contact with the RNA, while CSPs can also occur in regions other than the intermolecular surface, due to allosteric effects. In any case, both CSPs and changes in peak intensities can be used as ambiguous restraints in docking protocols.Intermolecular distances can be measured directly through intermolecular dipolar correlation experiments. In solution-state NMR, intermolecular distances are measured through 13C,15N-edited, 12C,14N-filtered 1H–1H NOESY experiments (Breeze, 2000, Zwahlen et al., 1997). In ssNMR, a plethora of methods yield dipolar correlations, such as cross-polarization (CP) (Hartmann and Hahn, 1962), rotational echo double resonance (REDOR) (Gullion and Schaefer, 1989a, Gullion and Schaefer, 1989b) and the closely related transferred echo double resonance (TEDOR) (Hing et al., 1992), as well as dipolar-assisted rotational resonance (DARR) (Takegoshi et al., 2001, Takegoshi et al., 1999), proton-driven spin diffusion (PDSD) (Szeverenyi et al., 1982), proton spin diffusion (PSD) (Lange et al., 2002, Wilhelm et al., 1998), radio frequency-driven dipolar recoupling (RFDR) (Bennett et al., 1992, Sodickson et al., 1993), or combined R2nv-driven (CORD) (Hou et al., 2013) experiments. One of the most popular approaches for the measurement of nucleic acid–protein distances utilizes TEDOR-based 31P–13C and and 31P–15N correlations (Bechinger et al., 2011, Huang et al., 2011, Huang et al., 2010, Jehle et al., 2010, Olsen et al., 2005, Yu and Schaefer, 2008). 31P has a high gyromagnetic ratio (Table 1) and is present in nucleic acids exclusively and with 100 % isotopic abundance; thus, in the presence of a 13C,15N-labeled protein, these correlation experiments are sensitive and report exclusively on intermolecular contacts between the protein and the nucleic acid.
Table 1
Nuclei relevant for the identification of nucleic acid–protein binding interfaces in ssNMR.
Nuclei
Spin quantum number
Natural abundance (%)
Gyromagnetic ratio γ (107 rad T−1 s−1)
NMR transition frequency at 18.8 T (MHz)
1H
½
99.98
26.7519
800
2H
1
0.015
4.1066
123
13C
½
1.1
6.7283
200
15N
½
0.37
−2.7126
80
19F
½
100
25.1815
753
31P
½
100
10.8394
324
Nuclei relevant for the identification of nucleic acid–protein binding interfaces in ssNMR.The third approach utilizes paramagnetic relaxation enhancement (PRE) effects, which rely on the interaction between nuclei and unpaired electrons. Because the electron spin gyromagnetic ratio is approximately 600-times larger than that of the 1H nucleus, these effects are large and can be used to measure longer distances than internuclear dipolar correlations. In solution, PRE measurements have been often applied to measure long-range intra- and intermolecular distances (up to 30 Å) in nucleic acid–protein complexes (Amrane et al., 2014, Graziadei et al., 2020, Hennig et al., 2015, Lapinaite et al., 2013, Leeper et al., 2010, MacKereth et al., 2011, Martin-Tumasz et al., 2010). For a detailed description of paramagnetic NMR in solution and solid-state, readers are referred to a comprehensive review by Pell and coworkers (Pell et al., 2019). In ssNMR, PRE measurements were first employed to identify residues of metalloproteins in proximity to the paramagnetic metal (Balayssac et al., 2007b, Balayssac et al., 2007a, Pintacuda et al., 2007); recently, PRE experiments have also been used to study nucleic acid–protein complexes, (Ahmed et al., 2020, Wiegand et al., 2017a, Zehnder et al., 2021). Notably, Ahmed et al. report the first structure of an RNP complex obtained solely from ssNMR data.Fig. 2 gives an overview of the methods developed and utilized to date to probe nucleic acid–protein interfaces, which will be discussed in detail in the next chapters. A representative set of ssNMR studies of nucleic acid–protein complexes is given in Table 2 in chronological order.
Fig. 2
Overview of ssNMR methods utilized to date to probe nucleic acid–protein interfaces. Information derived from paramagnetic effects, cross-interface dipolar correlations, chemical shifts and peak intensities are highly complementary and can be utilized as restraints in a molecular docking protocol that builds the nucleic acid–protein complex from the structures of the individual components. Pseudo contact shifts (PCSs) measure changes in chemical shifts due to the influence of a paramagnetic ion with anisotropic tensor. Paramagnetic relaxation enhancement (PRE) measures the increase in relaxation of nuclei in the vicinity of an unpaired electron. Both effects can be quantified and translated into nucleus-electron distances. Cross-interface dipolar correlations directly measure intermolecular distances through a variety of dipolar recoupling techniques. The ssNMR toolbox for the measurement of intermolecular distances covers transferred echo double resonance (TEDOR)/ rotational echo double resonance (REDOR)-based recoupling experiments, 13C–13C correlation experiments, such as dipolar assisted rotational resonance (DARR) and combined R2nv-driven (CORD), spin-diffusion based CHHP/NHHP experiments, and 1H-detected CP-based hPH experiments. Chemical shift perturbation (CSP) mapping reports on changes in chemical shifts occurring in the protein upon binding of the nucleic acid, while temperature mapping measures the temperature dependence of 1H chemical shifts, which in turn depends on the involvement of the atom in hydrogen bonds. Comparison of temperature coefficients in the free and nucleic acid-bound states can reveal which protein residues are involved in nucleic acid binding. Quantification of peak intensities can also reveal which atoms are close to the partner molecule in the complex, because of binding-induced line-broadening effects.
Table 2
Selection of ssNMR studies of intermolecular interactions in nucleic acid–protein complexes since 2005.
Year
Reference
Complex type
Site specific information
Protein (aa)
NA (nt)
ssNMR technique
2005
(Olsen et al., 2005)
1:1 dsRNA–peptide
Yes
11
29
31P-19F REDOR
2008
(Yu and Schaefer, 2008)
dsDNA–protein (intact bacteriophage)
No
not known
342,000
15N-31P, 31P-15N REDOR
2010
(Jehle et al., 2010)
1:1 dsRNA–protein
Yes
123
26
31P-15N TEDOR
(Huang et al., 2010)
1:1 dsRNA–peptide
Yes
11
29
13C/15N-19F REDOR
2011
(Huang et al., 2011)
1:1 dsRNA–peptide
Yes
11
29
13C-31P, 13C-19F, 15N-31P, 15N-19F, 31P-19F REDOR
(Sergeyev et al., 2011)
ssDNA–protein (intact bacteriophage)
Partially (only for coat protein)
46
7349
13C,13C DARR
(Bechinger et al., 2011)
Oligomeric dsDNA–peptide
No
27
not known
15N-31P REDOR
2013
(Asami et al., 2013)
1:1 dsRNA–protein
Yes
123
26
Intensity mapping (Dipolar-coupling-mediated line broadening)
Overview of ssNMR methods utilized to date to probe nucleic acid–protein interfaces. Information derived from paramagnetic effects, cross-interface dipolar correlations, chemical shifts and peak intensities are highly complementary and can be utilized as restraints in a molecular docking protocol that builds the nucleic acid–protein complex from the structures of the individual components. Pseudo contact shifts (PCSs) measure changes in chemical shifts due to the influence of a paramagnetic ion with anisotropic tensor. Paramagnetic relaxation enhancement (PRE) measures the increase in relaxation of nuclei in the vicinity of an unpaired electron. Both effects can be quantified and translated into nucleus-electron distances. Cross-interface dipolar correlations directly measure intermolecular distances through a variety of dipolar recoupling techniques. The ssNMR toolbox for the measurement of intermolecular distances covers transferred echo double resonance (TEDOR)/ rotational echo double resonance (REDOR)-based recoupling experiments, 13C–13C correlation experiments, such as dipolar assisted rotational resonance (DARR) and combined R2nv-driven (CORD), spin-diffusion based CHHP/NHHP experiments, and 1H-detected CP-based hPH experiments. Chemical shift perturbation (CSP) mapping reports on changes in chemical shifts occurring in the protein upon binding of the nucleic acid, while temperature mapping measures the temperature dependence of 1H chemical shifts, which in turn depends on the involvement of the atom in hydrogen bonds. Comparison of temperature coefficients in the free and nucleic acid-bound states can reveal which protein residues are involved in nucleic acid binding. Quantification of peak intensities can also reveal which atoms are close to the partner molecule in the complex, because of binding-induced line-broadening effects.Selection of ssNMR studies of intermolecular interactions in nucleic acid–protein complexes since 2005.
Chemical shift perturbations and intensity changes
Both CSPs and peak intensities changes can be used to reveal intermolecular surfaces. CSP mapping can be achieved with all types of NMR spectra. In 13C,15N-detected ssNMR, chemical-shift differences measured in 2D 13C–13C DARR spectra (Ahmed et al., 2020, Boudet et al., 2019, Wiegand et al., 2016, Wiegand et al., 2019) and 2D CP 13C,15N spectra (Ahmed et al., 2020, Boudet et al., 2019, Wiegand et al., 2019) of 13C,15N-labeled proteins in the free and complexed forms were used to identify surfaces involved in nucleic-acid binding (Fig. 3A–B). Wiegand et al. used 1D 1H-31P CP spectra to detect binding of a protein to (dT)20 ssDNA, whose 31P peaks shifted from ∼ –1 ppm in the free form to 0–1 ppm in the protein-bound form (Wiegand et al., 2016, Wiegand et al., 2019). However, nucleic acid CSPs have never been used to reveal the nucleic acid residues involved in protein recognition. This task is indeed challenging, especially for RNAs, as these do not always adopt a single conformation in the free form and thus the observed CSPs may report on both folding and binding.
MAS ssNMR techniques to measure internuclear distances were first applied to an RNA–peptide complex by Olsen et al., who, however, did not measure intermolecular distances but detected peptide binding through the change in the value of a 19F-31P distance in the RNA (Olsen et al., 2005). The 19F atom was introduced as a 2′-fluorine at an individual site, while the 31P belonged to a single phosphorothioate (pS) label in the RNA backbone and thus could be easily assigned (Fig. 5A–B). The distance was measured in a 31P-19F REDOR experiment (Gullion and Schaefer, 1989a, Gullion and Schaefer, 1989b, Merritt et al., 1999) (Fig. 5C). Distance measurements between two unique sites were used before to detect binding in a DNA–protein complex (Yu et al., 2004) and in DNA–small molecule complexes (Mehta et al., 2004, Olsen et al., 2003). The introduction of fluorine atoms at individual position in nucleic acids, either in the ribose as 2′-F, or in the base, as 5-fluorouridine (5FU) and 5-fluorocytidine (5FC), has become popular because of the large chemical shift dispersion of 19F and the high sensitivity of its chemical shift to the structural and chemical environment (Hennig et al., 2007, Marchanka et al., 2018a, Scott et al., 2004). Moreover, substitution of 1H by 19F in nucleic acids has a negligible effect on both structure (Hennig et al., 2007, Merritt et al., 1999) and intermolecular interactions (Olsen et al., 2005).
The first implementation of PREs in ssNMR of a protein–nucleotide complex took advantage of the fact that ATP-binding is often accompanied by binding of Mg2+ ions, which can be substituted by paramagnetic ions such as Mn2+ (Bonneau and Legault, 2014, Tamaki et al., 2016) and Co2+ (Balayssac et al., 2008, Bertini et al., 2010b) without affecting protein function (Otting, 2010). The presence of a paramagnetic center can cause an increase of the relaxation rates (paramagnetic relaxation enhancement, PRE) and/or a change in the chemical shifts (pseudo-contact shifts, PCS) of close-by nuclei (Jaroniec, 2012). These effects dependent quantitatively on the distance between the nucleus and the paramagnetic center. Through the analysis of signal intensities in 2D 13C,13C DARR spectra of the diamagnetic (in the presence of Mg2+) and paramagnetic (in the presence of Mn2+) protein–nucleotide complex, Wiegand et al. identified protein residues close to the nucleotide binding site (Wiegand et al., 2017a). Recently, Zehnder et al. extended this approach to a ssDNA–protein–nucleotide complex, in which Mn2+ and Co2+ ions were used as paramagnetic centers (Zehnder et al., 2021). 2D 13C,13C DARR, as well as 2D NCA and 3D NCACB spectra (Baldus et al., 1998) were recorded for diamagnetic and paramagnetic ssDNA–protein–nucleotide complexes (Fig. 9A) to localize the metal ion in the complex. Moreover, PCSs of the (dT)20 ssDNA observed in 1D 1H-31P CP spectra allowed to localize the DNA phosphate groups in the complex (Fig. 9B).
Structure determination of an RNP complex by ssNMR only
Ahmed et al. succeeded in determining the structure of the complex formed by the Pf protein L7Ae and the 26mer Box C/D RNA using exclusively ssNMR-derived restraints (Ahmed et al., 2020). They achieved this goal through a docking protocol, which started from the individual structures of protein-bound 26mer RNA determined by ssNMR (Marchanka et al., 2015) and RNA-bound L7Ae protein (Xue et al., 2010). The RNA-bound protein structure was determined by X-ray crystallography, but in principle it would have been accessible by ssNMR through well-established methodology (Schubeis et al., 2018, Zhao, 2012). The docking protocol was implemented in the program Haddock (Dominguez et al., 2003) and used a combination of ssNMR-derived CSP and PRE data to guide complex assembly.PRE-based restraints were derived from the quantification of PRE effects observed on nucleotide-type specific 13C,15N labeled (Alab and Ulab) RNA in 2D 13C,13C SPC-5 spectra and induced by 3 different paramagnetic tags coupled to the unlabeled protein. PRE effects were quantified as peak volume ratios in spectra measured for the same sample in either the paramagnetic (with the nitroxide radical coupled to a cysteine residue) or the diamagnetic state (after addition of ascorbic acid to reduce the radical). These ratios (V) were converted into distance restraints in a semi-quantitative manner, whereby the intramolecular V ratios of the protein peaks, whose electron-nucleus distances were known from the protein structure, were used to calibrate a linear regression between the V ratio and the electron-nucleus distance. Although not rigorously correct, this linear regression, with the appropriate tolerance bounds, was good enough to deliver broad distance ranges that defined the relative position of the protein and the RNA to a good level of precision.CSP-derived ambiguous restraints were measured for the protein in 2D 13C–13C DARR and 2D CP 13C,15N spectra of uniformly 13C,15N labeled protein in the free and RNA-bound states.In total 72 restraints were used for the docking and lead a 26mer Box C/D RNA–L7Ae structure that is very similar to that of an orthologous Box C/D RNA–L7Ae complex determined by crystallography, thus verifying the accuracy of the ssNMR-derived structure (Fig. 10). This exemplary study demonstrates the ability of ssNMR to yield nucleic acid–protein complex structures at atomic resolution.
The ssNMR methods discussed here build a suite of complementary experiments that identify interaction interfaces and yield intermolecular distance restraints in nucleic acid–protein complexes. They provide a powerful tool to determine the structural basis of intermolecular recognition for those nucleic acids–protein complexes that are not amenable to X-ray crystallography, cryo-EM or solution-state NMR. The structure of the individual components of the complex can be obtained by ssNMR as well as with previously published and often reviewed methodology (Marchanka and Carlomagno, 2019, Schubeis et al., 2018, Zhao, 2012).In 2021 the program AlphaFold (Jumper et al., 2021) revolutionized structural biology of folded protein domains and their complexes, by demonstrating an unprecedented accuracy in the prediction of protein folding and interactions based on database knowledge. Nucleic acid structures, in particular RNA, are difficult to predict, as they largely depend on the environment and on the binding partners. Thus, it is unclear whether AlphaFold will ever be expanded to this class of polymers. In view of this, NMR spectroscopy, with its power to illuminate intermolecular interactions involving flexible molecules, gains unique relevance.To date, many studies of nucleic acid–protein complexes have focused on obtaining site specific information for the proteins in the complex. The nucleic acid component has been often neglected due to lack of spectral resolution and limited access to advanced isotope labeling techniques. In the past few years, the ssNMR toolkit for studying nucleic acids and their complexes has grown steadily, including both technical developments in MAS ssNMR and isotope labeling techniques. Consequently, we expect MAS ssNMR of nucleic acid–protein complexes to rapidly grow in relevance and scope, including site specific structural information on both the protein and nucleic acid components of the complex.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Authors: Kerstin Riedel; Christian Herbst; Sabine Häfner; Jörg Leppert; Oliver Ohlenschläger; Maurice S Swanson; Matthias Görlach; Ramadurai Ramachandran Journal: Angew Chem Int Ed Engl Date: 2006-08-25 Impact factor: 15.336
Authors: Michael T Colvin; Robert Silvers; Qing Zhe Ni; Thach V Can; Ivan Sergeyev; Melanie Rosay; Kevin J Donovan; Brian Michael; Joseph Wall; Sara Linse; Robert G Griffin Journal: J Am Chem Soc Date: 2016-07-14 Impact factor: 15.419
Authors: Yu Lin Jiang; Lynda M McDowell; Barbara Poliks; Daniel R Studelska; Chunyang Cao; Gregory S Potter; Jacob Schaefer; Fenhong Song; James T Stivers Journal: Biochemistry Date: 2004-12-14 Impact factor: 3.162
Authors: Jörg Leppert; Carl R Urbinati; Sabine Häfner; Oliver Ohlenschläger; Maurice S Swanson; Matthias Görlach; Ramadurai Ramachandran Journal: Nucleic Acids Res Date: 2004-02-18 Impact factor: 16.971