Literature DB >> 35025504

¹³C Direct Detected NMR for Challenging Systems.

Isabella C Felli¹, Roberta Pierattelli¹.

Abstract

Thanks to recent improvements in NMR spectrometer hardware and pulse sequence design, modern 13C NMR has become a useful tool for biomolecular applications. The complete assignment of a protein can be accomplished by using 13C detected multinuclear experiments and it can provide unique information relevant for the study of a variety of different biomolecules including paramagnetic proteins and intrinsically disordered proteins. A wide range of NMR observables can be measured, concurring to the structural and dynamic characterization of a protein in isolation, as part of a larger complex, or even inside a living cell. We present the different properties of 13C with respect to 1H, which provide the rationale for the experiments developed and their application, the technical aspects that need to be faced, and the many experimental variants designed to address different cases. Application areas where these experiments successfully complement proton NMR are also described.

Entities: Chemical

Mesh：

Substances：

Year: 2022 PMID： 35025504 PMCID： PMC9136920 DOI： 10.1021/acs.chemrev.1c00871

Source DB: PubMed Journal: Chem Rev ISSN： 0009-2665 Impact factor: 72.087

Introduction

NMR spectroscopy is an indispensable tool for investigations of biological molecules and their interactions. The power of NMR to link structural, dynamic, kinetic, and thermodynamic information makes it an essential component of cutting-edge research in structural biology. Provided NMR spectra can be acquired with high resolution and sensitivity, a virtually unlimited amount of atomic-resolution information can be achieved starting from chemical shift values, nuclear spin relaxation rates, scalar couplings, exchange effects, diffusion coefficients, and other highly sophisticated observables. This places NMR in a unique position with respect to the many spectroscopic methods that provide global information or to different high-resolution methods, such as X-ray crystallography and cryo-electron microscopy, that however fail to provide information about macromolecules in solution for highly dynamic or heterogeneous ones in particular. Needless to say that continuous development of the experimental approach is necessary to exploit at best this powerful spectroscopic technique to extend the complexity of the systems under investigation, as required by the many challenges in biomedical research. These would benefit from the availability of high resolution structural and dynamic details on biologically relevant molecular components. This in turn triggers instrumental technological improvements that enable expansion of the applications to problems of increasing complexity. We would like to focus here on the example of carbon-13 direct detection NMR in solution, one of the widely used tools to characterize biological macromolecules. We are going to start by introducing the key properties of heteronuclear spins that make heteronuclear direct detection interesting and discuss the many strategies developed to overcome potential critical points, such as the problem posed by homonuclear decoupling in the direct acquisition dimension. The issue of the starting polarization source as well as coherence transfer methods is described since they constitute common basic ingredients of more complex NMR experiments. The flow of the review then proceeds to illustrate the many experiments developed, focusing on the applications where they reveal unique additional information with respect to more conventional approaches. Current challenging research areas where these methods provide useful data are also presented.

Properties of Heteronuclear Spins

The term “heteronuclear” refers to nuclear spins other than protons, the first ones to be considered in general in NMR spectroscopy (the magnetic field of an instrument, B0, is indeed often indicated through the 1H Larmor frequency). The most widely used nuclear spins for biomolecular NMR investigations are 1H, 13C, 15N, and 31P (the latter of interest for the investigation of nucleic acids); we will focus here on the contributions of 13C and 15N to the study of proteins even if similar arguments can of course be extended to nucleic acids (including 31P) and carbohydrates. While sharing the same spin angular momentum (S = 1/2), heteronuclei have smaller gyromagnetic ratio (in absolute value) with respect to proton and thus smaller magnetic moment. The latter, which interact with the external magnetic field B0 determining the Larmor frequency, is responsible for the intrinsic sensitivity that scales down moving from 1H to 13C to 15N.[1] The magnetic moment associated with nuclear spins is also responsible for the dipole–dipole (DD) interactions with other nuclear spins in the surroundings, interactions that in isotropic solution are largely averaged by fast molecular tumbling and do not have an impact on line positions or do not cause line splitting like the scalar coupling. They are however the major interactions that promote nuclear relaxation. Therefore, in principle heteronuclear spins sense lower dipolar contributions to relaxation as a result of their lower magnetic moments. Considering the interaction between two dipoles at a specified distance, heterouclei sense a lower contribution to relaxation with respect to protons by a factor of approximately (γX/γH)[2] (neglecting contributions from the different spectral densities). Paramagnetic contributions to nuclear relaxation provide a clear example in this respect. Once a particular paramagnetic center is defined, dipolar contributions to relaxation of nuclear spins in the surroundings depend on the distance (1/r6) and on the square of the gyromagnetic ratio (γX2) of the nucleus itself (as well as on spectral densities). Therefore, at a fixed distance from the paramagnetic center, heteronuclear spins sense a smaller dipolar interaction and thus a smaller contribution to relaxation with respect to protons; in other words, the so-called paramagnetic relaxation enhancements at a specified distance from a paramagnetic center are smaller for heteronuclei than for protons. Similar arguments hold for diamagnetic contributions to relaxation: dipolar interactions involving heteronuclear spins are lower than those involving protons, not easy to generalize in this case because the final result depends on many contributions which in turn depend on the local chemical structure. On the other hand, contributions to dipolar relaxation deriving from proton spins are generally dominant, both for proton itself and for heteronuclei due to the large γH, a situation that pushed the development of isotopic enrichment in deuterium to reduce the bath of proton dipoles in which heteronuclei are immersed and thus reduce contributions to nuclear relaxation.[2−4] Another example in this respect is provided by 15N relaxation often measured to investigate local dynamics in proteins. In this case, the dominant dipolar contribution to 15N relaxation is provided by the directly bound proton. The other striking difference when moving from protons to heteronuclei consists of the electronic structure/chemical environment that influences signals chemical shift. Largely averaged in solution due to fast molecular tumbling, the isotropic part of the chemical shift tensor determines peak positions. The major contributions to chemical shifts derive thus from the local molecular topology that translates into the different chemical shifts expected for the different functional groups, a property often exploited in the study of small molecules. In proteins, the chemical structure of the different amino acids as well as their link through the peptide bond is the major determinant of the observed signals chemical shift. The local 3D structure provides an additional contribution to it, which clearly shows a general trend toward larger chemical shift dispersion for heteronuclei with respect to protons, as can be appreciated by inspecting the Biological Magnetic Resonance Data Bank (BMRB, https://bmrb.io/), the database in which chemical shifts of assigned proteins are deposited. The anisotropic part of the chemical shift tensor instead contributes to nuclear relaxation. Therefore, large contributions to relaxation are expected when significant chemical shift anisotropy (CSA) is present such as for example for nuclear spins involved in the peptide bond or aromatic rings, certainly an important aspect to consider when exploiting heteronuclear spins. Several solutions have indeed been proposed to use constructively interference between CSA and DD to mitigate the contributions to transverse relaxation that may broaden lines beyond detection for globular proteins of increasing molecular mass due to the increasing rotational correlation time. On the other hand, these contributions to nuclear relaxation do not have a detrimental impact when focusing on highly flexible proteins, in which the fast motions reduce contributions to transverse relaxation. Scalar couplings have a strong impact on the spectra, in particular in the direct acquisition dimension, and their magnitude and topology are also influenced by the type of nuclei under investigation. These are mediated by electronic effects and depend on many factors including the gyromagnetic ratio of the nuclear spins involved in the coupling as well as the electronic structure and the local geometry. However, the most striking difference moving from 1H to 13C relates to the fact that large one-bond homonuclear scalar couplings are present when focusing on 13C nuclear spins in uniformly labeled samples, while analogous ones are not observed for protons. This property, widely exploited in many multidimensional experiments to achieve coherence transfer, also causes very large signals splitting and the relative complex multiplet structures complicate direct detection of 13C. This constitutes an important aspect to be considered in order to convert 13C direct detection into a useful tool for the study of complex macromolecules. On the other hand, when moving to multiple bond effects, these are smaller than the analogous ones involving 1H (for example, ranges for 3JHH are often larger than those for 3JCC in aliphatic chains). Heteronuclei constitute the molecular backbone; protons are at the edge of chains of chemical bonds and in many cases form the exposed surfaces of macromolecules. This property is often used in the study of interactions in which changes in 1H chemical shifts are investigated to identify interaction surfaces. Also heteronuclei are of course affected by changes in the nucleus surrounding but are more sensitive to changes in the local structure, such as changes in dihedral angles, and these complementary features can result useful. Proteins are studied in water, the solvent of life. Interactions of proteins with water are key for many aspects: they prompt polypeptide chains to fold into stable globules or to remain flexible and solvent-exposed as well as to create membrane-less organelles through liquid–liquid phase separations. Therefore, the interactions with the solvent have a very relevant role in protein function. Proton NMR can be used to detect changes with pH or protonation state of a protein, but it is also influenced by exchange processes that, depending on their magnitude, can broaden lines beyond detection. Heteronuclei in this context can act as “spies” of nonprotonated states providing information also in cases in which protons are not present or when fast exchange between the free and bound forms broadens 1H NMR lines beyond detection.

Building Blocks of NMR Experiments: What’s New for 13C Direct Detection?

Heteronuclear spins have interesting properties as also exploited in the indirect dimensions of many multidimensional NMR experiments based on 1H direct detection (“inverse detection” of heteronuclei). We would like to discuss here the most important aspects to consider when moving to 13C detection. Starting from simple 1D 13C NMR spectra, often used also for the study of small molecules to identify specific functional groups or coupling topologies, the large chemical shifts dispersion is accompanied, when studying isotopically labeled macromolecules, by the onset of complicated multiplets determined by the large one bond homonuclear 13C–13C scalar couplings. These can range from simple doublets for 13C nuclei that only have one 13C bound, or to more complicated multiplet structures observed for the different side-chains with 13C nuclei directly bound to two or three other 13C nuclei. The onset of complicated multiplets, that on one hand can provide valuable information regarding the type of spin system, is detrimental when seeking high-resolution information needed to study macromolecules. Indeed, the complex multiplet structure of 13C signals deriving from homonuclear scalar couplings drastically reduces both sensitivity and resolution of the NMR spectra, two key features for the study of complex macromolecules. Several strategies have become widely used to suppress these homonuclear couplings in the indirect dimensions of NMR spectra such as the inclusion of constant time evolution periods, the use of band-selective inversion pulses to refocus scalar coupling evolution, etc.[5] In many cases, the resolution that can be achieved in the indirect dimension is however limited by the number of points that can be acquired in the time that can be dedicated to a specific experiment and the reduction in resolution brought about by signals’ splitting is seldom a limiting factor. On the other hand, when nuclear spins are investigated in the direct acquisition dimension, the FID can in principle be acquired as long as desired (just limited by the transverse relaxation properties of the system under investigation rather than by the time needed to accumulate increments to construct an indirect dimension). This constitutes a contribution to spectral resolution, provided the complex multiplet structures are simplified by the implementation of homonuclear decoupling. The problem of 13C homonuclear decoupling is more demanding than heteronuclear decoupling because the two nuclear spins involved in the coupling are close in frequency and thus radio frequency irradiation on one of them can be sensed by the other. However, several elegant solutions have been proposed and constitute a relevant aspect for the design of NMR experiments for the study of complex biomolecules.

Homonuclear Decoupling

Let us start discussing the “simple” case of a 13C nuclear spin that only has one large bond homonuclear coupling such as for example carbonyl nuclear spins (13C′) in protein backbones, which share a large one bond scalar coupling with 13Cα nuclear spins. One nice feature of the one bond scalar couplings (1JC′C) consists of the fact that they are not as variable as, for example, three bond couplings, which largely depend on the local conformation of the molecule. In the case of carbonyl moieties of protein backbone, the one bond scalar coupling with Cα is fairly uniform throughout the primary sequence and relatively independent of the type of amino acid and on the local conformation,[6] a property that renders the problem of homonuclear decoupling generally amenable and more easy to address with respect to, for example, 1H homonuclear decoupling. The most straightforward approach thus consists of deconvolution of the spectra, exploiting a defined value of the coupling. Initially proposed for indirect dimensions of triple resonance experiments[7] and implemented for 13C detected experiments,[8,9] this approach has been recently revived by the incorporation of AI methods.[10] The other possibility, provided the chemical shifts of the two nuclear spins involved are sufficiently different to allow for their selective irradiation, consists of band-selective homonuclear decoupling in which the acquisition time is shared between acquisition and decoupling mode in alternating time intervals (Figure ).[11−13] This approach requires an additional radio frequency channel; decoupling sidebands are observed depending on the frequency of acquisition and decoupling periods. An elegant alternative approach to homonucear decoupling in which the acquisition time is shared between acquisition and 180° refocusing pulses is the BASHD.[14] Introduced for solid-state NMR experiments,[15] it was adapted for direct detection of carbonyl carbon nuclei in solution measurement.

Figure 1

Homonuclear decoupling strategies initially implemented for 13C direct detection: band-selective homonuclear decoupling (top panel), virtual decoupling achieved through the IPAP (middle panel), and S3E (bottom panel) approaches. A scheme illustrating the linear combinations performed to achieve virtual decoupling is also reported on the right side of the middle panel. The contribution of the IP and AP components to the two independent FIDs acquired through the S3E approach is also schematically indicated on the right side of the bottom panel. The most widespread approach used nowadays consists of using virtual homodecoupling, that is exploiting spin-state-selective methods in which scalar couplings are preserved and different components of the signal are collected: these constitute the basis to achieve virtual decoupling through the appropriate linear combinations of the acquired signals to separate the different multiplet components, followed by a shift to the center of the original splitted signal.[16] Several experimental variants based on this idea have been developed and may result useful for different applications. The most straightforward implementation of this idea consists of the IPAP approach[17] in which the in-phase (IP) and antiphase (AP) components of the carbonyl carbon signals (with respect to Cα) are acquired and separately stored (Figure ).[18−20] The postacquisition treatment of the acquired data, which can also be performed directly in the time domain, enables the removal of the large one bond splitting from the spectra. Interestingly, this approach preserves the coupling and can in principle be applied to mutually decouple and still observe the two nuclear spins involved in the coupling, as demonstrated in solid-state applications.[21] An experimental variant in which AP and IP signals for virtual decoupling are acquired sequentially in a single scan has been recently proposed.[22] When fast transverse relaxation becomes a limiting factor, shorter experimental variants can be implemented.[18,20,23] Indeed the IPAP approach relies on complete interconversion between the IP and AP components of the signal and requires a time that is inversely proportional to the scalar coupling itself (1/2JC′C). However, partial interconversion between in-phase and antiphase already provides the two components in half the time; in this case, changing the sign of one of the two components in alternate scans allows storage of two FIDs that can be used to separate the two multiplet components needed to perform homonuclear decoupling through appropriate manipulations of the acquired FIDs (Figure ).[20] Direct acquisition of the AP signal component was also proposed for systems in which transverse relaxation is a key limitation such as for very large proteins[24] or for paramagnetic ones.[25] Finally, different variants exploiting sensitivity enhancement strategies were proposed for homonuclear decoupling in COCA (COCAINE)[26] and CON[27] experiments. They were also shown to be useful alternatives for heteronuclear decoupling.[28] These principles implemented for backbone carbonyl homonuclear decoupling can of course be extended to analogous cases such as 13C nuclear spins that only have one large scalar coupling with a second 13C nuclear spin that also has a different chemical shift, sufficiently different to allow band-selective inversion of the two spins independently. Indeed this strategy also performs well for amino acid side-chains that have a carbonyl/carboxylate moiety such as aspartate, glutamate, asparagine and glutamine residues. It has also been implemented to investigate terminal nuclear spins of aliphatic side-chains and to decouple them from their next neighbors.[29−32] The situation becomes more complex for spins that are coupled to more than one additional 13C spin through large one-bond scalar couplings such as 13Cα spins (for all amino acids except glycine) as well as for the vast majority of spins of amino acid side-chains. The ideas described can in principle be extended also to this case; selected applications were so far implemented. As an example, several approaches were proposed to perform homonuclear decoupling of 13Cα signals by clever combinations of spin-state selective approaches (DIPAP, DS3E).[20,30] The capacity to selectively invert 13Cα from 13Cβ (in addition to 13C′) constitutes a key feature, and some compromises may be necessary for signals with similar 13Cα and 13Cβ shifts that fall close to the transition regions of the inversion profiles of the band-selective pulses employed. The acquisition of the different components also has a cost in terms of overall sensitivity. Despite these additional complications, these methods were successfully used to investigate very large proteins in which extensive isotopic enrichment was mandatory and for which carbonyl carbon direct detection provided too broad lines.[30] Similar approaches were also proposed for the investigation of nucleic acids through 13C direct detection[33−37] as well as to focus on aromatic residues in proteins.[38]

Starting Polarization Source

The starting polarization source used in NMR experiments also constitutes an important element that can be used to modulate experimental sensitivity. On the other hand, the latter also depends on the kind of information desired. Direct detection of heteronuclei for biomolecular applications in solution, after pioneering work in 1988[39] was abandoned in favor of inverse detection methods.[40−42] It was proposed at the beginning of 2000 as a strategy to recover information on paramagnetic proteins in regions in which proton resonances are broadened beyond detection due to paramagnetic relaxation enhancements.[38,43−47] Dipolar contributions to relaxation sensed by nuclear spins when a paramagnetic center is present in a molecule depend on the properties of the paramagnetic center itself, on the effective correlation time modulating the interaction, on the electron–nuclear distance (1/r6), and on the square of the gyromagnetic ratio of the nuclear spin under investigation (γX2).[48] Therefore, shifting our attention from 1H to 13C and 15N ensures reduction of the paramagnetic enhancement (considering the same distance); from a different point of view, similar enhancements are sensed at shorter distances from the paramagnetic center and thus shifting the focus from 1H to 13C (and to 15N) enables researchers to observe resonances of nuclear spins closer to the paramagnetic center since the additional contributions to relaxation are scaled by (γX/γH)2.[49,50] When considering experiments that are more complicated than 1D experiments, these should not actively perturb 1H nuclear spins in any of the coherence transfer steps because this would reintroduce the dependence on 1H transverse relaxation, much more sensitive to paramagnetic relaxation enhancements. For this reason initial variants of 13C detected experiments for biomolecular applications in solution were based on 13C as a starting polarization source and never exploited protons in any of the coherence transfer steps; these experiments were called “protonless” NMR experiments.[20] These experiments also resulted useful for the study of very large proteins in which high levels of deuteration were necessary[51] reducing the amount of information that could be achieved through proton direct detection.[9,19,52] While focusing on this topic, it became evident that heteronuclear direct detection could result useful also for different applications in which 1H fast relaxation does not constitute a limiting factor.[23,53,54] For these applications, the use of 1H as a starting polarization source brings a significant increase in experimental sensitivity,[55−58] still exploiting heteronuclear chemical shifts in all the detected dimensions of multidimensional NMR experiments to benefit from their contribution to resolution, a key feature when focusing, for example, on proteins devoid of a stable tertiary structure. These experiments were thus generally referred to as “exclusively heteronuclear” to indicate that only heteronuclei were frequency labeled in all dimensions, regardless of the starting polarization source exploited.[24,28,53,59] Considering for the moment backbone nuclear spins, the proton polarization source can be provided by amide protons (1HN) as well as by aliphatic protons (1Hα) with advantages and disadvantages that depend on the properties of these two nuclei (Figure ). Amide protons, which can be easily correlated to the directly bound nitrogen and then to carbonyl carbon nuclei through the 1JHN and 1JC′N, respectively, are influenced by solvent exchange that can become so pronounced that coherence transfer becomes inefficient. In addition proline residues do not have an amide proton, a feature that reduces the information content of CON spectra that exploit 1HN polarization as a starting source, in particular for proline-rich proteins.

Figure 2

Intensity of one of the cross peaks obtained in 2D CON experiments (the correlation for C′59–N60 of ubiquitin indicated by an arrow in the full spectrum reported on the left) recorded with different pulse schemes and different interscan delays (d1) is compared by showing its trace. From left to right: 13C-start (relaxation delay d1 = 2.5 s), 1H-start (d1 = 1.5 s), and 1Hα-flip with different d1 (1.5, 1.2, 0.9, 0.7, 0.5, 0.35, 0.2 s). Adapted from ref (59). Copyright 2009 American Chemical Society. Backbone aliphatic protons (Hα) are instead present for all amino acids and are nonexchangeable nuclei. Therefore, experiments based on 1Hα as a starting polarization source can in principle provide complete information. Also in this case coherence transfer pathways exploiting large scalar couplings (1JH, 1JC, and 1JC′N) enable the transfer of 1H polarization to backbone nuclear spins in an efficient way without major losses; care should be taken to consider the role of 13Cα–13Cβ couplings in the coherence transfer pathway to ensure that no information is lost for amino acids with similar 13Cα and 13Cβ chemical shifts. Longitudinal relaxation enhancement (LRE) strategies can be used to reduce the interscan delay and reduce experimental time (or increase the S/N per unit time).[60] Borrowing ideas proposed for amide proton detected NMR experiments, in which amide protons are selectively perturbed to enhance the recovery to equilibrium and reduce interscan delays (SOFAST, BEST),[61,62] different variants of 1HN-start experiments were proposed exploiting 1HN band-selective pulses (HNBESTCON).[63] It is interesting to note that in these experiments 1H polarization is only used as a starting polarization source so that the longitudinal recovery starts well before acquisition of the FID, a feature that enabled the acquisition of the HNBESTCON without introducing any longitudinal recovery delay after the acquisition of the FID.[63] Similar longitudinal relaxation enhancement approaches would be useful also for 1Hα protons, used as a starting polarization source in several variants of exclusively heteronuclear NMR experiments. However, 1Hα spins are more difficult to manipulate through band-selective pulses with respect to the case of 1HN protons because they fall in a more crowded spectral region. To this end, a variant to selectively manipulate a subset of nuclear spins while leaving others unaffected was proposed that exploits the scalar coupling with the attached heteronuclear spins (Hflip).[59] In this way, longitudinal relaxation enhancement can be implemented in a general way for protons used as a starting polarization source of exclusively heteronuclear NMR experiments, both for 1HN-start and 1Hα-start variants (Figure ). The use of proton–nitrogen cross-polarization, in place of the more widespread INEPT block, was also proposed since this excitation mechanism uses the large water-magnetization reservoir to continuously replenish the amide-proton; combined with the CON reading scheme, this represents another useful approach to study systems in which solvent exchange is very pronounced.[64,65]

Suite of 13C Direct Detection NMR Experiments

Sequence-Specific Assignment

Similarly to 1H detected experiments, 13C detected experiments can be differentiated based on the active interactions exploited in the coherence transfer steps (scalar or dipolar coupling), the starting polarization source (13C or 1H), and the kind of detected nuclei (13C′ or 13Cα or others). The experiments initially proposed for assignment purposes were based on 13C-start, 13C-detection, and completely avoided protons in any of the magnetization transfer steps (protonless NMR experiments).[20] The set of protonless NMR experiments included various kinds of 2D experiments correlating carbon nuclei such as the 13C–13C COSY, 13C–13C TOCSY, and the CACO MQ correlation experiment.[20,30,43,45,46,66,67] The 13C–13C NOESY experiment proved also useful in cases in which fast transverse relaxation represents a major limitation.[9,19] Despite the 13C–13C cross-relaxation rates being expected to be much smaller than 1H–1H ones, the 13C–13C NOE effects between directly bound carbon nuclei are easily detectable and spin diffusion through their nuclei is a very efficient process able to provide very useful spectra for amino acid type identification. These initial results opened the way to the development of a wide variety of experiments for sequence-specific assignment. A backbone 13C nucleus that can be used to design a whole series of experiments is Cα. It is characterized by a large chemical shift dispersion, so it can provide highly dispersed spectra. However, the coupling to the Cβ present in all amino acids other than glycine makes homonuclear decoupling less straightforward (see section ). Nevertheless, the CAN experiment is useful to detect the two correlations between Cα and the intra- and inter-residue nitrogen.[50] Variants to highlight the sequential correlation and thus discriminate it from the intraresidue have been proposed.[68] These gain also in resolution since the inter-residue correlations are generally more resolved than the intraresidue ones.[69] Thanks to the potential of experiments based on Cα direct detection for the study of higher molecular mass proteins, it was proposed to exploit the isotopic labeling strategy designed for solid-state applications that avoids neighboring 13C spins (and thus also the need for homonuclear decoupling in the direct 13C dimension).[51,70] Several experiments based on Cα direct detection exploiting this isotopic labeling scheme were proposed including the CANCA and the CACA TOCSY experiments.[68,71] The most successful applications of 13C direct detected NMR rely on carbonyl carbon detection, which presents a single and uniform splitting that can be easily removed (see section ). Carbonyl carbon nuclei can be directly correlated to the neighboring Cα nuclei through a 13C–13C transfer step (1JC′C), which can be further correlated to the Cβ and to all the aliphatic side chain to obtain the 2D CBCACO and 2D CCCO experiments.[20] By introducing a further transfer step, it is also possible to include an additional dimension in which the attached nitrogen is frequency labeled.[20,72] Therefore, the 3D experiments CACON, CBCACON, and CCCON were designed.[72] These enable the identification of all the spin systems in a protein including those involving proline residues. Actually the correlations involving the proline nitrogen with the carbon frequencies of the previous amino acid (13C′, 13Cα, 13Cβ, and other aliphatic 13C spins) often become the starting point for sequence-specific assignments because, already at this initial stage, they provide inter-residue information regarding X-Pro pairs, as illustrated for two intrinsically disordered regions of the nucleocapsid protein from SARS-CoV-2 (Figure ).[73]

Figure 3

Left: The 2D CON of a 248 amino acids long construct of the SARS-CoV-2 nucleocapsid protein comprising the NTD folded domain and the two intrinsically disordered regions flanking it (IDR1-NTD-IDR2). The high resolution provided by this experiment enables to easily resolve resonances in the usually very crowded Gly-region (upper squared region) and to directly observe connectivities involving proline residues (lower squared region). In the expansion shown in the center of the map, the resolution of several repeating fragments comprising asparagine residues can be appreciated (the assignment reported is referred to the amide nitrogen of the mentioned amino acid). Right: Seven strips derived from the 3D (H)CBCACON experiment extracted at the 15N chemical shift of proline residues, flanked by a cartoon of the IDR1-NTD-IDR2 construct. The C′, Cα, and Cß frequencies belong to the preceding amino acid leading to the X-Pro assignment. The upper part of the panel reports the IDR1-NTD-IDR2 primary sequence (the sequence of the NTD domain is omitted for sake of clarity). Five proline residues are found in the IDR1 and two in IDR2 domain. Adapted from ref (73). Copyright 2021, The Authors under the terms of a Creative Commons CC BY license http://creativecommons.org/licenses/by/4.0/. Several additional experiments were then proposed in which different couplings were exploited to detect the complementary correlations to link amino acid spins systems in a sequence-specific manner. The CANCO experiment,[20,74] and its variant also including the information about the Cβ chemical shift (CBCANCO),[75] provides the complementary information for sequence-specific assignment. These experiments exploit Cα and Cβ chemical shifts to match different spins-systems identified in CBCACON/CCCON spectra and enable researchers to link them to achieve their sequence-specific assignment. However, additional information is useful when focusing on complex systems, in particular when contributions to chemical shifts are drastically reduced due to high flexibility and disorder of the polypeptide. Other experiments were thus designed that provide correlations involving carbonyl carbon nuclei as a further tool to reduce potential ambiguities.[23,67,76] This is the case of the COCON experiment, which correlates the backbone nitrogen with the attached carbonyl carbon and with the previous and following carbonyl carbon nuclei in the sequence exploiting the 3JC′C′ scalar coupling.[23,76] As previously described, in most cases of practical interest it is possible to exploit 1H as a starting polarization source to boost sensitivity. This has been accomplished for the multinuclear experiments described above to obtain the (H)CBCACON, (H)CCCON, (H)CBCANCO, and (HCA)COCON variants.[28,77] Moreover, additional experiments were designed that needed a leap in S/N to be feasible due to their intrinsic low sensitivity such as the case of the 3D NMR experiments correlating several residues in a raw ((H)NCANCO, (H)N(COCA)NCO, and (HN)CO(CA)CON).[28,78] As an example of the quality of the spectra, a few slices extracted from the 3D spectra acquired on quail osteopontin, a disordered protein, illustrate the process of sequence-specific assignment through this strategy based on 3D C′-detected NMR experiments (Figure ).

Figure 4

Assignment strips for quail osteopontin. Upper panel: Strip plots of the region 233–238 of the 3D (H)CBCANCO allowing the connectivity of dipeptides by Cα and Cβ carbon nuclei. Lower panel: 3D (HCA)COCON strip plots of region 233–238 showing the connectivity between at least three consecutive C′ signals. Reprinted in part from ref (77). Copyright 2019 Elsevier Ltd. All rights reserved. In the case of particularly crowded spectra, these experiments can also be tailored to select resonances of specific amino acids to simplify the spectra. There are many ways to select the signals of a specific amino acid or a group of amino acids with common characteristics. Implementation of multiple quantum filters to select XHn groups, exploitation of band selective pulses for specific nuclei excitation, tuning of specific delays to select coherence transfer pathways, or matching of increasing numbers of coherence transfer steps to a particular spin topology are all selection blocks that can be included in a pulse sequence. With combinations of these approaches, it is possible to simplify the spectra and identify resonances belonging to virtually all different amino acids. These experiments are particularly well suited to simplify crowded spectra of highly flexible proteins, which generally have favorable relaxation properties that limit losses resulting from the multiple magnetization transfer steps used for filtering, and for which band-selective pulses are highly effective due to the reduced chemical shifts dispersion. The first set of experiments dedicated to this purpose was a 13C-adaptation of the MUSIC approach developed by the Oschkinat group.[79−81] These experiments, based on the CACON and CANCO sequences, are simple in their implementation and can provide information to aid sequential assignment[82] (Figure ). On the other hand, many experiments should be acquired to discriminate as much as possible the different amino acids and in some cases the sensitivity is limited by the many transfer steps and pulses included for optimal selection.

Figure 5

Schematic diagram showing the sequence specific assignment strategy using the amino acid-selection approach for three generic amino acids A, B, and C. The correlations obtained by (CA)CON-based experiments are reported as filled circles, while the additional ones obtained by the (CA)NCO-based experiments are reported as open circles. A second approach, based on the CBCACO sequence, proposed the use of a selection method based on the 13Cβ topology to distinguish the different amino acid types.[83] From a single experiment recorded with a sequence that includes several selective pulses cleverly chosen, it is possible to generate subspectra in which the residue signals can be grouped in six classes according to their topology. The classification is coarse but useful for specific purposes. Although these amino acid selective experiments produce spectra similar to those obtainable by amino acid selective labeling, they offer the clear advantage that only a single sample is required for all experiments and can be exploited in case a specific subset of residues should be monitored, for example, for simplified chemical shift mapping.

Multidimensional Experiments (nD, n > 3)

The boost in S/N ratio achieved with pulse sequence design and hardware innovations opened the way to the implementation of experiments with many coherence transfer steps, such as multidimensional experiments capable to correlate many diverse heteronuclei, and, when useful, for assignment purposes, reintroduce also the proton dimension.[84] These experiments, as for the 1H-detected counterparts, require a very long experimental time and thus it becomes crucial to implement approaches to obtain highly resolved spectra in a reasonable time frame. The acquisition of a multidimensional NMR experiment is generally performed sampling time-points equally spaced on a Cartesian grid (with time-point, we refer to each repetition of the experiment with different delays for chemical shift evolution). This is dictated by the Nyquist theorem, which states that the interval between the time-points sampled cannot exceed the inversion of the spectral width within which all the expected peaks appear. On the other hand, the resolution of the NMR signals is inversely proportional to the length of the acquisition time. This results in an enormous increase of experimental time as the dimensionality of the experiment is increased, especially when a large number of increments is required to achieve optimal resolution in the indirect dimensions. In the last decades, alternative sampling approaches known with the collective name of nonuniform sampling (NUS) have been introduced in NMR to reduce experimental time, or achieve higher resolution in the same amount of time or be able to acquire spectra of high dimensionality with appropriate resolution.[85] NUS eludes the limitation of the conventional acquisition scheme by sampling only a subset of the time points of the Cartesian grid, according to some predetermined sampling scheme. The price to pay is that the data cannot be processed with the usual fast Fourier transform (FFT) but need different strategies for proper treatment of the acquired data and for reconstructing the final data-matrix including some postprocessing to remove spectral artifacts in the final spectrum arising from the so-called “sampling noise” intrinsically related to the method.[86] Several algorithms for data reconstruction or processing have been proposed and optimized such as maximum entropy (ME),[7] multidimensional decomposition (MDD),[87] compressed sensing (CS),[88] and multidimensional Fourier transform (MFT)[89] to mention just a few. Some of these reconstruction methods have been implemented in the acquisition programs of the spectrometers to be routinely used, not only for 13C direct detected experiments of course. The use of NUS in 13C direct detected spectra was tested first with the CANCO experiment on ubiquitin, a small globular protein of 73 amino acids.[59] Even without the inclusion of 1H as starting polarization source, it was possible to reduce at about 40% the number of acquired points, reconstructing the final data matrix with the MDD algorithm.[59] Different approaches were also tested for the 3D CBCACON experiment.[90] In the following years, a number of additional experiments with higher dimensionality were designed, all tailored for spin system identification and backbone resonance assignment. The most successful example is provided by the suite of experiments developed by Koźmiński and co-workers, which takes advantage of the sparse multidimensional Fourier transform (SMFT) method to handle the spectra.[91] One of the advantages of this method is the possibility to process only a subspace of the full multidimensional spectrum at arbitrary frequency coordinates and to simplify the analysis of multidimensional spectra by displaying only 2D cross-sections computed at some predefined frequencies collected in a lower dimensionality “basis spectrum” (2D/3D), which shares the same dimensions with the higher dimensionality experiment (4D/5D). In this way, the inspection of the 4D/5D is reduced to the analysis of a collection of lower dimensionality maps. The complete set of experiments, particularly useful for the investigation of intrinsically disordered proteins, has the CON and the CACON experiments as reference spectra to process 4D and 5D data sets, respectively. The choice of these two is based on the fact that they provide the best result in terms of reliability of the sequential backbone assignment taking advantage of the excellent chemical shift dispersion of the resulting spectra.[92,93] The multidimensional 13C detected NMR experiments that have been proposed share as common features direct detection of carbonyl carbon nuclei and exploit coherence transfer pathways mediated by the one-bond and two-bonds scalar couplings between backbone heteronuclear spins (1JC′N, 1JC, 2JC, 1JC). Since these involve coherence transfer between several nuclear spins, the extension to higher dimensions can be easily achieved through frequency labeling of different spins exploited for coherence transfer. The most straightforward approach thus consists of extending the dimensionality of 3D experiments by frequency labeling additional nuclear spins exploited in coherence transfer pathways, as proposed for 4D HCBCACON, 4D HCCCON, 4D HCBCANCO, 4/5D HNCACON and 4/5D HNCANCO, and 3/4D HCANCACO experiments.[94,95] Additional variants were also proposed that exploit the most informative and well-resolved heteronuclear spins in the indirect dimensions, such as the 5D CACONCACO to focus on backbone assignment, followed by the 5D HC(CC-TOCSY)CACON[96,97] to discriminate between different amino acid types. An approach to accelerating the long NMR experiments consists of the implementation of 1H-start and longitudinal relaxation enhancement that significantly shortens the effective longitudinal recovery time and allows for shorter delays between the consecutive acquisitions (section ). This was exploited to propose improved variants of the 4D and 5D NMR experiments based on selective excitation of 1HN nuclei (not touching the aliphatic ones as well as the water resonance),[96] or on the 1Hflip approach that can be exploited also when 1Hα are used as a starting polarization source.[92] Several variants of the same core NMR experiment can be used (1HNBEST, 1HN-flip, 1Hα-flip) allowing researchers to perform a sequence specific walk through the backbone by “hopping” from one CON correlation to the neighboring one (CON-CON strategy). These experiments can be performed as 4D experiments (by using a 2D CON as a reference) as well as in the 5D version in which the reference spectrum is the 3D CACON. The proficiency of this CON-CON strategy then led to the design of the analogous experiments based on 1H direct detection,[98] providing a complete set to meet different experimental needs.[99−102] Importantly, this set of spectra is well suited for the implementation of automated protocols for assignment.[103,104] Another strategy that can be used to speed up the acquisition of a multidimensional spectrum is to exploit projection spectroscopy, in which a series of projections are acquired rather than the full spectrum.[105] In this way, the analysis of the spectrum is facilitated as it consists of a collection of a series of 2D spectra. Automation of such analysis, called automated projection spectroscopy (APSY),[106] yields a peak list of the full dimensionality spectrum without reconstructing it. The methods for 13C assignment purposes have been demonstrated with α-synuclein[107] and successfully applied also for large intrinsically disordered proteins.[102]

CON and CACO Fingerprints

Acquisition of 2D NMR spectra that provide well-resolved cross-peaks constitutes a useful tool to achieve a fingerprint of a protein. 2D HN spectra are often the first ones used for this purpose and sometimes the analogous 2D HC spectra are collected to obtain (and follow) 13C signals. However, information about carbonyl carbon nuclei is not available through 1H detected 2D spectra. It is thus worth to record experiments that correlate carbonyl carbon resonances with the directly bound heteronuclear spins, Cα and N, exploiting the one bond couplings (1JC and 1JC′N), to obtain the CACO and CON spectra. The various solutions for homonuclear decoupling described in section ensure the removal of the large one-bond coupling in the direct acquisition dimension. The inclusion of an additional building block allows designing the 2D CBCACO and CCCO NMR experiments that collectively provide a suite of 2D NMR experiments that significantly enrich the information content that can be obtained through simple 2D experiments (Figure ).[108]

Figure 6

Schematic representation of the 2D spectra based on 13C direct detection that can be used, in conjunction with the 2D HN (1H–15N HSQC), to obtain a fingerprint of a protein. These include the 2D CON (13C′–15N correlation), 2D CACO (13C′–13Cα correlation), 2D CBCACO (13C′–13Cα and 13C′–13Cβ correlations) spectra. The scalar couplings exploited to detect the correlations in the various spectra are schematically indicated in the top panel (1JNH, 1JC′N, 1JC′C, 1JC). In addition to the correlations involving backbone nuclei shown in the figure, correlations involving specific side-chains can be observed (for Asn and Gln in HN and CON 2D spectra and for Asn/Asp and Gln/Glu in CACO and CBCACO 2D spectra). Adapted from ref (112). Copyright 2014, Nature Publishing Group, a division of Macmillan Publishers Limited. All Rights Reserved. The additional information available for a protein, if compared to that available in a 2D HN, is evident. The large dispersion of heteronuclear chemical shifts and additional nuclear spins that can be monitored (13C) provide a useful tool to investigate proteins in solution through simple 2D experiments. The correlation of nuclear spins belonging to two different amino acids involved in the peptide bond provides an important contribution to resolution in 2D spectra, particularly for intrinsically disordered proteins in which resonances tend to cluster in regions typical for each amino acids type.[69] Interestingly, the protonless variants are amenable to combination with 1H detected NMR experiments using multiple receivers in order to collect two experiments simultaneously as shown through the CON//HN implementation.[109−111] 13C NMR is not particularly sensitive to solution conditions and is not affected by the detrimental effect of the presence of high salt concentration, pH, or temperature, which cause extensive line broadening of the HN resonances in the NMR spectra.[8] This holds in particular for the spectra of highly mobile proteins or protein regions, which have peptidic protons largely exposed to the solvent and not engaged in stabilizing interactions like H-bonds. In these cases, upon increasing experimental ionic strength, pH, or temperature, the quality of 2D HN correlation spectra may get worse due to the more efficient hydrogen-exchange mechanism, resulting in increasing number of peaks that are extensively line broadened, while the quality of 13C-detected spectra is maintained or improved. In Figure , the comparison of the 2D HN and 2D CON correlation spectra acquired for α-synuclein at pH 7.4 clearly shows that 13C NMR allows recovering the missing information in 2D HN spectra, such as the correlations of Gly, Ser and Thr residues, that are the first ones to disappear when increasing temperature.

Figure 7

2D spectra correlating the backbone amide nitrogen either with the directly bound amide proton (1H–15N HSQC spectra, lower panels) or with the directly bound carbonyl (13C–15N CON spectra, upper panels) acquired on α-synuclein at pH 7.4, are shown as a function of increasing temperature, from 286 to 323 K (from left to right). Each spectrum was acquired at 16.4 T with one scan per increment on a 1 mM protein sample in 20 mM sodium phosphate buffer and 200 mM NaCl. As described in Section , amide protons can also provide a starting polarization source and different experimental variants of CON spectra have been proposed including 1HN-start,[27]1HN-flip,[59]1HNBEST,[63] and 1HN-CP.[64] These experiments lack correlations involving proline residues and the dependence on solvent exchange processes of amide protons is reintroduced that has an impact on the outcome of the spectra, eventually leading to loss of information when chemical exchange is very efficient. However, if properly tuned, this can even result in a sensitivity increase with respect to the 1HN-start variant by exploiting one of the variants with LRE (1HN-flip, 1HNBEST) or the 1HN-CP one.

Useful NMR Observables

Chemical Shifts and Heteronuclear Relaxation Rates

The process of sequence-specific resonance assignment provides a list of chemical shifts associated with different nuclei in the protein that represents the key to access atomic resolution information on complex macromolecules. These allow us to associate specific nuclei to each of the cross-peaks detected in multidimensional NMR spectra and follow spectral changes upon changes in the experimental conditions. For example, 2D spectra are often used to acquire protein fingerprints, as described in the previous section, and then follow spectral changes upon addition of potential partners, ligands, metal ions, etc. The additional contribution provided by 13C detected NMR experiments consists of the possibility to access information on solvent-exposed regions of proteins, such as external loops of globular proteins, intrinsically disordered proteins or protein regions, to follow proline residues and to recover information on paramagnetic proteins or highly deuterated large proteins. The chemical shifts, in particular heteronuclear ones, are also very informative about the secondary structural elements present in a protein.[113,114] Indeed, on top of contributions deriving from the chemical structure and the presence of specific functional groups, a significant contribution also derives from the local conformation, which in turn is linked to the secondary structural element.[115,116] Therefore, chemical shifts provide the first source of structural and dynamic information just by comparison with chemical shifts expected for a hypothetic “random-coil” state, representative of the contribution from the local chemical topology. The extraction of this information is by no means a trivial task. Computational tools have been developed to predict reference random-coil chemical shifts to enable the interpretation of experimentally determined chemical shifts in terms of structural and dynamic properties of a protein.[117−120] As an example, Figure shows the case of one of the “flexible linkers” of CBP, a 207 amino acids long intrinsically disordered region of the multidomain CBP protein (CBP-ID4).[99,121] Even if largely unstructured, two partially populated helical elements separated by proline rich regions can clearly be identified from chemical shift analysis.

Figure 8

Left: 2D proline-fingerprint spectrum of CBP-ID4. In the 2D spectrum, the signals are numbered according to the proline position in the sequence; C′–N correlations involving identical residue pairs are circled. In the upper part, the primary sequence of the linker is reported, with the proline residues colored in red. Right: On the upper part, the secondary structure propensity (SSP) plot indicates that the protein is largely disordered, with two regions that have a measurable α-helical propensity. In the other two panels, 15N R2 data measured for proline nitrogen nuclei and 15N R2 data measured for the nonproline residues recorded at 16.4 T are reported. Adapted from ref (121). Copyright 2018 Wiley-VCH Verlag GmbH and Co. KGaA, Weinheim. Variants of basic 2D experiments were developed to determine heteronuclear relaxation rates often used to access information about local flexibility.[122] CON-based experimental variants are available to determine 15N relaxation rates (longitudinal and transverse).[123,124] The most useful experiments however are probably those focusing on proline 15N spins as they provide information that is not accessible through 1H detected analogues. The possibility to focus on the 15N region of prolines selectively also allows researchers to focus on a narrow spectral region and achieve an excellent resolution with very few increments, if compared to the whole amide region in proteins, an attractive feature for the investigation of complex proline-rich proteins. It is interesting to note that the absence of the directly bound proton contributes to a reduction of the 15N nuclear relaxation rates, reflected in sharp NMR lines. As an example, the data obtained for CBP-ID4 are shown in Figure . It is worth noting that the CON spectrum in Figure reveals a peculiarity of this experiment when recorded for unstructured proteins, that is the clustering of the signals in groups depending on the previous amino acid type, whose nature determines to a great extent the chemical shift of the 13C′ nucleus, being all other contributions to the chemical shift typical of a folded protein largely averaged out. With increasing pH and temperature, the efficient solvent exchange renders the interpretation of 15N relaxation rates for nonproline amino acids quite complicated as also exchange may contribute to the observed results, reducing possible advantages of CON-based approaches. In these cases, other heteronuclear relaxation rates involving nonexchangeable nuclei such as 13C may result useful and provide complementary information to that accessible through 15N relaxation. The picture is more complex because of the dense network of directly bound carbon nuclei, which implies 13C–13C interactions in uniformly labeled samples. However, cases of interest in which these effects are mitigated and valuable information can be obtained include the determination of carbonyl carbon longitudinal and transverse relaxation rates[123,125] as well as heteronuclear 1H–13C NOEs.[126] CON and CACO variants were used for the purpose and tested on model proteins;[123,127] these may result useful for the investigation of intrinsically disordered proteins, for example, in all the cases in which chemical exchange is so pronounced to interfere with the determination of 15N relaxation rates.[128]

Chemical Exchange

CON experiments also have interesting properties for the experimental determination of exchange processes with the solvent, an observable used since the early days of NMR to discriminate between amide protons easily accessible to the solvent from those protected into globular cores or involved in tight hydrogen bonds.[129] In fact, the experiment correlates two heteronuclear spins not directly involved in chemical exchange with the solvent and it constitutes a useful tool to recover information also in conditions in which amide protons are broadened beyond detection, as described above (Figure ).[63,72] However, simple modifications of the basic pulse sequences can be designed to reintroduce a dependence on exchange processes with the solvent although in a more indirect way, allowing to monitor the process in a less perturbative way. The 1HNCON variant[126,130] of course constitutes the first obvious one to reintroduce effects deriving from solvent exchange through the starting polarization source used in the experimental variant.[112] Exchange indeed influences the observability of the signals, even if to lower extent with respect to 1H detected experiments. Exchange is also responsible for a pronounced enhancement of the recovery of amide proton polarization to equilibrium provided amide protons are selectively perturbed with respect to those of water.[60,131,132] Therefore, in a certain range of exchange rates, this effect can become a favorable feature for the detection of CON spectra as shown through the example of the 1HNBESTCON acquired on α-synuclein in less than 1 min[63] (without introducing any delay between the end of one transient and the start of the next one thanks to the fast recovery of amide proton magnetization in the experimental conditions used). This is actually a qualitative observation of exchange which however could be quantified by exploiting similar approaches to those proposed by Schanda and Brutscher.[131] Selective manipulation of the water resonance to highlight only residues whose amide proton senses this perturbation[133,134] constitutes another widely used strategy that was implemented prior to 1HN-start CON variants to take advantage of the nice chemical shift dispersion typical of the CON reading scheme.[135] However, probably the most interesting approaches are those that exploit the most simple CON variant, the 13C-start one, to indirectly detect effects of exchange by monitoring its effect through 15N spin coherences or spin orders. In an elegant paper, Atreya and co-workers proposed to monitor the effect of isotopic shift induced by deuterium in solutions constituted by 50% H2O and 50% D2O, and then measure exchange effects between the two resonances.[136] Effects of chemical exchange can thus be determined by observing the direct exchange of polarization between the two isotopomers as well as by the effect on the scalar coupling of 15N with 1H, coupling that can be quenched by fast exchange processes. A similar approach was also proposed to investigate exchange rates of arginine side-chains.[137] Finally, the CON reading scheme is particularly well suited to implement variants in which exchange is monitored through subtle effects on coherences or spin orders involving 15N. For example, the determination of 15N relaxation under CPMG, in the presence of efficient exchange processes of amide protons with the solvent, is influenced by whether 1H RF decoupling is applied or not. This difference, properly modeled, can be used to access information on exchange rates also in cases in which direct observation of the 1H resonance is precluded. This approach, initially designed to access information on exchange rates of selected amino acid side-chains[138−140] was then implemented also for the study of amide protons in proteins in 1H and 13C detected experimental variants.[141−143] Along similar lines, a three spin order operator can be created without perturbing the water resonance and then allowed to evolve in a free evolution period.[108] In the presence of efficient solvent exchange processes, this becomes the major determinant of the disappearance of the three spin order by “decorrelation”, as initially described by Skrynnikov and Ernst.[144] The CON variant allows implementation of this idea in a very clean way since the water resonance is not perturbed at all in the experiment and the 2D CON reading scheme allows to profit by the excellent resolution also in intrinsically disordered proteins.[108] Finally, solvent accessible protein backbones can be “illuminated” by hyperpolarized HDO produced using dissolution-dynamic nuclear polarization (D-DNP).[145−149] The hyperpolarized solvent provides a polarization reservoir, enhancing the 1H signals of sites undergoing chemical-exchange with the hyperpolarized solvent itself; implementation of a 2D CON reading scheme provides a nice resolution and reduced hurdles in dealing with the resonance of hyperpolarized HDO when performing 1H detection.[149]

Residual Dipolar Couplings

Other observables deriving from dipole–dipole interactions include residual dipolar couplings (RDCs) resulting, as the term itself tells, from noncomplete averaging of dipolar interactions in solution.[150−152] This can be originated by the natural magnetic anisotropy of the molecule under investigation that induces a partial degree of alignment when immersed in high magnetic fields or by external agents. Residual dipolar couplings are generally determined measuring changes in signals splitting resulting from a well resolved scalar coupling upon induction of a partial degree of alignment of the molecule in solution. The resolved 13C′–13Cα scalar couplings are thus an obvious candidate,[153] easily accessible from carbonyl detected experiments, to determine RDCs. Experiments to measure a variety of different RDCs, including several that involve proton nuclei,[123,154,155] were developed to enable the determination of RDCs also in cases in which methods based on 1H direct detection fails to provide information.[123,156]

What about Homonuclear Nuclear Overhauser Effects (NOE)?

By analogy to observables that can be detected through 1H detected experiments, an important observable that comes in mind is the homonuclear NOE, one of the most robust sources of internuclear distances useful for solution structure determination. However, when moving from 1H to 13C, homonuclear NOEs are drastically reduced due to the lower gyromagnetic ratio of 13C.[157,158] Ideally, long-range 13C–13C NOEs would be very useful observables that would contribute to the investigation of very large systems. However, since they scale with 1/r6, they become tiny effects that add to the much stronger interactions between directly bound carbon atoms (from 1 to 3 Å the NOE decreases by 3 orders of magnitude), so small that it is really difficult to measure them experimentally in uniformly labeled protein samples with the current hardware possibilities. On the other hand, 13C–13C NOEs can be easily detected between directly bound carbon nuclei and spin diffusion becomes a very efficient effect within these networks of directly bound carbon atoms.[9,19,30,52,158] This information is not so useful to recover unknown structural information but contains nevertheless information that can be used for assignment purposes.

Biomolecular Applications

Focus on Amino Acid Side-Chains

The possibility to focus the NMR experiments on specific amino acids is a useful approach for proteins’ investigation. This can be used to aid the sequence specific assignment of macromolecules, as described in section , but it can also be exploited to address specific biological questions. Amino acids’ side-chains are generally assigned using 1H-detected experiments, but there are many cases in which the resonances due to the nuclei of a long chain are missing or the signals vanish upon interaction due to conformational exchange. In all these cases, 13C-detected experiments can be of use. In addition, with simple 2D experiments is possible to focus on the key correlations relevant to monitor the behavior of a side chain upon changes in its environment, either due to changes in the solution conditions or to the interaction with a partner or to binding of a small molecule/metal ion. Since different kinds of nuclei are affected to a different extent by the various interactions, the analysis of carbon nuclei chemical shifts can provide additional information on conformational changes, variations in solvent exposure or in hydrogen bond pairing, etc. providing useful information to describe the processes of interest.

Monitoring Side-Chains through C′-Detected Experiments

The set composed of CACO, CBCACO, and CCCO experiments provides an excellent tool to monitor resonances of virtually every 13C nuclear spin of aliphatic side-chains by correlating them to the backbone carbonyl of each amino acid. Moreover, it also provides information on the terminal carbon spins of side-chains of residues that contain the carboxylate (aspartate and glutamate) or the carbonyl (asparagine and glutamine) moiety since their correlations can be detected in a clean region of the spectra. The one-bond correlation between the terminal carbon nuclei and the neighboring one can be collected in 2D CACO spectra; the cross-peaks observed for the side-chains of these four amino acid types fall in specific spectral regions, as illustrated in Figure . Comparison with a 2D CBCACO allows detection of the neighboring aliphatic carbon spin, providing the information to assign in a sequence-specific manner the 13C terminal resonance of aspartate and asparagine residues. The 2D CCCO is needed to complete the assignment and also identify in a sequence-specific manner the 13C resonances of glutamate and glutamine residues.

Figure 9

Superposition of the CACO (blue contours) and the CBCACO (green contours) spectra recorded on a 600 μM sample of α-synuclein in 20 mM Tris-Cl buffer, pH 7.4, and 310 K at 16.4 T. On the left, the correlations observable in the two spectra for an Asp residue, sketched above, are schematically reported. The CON experiment provides instead the intraresidue Nδ–Cγ for asparagine and Nε–Cδ for glutamine residues. A combination of CON and CACO/CBCACO has been proposed to remove from the latter spectra the correlations due to the carbonyl carbon nuclei bound to nitrogen, producing maps where only the cross-peaks of aspartate and glutamate nuclei are present.[159] Since the coupling between the backbone carbon and the carboxylate carbon of the aspartate side chain has a strong conformational dependence, in some favorable cases it is even possible to observe a splitting of the signals, whose magnitude can provide information on the chain’s conformation. Carbonyl and carboxylate functional groups of amino acid side-chains are seldom assigned despite their key role in many biological processes and their investigation can be very instructive. A recent example in this context is provided by the investigation of α-synuclein when subject to concentration jumps of calcium metal ions, a process associated with the transmission of nervous signals. Negatively charged side-chains of aspartate and glutamate residues are expected to be among the first candidates to interact with positively charged ions such as Ca2+, and it is interesting to access direct information on these side-chain functional groups to identify whether specific regions of the polypeptide are affected to different extents by the interaction. The set of 2D 13C detected NMR experiments confirmed that the final part of the protein, rich in negatively charged amino acids, is the one sensing Ca2+ concentration increase, as also previously described through 1H detected experiments.[160−162] These experiments, by monitoring also carboxylate groups of glutamate and aspartate residues, revealed that even within a disordered protein in which all of them are expected to be exposed to the solvent, only a subset of them are initially perturbed by the addition of Ca2+ (Figure A,C). Interestingly, among the most affected residues there are aspartate and glutamate residues separated by a proline, in a peculiar motif (DPD and EPE motif). The presence of a proline residue in between two residues featuring carboxylate groups could be a strategy to adapt the local conformation for Ca2+ interaction. It is noteworthy that these two proline residues were the most perturbed upon the addition of Ca2+, an observation that could be easily achieved through the 2D CON (Figure B). The 2D CACO and CON spectra thus provide a useful tool to focus on the metal ion coordination sphere and more generally to investigate interactions of negatively charged residues with complementary charged molecules in the cell.

Figure 10

(A) Expansion of CBCACO spectra recorded on α-synuclein at increasing concentration of Ca2+ showing the chemical shift perturbation for the most affected Asp residues during the titration. (B) Expansion of CON spectra recorded on α-synuclein at increasing concentration of Ca2+ showing the chemical shift perturbation for two proline residues (P120 and P138). (C) Comparison of chemical shift perturbations (CSP) of side chain carboxylate/carbonyl carbon chemical shifts (blue) with backbone carbonyl carbon chemical shifts (red) determined through from 2D-CACO and 2D-CON spectra (CSP = |Δ(δ 13C)|). Backbone CSP values are smaller in magnitude with respect to those of side-chains and not necessarily maximal for Asp/Glu/Asn/Gln amino acids, reflecting a more indirect effect experienced by backbone nuclear spins upon interaction with calcium ions. On the bottom the portion of primary sequence affected by metal ion binding is reported, with two sketches of the three-amino acid motifs most affected. Adapted from ref (108). Copyright 2020 Wiley-VCH GmbH.

Positively Charged Amino Acids

Lysine and arginine side-chains are crucial for driving protein–protein interaction and for forming intramolecular and intermolecular salt bridges. Interactions and salt bridges often involve acidic protein side-chains such as those of aspartate or glutamate. Positively charged residues are relevant also to establish interactions with charged phosphodiester moieties of nucleic acids. The first example of exploitation of CON and CBCACO type experiments for investigating protein–protein interaction dates back to 2006, in the study of the metal-mediated complex formed in the presence of Cu(I) between two copper chaperons, Atx1 and Ccc2a.[29] These two proteins have a very specific role in copper trafficking and the solution structure of the complex obtained in the presence of Cu(I) was solved by NMR.[163] However, the HN signals of some of the residues considered important to characterize the metal-binding region were missing due to exchange with the solvent protons, preventing clear-cut characterization of their role. The correlations of these residues were instead present in the CON experiment and allowed the characterization of the interacting interfaces of the two oppositely charged proteins. In addition, through the comparative analysis of the CON and the CBCACO, it was possible to establish the direct involvement of selected carboxylate moieties in electrostatic interactions as well as in H-bonds necessary to stabilize the conformation of an otherwise very mobile region of the polypeptide. Since Atx1 has several lysine residues considered important for driving binding, an experiment to detect the Cδ–Nε was designed, which allowed the identification of small but meaningful chemical shift variations upon complex formation allowing to delineate the interaction surface. One of the lysine residues (K65) experienced in the apo-form a very peculiar downfield shift of 6 ppm with respect to the usual value, suggesting its involvement in an intramolecular H-bond.[29] Novel experiments were also designed for monitoring arginine side-chain. The guanidinium group of this amino acid is often involved in salt-bridges, H-bonds and in cation-π interactions with aromatic side-chains and with nucleic acid molecules. It is thus very informative to be able to determine its ionization state and its dynamics; however, 1H-detected experiments are often not effective due to line broadening and signal crowding and 13C detection has been exploited by several groups. For example, Hansen and co-workers proposed an experiment to obtain a 13Cζ–15Nε correlation spectra avoiding transfer from 13Cζ to 15Nη. Such experiment proved useful for 15N relaxation measurements and quantification of the squared order parameter, S2, that reports on the motions of the arginine side-chain on some model proteins and on a 42 kDa enzyme, the human histone deacetylase 8 (HDAC8).[164] With a similar approach it is possible to probe the rotational dynamics around the Cζ–Nε bond.[165] This experimental scheme found an interesting application on a mutant of T4 lysozyme, a 19 kDa protein containing 13 arginine residues with different exchange regimes: some are in fast exchange; five of them have 15N resolved resonances, suggesting some restriction in the rotation of the side-chain. A variant to include in the 13C detected scheme the so-called divided-evolution approach[166] was designed and successfully demonstrated that two arginine residues have an exchange regime consistent with their involvement in a hydrogen-bonding network. An additional variant of the experiment including selective pulses proved useful to establish the exchange regime for two slower rotating guanidinium groups, which belong to residues shown to be involved in an interaction network together with a tryptophan side-chain in one case and in strong ionic bidentate hydrogen bonds in the other. Exploiting double-quantum experiments it is also possible to obtain 13Cζ–15Nε correlation spectra with reduced exchange broadening, obtaining sharper lines. These can report about chemical shift perturbations occurring due to the interaction of the side-chain such as the measurement of very small isotopic effects when protons are substituted with deuterium to highlight the involvement of a specific group in stabilizing H-bonds or salt bridges.[167] Another approach, proposed by Mulder and co-workers,[31] provides the correlation of the 13Cζ with both 15Nε and 15Nη, and enables the determination of the protonation pattern of the nitrogen nuclei. They tested the experiment, which based its efficiency on cross-polarization transfers, on three proteins differing in number of arginine residues present and in their protonation state. In the case of the photoactive yellow protein (PYP), which presents a particularly relevant arginine for protein function (R52), they demonstrated the possibility to detect all the protons bound to nitrogen nuclei, one Hε and four Hη, contrary to previous crystallographic studies (Figure ).[31]

Figure 11

(a) 1H-15N HSQC spectrum of 1 mM PYP in 5 mM potassium phosphate at pH 6.2. The 15Nη–1Hη and 15Nϵ–1Hϵ correlations are indicated in red and blue, respectively. (b) 1H-decoupled and (c) 1H-coupled 15Nη/ϵ–13Cζ correlation spectra of 3 mM PYP in 5 mM potassium phosphate at pH 6.5. The 15Nη–13Cζ and 15Nϵ–13Cζ correlations are indicated in red and blue, respectively. To avoid additional signals due to the presence of 2H isotopomers, D2O for field lock was added externally to a separate compartment of the NMR tube. The number of scans is as follows: (a) 16, (b) 128, and (c) 192. In (a) and (b), 1D traces from the 2D spectra, indicated by the broken lines, are shown. Reproduced with permission from ref (31). Copyright 2017 Wiley-VCH Verlag GmbH and Co. KGaA, Weinheim.

Hydrophobic Amino Acids

The use of 13C direct detected NMR experiments has been proposed also to map methyl groups, often exploited for protein structure and dynamics, particularly in large proteins. In 1H detected experiments often the signals of the CH3 groups fall in regions where other signals resonate, complicating the extraction of selective information. A solution is to recur to selective labeling of the protein;[51] otherwise, one can exploit a spectroscopic filter to remove all the nondesired cross-peaks, as in the 13C-Methyl COSY experiment proposed by Atreya and co-workers.[168] In this experiment they exploited the fact that the 13CH3 signal of the methyl group located at the terminal end of an amino acid chain has only a single J-coupling to its directly attached 13C nucleus. Using this selection method they demonstrated that the 13CH3 correlation peaks for the various amino acids can be clearly separated into distinct spectral regions. Such a filtering building block can be incorporated in a 3D experiment wherein the additional dimensions can be utilized to provide intra-amino acid or long-range correlations. This approach is particularly well suited to study highly flexible proteins, for which the overlap is dramatic, but can be of use also for chemical shift mapping of large proteins or to obtain long-range information accurately quantifying paramagnetic relaxation enhancement. Aromatic residues are often of great interest but not trivial to be investigated in detail, in particular in very large globular proteins or in intrinsically disordered ones. They often form the core of protein folds and recently they were shown to play a key role in the formation of membrane-less organelles through the process of liquid–liquid phase separation. The quaternary carbon (Cγ) linking aromatic rings to Cβ is not straightforward to be detected through 1H detected experiments while it gives rise to well resolved cross peaks with Cβ that fall in a very clean spectral region of simple 2D 13C–13C COSY or TOCSY or NOESY spectra. These 2D spectra were used to identify metal ion ligands in paramagnetic proteins by comparison with the spectra obtained with a diamagnetic analogue.[44,45] A special sequence was also designed to remove the two large one bond scalar couplings influencing the Cβ (1JC and 1JC) to improve the resolution of 2D spectra that report the Cβ-Cγ correlation.[38]

The Special Case of Proline

The cyclic nature of the proline side-chain and the lack of the backbone peptidic HN atom sets it apart from all other natural amino acids. Well studied in globular proteins, proline residues are often very abundant in regions of proteins that would not lead to diffraction in X-ray crystallographic studies and on this basis were classified among “disorder promoting” amino acids in polypeptides. However, due to the constrained side-chain of proline residues, they confer local rigidity. The steric hindrance given by the cyclic structure of proline lowers the energy barrier present between the cis and the trans conformation of the peptide bond involving the nitrogen of this residue. When proline residues are part of globular folds the local structure can favor the cis conformation, still allowing interconversion to the trans conformer through minor changes, inducing a bend in the polypeptide and therefore protein folding.[169] In highly flexible protein regions, while the occurrence of cis conformation is usually around 0.5% for all the residues in a polypeptide, for proline this can be much higher even in the absence of constraints from a stable 3D structure.[170,171] The possibility to directly monitor correlations involving proline residues in clean regions of CON spectra stimulated the design of a series of experiments to investigate the properties of these residues in globular as well as disordered proteins. One of the first experiments proposed was the (H)CCCdNCO experiment,[82] which correlates the intraresidue 15N and the 13C′ of the preceding residue with all the 13C resonances of a proline ring, providing information to identify the isomerization of the peptide bond. An alternative strategy to select in a clean way resonances involving proline nitrogen resonances consists of exploiting the evolution of the 1JNH scalar coupling to suppress resonances of all amino acids except proline.[82] Finally, when the 15N chemical shifts of prolines are sufficiently isolated from those of all other amino acids, such as for highly flexible disordered proteins (15N chemical shift range centered at 137 ppm), the use of a band selective pulse on 15N spins of proline residues allows to detect proline fingerprint with high resolution with a reduced spectral width and to design Pro-selective experiments to focus the assignment on proline residues. This experimental scheme allows tailoring all the experiments to measure observables such as 15N transverse and longitudinal relaxation rates.[121] Lately, several other experiments were designed following a similar approach, which provides the sequence-specific assignment of proline residues to complement any assignment obtained either with 13C or 1H NMR (or, more likely, a combination of the two).[172] These experiments provided a wealth of information that is beginning to reveal possible functional roles of proline in highly flexible proteins. The investigation of flexible linkers of CBP, which constitute more than half of the primary sequence of this complex multidomain coactivator of transcription, revealed that often proline residues are clustered in specific regions of the primary sequence, indicating that these residues can be exploited to maintain elongation of the backbone, such as in the case of CBP-ID3,[101] or act as helix-breakers, separating regions with high helical propensities, like in the case of CBP-ID4.[99] Peculiar motifs with several proline residues in a row (PP, PPP, PPPP) could be detected and characterized with the novel experimental tools developed.[172] Other motifs in which prolines are flanked by specific amino acids are also emerging, as described above for the interaction of α-synuclein with calcium ions through the DPD or EPE motifs. Similar motifs were also identified in different proteins, such as viral proteins[102,173−175] as well as in WASP-interacting protein (WIP), an intrinsically disordered polypeptide with a key role in actin polymerization in activated T cells.[176,177] Another interesting example in this context is provided by aromatic-proline pairs as recently monitored in different proteins[77,178,179] and illustrated below through the example of quail osteopontin.[77] The 13C detected experiments used to complete resonance assignment revealed in addition to the major form, also a subset of peaks with lower intensity that could be clearly identified in the clean region of CON spectra reporting the correlation of proline nitrogen (Figure ). This set of peaks could be assigned in a sequence specific manner and the 13C chemical shifts of proline rings confirmed the presence of cis/trans isomers and provided their relative population. All major peaks have a secondary form and for some of them, which appear clustered in specific regions of the primary sequence, even additional forms due to various combinations of cis and trans conformation could be identified. The population of cis isomers results more abundant for proline residues near aromatic residues. In this protein the rigidity of the closed proline ring is exploited to establish π–π interactions with the aromatic side-chain of neighboring amino acids that imparts a compact state to the otherwise very flexible tertiary structure.[78]

Figure 12

Proline region in the 2D CON spectrum. The main forms of prolines are observed (left) and minor forms appear (right) when the contour levels are lowered. In the inset, the level of the 2D slice and the relative intensity of major and minor forms are reported. Sequence specific assignment of cross-peaks is reported in the spectra; tentative assignments are indicated with an asterisk. Reprinted in part from ref (77). Copyright 2019 Elsevier Ltd. All rights reserved. An equilibrium between the cis and trans isomers of proline peptide bonds was shown to be relevant in case of RNA polymerase II C-terminal domain, an intrinsically disordered region of the protein.[178] Another interesting observation was reported for the BMAL1 transactivation domain that is involved in circadian clock modulation,[179] showing a direct link with function mediated by an aromatic-proline pair within an intrinsically disordered region. The 13C detected experiments are thus revealing information about novel sequence motifs that are not yet well described.

Overcoming Limitations Due to 1H Transverse Relaxation: Paramagnetic and Large Proteins

Paramagnetic centers are known to provide a wealth of additional contributions to nuclear spins in their surroundings, including contributions to nuclear relaxation, to chemical shifts as well as to partial orientation in high magnetic fields.[180] The protonless NMR experiments provide, by default, additional unique information with respect to 1H detected experiments. They enable monitoring signals also at distances from the paramagnetic center in which the effects sensed by protons are so large that resonances are broadened beyond detection, recovering information that would be lost when using as reading schemes 1H detected experiments (such as 1H–15N or 1H–13C HSQCs).[43−47,181−185] A variety of observables that contain structural information can thus be determined. Several approaches were proposed to quantify paramagnetic contributions to relaxation (either longitudinal or transverse, or both, depending on the type of paramagnetic center) and to exploit them to access distance information between nuclei and paramagnetic centers themselves for structure determination (Figure ).[13]

Figure 13

Presence of a paramagnetic center in a molecule affects its NMR spectrum (blue dots) with respect to the one of a diamagnetic analogue (green dots). The change in relaxation rates, in chemical shifts and signals splitting provide restraints called paramagnetic relaxation enhancement (PRE), preudocontact shift (PCS), or contact shift (CS) and paramagnetism-induced residual dipolar coupling (RDC), respectively. Paramagnetic effects on nuclear relaxation and chemical shifts are generally smaller for 13C with respect to 1H, and protonless NMR allows to measure observables closer to the paramagnetic center with respect to proton NMR. The approach can be applied for structural purposes to establish the reciprocal orientation of two interacting partners or to establish long-range crosstalk within a protein. Paramagnetic contributions to chemical shifts also contain a wealth of structural information and protonless NMR experiments enable detection of these effects also in regions of macromolecules in which protons are broadened beyond detection.[20,25,47,182,183] The same holds for residual dipolar couplings that can be induced when a significant anisotropy derives from the paramagnetic metal ion itself.[123,156] Therefore, the set of protonless experiments can provide excellent information for the structure determination of paramagnetic proteins. After initial pioneering work showing the potential of the approach, the methods were applied to the study of type II Cu(II) containing proteins such as CopC[45] as well as to oxidized Cu(II)Zn(II) superoxide dismutase[13] to demonstrate that a 3D structure can be obtained also for proteins containing paramagnetic centers that broaden proton resonances beyond detection in large spheres around the metal ion itself. 13C detected experiments were also used to investigate binding of Cu(II) to αB-Crystallin[186] as well as to follow the uptake of iron ions by ferritin.[187] A similar strategy was used to investigate iron-containing proteins[182,188,189] and calcium binding proteins where the metal ion was replaced with lanthanide ions.[47] In the latter case, the lanthanide ion can be selected to modulate the magnitude and type of paramagnetic effects.[190] Paramagnetic tags are also often engineered on purpose to study intermolecular interactions as well as possible residual structure in disordered proteins.[191−197] The additional interactions deriving from the presence of a paramagnetic center are generally larger in magnitude with respect to internuclear ones on the grounds of the larger electronic magnetic moment. For this reason, they are well suited to reveal long-range information as well as information about short-range distances even if only partially populated in the ensemble of conformers describing disordered systems. In this context protonless NMR experiments provide valuable complementary information to that available through 1H detection.[198−200] Actually, different variants of CON experiments were used to modulate the extent of paramagnetic effects that contribute to cross-peaks intensity by exploiting 1H-start variants (1Hα, 1HN) and comparing them to effects observed through 13C-start variants, still exploiting the CON reading scheme to take advantage of its excellent resolution.[199] Indeed while 13C-start variants only sense paramagnetic relaxation enhancements through 13C nuclei, 1H-start variants reintroduce a dependence also on effects sensed by protons, offering an additional way to modulate the extent to which paramagnetic effects are sensed. This approach was recently demonstrated for a highly flexible disordered system such as osteopontin.[199] Upon comparing C′ and HαCON-derived long-range PREs in the protein grafted with MSTL, it appears that both experiments highlight the same regions of the protein (indicated by arrows in Figure ), with the former showing less pronounced effects, in agreement with 13C nuclear spins sensing similar paramagnetic effects at shorter distances from the paramagnetic center with respect to protons. The comparison of the PREs observed through the Hα and HNCON, which both exploit 1H as a starting polarization source, highlighted the presence of proline residues, which are clustered in the vicinity of the so-called compact state of the polypeptide. In addition, the HNCON profiles are less uniform with respect to the HαCON profiles; a feature that could be due to additional solvent-mediated relaxation enhancement effects that are picked up by HNCON and do not influence HαCON data.

Figure 14

PRE intensity profiles determined through the CON-based approach exploiting different starting magnetization sources: C′-CON (top), Hα-CON (middle), HN-CON (bottom). The green circle indicates the position of the spin label (S108C). Black arrows indicate the observed long-range contacts with the spin label. Reproduced with permission from ref (199). Copyright 2019 Wiley-VCH Verlag GmbH and Co. KGaA, Weinheim. Paramagnetic tags can also provide useful long-range information about intermolecular interactions as well as about the relative orientation of two domains in multidomain proteins, such as for the case of tandem RNA Recognition Motif (RRM) domains (RRM1-RRM2) of the splicing factor U2AF65 bound to a polyuridine (U9) RNA oligonucleotide. A paramagnetic tag was introduced at position 155 in the RRM1 domain; the complementary information achieved through 13C protonless experiments in conjunction with 1H detected ones was very effective in defining the relative orientation of the two RRM domains connected by a flexible linker.[198] Paramagnetic relaxation enhancements sensed by proteins upon addition of highly paramagnetic complexes to the solution (solvent PREs) are also modulated by the experimental tool used to measure them and protonless experiments resulted useful to access complementary information to that achieved through 1H–15N HSQC experiments.[201] Alternatively, solvent PREs have been proposed as a tool to shorten longitudinal recovery times.[202,203] Another area where heteronuclear direct detection can provide useful information on the grounds of the reduced contributions to relaxation sensed by heteronuclear spins with respect to protons as in case of paramagnetic proteins, is that of multimeric proteins[204] or large protein assemblies for which extensive enrichment in 2H (in place of 1H) is needed to reduce the dipolar contributions to nuclear relaxation and thus the amount of protons that can be investigated is limited to exchangeable amide protons. These can be investigated through transverse relaxation optimized spectroscopy (TROSY)[205] that exploits the constructive interference of CSA and DD interactions involving the amide proton and the directly bound amide nitrogen. A similar approach is more difficult to implement in a general way for 13C nuclear spins because of the higher variability in chemical topology and magnitude of the effects. Methyl TROSY,[206] which exploits the interference between 1H–1H and 1H–13C DD interactions instead, represents the most successful application but only allows the study of methyl groups; other approaches have been proposed for aromatic rings as well as for methylene groups. However, on highly deuterated samples, 13C direct detection offers the most general strategy to recover information that is not easy to access otherwise. This approach, initially demonstrated on dimeric SOD (in the reduced, diamagnetic state)[9] resulted useful to provide information on a multimeric protein of high molecular mass[52] as well as for the Fc portion of immunoglobulin G (IgG) with a molecular mass of 56 kDa as model glycoprotein[207] and for malate synthase G (MSG) protein (82-kDa).[32] With increasing field strength, the utility of this approach is expected to increase thanks to the increased sensitivity and resolution provided by the increase in B0, in particular for aliphatic 13C resonances that only experience a modest contribution to relaxation due to the chemical shift anisotropy tensor at increasing magnetic field.[208,209]

Intrinsically Disordered Proteins

The field where 13C direct detection finds most applications is that of intrinsically disordered proteins (IDPs) or regions (IDRs) of complex multidomain/heterogeneous proteins. Research in this area flourished almost in parallel to the development of the suite of NMR experiments based on 13C direct detection, triggering the design of the most widely used experiments which in turn provided a wealth of valuable information at atomic resolution about their structural and dynamic properties. Since the early studies on unfolded systems, the importance of exploiting heteronuclei to increase the resolution of cross-peaks in multidimensional NMR spectra was recognized.[210] Uniform isotopic labeling with 13C and 15N indeed contributed to the increase in the complexity of the systems that could be studied in unfolded states. However, initial studies were largely based on 1H direct detection, in particular on detection of amide protons that can be easily correlated to the directly bound 15N, the latter greatly contributing to excellent chemical shift dispersion, a key feature to overcome the main critical point arising from the high flexibility and lack of structure typical of IDPs. Amide protons however are influenced by exchange processes with the solvent (see section ) and thus are prone to extensive line broadening in particular in conditions enhancing exchange processes. An interesting work[118] analyzed the chemical shifts of disordered proteins deposited in the BMRB and revealed that most studies were based on 1H detected triple resonance experiments and were performed in conditions deviating from “physiological” ones because a reduction of pH or temperature (or of both) was necessary to diminish the impact of exchange processes on amide proton line widths and to obtain informative 2D HN spectra. In this context the possibility to shift to direct detection of 13C′ and exploit the suite of multidimensional NMR experiments offered an excellent alternative to investigate IDPs. Carbonyl carbon nuclei are not influenced by exchange processes and can be easily correlated to the directly bound nitrogen nuclei which, as mentioned above, are the nuclear spins that retain a large chemical shift dispersion also in the absence of a 3D structure. In addition the C′–N correlation exploits the scalar coupling deriving from the peptide bond, linking one amino acid with the neighboring one. This offers an important contribution to chemical shift dispersion deriving from the covalent structure of the protein which is particularly relevant for IDPs in which chemical shifts tend to average to those predicted for each amino acid type.[69] This set of experiments was used to perform the sequence-specific assignment of a large number of IDPs, starting from initial work on α-synuclein,[23] Src[211] and securin.[75] As a result, the size and complexity of IDPs investigated significantly increased, and assignments deposited were more extended as also proline resonances could be monitored contributing in this way to more accurate characterization of IDPs.

Mimicking Physiological Events

The possibility to study IDPs in physiological conditions also represents an important motivation to choose 13C detected NMR, an important aspect in general but particularly for IDPs that are highly solvent-exposed and thus sensitive to environmental conditions. Exposed backbones are often sites of post-translational modifications (PTMs), a field in which NMR can provide valuable contributions by highlighting at atomic resolution which amino acids are affected by these chemical reactions in real-time. Phosphorylation of OH groups of serine, threonine and tyrosine residues is one of the most extensively studied post-translational modifications through NMR.[212] The impact of this reaction on amide protons and nitrogen chemical shifts is very large and 2D HN spectra have been often used to follow these reactions. However, several kinases change their activity depending on pH and temperature and it is often important to investigate their effect in conditions near physiological ones. Therefore, the possibility to follow these reactions with an additional experiment, the CON, in particular if acquired simultaneously to the 2D HN,[111] offers an excellent tool of investigation for this challenging topic.[63,213] Another strategy that has been proposed, in particular in cases in which sensitivity is an issue (either for low protein concentration or for instrumental setup) is the use of the most sensitive 2D experiment, the 1H-start CACO. Several variants focusing on glycine or nonglycine residues were indeed proposed and used to follow phosphorylation reactions, in the presence of purified kinases or in cell extracts, on a range of previously problematic targets, namely Mdm2, BRCA2, and Oct4.[203] In-cell NMR spectroscopy provides a unique spectroscopic tool to investigate a protein in conditions that resemble physiological ones and the potential of 13C detected experiments has been tested with several proteins characterized by different molecular mass and motional properties.[214] The results clearly indicate that for well-folded proteins the method of choice is 1H NMR since line broadening typical for in-cell protein samples severely affects the detectability of the signals, particularly when CON-type experiments are used. The most valuable contribution provided by 13C detected experiments in this context consists of picking up in a clean way signals deriving from highly flexible protein regions that also retain these properties of high flexibility within whole cells. Indeed, when dealing with IDPs, 13C detected experiments offer a valuable solution to contrast the reduced chemical shift dispersion and high solvent exchange rates of the HN signals, allowing to map entirely the protein backbone. The case of α-synuclein is paradigmatic.[112] The complementary features of the different variants of 2D CON spectra (1HN-start, 1Hα-start, 1Hα-flip, etc.) as well as of CACO spectra (1Hα-start, 1Hα-flip) offer many opportunities to pick up the desired correlations within the lifetime of in-cell NMR samples. As an example, the HNBESTCON and Hα CON can be acquired in less than 1 h on in-cell α-synuclein samples.[63] More recently, the CON//HN multiple receivers variant was implemented to monitor two experiments simultaneously,[111] an important feature for samples of limited stability, such as in-cell NMR samples, showing that also the 13C-start variant of the CON can provide useful information in a very limited time (<1 h, Figure ). Comparison of the two NMR spectra acquired simultaneously reveals that the increase in line width when passing to the in-cell samples is more pronounced for 1H than for 13C, a possible origin for this could be the local inhomogeneities that affect more nuclear spins with higher gyromagnetic ratios such as 1H. The simultaneous observation of the two experiments of course provides a wealth of atomic resolved information on the state of the protein within cells.

Figure 15

Scheme of the CON//HN experiment (top panel) and comparison of the 2D spectra (HN left, CON right) acquired through this experiment on 13C, 15N labeled α-synuclein at 310 K in different experimental conditions. From top to bottom: purified sample in 100 mM NaCl, 50 μM EDTA, 20 mM phosphate buffer at pH 7.4; in Escherichia coli cells lysate resuspended in the same buffer and in-cell. The traces of a representative signal extracted from the HN (red left) and the CON (blue right) spectra are also reported. Adapted with permission from ref (111). Copyright 2019 Biophysical Society.

Liquid–Liquid Phase Separation

Liquid–liquid phase separation (LLPS) is a phase transition process in which a homogeneous solution separates into two different coexisting liquid phases. The interface between the dense and the light phases allows the passage only of certain molecules making these droplets act as compartments which are called membrane-less organelles.[215] IDPs and protein containing IDRs are over-represented among proteins that undergo phase separation processes, especially in condensates containing RNA. This is likely due to their ability to form transient multivalent interactions. The LLPS behavior of IDPs is governed by various interactions, going from charge–charge interactions to π–π and cation−π interactions as well as hydrophobic contacts and charge pairing; in vitro, phase separation is often driven by changes in pH or temperature or ionic strength of the solution.[216] In living cells, condensates can act as organizational centers promoting reactions or sequestering proteins and nucleic acids to inhibit metabolic pathways and it is crucial to be able to understand the driving forces leading to their formation. Furthermore, phase transitions are also associated with a range of neurodegenerative diseases and it is pharmaceutically relevant to understand how to interfere with their formation.[217,218] NMR spectroscopy is one of the main players on this grounds and 13C NMR has been exploited to characterize at the molecular level some of these processes. An interesting example is the phase transition of Tau, an IDP that functions in microtubule nucleation, assembly, and stabilization. This large protein contains a N-terminal projection domain, two proline-rich regions, and a C-terminal domain that includes three (3R) or four (4R) imperfect 31- or 32-residue repeats. The repeat domain constitutes the microtubule-binding domain and has the ability to bind microtubules and to promote their assembly.[219] Tau and its shorter variant containing the 4R repeats (termed K18) undergo liquid–liquid phase separation in vitro at pH 8.8 and 37 °C (Figure ) and 13C NMR was key to be able to map at the residue level changes occurring upon LLPS formation since most of the signals of 1HN-detected spectra, and those of Gly, Ser and Thr residues in particular, were broadened beyond detection in these conditions. The high quality of the spectra obtained, including CON and CBCACO, suggested the possibility to follow chemical shift perturbations for each residue and highlight changes in secondary structure propensities upon droplets formation.[220]

Figure 16

NMR spectroscopy of liquid–liquid phase separation of the repeat domain of tau. (a, b) Superposition of 2D 1H-15N HSQC (a) and 2D CON (b) spectra of K18 in the monomeric dispersed state (5 °C, black) and the droplet phase (37 °C, red). To avoid contributions of solvent exchange to NMR signal broadening in 1H-15N correlation spectra, D2O was placed into a separate capillary tube (insert in a). (c) Superposition of CON spectra of K18 recorded at 37 °C in the absence (−hex, red) and presence (+hex, blue) of 3% 1,6-hexanediol, which rapidly dissolved K18 droplets (e). (d) DIC microscopy demonstrates K18 droplet formation at 37 °C. (e) Time-dependent dissolution of K18 droplets in the presence of 3% 1,6-hexanediol observed by DIC microscopy. Reproduced from ref (220) with permission from the Royal Society of Chemistry. Another example was recently provided by the investigation of the C-terminal disordered regions of two interacting proteins, FMRP and CAPRIN1.[221] These two fragments form condensates individually and in interaction when FMRP is phosphorylated in specific positions. The comparison of the CON spectrum of CAPRIN1 in the presence of phase-separated FMRP and of CAPRIN1 cophase-separated with phosphorylated FMRP (pFMRP) highlighted the involvement of two arginine-rich regions in the interaction as well as the involvement of some tyrosine residues.

Low Complexity Regions

“Low complexity regions” is a term widely used to describe parts of polypeptide chains constituted by a simple composition in terms of amino acids, featuring a few or even only one amino acid-type. These are not expected to be parts of globular folds, at least considering what we know as of today, and are often found as parts of IDPs/IDRs. Among them an interesting example is provided by the so-called poly-Q, that is parts of polypeptide chains constituted by many glutamine residues (Qs) in a row. These motifs attracted the attention of the scientific community because some proteins featuring elongated stretches of Qs are linked to the onset of the so-called poly-Q diseases, a set of neurodegenerative diseases caused by malfunction of proteins that only have in common a part of the polypeptide chain with a high number of glutamine residues.[222] The atomic resolution structural and dynamic investigation of these proteins is not easy because they do not crystallize, preventing the use of X-ray crystallography, and also from the NMR point of view their investigation represents a challenge due to the highly repetitive primary sequence. With 13C detected NMR experiments it was possible to perform complete sequence specific assignment of the N-terminal fragment of the androgen receptor (AR) comprising up to 25 Q in a row (Figure ), not just as an isolated peptide but embedded in the N-terminal fragment of AR (135 and 156 amino acids in the 4Q and 25Q variants, respectively). This study revealed a pronounced helical propensity of this region and highlighted the important role of four leucine residues preceding the poly-Q tract to induce helicity in the poly-Q region, as confirmed through a deletion mutant.[100,223] This may provide a mechanism to protect the poly-Q tract from aggregation, a relevant process for the onset of a neurodegenerative disease (spinal bulbar muscular atrophy). Interestingly, these poly-Q helices, induced by preceding leucine residues were also identified in functional proteins such as the ID5 linker of CBP.[224] The investigation of poly-Q containing proteins has now become a very active field of investigation.[225−227]

Figure 17

Comparison of the (A, B) NMR and (C) CD spectra of 4Q and 25Q; secondary structure propensity of 4Q and 25Q is reported in (D). The central region of (A) the 13C,15N-CON-IPAP spectrum and (B) the 1H–15N HSQC spectrum of 400 μM 4Q (black) and 25Q (red) at 278 K are shown, with the assignment of the resonances that experience the largest chemical shift variations upon increasing the length of the poly-Q tract. (C) CD spectrum of 120 μM 4Q and 130 μM 25Q at 310 K. In (D) Values for residues 55 to 62, corresponding to the 55LLLL58 motif and the first 4 Gln of the poly-Q tract, are shown in blue to highlight the variation of the structural properties of the protein due to the different length of the poly-Q tract. To facilitate the comparison, the values for the residues of 4Q that follow the poly-Q tract are shifted to the right by 21 units. Adapted from ref (100). Copyright 2016 Biophysical Society.

Conclusions and Outlook

Largely neglected for many years in favor of approaches exploiting direct detection of proton on the grounds of the intrinsic higher sensitivity of the latter, from the early 2000s 13C direct detected NMR in solution has been rediscovered thanks to hardware development and experimental improvements. The first spectra we recorded at 16.4 T (700 MHz 1H frequency) were obtained with a spectrometer equipped with a room-temperature inverse detection probehead, with a S/N ratio on the ASTM standard sample of about 200:1; today we are routinely using a cryogenically cooled probe for direct 13C observation with a S/N ratio of 2800:1 on the same sample. This increase in sensitivity opened the way to routine use of 13C direct detection by allowing a significant reduction of experimental times to access the same information (a factor of about 200). With the latest magnet and probeheads technology, the instrumental sensitivity jumps above 4400 for a 28.1 T (1.2 GHz) spectrometer. In a stepwise fashion, solutions to overcome the potential limitations of the approach have been devised. Automation for removing the large splitting due to one bond scalar couplings in the 13C direct dimension constituted an important contribution for making it handy. Different polarization sources were used as the basis to design a wide variety of NMR experiments, ranging from simple 2D experiments to achieve a snapshot of a protein in solution, to 3D experiments for sequence-specific resonance assignment, all the way to complex multidimensional experiments (nD, with n > 3) exploiting the most advanced strategies to acquire and process the data. In addition, a wide variety of additional experiments to detect different NMR observables were developed and made available in the NMR spectrometers’ library of pulse sequences, thus becoming of general use. Browsing the literature, it can be realized that next to papers dedicated to the development of the experiments and exploring potential novel applications, there are many other papers including 13C detected experiments side by side to 1H detected ones for fingerprinting or to complete the characterization or the assignment of a protein. This is a clear indication that since the early 2000s when it was considered a sort of revival of a vintage technology, 13C direct detection NMR in solution is now considered a useful complement to 1H NMR that can provide unique information or simplify assignment and characterization of biomolecules in general. One of the research areas where 13C direct detection has proven very successful, as also witnessed but the many applications populating this account, is the characterization of intrinsically disordered proteins and heterogeneous proteins presenting highly mobile portions of their chain. Indeed, 13C NMR provides larger chemical shift dispersion with respect to proton NMR and benefits from the favorable relaxation properties of disordered polypeptides. We expect that this area will be even more important in the future since these highly flexible disordered proteins constitute a large share of the proteome and need to be investigated at atomic level. This leads to the topic of the capability of highly flexible proteins to engage in intermolecular interactions still in a nonspecific way, interactions that lead to fuzzy complexes as well as to the process of liquid–liquid phase separation, still not well characterized. The investigation of these interactions at atomic resolution has great potential to reveal the driving forces of these processes and exclusively heteronuclear experiments can play a central role, monitoring both protein backbone as well as key side-chains involved in this process.

208 in total

1. Cooling overall spin temperature: protein NMR experiments optimized for longitudinal relaxation effects.

Authors: Michaël Deschamps; Iain D Campbell
Journal: J Magn Reson Date: 2005-10-24 Impact factor: 2.229

2. Protonless NMR experiments for sequence-specific assignment of backbone nuclei in unfolded proteins.

Authors: Wolfgang Bermel; Ivano Bertini; Isabella C Felli; Yong-Min Lee; Claudio Luchinat; Roberta Pierattelli
Journal: J Am Chem Soc Date: 2006-03-29 Impact factor: 15.419

3. Structural and dynamic characterization of intrinsically disordered human securin by NMR spectroscopy.

Authors: Veronika Csizmok; Isabella C Felli; Peter Tompa; Lucia Banci; Ivano Bertini
Journal: J Am Chem Soc Date: 2008-12-17 Impact factor: 15.419

4. High-resolution characterization of intrinsic disorder in proteins: expanding the suite of (13)C-detected NMR spectroscopy experiments to determine key observables.

Authors: Ivano Bertini; Isabella C Felli; Leonardo Gonnelli; M V Vasantha Kumar; Roberta Pierattelli
Journal: Chembiochem Date: 2011-10-17 Impact factor: 3.164

Review 5. Fast time-resolved NMR with non-uniform sampling.

Authors: Dariusz Gołowicz; Paweł Kasprzak; Vladislav Orekhov; Krzysztof Kazimierczuk
Journal: Prog Nucl Magn Reson Spectrosc Date: 2019-09-25 Impact factor: 9.795

6. Effect of deuteration on the amide proton relaxation rates in proteins. Heteronuclear NMR experiments on villin 14T.

Authors: M A Markus; K T Dayie; P Matsudaira; G Wagner
Journal: J Magn Reson B Date: 1994-10

7. 13C direct detection experiments on the paramagnetic oxidized monomeric copper, zinc superoxide dismutase.

Authors: Wolfgang Bermel; Ivano Bertini; Isabella C Felli; Rainer Kümmerle; Roberta Pierattelli
Journal: J Am Chem Soc Date: 2003-12-31 Impact factor: 15.419

8. Amino acid recognition for automatic resonance assignment of intrinsically disordered proteins.

Authors: Alessandro Piai; Leonardo Gonnelli; Isabella C Felli; Roberta Pierattelli; Krzysztof Kazimierczuk; Katarzyna Grudziąż; Wiktor Koźmiński; Anna Zawadzka-Kazimierczuk
Journal: J Biomol NMR Date: 2016-02-18 Impact factor: 2.835

9. Phosphorylation induces sequence-specific conformational switches in the RNA polymerase II C-terminal domain.

Authors: Eric B Gibbs; Feiyue Lu; Bede Portz; Michael J Fisher; Brenda P Medellin; Tatiana N Laremore; Yan Jessie Zhang; David S Gilmour; Scott A Showalter
Journal: Nat Commun Date: 2017-05-12 Impact factor: 14.919

10. Interaction between the scaffold proteins CBP by IQGAP1 provides an interface between gene expression and cytoskeletal activity.

Authors: Simone Kosol; Sara Contreras-Martos; Alessandro Piai; Mihaly Varadi; Tamas Lazar; Angela Bekesi; Pierre Lebrun; Isabella C Felli; Roberta Pierattelli; Peter Tompa
Journal: Sci Rep Date: 2020-04-01 Impact factor: 4.379

3 in total

1. The Role of Disordered Regions in Orchestrating the Properties of Multidomain Proteins: The SARS-CoV-2 Nucleocapsid Protein and Its Interaction with Enoxaparin.

Authors: Marco Schiavina; Letizia Pontoriero; Giuseppe Tagliaferro; Roberta Pierattelli; Isabella C Felli
Journal: Biomolecules Date: 2022-09-15

2. NMR Reveals Specific Tracts within the Intrinsically Disordered Regions of the SARS-CoV-2 Nucleocapsid Protein Involved in RNA Encountering.

Authors: Letizia Pontoriero; Marco Schiavina; Sophie M Korn; Andreas Schlundt; Roberta Pierattelli; Isabella C Felli
Journal: Biomolecules Date: 2022-07-02

3. Linear discriminant analysis reveals hidden patterns in NMR chemical shifts of intrinsically disordered proteins.

Authors: Javier A Romero; Paulina Putko; Mateusz Urbańczyk; Krzysztof Kazimierczuk; Anna Zawadzka-Kazimierczuk
Journal: PLoS Comput Biol Date: 2022-10-06 Impact factor: 4.779

3 in total