Literature DB >> 35446534

NMR Provides Unique Insight into the Functional Dynamics and Interactions of Intrinsically Disordered Proteins.

Aldo R Camacho-Zarco¹, Vincent Schnapka¹, Serafima Guseva¹, Anton Abyzov¹, Wiktor Adamski¹, Sigrid Milles¹, Malene Ringkjøbing Jensen¹, Lukas Zidek^2,3, Nicola Salvi¹, Martin Blackledge¹.

Abstract

Intrinsically disordered proteins are ubiquitous throughout all known proteomes, playing essential roles in all aspects of cellular and extracellular biochemistry. To understand their function, it is necessary to determine their structural and dynamic behavior and to describe the physical chemistry of their interaction trajectories. Nuclear magnetic resonance is perfectly adapted to this task, providing ensemble averaged structural and dynamic parameters that report on each assigned resonance in the molecule, unveiling otherwise inaccessible insight into the reaction kinetics and thermodynamics that are essential for function. In this review, we describe recent applications of NMR-based approaches to understanding the conformational energy landscape, the nature and time scales of local and long-range dynamics and how they depend on the environment, even in the cell. Finally, we illustrate the ability of NMR to uncover the mechanistic basis of functional disordered molecular assemblies that are important for human health.

Entities: Chemical

Mesh：

Substances：

Year: 2022 PMID： 35446534 PMCID： PMC9136928 DOI： 10.1021/acs.chemrev.1c01023

Source DB: PubMed Journal: Chem Rev ISSN： 0009-2665 Impact factor: 72.087

Introduction

Unexpected discoveries regularly revolutionize our understanding of molecular biology. The remarkable observation that intrinsically disordered proteins are prevalent throughout all known proteomes represents one such example, forcing a reassessment of established approaches for investigating biological function at the molecular level.[1−5] Unlike folded proteins, the primary amino acid sequence of intrinsically disordered proteins (IDPs) does not adopt a stable tertiary fold to function but dynamically samples a broad free-energy surface. IDPs thus access a vast conformational landscape that nevertheless encodes specific biological activity.[6] This conformational heterogeneity endows IDPs with considerable advantages over their folded counterparts, for example, the ability to interact with multiple partners, possibly simultaneously as in the case of hub-proteins. Combining transient and local disorder-to-order transitions with rapid dissociation rates allows efficient processing and provides the necessary level of multivalent, weak intermolecular binding to transiently form membraneless organelles[7] (another phenomenon whose importance has revised our understanding of cell regulation and function). In general, although the potential benefits of conformational disorder are quite well discussed in the literature, we are still discovering the true breadth of functional diversity encoded in IDPs. Structural dynamics are of course essential to biological function in all proteins, and the characterization of the conformational fluctuations that enable function is a vital aspect of our quest for a molecular understanding of biology. Complementary to the stabilization of distinct conformational substates and the determination of their three-dimensional structures at given points in a functional cycle, direct physical methods such as infrared,[8,9] terahertz,[10] neutron,[11] dielectric[12] Mössbauer,[13] and Raman[14] spectroscopies can be used to describe the characteristic time scales of protein motions. Time-resolved X-ray diffraction techniques[15] and X-ray free electron lasers[16] also provide simultaneous access to both high resolution structure and dynamics. Within the broad panoply of physical techniques available to characterize biomolecular dynamics, nuclear magnetic resonance (NMR) spectroscopy occupies a unique place, providing atomic resolution information over an incredibly broad range of motional time scales extending from tens of picoseconds to hours or even days (Figure ).

Figure 1

NMR probes biomolecular conformational changes on a vast range of time scales. NMR spin relaxation provides accurate information on the reorientational properties of relaxation-active interactions, normally interatomic bonds, up to tens of nanoseconds. In the fast exchange limit, a single NMR peak represents a population weighted average over the chemical shifts of each populated substate. When the exchange rate is in the same range as the difference in chemical shifts of the distinct states, on time scales from tens of microseconds to hundreds of milliseconds in proteins, line-broadening is observed, and 1H, 13C, and 15N NMR exchange approaches can be used to characterize interconversion between the different conformational states. Exchange that is significantly slower than the difference in chemical shifts of the distinct states gives rise to slow exchange, allowing all states to be individually investigated. Flexibility and dynamics not only define the physical nature but also the biological function of IDPs, and the two major challenges facing interpretation of experimental data from IDPs are related to these characteristics. The first concerns the accurate description of the conformational space sampled by the protein. NMR reports on a population-weighted average over the ensemble of interconverting states sampled at equilibrium so that as long as the exchange rates are fast on an NMR time scale, conformation-dependent parameters, such as chemical shift or scalar and dipolar couplings, report on interconversion between a potentially immense number of conformers. In practice for NMR studies of proteins using 1H, 15N, and 13C nuclei, this means interconversion on time scales faster than hundreds of microseconds. Interpretation of experimental data therefore requires statistical mechanical approaches to evaluate the nature of the conformational ensemble. The available degrees of conformational freedom that are accessible to IDPs significantly outweigh the ability of the experimental constraints to uniquely define the free-energy surface. Regardless of the approach used to delineate the conformational space, caution must therefore be employed to derive meaningful ensemble models that correctly describe the long-range and local conformational sampling. To this end, there has been considerable methodological development aiming to delineate the contours and limits of local and long-range conformational space sampled by IDPs in solution,[17−27] from NMR, and other complementary biophysical techniques such as small angle scattering and single molecule Förster resonance energy transfer (smFRET).[28−32] Progress in this direction has focused on the use of extensive exploration of conformational space, using for example stochastic sampling of the available degrees of freedom, and subsequent identification of combinations of conformers that when assembled into representative ensembles agree with experimental data and can describe the contours of the Boltzmann ensemble.[33−37] The success of such approaches is predicated on the ability to accurately calculate the expected value of experimental data for a given conformation or conformational sampling regime. The same end can be achieved via ensemble restrained molecular dynamics simulation,[38−41] for example, by including experimental data into the force field via a target function applied over the entire ensemble.[42−51] The amount of detail concerning the conformational sampling of IDPs in solution that can be derived from all of these ensemble approaches depends of course heavily on the extent of experimental data available.[52] The advantage of the fast exchange regime, reporting on a population-weighted average over an ensemble of states that interconvert on time scale faster than 100 μs, also highlights its key limitation that more precise information about the associated motional time scales is not explicitly contained in this average. Knowledge of the time scales of diffusion and chain dynamics, of interconversion rates between locally structured binding-competent and incompetent substates, and of transient contacts relating the conformational properties of distant regions of IDPs will all play an essential role in developing a deeper understanding of IDP reaction kinetics and thermodynamics. Understanding the dynamic properties of IDPs complements Cartesian descriptions of their exploration of conformational space, providing a new and essential dimension to our description of their functional behavior. In response to this challenge, time scales of conformational rearrangements of IDPs have been investigated using a vast range of experimental techniques,[53] sensitive to local conformational dynamics such as infrared,[54,55] Raman,[56] or neutron spectroscopy[57−59] or to long-range interactions using single molecule fluorescence,[60−69] electron paramagnetic resonance,[70−72] and NMR paramagnetic relaxation spectroscopies,[73−79] but by far the most powerful technique is the use of NMR spin relaxation. NMR spin relaxation probes the angular correlation functions of relaxation active mechanisms, typically dipole–dipole interactions between neighboring nuclei, arising due to reorientation processes of macromolecules on time scales ranging from 10s of picoseconds to 10s of nanoseconds or even slower. These time scales are also readily accessible to atomistic molecular dynamics (MD) simulation of fully solvated proteins, rendering the combination of MD and NMR extremely powerful. Advances in molecular simulation, in terms of accuracy of force-fields or sampling of slower dynamic time scales,[80−85] have always accompanied advances in our understanding of the interpretation of NMR relaxation in terms of global and local molecular motions, demonstrating the synergy between these two atomic resolution techniques. Indeed, 15N and 13C NMR relaxation data have often been used to test and benchmark MD force fields and algorithms,[82,86−91] establishing the accuracy of dynamic trajectories of soluble, folded proteins. 15N spin relaxation provides a remarkably sensitive probe of the motional time scales exhibited by IDPs, characterizing the dynamic properties of bond vectors throughout the length of the unfolded protein.[92] The physical interpretation of the dynamic time scales contributing to the quenching of the angular correlation function is however less straightforward than in the case of folded proteins. The amount of information that can be extracted from spin relaxation is also limited by the efficiency with which fast large-scale motions quench the angular correlation function. 15N spin relaxation measurements in unfolded proteins have nevertheless been measured extensively, leading to the detection of extensive pico- and nanosecond motions, as well as clear correlations between motional time scales and structural propensities detected from chemical shifts and scalar and dipolar couplings.[93−114] Further insight into the actual physical origin of the motional modes and time scales giving rise to NMR spin relaxation can again be derived from the combination of MD simulation with spin relaxation measurements.[115−118] Measured relaxation rates report on population-weighted averages so that accurate simulation should account for fast motions occurring over the ensemble of states sampled by the protein. The value of relaxation rates associated with each substate depends on the nature of this conformation, so that in principle it would be necessary to simulate each of the substates and average the individual rates as a function of their populations, or to simulate sufficiently long trajectories to sample all individual states. In the case of globular proteins, the identification and simulation of distinct conformational substates that are in fast exchange on the chemical shift time scale but that exhibit distinct fast reorientational properties have indeed been shown to significantly improve the description of the ensemble of fast motions, as measured by the reproduction of experimental 15N relaxation rates.[119] This demonstrates the improved accuracy of dynamic information when considering the entire free-energy surface but also the interdependence of fast and slower motions in proteins. For IDPs, this potential interdependence has an even greater importance and underlines the relevance of adequate sampling of the ensemble of conformational states.[120] Despite major progress in the simulation of highly flexible or unfolded proteins,[42,118,121−123] a more general application of these techniques has been hindered by the inability of state-of-the-art force fields to describe the dynamics of IDPs with acceptable accuracy.[90,124,125] While the degrees of conformational freedom available to internuclear covalent bonds present in folded proteins are mainly dictated by the immediate environment, and therefore intraprotein interactions, for IDPs the solvent protein interactions take on a far greater importance, so that an imbalance between potential energy terms reporting on protein–protein and protein–solvent interactions[124] may result in inaccurate kinetic and thermodynamic behavior. The resolution of this question, and the development of force fields that can describe both folded and unfolded proteins with equal accuracy,[126] remains an important challenge.[90,120,124,127−130] The availability of accurate and calibrated NMR relaxation rates from proteins with well-described conformational behavior will undoubtedly contribute to this important task. Beyond the fast exchange regime, NMR relaxation experiments no longer represent a population-weighted average of the reorientational properties of the exchanging species but report on motions occurring on time scales defined by the difference in chemical shifts of the exchanging subspecies, in the range of micro to milliseconds. In this regime, NMR exchange spectroscopy is particularly powerful way to probe the molecular mechanisms underlying the exchange contributions, providing information on the thermodynamics, free-energy landscape, and kinetics of the interconversion between the species.[131−136] Finally, our understanding of the functional modes adopted by IDPs is enriched by every physiologically relevant complex that is characterized experimentally. The functional interactome of IDPs is vast and potentially highly diverse, and our experimental sampling of the interaction modes employed by IDPs remains extremely punctual. Although specific model systems that are experimentally well-characterized provide useful bench-marks, insight into the true diversity of the IDP interactome requires more sampling, of more diverse systems, at atomic resolution. Exchange NMR, whether fast, intermediate, or slow, provides powerful tools to deliver this essential insight. The aim of this review is to describe recent developments of NMR-based approaches to understand the conformational dynamic behavior of IDPs in physiological, and even cellular environments, and to illustrate the insight that NMR offers to reveal the mechanistic basis of functional disordered assemblies that are important for human health. Part of the power of NMR spectroscopy lies in the use of combinatorial approaches with structural techniques such as cryoEM and X-ray diffraction that provide the structural context within which the functional role of IDRs can be best understood. Examples will also be shown of the ability of NMR to characterize large-scale dynamics of complex biomolecular assemblies comprising highly disordered elements.

Accurate Mapping of the Conformational Landscape of IDPs

An accurate understanding of the conformational properties of IDPs, and intrinsically disordered regions (IDRs) of multidomain proteins, is of primordial importance. The dynamic behavior of IDPs is defined by the amino acid sequence, and the ability of the protein to interact via, for example linear motifs, is encoded and controlled by the intrinsic conformational sampling. In addition, IDRs, often linking folded domains, define the free-energy landscape of the protein, providing the degrees of conformational freedom of the entire molecular assembly.[6,137−139] Characteristics such as charge and hydrophobicity distribution of IDPs have been interpreted in terms of their role in controlling physical parameters, for example, compactness and extendedness,[140] and the ability of IDPs to participate in multivalent interactions.[141−144] Similarly, regulation of these degrees of freedom can be achieved by post-translationally modifying the chemical nature of the chain.[145−148] Two recent studies described herein illustrate the importance of a detailed consideration of the averaging properties of different experimental data types to understand the conformational nature of IDRs. In particular, the combination of long-range and local transient structure poses specific challenges to the analysis of disordered proteins in terms of representative ensembles, and certain pitfalls must be avoided to extract accurate structural information. Chemical shifts and scalar couplings present two important features that directly impact their interpretation. First, they depend primarily on the local structural environment of the observed spin, and second, if interconversion between the different states is much faster than the difference between the expectation values of the different states in isolation, the measured NMR spectrum represents a weighted average of the ensemble of states. Conversely, parameters whose experimental values depend on time-dependent interactions, such as paramagnetic relaxation for example, require a more detailed consideration of the averaging properties, as has been discussed.[79] Residual dipolar couplings (RDCs) depend on the average of the orientations of the internuclear vector (I–S) with respect to the magnetic field,where K describes physical constants such as the gyromagnetic ratio and the internuclear distance, and P2 (x) = (3x2 – 1)/2. In a molecule of fixed shape, we can expand this average,where α refers to the orientation of the internuclear vector with respect to a traceless second rank tensor S that describes the alignment properties of the molecule. In highly flexible proteins, S can clearly vary significantly over the ensemble such that proteins of different shape, and therefore different alignment properties, but identical local sampling, would give rise to very different RDCs:Using simple and intuitive simulation of target ensembles, it was demonstrated that ensemble descriptions derived from RDCs of molecular systems whose shape varies significantly over the ensemble can actually reproduce experimental data very closely, even without explicit consideration of the alignment properties of the component conformations. However, the orientational properties of the internuclear vectors are then severely compromised and inaccurately describe the conformational space compared to the target ensemble.[149] This reiterates the long-held observation that to accurately describe local and long-range conformational sampling, it is necessary to respect both of these contributions to the average over the ensemble of states.[150] The importance of considering long-range order in the interpretation of RDCs was also illustrated in a recent study of the δ domain of RNA polymerase (δ−RNAP), where multiple NMR parameters and small angle scattering data were combined using the ensemble selection approach, ASTEROIDS, to compare the free energy landscape of different forms of the protein. ASTEROIDS uses extensive conformational sampling described in an initial prior database, broadly sampling amino-acid specific statistical-coil distribution for the unfolded chain,[151,152] and a genetic algorithm, to select representative subensembles of conformers that in combination are in agreement with the experimental data. The sampling of the prior database is modified iteratively until convergence is achieved within the estimated uncertainty.[37] In the case of δ−RNAP, the 90 amino acid C-terminal IDR follows the similarly sized folded domain.[153] The IDR is locally highly charged, with mainly acidic but also basic stretches of amino acids. As in the case of a number of acidic disordered domains in RNA-polymerase machinery, the acidic sequence has been suggested as an RNA mimic.[154] Experimental data used to describe the conformational sampling of the IDR included 13C, 15N, and 1H backbone chemical shifts, paramagnetic relaxation enhancements (PREs), residual dipolar couplings (RDCs), and small-angle X-ray scattering data. PREs provide clear evidence of transient long-range order in the IDP, with apparent contacts between regions exhibiting opposite charges (Figure ).[155] Analysis of δ-RNAP in terms of representative ensembles results in close agreement with expected behavior of the averaged RDCs. Characteristic modulations of multiple RDCs were observed in each peptide unit (manifest as quenching of the RDCs measured between the points of contact), and these RDCs were only correctly predicted when the long-range contact identified from the PREs was included in the analysis.

Figure 2

Experimental comparison of conformational behavior of the intrinsically disordered δ subunit of bacterial RNA polymerase. (A) Experimental parameters measured on wild-type protein (green bars) compared to ensemble-averaged values calculated from 10 ensembles comprising 200-strong ASTEROIDS ensembles (red lines). From top to bottom: secondary chemical shifts, paramagnetic relaxation enhancements (labeled at residue 132), residual dipolar couplings, and SAXS. Bottom: comparison of distribution of radii of gyration from a statistical coil pool (black) and the ASTEROIDS ensemble (red). Structural models of five conformations are displayed below the plots (ordered domain in green, IDR in yellow with positively and negatively charged residues highlighted in blue and red, respectively. (B) Same parameters for the mutated protein in which a lysine-rich tract 96KAKKKKAKK104 are replaced by 96EAEEEEAEE104. This results in a clear abrogation of long-range contacts with the C-terminal half of the domain that collapse the protein. This collapse, and its abrogation, are visible not only in SAXS and PRE data but also in the residual dipolar coupling data. (Reproduced with permission from Kuban et al. 2019 Copyright 2019 ACS[156]). Mutation of the cluster of basic amino acids to acidic residues abrogates the long-range contacts, resulting in extinction of the characteristic PRE- and SAXS-derived evidence of compaction in the wild type protein, revealing a highly extended IDR in the absence of the basic cluster, and a disappearance of the characteristic long-range RDC modulation. The combined analysis thus results in an accurate, integrated description of the ensemble of states sampled by both wild-type and mutant protein in solution, providing insight into the impact of the electrostatic charge distribution on local and long-range conformational behavior.[156] Interestingly, the loss of long-range contacts induced by mutagenesis influences cell fitness and transcription efficiency in vitro. While the complete knockout of the delta subunit makes transcription too fast and insensitive to regulation by initiating nucleoside triphosphates, the mutation disrupting long-range contacts has the opposite effect: it inhibits transcription from promoters that form unstable complexes with RNA polymerase.

NMR Studies of IDP Dynamic Modes and Timescales

NMR Relaxation of IDPs and Models of Correlation Functions

As introduced earlier, NMR relaxation occurs due to angular fluctuations of relaxation-active interactions resulting in transitions and incoherent dephasing that relax the spin state back to equilibrium.[92,157,158] The angular reorientation of such interactions can be described in the time domain (correlation function C(τ)) or the frequency domain (the spectral density function J(ω)). Protein backbone dynamics are typically characterized in solution using longitudinal (R) and transverse (R) autocorrelated 15N relaxation rates, heteronuclear 1H–15N cross-relaxation, and 15N longitudinal (η) and transverse (η) cross-correlated dipole–dipole/CSA (chemical shift anisotropy) cross-relaxation (σ).[92] The advantage of measuring different rates lies in their distinct dependence on different combinations of the angular spectral density function at the characteristic Larmor frequencies defined by the spin system, ωN, ωH, ωH ± ωN. If enough measurements are available, the spectral density functions can be mapped from the different relaxation rates[159,160] using reduced spectral density mapping[161−164] to estimate J(0), J(ωN) and an approximate mean value at high frequencies throughout the sequence. Alternatively, the correlation function of internal motional modes can be described analytically, in terms of geometric and temporal parameters (for example, n-site jumps of diffusion in a cone), although it can be difficult to differentiate between these models on the basis of NMR relaxation rates alone. A simple and popular alternative is to use the model-free approach, where mathematical contributions to the autocorrelation function are parametrized. The approach is simply understood in the case of internal modes in a folded protein,[165−169] where it is possible to express the angular correlation function aswhere C (t) is the correlation function for global motion, and a faster internal contribution, that is not associated with a specific motional mode, describes restricted motion relative to the molecular frame:where μ̂ is a unit orientation vector of the relevant relaxation-active interaction (dipolar or CSA). If the internal correlation function C(t) is approximated to a single exponential, the associated spectral density function can be described aswhere τ’ = (τ-1 + τ-1)−1τ describes the overall rotational diffusion and S2 is the generalized order parameter. Extension[168] to two internal components with distinct correlation times (τ and τ and order parameters, gives)where τ′ = (τ–1 + τ–1)−1, τ′ = (τ–1 + τ–1)−1. This formalism is commonly used to interpret relaxation measured in folded proteins, with the global contribution to the autocorrelation and spectral density functions assumed to be common for all sites. Although, due to their high flexibility, IDPs are not expected to exhibit a shared diffusion tensor for distinct regions in the chain, the same mathematical formalism can be used to model the spectral density functions of each site independently, assuming that the time scales of the component modes are sufficiently separated, and that all the motions are isotropic:with ∑A = 1, and This formalism has been diversely exploited for the interpretation of relaxation from partially denatured proteins and IDPs.[93,95,97,170−173] Alternatively, it is possible to describe the spectral density function in terms of an analytical distribution of motions, of which the model-free approach represents one of the simplest manifestations.[99,103,110,174] Here again, the complexity of the models makes differentiation difficult, although they have been successfully used to explain the dynamic behavior of synthetic homopolymers,[175] and surely provide a more physical representation of the complex dynamics of flexible proteins.[103] In highly dynamic molecules such as IDPs, large-amplitude motions occur in the range of nanoseconds,[93−114] rapidly quenching angular correlation and reducing the slowest sensitive time scales to the nanosecond range (at room temperature and in free solution). Nevertheless, the existence of segmental motions was suggested from the bell-shaped dependence of transverse relaxation components (with respect to primary sequence, tailing off to low values at both termini), in chemically denatured and intrinsically disordered proteins,[104] relating to stiffness or side chain bulkiness,[96,176] and from 1H relaxometry.[177] IDRs connected to folded domains have been shown to induce slower components on the rotational diffusion properties of multidomain proteins indicating the importance of local viscosity and drag on dynamic time scales.[178−181] Faster time scales are expected to relate to more local dynamics, for example, of backbone dihedral angles, which may be important in terms of local folding or binding;[6,23,182−192] however, in general the physical origin of observed relaxation rates remains weakly characterized.

Recent Applications of Model-Free Approaches to IDPs

It is clear from eq that amplitudes and time scales of the different components may be correlated and that the resulting parametrization will depend on the accurate estimation of the number of contributions. In the context of identifying the most appropriate model for the accurate interpretation of NMR relaxation from IDPs, a number of recent studies used extensive data sets to shed important light on the available information content. Rather than fixing the number of models and determine the most appropriate correlation times, Ferrage and co-workers[193] used an array of fixed correlation times (τ), distributed on a logarithmic scale, with variable amplitudes (A), that could also be zero, to analyze the spectral density function from eq . The backbone dynamics of the partially disordered protein Engrailed 2 were analyzed using a large range of auto- and cross-correlated relaxation rates measured at five magnetic fields between 400 and 1000 MHz 1H frequencies. This provides a grid of motional amplitudes corresponding to six characteristic correlation times for the entire protein, clearly delineating the folded and unfolded domains, and revealing dominant time scales around 1 ns in the unfolded domain. Gill et al.[194] also studied the dynamics of a partly unfolded protein, the basic leucine-zipper region of GCN4. In this case, 15N R1, R2, and σ, measured at 600, 700, 800, and 900 MHz 1H frequency were analyzed by rearranging the measured relaxation rates using a modified spectral density mapping, and comparing these results to a model free analysis using eq to determine how many independent contributions can be extracted from this analysis. The results demonstrate that the extended model-free approach accurately describes the experimental data as well as being statistically justified on the basis of the experimental uncertainty. The authors note that more than three contributions cannot be theoretically justified from these data. A similar study of the dynamic behavior of the 126 amino acid C-terminal disordered domain of Sendai virus nucleoprotein (NT), examined 15N R1, R2, σ, η, and η measured at four magnetic field strengths (600, 700, 850, and 950 MHz 1H frequency). In a first step, autocorrelated and cross-correlated rates measured at each field strength were analyzed using reduced spectral density mapping at each magnetic field strength, confirming the self-consistency of the data, and the absence of exchange contributions to R. The data were then analyzed using eq to determine the optimal number of contributions. Two procedures were undertaken, the first based on statistical testing, to determine the minimum number of contributions for each site. Models with 2 (τ 1 and θ), 4 (τ 1, τ2, A, and θ), 5 (A2, A3, τ 2, τ3, and θ), or 6 (A2, A3, τ 1, τ 2, τ3, and θ) parameters for all sites in the molecule, corresponding to 1, 2, or 3 contributions to the relaxation-active correlation function. The 3-component model was found to be justified throughout the protein. Second, 10% of all data were removed from each data set, and their values predicted from the parameters determined from the remaining data sets, again demonstrating that 3 components are essential to correctly predict experimental values. This implies that sufficient relaxation data have been measured to justify the more complex model. Experimentally measured relaxation rates vary significantly throughout the length of IDPs, exhibiting apparent correlation with transient secondary structure/linear motifs and differential dynamic behavior depending on sequence composition. It is therefore interesting to investigate the physical origin of the three components. The ability to measure NMR relaxation rates in complex environments such as liquid–liquid phase separation[195−197] and in cellulo(198) also calls for a careful analysis of the possible physical mechanisms underlying these experimentally observed dynamic modes. To this end, two approaches, described below, have recently shed more light on the information content of this site-specific variation of relaxation in IDPs, in particular concerning the relative importance of local backbone conformational sampling and long-range chain-like behavior. The first concerns the dependence of the different components on environmental parameters such as temperature and crowding, and the second combines novel MD-based approaches to the interpretation of relaxation in IDPs.

Developing a Unified Description of IDP Dynamics in Solution

Temperature-Dependent Relaxation Reveals Properties of Distinct Dynamic Modes

The study of NT, a disordered protein containing a short helical linear motif was extended to measure R1, R2 and σ, and η and η at four magnetic field strengths (600, 700, 850, and 950 MHz) and over a large range of temperatures (268–298 K) (Figure A).[199] Up to 61 rates were measured for each amide group in the protein and interpreted using a simple Arrhenius relationship to couple the correlation times at the different temperatures (in analogy to the study of the temperature-dependent response of a microcrystalline protein by solid state NMR[200]):

Figure 3

Temperature-dependent 15N relaxation maps three modes of intrinsically disordered protein dynamics. (A) 15N auto- and cross-relaxation rates of NT measured at different magnetic field strengths (green, 600 MHz 1H frequency; blue, 700 MHz; red, 850 MHz; orange, 950 MHz) and at different temperatures (top: 298 K, second row 288 K, third row 278 K, bottom 274 K). (B–F) Analysis of all relaxation data in (A), using a three-component model-free approach, with characteristic correlation times related via an Arrhenius expression. (B) Slow (τ3) and intermediate (τ2) correlation times at 274 K (red), 278 K (orange), 288 K (green), and 298 K (blue). (C) Activation energies for slow (red) and intermediate (blue) time scales for each residue. (D–F) Amplitude of slow (D), intermediate (E), and fast (F) time scale contributions (Reproduced with permission from Abyzov et al. JACS 2016 Copyright 2016 ACS[199]). The different temperature dependences of the three components are described by temperature coefficients, or activation energies given by E (τ is the Arrhenius prefactor). Fitting to this function requires the determination of parameters defining the relative amplitude of the three components at each temperature and the effective temperature coefficients of the intermediate and slowest contribution (the fastest contribution around 50 ps shows insignificant temperature dependence). Again, cross-validation by removal of either 10% of all data, or data from each magnetic field, indicates that the analysis is satisfactorily overdetermined. It is worth pointing out that predictive cross-validation is not so common in analysis of protein dynamics from NMR spin-relaxation but when applied shows a reassuring level of confidence in the data analysis.[199] The simultaneous analysis of data from all five temperatures (Figure B–F) reveals fascinating insight into the origin of the three resolved components. The amplitude of the slowest component exhibits a bell-shaped distribution with respect to primary sequence, with a clear maximum in the helical region. The time scale parallels this distribution, reaching time scales up to 25 ns in the helical region at 268 K. Although this contribution is dominated by the slowest times experienced by the helix, the effective activation energy, or rate of change of τ with temperature, exhibits a smooth function along the sequence, reaching a maximum (20–25 kJ mol–1) in the center of the sequence. It was proposed that the slowest contribution reports on chain or segmental dynamics. The reason that slower motions are detected in the helix is that C(t) is not as efficiently quenched by the high amplitude fast motions occurring in the remaining unfolded part of the chain. The residual order left after the more restricted fast motions occurring in the helix allow for the detection of slower motion that has little effect on correlation functions from the less-structured parts of the chain. This is further supported by the analysis of data measured using protein constructs engineered to comprise 50, 75, or 126 amino acids, revealing a clear dependence of the τ3 on the length of the peptide chain, as expected for chain dynamics considered using Rouse or Zimm models.[201−203] The intermediate motion has a much flatter distribution over the unfolded regions, and the apparent activation energies are in the range expected from studies of peptide backbone free energy landscapes.[204,205] In this case, there is a discontinuity in activation energy between the unfolded and helical regions, motivating the suggestion that these contributions report respectively on local fluctuations within Ramachandran wells and constrained internal dynamics or partial unfolding in the helix.[54,128,206,120] Although relaxation in IDPs is often thought to provide information essentially concerning subnanosecond motions, the analysis shown here clearly demonstrated that short, structured motifs in unfolded polymers are also dependent on slower, segmental or chain-like motions, or whatever other motion finally quenches the angular correlation function. Most regions are not sensitive to these motions because of the extent of the faster motions, but if one can locally quench these, a great deal of insight can be derived from the resulting relaxation rates. We note that while the contribution of the slowest motion increases at lower temperatures, as the fastest motion falls, the amplitude of the intermediate motion systematically passes through a maximum at 288 K. This may provide us with information about the shape of the actual distribution of correlation times and their impact on the sampled correlation function.

IDP Dynamics under Crowded Conditions Experienced In Cellulo

Although significant progress has thus been made over recent years in our understanding of the information provided by NMR relaxation studies of IDPs, it remained unclear how to interpret data measured in more complex, and more specifically in the more crowded, physiological environments in which they function.[208,209] This question is particularly relevant with respect to NMR in cellulo,[198,210−216] where IDPs function in environments with molecular concentrations reaching 400 g/L,[217−219] very likely strongly affecting the time scales of IDP dynamics.[220−222] The effect of local environment on IDP function is also relevant for understanding the mechanistic role of IDPs in membraneless organelles.[195−197,223−225] IDPs are subjected to extreme solvent accessibility compared to folded proteins, suggesting that the physiological environment in complex multicomponent environments will very likely strongly influence dynamic modes and time scales. Single molecule fluorescence techniques have provided unique insight into the importance of so-called internal and solvent friction on IDP dynamics and partially folded or destabilized protein states as well as on the kinetics of protein folding.[66,226,227] These approaches have been used to investigate the dynamics of IDPs[228,229] and protein function[230] in the cellular environment. Similarly, NMR spectroscopy has been used to investigate modulation of the folding/unfolding equilibrium of globular proteins in cellulo, indicating changes in both population and exchange rates as a function of the cellular milieu, and a dependence on weak, so-called quinary[231] interactions between the protein of interest and diverse other molecules constituting the intracellular matrix.[232−234] NMR was also used to describe the impact of the cellular milieu on protein dynamics, from small globular proteins to IDPs.[210,211,215,235−238] In a detailed study, Theillet and co-workers compared the influence of different viscogens on the dynamics of α-synuclein, with 15N relaxation measurements made in mammalian cells, revealing changes in dynamics of the termini of the protein, presumably associated with crowding-induced compaction or inter- and intramolecular interactions. The extent of changes appeared to be more pronounced in cellulo, suggesting additional impact of intermolecular interactions on the relative deceleration of the NH-backbone fluctuations.[198] In the context of these examples, and the growing body of experimental data,[239−244] a physical framework that incorporates the effects of molecular crowding on the dynamics of the protein would provide a welcome tool allowing quantitative interpretation of NMR relaxation measured under physiological conditions. Recent work further addressed this challenge by measuring dynamics of IDPs as a function of environmental complexity. An extensive set of multifield NMR relaxation rates were measured over a broad range of conditions, using inert crowding agents to systematically modify viscosity, as well as temperature (Figure ).[207] This calibration allowed the dynamics of two IDPs to be mapped as a function of environmental conditions, including both viscosity and temperature. The two IDPs exhibit distinct physical properties, comprising both partially folded and highly flexible elements. Local, or nanoviscosity was gauged by measuring 1H longitudinal relaxation of water,[157] which, at the high magnetic fields used here, is expected to be dominated by rotational diffusion of the water molecules.[245−247] The overall dependences of the nanoviscosity of the solvent and solute on the concentration of viscogen show similar features, with the intermediate and slow correlation times of the backbone of the protein, and the 1H R1 both deviating from the linear regime in the range of 200 mg/mL (Figure ). Nevertheless, the two motional modes of the protein backbone exhibit very different responses, with friction coefficients that are much steeper (approximately a factor of 3) for the slower motions. As noted from fluorescence-based studies, viscosity probes of different dimensions are expected to measure different effective viscosities,[248−252] so that friction coefficients would be expected to be characterized by distinct length scales and to decrease for smaller probes.[253,254] This suggests, perhaps not surprisingly, that intermediate and slow dynamic modes are associated with fragments of different dimensions, for example, respectively, single and multiple peptide units. The ratio of friction coefficients corresponding to intermediate and slow motions was reproduced for both experimental systems (over 200 amino acids), suggesting that the observation may be general. The observed differences in effective friction coefficients may be related to observations made by Schuler and co-workers that translational diffusion slows down considerably more than rotational diffusion of the IDP prothymosin α inside crowded cells, suggesting very different length scales and susceptibilities to crowding.[229]

Figure 4

Viscosity-dependent 15N relaxation maps distinct response of local and longer-range dynamics in intrinsically disordered proteins. (A) Transverse (R2) and longitudinal (R1) relaxation, transverse cross-correlated DD/CSA (η) and heteronuclear {1H}-15N nuclear Overhauser enhancement (NOE) recorded at 600, 700, and 850 MHz as a function of concentration of Dextran 40. (B) Longitudinal water relaxation (solid red line, normalized to the value in free solution; ρ0) shows a similar dependence on concentration of viscogen to the intermediate time scale motion (green points). The slow motional component (purple) resembles approximately 3* ρ0 (dotted line). (C) Friction coefficients (ε) for intermediate backbone (blue) and slower, segmental (red) motions. (D) Cartoon representation of the length scales of intermediate and slower motions (Reproduced with permission from Adamski et al. JACS 2019[207] Copyright 2019 ACS). On the basis of these observations, it was possible to develop, and test, a single expression to describe the dynamic modes and their characteristic time scales of IDPs in complex mixtures, their temperature and viscosity coefficients, using a minimal set of physical parameters to relate both the intermediate and slow time-scales (τ) to the nanoviscosity of the solvent:where ρ(C) = (η – η0)/η0 = (R1, – R1,0)/R1,0, and R1,0 and η0 are the longitudinal relaxation rate of water and the viscosity in the absence of viscogen, R1,C is the longitudinal relaxation rate, η is the viscosity, and τ′,∞ is a prefactor representing the correlation time at infinite dilution and temperature. ε is the residue-specific friction coefficient relative to η of intermediate or slow motions. The model turns out to be robust and remarkably transferable in vitro. For example, once sequence-specific friction coefficients have been determined as a function of concentration for a particular protein, highly sensitive dynamic probes such as a complete set of 15N relaxation rates measured in very different crowding conditions are predicted with very high accuracy, simply on the basis of the measurement of the water R1 (Figure A).

Figure 5

Residue-specific friction coefficients are transferable between different in vitro crowding environments and even predict values measured in cellulo. (A) Experimental 15N relaxation rates recorded on Sendai virus NT in the presence of 135g/L PEG (gray bars) compared to values calculated using sequence-specific friction coefficients (eq ) (red lines) determined as a function of Dextran concentrations and water relaxation in the sample of interest. For comparison, relaxation rates predicted under dilute conditions are shown in blue. (B) Relaxation rates measured at 600 MHz 1H frequency at a concentration of 90 g/L PEG (colors as in (A)). (C) 15N relaxation rates recorded in-cell (red points) compared to values calculated on the basis of dynamic parameters determined in vitro (green bars and line). Orange bars and lines show rates predicted for dilute conditions. Experimentally determined friction coefficients and the experimental measurement of the water Rin cellulo were used in the prediction. (Reproduced with permission from Adamski et al. JACS 2019[207] Copyright 2019 ACS). Perhaps most remarkably, the expression reproduces experimental relaxation measured in cellulo in Xenopus oocytes, on the basis of viscosity coefficients measured in vitro and nanoviscosity measured in the cell (Figure B). This unified description offers new insight into the nature of IDPs, and extends our ability to quantitatively investigate their conformational dynamics in complex environments. Such a successful application of experimental methodology from in vitro viscogen to in cellulo observation may appear surprising in view of the complexity of the cellular environment[255] and the evident inability of synthetic polymers to reproduce this complexity.[256] This study suggests that such concerns do not prevent the accurate prediction of average reorientational properties of IDPs in cells and indicates that the averaging of observable signals from IDPs and water remain closely coupled even in the multicompartmental environment of the cell.

Interpreting NMR Relaxation in IDPs Using MD Simulation

Accounting for Ensemble Conformational Sampling to Interpret Relaxation from IDPs

Although MD simulation provides unique insight into the conformational dynamics of IDPs,[42,118,122,123] force-fields that accurately describe the behavior of folded proteins often fail to reproduce ensemble averaged properties of IDPs in solution, probably due to the importance of protein–solvent interactions. This in turn has motivated the conception of force fields that have been specifically designed for IDPs.[90,120,124,127−130] Spin relaxation remains the most powerful NMR observable to characterize dynamic time scales at a sequence specific level, and reproduction of experimental values is often the most challenging for MD simulation. As described earlier, assuming conformational exchange that is fast on the chemical shift (and relaxation rate) time scale, experimentally observed rates derive from a population-weighted average over individual relaxation occurring within the different states sampled up to the micro- to millisecond range, such that ⟨R⟩ = ∑pR (p and R are the population and the relaxation of each state). The problem of reproducing experimental relaxation rates from IDPs using MD simulation is illustrated in Figure , where the 18 rates from Sendai virus NT are compared to those derived from several microseconds of fully solvated trajectories, using (in 2016) state-of-the-art, IDP-adapted force fields.[90,258]

Figure 6

NMR relaxation allows for the identification of ensembles of time-dependent trajectories that represent fast motions in interconverting substates. (A) Experimental 15N relaxation rates recorded on Sendai virus NT at 298 K in dilute conditions (gray bars) compared to values calculated from 4 μs of MD simulation, (blue line). The red line shows values calculated from the ABSURD procedure targetting only transverse relaxation measured at 850 MHz (orange box). (B) The ABSURD procedure results in average time-dependent correlation functions that can be decomposed into local and segmental motions of the peptide chain. (Reproduced with permission from Salvi et al. JPCL 2016[125] Copyright 2016 ACS and Salvi et al. Angewandte Chemie 2017[257] Copyright Wiley 2017). Analysis of these trajectories indicates that the origin of the discrepancy derived from the over-representation of rare events, such as long-range contacts, whose frequency is poorly sampled, leading to statistical instability because the sampled correlation time does not fulfill the necessary criterion τ ≪ tmax,[259] where tmax is the maximal sampled time of the angular correlation function. To address this problem, the following procedure was adopted: The entire trajectory, or multiple distinct trajectories nucleated from different conformations, are divided into subtrajectories of 100 ns, from which correlation functions C(τ) (and rates R) are calculated and combined in an ensemble average that explicitly mimics the actual heterogeneous conformational origin of the measured relaxation. The maximum length of each subtrajectory is dictated according to the experimental analysis described above for the studies of two IDPs, NT and MKK4. At T = 298 K, the slowest contribution to the rotational correlation function detected by experimental spin relaxation (see above) is approximately 5 ns, so that the dynamic reorientations occurring in each distinct substate can be reasonably sampled using a sampling window of 100 ns (tmax= 50 ns). The ABSURD (average block selection using relaxation data) approach then estimates the relative weights or segments of C(τ) with respect to a single experimental relaxation rate, compiling an ensemble of subtrajectories that interchange on time scales significantly slower than the correlation time limit (100 ns) and significantly faster than the chemical shift time scale (100s of μs).[125] In this way, a representative ensemble of time-dependent trajectories is identified, thereby extending the concept of conformationally averaged ensemble-descriptions into the time dimension. Optimization against a unique relaxation rate at a single field identifies an ensemble of trajectories that systematically improves agreement with a broad set of rates, sensitive to motions occurring on a range of time scales (R, R, σ, η measured at multiple fields) (Figure ), as well as local (13C chemical shift) and global (SAXS) conformational sampling properties. The fact that the ensemble of trajectories improves reproduction of “passive” dynamic reporters highlights the importance of correctly sampling the free energy landscape of the IDP in solution, and illustrates the complex interdependence of motions occurring on time scales varying over many orders of magnitude. While it has previously been shown that simulating motions occurring in distinct substates improves reproduction of relaxation in folded proteins,[119,260] it is challenging to make this observation for IDPs.[120]

Analytical Description of the Dynamics of IDPs Sampled by NMR Relaxation

The ability to simulate the ensemble averaged angular correlation functions is of course only half of the challenge. In principle this function describes all of the molecular mechanisms that are relaxation-active, but in practice it is not straightforward to extract motional modes from this complex function. To address this problem, the correlation function was recently analytically decomposed into three components using internal coordinates to describe librational and reorientational dihedral angle modes relative to the average peptide plane, and tumbling of each peptide relative to the laboratory frame.[257] This deconvolution of the angular components allowed the identification of locally correlated and segmental motions along the chain. The advantage of such an approach was exemplified in a comparison of temperature dependent 15N relaxation measured on Sendai virus NT, and compared to relaxation calculated from average correlation functions derived using different force fields.[261] This allowed the identification of the best force field over a range of temperatures (Figure ) but also the exact dynamic mode that was responsible for the incorrect reproduction of experimental data (in this case the reorientation of water molecules and their correlation with intrasegmental backbone motions). In this way, the combination of ABSURD and the analytical description of the correlation functions can be seen as a forensic tool to improve molecular dynamics force fields with respect to experimental data.

Figure 7

Temperature-dependent NMR relaxation identifies accurate and transferable molecular force fields for IDPs. Experimental 15N{1H} steady-state nOes (gray bars) measured on Sendai virus NT at different magnetic fields (left 600 MHz, middle 700 MHz, and right 850 MHz) and temperatures. ABSURD-selected ensembles of trajectories using Charmm36m combined with the TIP4P/2005 water model (red) reproduces experimental values better than when combined with TIP3P (blue), at all temperatures. (Reproduced with permission from Salvi et al. Sci. Adv. 2019[125] Copyright 2019 AAAS).

How Do IDPs Function? Time-Resolved Atomic Resolution Descriptions of IDP Complexes

The detailed study of IDP-binding to receptors and cofactors has revealed that IDP-based affinities range from tight subnanomolar binding of highly specific chaperone complexes to multivalent interactions with individual dissociation constants in the millimolar range.[262−267] NMR spectroscopy has the immense benefit of providing residue- or even atomic-resolution detail of the interaction trajectories of IDPs, even in the weak binding regime, and it is in this range of affinities that it most often provides unique functional insight. Depending on the exchange regime between free and bound protein, NMR chemical shifts report on the population-weighted average of the free and bound forms of the protein (fast exchange, where the exchange occurs at a rate faster than the difference in chemical shifts ΔΔω in the two states) or slow exchange, that in principle allows for simultaneous detection of both environments. The former regime has been elegantly exploited by Brüschweiler et al. to investigate the binding modes of different amino acids present in disordered proteins by measuring the impact of aqueous colloidal dispersions of anionic silica nanoparticles on the transverse relaxation rates of IDPs.[268,269] Electrostatic and hydrophobic interactions are thought to dominate these weak interactions, and these are shown to differ largely between amino acid types. The authors show that these interactions can be parametrized and the binding profile of a given IDP can be accurately predicted using a simple mathematical model. This method also has the considerable advantage that transverse relaxation rates are impacted by motions occurring on time scales that are normally difficult to access by solution state NMR, also providing insight into the intrinsic dynamics of folded proteins.[270] Beyond the fast exchange limit, intermediate exchange, occurring on time scales that are comparable to ΔΔω, leads to line-broadening of the observable peaks (Figure ). This latter regime can be particularly informative because NMR exchange spectroscopy can be used to unravel the molecular mechanisms responsible for the observed broadening, even at very low population of bound state, simultaneously providing information both about the exchange kinetics and the free energy surface of the exchanging environments. Rotating frame relaxation (R1ρ),[134,135] Carr–Purcell–Meiboom–Gill (CPMG) relaxation dispersion,[131,132,136] chemical exchange saturation transfer (CEST),[133,271,272] and zz-exchange[273,274] provide information about exchange processes from the tens of microseconds to the subsecond range.

Describing the Interaction Trajectories of IDPs with Their Partner Proteins

The power of CPMG relaxation dispersion to describe complex interaction trajectories of IDPs was demonstrated by Sugase et al.,[182] who studied the interaction between the KIX domain of CREB binding protein and the phosphorylated form of kinase inducible activation domain (pKID). 15N CPMG measurements in the presence of substoichiometric admixtures of KIX provided evidence for weak binding between pKID and KIX, and allowed the authors to propose a model of the binding trajectory according to a three-site exchange model, describing binding via a partially folded encounter complex. This approach has been further exploited, using a combination of 1H, 13C, and 15N CPMG, to map the interaction trajectory of Sendai NT upon binding to the C-terminal domain of the phosphoprotein (PX).[191] While 1H and 15N amide chemical shifts are commonly used as probes to map interaction interfaces, 13C backbone chemical shifts are more sensitive to secondary structure. 1H, 13C, and 15N CPMG, measured at substoichiometric admixtures of PX, was used to map the conformational transitions along the interaction trajectory of the partially formed helical motif (Figure ). This motif had previously been characterized on the basis of RDCs and chemical shifts as a rapidly exchanging ensemble of distinct helical elements.[275] The initial step of the interaction involves the stabilization of one of the helical elements present in the free-state equilibrium in an encounter complex on the surface of PX. This step is mainly characterized by 13Ć differences between the free state and the encounter complex. The second and final step, as reported mainly by 1H and 15N shifts, involves binding of the stabilized NT helix into a groove between two helices on the surface of PX. The combination of multinuclear CPMG, measurements on both partners and at multiple admixtures thus provides the necessary information to reconstruct a complex interaction trajectory involving both folding and binding. This study also highlights the importance of the intrinsic conformational dynamics of the binding partners that is already present in their free states. The conformational equilibrium of free NT comprises a pre-existing population of the state that is stabilized in the encounter complex, while the second binding step appears to be limited by breathing motions that open and close the binding pocket on PX in its free form.[108] This example also demonstrates that simple models of intermolecular interaction such as “induced-fit” or “conformational selection” are not necessarily applicable to interactions involving highly dynamic proteins such as IDPs, where a broader terminology, for example, conformational funneling, would be necessary to describe such multistate interaction trajectories.[192]

Figure 8

Multinuclear CPMG relaxation dispersion maps the molecular recognition trajectory of an intrinsically disordered protein as it binds its physiological partner. (A) 1H, 13C, and 15N CPMG were used to map the interaction trajectory of Sendai virus NT with the C-terminal domain of the phosphoprotein (PX). The combination of multinuclear CPMG, measured at multiple substoichiometric admixtures (2, 3.5, 5, and 8% of PX compared to NT) provides the necessary information to reconstruct a complex interaction trajectory involving both folding and binding. (B) The first step involves funnelling of one of the helical elements present in the equilibrium of rapidly exchanging substates, in an encounter complex on the surface of PX. (C) The second step involves binding of the stabilized helix into a groove between two helices on the surface of PX. (D) Relaxation dispersion measured on NT confirms that the second step coincides with events occurring on the surface of NT. (E) Representation of the most likely interaction trajectory derived from the ensemble of the experimental data. (Reproduced with permission from Schneider et al. JACS 2015[191] Copyright 2015 American Chemical Society). The crowded environment of living cells can clearly influence interactions involving IDPs,[255,276,277] impacting association and dissociation rates, via nonspecific interactions or modulation of the structural and dynamic behavior of the proteins described above. Although fluorescence[278] and simulation has provided useful insight, for example, into the potential impact of attractive and repulsive interactions with the cellular milieu on coupled folding and binding,[279] atomic or residue-specific experimental characterizations of IDP-mediated interactions in vivo remain relatively rare.[198,280−282] To achieve a deeper understanding of the effects of crowding on the thermodynamics and kinetics of reactions involving IDPs and their partners, a more detailed, residue-specific picture is required, for example, using relaxation and exchange measurements in crowded environments and living cells. Kay and co-workers already performed 15N R1ρ relaxation dispersion experiment in a highly concentrated phase-separated state (which can be regarded as a particular form of crowding) of the germ granule protein Ddx4, discovering a slowly exchanging excited state with increased intermolecular contacts.[283]

On the Importance of Multivalent, Weak Interactions in Biology

It is becoming increasingly clear that not all IDPs fold upon binding to their partners, even locally. The nuclear pore is filled with proteins (FG-nucleoporins) comprising extremely long IDRs, that are decorated with phenylalanine-glycine (FG) motifs, that control transition between the cytoplasm and the nucleoplasm. Larger proteins can only pass the filter when bound to nuclear transport receptors (NTRs). Despite the high selectivity of the filter, transport across the pore is extremely fast. The crucial interaction between NTRs and FG motifs was recently investigated using NMR, revealing weak chemical shift perturbations in the nucleoporin Nup153 in the presence of a series of NTRs.[68] In this case, 15N R1ρ and chemical shift titration confirmed that the interaction was in fast exchange, allowing an estimate of the intrinsic individual dissociation constant of a single site of around 8 mM. The presence of multiple motifs in a single protein clearly illustrated the effect of multivalency on the apparent affinity, which decreased with increasing multivalency. Finally, assignment of both free and bound forms of Nup153 demonstrated a complete absence of backbone conformational transition upon binding, with the disordered domain maintaining a high level of plasticity in the complex. On the basis of these results, a model was proposed of rapid passage, assured by the quasi continuum of NTR-binding sites present throughout the pore, and the fast on and off rates that are maintained by multivalent ultraweak binding throughout this continuum. Related results were also found for other nucleoporins,[284,285] suggesting that the mechanism may be general. Another example of the physiological importance of ultraweak binding is shown from the study of the chaperone complex between the partially disordered nucleoprotein (N) and the intrinsically disordered phosphoprotein of Measles virus (MeV).[286] Paramyxoviral phosphoproteins (P) are essential cofactors of the replication complex: they are tetrameric and all comprise long IDRs that are hundreds of amino acids in length and whose function remains largely unknown.[287] N has a folded domain that encapsidates the viral genome, protecting it from the host immune system, and a disordered C-terminal domain. ASTEROIDS analysis of the 304 amino acid IDR of P from MeV identifies short helical elements in the N-terminal domain, and an additional fourth helix 150 amino acids downstream of this (α4), adjacent to a highly acidic strand. The N-terminal helices bind tightly to N, maintaining it in its monomeric form prior to encapsidation of the RNA genome. The 90 kDa NP complex was investigated using NMR, including over 450 intrinsically disordered residues, identifying the known N-terminal chaperone binding site, but also a second, previously unknown binding site positioned at the fourth helical element, α4 (Figure ). 15N CPMG using a molecular construct comprising only this site revealed that the interaction has an intrinsic affinity that is around 5 orders of magnitude weaker than the main interaction site, allowing P to transiently wrap around N, and to exchange between compact and extended forms. Remarkably, the conserved interaction motif is shown to be essential for viral replication. Although the exact role of the second binding site remains unknown, it is possible that conformational fluctuations of the acidic loop between the binding sites on P frustrate access to the surface of N, for example, by cellular RNA or inhibit self-assembly with other N monomers. More generally, the combination of two distant interactions involving the same IDR suggests the existence of long-range coupling between the two interaction sites linking opposite ends of N that is regulated by the highly disordered nature of P. This example again highlights the extreme sensitivity of NMR to detect ultraweak interactions, even in the presence of very strong affinity interactions between the same partners.

Figure 9

NMR detects essential, ultraweak interactions in the dynamic assembly of Measles virus nucleo/phosphoprotein complex. (A) 15N–1H HSQC spectrum of the complex formed between PTAIL and the nucleoprotein. The complex comprises more than 450 intrinsically disordered amino acids. (B) Representation of the two interaction sites involved in the complex. The phosphoprotein of Measles virus (yellow) is known to bind the nucleoprotein (gray) in a tight complex at its N-terminal end. NMR reveals a second binding site (δα4) that is 150 amino acids away from the first binding site, in the middle of a long intrinsically disordered domain that binds a distal site of the nucleoprotein. NMR exchange (C) 15N CPMG and (D) rotating frame relaxation in the free and bound forms of the region 140–304 of PTAIL, reveals that the intrinsic affinity of this second site is 5 orders of magnitude lower than the known binding site. (E) Normalized peak intensities (I/I0) of P1–304 (50 μM) with P1–50N1–525 (gray, 25; red, 50; green, 100; and blue, 150 μM concentrations of P1–304. (F) Interaction profile of P1–304,HELL → AAAA mutation (concentrations as in E). Mutation of these four residues in the binding site knocks out the second interaction and replication. (Reproduced with permission from Milles et al. Sci. Adv. 2018[286] Copyright 2018 AAAS).

Atomic Resolution Descriptions of Highly Dynamic Molecular Assemblies from NMR

Disordered domains are thought to play a role in the replication of numerous single strand RNA viruses, with components of the replication machinery from both negative[287,288] and positive sense[289−293] RNA viruses exhibiting extensive disorder. A recent description of the nucleoprotein of SARS-CoV-2, involved in protection of the viral genome and regulation of gene transcription, revealed that the flexible central region undergoes a disorder to order transition, folding around the N-terminal domain of its viral partner nsp3 and inducing a collapse of the remainder of the protein that impacts its ability to bind RNA.[294] Influenza A represents another example where extreme disorder appears to play an essential role in viral function. To efficiently replicate in human cells, avian influenza polymerase undergoes host adaptation, with adaptive mutants (in particular E627 K) localized on two C-terminal (627 and NLS) domains of the PB2 polymerase subunit. This region of the protein shows remarkable behavior in solution, populating an equilibrium between open and closed conformations that can be characterized using 15N CEST experiments, revealing open form chemical shifts that are essentially identical to the isolated domains in free solution and determine the exchange rate to be around 20 s–1.[295] The closed form is stabilized by an interdomain salt bridge[296] while in the open form the linker connecting the two domains becomes highly dynamic and the two domains evolve freely. The host transcription factor ANP32a was identified as an essential cofactor for the adaptation of the viral polymerase,[297] suggesting a direct interaction between the two proteins. ANP32a has a highly acidic, intrinsically disordered domain whose length varies between species, with the avian form containing a 33 amino acid insert, comprising a unique hydrophobic hexapeptide and a repeat of the first 27 acidic amino acids. Somehow the absence of this insert in mammals is compensated by a single E627 K mutation of the avian polymerase, allowing cross-species infection. It was therefore important to investigate the complexes between these two highly flexible proteins. Here again, the IDR mediates the interaction, with a polyvalent interaction between the acidic tail of ANP32a and the positively charged surface of the 627 domain.[298] The intrinsic KD measured from the side of ANP32a is more than 1 order of magnitude lower than the KD measured from the side of 627 due to the multiple interaction sites on ANP32a dispersed along the IDR visiting the same sites on 627-NLS. To characterize the dynamic ensembles, a series of eight cysteine mutants of the avian and human adapted forms of 627-NLS were made, and PREs measured on ANP32a. In the fast exchange regime, these data provide a sensitive map of the population-weighted proximity of the two proteins over the dynamic assembly and were used to develop an ensemble description of the human and avian complexes using the ASTEROIDS ensemble approach. This comparison identifies clear distinctions between the binding modes exploited in the two complexes (Figure ), as shown quantitatively in the average distance map, where closer or more populated contacts are observed between the positively charged 627 domain and the acidic IDR for the human complex than for the avian complex where the electrostatic distribution on the surface of 627 is disrupted by the E627 K mutation. This study allows us to speculate further on the role of the interaction in the function of the replication complex and more generally demonstrates the ability of NMR to characterize intermolecular complexes exhibiting extreme levels of flexibility and multivalency.

Figure 10

Influenza polymerase forms a highly dynamic assembly with the intrinsically disordered host transcription factor ANP32a in a species specific-way. (A) PREs measured on hANP32A (orange, experimental; and blue, representative ensembles selected using ASTEROIDS) in the presence of paramagnetically labeled human adapted 627-NLS. (B) Same information for avANP32A in the presence of paramagnetically labeled avian adapted 627-NLS. (C, D) Representation of the dynamic complexes determined from the data shown in A and B, respectively. Multivalent interactions between ANP32a (yellow/red) and the 627 domain (gray) are localized to the basic patch on the surface of 627. In the case of avANP32A and avian adapted 627-NLS(E), ANP32A disordered domain is in general closer to the NLS domain (yellow) mediated by the hydrophobic hexapeptide (green). (E) Position of the cysteine residues used to label 627-NLS. (F) Representation of the ensemble of conformers of the hANP32A:627-NLS complex. (G) Average distance difference matrix (in Å) between ANP32A (x-axis) and the 627-NLS domains (y-axis) over the two ensembles. (Reproduced with permission from Camacho-Zarco et al. Nat. Commun. 2020[298]). It is perhaps not surprising that electrostatic interactions in low complexity IDPs can be responsible for highly multivalent interactions. This was clearly demonstrated by a combination of smFRET and NMR spectroscopy, together with coarse grained MD simulation, to investigate the complex between two IDPs, the strongly basic histone H1 and the highly negatively charged prothymosin-α.[299] Fluorescence spectroscopy reveals affinities in the picomolar range, while NMR and smFRET reveal that the proteins remain dynamic within the complex, implying a high level of dynamic polyvalency and possible formation of transient ternary complexes.[300] The presence of dynamics in the bound state of IDRs was also characterized in two recent studies of the disordered domain of kinases MKK7,[301] MKK4[302,303] in complex with JNK1 and p38α. CEST, CPMG, and spin relaxation were measured as a function of stoichiometric ratio, suggesting that the bound state of MKK7, and the kinase specificity regions flanking the main interaction site of MKK4, both exhibited additional dynamics in the bound state, in the former case on the micro to millisecond time scale and the latter on relaxation-active ps-ns time scales. Similar data were used to investigate the interaction between Artemis and the DNA binding domain of ligase IV, in this case identifying a single step binding interaction.[304]

Perspectives

Over the course of this review, we have demonstrated the unique insight that NMR offers concerning the structure, dynamics and interactions of IDPs at atomic resolution not only in reduced systems comprising isolated proteins but also in the context of more complex molecular environments that are relevant to physiological function. In particular, we have drawn attention to the importance of describing the ensemble and time-averaging processes that govern interpretation of NMR parameters, and the remarkable insight that this can provide concerning the functional modes exploited by such highly dynamic systems. The power of NMR results in part from analytical understanding of the ensemble and time-averaging processes occurring on time scales covering orders of magnitude from pico- to milliseconds that remains one of its unique advantages for studying flexible molecules. In addition to providing unique new insight into the relationship between protein flexibility and function, the combination of atomic resolution characterization of essential dynamic processes from NMR with complementary structural and dynamic probes that can be measured on similar sample preparations ensures an exciting future for NMR as an integral tool for the investigation of increasingly complex biological systems.

271 in total

1. Balance between alpha and beta structures in ab initio protein folding.

Authors: Robert B Best; Jeetain Mittal
Journal: J Phys Chem B Date: 2010-07-08 Impact factor: 2.991

Review 2. Characterization of the dynamics of biomacromolecules using rotating-frame spin relaxation NMR spectroscopy.

Authors: Arthur G Palmer; Francesca Massi
Journal: Chem Rev Date: 2006-05 Impact factor: 60.622

3. Dynamic equilibrium engagement of a polyvalent ligand with a single-site receptor.

Authors: Tanja Mittag; Stephen Orlicky; Wing-Yiu Choy; Xiaojing Tang; Hong Lin; Frank Sicheri; Lewis E Kay; Mike Tyers; Julie D Forman-Kay
Journal: Proc Natl Acad Sci U S A Date: 2008-11-13 Impact factor: 11.205

4. Coupling and Decoupling of Rotational and Translational Diffusion of Proteins under Crowding Conditions.

Authors: Matthias Roos; Maria Ott; Marius Hofmann; Susanne Link; Ernst Rössler; Jochen Balbach; Alexey Krushelnitsky; Kay Saalwächter
Journal: J Am Chem Soc Date: 2016-08-03 Impact factor: 15.419

5. Comprehensive structural and dynamical view of an unfolded protein from the combination of single-molecule FRET, NMR, and SAXS.

Authors: Mikayel Aznauryan; Leonildo Delgado; Andrea Soranno; Daniel Nettels; Jie-Rong Huang; Alexander M Labhardt; Stephan Grzesiek; Benjamin Schuler
Journal: Proc Natl Acad Sci U S A Date: 2016-08-26 Impact factor: 11.205

6. Dynamics of GCN4 facilitate DNA interaction: a model-free analysis of an intrinsically disordered region.

Authors: Michelle L Gill; R Andrew Byrd; Arthur G Palmer
Journal: Phys Chem Chem Phys Date: 2016-02-17 Impact factor: 3.676

7. Atomic-resolution dynamics on the surface of amyloid-β protofibrils probed by solution NMR.

Authors: Nicolas L Fawzi; Jinfa Ying; Rodolfo Ghirlando; Dennis A Torchia; G Marius Clore
Journal: Nature Date: 2011-10-30 Impact factor: 49.962

8. Polyelectrolyte interactions enable rapid association and dissociation in high-affinity disordered protein complexes.

Authors: Andrea Sottini; Alessandro Borgia; Madeleine B Borgia; Katrine Bugge; Daniel Nettels; Aritra Chowdhury; Pétur O Heidarsson; Franziska Zosel; Robert B Best; Birthe B Kragelund; Benjamin Schuler
Journal: Nat Commun Date: 2020-11-12 Impact factor: 14.919

9. Metainference: A Bayesian inference method for heterogeneous systems.

Authors: Massimiliano Bonomi; Carlo Camilloni; Andrea Cavalli; Michele Vendruscolo
Journal: Sci Adv Date: 2016-01-22 Impact factor: 14.136

2 in total

1. ¹⁵N-Detected TROSY NMR experiments to study large disordered proteins in high-field magnets.

Authors: Abhinav Dubey; Thibault Viennet; Sandeep Chhabra; Koh Takeuchi; Hee-Chan Seo; Wolfgang Bermel; Dominique P Frueh; Haribabu Arthanari
Journal: Chem Commun (Camb) Date: 2022-08-23 Impact factor: 6.065

2. ALS mutations in the TIA-1 prion-like domain trigger highly condensed pathogenic structures.

Authors: Naotaka Sekiyama; Kiyofumi Takaba; Saori Maki-Yonekura; Ken-Ichi Akagi; Yasuko Ohtani; Kayo Imamura; Tsuyoshi Terakawa; Keitaro Yamashita; Daigo Inaoka; Koji Yonekura; Takashi S Kodama; Hidehito Tochio
Journal: Proc Natl Acad Sci U S A Date: 2022-09-16 Impact factor: 12.779

2 in total