Adithi Kannan1, Athi N Naganathan1. 1. Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India.
Abstract
Mutational effects in globular proteins exhibit an exponential-like decreasing dependence on distance from the mutated site, suggestive of long-range modulation of structural-thermodynamic features. Here, we extract the physical origins of this pattern by employing a statistical-mechanical model to construct conformational ensembles of three archetypal proteins. Through large-scale in silico alanine-scanning mutagenesis, we show that inter-residue differential coupling free energies, which are characteristic ensemble thermodynamic properties, follow a similar exponential distance dependence with the effects felt until ∼15-20 Å from the mutated site. From the perspective of an ensemble-averaged structure, this feature arises via long-range reorganization of the interaction network on mutations which is more significant for charged residues compared to hydrophobic residues. Our work highlights how subtle alterations in the microscopic distribution of states manifest as a macroscopic distance dependence, the physical origins of mutation-induced dynamic allostery, and the necessity to consider the global intra-protein interaction network to understand mutational outcomes.
Mutational effects in globular proteins exhibit an exponential-like decreasing dependence on distance from the mutated site, suggestive of long-range modulation of structural-thermodynamic features. Here, we extract the physical origins of this pattern by employing a statistical-mechanical model to construct conformational ensembles of three archetypal proteins. Through large-scale in silico alanine-scanning mutagenesis, we show that inter-residue differential coupling free energies, which are characteristic ensemble thermodynamic properties, follow a similar exponential distance dependence with the effects felt until ∼15-20 Å from the mutated site. From the perspective of an ensemble-averaged structure, this feature arises via long-range reorganization of the interaction network on mutations which is more significant for charged residues compared to hydrophobic residues. Our work highlights how subtle alterations in the microscopic distribution of states manifest as a macroscopic distance dependence, the physical origins of mutation-induced dynamic allostery, and the necessity to consider the global intra-protein interaction network to understand mutational outcomes.
Folding-function mechanisms in proteins are primarily explored via mutational perturbations. In the large majority of studies, mutations typically involve the substitution of large hydrophobic residues in the protein core with alanine (Fersht et al., 1992; Hecht et al., 2013; Morrison and Weiss, 2001; Tang and Fenton, 2017; Xu et al., 1998). It is generally assumed that such mutations modulate only the nearest neighbor interactions, i.e. in the first shell around the mutated site, and that the native structure is unperturbed or does not change. However, evidence is accumulating on the long-range effects of even single-point mutations from hydrogen-deuterium exchange experiments (Offenbacher et al., 2017; Pacheco-Garcia et al., 2021; Puri et al., 2022; Roche et al., 2013), NMR measures of order parameters and chemical shift perturbations (Bouvignies et al., 2011; Consonni et al., 1999; Haririnia et al., 2008; Whitley et al., 2008), double-mutant-cycle measures of thermodynamic coupling (Chi et al., 2008; Fodor and Aldrich, 2004), anisotropic-/elastic-network models and simulations (Chennubhotla and Bahar, 2007; Gerek and Ozkan, 2011; Xiao et al., 2022; Yu et al., 2019), and statistical-mechanical-cum-perturbation analysis of protein structures (Guarnera and Berezovsky, 2019; Liu et al., 2009; Tee et al., 2020). In addition, while charged residue mutations have been successfully exploited to engineer stabilities (Sanchez-Ruiz and Makhatadze, 2001), they can also alter the folding-conformational landscape due to the long-range nature of charge-charge interactions. Thus, it is increasingly being realized that mutations could implicitly contribute to allosteric or long-range effects, and this could have implications in not just folding mechanisms but also in understanding the evolution of proteins and functionality (Buller et al., 2018; Maria-Solano et al., 2018; Markin et al., 2021; Petrovic et al., 2018; Tokuriki and Tawfik, 2009; Yang et al., 2016; Zayner et al., 2013).Though mutational outcomes are highly context-dependent, it is observed that changes in certain structural-thermodynamic features on mutations in globular proteins follow a near-universal exponential-like trend when plotted as a function of distance from the mutated site (Rajasekaran et al., 2017a, 2017b). In other words, mutations not only influence the first shell but also the second and even the third shell around the mutated site with the perturbation “felt” until 15–20 Å, but decreasing in an exponential manner, at least when monitored from the perspective of NMR experiments (Naganathan, 2019). This observation has been exploited to build an empirical treatment of hydrophobic truncation mutations that simultaneously captures both the two-shell propagation (in the thermodynamic sense) and the unfolding thermodynamics (stability changes) (Rajasekaran and Naganathan, 2017). These experimental observations and empirical analysis, however, primarily view proteins as a single structure with perturbations “spreading out” radially from the mutation site. In reality, the “native state” is not a single structure but a collection of conformations, the Boltzmann-weighted average of which is observed as NMR or X-ray crystallographic models. The ensemble view of proteins is, in fact, one of the basic tenets of the energy landscape theory of protein folding (Bryngelson et al., 1995), and has been invoked to not only explain protein folding mechanisms but also function, (dynamic) allostery, and epistasis (Biddle et al., 2021; Freire, 1999; Hilser et al., 1998; Luque et al., 2002; Modi et al., 2018; Morrison et al., 2021; Motlagh et al., 2014).The ensemble view, though attractive, has its associated challenges. How broad are the native ensembles of proteins? Is it possible to define, construct, and refine ensembles on large proteins? Can large-scale in silico mutagenesis be performed to mimic experimental saturation mutagenesis studies? Importantly, can the ensemble view reproduce the exponential-like decay of mutational perturbations when the ensemble effects are mapped on to the native structure? We address these specific questions in the current work by employing the Ising-like statistical mechanical treatment of the folding process termed the Wako-Saitô-Muñoz-Eaton (WSME) model (Muñoz and Eaton, 1999; Wako and Saito, 1978). By studying three classic model systems—PDZ (Lockless and Ranganathan, 1999), CheY (Lee, 2015), and CypA (Doshi et al., 2016) (Figure 1A)—and generating more than 300 single-point mutations, we show that subtle alterations in the native ensemble distribution manifest as an exponentially decreasing distance dependence of coupling free energies when mapped on to a single structure. Our work thus reconciles the different viewpoints prevalent in the literature and provides a physical framework for understanding mutational effects and their role in dynamic allostery and protein evolution.
Figure 1
Extracting coupling free energies to study long-range effects of mutations
(A) The structure of the three model protein systems.
(B) Workflow employed to estimate the positive thermodynamic coupling free energies (coupling FEs) and their analyses.
(C–E) The coupling matrices of the wild-type PDZ (panel C), mutant Y397A (panel D), and the Differential Coupling Matrix or DCM (panel E) are depicted. Free energy units are in kJ mol−1 with panels C and D plotted with the same color scale (colorbar in panel D). Changes in the coupling FE between distant residues as evident in panel E indicate the allosteric effect of mutations. The extracted positive coupling FE, the free energy profiles, and the free energy of individual substates are compared with respect to the wild type to explore the origins of exponential decay of perturbation with distance.
Extracting coupling free energies to study long-range effects of mutations(A) The structure of the three model protein systems.(B) Workflow employed to estimate the positive thermodynamic coupling free energies (coupling FEs) and their analyses.(C–E) The coupling matrices of the wild-type PDZ (panel C), mutant Y397A (panel D), and the Differential Coupling Matrix or DCM (panel E) are depicted. Free energy units are in kJ mol−1 with panels C and D plotted with the same color scale (colorbar in panel D). Changes in the coupling FE between distant residues as evident in panel E indicate the allosteric effect of mutations. The extracted positive coupling FE, the free energy profiles, and the free energy of individual substates are compared with respect to the wild type to explore the origins of exponential decay of perturbation with distance.
Results and discussion
Differential coupling free energies as proxies for mutational effects
The protocol employed to construct ensembles and quantify the thermodynamic coupling free energies between residues is described in the flowchart presented in Figure 1B (also see “method details” section). Briefly, by employing a binary representation of residue conformational status (1 for folded and 0 for unfolded), specific sequence approximations (single- and double-sequence approximations), and considering sets of two (PDZ and CheY) or three residues (CypA) to fold as one contiguous unit (termed the block, b), the bWSME model (Gopi et al., 2019) constructs an ensemble of states where is the total number of blocks spanning the protein sequence. For example, for the 110 residue PDZ, the total number of blocks comes to 50 when considering a most probable blocksize of 2. This translates to an ensemble of 485,226 conformational substates or microstates—defined by a large array of strings of 1 and 0s—from the expression above after excluding specific microstates that are not physically possible (“method details”). The populations are determined by the associated statistical weights of the individual microstates, which are in turn determined by the balance between the favorable (free-)energy terms (van der Waals, electrostatics, and solvation free energy) and destabilizing conformational entropic terms, all calculated from the PDB structure. Accordingly, the total number of microstates for CheY and CypA are 1,011,446 and 721,951, respectively. From the microstate probabilities, partial partition functions are generated for the different number of structured blocks enabling the construction of one-dimensional free energy profiles (1D FEPs). The 1D FEPs enable the description of native ensembles, macroscopic stabilities, and the monitoring of changes in their properties on mutations.Every wild-type (WT) protein ensemble can be further divided into two sub-ensembles that consider the folded status of one residue with respect to another residue: accounts for the probabilities of all states in which both residues i and j are folded, while considers only those states in which residue i is unfolded and j is folded.(Naganathan and Kannan, 2021) Note that this partitioning does not account for states that harbor decoupled residues (Naganathan and Kannan, 2021). Given these, the positive thermodynamic coupling free energy (FE) between two residues can then be calculated from:By calculating the for every residue with respect to every other residue, a coupling matrix can be constructed for the WT (Figure 1C) that provides unique information on the regions of the protein structure that are coupled to varying extents in the native ensemble. The coupling FEs have provided extensive insights into the thermodynamic architecture of protein structures as they account for direct interaction between residues (first-shell effects) and beyond the first shell via intermediary interactions and residues (and hence second-shell effects and so on) (Naganathan and Kannan, 2021). Accordingly, we use these ensemble descriptors as the starting point to understand and quantify mutational effects. Alanine-scanning mutagenesis is performed by replacing every residue with alanine (except for proline, glycine, and alanine), resulting in 300 mutations over the three proteins. The mutated structures (mut)—and hence altered contact maps and charge-charge interactions (if the mutation involves a charged residue)—are fed into the model and free-energy profiles and coupling free energies for the mutants (, Figure 1D) are calculated using identical parameters as the WT. Following this, the differential positive coupling FE matrix, or simply differential coupling matrix (DCM), is generated as (Figure 1E).
Mutational effects are long-ranged and exhibit the characteristic exponential distance-dependence
As representative examples, we generate 1D free-energy profiles (FEPs) for two alanine mutants in each of PDZ, CheY, and CypA (Figure 2). The hydrophobic residue mutations considered are either allosteric hotspots (Y397 in PDZ, V29 in CypA) or residues that affect the function of their corresponding protein (Y106 in CheY), whereas the charged residues depicted are chosen at random. The FEPs of the mutants are typically destabilized, and this is also observed in the representative mutant set from the lower free energy of the unfolded ensembles (arrows in Figures 2A, 2D, and 2G). Deviations in the folded ensemble on mutations are minimal due to the nature of the collective variable employed to project the conformations (the number of structured blocks), which is dominated by the statistical weight of the state in which the majority of the residues are folded. In some cases, for example in the D160A variant of CypA (red in Figure 2G), the larger population (or lower free energy) of the partially structured states in the native ensemble are also apparent (red arrow in Figure 2G).
Figure 2
Representative examples of free energy profiles (FEPs; free energies in kJ mol−1) and distance-dependent percolation of mutational effects in PDZ, CheY, and CypA
The FEPs of alanine mutants of hydrophobic residues (Y397A in PDZ, Y106A in CheY, and V29A in CypA, which are at a distance of ∼14 Å, ∼3 Å, and ∼18 Å from the center of mass of the functional site, respectively) are depicted in blue, and those of charged residues (E401A in PDZ, K91A in CheY, and D160A in CypA, located ∼16 Å, ∼16 Å, and 24 Å away from the center of mass of the functional site) are depicted in red (panels A, D, and G). The gray arrows in panel A, D, and G indicate the lower free energies or higher population of the unfolded ensemble, whereas the red arrow in panel G indicates the higher population of partially structured substates in the folded well. The differential coupling indices (DCIs) of the mutants are binned and averaged every 1 Å, represented against the distance from the mutated site and fit to a single exponential to obtain their amplitudes and the coupling distance (panels B, E, and H; DCIs in units of kJ mol−1). The DCIs mapped on to the structure (panels C, F, and I) indicate maximal changes around the site of mutation with other residues perturbed to varying extents. The mutated residues are indicated in orange spheres and the color bar is in kJ mol−1.
Representative examples of free energy profiles (FEPs; free energies in kJ mol−1) and distance-dependent percolation of mutational effects in PDZ, CheY, and CypAThe FEPs of alanine mutants of hydrophobic residues (Y397A in PDZ, Y106A in CheY, and V29A in CypA, which are at a distance of ∼14 Å, ∼3 Å, and ∼18 Å from the center of mass of the functional site, respectively) are depicted in blue, and those of charged residues (E401A in PDZ, K91A in CheY, and D160A in CypA, located ∼16 Å, ∼16 Å, and 24 Å away from the center of mass of the functional site) are depicted in red (panels A, D, and G). The gray arrows in panel A, D, and G indicate the lower free energies or higher population of the unfolded ensemble, whereas the red arrow in panel G indicates the higher population of partially structured substates in the folded well. The differential coupling indices (DCIs) of the mutants are binned and averaged every 1 Å, represented against the distance from the mutated site and fit to a single exponential to obtain their amplitudes and the coupling distance (panels B, E, and H; DCIs in units of kJ mol−1). The DCIs mapped on to the structure (panels C, F, and I) indicate maximal changes around the site of mutation with other residues perturbed to varying extents. The mutated residues are indicated in orange spheres and the color bar is in kJ mol−1.Differential coupling matrices (DCMs; Figures S1–S3) were generated following the protocol described in Figure 1. However, these DCMs cannot be directly compared with experiments. This is because in a typical experiment, one measures a set of residue-level structural-thermodynamic parameters (chemical shifts, typically) for the WT and similarly for the mutant. These are then compared against one another to generate distance-dependent profiles. Seldom are the coupling indices between residues available. To mimic the experimental output, we average over the rows of the DCM to generate residue-level differential coupling indices (DCIs, ) with the WT values as reference. Since we are primarily interested in the overall magnitude of perturbation, only the absolute values of are considered.The DCIs reveal a pattern wherein the effect of the mutation is the largest around the site of mutation, beyond which it decreases (Figures 2B, 2E, and 2H). The trend can be explained by an exponential function, inspired from experimental observations, of the form where is the amplitude of the perturbation (in kJ mol−1), is a shift parameter (in kJ mol−1), is the Cα-Cα distance between residues (in Å), and is a characteristic distance of every mutation, which we term as the coupling distance (in Å) (Rajasekaran et al., 2017a). A larger implies that the effect of the mutation is felt at a longer distance from the origin of perturbation. Since is an exponent, the net effect of mutation is negligible only beyond 15–25 Å for values in the range of 5–9 Å. In each of the 6 mutations displayed in Figure 2, the values range between 5.7 and 9.2 with the effect of charged residue mutations “felt” at longer distances compared to mutations involving hydrophobic residues. Mapping the DCIs on to the structure, it is clear that many protein residues are perturbed and to varying extents (Figures 2C, 2F, and 2I).
Exponential distance dependence of mutational effects is independent of the chemical nature of the mutated residue
To derive general principles, we further group the mutations into three classes (hydrophobic, hydrophilic, and charged residues), bin the DCIs, and extract the functional form that best explains the DCIs distance dependence in each of the three proteins. Such grouping ensures sufficient points in every bin, and hence the extracted is more reliable. Mutants involving all the three classes exhibit a similar distance-dependent decay of DCIs that is captured well by a single-exponential function (Figures 3A–3C). In addition, higher order trends begin to emerge with the (charged) > (hydrophobic) > (hydrophilic). This feature is intuitively expected as mutations of charged residues to alanine would modulate both the long-range electrostatics and packing interactions, unlike the other two classes. In the latter classes, since the vast majority of hydrophobic residues are buried within the core of the protein, they exert their effect to longer distances as opposed to the hydrophilic residues that are predominantly solvent-exposed. Given that the protein-wise trends are internally consistent, a further grouping of the mutational dataset to include only the three classes does not alter the pattern (Figures 3D–3F and S4).
Figure 3
Exponential decay of differential coupling indices irrespective of the class of mutation
The mutated residues are categorized into hydrophobic (HB, blue), hydrophilic (HP, green) and charged residues (CH, red), and their distant dependent decays of the DCIs (in kJ mol−1) follow an exponential trend for each of the model proteins (panels A, B, and C). The exponential trend is also evident when the DCIs of a particular class of mutations are grouped, binned and averaged across the three proteins (panels D, E, and F). The shaded region in panels D, E, and F represents the SD obtained upon averaging the DCIs in each bin. The frequency distributions of the coupling distances (panel G) and amplitudes in kJ mol−1 (panel H) for each class obtained from the single exponential fit indicate the long-range effect of charged residues followed by hydrophobic and hydrophilic residues. The dashed vertical lines represent the mean value for a particular class of mutations. The absolute values of DCIs averaged across all mutations (panels I, K, and M) and the mapping of these values on the structure (panels J, L, and N) indicate that only certain residues are highly perturbed. The functional sites (depicted in red open circles in panels I and K and yellow circles in panel M), as well as sector residues (depicted in yellow circles in panel I), are less perturbed due to their low coupling with the rest of the structure. The green star symbols represent residues that are both considered as sector and functional site residues of PDZ (panel I), the phosphorylation site of CheY (panel K), and the allosteric site of CypA (panel M). The allosteric quartet residues of CheY are depicted as yellow circles in panel K and sticks in panel L. The functional sites of the three proteins are represented as spheres in panels J, L, and N, whereas the allosteric site is shown in orange spheres.
Exponential decay of differential coupling indices irrespective of the class of mutationThe mutated residues are categorized into hydrophobic (HB, blue), hydrophilic (HP, green) and charged residues (CH, red), and their distant dependent decays of the DCIs (in kJ mol−1) follow an exponential trend for each of the model proteins (panels A, B, and C). The exponential trend is also evident when the DCIs of a particular class of mutations are grouped, binned and averaged across the three proteins (panels D, E, and F). The shaded region in panels D, E, and F represents the SD obtained upon averaging the DCIs in each bin. The frequency distributions of the coupling distances (panel G) and amplitudes in kJ mol−1 (panel H) for each class obtained from the single exponential fit indicate the long-range effect of charged residues followed by hydrophobic and hydrophilic residues. The dashed vertical lines represent the mean value for a particular class of mutations. The absolute values of DCIs averaged across all mutations (panels I, K, and M) and the mapping of these values on the structure (panels J, L, and N) indicate that only certain residues are highly perturbed. The functional sites (depicted in red open circles in panels I and K and yellow circles in panel M), as well as sector residues (depicted in yellow circles in panel I), are less perturbed due to their low coupling with the rest of the structure. The green star symbols represent residues that are both considered as sector and functional site residues of PDZ (panel I), the phosphorylation site of CheY (panel K), and the allosteric site of CypA (panel M). The allosteric quartet residues of CheY are depicted as yellow circles in panel K and sticks in panel L. The functional sites of the three proteins are represented as spheres in panels J, L, and N, whereas the allosteric site is shown in orange spheres.An empirical treatment of the experimentally observed two-shell propagation consistently results in a of 4–6 Å (Rajasekaran and Naganathan, 2017), unlike experimental trends that point to significantly larger values (5–12 Å) (Rajasekaran et al., 2017a). To explore if the current unbiased simulations reveal a similar range, we fit the in silico mutation data independently (similar to Figure 2B). The resulting coupling distance ranges are longer (2–10 Å) and follow the trends observed for the grouped data, i.e. mutations involving charged residues exert their effort over longer distances. The amplitudes () follow a similar trend but with the charged residues exhibiting a broader spectrum compared to the other two classes, signaling the possibility of a wider array of effects involving charged-residue mutations. The set of co-evolving residues in PDZ identified via sequence-based analysis has been termed the “sector” residues (Lockless and Ranganathan, 1999). Interestingly, the mean DCIs (averaged across all mutations) at the residue-level indicate a pattern wherein the majority of the weakly coupled residues fall within the sector region (Figures 3I and 3J), highlighting the relation between thermodynamic coupling and function, further recapitulating our original work on the same (Naganathan and Kannan, 2021). A similar observation can also be made for CypA and CheY, with all the active site residues exhibiting a low DCI compared to other regions (Figures 3K–3N).
Mutations reshape the conformational landscape
It is important to emphasize that the observations in Figures 2 and 3 are a manifestation of redistribution of populations in the native conformational landscape. In other words, when a mutation is introduced, it stabilizes some microstates and destabilizes some. This in turn modulates the magnitude of thermodynamic coupling between residues that are already coupled in the native ensemble, the effective average of which is quantified by the DCIs through Equation 1. This can be observed in Figure 4A which plots the difference in free-energy of microstates in the native ensemble for PDZ. The population of a large number of microstates that are partially structured increases on mutations (negative free energy) in Figure 4A calculated as with i being the index of the microstate or (Figures 1B and 4), while in some cases, the populations decrease (positive free energy). Similar redistribution of population in the native ensemble is also observed for CheY and CypA (Figure S5). Here, we show only the native ensemble free energy differences to highlight that though the magnitudes are small, they match the differences in free energies predicted by Cooper and Dryden (which are of the order of ∼RT or ∼2.5 kJ mol−1) required for “dynamic allostery” (Cooper and Dryden, 1984).
Figure 4
Mutations induce conformational redistribution
(A) Differences in free energy of individual microstates in the folded well for all the mutations of PDZ with respect to the WT. The color bar units are in kJ mol−1.
(B) The average differences in the FE of each microstate generated from the model (green) and the microstates corresponding to the native well alone (dark pink) across all mutations in PDZ.
(C) A blow-up of the native ensemble distribution in panel B.
(D and E) Same as C but for CheY and CypA.
(F) Schematic representation of our findings from the analyses of DCIs and free energy changes upon mutations contributing to dynamic allostery.
Mutations induce conformational redistribution(A) Differences in free energy of individual microstates in the folded well for all the mutations of PDZ with respect to the WT. The color bar units are in kJ mol−1.(B) The average differences in the FE of each microstate generated from the model (green) and the microstates corresponding to the native well alone (dark pink) across all mutations in PDZ.(C) A blow-up of the native ensemble distribution in panel B.(D and E) Same as C but for CheY and CypA.(F) Schematic representation of our findings from the analyses of DCIs and free energy changes upon mutations contributing to dynamic allostery.Figure 4B plots the distribution of the mean difference in free energy between the WT, i.e. the distribution of averaged across all mutations for a particular microstate. The average effect of a mutation is destabilizing as we perform only alanine-scanning mutagenesis. The native ensemble effects highlighted in Figure 4A constitute only a small portion of the overall landscape (Figures 4B and 4C). There are combinatorially more substates possible in the partially structured ensemble, and a large fraction of them are stabilized (compared to the fully folded microstate). The overall effects are very similar for the distributions for CypA and CheY (Figures 4D and 4E). These plots reveal that the large distribution of states modulated on mutations enables access to partially structured states that are otherwise less accessible in the native ensemble. Thus, mutations invariably modulate populations of substates both in the native ensemble and in the partially structured ensemble, potentially contributing to functional tuning. Even in the absence of functional changes, the differential coupling matrix of a mutant is never the same as the WT (as Equation 1 will always be different), particularly in alanine-scanning mutagenesis. We would also like to reiterate that such changes at the level of microstates are never observable in experiments.To summarize, we construct ensembles of microstates from the WSME model starting from a single native structure, introduce in silico mutations, and quantify the differences in populations via coupling matrices. The resulting ensemble-averaged DCIs display a trend that mimics experimentally observed distance-dependent exponential-like decay of thermodynamic features (Figure 4F). It is apparent that population redistributions in the native landscape can contribute to macroscopic exponentially decaying distance-dependent effects of protein structural-thermodynamic features. A unique quantity termed the coupling distance captures the distance-dependent features, with the magnitude of this quantity varying between 2 and 10 Å depending on the residue that is mutated, its location in the protein, and the interactions that it mediates. Mutations can, therefore, subtly influence the folding status and hence dynamics of residues not just in the immediate vicinity but even as far as 20 Å from the mutation site in a context-dependent manner. The results presented here explain why residues around active sites of proteins (up to 30 Å) exhibit lower evolutionary mutational rates compared to residues farther from the active site (Jack et al., 2016). While mutations can be classified as having an effect on function or not (neutral mutation), the same cannot be said for their effect on modulating populations in the native ensemble that can be highly non-intuitive and challenging to probe from an experimental viewpoint. This observation applies even more so for charged residues on the protein surface that exert a larger effect on surrounding residues. Such modulations aid in the evolution of proteins as the intra-protein interaction network is always perturbed on mutations, in the acquisition and loss of function, and form the basis for epistatic effects.
Limitations of the study
Allostery is a broad field with a rich history, and with multiple experimental and computational treatments contributing to the understanding of this phenomenon (Chen et al., 2022; Grutsch et al., 2016; Lisi and Loria, 2016; Verkhivker et al., 2020; Wodak et al., 2019; Yao and Hamelberg, 2019). In this work, we have highlighted and explained the origins of an oft-observed and an under-appreciated hallmark of mutational effects—the distance-dependent decay of perturbations, particularly those observed in NMR experiments. Exponential-like decays are the zeroth-order expectations from protein structural perturbations, while deviations from this behavior can be attributed to the specific and idiosyncratic nature of protein structures and interaction energies. Specifically, if regions adjacent to the site of mutations are only weakly coupled, the domino-like effect of perturbations could contribute to significant unraveling of the local structural features (for example, melting of helices or enhanced dynamics in loops). In such cases, these residues will not follow the exponential-like pattern observed here—this is particularly expected in case of mutations involving charged residues or perturbations of charged residues located distant from the mutated site. Deviations from exponential-like trends can be studied only on a case-by-case basis and not addressed in this work. Moreover, we have considered only the overall magnitude of the perturbation but not its sign—are the distant residue less coupled (negative ) or more coupled (positive ) and why? Answering these questions requires a more detailed analysis where the nature of microstates populated are also studied. Finally, the model considers only native-like energetics and any potential non-native interactions stabilized by mutations are neither predicted and nor accounted for.
STAR★Methods
Key resources table
Resource availability
Lead contact
Further information and requests for resources should be directed to and will be fulfilled by the lead contact, Athi N Naganathan (athi@iitm.ac.in).
Materials availability
The study did not involve experiments.
Method details
Model description
We employ the block version of the Wako-Saitô-Muñoz-Eaton (bWSME) model, where a set of contiguous residues are considered as blocks (b), and all the residues in a block can either adopt a folded (represented as 1s) or an unfolded (0s) conformation.(Gopi et al., 2019) A single structure input to the model generated millions of microstates, with each microstate represented by a string of 0s and 1s. For computational efficiency, only microstates that follow the Single Sequence Approximation (SSA, a single island or contiguous stretch of folded blocks), the Double Sequence Approximation (DSA, two islands with intervening unfolded blocks) and DSA with loop (DSA w/L, two islands with interactions considered between them, despite being separated by unfolded blocks) are allowed. The statistical weights of the microstates are estimated considering the contributions from van der Waals interactions with a cut-off distance of 5 Å, electrostatics with no cut-off distance and implicit solvation terms (together included under stabilization free energy) as well as conformational entropy terms for fixing a particular residue in its native state. The rigidity of the proline backbone and the high flexibility of glycine residues are accounted for in the conformational entropy terms. The calculations in this study are performed at 310 K, an effective ionic strength of 0.1 M, and with the pH 7.0 protonation state. The parameters used for the partition function calculation of the three proteins analyzed in this study — PDZ, CheY and CypA— are listed in Table S1.
Mutations and DCMs
Every residue in the wild-type proteins, except alanine, glycine and proline, is mutated to alanine using PyMOL (The PyMOL Molecular Graphics System LLC.), thus resulting in a total of 307 mutations. The structures obtained using PyMOL are fed as an input to create the contact maps for van der Waals interactions and electrostatics of the corresponding mutant. In other words, each mutant has different contact maps, owing to the mutated side chain that alters the interaction network of the protein. These contact maps are used to generate the ‘allowed’ microstates, calculate the total partition function and hence, the probability of every microstate for all the mutants. The microstates are categorized into two sub-ensembles, and the positive thermodynamic coupling free energies () are calculated (Equation 1) (Naganathan and Kannan, 2021), the details of which are mentioned in the ‘Results and Discussion’ section. Briefly, the summations of probabilities of all the microstates where residue i is folded and those in which i is unfolded, given that residue j is folded, are estimated for the between each pair of residues for all the wild-type and mutant proteins. Note that the matrix obtained is asymmetric as the effect of perturbation of i on j is not the same as j on i due to different environment around the residues. The difference between the pairwise of a mutant with respect to that of the wild-type is represented in the form of a matrix or a heat map, where the rows and columns indicate the residue number and the color stands for the values. This representation is referred to as the Differential Coupling Matrix (DCM). To analyze the distance-dependent effect of mutations, we averaged the absolute values of across the rows of the DCMs to obtain the residue-level Differential Coupling Indices (DCIs) in the form of a vector, being the number of residues. The distances from the mutated site (Cα-Cα distances) are calculated using the structure of the protein.
REAGENT or RESOURCE
SOURCE
IDENTIFIER
Deposited data
Coupling free energies for the three proteins and the alanine mutants
Authors: Celestine N Chi; Lisa Elfström; Yao Shi; Tord Snäll; Ake Engström; Per Jemth Journal: Proc Natl Acad Sci U S A Date: 2008-03-13 Impact factor: 11.205
Authors: Julien Roche; Jose A Caro; Mariano Dellarole; Ewelina Guca; Catherine A Royer; Bertrand E García-Moreno; Angel E Garcia; Christian Roumestand Journal: Proteins Date: 2013-04-27
Authors: Juan Luis Pacheco-Garcia; Ernesto Anoz-Carbonell; Pavla Vankova; Adithi Kannan; Rogelio Palomino-Morales; Noel Mesa-Torres; Eduardo Salido; Petr Man; Milagros Medina; Athi N Naganathan; Angel L Pey Journal: Redox Biol Date: 2021-08-18 Impact factor: 11.799