Literature DB >> 31042036

Configurational Entropy Components and Their Contribution to Biomolecular Complex Formation.

Markus Fleck1,2, Bojan Zagrovic1.   

Abstract

Configurational entropy change is a central constituent of the free energy change in noncovalent interactions between biomolecules. Due to both experimental and computational limitations, however, the impact of individual contributions to configurational entropy change remains underexplored. Here, we develop a novel, fully analytical framework to dissect the configurational entropy change of binding into contributions coming from molecular internal and external degrees of freedom. Importantly, this framework accounts for all coupled and uncoupled contributions in the absence of an external field. We employ our parallel implementation of the maximum information spanning tree algorithm to provide a comprehensive numerical analysis of the importance of the individual contributions to configurational entropy change on an extensive set of molecular dynamics simulations of protein binding processes. Contrary to commonly accepted assumptions, we show that different coupling terms contribute significantly to the overall configurational entropy change. Finally, while the magnitude of individual terms may be largely unpredictable a priori, the total configurational entropy change can be well approximated by rescaling the sum of uncoupled contributions from internal degrees of freedom only, providing support for NMR-based approaches for configurational entropy change estimation.

Entities:  

Mesh:

Substances:

Year:  2019        PMID: 31042036      PMCID: PMC9251725          DOI: 10.1021/acs.jctc.8b01254

Source DB:  PubMed          Journal:  J Chem Theory Comput        ISSN: 1549-9618            Impact factor:   6.578


Introduction

Noncovalent interactions between macromolecules are fundamental to a large number of biological processes including transcription, translation, cell signaling, and many other.[1] Given an isothermal–isobaric ensemble with a constant number of particles (NPT), the Gibbs free energy change [see, e.g., ref (2)]captures the likelihood for such a binding process to occur, together with the equilibrium fractions of the species involved. Importantly, the entropic term (−TΔSsystem) remains largely underexplored when it comes to biological macromolecules. This especially concerns the configurational entropy part of the total entropy change, which stems from the solute degrees of freedom only and is notoriously difficult to measure experimentally[3−5] or calculate from atomistic simulations.[6−18] It has traditionally been assumed that the configurational entropy change is negligible in comparison to the change in solvent entropy.[19] Recently, however, it was experimentally demonstrated that the configurational entropy contribution in the case of proteins can be of similar magnitude as the solvent entropy contribution[3−5] and can thus potentially have a strong impact on the thermodynamics of protein interactions. In a more applied context, deeper insight into configurational entropy and the basic physical principles that govern its response to changes in biomolecular dynamics could significantly improve computational drug design by helping to overcome enthalpy/entropy compensation.[20−22] In the present work, we analyze the individual contributions to configurational entropy change of protein binding, stemming from internal[12,23−28] and external (rigid body rototranslational) degrees of freedom. Going beyond the previous studies of entropy change in protein–protein interactions, such as those involving normal mode calculations,[29,30] we investigate here the importance and the relative magnitude of the often ignored coupling (correlation) terms between internal and external degrees of freedom in BAT coordinates. Following previous work,[31] we employ an entropy decomposition[32] known as the mutual information expansion (MIE) in its analytical form. There exists a well-developed theoretical apparatus for the decomposition of such coupling terms (see, e.g., refs (12) and (33) and references therein), mainly for liquids. However, numerical application of MIE as an approximation of the configurational entropy is rather novel in the case of configurational entropy of biomolecules.[12] In the pioneering work by Gilson and co-workers,[12] the MIE expansion is taken at the level of single degrees of freedom as opposed to sets of degrees of freedom, which, as mentioned above, will be treated in this work. The obtained numerical coupling terms can then be summed up to approximate the analytical MIE coupling terms at the level of the sets of degrees of freedom, e.g., external with internal or main-chain with side-chain. In this context, Gilson and co-workers[34] have recently discussed and reviewed the general significance of coupling corrections in configurational entropy estimation. Importantly, they have applied a more recently developed variant of the MIE approximation, the maximum information spanning tree (MIST) approximation.[10,35] However, MIST/MIE analysis of the coupling terms in proteins resulting from the partitioning into external and internal degrees of freedom is, to the best of our knowledge, limited to a single case study,[11] employing the MIE approximation at pairwise order. Here, by distilling the previous work[31,32] to a compact form, we arrive at a decomposition of entropy into sets of external and internal degrees of freedom in the form of a general framework, which takes advantage of separable terms in the underlying potential energy. We compute these contributions to configurational entropy change at pairwise order for a large set of typical protein complexes (Figure and Table ). The isolated binding partners and their binary complexes are captured using microsecond-level classical molecular dynamics simulations (for methodological details and a full description of the simulated set, please see refs (36−38)). The simulated set exhibits a wide range of physical sizes and secondary- and tertiary-structure classes of individual binding partners as well as a variety of the total configurational entropy changes and the uncoupled configurational entropy changes of individual binding partners. Note that five of the simulated complexes involve ubiquitin as one of the binding partners (Figure a), a well-folded, biologically important protein frequently used in biophysical studies; we highlight these complexes for clarity in all of our analyses. This large-scale investigation is made possible by our recent parallel implementation[36] of MIE/MIST. For three representative complexes, we carry out an extensive analysis of configurational entropy convergence, leading to several notable results. Additionally, as there exists a certain fundamental arbitrariness[31,39,40] in decomposing the entropy over external and internal degrees of freedom, we provide an analysis of the impact of different decomposition choices (Figure ). We demonstrate that several coupling terms contribute significantly to the overall configurational entropy change across different proteins, contrary to commonly accepted assumptions. Finally, we provide a justification for the experimental estimation of the total entropy change from the leading internal uncoupled entropy terms even under these circumstances.
Figure 1

Structures of the simulated protein complexes together with the computed configurational entropy contributions to the Gibb’s free energy change of binding of the whole complexes (−TΔStotal) or binding partners alone [−TΔS1D(A) and – TΔS1D(B)]. The latter values stem from the internal degrees of freedom of individual binding partners without any coupling (mutual information) contributions included. On the x-axis, we give the PDB code[41,42] of each complex together with the number of amino acids in each partner (in parentheses). In (a), the second binding partner B (colored red) is always ubiquitin. Please see Table for further details concerning the simulated complexes.

Table 1

Simulated Protein Set

name# atomsaPDB codebcomplexcTΔS1Dd
Tsg101 protein14801KPP1S1Q190.0
ubiquitin7601UBQ1S1Q248.3
gGGA3 Gat domaine9491YD8*1YD844.0
ubiquitin7601UBQ1YD8420.4
ESCRT-I complex subunit VPS2314933R3Q1UZX131.6
ubiquitin7601UBQ1UZX187.5
E3 ubiquitin–protein ligase CBL-B4572OOA2OOB2.4
ubiquitin7601UBQ2OOB213.5
polymerase iota ubiquitin-binding motif4572L0G2KTF85.5
ubiquitin7601UBQ2KTF215.1
subtilisin Carlsberg24331SCD1R0R527.4
ovomucoid4982GKR1R0R106.6
uracil–DNA glycosylase23331AKZ1UGH–65.7
uracil–DNA glycosylase inhibitor7881UGI1UGH503.8
PPIase A16411W8V1AK4–1.4
PR160Gag-Pol14082PXR1AK4367.4
micronemal protein 112262BVB2K2S54.5
micronemal protein 64962K2T2K2S145.0
alkaline protease45031AKL1JIW173.1
alkaline protease inhibitor9972RN41JIW40.0

Number of force field atoms of individual proteins.

PDB codes[41,42] of individual proteins.

PDB codes of complexes.

Entropy change from internal degrees of freedom of individual binding partners upon complex formation without any coupling (mutual information) contributions, given in kJ mol–1, as in Figure .

The constituent GGA3 Gat domain was extracted from the PDB structure of the 1YD8 complex and named 1YD8* accordingly.

Figure 5

Effect of the choice of coordinate system on the importance of the entropy contributions. There is a largely arbitrary choice of root atoms to define the BAT coordinate system. Figure a shows values where the root atoms for each complex have been chosen (out of five sets of root atoms) in order to minimize ΔS1D(Y). The current graph shows the changes relative to this minimum choice when root atoms are chosen in order to maximize ΔS1D(Y). The percentage values describe the spans relative to the total entropy change in Figure a. Green points denote complexes involving ubiquitin (Figure a), while blue points denote complexes not involving ubiquitin (Figure b).

Structures of the simulated protein complexes together with the computed configurational entropy contributions to the Gibb’s free energy change of binding of the whole complexes (−TΔStotal) or binding partners alone [−TΔS1D(A) and – TΔS1D(B)]. The latter values stem from the internal degrees of freedom of individual binding partners without any coupling (mutual information) contributions included. On the x-axis, we give the PDB code[41,42] of each complex together with the number of amino acids in each partner (in parentheses). In (a), the second binding partner B (colored red) is always ubiquitin. Please see Table for further details concerning the simulated complexes. Number of force field atoms of individual proteins. PDB codes[41,42] of individual proteins. PDB codes of complexes. Entropy change from internal degrees of freedom of individual binding partners upon complex formation without any coupling (mutual information) contributions, given in kJ mol–1, as in Figure . The constituent GGA3 Gat domain was extracted from the PDB structure of the 1YD8 complex and named 1YD8* accordingly.

Theory

Configurational Entropy of the Binding Process

An expression for the configurational entropy of a single molecule or complex can be derived from the quasi-classical entropy integral[43,44]where R represents the universal gas constant, h the Planck constant, N the number of atoms in the molecule, and ρ the classical phase-space probability density function (pdf). q⃗ and p⃗ denote, respectively, the spatial degrees of freedom and the canonically conjugate momenta in Cartesian coordinates. Note that due to the factor h3 this integral cannot be split into momentum and spatial parts while preserving physically correct dimensions for both quantities.[38,43] As in this work we are concerned with the spatial part of the entropy (labeled just S to simplify the notation), a convenient choice for separating off the momentum entropy Sm isThe momentum entropy then evaluates to[38,43]Here, m denotes the mass vector of the solute, and kB is the Boltzmann constant. Note that this expression is constant if the temperature and atomic composition remain fixed. Under the assumption of a vanishing external field and a concentration C° associated with a container of volume V° = 1/C° for a single molecule (or complex), the spatial part of eq can be evaluated to[11,12,43]where J(q⃗int) denotes the Jacobian of the chosen molecule internal coordinates (such as anchored Cartesian[23,24] or BAT[12,23−28] coordinates). Thus, the second term on the right-hand side captures the integration over the chosen 3N – 6 internal degrees of freedom q⃗int. The term R ln(8π2V°) in eq results from integration over the 6 external degrees of freedom, which in the absence of an external field can be carried out analytically.[26] Here, following a common practice, we choose a standard concentration of C° = 1/V° = 1 mol L–3. As this term as well as the momentum entropy at fixed temperature and atomic composition is constant,[38,43] the second term on the right-hand side of eq alone is often referred to as configurational entropy. Therefore, for a single molecule, sampling the internal, spatial pdfs is sufficient to calculate the total entropy contribution to the Gibbs free energy change from the solute. Importantly, while neither the momentum entropy (eq ) nor the spatial entropy (eq ), as mentioned above, exhibit physically correct dimensions, the problematic terms cancel for entropy differences.[38,43] Thus, differences (and differences only) of these quantities bear physically valid dimensions of entropy. We now begin our analysis of the configurational entropy change of a binary binding process by deriving the configurational entropy of the unbound state (including external degrees of freedom), followed by the derivation of the configurational entropy of the bound state. As mentioned in the introduction, both derivations follow a previously discussed strategy,[31] making extensive use of the analytical MIE.[32] The configurational entropy change upon binding is then obtained by subtraction. As no other approximations are made apart from the assumption of a vanishing external field, the end result as well as all intermediate results are analytically exact in the classical limit.

Configurational Entropy of the Unbound State from a Statistical-Mechanical Perspective

In the unbound state, the molecules are assumed to be infinitely far apart, and we assume no external fields. The notation used for describing different terms is given in Table . Note that from now on we drop the vector symbol for a more convenient notation. With these assumptions and notation, the potential energy separates asHere, without a loss of generality, the external potential constants are set to zero. Analogously, the pdf factorizes intoThe factor 8π2V° results from the homogeneous probability distribution with respect to the position in the container volume, the full solid angle, as well as the external torsional degree of freedom of the molecules. Finally, using the corresponding external entropy terms R ln(8π2V°) from eq and the notation of Table , the spatial entropy of the unbound state is given as
Table 2

Nomenclature of the Degrees of Freedom

qXexternal degrees of freedom molecule 1
qYexternal degrees of freedom molecule 2a
qAinternal degrees of freedom molecule 1
qBinternal degrees of freedom molecule 2
Xrandom variables from qX
Yrandom variables from qY
Arandom variables from qA
Brandom variables from qB
a quantity associated with the bound state
S1Dentropy from marginal 1D probability density functions only (within a given subsystem)
I2Dmutual information of 2D and higher probability density functions (within a given subsystemb)
I2pairwise mutual information of 2D and higher probability density functions (shared between two subsystemsb)
I3triplet mutual information of 3D and higher probability density functions (shared between three subsystemsc)

In the reference frame of molecule 1.

Numerically approximated by 2D probability density functions in this work.

Not treated numerically in this work.

In the reference frame of molecule 1. Numerically approximated by 2D probability density functions in this work. Not treated numerically in this work.

Configurational Entropy of the Unbound State from an Information-Theoretic Perspective

The same result from the previous subsection can be derived in an information-theoretic framework. First, the MIE is denoted as[32]whereare the so-called higher-order mutual information (MI) terms of order n. Note however that I1(X) = S(X). For further derivation, it will be convenient to first prove the following summary of the previous work[32] in a general formHere, for every set of degrees of freedom q, X denotes the corresponding random variable. This equation states that, if a given set of degrees of freedom can be separated out in the energy function, all (higher-order) MI terms describing the coupling of this set to any possible subset (including the full set) of the remaining degrees of freedom vanishes simultaneously. The proof then proceeds as follows. Analogously to eqs –8, we haveThen, from eq , it follows for the pairwise MIAs for pairwise MI, we have[45]All pairwise combinations involving X1 vanish simultaneously. Furthermore, all higher-order MI terms can be expanded recursively as a sum of such vanishing pairwise MI combinations with X1 using[32]This completes the proof of eq . Now, applying the MIE (eq ) for the subsystems treated in this work yields the expansionAs, according to eq , we have U(q,q,q,q) = U(q) + U(q) + U(q) + U(q), all MI terms vanish for the unbound state. Then, in accordance with eq , one obtains

Configurational Entropy of the Bound State

In the bound state, the two molecules are close together, and therefore, the orientation of molecule 2 with respect to molecule 1 contributes to the potential energy. Because in the bound state the internal degrees of freedom between the molecules also influence each other, only the external degrees of freedom of the first molecule remain separable (as they anchor the whole complex and we assume no external field). Thus, denoting the degrees of freedom in the bound state with a tilde, one can writeThen, using eq for the vanishing MI terms and the MIE for the present subsystems (see eq ), together with the fact that as before S(X̃) = R ln(8π2V°), the configurational entropy in the bound state can be written asNote that the same result could be derived by using S(X̃,Ã,Ỹ,B̃) = S(X̃) + S(Ã,Ỹ,B̃) from the statistical mechanical framework due to eq and then applying the MIE (eq ) just to S(Ã,Ỹ,B̃).

Configurational entropy change upon binding

Using the results from the previous subsections, the configurational entropy change upon binding can be obtained by subtracting eq (or equivalently eq ) from eq asThis final result describes the fully analytical configurational entropy change in the absence of an external field expressed in terms of contributions from external and internal degrees of freedom of the molecules involved. It follows from the singular assumption of the form of the potential energy function in eqs and 18 in the classical limit without any further approximations. For the external degrees of freedom of the second molecule, the term ΔS(Y) = S(Ỹ) – ln(8π2V°) expresses the rototranslational restriction upon binding to the first molecule, in contrast to the motional freedom in the unbound state. Note also that the only contributions that reflect the coupling between the four subsystems stem from the MI in the bound state.

Results and Discussion

Using the above analytical framework, we analyzed the relative importance of the individual contributions to configurational entropy change in the case of 10 protein complexes shown in Figure . As a consequence of the limitations of in silico sampling, application of the MIST approximation at an order higher than pairwise is currently not possible for proteins of biologically relevant sizes. However, one can further dissect eq by separating the uncoupled configurational entropy from the mutual information terms within a given subsystem. Here, note that while, e.g., S(A) appears as a one-dimensional term at the level of individual subsystems, when it comes to degrees of freedom, it stems from a high-dimensional probability density function, which one can expand via eq . The same holds for I2 terms: while at the level of individual subsystems they appear as pairwise mutual information terms, at the level of degrees of freedom they are described by higher-order mutual information terms as in eq . Separating off the coupling terms within one subsystem leads then toHere, as denoted in Table , S1D refers to the sum of the I1 terms in eq and I2 to the sum of all terms I with k > 1, both referring to the equation expressed at the level of degrees of freedom and within one subsystem. While our analysis approximates all I2D and I2 terms in eq from 2D pdfs over the degrees of freedom, the triplet term I3 inherently has dimensions ≥ 3 and is, thus, difficult to sample properly. Note that the term I2D(Ỹ,Ỹ) is zero for the unbound state, corresponding to the total motional freedom of the molecules. Thus, I2D(Ỹ,Ỹ) enters the equation directly without making a difference in the case of the unbound state.

Convergence Analysis

Before analyzing and comparing individual contributions to the configurational entropy change, we would first like to discuss the convergence of our computational estimates. The uncertainty in configurational entropy calculations stems, in principle, from two main sources. First, the underlying simulations need to accurately and exhaustively sample the configurational space explored by a given molecule. While the question of force field accuracy is an important one, its adequate treatment is beyond the scope of the present study. On the other hand, the question of how exhaustively the phase space is sampled may be addressed by monitoring the convergence of the configurational entropy change and its components as a function of simulated time. Second, uncertainty is also influenced by the intrinsic properties of different configurational entropy components and their dependence on sufficient sample size. This question may be addressed by analyzing samples of different size coming from a shuffled trajectory in which the ordering of individual snapshots is randomized, thus removing the physical sources of uncertainty. We have carried out both of these types of analysis for three representative complexes in our set: the smallest one (PDB code 2KTF), the largest one (PDB code 1JIW), and a medium-size one involving S1D terms with noteworthy properties, as further discussed below (PDB code 1UGH). When it comes to total entropy change and its convergence as a function of physical time, the complexes converge to within 9, 7, and 25 kJ/mol from the final value for the 2KTF, 1UGH, and 1JIW complexes, respectively, already 80 ns before the end of the simulated trajectories (Figure ). Considering the configurational entropy components, for 1JIW, the principal determinant of convergence is the TΔS1D term of its larger protein 1AKZ, making up 4503 of its total 5500 atoms. The corresponding mutual information term −TΔI2D, in fact, converges considerably better (Figure ): for example, over the last 80 ns, TΔS1D rises by about 46 kJ mol–1, while −ΔI2D drops by only 5 kJ mol–1. In fact, an analogues statement can be made for all six proteins analyzed: −TΔI2D does not constitute the limiting factor for convergence. It is rather the TΔS1D terms that are more problematic to converge. Another noteworthy observation can be made for 1UGH: the larger protein 1AKZ, constituting 2333 of the 3121 atoms, converges surprisingly well in all of its components (Figure ). The TΔS1D terms of the smaller protein 1UGI, on the other hand, still drop by 35 kJ mol–1 over the last 160 ns. The reason for this is likely the magnitude of the respective TΔS1D values, which is −66 kJ mol–1 for 1AKZ and a considerable 504 kJ mol–1 for 1UGI after the full 800 ns of the simulations. This suggests that physical size is not necessarily the deciding factor in convergence.
Figure 2

Dependence of the calculated configurational entropy change and its components on simulated time for three representative complexes. For each complex and contribution, we give values based on simulated samples with a 1 ps output frequency whose total size increases incrementally in steps of 80 ns. For visualization purposes, the curves have been shifted to converge to zero when reaching their final value at the total simulation length of 800 ns. The reference to binding partners A and B in different complexes as denoted by their PDB codes is the same as that in Figure and Table .

Dependence of the calculated configurational entropy change and its components on simulated time for three representative complexes. For each complex and contribution, we give values based on simulated samples with a 1 ps output frequency whose total size increases incrementally in steps of 80 ns. For visualization purposes, the curves have been shifted to converge to zero when reaching their final value at the total simulation length of 800 ns. The reference to binding partners A and B in different complexes as denoted by their PDB codes is the same as that in Figure and Table . The other terms considered in this study, TΔS1D(Y), −TI2D(Ỹ,Ỹ), −TI2(Ã,B̃), −TI2(Ã,Ỹ), and −TI2(B̃,Ỹ), show rather satisfactory convergence properties for all complexes analyzed, with their values coming to within approximately 2, 2, 5, 2, and 2 kJ mol–1, respectively, of the final values already in a few 80 ns steps. What is left to discuss are the convergence properties of 1UBQ in the 2KTF complex. While ubiquitin is a stable, well-folded protein, its configurational entropy converges rather slowly, especially in its TΔS1D terms, which still drop by about 27 kJ mol–1 over the last 160 ns. The reason for this likely stems from the fact that, while well-folded, ubiquitin explores different conformational substates on a time scale that is slow compared to the simulation length of 800 ns. Indeed, the excellent convergence of configurational entropy changes and their components for all three complexes in the analysis of shuffled trajectories, with all terms converging to within 6 kJ/mol or less from the final value already within the first 80 ns (Figure SI 1), strongly suggests that the key determinant for convergence is not the sheer number of frames used for the configurational entropy calculation, but rather the quality of the underlying coverage of the phase space. In fact, the initial convergence in the analysis of shuffled trajectories turned out to be so rapid for all six proteins of the three complexes studied that we had to fine-grain the first 80 ns to steps of 8 ns to produce SI Figure 1. Putting the convergence issues aside, the final configurational entropy change values, as calculated here, may seem relatively high. There are three separate issues that need to be mentioned in this regard. First, the MIST approximation is by definition an upper bound on the absolute configurational entropy and, if the underlying absolute values are too high, it is likely that the corresponding differences will show the same trend. Note, however, that when compared to the values obtained by the quasi-harmonic approximation the MIST configurational entropy differences are actually lower by a factor of approximately 3.[38] Next, a large change in entropy is frequently accompanied by a large change in enthalpy, resulting in a moderate value for the relevant free energy change.[3] In this sense, our results could very much be physically meaningful. Finally, the main experimental estimates of configurational entropy changes in protein interactions are derived from the changes in the NMR methyl order parameters by using a linear relationship between the two.[38,46] While the proteins in our set indeed exhibit somewhat higher values of configurational entropy change as compared to the proteins that have been studied experimentally,[46] they also explore a significantly larger range of order parameter changes (a factor of ∼3). Taking this into the account, one could claim that our results are approximately consistent with the experimentally measured magnitude of configurational entropy change.

Evaluation of Contributions to Configurational Entropy Change

Acknowledging the uncertainties discussed in the previous section, we now turn back to the numerical investigation of eq . In Figure a, we evaluate this breakdown on our simulated set, each captured by one point in every column, as calculated from the pairwise order MIST approximation. The percentage values given in the graph capture the span of the values in reference to the span of the column ΔStotal. Thus, these values can be interpreted as the numerical measure of the importance of a given contribution. We have opted for such a means of comparing different terms because taking the ratios between individual components for the same binding process, while seemingly more natural, results in some cases in misleadingly extreme values. As expected, the 1D terms of the internal degrees of freedom contribute the most, followed by the coupling within the molecules. The coupling between the internal degrees of freedom of the two molecules makes up 11% of the total variation. Note that, in absolute terms, this corresponds to a variation of 40 kJ mol–1. The smallest variation stems from arguably the most exotic term: the coupling of the external degrees of freedom of molecule 2 with respect to molecule 1 with themselves. However, although fractionally minor, this 1% percent of the span still makes up for 3.6 kJ mol–1, a value that could have physical and biological significance.
Figure 3

Contributions to configurational entropy change upon protein binding. Every column represents one of the contributions in eq , and every binding process contributes one data point in every column. The percentage values represent the span in the given column in relation to the span of the first column describing the total configurational entropy change. Green points denote complexes involving ubiquitin (Figure a), while blue points denote complexes not involving ubiquitin (Figure b). (b) Same as (a) but normalized by the number of degrees of freedom of the complex for each binding process, thus illustrating size dependence.

Contributions to configurational entropy change upon protein binding. Every column represents one of the contributions in eq , and every binding process contributes one data point in every column. The percentage values represent the span in the given column in relation to the span of the first column describing the total configurational entropy change. Green points denote complexes involving ubiquitin (Figure a), while blue points denote complexes not involving ubiquitin (Figure b). (b) Same as (a) but normalized by the number of degrees of freedom of the complex for each binding process, thus illustrating size dependence. It is of interest, especially in the context of rational drug design, to assess whether the above results hold if smaller molecules are involved. To investigate this, we have analyzed the relationship between different configurational entropy contributions normalized by the number of degrees of freedom (3N – 6, where N is the total number of atoms) for each binding process (Figure b). This normalization down-weights the binding contributions of larger complexes or, i.e., up-weights those of the smaller complexes. For this reason, Y, which is comprised of a small but constant number of six external degrees of freedom regardless of the size of the complex, gains in importance. This is reflected in ΔS1D(Y) almost doubling. Also, the other terms involving Y tend to rather increase their impact [with the minor outlier −I2D(B̃,Ỹ)]. The fact that the smallest complex in this study is comprised of 1094 atoms together with the fact that the span of ΔS1D(Y) increases already by a factor of 2 demonstrates the importance of these external degrees of freedom as well as all of their couplings when investigating small systems. Thus, from an entropic point of view, retaining as much rotational and translational freedom as possible at the binding site should turn out especially beneficial for small ligands such as many drug compounds. However, due to the enthalpy/entropy compensation,[20−22] one should also consider the impact on the enthalpic component of any practical optimization in this direction. Note that there exists a fundamental arbitrariness in separating external from internal contributions, as already discussed by Gilson and co-workers.[31,39,40] In the BAT coordinate system,[12,23−28] this is reflected in the choice of root atoms from which the construction of the coordinate system is initiated. Accordingly, in the bound state, a nonphysical pseudobond is introduced connecting to the root atoms of the second molecule in order to form a complete coordinate system. For this reason, we numerically explore the impact of this largely arbitrary choice by performing our calculations for 5 different sets of root atoms in the second molecule for each of the 10 protein complexes. While Figure a illustrates the values chosen from the root atoms that minimize ΔS1D(Y), as proposed in ref (31), Figure shows the changes of the values with respect to the maximization of such terms. The spans relative to the total entropy change in Figure a suggest that the global importance of the individual terms is hardly affected by this fundamental arbitrariness. However, individual terms can exhibit quite a drastic change for certain proteins. Relationship between different configurational entropy contributions. (a) Footprint of configurational entropy contributions for two specific complexes demonstrating the nonpredictable relationship between different terms. (b) The sum of the leading internal uncoupled terms is an excellent linear predictor of the total entropy change. Effect of the choice of coordinate system on the importance of the entropy contributions. There is a largely arbitrary choice of root atoms to define the BAT coordinate system. Figure a shows values where the root atoms for each complex have been chosen (out of five sets of root atoms) in order to minimize ΔS1D(Y). The current graph shows the changes relative to this minimum choice when root atoms are chosen in order to maximize ΔS1D(Y). The percentage values describe the spans relative to the total entropy change in Figure a. Green points denote complexes involving ubiquitin (Figure a), while blue points denote complexes not involving ubiquitin (Figure b). Generally, the footprint of a given molecule does not follow a readily discernible pattern, which is illustrated in the case of the 1S1Q and 1UGH complexes in Figure a (see Figure and Table for further details). While the two complexes exhibit almost the same total entropy change ΔStotal, the contribution of the leading uncoupled terms ΔS1D is vastly different. For 1S1Q, the two binding partners contribute similarly when it comes to ΔS1D. In 1UGH, however, a small ΔS1D contribution of the larger binding partner 1AKZ is accompanied by a large ΔS1D contribution of the smaller binding partner 1UGI. Surprisingly, however, the rest of the terms are virtually the same, which is noteworthy especially when it comes to the internal coupling terms. Remarkably, however, the sum of the internal uncoupled terms ΔS1D(A) + ΔS1D(B) exhibits an excellent linear correlation with the total entropy change for both the ubiquitin-containing and the non-ubiquitin-containing complexes (Figure b). This fact provides fundamental support for the recently developed NMR-based methods for measuring the configurational entropy change of protein interactions,[4,5,46,47] which critically rely on such linear behavior. Nevertheless, although the external as well as the coupling terms obviously average out to a constant fraction rather well, given the ranges in Figure a, a customized recalibration for the system of interest (as done by the NMR methods), may likely be required for improved accuracy.
Figure 4

Relationship between different configurational entropy contributions. (a) Footprint of configurational entropy contributions for two specific complexes demonstrating the nonpredictable relationship between different terms. (b) The sum of the leading internal uncoupled terms is an excellent linear predictor of the total entropy change.

Conclusions

In summary, we have presented here a comprehensive theoretical framework for analyzing different contributions to configurational entropy change over internal and external degrees of freedom. Moreover, we have provided a quantitative assessment of the individual contributions to configurational entropy change in the case of a large set of MD simulations of biomolecular binding processes. While the analytical parts of our study are exact, the latter analysis was subject to different sources of uncertainty, including force field errors and convergence issues, and its results should be treated as such. We hope that these efforts will help to complete the theoretical foundation used for treating the configurational entropy in biomolecular systems. With recent methodological advances on both experimental and computational fronts, it is our firm conviction that such a foundation will be instrumental in numerous fundamental and applied contexts alike.

Methods

MD simulations were performed as described previously[36−38] using the GROMACS 4.0.7 simulation package,[48,49] the GROMOS 45A3 force field,[50] and the SPC water model.[51] Proteins were placed in water boxes, together with the necessary number of sodium or chloride counterions to reach neutrality, and subjected to energy minimization, followed by heating to 300 K for 100 ps and subsequent unconstrained MD simulations. The length of each MD trajectory was 1 μs, with the first 200 ns treated as an equilibration period and the remaining 800 ns analyzed. Simulations were carried out with a time step of 2 fs using 3D periodic boundary conditions, in the isothermal–isobaric (NPT) ensemble with an isotropic pressure of 1 bar and a constant temperature of 300 K, while system coordinates were output every 1 ps. The pressure and the temperature were controlled using the Berendsen thermostat and barostat[52] with 1.0 and 0.1 ps relaxation parameters, respectively, and a compressibility of 4.5 × 10–5 bar–1 for the barostat. Bond lengths were constrained using LINCS.[53] The van der Waals interactions were treated using a cutoff of 14 Å. Electrostatic interactions were evaluated using the reaction-field method,[54] with a direct sum cutoff of 14 Å and relative permittivity of 61. For the complex 1YD8, due to the lack of a separate structure, the ubiquitin binding partner (human GGA3 GAT domain) was extracted from the PDB structure of the complex and equilibrated for an additional 500 ns. The PARENT[36] program suite, a configurational entropy package in parallel architecture, was used for entropy calculations by applying the MIST approximation.[10,35] For sampling probability densities, 50 bins were used in one-dimensional cases and 50 × 50 = 2500 in two-dimensional cases.
  34 in total

1.  The Protein Data Bank.

Authors:  H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  Physical nature of higher-order mutual information: intrinsic correlations and frustration

Authors: 
Journal:  Phys Rev E Stat Phys Plasmas Fluids Relat Interdiscip Topics       Date:  2000-09

3.  Announcing the worldwide Protein Data Bank.

Authors:  Helen Berman; Kim Henrick; Haruki Nakamura
Journal:  Nat Struct Biol       Date:  2003-12

4.  Free energy, entropy, and induced fit in host-guest recognition: calculations with the second-generation mining minima algorithm.

Authors:  Chia-En Chang; Michael K Gilson
Journal:  J Am Chem Soc       Date:  2004-10-13       Impact factor: 15.419

5.  Practical conversion from torsion space to Cartesian space for in silico protein synthesis.

Authors:  Jerod Parsons; J Bradley Holmes; J Maurice Rojas; Jerry Tsai; Charlie E M Strauss
Journal:  J Comput Chem       Date:  2005-07-30       Impact factor: 3.376

6.  Study of the insulin dimerization: binding free energy calculations and per-residue free energy decomposition.

Authors:  Vincent Zoete; Markus Meuwly; Martin Karplus
Journal:  Proteins       Date:  2005-10-01

7.  Conformational entropy of biomolecules: beyond the quasi-harmonic approximation.

Authors:  Jorge Numata; Michael Wan; Ernst-Walter Knapp
Journal:  Genome Inform       Date:  2007

8.  MIST: Maximum Information Spanning Trees for dimension reduction of biological data sets.

Authors:  Bracken M King; Bruce Tidor
Journal:  Bioinformatics       Date:  2009-03-04       Impact factor: 6.937

9.  Direct entropy calculation from computer simulation of liquids.

Authors: 
Journal:  Phys Rev A Gen Phys       Date:  1989-10-01

10.  Efficient calculation of molecular configurational entropies using an information theoretic approximation.

Authors:  Bracken M King; Nathaniel W Silver; Bruce Tidor
Journal:  J Phys Chem B       Date:  2012-02-22       Impact factor: 2.991

View more
  1 in total

1.  Selectively tuning ionic thermopower in all-solid-state flexible polymer composites for thermal sensing.

Authors:  Cheng Chi; Meng An; Xin Qi; Yang Li; Ruihan Zhang; Gongze Liu; Chongjia Lin; He Huang; Hao Dang; Baris Demir; Yan Wang; Weigang Ma; Baoling Huang; Xing Zhang
Journal:  Nat Commun       Date:  2022-01-11       Impact factor: 14.919

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.