Literature DB >> 25692597

Quantifying the entropy of binding for water molecules in protein cavities by computing correlations.

David J Huggins1.   

Abstract

Protein structural analysis demonstrates that water molecules are commonly found in the internal cavities of proteins. Analysis of experimental data on the entropies of inorganic crystals suggests that the entropic cost of transferring such a water molecule to a protein cavity will not typically be greater than 7.0 cal/mol/K per water molecule, corresponding to a contribution of approximately +2.0 kcal/mol to the free energy. In this study, we employ the statistical mechanical method of inhomogeneous fluid solvation theory to quantify the enthalpic and entropic contributions of individual water molecules in 19 protein cavities across five different proteins. We utilize information theory to develop a rigorous estimate of the total two-particle entropy, yielding a complete framework to calculate hydration free energies. We show that predictions from inhomogeneous fluid solvation theory are in excellent agreement with predictions from free energy perturbation (FEP) and that these predictions are consistent with experimental estimates. However, the results suggest that water molecules in protein cavities containing charged residues may be subject to entropy changes that contribute more than +2.0 kcal/mol to the free energy. In all cases, these unfavorable entropy changes are predicted to be dominated by highly favorable enthalpy changes. These findings are relevant to the study of bridging water molecules at protein-protein interfaces as well as in complexes with cognate ligands and small-molecule inhibitors.
Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

Entities:  

Mesh:

Substances:

Year:  2015        PMID: 25692597      PMCID: PMC4336375          DOI: 10.1016/j.bpj.2014.12.035

Source DB:  PubMed          Journal:  Biophys J        ISSN: 0006-3495            Impact factor:   4.033


Introduction

Experimental techniques such as x-ray crystallography commonly identify water molecules in the internal cavities of proteins (1,2). A number of analyses on databases of protein structures have shown that this is a common observation and that protein cavities commonly accommodate one to three water molecules (3,4). Such water molecules tend to be stabilized by hydrogen bonding interactions and are thus expected to exhibit strong translational and orientational ordering (5). Numerous computational studies have been performed to quantify the binding free energy of such water molecules using free energy methods (6–9). In this context, the binding free energy is the free energy of transfer of a water molecule from the bulk to the cavity. As expected, the binding free energies for crystallographically observed water molecules in protein cavities are predicted to be favorable. However, the degree of ordering relative to bulk water may mean that this favorable free energy is comprised of a favorable enthalpy component and an unfavorable entropy component (5). Analysis from Dunitz calculated the entropy contribution of individual water molecules to the entropy of inorganic crystals (10). This analysis suggests that an entropic cost of transferring an individual water molecule to a protein cavity will not typically be greater than 7.0 cal/mol/K. This corresponds to a contribution of approximately +2.0 kcal/mol to the free energy. From this data, the conclusion was that water molecules in proteins are unlikely to make an entropic contribution of greater than +2.0 kcal/mol to the free energy. In this study, we employ the statistical mechanical method of inhomogeneous fluid solvation theory (IFST) (5,11–14) to quantify the enthalpic and entropic contributions of individual water molecules in protein cavities. IFST has previously been used to rationalize kinase selectivity (15), identify ligand-binding hotspots at protein surfaces (16), and understand the hydrophobic effect (17). It has proven a very useful tool in modeling networks of water molecules in biological systems. IFST has also been used to study water molecules in cavities of IL-1β (18), and in this article we extend such a study to five different proteins and nineteen protein cavities. In addition, we combine the k-nearest neighbors (KNN) (19,20) algorithm with information theory (21–24) to study doubly occupied cavities and explicitly calculate the contributions of changes in water-water correlations and the associated entropy changes. To our knowledge, this approach represents the first estimate of this quantity using mutual information and provides a complete framework to calculate two-particle entropies, as envisioned in the original development of IFST (11,12).

Materials and Methods

Five test systems were chosen for this study: IL-1β (2), T4 Lysozyme (1), FKBP-2, Carbonic Anhydrase (CA-II) (25), and β-Lactamase (26) (see Table 1). From these five structures, 19 internal cavities were identified. Fifteen of these cavities were singly occupied and four were doubly occupied, and thus the analysis considers 23 water molecules in total. Details of the PDB crystal structures and the residue numbers for these 23 water molecules are listed in Table 1.
Table 1

The five proteins considered in this study along with crystallographic data, the number of singly and doubly occupied internal cavities, the initial charge of the protein before adding counterions, and the residues altered during the preparation stage

ProteinIL-1βT4 LysozymeFKBP-2CA-IIβ-Lactamase
PDB ID2NVH3DKE2PBC3GZ02P74
Protein chainAXAAA
Resolution (Å)1.531.251.81.260.88
Single cavities22155
Double cavities21100
Initial charge0+900+1
Residues alteredQ14 FlipH31 ProtonateH55 ε-HydrogenH4 ProtonatedH112 Flip
H30 ProtonateN68 FlipQ 72 FlipH10 ε-HydrogenQ31 Flip
Q48 FlipQ69 FlipH15 ε-HydrogenQ141 Flip
Q116 FlipQ123 FlipH17 ε-HydrogenQ206 Flip
N119 FlipN140 FlipH36 ε-HydrogenQ254 Flip
Q126 FlipN144 FlipH64 ε-Hydrogen
N129 FlipN163 FlipH119 ε-Hydrogen
Q149 FlipN178 Flip
N253 Flip

System setup

Protein structures were downloaded from the Protein Databank (27). Selenomethionines were changed to methionines and missing sidechains were added using Schrodinger’s Preparation Wizard (28), which was also used to check the orientations of the asparagine, glutamine, and histidine residues, as well as the protonation state of all ionizable residues. All heteroatomic species such as buffer solvents and ions were removed, with the exception of the zinc ion in the case of Carbonic Anhydrase. The changes made to each structure to improve the hydrogen bonding patterns are detailed in Table 1. The hydrogen-atom positions were then built using the HBUILD facility of CHARMM (29) with the CHARMM27 energy function (30,31), and the force field parameters and partial charges were assigned from the CHARMM27 force field (30,31). Only the crystallographic water molecules identified in internal cavities were retained, with all others being deleted. Water molecules were modeled with the TIP4P-2005 water model (32). To ensure an overall charge of zero in the system, nine and one additional chloride ions were added in the cases of T4 Lysozyme and Beta-Lactamase respectively. All ions were placed farther than 15 Å from the protein.

Molecular dynamics simulations

Equilibration was performed for 1.0 ns in an NVT ensemble at 300 K using Langevin temperature control (33). All systems were brought to equilibrium before continuing, by verifying that the energy fluctuations were stable. MD simulations were performed using an MD time step of 2.0 fs. Electrostatic interactions were modeled with a uniform dielectric and a dielectric constant of 1.0 throughout the equilibration and production runs. Van der Waals interactions were truncated at 11.0 Å with switching from 9.0 Å. Electrostatics were modeled using the particle mesh Ewald method (34), and the systems were treated using rhombic dodecahedral periodic boundary conditions. All non-water atoms were fixed for the entirety of the equilibration and production simulations. MD simulations were performed using NAMD version 2.9 (35). To explore the effect of protein conformation on the IFST results, we generated 10 conformations of T4 Lysozyme by running a 10.0 ns simulation with heavy-atom restraints of 1.0 kcal/mol/Å2 and storing the coordinates every 1.0 ns. Each unique protein conformation was then used for a separate 10.0 ns IFST calculation with a fixed protein structure. For comparison, the 10.0 ns simulation with heavy-atom restraints was also analyzed.

Free energy perturbation calculations

Free energy perturbation (FEP) calculations were performed for each internal cavity, to calculate the binding energy of the buried water molecules (ΔG). The total free energy change for an FEP calculation (ΔG) was calculated as the sum of free energy changes for a series of N small steps between intermediate states a and b (36). The change in free energy was calculated for each small step (ΔG) using the partition functions (Q) for the two states, which are calculated from the Hamiltonians (H):The results for the forward and backward FEP simulations were combined using the Bennett Acceptance Ratio (BAR) method (37,38). BAR was implemented using the ParseFEP Plugin from Visual Molecular Dynamics and the statistical error was estimated in each case (39). The estimated statistical error in the FEP free energy predictions using BAR was less than 0.1 kcal/mol in all cases. We used 32 equally spaced λ windows for the forward FEP simulations and 32 equally spaced λ windows for the backward FEP simulations. A soft-core potential was employed with a van der Waals radius-shifting coefficient of 5.0 (40,41), electrostatic interactions were scaled down to zero between λ = 0.0 and λ = 0.5, and van der Waals interactions were scaled down to zero between λ = 0.5 and λ = 1.0 (38). Equilibration was performed for 250 ps for each lambda window, and production simulation was performed for 750 ps for each lambda window. An NVT ensemble was used throughout and all nonwater atoms were fixed for the entirety of the FEP simulations. The free energy cycle for calculating ΔG can be seen in Fig. 1. The first step transfers the water molecule from 55 M to a fixed point in solution (−ΔG), and the second step annihilates the water molecule from bulk (−ΔG). ΔG was calculated to be −6.95 kcal/mol for a fixed TIP4P-2005 water molecule using FEP at 1 atm and 300 K in an NPT ensemble. The third step transfers the water molecule from the fixed location back to 55 M in vacuum (ΔG), and thus the two liberation terms in the cycle cancel one another. The fourth step is to harmonically restrain the oxygen atom of the water molecule to aid convergence of the free energy calculations (7,9,42,43). This harmonic restraint leads to an analytic free energy penalty (ΔG) given by Eq. 3:where C0 is the concentration of a water molecule in bulk (55 M), V is the volume available to the harmonically restrained water molecule, and k is the harmonic force constant. The value of k was set to 0.5 kcal/mol/Å2 in all cases, and thus ΔG was calculated to be +0.23 kcal/mol. The fifth step is exnihilation of the water molecule in the cavity (ΔG) using FEP. The final step is to remove the harmonic restraint (ΔG) and this contribution is assumed to be zero, as in previous work, and is justified because the dynamics of the water molecule in the cavity are not affected by a force constant that is small in relation to the atomic fluctuations (9,18). For cavities containing two water molecules, the exnihilation is performed on both simultaneously and interactions between the exnihilated water molecules are also scaled to decouple them (18). Considering the steps described above, ΔG can be calculated using Eq. 5:The symmetry contribution to the binding free energy (−0.41 kcal/mol in the case of a water molecule) is only appropriate if there is a difference between the sampling of the symmetry-related states in the bound and unbound states (44). The unbound water molecules are treated as fixed and cannot sample the two symmetry-related states. In this case, the bound water molecules were not observed to sample the two symmetry-related states, presumably because of a large kinetic barrier. Thus, there is no symmetry contribution.
Figure 1

The steps in the free energy cycle used to calculate ΔGbind for a water molecule. The term aq represents a water molecule in aqueous conditions, the term vac represents a water molecule in vacuum, and the term cav represents a water molecule in a protein cavity. 55 M represents a water molecule at the standard bulk concentration of 55 mol/dm3, fixed represents a water molecule that is at a fixed point, and harm represents a water molecule that is harmonically restrained. To see this figure in color, go online.

IFST calculations

IFST calculates the hydration free energy of a solute (ΔG) by computing the difference in free energy between a solution and the same number of solvent molecules (n) modeled as the pure bulk solvent (11,12). ΔG can be calculated for small subvolumes of the system, allowing the contribution of specific regions to be estimated (13,45,46). In the context of protein systems, water molecules tend to cluster at distinct locations termed hydration sites, and it is natural to compute the contribution of individual water molecules to the hydration free energy (14,18,47,48). For the IFST calculations, 100.0 ns of production simulation in an NVT ensemble were performed at 300 K for each system. System snapshots were saved every 5.0 ps, yielding 20,000 snapshots in total for each system. We calculated a mean and standard deviation for each of the IFST quantities by considering 10 blocks from the 100 ns simulation, each derived from 2000 randomly selected snapshots. ΔG is calculated from the contributions of the hydration energy (ΔE) and hydration entropy (ΔS) to the hydration free energy:ΔE is calculated from the mean solute-water interaction energy (E), the mean water-water interaction energy (E), and the mean interaction energy of a bulk water molecule (E):E and E are defined as half the interaction energy of a water molecule with all other water molecules in the system. For the TIP4P-2005 water model, E is calculated to be −11.5748 kcal/mol. The total hydration entropy is calculated as the sum of local (S) and nonlocal contributions (S). S is the summed contribution of each local subvolume and S is the sum of the volume entropy (S) and the change in liberation entropy (ΔS) (12). S is calculated from the two-particle correlation function (g) that is a function of the position (r) and orientation (ω) of the water molecule. The number density of bulk TIP4P-2005 water (ρ) is calculated to be 0.03324 molecules/Å3:where V is the partial molar volume of the solute, α is the thermal expansion coefficient of the solvent, and κ is the isothermal compressibility of the solvent. Equation 9 uses the Kirkwood-Buff relationship for V (49). The two remaining integrals in Eqs. 8 and 9 can be understood as a local term corresponding to a reduction in the volume accessible to the solvent (V) and a non-local term corresponding to an increase in the volume accessible to the solvent (V) as the system expands under constant pressure. These two terms are equal and cancel one another. ΔS is calculated from the two-particle correlations using the solute-water entropy (S) and the difference in water-water entropy (ΔS); higher-order correlations are not considered:For the TIP4P-2005 water model, S is calculated from the values for E and ΔG to be -15.5097 cal/K/mol. In recent work, we developed a novel KNN approach to calculate S using the translational and orientational distance metrics (50). The translational distance (d) between two water molecules is simply the Euclidean norm between the Cartesian coordinates of the two water molecules. The orientational distance (d) between two water molecules is the distance between the rotations required to bring the two orientations to the same reference orientation. The correct distance metric for the rotation group is twice the geodesic distance on the unit sphere (51). The KNN algorithm provides an unbiased estimate of the absolute entropy (H) from the general expression in Eq. 12:where n is the number of samples, d is the distance between sample point i and its kth nearest neighbor, p is the number of degrees of freedom, Г is the gamma function, L0 is 0, and γ is Euler’s constant. In the context of an individual water molecule, the relative solute-water thermodynamic entropy (S) is calculated using the total distance (d) between two water molecules in six dimensions (p = 6) (50). It is calculated from the difference between the absolute entropy of the distribution (H) and the absolute entropy of a uniform distribution (H). We disregard the symmetry number (two in this case) as it is present in both H and H. We use the first nearest neighbor in this work (k = 1) and the number of samples is equal to the number of frames where a water molecule is present in the hydration site:where R is the gas constant and converts the entropy to thermodynamic units and Г(4) is equal to 6. The Cartesian coordinates of water molecule j in frame i and its nearest neighbor water molecule k in frame l are denoted by x, y, z and x, y, z, respectively. The quaternion representations of the rotations for water molecule j in frame i and its nearest neighbor water molecule k in frame l are denoted by q and q, respectively. In this work, we extend the nearest neighbor approach to calculate S from the two-particle correlation functions (g, g, and g) and the triplet correlation function (g), which is a function of the 12 variables representing the positions and orientations of a pair of water molecules (p = 12):I is best understood as a mutual information (MI) term as it represents the additional correlation between two water molecules that is not captured by the solute-water entropy term. The S term accounts for the exclusion of solvent from the volume occupied by other solvent molecules and is related to the Kirkwood-Buff integral (11,49). For a single water molecule, Sww can be calculated by considering the pair correlation with all other water molecules in the system. In this work, we restrict the sum to pairs of water molecules within 4.0 Å. One can easily consider I as a combination of two- and three-particle entropy terms:where S and S represent the two-particle entropies computed from the pair data and can be calculated using Eq. 14. The three-particle entropy (S) can be calculated from the total distance (d) between two pairs of water molecules. In this case, both hydration sites are occupied in every snapshot for each doubly occupied cavity and thus all n frames contain pair data:Г(7) is equal to 720. In practice, problems arise from combining KNN terms of different dimensionality in Eq. 19 (52). Thus, we use the method of permuted fill modes as described by Hensen et al. (53) The permuted set of distances (d) captures the correlation of the individual water molecules with the solute but decouples the correlation between the water molecules by computing the entropy of the artificially decorrelated data:Whereas the individual entropy terms, such as the one defined in Eq. 20, obey a power law convergence (as previously observed), the MI estimate in Eq. 22 does not appear to obey a power law convergence. It is also interesting to note that this MI estimate is not biased, as the bias terms from the two entropy estimates cancel one another. Within a singly occupied cavity, I and S are negligible because there are no significant water-water pair correlations. In a cavity with two or more water molecules, only pairs of subvolumes in which g and g are nonzero will make a contribution to S. These regions are very small and S is thus expected to be negligible. For this reason, we assume that S is zero in all cases. As a comparison, the magnitude of S in bulk water can be calculated from the radial distribution function and has a value of −0.99 cal/mol/K for the TIP4P-2005 water model, making a contribution of +0.29 kcal/mol to the excess free energy.

Results and Discussion

We begin by considering the IFST estimates of enthalpy, entropy, and free energy contributions of each water molecule to the protein hydration free energy. The calculations for the 23 sites are presented in Table 2.
Table 2

The results of the IFST calculations for the 23 hydration sites, with the mean and standard deviation from 10 blocks reported

SystemPDB water IDEsw (kcal/mol)Eww (kcal/mol)ΔEIFST (kcal/mol)−TSsw (kcal/mol)−TSww (kcal/mol)−TΔSIFST (kcal/mol)ΔGIFST (kcal/mol)
IL-1β202−15.93 ± 0.03−2.69 ± 0.01−7.05 ± 0.036.35 ± 0.040.07 ± 0.021.80 ± 0.05−5.25 ± 0.04
IL-1β204−16.73 ± 0.02−2.72 ± 0.01−7.88 ± 0.026.41 ± 0.030.07 ± 0.021.86 ± 0.04−6.02 ± 0.03
IL-1β203−14.56 ± 0.05−1.97 ± 0.02−4.96 ± 0.046.63 ± 0.050.10 ± 0.032.11 ± 0.06−2.85 ± 0.04
IL-1β207−13.75 ± 0.02−1.99 ± 0.02−4.16 ± 0.035.77 ± 0.030.10 ± 0.031.25 ± 0.03−2.91 ± 0.03
IL-1β200−21.21 ± 0.02−9.48 ± 0.026.53 ± 0.041.91 ± 0.04−7.57 ± 0.03
IL-1β209−19.04 ± 0.03−7.33 ± 0.035.08 ± 0.040.46 ± 0.04−6.87 ± 0.02
T4 Lysozyme902−22.13 ± 0.02−2.95 ± 0.01−13.50 ± 0.026.72 ± 0.030.05 ± 0.022.16 ± 0.03−11.34 ± 0.04
T4 Lysozyme905−18.67 ± 0.02−2.97 ± 0.01−10.06 ± 0.026.45 ± 0.030.05 ± 0.021.88 ± 0.03−8.18 ± 0.03
T4 Lysozyme904−16.23 ± 0.03−4.65 ± 0.036.27 ± 0.031.65 ± 0.03−3.00 ± 0.01
T4 Lysozyme920−21.10 ± 0.02−9.54 ± 0.026.40 ± 0.031.78 ± 0.03−7.76 ± 0.01
FKBP-2207−17.96 ± 0.01−2.37 ± 0.01−8.76 ± 0.016.01 ± 0.030.08 ± 0.011.48 ± 0.03−7.29 ± 0.02
FKBP-2208−19.54 ± 0.03−2.44 ± 0.01−10.40 ± 0.026.09 ± 0.050.08 ± 0.011.56 ± 0.06−8.85 ± 0.05
FKBP-2203−26.49 ± 0.02−14.91 ± 0.027.03 ± 0.042.41 ± 0.04−12.50 ± 0.02
CA-II2004−20.77 ± 0.03−9.47 ± 0.036.47 ± 0.031.85 ± 0.03−7.62 ± 0.02
CA-II2015−27.52 ± 0.03−15.92 ± 0.036.85 ± 0.042.23 ± 0.04−13.70 ± 0.01
CA-II2031−22.40 ± 0.02−11.09 ± 0.026.28 ± 0.021.66 ± 0.02−9.44 ± 0.02
CA-II2042−24.11 ± 0.02−12.56 ± 0.026.59 ± 0.031.97 ± 0.03−10.58 ± 0.02
CA-II2055−17.80 ± 0.02−6.20 ± 0.026.66 ± 0.032.04 ± 0.03−4.17 ± 0.03
β-Lactamase2023−16.08 ± 0.03−3.95 ± 0.036.02 ± 0.041.40 ± 0.04−2.55 ± 0.02
β-Lactamase2048−27.94 ± 0.01−16.33 ± 0.017.09 ± 0.032.47 ± 0.03−13.87 ± 0.03
β-Lactamase2073−28.60 ± 0.03−16.47 ± 0.036.52 ± 0.031.90 ± 0.03−14.57 ± 0.02
β-Lactamase2105−24.70 ± 0.02−13.08 ± 0.025.93 ± 0.051.31 ± 0.05−11.77 ± 0.04
β-Lactamase2327−30.27 ± 0.02−18.70 ± 0.027.29 ± 0.022.67 ± 0.02−16.03 ± 0.01
Minimum−18.700.46−16.03
Maximum−3.952.67−2.55
Mean−10.281.82−8.46
The standard deviations are all below 0.1 kcal/mol and generally much smaller. The predicted free energies are in the range from −2.6 to −16.0 kcal/mol, and this is in excellent agreement with previous IFST studies that showed a range from −1.9 to −17.2 kcal/mol (5,14). The predicted enthalpies are larger in magnitude than the entropies and make the dominant contributions to the free energies for all 23 water molecules. This is also in agreement with previous studies. The mean contribution of +1.82 kcal/mol is in good agreement with the contribution of +2.0 kcal/mol estimated by Dunitz, and the maximum contribution of +2.67 kcal/mol is not far above this value. As expected, the most favorable ΔE and unfavorable −TΔS are found for protein cavities containing charged amino acid sidechains. Table 2 also shows that the I term is small in magnitude for each pair of water molecules in the four doubly occupied cavities. The IFST and FEP results are reported in Table 3.
Table 3

The results of the FEP and IFST calculations for the 19 cavities

SystemPDB water IDsΔGbind (kcal/mol)ΔGIFST (kcal/mol)Signed difference (kcal/mol)Unsigned difference (kcal/mol)
IL-1β202 and 204−11.77−11.27−0.490.49
IL-1β203 and 207−6.18−5.76−0.420.42
IL-1β200−7.09−7.570.490.49
IL-1β209−6.90−6.87−0.030.03
T4 Lysozyme902 and 905−20.41−19.52−0.890.89
T4 Lysozyme904−3.33−3.00−0.330.33
T4 Lysozyme920−8.29−7.76−0.530.53
FKBP-2207 and 208−16.70−16.13−0.570.57
FKBP-2203−13.07−12.50−0.570.57
CA-II2004−8.21−7.62−0.590.59
CA-II2015−14.05−13.70−0.360.36
CA-II2031−10.06−9.44−0.620.62
CA-II2042−11.09−10.58−0.510.51
CA-II2055−4.29−4.17−0.120.12
β-Lactamase2023−2.31−2.550.240.24
β-Lactamase2048−14.30−13.87−0.440.44
β-Lactamase2073−13.98−14.570.590.59
β-Lactamase2105−12.06−11.77−0.290.29
β-Lactamase2327−16.53−16.03−0.510.51
Mean−0.310.45
The agreement between IFST and FEP is extremely good, with an R2 coefficient of determination of 0.995 and a mean unsigned difference (MUD) of 0.45 kcal/mol. The accuracy of the estimates for I are supported by the close agreement of ΔG and ΔG. In addition to analyzing the protein conformation from the crystal structure, we considered the effect of the protein conformation on the predictions of IFST in the case of T4 Lysozyme. This was achieved by analyzing 10 different protein conformations generated from a simulation with harmonically restrained heavy atoms. The results are presented in Table 4.
Table 4

The results of the IFST calculations for 10 conformations of T4 Lysozyme, alongside the results for the first 10 ns of the fixed crystal structure simulation, and the results for the 10 ns harmonically restrained simulation

PDB water IDFixed crystal structure
Ten simulation structures
Harmonically restrained
ΔEIFST (kcal/mol)−TΔSIFST (kcal/mol)ΔGIFST (kcal/mol)ΔEIFST (kcal/mol)−TΔSIFST (kcal/mol)ΔGIFST (kcal/mol)ΔEIFST (kcal/mol)−TΔSIFST (kcal/mol)ΔGIFST (kcal/mol)
902−13.502.16−11.34−13.08 ± 0.832.32 ± 0.32−10.76 ± 0.84−13.301.30−12.00
905−10.061.90−8.16−9.11 ± 0.392.12 ± 0.17−6.99 ± 0.45−8.801.38−7.41
904−4.651.66−2.99−5.62 ± 1.401.21 ± 0.25−4.41 ± 1.26−5.350.21−5.14
920−9.541.78−7.76−8.31 ± 0.961.53 ± 0.29−6.79 ± 0.84−8.520.70−7.82

For the 10 conformations, the means and standard deviations are reported.

Comparing the results from the fixed crystal structure with the results from the harmonically restrained structure, the −TΔS term is reduced from an average of 1.87 kcal/mol to an average of 0.90 kcal/mol, respectively. This is in agreement with previous work showing that the −TΔS term is smaller in magnitude when using data from a harmonically restrained protein simulation and is the expected result, because of a blurring of the probability densities when the protein is mobile (18). The ΔE and ΔG terms also differ to a small extent, with the most notable difference being that the binding free energy is more favorable by 2.15 kcal/mol in the case of water 904. Comparing the IFST results from the fixed crystal structure with the results from the ensemble of 10 structures, the differences are lower than 1.5 kcal/mol in all cases, and the qualitative picture remains the same. However, it is worth noting that the standard deviations from the 10 simulations are relatively large, with a maximum of 1.40 kcal/mol for ΔE in the case of water 904.

Conclusions

In this work, we have predicted the free energy of transferring water molecules from the bulk into a buried protein cavity using the methods of IFST and FEP. This can be viewed either as the binding free energy of the water molecule (ΔG) or the contribution of the water molecule to the hydration free energy of the protein (ΔG). The free energy contributions are all strongly favorable and dominated by a favorable enthalpy component. The entropy contributions to the free energy are all unfavorable, but relatively small in magnitude. The water molecules with the strongest binding affinities tend to be in hydrophilic cavities making one (β-Lactamase W2073) or two (β-Lactamase W2327) hydrogen bonds with charged amino acids. Conversely, the water molecules with the weakest binding affinities tend to be in hydrophobic cavities (T4 Lysozyme W904) with very little potential for hydrogen bonding, as expected. The agreement between IFST and FEP for the 19 protein cavities in the five test systems is extremely good, with an R2 coefficient of determination of 0.995 and a MUD of 0.45 kcal/mol. To our knowledge, this study also represents the first calculation of the total two-particle entropy term (ΔS) using information theory. The excellent agreement between IFST and FEP indicates that the mutual information terms between the pairs of water molecules (I) have been calculated correctly and that this term is negligible for water molecules in protein cavities. This is an important result because it demonstrates that the majority of the correlation is captured by the solute-water entropy (S) term. This means that the total change in water-water entropy (ΔS) is approximately equal to −S (rather than zero) and makes a significantly favorable contribution of −4.62 kcal/mol to the binding free energy for the TIP4P-2005 water model. This result is expected to hold for highly ordered water molecules in protein active sites. It is important to note that small values of I in no way suggest an absence of correlation between the pair of water molecules in a doubly-occupied protein cavity. Solute-water and water-water correlations are explicitly wrapped up in the total two-particle entropy change (ΔS), and the I term serves to capture additional correlations not quantified by the S term. One could equally well proceed by considering the water-water correlations first, but the choice of a solute reference frame significantly simplifies the calculations. In this study, we have considered cavities containing up to two water molecules. Studies on cavities with three or more water molecules would allow the accuracy of the two-particle approximation to be assessed. This is an important issue that should be addressed in future work. At the same time, it will also be important to estimate the S term explicitly, as it may be significant and vary between buried and surface-exposed hydration sites. The average prediction of −TΔS is +1.82 kcal/mol and this is consistent with the work of Dunitz, who noted that water molecules in inorganic salts differ in entropy from bulk solvent by an amount corresponding to a free energy difference of approximately +2.0 kcal/mol. The largest value of −TΔS in this study is found for a water molecule near two charged residues and is +2.67 kcal/mol. In comparison, the greatest difference in Dunitz’s analysis is +3.5 kcal/mol in the case of zinc sulfate monohydrate (10). It is important to note that the majority of entropy estimates in this work are for a fixed protein structure. The use of a fixed protein allows direct comparison of the IFST and FEP predictions and thus validation of the IFST calculations. It is clear that using a harmonically restrained or fully mobile protein structure within the current framework of IFST will lead to misestimation of the entropies because of a blurring of the probability densities. However, the results from calculations on 10 different fixed conformations of T4 Lysozyme suggest that IFST results should be robust in cases where the solute does not show a significant deviation in size, shape, or electrostatics. This is true for the cavities within these compact protein structures and should extend to many cases of interest in biology. Despite this, it is clear that further work is needed to develop probability-based statistical mechanical methods such as IFST to make accurate predictions. The timescales for the IFST and FEP simulations are similar, though the timescales for FEP depend on the number of lambda windows. This makes FEP a technique that requires performing benchmark simulations for each particular case or user input in configuring the calculations. Conversely, the convergence of IFST calculations can be automatically monitored throughout a simulation until the required accuracy is reached. Importantly, a single IFST simulation is informative about every hydration site in a system whereas FEP requires a separate simulation for each water molecule. In addition, IFST can be used to study individual water molecules within a network whereas FEP is best suited to studying single water molecules or groups of water molecules as a whole. This is because annihilating a water molecule from a network requires artificially creating an unphysical vacuum in a hydration site. For this reason, IFST is unique in yielding spatially resolved prediction of water thermodynamics and this is extremely useful in identifying ligand-binding hotspots at protein surfaces (48) and guiding the design of high-affinity small-molecule inhibitors (47). The IFST calculation of the hydration entropy is defined in the context of a fixed solute. The use of a mobile protein has been shown to reduce ΔS, and this can be attributed to a blurring of the probability densities. The effect of the protein conformation on the results of IFST has been investigated by performing simulations for 10 different fixed conformations of T4 Lysozyme. The results are very similar to those for the crystal structure conformation, with no difference greater than 1.5 kcal/mol. This suggests that IFST predictions will be robust if the protein confirmation does not deviate significantly from the conformation observed in the crystal structure. However, the standard deviations are significant and thus the choice of protein confirmation will affect the results if a single-protein confirmation is used. The method of combining results from multiple IFST calculations on an ensemble of protein confirmations allows one to account for molecular flexibility and estimate the coupling of solute and solvent degrees of freedom. Clearly, FEP has the major advantage of considering molecular flexibility and, alongside thermodynamic integration, remains the tool of choice for estimation of absolute and relative binding free energies. However, when studying water molecules, IFST has a number of unique advantages and these make it a very useful tool for understanding the role of hydration in the structure and function of biological systems. In summary, to our knowledge, we have developed a new approach combining KNN and MI to calculate the total two-particle entropy in the context of IFST, and we have shown that the resulting predictions for the contribution of water molecules in protein cavities to the hydration free energy agree extremely well with equivalent predictions using FEP. The predicted entropy contributions to the free energy are in the range of +0.46 to +2.67 kcal/mol and this is in excellent agreement with historical estimates. In the future, it will be interesting to apply the entropy estimates developed in this work to extended water networks at protein surfaces and within protein binding sites. In these cases, the coupling of solute and solvent degrees of freedom is expected to be more significant and will need to be treated appropriately.
  41 in total

1.  The Protein Data Bank.

Authors:  H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  Estimating mutual information.

Authors:  Alexander Kraskov; Harald Stögbauer; Peter Grassberger
Journal:  Phys Rev E Stat Nonlin Soft Matter Phys       Date:  2004-06-23

3.  Good practices in free-energy calculations.

Authors:  Andrew Pohorille; Christopher Jarzynski; Christophe Chipot
Journal:  J Phys Chem B       Date:  2010-08-19       Impact factor: 2.991

4.  The acylation mechanism of CTX-M beta-lactamase at 0.88 a resolution.

Authors:  Yu Chen; Richard Bonnet; Brian K Shoichet
Journal:  J Am Chem Soc       Date:  2007-04-05       Impact factor: 15.419

Review 5.  CHARMM: the biomolecular simulation program.

Authors:  B R Brooks; C L Brooks; A D Mackerell; L Nilsson; R J Petrella; B Roux; Y Won; G Archontis; C Bartels; S Boresch; A Caflisch; L Caves; Q Cui; A R Dinner; M Feig; S Fischer; J Gao; M Hodoscek; W Im; K Kuczera; T Lazaridis; J Ma; V Ovchinnikov; E Paci; R W Pastor; C B Post; J Z Pu; M Schaefer; B Tidor; R M Venable; H L Woodcock; X Wu; W Yang; D M York; M Karplus
Journal:  J Comput Chem       Date:  2009-07-30       Impact factor: 3.376

6.  Thermodynamic stability of water molecules in the bacteriorhodopsin proton channel: a molecular dynamics free energy perturbation study.

Authors:  B Roux; M Nina; R Pomès; J C Smith
Journal:  Biophys J       Date:  1996-08       Impact factor: 4.033

7.  Understanding kinase selectivity through energetic analysis of binding site waters.

Authors:  Daniel D Robinson; Woody Sherman; Ramy Farid
Journal:  ChemMedChem       Date:  2010-04-06       Impact factor: 3.466

8.  Grid inhomogeneous solvation theory: hydration structure and thermodynamics of the miniature receptor cucurbit[7]uril.

Authors:  Crystal N Nguyen; Tom Kurtzman Young; Michael K Gilson
Journal:  J Chem Phys       Date:  2012-07-28       Impact factor: 3.488

9.  Symmetry numbers for rigid, flexible, and fluxional molecules: theory and applications.

Authors:  Michael K Gilson; Karl K Irikura
Journal:  J Phys Chem B       Date:  2010-12-16       Impact factor: 2.991

10.  Thermodynamics of buried water clusters at a protein-ligand binding interface.

Authors:  Zheng Li; Themis Lazaridis
Journal:  J Phys Chem B       Date:  2006-01-26       Impact factor: 2.991

View more
  25 in total

1.  Functional Role of Solvent Entropy and Conformational Entropy of Metal Binding in a Dynamically Driven Allosteric System.

Authors:  Daiana A Capdevila; Katherine A Edmonds; Gregory C Campanello; Hongwei Wu; Giovanni Gonzalez-Gutierrez; David P Giedroc
Journal:  J Am Chem Soc       Date:  2018-07-16       Impact factor: 15.419

Review 2.  Relationship between Solvation Thermodynamics from IST and DFT Perspectives.

Authors:  Ronald M Levy; Di Cui; Bin W Zhang; Nobuyuki Matubayasi
Journal:  J Phys Chem B       Date:  2017-02-28       Impact factor: 2.991

Review 3.  Kinase inhibitors: the road ahead.

Authors:  Fleur M Ferguson; Nathanael S Gray
Journal:  Nat Rev Drug Discov       Date:  2018-03-16       Impact factor: 84.694

4.  Solvation Thermodynamics from the Perspective of Endpoints DFT.

Authors:  Ronald M Levy; Nobuyuki Matubayasi; Bin W Zhang
Journal:  J Phys Chem B       Date:  2020-12-11       Impact factor: 2.991

5.  The Role of Interfacial Water in Protein-Ligand Binding: Insights from the Indirect Solvent Mediated Potential of Mean Force.

Authors:  Di Cui; Bin W Zhang; Nobuyuki Matubayasi; Ronald M Levy
Journal:  J Chem Theory Comput       Date:  2018-01-12       Impact factor: 6.006

6.  Spatially-Decomposed Free Energy of Solvation Based on the Endpoint Density-Functional Method.

Authors:  Yoshiki Ishii; Naoki Yamamoto; Nobuyuki Matubayasi; Bin W Zhang; Di Cui; Ronald M Levy
Journal:  J Chem Theory Comput       Date:  2019-04-16       Impact factor: 6.006

7.  Estimation of Solvation Entropy and Enthalpy via Analysis of Water Oxygen-Hydrogen Correlations.

Authors:  Camilo Velez-Vega; Daniel J J McKay; Tom Kurtzman; Vibhas Aravamuthan; Robert A Pearlstein; José S Duca
Journal:  J Chem Theory Comput       Date:  2015-10-21       Impact factor: 6.006

8.  A Binary Arginine Methylation Switch on Histone H3 Arginine 2 Regulates Its Interaction with WDR5.

Authors:  Benjamin M Lorton; Rajesh K Harijan; Emmanuel S Burgos; Jeffrey B Bonanno; Steven C Almo; David Shechter
Journal:  Biochemistry       Date:  2020-03-31       Impact factor: 3.162

9.  Conserved buried water molecules enable the β-trefoil architecture.

Authors:  Michael Blaber
Journal:  Protein Sci       Date:  2020-07-08       Impact factor: 6.725

10.  Enhancing Structure Prediction and Design of Soluble and Membrane Proteins with Explicit Solvent-Protein Interactions.

Authors:  Jason K Lai; Joaquin Ambia; Yumeng Wang; Patrick Barth
Journal:  Structure       Date:  2017-09-28       Impact factor: 5.006

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.