Literature DB >> 35534011

Spatially Resolved Hydration Thermodynamics in Biomolecular Systems.

Abstract

Water is essential for the structure, dynamics, energetics, and thus the function of biomolecules. It is a formidable challenge to elicit, in microscopic detail, the role of the solvation-related driving forces of biomolecular processes, such as the enthalpy and entropy contributions to the underlying free-energy landscape. In this Perspective, we discuss recent developments and applications of computational methods that provide a spatially resolved map of hydration thermodynamics in biomolecular systems and thus yield atomic-level insights to guide the interpretation of experimental observations. An emphasis is on the challenge of quantifying the hydration entropy, which requires characterization of both the motions of the biomolecules and of the water molecules in their surrounding.

Entities: Chemical

Mesh：

Substances：
Water

Year: 2022 PMID： 35534011 PMCID： PMC9150089 DOI： 10.1021/acs.jpcb.2c01088

Source DB: PubMed Journal: J Phys Chem B ISSN： 1520-5207 Impact factor: 3.466

Introduction

Water has been called the “matrix of life”, as it plays a role in almost all biological processes, covering a broad range from, for example, protein folding, biomolecular recognition, ligand binding, and enzyme activity via ion transport to self-assembly, osmosis, and diffusion.[1] However, despite this importance, the precise nature of the intricate relationships between biomolecules like proteins, nucleic acids, lipids, carbohydrates, etc. and water often remains elusive, particularly at a molecular or atomic level. This can hamper our fundamental understanding of biomolecular processes and impede their targeted modulation, for example, with biochemistry or chemical biology techniques. From a global thermodynamics viewpoint, it is free energy that dictates chemical equilibria: a process will occur spontaneously if the free energy change associated with it is negative (favorable). In the isothermal–isobaric ensemble, the thermodynamic potential that is calculated from the partition function is the Gibbs free energy, whose changes are governed by enthalpy and entropy changes, ΔG = ΔH – TΔS. Thus, although ΔG is the driver, the interplay between enthalpy and entropy is helpful to understand the free energy economy of a process, its temperature dependence, and the underlying thermodynamic driving forces. In many cases, rather small net free energy differences result from large but almost compensating enthalpy and entropy contributions, a phenomenon known as enthalpy/entropy compensation.[2] Biomolecular processes are governed by a plethora of different interatomic interactions, and hydration-related contributions to free energy, enthalpy, and entropy are often of a sizable magnitude, as is further illustrated below. Enthalpy is linked to the strength of interatomic interactions and is directly related to the internal energy U of a molecular system, H = U + pV, where pV is the pressure–volume work. Usually, one is interested in differences between two (or more) states or systems at constant pressure, ΔH = ΔU + pΔV. For almost incompressible condensed-phase systems, such as aqueous solutions at standard ambient conditions, the pΔV term is negligible, and thus enthalpy differences can be readily obtained as potential energy differences. Entropy is related to atomic fluctuations (i.e., motions/dynamics), and its quantification is often more challenging because it in principle requires counting the microstates of the system of interest. While the entropy can be straightforwardly obtained for simple model systems that can be treated analytically, this endeavor becomes highly nontrivial for (biological) macromolecules with a large number of degrees of freedom and a complex energy landscape, which prohibits an analytical treatment. This “entropy challenge” is even aggravated for liquids, which have a pronounced diffusive component and can access an extremely large configuration space volume, which scales exponentially with the number of solvent molecules. Thus, complete numerical sampling is essentially impossible already even for a relatively small number of solvent molecules, and simple harmonic or quasiharmonic approximations are inaccurate. If one aims at obtaining a realistic picture of a biomolecular system, which is typically comprised of biomolecules and a large number of water molecules, plus ions and cosolutes, one is facing a combination of the above challenges. The interactions between biomolecules and water are determined by local and typically short-ranged interatomic forces, and the thermodynamics of a molecular system are determined by the interplay of all these local interactions. However, the idea of “local thermodynamics” is controversial. In principle, thermodynamic properties are macroscopic system properties, and thus a local perspective might appear questionable. For example, solvent reorganization energy and entropy are inherently nonlocal in nature, as they involve displacements of many solvent molecules. Entropy is linked to atomic fluctuations, which can be local but might also involve the collective motions of many atoms and thus are not localizable or assignable to a single particle or a specific volume element of the system in an unambiguous way. Nevertheless, a local and spatially resolved perspective is highly desirable for mechanistic insights, for example, to map out the contributions of individual water molecules to the binding of a ligand inside a protein pocket. Despite the impressive progress that has been made with experimental methods for probing the structure and dynamics of water at the surface of biomolecules and small molecules,[3,4] obtaining a spatially resolved picture at the sub-nanometer scale of the thermodynamic driving forces underlying biomolecular processes remains a major challenge, especially when it comes to individual water molecules. In computer simulations, the positions of all atoms are known, and several computational methods, each coming with certain approximations and limitations, have been developed for studying the links between the atomic interactions and thermodynamic parameters, such as free energy, enthalpy, and entropy. In this Perspective, we focus on physics-based theoretical approaches that can provide a spatially resolved picture of the—in this sense “local”—thermodynamics and thus yield microscopic insights that are difficult to obtain otherwise. The local thermodynamic quantities provided by the different methods must be interpreted with care because the localization and decomposition are not unique (see above); however, this limitation is diminished when one compares the corresponding integrated quantities. It is, in principle, straightforward to compute enthalpies (or differences thereof) from the ensemble-averaged interaction energies, , as defined by the potential energy function used in a molecular dynamics (MD) or Monte Carlo (MC) simulation.[5] However, depending on the particular system, achieving sufficient statistical precision with this “brute force” approach might require rather extensive equilibrium sampling, especially if large fluctuations and slow correlation times are involved.[6] If in addition the free energy difference associated with the process under study is known, for example, from thermodynamic integration (TI), free energy perturbation (FEP), or similar methods, ΔS is directly obtained from the difference, ΔS = (ΔH – ΔG)/T. These rigorous statistical mechanics methods provide accurate results, given the possible inaccuracies of the force field and limited sampling. However, the microscopic insights that can be gained with such approaches concerning the molecular mechanisms at play can be somewhat limited because obtaining spatial resolution is challenging, unless for special cases, for example, when single localized water molecules are investigated that are not part of a larger network (in which annihilation of a water molecule would create a defect). In addition, breaking down the solvation entropy into translational and rotational contributions, or one-body and many-body terms, etc., is not easily possible. Alternatively, entropy can be obtained from the temperature dependence of the free energy or heat capacity. The contributions to the free energy of solvation, ΔGsolv = ΔHsolv – TΔSsolv, can be formally decomposed into contributions from solute–water (SW) and water–water (WW) interactions, ΔHsolv = ΔUSW + ΔUWW and ΔSsolv = ΔSSW + ΔSWW. It is important to notice that the enthalpy and entropy terms that are connected to changes in water–water interactions exactly cancel each other;[7,8] that is, ΔUWW – TΔSWW = 0. As contributions that are exclusively related to water reorganization thus have no net contribution to the free energy, interpreting ΔHsolv and TΔSsolv in terms of thermodynamic driving forces should focus on the noncanceling contributions related to solute–solvent interactions, ΔGsolv = ΔUSW – TΔSSW. The above exact enthalpy–entropy compensation applies to all solvents, but for water the magnitudes of the individual canceling contributions are typically particularly large. Computational methods that provide a spatially resolved picture of solvent thermodynamics include inhomogeneous solvation theory (IST)[9,10] and a three-dimensional (3D) grid-based adaptation (GIST),[11−13] two-phase thermodynamics (2PT),[14,15] and the spatially resolved (grid-based) extension 3D-2PT,[16] cell theory,[17,18] grid cell theory (GCT),[19] and multiscale cell correlation (MCC).[20−22] For more details about the methods, the interested reader is referred to the recent review by Heyden.[23] A notable new method is permutation reduction[24] combined with a mutual information expansion (Per|Mut),[25,26] which is further discussed below. Here, we do not cover integral equation theory methods based on a 3D-reference interaction site model (3D-RISM) as a computationally cheap alternative that does not require explicit Boltzmann sampling of configurations, but refer to work by Nguyen et al. for a comparison of 3D-RISM and GIST.[27] In this Perspective, recent methodological developments and applications are showcased that focus on the spatial decomposition of hydration thermodynamics from explicit solvent simulations of biomolecular systems, with a particular emphasis on proteins. We start with the discussion of water confined inside protein cavities and continue with water that forms the hydration layers around proteins. For more exhaustive reviews we refer the reader to the recent literature.[3,4]

Water Inside Proteins

This section focuses on the role of water in the binding of small molecules to proteins in internal cavities that are at least partially—often even largely—buried inside the protein structure. Processes of interest include, for example, the formation of enzyme–substrate or protein–ligand (or inhibitor) complexes. The overall binding free energy includes different contributions, not all of which are linked to hydration (or at least not directly), such as protein–ligand interactions and the conformational energy and entropy associated with conformational reorganization of the protein and/or the ligand upon binding, etc. Thus, analyzing hydration-related contributions can only reveal a part of the full story. However, especially when comparing differences in binding thermodynamics of similar ligands with similar binding modes, the approximations underlying a hydration-focused view might be justified because the other contributions are expected to cancel out to a large extent. In general, hydration-related contributions to the overall thermodynamics can be sizable in magnitude, and their exploitation has been identified by the pharmaceutical industry as a putative strategy, for example, during the late-stage lead optimization phase of structure-based drug discovery campaigns.[28,29] The idea is that understanding the thermodynamics of active-site water molecules can provide guidance as to whether or not ligand optimization should aim to displace the water, because the energetic cost of displacement would need to be recovered by the ligand–protein interactions.[30] Buried binding pockets typically have a concave shape and provide a confined environment for positionally ordered and precisely located (“structural”) water molecules, which can mediate interactions between the ligand and the receptor and/or be displaced upon complex formation and be released into the bulk solvent. Irrespective of the precise details of the ligand–protein interactions, water molecules are rearranged upon binding, and both protein and ligand surfaces are (partially) desolvated. Thus, hydration effects can have an important impact on the binding process, not only in terms of the thermodynamics but also the kinetics.[31,32] Understanding (let alone predicting) the hydration-related thermodynamic contributions from high-resolution structures alone is notoriously hard, not only in terms of the free energy but also concerning the enthalpy and entropy contributions to it; this challenge is just as difficult when it comes to the kinetics. Furthermore, enthalpy/entropy compensation can hinder the targeted optimization of binding affinity.[33] Interestingly, the rearrangement of water networks upon complex formation was found to contribute to enthalpy/entropy compensation,[34] further supporting the notion that understanding binding is intimately linked with understanding hydration. Computational methods can determine the locations of water molecules in protein structures and provide detailed microscopic insights into the thermodynamics behind their contributions to binding processes. Hence, they have found widespread application, as recently reviewed by Samways et al.[30] Inhomogeneous solvation theory[9,10] and its grid-based extension GIST, as, for example, implemented in Schrödinger’s WaterMap or the AmberTools, are among the most popular approaches.[11−13,35−40] In the following, we briefly introduce the basic methodological concepts and discuss selected case examples from the recent literature in which solvation entropy and energy have been explicitly considered in a spatially resolved manner.

Grid Inhomogeneous Solvation Theory (GIST)

The basic idea of GIST is to use MD or MC to sample the equilibrium (Boltzmann) distribution of water molecules around a given solute and to discretize the water density and solvation energy and entropy on a three-dimensional grid composed of small cubic voxels.[11,13] A typical spacing used for GIST grids is 0.5 Å, which is much smaller than one single water molecule and provides a sufficiently detailed spatial resolution for capturing the anisotropic shapes of water density distributions in biomolecular environments. At the same time, it allows for reasonable statistical convergence of the voxel properties, which are limited by the finite configurational sampling. The GIST entropies are single-body terms, that is, the solvation entropy, which in principle includes contributions of solute–water and water–water correlations, is approximated only by the solute–water term, ΔSsolv = ΔSSW + ΔSWW ≈ ΔSSW. In subsequent work, Nguyen et al. extended the original GIST implementation to also account for water–water two-body correlations in the translational entropy ΔStrans. However, this extension is usually not used in practical applications because the ΔStrans contribution was found to be rather small but very hard to statistically converge, even with a k-nearest-neighbor method (instead of histograms) that reduces the numerical noise in the estimated density distributions.[13] In contrast, the energies ΔU do contain the solute–water and water–water (second-order) terms. After its implementation and demonstration in a proof-of-concept study of the hydration patterns around cucurbit[7]uril, a small model receptor with exceptional solvation properties,[7]uril. J. Chem. Phys.. 2012 ">11] early work of GIST on proteins focused on the binding site hydration of Factor Xa.[12] This serine protease is involved in blood coagulation and has a ligand binding site that is not too deeply buried and thus allows for rather rapid sampling of solvent configurations. Figure shows a rendering of the GIST hydration thermodynamics in the Factor Xa binding pocket and visualizes the distinct sites at which the hydration thermodynamics differs substantially from the bulk. The maxima are located at similar positions in the different panels, which is a consequence of the weighting of the thermodynamic quantities plotted by the water number density, which has maxima at these distinct locations. Figure A,B shows that the protein–water and water–water interaction energies are negative (favorable) and positive (unfavorable) compared to the bulk, respectively. This is expected, as water molecules inside the binding pocket form contacts with the protein while sacrificing water–water H-bonds, and thus they can not form the same H-bonded network as in the bulk liquid. These two effects partly cancel each other, but in sum the total interaction energies are favorable for most sites, with the exception of site 6 (Figure C). The entropy density distribution (Figure D) resembles that of the water–water energy. The solvation entropy is unfavorable for all sites. The truncation of the entropy expansion to single-body solute–water contributions (see above) implies that only unfavorable entropies relative to bulk are possible. It would be interesting to investigate how including higher-order water–water correlations in the entropy would modulate this picture, as is further discussed below.

Figure 1

GIST map of spatially resolved hydration thermodynamics in the binding site of Factor Xa for (A) the solute–water energy, (B) the water–water energy, (C) the total water energy, and (D) the total water entropy (−TΔStrans+rot) density distributions. The isosurfaces shown are contoured at −2 (dark red), −1 (light red), +1 (light blue), and +2 (dark blue) kcal/mol/Å3, relative to the bulk water values. The figure was taken from ref (27), where it was published under the Creative Commons Attribution License (CC BY 4.0). The observation that the formation of energetically favorable strong protein–water H-bonds, typical for polar or charged binding sites, reduces the mobility of the water and thus is accompanied by a compensating unfavorable entropy contribution is commonly found, especially at concave surfaces.[12,16,39,41] Vice versa, in apolar environments, water molecules can be energetically less favorable but have increased entropy due to their increased mobility/fluctuations. This fingerprint was also found in a systematic investigation of the thermodynamic signatures of binding of a single water molecule to model cavities of defined size and polarity by means of thermodynamic integration free energy calculations,[42] and therefore it appears to be a more general principle. In both scenarios discussed above, the enthalpy is typically found to be larger in magnitude than −TΔS, hence resulting in “happy” and “unhappy” water molecules (in terms of free energy) for the polar and apolar binding situations, respectively. In a recent “real-life” application, Ryde and co-workers used GIST to estimate the solvation entropy contributions to the thermodynamics of binding of three congeneric ligands to the protein galectin-3.[43] The ligands differ in the position of a fluorine atom in the fluorophenyl-triazole moiety, with the ortho ligand having a slightly lower binding affinity compared to the meta and para ligands. Isothermal titration calorimetry (ITC) experiments showed that this lower binding affinity of the ortho ligand can be attributed to a less favorable binding enthalpy, whereas the entropy change upon binding is actually less unfavorable for the ortho ligand compared to the other two stereoisomers. The GIST calculations revealed that these thermodynamic signatures associated with the overall binding process can be partially explained by differences in the solvation patterns of the binding site occupied by the ortho ligand compared to the meta and para complexes, which both have a more unfavorable −TΔΔSsolv compared to the ortho complex. Because of the choice of the highly similar congeneric ligands, there are no solvation differences for the unbound state, and thus the observations can be assigned exclusively to the bound complex. However, while the GIST calculations provide valuable microscopic insights that are qualitatively in agreement with—and partially explain—the experimental findings, the free energy contribution due to the observed difference in solvation entropy is 25–30 kJ/mol and thus probably too large in magnitude. Such an overestimation of the solvation entropy contribution is not generally observed, though. For example, in a related previous study of binding of a pair of diastereomeric ligands to the same protein (galectin-3), the GIST calculations of Ryde and co-workers yielded smaller values of −TΔΔSsolv ≈ 3 kJ/mol.[44] One limitation of GIST (and grid-based approaches in general) is that tight restraining potentials must be used to artificially fix the solute atoms at well-defined positions during the simulations. This can obviously be problematic for flexible systems because relevant conformational states might be overlooked. Without restraining potentials, the motions of the solute atoms would smear out the densities, which would hamper the localization of the thermodynamic properties and also lead to systematically overestimated entropies. This limitation can be overcome by first running an unrestrained simulation to obtain representative configurations, for example, through conformational clustering, and then perform restrained simulations separately for each relevant conformer. In this way, the sampling of the solute and solvent configurations are decoupled from each other (in the described sense). Such an approach might be suitable—and to some extent unavoidable, given the limitations set by the requirement for a 3D grid. However, fluctuations in the hydration layers are coupled to protein motions[45] and, thus, can be affected by the restraints.[46,47] A recent study of binding of a model ligand to a hydrophobic surface patch of ubiquitin showed that protein flexibility modulates the density fluctuations, and hence the compressibility, of the surrounding hydration layers in such a way that partial dewetting of the binding interface is facilitated and the friction associated with the binding process is reduced.[48] These results agree with a recent study, which showed that forces associated with joint protein–water motions lower the friction along the reaction coordinate for different biomolecular processes, such as Fe-CO bond rupture in myoglobin, unfolding of a small protein domain, and dissociation of an insulin dimer.[49] Furthermore, by studying fluctuations of the local water density near protein surfaces and the dewetting of surface patches, Rego et al. established a classification of surface hydrophobicity that is independent of the chemical nature of the constituent atoms.[50] These findings are in line with previous work on collective vibrational motions of two separated biomolecular surfaces, which are mediated by the water layers and lead to ultrafast kinetic energy exchange between the biomolecules.[51] Corresponding results were also found for the collective motions of lipids in a bilayer, which are mediated by hydration water in the headgroup region.[52] Although the entropy was not explicitly quantified in these works, they highlight that one might need to be cautious with the application of restraining potentials that could interfere with such collective dynamics, as they can be linked to solvent entropy. However, while correlations between dynamics and entropy have been observed empirically, which has led to the formulation of scaling laws,[53] a general theoretical framework that connects dynamics and entropy is not known. As discussed above, a drug design strategy is to displace thermodynamically unfavorable water molecules in binding pockets through ligand modifications. For GIST, this idea was first put into action by Nguyen et al. by defining solvent-displacement scoring functions,[12] based on previous work.[36] These scoring functions use local GIST solvation energies and/or entropies and include up to six empirical parameters, which are determined by fitting the scores against experimentally obtained differences in binding free energies between pairs of congeneric ligands, ΔΔGbindexpt. Using pairs of similar ligands is very important because, by construction, such GIST scores can be expected to correlate with binding free energy differences only if solvation-related contributions dominate the ΔΔGbindexpt values, which can not be assumed to be the case for dissimilar ligands (or also similar ligands with different binding modes). In the original work,[12] the GIST-based scoring functions were adjusted against relative binding free energy differences between 28 congeneric pairs of Factor Xa inhibitors. An interesting finding was that scoring functions based on both the energy and one-body entropy yielded good correlations with ΔΔGbindexpt and that scoring functions based on energy alone performed equally well. Interestingly, scoring functions based on the entropy alone did not result in a good correlation with ΔΔGbindexpt, which led the authors to conclude that the displacement of entropically unfavorable water molecules from binding sites might not be important for ligand affinity differences, at least not for Factor Xa with the approach used in that work.[12] Recently, Hüfner-Wulsdorf and Klebe revisited the idea of using GIST-based solvent displacement scoring functions for predicting ligand affinity differences.[54] They devised a set of different scoring functions with a varying number of fitting parameters and carefully calibrated, tested, and validated these scoring functions against ΔΔGbindexpt values for a highly congeneric series of 53 ligands binding to thrombin and 12 ligands binding to trypsin, two homologous serine proteases. The GIST calculations not only considered the (de)solvation of the pocket of the apo protein, as was done previously, but additional GIST scoring functions were devised that are based either on the protein–ligand complex or even only on the ligand. Interestingly, the scoring functions trained on protein–ligand complexes had low predictive power, whereas those trained on solvation of the apo protein pocket yielded good correlations. Strikingly, agreement with the experimental binding free energy differences could be obtained even with scoring functions based on the ligands alone, that is, purely based on ligand solvation without any protein information at all. This finding could suggest that the interactions with the water molecules on the surface of the unbound ligand, which partially need to be stripped off when the ligand enters the pocket, constitute a proxy for interactions gained upon binding to the protein. This might open interesting avenues toward using ligand solvation thermodynamics as a drug design principle.[55] Another important finding of Hüfner-Wulsdorf and Klebe was that GIST scoring functions trained for thrombin had low predictive power for trypsin. This issue could not be solved by a multiobjective optimization of GIST solvation energy- and entropy-based scoring functions separately against the corresponding ΔΔHbindexpt and ΔΔSbindexpt values from ITC experiments, as was tried in a follow-up work.[56] These results show that one must be very cautious when transferring a scoring function that was optimized for one system to another, even if the two proteins are highly similar, as is the case for thrombin and trypsin. In fact, the found system-specificity of the scoring functions highlights that refitting of the scoring function parameters is required for every new target. Taken together, these recent developments highlight the potential power of assigning thermodynamic quantities to water molecules to quantitatively connect water positions in protein structures with measured binding affinity differences. Future improvements could focus on efforts to improve the transferability of the GIST-based scoring functions between different proteins or binding pockets by using larger data sets for training, a venture that could also benefit from employing machine learning. First steps in this direction provided promising results for predicting protein–ligand interactions.[57] Furthermore, on the physics side, it would be interesting to see whether improved prediction accuracy can be achieved by improving the entropy term used in GIST by including higher-order correlations beyond two-body translation-translation, that is, translation-rotation and rotation-rotation correlations. Huggins showed that many-body correlations between pairs of water molecules in doubly occupied buried protein cavities only lead to small entropy corrections.[39] However, that study included tightly bound immobilized water molecules that have low entropy and thus only small fluctuations available for building up correlated motions. In contrast, a recent protein folding simulation study of Heinz and Grubmüller[41] showed that differences of the protein-induced many-body water correlations account for more than half of the solvent entropy difference between folded and unfolded states, as is discussed in more detail below.

Two-Phase Thermodynamics Model

The two-phase thermodynamics (2PT) model was originally developed by Lin et al. for extracting the thermodynamic properties of bulk liquids from MD simulations.[14,15] The interested reader is referred to the original work of Lin et al. and to the recent review article by Heyden[23] for a comprehensive overview of the methodological background. The central quantity of the 2PT method is the vibrational density of states (VDOS), which is obtained as the Fourier transform of the time autocorrelation function (TCF) of velocities. For rigid water molecules, translations and rotations are readily separable by considering the center-of-mass velocity and the angular velocity of rotations around the three rotational axes, respectively. The VDOS provides the distribution of translational and rotational degrees of freedom of each single water molecule in the system over the frequency domain, thus providing access to the corresponding translational and rotational entropies. The zero-frequency intensity of the VDOS is directly related to the diffusion coefficient. The 2PT method is based on an ad hoc partitioning of the VDOS into two contributions, “gas-like” and “solid-like”. For the diffusive (gas-like) contribution, fractions of the translational and rotational degrees of freedom are assigned to a hard sphere (HS) fluid model and to a rigid rotor (RR) model, respectively. The corresponding single-water entropies, SHStrans and SRRrot, respectively, can be expressed analytically with the Carnahan–Starling equation of state. In 2PT, these HS/RR contributions are subtracted from the total VDOS to yield the remaining fractional translational and rotational degrees of freedom per solvent molecule, which are treated as a set of harmonic oscillators (HO) and are thus referred to as solid-like. With the frequency-dependent quantum HO partition function, the HO entropies for these translational and rotational degrees of freedom can be obtained, yielding the total entropy of the solvent as S = SHStrans + SHOtrans + SRRrot + SHOrot. This entropy includes both harmonic and anharmonic contributions and also higher-order couplings between water molecules, as are encoded in the translational and rotational velocities. In summary, the 2PT method assumes that the VDOS of the liquid can be partitioned into two contributions, each of which can be described with an analytical model. The weights of the gas-like and the solid-like components in 2PT are reflected in the fluidicity parameter. While the two limiting cases are clearly physically justified, that might not be the case in between, where the ad hoc partitioning is employed. 2PT has been shown to accurately capture the entropy of water for a wide range of systems and under different conditions.[15,58−62] However, its range of applicability is not clear a priori and cannot be blindly assumed. An advantage of the 2PT method is that it naturally provides spectral resolution, and different contributions to the entropy can be obtained by integrating over different frequency domains of the VDOS. The spectrum of center-of-mass vibrations (translations) reports mainly on collective intermolecular H-bond bending and stretching modes of the water H-bond network, resulting in broad bands below 100 cm–1 and at ∼200 cm–1, respectively. The energy difference between vibrational levels in a quantum HO corresponds to the room-temperature thermal energy of ca. 200 cm–1, and thus excited vibrational levels can be populated for the above-mentioned collective water modes, which therefore contribute most to the entropy and heat capacity.[16,23] Higher-frequency vibrations are not excited and hence do not contribute much, such as librational modes, that is, hindered rotations of water molecules constrained within their local H-bond network, which give rise to a broad spectral band between 300 and 1000 cm–1. In a recent work, Päslack et al. studied the entropy of water molecules in the active site of human carbonic anhydrase II (hCAII) and investigated the entropy changes upon binding of an inhibitor, dorzolamide.[63] The apo form of hCAII harbors a structurally conserved network comprised of ca. 10 water molecules in its active site, which is stabilized by H-bonds of the water molecules with themselves, with the catalytic zinc ion, and with a number of amino acids that are crucial for catalysis (Figure ).[64,65] The water network is important for the function of the enzyme, as it is involved in the shuttling of protons during the enzymatic reaction. Unrestrained MD simulations were performed, and the three-dimensional density distributions of the water molecules were found to be in good agreement with the positions of the water molecules found in the high-resolution X-ray and neutron diffraction structures.[64] Interestingly, despite their well-defined average locations inside the protein matrix, the active-site water molecules were found to be in dynamic exchange with the bulk on a broad range of time scales, with water residence times at the individual sites between ∼30 ps and 13 ns (depending on the precise location in the active site); a single water molecule did not exchange at all on the 500 ns time scale of the MD simulations (W1 in Figure ). This observation, together with the experimental finding that binding of the dorzolamide inhibitor displaces three of the active-site water molecules from the active site to the bulk and leads to an overall stiffening of the conformational dynamics of the protein,[64,65] prompted Päslack et al. to study the entropy signatures of the individual water molecules and how they respond to inhibitor binding.[63] To that end, the VDOS was calculated for the active-site water molecules, which were dynamically selected during the MD trajectories based on their distance to the local maxima in the 3D water density. The spatial resolution of the approach is thus determined by the distance cutoff used, which was 1.4 Å. This is small enough to include at most one water molecule and not to overlap with the neighboring volumes, therefore enabling a unique assignment of single water molecules. In other words, the water selection was based on small (local) spherical probe volumes, which is an approximation to the true anisotropic water density distribution. However, this approximation holds quite well in the present case for the strongly immobilized waters. Furthermore, no restraining potentials on the protein atoms were necessary because no grid was used.

Figure 2

Apo (A) and inhibitor-bound (B) structures of hCAII. The investigated water molecules that constitute the active site network are indicated. The figure was adopted from ref (63).

Apo (A) and inhibitor-bound (B) structures of hCAII. The investigated water molecules that constitute the active site network are indicated. The figure was adopted from ref (63). The 2PT results shown in Figure reveal that the water molecules confined in the hCAII active site have lower entropy than bulk water, but to a different extent depending on their location in the protein pocket. For example, for apo hCAII, the entropy of the water molecule at site 1 is 38.8 J·mol–1 K–1 and thus 17 J·mol–1 K–1 lower than the bulk value, which is 55.8 J·mol–1 K–1 for the SPC/Eb water model used in the study of Päslack et al. This entropy reduction associated with tying up a water molecule at position W1 is in line with the very long residence time at this site and is substantial, but still smaller than the upper limit of ca. 30 J·mol–1 K–1 estimated by Dunitz based on crystalline inorganic hydrates.[66] For comparison, the entropy of the water molecule at site W7 is 48.1 J·mol–1 K–1 and thus only slightly lower than in bulk. Binding of the dorzolamide inhibitor displaces three water molecules from the active site into the bulk (W3, W6, and W7), and these waters gain entropy upon dorzolamide binding. Interestingly, the entropy of some of the remaining water molecules is further lowered (Figure ). This effect is most pronounced for site W4, where inhibitor binding lowers the water entropy by 5.4 J·mol–1 K–1, from 50.7 to 45.3 J·mol–1 K–1. In sum, the entropy gain due to the three water molecules released from the active site is larger than the entropy loss of the remaining ones, and thus the total entropy contribution related to hydration changes upon inhibitor binding is positive, ΔSsolv,bind = 15.6 J·mol–1 K–1, corresponding to a favorable free energy contribution of ca. −5 kJ mol–1 at 300 K. We note in passing that, for the strongly immmobilized water molecules in the hCAII active site, the diffusive contribution is very small, and very similar entropies are obtained by using a continuous set of harmonic oscillators, that is, without the gas-like component.[63]

Figure 3

2PT entropies of active-site water molecules in apo and inhibitor-bound hCAII. (A) Total water entropies. (B) Entropy difference between inhibitor-bound and apo hCAII. Water molecules at sites 3, 6, and 7 are released to the bulk upon binding. See Figure for the numbering of the water positions. The results discussed above highlight that the reorganization of the water in the binding pocket can be important and needs to be considered in the full picture. An advantage of the VDOS-based method is that additional insights into these processes can be gained from the spectral resolution provided. The spectral densities of W1 and W4 (Figure ) show that the H-bond network of the water molecules at these locations differs substantially from that of bulk water. In general, the suppressed diffusive component (zero-frequency response) and the blue-shifted peaks indicate a stiffer H-bond network, in line with the lower entropy. The spectral density of the low-entropy water at site W1 does not change substantially upon inhibitor binding (at least not in the spectral range below 200 cm–1 that is most relevant for the entropy). For W4, a further stiffening of the H-bond bending mode is observed upon inhibitor binding, as is reflected in a lowering of the intensity maximum and a blue-shift of the band below 100 cm–1.

Figure 4

Spectral densities of water molecules at sites W1 (upper) and W4 (lower) in the hCAII active site are shown for both the apo enzyme and the dorzolamide-bound form (red and blue curves, respectively). For comparison, the spectral density of bulk water is also shown in gray. The figure was adopted from ref (63). These examples highlight the role of water molecules for the binding of small molecules in buried protein pockets. In addition, water can of course also play a role in stabilizing specific protein conformations or at protein–protein interfaces. For example, recent work revealed that a structurally conserved water network inside the core of the insulin hexamer has strongly slowed-down dynamics and provides a robust scaffold for the hexamer assembly, whose structural integrity breaks down in the absence of the water network.[67] While this rigidity of the cavity water comes with an entropy cost, favorable interactions with protein side chains via long-lived H-bonds provide enthalpic stabilization.

Water around Proteins

The previous section focused on water molecules that are tightly bound inside a protein, which in many cases are even considered to be an integral part of the protein structure. In contrast, the protein hydration layer (PHL) that surrounds a protein is more loosely interacting with the surface and is characterized by a heterogeneous distribution of structural and dynamic properties.[4,68,69] The question up to what length scale the presence of a protein (or other biomolecular) surface perturbs the water around it has been a matter of debate in the literature. There is no abrupt physical boundary between the PHL and bulk water, and the answer to that question depends on the quantity (or observable) that is probed and on the technique used.[69,70] While some techniques, such as NMR, probe single-particle dynamics and are most sensitive to short-ranged interactions with the first hydration layer, other methods such as terahertz spectroscopy probe the collective response of water molecules in an extended H-bonded network and thus longer-ranged effects.[71] The modulations of the structure and dynamics of the PHL result from a combined effect of different local interactions between the protein surface and water molecules, and thus a more detailed picture would be desirable. In the following, we discuss two recent computational methods that provide a spatially resolved map of the thermodynamic properties of water around proteins, the 3D-2PT approach of Persson et al.[16] and the Per|Mut method of Heinz and Grubmüller.[25,26]

Three-Dimensional Two-Phase Thermodynamics (3D-2PT)

Persson et al.[16] extended the 2PT method[14,15] to map the quantities of water onto a 3D grid around the solute. The solvent molecules are dynamically assigned to the voxels based on their positions during an MD simulation. To obtain the VDOS, the motions of the water molecules are tracked for a few picoseconds, which is sufficiently short to remain local. The solvation enthalpy contributions of each voxel i located at position r are obtained similar to GIST (see above) as ΔHsolv(r) = USW(r) + (UWW(r) – UWW,bulk)/2. The solvation entropy (relative to bulk) is obtained as ΔSsolv(r) = S(r) – Sbulk. The integrated thermodynamic quantities are obtained by summing up all individual voxels weighted by the respective occupancies; for example, ΔSsolv = ∑nW(r) · ΔSsolv(r) for the solvation entropy. With the sampling routinely accessible with MD today, typical grid spacings of ca. 1 Å can be used in 3D-2PT, so that the voxels are much smaller than the volume of one water molecule. For illustration, Figure shows the energetic and entropic decomposition of the solvation contributions on a grid around N-methylacetamide. As discussed above for GIST, as a grid-based method also in 3D-2PT the solute atoms need to be tightly restrained during the simulations.

Figure 5

Spatially resolved contributions of the solvation enthalpy (left) and entropy (right) relative to bulk water around N-methylacetamide. The voxels shown represent the first solvation shell and have an average water number density that is higher than in bulk by at least 30%. The voxels in front of the solute are not shown for clarity. The figure was taken from ref (16). The 3D-2PT technique was used to study the properties of the hydration layers around a protein from the metalloproteinase family[72] and around the catalytic centers in designed and optimized Kemp eliminases.[73] In the former work, a classical force field with fixed point charges was used, whereas in the latter work a polarizable force field was employed. On the basis of the local properties of the water molecules around the two enzymes investigated, as obtained from the VDOS and the 3D-2PT entropies, the authors classified the waters into several distinct classes ranging from “bound” via “weakly-bound” to “unbound” and “bulk-like”. A key finding from the studies was that the water molecules that are close to the active sites of the enzymes, but also for other surface regions around the enzymes, have distinct properties, such as lower entropy. These properties might be further modulated by changes in the active-site pockets, for example, amino acid exchanges through directed evolution in the case of the Kemp eliminases. Another question addressed by the authors is which features of the protein surface determine the properties of the PHL. The results of the 3D-2PT analyses stress the heterogeneous character of the hydration shell, which results from a combination of variations of the surface curvature and the chemical nature of the surface residues. The hydrophobicity of the individual amino acids is by itself not a very good predictor of the hydration entropy of the water in its surrounding, a result that was also found by others.[41,60,74,75] Thus, instead of single residues, one should think in terms of somewhat more extended protein surface patches or regions with a particular polarity and topology (concave/convex surfaces). The interplay of solvation-related thermodynamic driving forces with other contributions was illustrated in a recent study of Fajardo and Heyden,[76] who used the 3D-2PT method to dissect the different energy contributions behind the conformational equilibrium between compact and extended states of a polyalanine model peptide. Microsecond all-atom MD simulations in explicit water showed that free energy contributions associated only with peptide degrees of freedom, such as the intramolecular potential energy and conformational entropy of the peptide, favor compact states, whereas the free energy of solvation favors extended states. Further decomposition of the free energy of solvation revealed that solvation enthalpy is more favorable for the extended states of the peptide, but solvation entropy favors compact states. These detailed analyses provide deep insights into the tug of war between the different thermodynamic contributions to the conformational equilibrium of the polypeptide. In the following section, an alternative methodological approach is discussed that was applied to the folding of a larger protein, Crambin.

Permutation Reduction and Mutual Information Expansion (Per|Mut)

Heinz and Grubmüller developed a method, Per|Mut, that yields spatially resolved water entropy contributions from translational and rotational motions as well as from their higher-order correlations on a per-molecule level.[25,26] In this method, an MD simulation of the full system is performed (including solute and solvent). The spread of the water molecules, which, in principle, explore a huge high-dimensional configuration space volume, is reduced by mapping them into a much smaller configurational subvolume via permutation reduction, which resolves the redundant counting of physically identical microstates.[24] The applied relabeling (permutation) of individual water molecules does not change the physics of the system but merely increases sampling drastically (by N!, where N is the number of water molecules). From the permutationally reduced MD trajectories, the water entropy is calculated using a mutual information (MI) expansion and a k-nearest-neighbor probability density estimator. The MI expansion is truncated at third order, , that is, in addition to the single-body (one-water) entropy S1 it includes the many-body correlations between two water molecules, I2(j, k), and between three water molecules, I3(l, m, n). For pairwise MI terms of the translational and rotational entropy, and for the translation-rotation correlation, pairs of water molecules within a maximal average distance of 10 Å were taken into account. For third-order terms, smaller cutoffs of 3.3–4.5 Å were used to reduce the computational effort, which is justified because the correlations are short-ranged. For the translation–rotation coupling, two-body correlations are considered, , where j and k̃ indicate the translational and rotational degrees of freedom of water molecules j and k, respectively. Fourth-order and higher terms are neglected in Per|Mut because their contributions to entropy differences ΔS are expected to be small, while the increasingly higher dimensionalities of the spaces that need to be sampled render their convergence notoriously hard. In fact, already computing the short-ranged pairwise and triplewise correlations is a considerable computational effort. Per|Mut can be considered a first-principles method, whose accuracy is systematically improvable by including higher-order terms in the MI expansion and using larger cutoffs for considering correlations between more distant molecules. The permutation reduction naturally localizes the densities already down to the volume of a single water molecule. To obtain an even finer spatially resolved map on a 3D grid, the entropy per water molecule is calculated by splitting the contributions from the pairwise and triplewise correlations equally between the involved water molecules. (While this splitting is of course sensible, it is also somewhat arbitrary, highlighting the above-discussed nonuniqueness of localization schemes.) The values of each voxel are then calculated as the average contribution of all water molecules that visited the voxel, weighted by the fraction of the time the voxel was visited by each water molecule in the system. For the visualization of the solvation enthalpies (energies), the established procedure that is also followed in GIST and 3D-2PT is adopted (see above). In a first application to a biomolecular system, Heinz and Grubmüller used Crambin as a case example for a small globular protein and studied the differential solvation of the folded state and an unfolded (molten-globule-like) state.[41]Figure visualizes the spatially resolved decomposition of the hydration free energy of Crambin in the native fold (A) and a molten-globule-like conformation (B), which served as a model for the unfolded state. From the interaction energy perspective, ΔU = ΔUSW + ΔUWW of water near the protein surface is more negative than in bulk (Figure , first row), with particularly favorable interactions with charged and polar protein residues. The attractive interactions due to protein–water H-bonds exceed the slightly unfavorable water–water interactions at these locations, which arise from the perturbed water network. Figure also shows that the attractive protein–water interaction energies are accompanied by unfavorable solvation entropies at the respective locations, which provides a microscopic view on enthalpy/entropy compensation. It is interesting to see that translational and rotational entropies, and also the translation–rotation correlation, are comparably large. Closer analyses revealed that the hydration entropy contributions in the native fold are mostly due to water molecules in the first hydration shell that are H-bonded to protein residues, whereas in the molten-globule-like conformation the entropy contributions are spatially more spread out and also the second hydration shell is affected. The authors speculate that this behavior is linked to the hydrophobic effect, as in the unfolded conformation a larger hydrophobic surface area is solvent-exposed, resulting in increased many-body water–water correlations.[41] Furthermore, the local topology of the protein surface was found to be a good predictor for ΔS and ΔU, with hydration entropy being more unfavorable and energy being more favorable at concave surfaces.

Figure 6

Spatially resolved map of the energy (ΔU = ΔUSW + ΔUWW) and entropy (−TΔS) contributions to the hydration free energy of Crambin in the native fold (A) and an unfolded conformation (B). All values (kJ/mol) are relative to bulk water. Shown are 2D slices through the center of the protein. The translational and rotational water entropies shown include single-body terms as well as higher-order correlations (up to the third order). The translation–rotation two-water correlation is shown in the bottom row. The figure was adopted from ref (41), where it was published under the Creative Commons Attribution License (CC BY 4.0). To obtain a complete free energy bill, Heinz and Grubmüller also analyzed the differences between native fold and molten-globule Crambin in the protein–protein interaction energy and protein conformational entropy, in addition to the solvent energies and entropies discussed above. The results (Figure ) demonstrate the tug of war between the different thermodynamic contributions,[41] which by themselves can be large but almost cancel each other, such that the total free energy of folding is relatively small (−53 kJ/mol in the present case). A striking observation is that the solvent entropy favors the native fold by as much as 498 kJ/mol. This maybe unexpectedly large amount underlines the importance of solvent entropy as a thermodynamic driving force for the folding of the protein. Even more surprisingly, at least in light of the previous findings of Huggins[39] and Nguyen et al.,[13] differences in the protein-induced many-body water correlations were found to account for more than half of the total solvent entropy difference between the folded and unfolded states, with the translation–rotation correlation being the most important contribution.

Figure 7

Free energy of folding of Crambin is decomposed into enthalpy (red) and entropy (blue) contributions. Indices P and S denote protein and solvent (water), respectively. All values are given in kJ/mol. Positive values favor the native fold and negative values favor the unfolded state. The figure was taken from ref (41), where it was published under the Creative Commons Attribution License (CC BY 4.0). As is also the case for other grid-based methods, such as GIST and 3D-2PT, also in Per|Mut the solute atoms need to be tightly restrained during the simulations. This is disadvantageous especially for the inherently flexible unfolded state. To asses the influence of protein flexibility on the results, Heinz and Grubmüller repeated the hydration thermodynamics analyses by using four different snapshots picked from unrestrained equilibration MD simulations. The results were found to be consistent, but it is fair to say that further investigations that cover a broader and more representative conformational ensemble would be desirable (but computationally expensive). Furthermore, it would be interesting to see future applications of Per|Mut to strongly immobilized water molecules and their role for ligand binding in buried protein cavities.

Conclusions and Future Challenges

This Perspective highlights selected recent methodological developments and some of their applications to obtain spatially resolved maps of hydration thermodynamics in biomolecular systems from computer simulations. A focus of these developments is on entropy, the computation of which is more challenging than that of enthalpy. The examples shown underline that the thermodynamic driving forces linked to hydration can be large and thus play an important role, for example, for protein folding and ligand binding. The microscopic insights obtained from the simulations are essential for understanding the processes investigated and might be helpful for targeted modifications, for example, through rational ligand or protein design. One of the future challenges is to firmly establish—at the atomic level—the link between hydration on the one hand and the actual biomolecular function on the other hand. In many cases, it is obvious that hydration changes quite drastically during the process, at least in some local region, but elucidating how exactly these solvation changes play a role for the particular biological or biophysical process under study is a formidable challenge. For example, for protein folding, clearly hydration is crucial, but gaining deep insights by quantifying the individual contributions to the thermodynamic driving forces in a spatially resolved manner is challenging. In general, processes in which proteins undergo large-scale conformational changes are typically accompanied by pronounced changes in solvent exposure of distinct parts of the protein structure, and hence hydration is expected to play a role. For example, a recent MD simulation study of an ABC transporter showed that the solvation of the membrane-embedded domains of the protein and of the transported substrate molecule changes strongly during the large-scale alternating access conformational transition of the transporter.[77] This transition mediates the translocation of the substrate across the membrane, which is the key biological function of transporters, and thus it would be highly interesting to study the distinct solvation patterns in the different parts of the protein and to quantify the thermodynamic driving forces that result from the solvation changes. Similar considerations also apply to ion transport. For example, molecular simulation studies of potassium channels underlined that the extent to which the potassium ions strip off water molecules upon passing through the selectivity filter of the channel crucially determines the conduction mechanism, a topic that is controversially debated.[78] Additional actively researched areas include the formation of oligomers and biomolecules under high-concentration conditions, where solvation conditions differ substantially from the dilute limit. For example, recent terahertz experiments of an intrinsically disordered protein domain were interpreted in terms of water entropy as an important determinant for liquid–liquid phase separation (LLPS).[79] Changes in solvation conditions modify the phase diagram of LLPS.[80] However, the solvation thermodynamics in such processes remains to be explicitly elucidated. Furthermore, high-concentration conditions are frequently found in biopharmaceutical formulations, for example, of monoclonal antibodies. The applicability and stability of such formulations strongly depend on the solution-state properties, such as solubility and viscosity,[81,82] rendering future investigations of the solvation thermodynamics important for formulation design. Solvation was also shown to be important for other self-organization processes, such as protein oligomerization and protofilament formation,[83,84] the mechanisms of which are still not fully understood. From the computational perspective, there are two ubiquitous limitations of molecular simulations that always need to be addressed. One concerns the question of whether the potential energy function used to describe the interatomic interactions is sufficiently accurate (the force field challenge), and the other is whether the amount of phase space explored within the given limited simulation time suffices (the sampling challenge). For biomolecular systems, the latter challenge can be especially relevant, for example, if water molecules in buried protein cavities exchange only slowly with bulk water, such that the time scales exceed the simulation times that can be reached in standard MD simulations (typically microseconds). In such cases, the slow convergence of the thermodynamic averages can greatly deteriorate the statistical precision that can be achieved. Even worse, if certain important configurations are not sampled at all during the simulation (and hence no information on them is available), even qualitatively wrong conclusions might be drawn. In such cases, grand canonical Monte Carlo or hybrid MC/MD samplers can be used to overcome sampling barriers, and there are several recent developments in that area.[85−87] Concerning the force-field challenge, a relevant issue is the level of accuracy that can be expected from models with fixed atomic partial charges that lack explicit electronic polarizability. For water, such nonpolarizable models are employed in the majority of the simulation studies. However, it is important to keep in mind that these water potentials were parametrized such that the average polarization of a water molecule by the surrounding waters in the bulk liquid is implicitly included in the model, through the magnitude of the fixed dipole moment of the molecule. These water force fields have thus been parametrized for bulk water, and, given the dependence of the water dipole on its environment, they cannot simply be assumed to be as accurate in different environments. Yu and Rick used thermodynamic integration to study the thermodynamics of binding of a water molecule inside model cavities with systematically tunable hydrophobicity, and they compared a nonpolarizable water model (TIP4P) with a polarizable one (TIP4P-FQ).[42] The results showed that the polarizable water has a lower (less unfavorable) ΔGbind to an entirely apolar (completely hydrophobic) model cavity compared to the nonpolarizable model, by ca. 6 kJ/mol. This free energy difference can be assigned almost entirely to the enthalpy difference, as the binding entropies are similar for the two water models. Interestingly, already for model cavities that contain only one single H-bonding capability, the thermodynamic signatures found with the two water models were similar (deviations below 2 kJ/mol), and introducing more than one H-bonding group further diminished the differences.[42] These results support the notion that, in realistic biomolecular environments, which typically harbor H-bonding functional groups, the classical water potentials can be expected to be reasonably accurate. Clearly, more research is necessary to further evaluate the performance of the force fields used and to further improve their accuracy. Adjustment against accurate quantum-chemical calculations and systematic comparison with experimental data will be essential in this endeavor.

75 in total

1. Compensating enthalpic and entropic changes hinder binding affinity optimization.

Authors: Virginie Lafont; Anthony A Armstrong; Hiroyasu Ohtaka; Yoshiaki Kiso; L Mario Amzel; Ernesto Freire
Journal: Chem Biol Drug Des Date: 2007-06 Impact factor: 2.817

2. Solvent Entropy Contributions to Catalytic Activity in Designed and Optimized Kemp Eliminases.

Authors: Saurabh Belsare; Viren Pattni; Matthias Heyden; Teresa Head-Gordon
Journal: J Phys Chem B Date: 2017-09-27 Impact factor: 2.991

3. WATCLUST: a tool for improving the design of drugs based on protein-water interactions.

Authors: Elias D López; Juan Pablo Arcon; Diego F Gauto; Ariel A Petruk; Carlos P Modenutti; Victoria G Dumas; Marcelo A Marti; Adrian G Turjanski
Journal: Bioinformatics Date: 2015-07-20 Impact factor: 6.937

Review 4. Water Determines the Structure and Dynamics of Proteins.

Authors: Marie-Claire Bellissent-Funel; Ali Hassanali; Martina Havenith; Richard Henchman; Peter Pohl; Fabio Sterpone; David van der Spoel; Yao Xu; Angel E Garcia
Journal: Chem Rev Date: 2016-05-17 Impact factor: 60.622

5. Protein conformational plasticity and complex ligand-binding kinetics explored by atomistic simulations and Markov models.

Authors: Nuria Plattner; Frank Noé
Journal: Nat Commun Date: 2015-07-02 Impact factor: 14.919