Reliable information on partition coefficients plays a key role in drug development, as solubility decisively affects bioavailability. In a physicochemical context, the partition coefficient of a solute between two different solvents can be described as a function of solvation free energies. Hence, substantial scientific efforts have been made toward accurate predictions of solvation free energies in various solvents. The grid inhomogeneous solvation theory (GIST) facilitates the calculation of solvation free energies. In this study, we introduce an extended version of the GIST algorithm, which enables the calculation for chloroform in addition to water. Furthermore, GIST allows localization of enthalpic and entropic contributions. We test our approach by calculating partition coefficients between water and chloroform for a set of eight small molecules. We report a Pearson correlation coefficient of 0.96 between experimentally determined and calculated partition coefficients. The capability to reliably predict partition coefficients between water and chloroform and the possibility to localize their contributions allow the optimization of a compound's partition coefficient. Therefore, we presume that this methodology will be of great benefit for the efficient development of pharmaceuticals.
Reliable information on partition coefficients plays a key role in drug development, as solubility decisively affects bioavailability. In a physicochemical context, the partition coefficient of a solute between two different solvents can be described as a function of solvation free energies. Hence, substantial scientific efforts have been made toward accurate predictions of solvation free energies in various solvents. The grid inhomogeneous solvation theory (GIST) facilitates the calculation of solvation free energies. In this study, we introduce an extended version of the GIST algorithm, which enables the calculation for chloroform in addition to water. Furthermore, GIST allows localization of enthalpic and entropic contributions. We test our approach by calculating partition coefficients between water and chloroform for a set of eight small molecules. We report a Pearson correlation coefficient of 0.96 between experimentally determined and calculated partition coefficients. The capability to reliably predict partition coefficients between water and chloroform and the possibility to localize their contributions allow the optimization of a compound's partition coefficient. Therefore, we presume that this methodology will be of great benefit for the efficient development of pharmaceuticals.
For a sufficiently high bioavailability, a druglike compound generally
needs to follow a set of four basic rules, also known as Lipinski’s
rule of five.[1,2] These simplified guidelines describe
the complex interplay of physicochemical properties, which are required
for a molecule to act as a drug, including solubility in environments
of different polarities. On the one hand, molecules aiming at intracellular
targets have to pass through the cell membrane, which is usually facilitated
through a certain degree of lipophilicity.[3,4] On
the other hand, it is decisive that the molecule features a sufficient
number of hydrophilic groups to ensure solubility in the polar intra-
and extracellular environments. Furthermore, it is well known that
rather lipophilic, that is, “fatty” drugs, such as tetrahydrocannabinol,
actually do show some extent of solubility in polar solvents.[5,6] However, these drugs can accumulate in fatty tissue, which often
is followed by an extended uncontrolled release that causes several
adverse effects.[7−10] Hence, a balanced equilibrium between solvent preferences, which
can be described by the partition coefficient, is of utmost importance
in drug design.[11]One of Lipinski’s rules states that the partition coefficient
between water and octanol of a bioavailable compound, log POW, may not exceed 5. This rule highlights the
importance of a reliable prediction of partition coefficients for
drug discovery and design, as this allows preselection of compounds
with sufficient bioavailability.[12,13] Therefore,
it is not surprising that in various countries, log POW must either be measured or calculated before a chemical
compound can be licensed for commercial use.[14] Yet, the experimental approaches to measure log P suffer from various shortcomings, and recently, steps have been
taken to remedy this situation.[15]Several approaches have been proposed to predict partition coefficients
for different solvents, both knowledge-based[16−26] and physics-based.[27−31] One of the standard methods to calculate the partition coefficient
is the cLog P method.[25,32] This method
splits the query molecule into a set of fragments for which the log P is known. The final value is computed by summing up the
log P values for the fragments multiplied by their
respective occurrences in the query molecule. Furthermore, steric,
hydrogen bonding and electronic interactions are considered for the
final cLog P value. Another approach to the cLog P’s fragment-based method is to use the atomic contributions
to calculate the log P. There are multiple approaches
that make use of this concept, for example, Alog P,[21,33,34] Xlog P,[26] and Mlog P.[23]Most of the physics-based methods rely on the calculation of the
differences in solvation free energies between solvents. One important
approach to calculate the solvation free energy is the three-dimensional
(3D) reference interaction site model (3DRISM).[35,36] This approach has recently been generalized for solvents other than
water. Octanol in particular was the focus of Huang et al.[28] who studied the calculation of partition coefficients
using the 3DRISM approach. The major advantage of 3DRISM over grid
inhomogeneous solvation theory (GIST) is its lower calculation cost.
The theoretical framework of 3DRISM inherently relies on an approximation
when calculating the entropy, called the closure relation. The fundamental
assumptions within the theory of inhomogeneous solvation theory (IST)-based
methods, on the other hand, do not rely on this estimate. Nevertheless,
because of practical limitations, GIST also introduces an approximation
in the entropy calculation, which is usually stopped after the two-body
term for the entropy. The entropy for IST-based approaches is directly
calculated from the phase space of water. Additionally, the GIST method
yields molecular distribution functions, whereas 3DRISM provides separate
atomic distribution functions, which are more difficult to interpret.Here, we propose a method based on GIST.[37−39] GIST calculates
thermodynamic properties of a solvent around a solute on a grid. Recently,
we introduced a new version of this algorithm, implemented on the
GPU (GIGIST).[40] One of the key features
of this algorithm is the tremendous speedup compared to the standard
CPU implementation present in the program cpptraj[37−39,41] of AmberTools.[42] In addition
to speeding up the calculations, we implemented a more generalized
version of GIST enabling the use of chloroform as a solvent.GIST calculations of localized thermodynamic quantities have already
facilitated a broad range of applications, including the analysis
of water structures in streptavidin,[43] calculation
of hydrophobicity,[40,44] integration of solvation thermodynamics
into structure–affinity relationships,[45] improvement of docking scoring functions,[46] and correlation of desolvation of aromatic moieties with binding
free energies.[47] In all these approaches,
the localized thermodynamic properties of water around a solute were
found to be a key aspect in understanding the studied phenomena. Here,
we aim toward extending this powerful tool to a different solvent,
chloroform. Chloroform is particularly interesting, as it has been
used as a simplistic model to approximate the polarity inside of proteins
as well as membranes.[31] It is assumed to
be a better suited model system than many other apolar solvents, as
its relative permittivity of 4.3 is close to that typically expected
for the interior of membranes.[48,49] Additionally, its structure
shows only minimal flexibility, allowing it to be modeled as a fixed
conformation. Therefore, chloroform can be treated similarly as water
molecules in GIST.Differences in solvation free energies in water and chloroform
have previously been studied by Wolf and Groenhof[50] using TI simulations. The focus of their study was to benchmark
the accuracy of varying force fields and water models in predicting
solvation free energies. Additionally, they combine the calculated
solvation free energies in water and chloroform to compare computationally
and experimentally determined partition coefficients between these
two solvents. Their results indicate that there is a much higher impact
of the given force field than of the different water models.Here, we use the model systems studied by Wolf and Groenhof to
test the reliability of GIST solvation free energies in estimating
the partition coefficients between water and chloroform. We present
a more generalized version of GIST that can calculate thermodynamic
properties on a grid for chloroform in addition to water. From these
values, we calculate solvation free energies, which are referenced
to the solute in vacuum. Furthermore, we calculate partition coefficients
between water and chloroform, which do not include the vacuum as a
reference state anymore. We compare our results to earlier calculations
of both, solvation free energies in water and chloroform, as well
as to the experimental partition coefficients between these solvents.
Furthermore, we calculate localized differences in the solvation free
energies and their contributions.
Methods
Compounds
We used the same eight compounds (Figure ) as Wolf and Groenhof[50] to benchmark our approach because these compounds
are common motifs in biomolecules. Therefore, we will refer to these
compounds by the single letter code of these biomolecules; that is,
we will call the nucleobases as A, G, C, T, and U and 3-methylindole will
be referred to as W, p-cresol as Y, and toluene as F in the following.
Figure 1
Eight compounds from Wolf and Groenhof[50] were used as a benchmark data set for our GIST-based approach. (A) 9-methyl-adenine, (C) 1-methyl-cytosine, (U) 1-methyl-uracil, (T) 1-methyl-thymine, (G) 9-methyl-guanine, (W) 3-methyl-indol, (Y) p-cresol, and (F) toluene.
The abbreviation in parenthesis corresponds to the respective nucleobase
or amino acid mimicked by the compound.
Eight compounds from Wolf and Groenhof[50] were used as a benchmark data set for our GIST-based approach. (A) 9-methyl-adenine, (C) 1-methyl-cytosine, (U) 1-methyl-uracil, (T) 1-methyl-thymine, (G) 9-methyl-guanine, (W) 3-methyl-indol, (Y) p-cresol, and (F) toluene.
The abbreviation in parenthesis corresponds to the respective nucleobase
or amino acid mimicked by the compound.We consider these eight compounds as an ideal set to evaluate the
reliability of our GIST calculations because for all but one (Y) of these compounds, reliable and coherent experimental
data are available on their partition coefficients between water and
chloroform. This is particularly beneficial as data on experimental
partition coefficients between water and chloroform are significantly
sparser than those between water and octanol, for example. Second,
all compounds in this selected set are very rigid and essentially
confined to the same conformation in water and chloroform. This eliminates
the need for conformational sampling prior to the GIST simulations,
which would be necessary for flexible solutes to obtain reasonable
ensemble averages in both solvents. Because the grid-based approach
requires restrained molecules to be analyzed, the GIST simulations
would need to be repeated for multiple adequate ensemble representatives.
Grid Inhomogeneous Solvation Theory
A short introduction
to GIST will be given in the following section. For a more complete
introduction into GIST,[37−39,51] and the underlying IST by Lazaridis,[52,53] the reader
is referred to the original publications.Generally, GIST can be used to calculate
the free energy of solvation localized on a grid. This can be achieved
by splitting the free energy (ΔA) into its
two contributions, enthalpy (ΔEtotal) and entropy (ΔSuvtotal), where u denotes the solute and
v denotes the solvent (eq ). The energetic contributions and the entropic contributions to
the free energy are calculated on the grid, indicated by the k. The energetic contributions
can be split into the solvent–solvent contributions (ΔEvv) and the solute–solvent contributions
(ΔEuv), eq . The force field is used to calculate the
different contributions to the energy. These values are then stored
at the appropriate grid positions. The grid positions are assigned
using a “central” atom, that is, oxygen in water and
the carbon atom in chloroform.The entropic contribution can be split into two parts, the translational
entropy (ΔSuvtrans) and the orientational entropy (ΔSuvorient) (eq ). Here, the
calculation is truncated after the two-body term of the entropy.Both two-body entropy terms are calculated using a nearest-neighbor
estimate.[39,54] The calculation of the nearest neighbor
for the translational entropy is straightforward, as this can be calculated
directly by using the Euclidean distance between the molecules. The
nearest neighbor for the orientational entropy is calculated using
quaternion distances (Δω),[55] which are defined in eq .where ⟨q1, q2⟩ denotes the inner product
between the two quaternions (q1, q2) and |.| denotes the absolute value. The quaternions
are constructed for each water molecule using the vectors from the
oxygen to the first hydrogen and from the oxygen to the second hydrogen.
This has the effect that the different hydrogen atoms can be distinguished
from each other, which has the disadvantage that more sampling is
required until convergence.Instead of splitting the entropy in the orientational and translational
part, they can be estimated together, directly approximating the integral
of sixth order. To achieve this, a distance (d) in
the orientational and translational space is defined, which is a simple norm (eq ). Using this distance, again the nearest-neighbor
entropy estimator is used.Δω again refers to the orientational
distance, calculated via eq , and deuclid refers to a simple
Euclidean distance in space.Our implementation of GIGIST uses cpptraj[37−39,41] of AmberTools[42] as a base,
benefitting from its various functionalities and making it straightforward
to use. Similar to water, we also defined a central atom for chloroform
in the GIST calculations. For chloroform, we used the carbon atom
as a center for the binning calculation. Furthermore, we used the
C–H bond and the bond between the carbon and the first chlorine
atom for the calculation of the quaternions. As in the water implementation
for the hydrogen atoms, the chlorine atoms are distinguishable.
System and Simulation Setup
The structures of the solute
molecules were generated with a molecular operating environment (MOE).[56] Hereafter, the parameters of the molecules were
derived with the antechamber module of AmberTools19[42] and the GAFF/GAFF2[57] force fields. Partial charges were derived with Gaussian 16[58] and the RESP[59] procedure
using HF/6-31G*.With LEaP, the molecules were placed in TIP3P,[60] TIP4P,[60] TIP5P,[61] OPC3,[62] OPC,[63] SPC/E,[64] and CHCl3[65] solvent boxes with a minimum
wall distance of 20 Å. Restrained simulations for GIST were performed
with the GPU implementation of the pmemd[66] module of AMBER18[42] on our in-house GPU
cluster. A Langevin thermostat[67] with a
collision frequency of 2 ps–1 was used to maintain
a constant temperature of 300 K, and a Berendsen barostat[68] with a pressure relaxation time of 2 ps was
used to maintain the atmospheric pressure. Long-range electrostatics
were treated with the particle-mesh Ewald method,[69] and a van der Waals cutoff of 8 Å was used. Because
a time step of 1 fs was necessary for the CHCl3 box, the
same time step was used for all simulations for consistency. All nonsolvent
atom positions were restrained with a harmonic potential with a weight
of 1000 kcal/mol·Å2. All simulations were run
for 100 ns, collecting frames every 10 ps, resulting in 10,000 frames
per simulation.
Calculation of Solvation Free Energies
For the calculation
of the solvation free energy, GIST analyses were performed on the
simulations’ trajectories. For the GIST calculations, the reference
density and the solvent–solvent reference energy of the AMBER
manual[42] were used, where they were available.
For the systems for which no reference values were reported in the
manual, a simple solvent box with a box size of 40 × 40 ×
40 Å3 was simulated. From this calculation, both the
solvent–solvent reference energy and the reference density
were calculated (see Supporting Information Table S1).For the solute simulations, the solvation free
energy was obtained with an in-house python script. The reference
energy was subtracted from the solvent–solvent interaction
energy. The energy density was recalculated from the normalized, referenced
energy and the population on the given voxel. Then, the free energy
was calculated as shown in eq . For the entropy contribution, the approximation of the integral
of sixth order was used. The free energy density in each voxel was
integrated around the molecule within 9 Å for the water models
and 11 Å for chloroform to yield the solvation free energy. As
a center for the cutoff, the heavy atoms of the compounds were used.The cut-off distance was chosen based on an radial distribution
function (RDF) analysis of solute simulations (see Supporting Information Figure S1). For both solvents, that
is, water and chloroform, the cut-off distances were chosen to include
the first two solvation shells of the molecules.
Calculation of log P and ΔG Difference
The difference in solvation free energy between
water and chloroform can be understood in terms of the thermodynamic
cycle. The solvation free energy corresponds to the transfer of the
solute from vacuum to the solvent (ΔGwater and ΔGchloroform). The difference
in these solvation free energies closes the thermodynamic cycle[70] and is related to the partition coefficient
between the two solvents via eq .[50]where cwater is
the concentration of the solute in water and cchloroform is the concentration of the compound in chloroform.
ΔGwater and ΔGchloroform are the different solvation free energies,
respectively, and RT is the ideal gas constant multiplied
with the temperature.The difference in solvation free energy
is readily calculated, as mentioned above. The experimental difference
in solvation free energy was calculated by reorganizing eq to yield eq and using the experimentally determined value
for the water–chloroform partition coefficient.
Grid Visualization
The values for the solvation free
energy, the referenced energy, and the entropy were calculated on
grids. To visualize the origins of the different contributions to
the solvation free energy, a factor of 0.5 in the solvent–solvent
interaction energy was retained to avoid double counting.[38] We calculated the differences in the two grids
by simply subtracting the chloroform grid from the water grid. The
resulting grid was visualized with cutoffs of ±0.05 kcal/(mol·A3). For free energy and reference energy, the negative cutoff
(blue surface) shows highly favorable sites of water, and the positive
cutoff (green surface) shows highly favorable sites for chloroform.
For the entropic contributions, we visualized −TΔS: blue regions are entropically favorable
sites for water and green regions entropically favor chloroform. All
molecules and grids were visualized using PyMOL.[71]
Results
Solvation Free Energies
The calculations of the free
energies of solvation in water and chloroform were performed as described
in the Methods section. In Table , the results for the two solvents
are summarized and compared to the experimental solvation free energies
and to the TI-based calculations by Wolf and Groenhof.[50] In general, our values are approximately 2 kcal/mol
lower than the experimental values. Furthermore, the results calculated
with the GAFF2 parameters are consistently lower than the results
obtained by Wolf and Groenhof using the GAFF parameters. For better
comparability, we calculated TIP3P and CHCl3 solvation
free energies using GAFF. Interestingly, even when using these parameters,
GIST results are lower, by about 2 kcal/mol compared to TI.
Table 1
Solvation Free Energies Calculated
Using GAFF2 and GAFF Parameters of AMBER in Combination with Various
Water Modelsa
Solvent
A
G
C
T
U
W
Y
F
GAFF2
TIP3P
–12.2
–18.4
–16.6
–10.2
–11.3
–3.4
–1.8
–0.2
TIP4P
–8.8
–16.6
–15.4
–9.4
–10.7
–1.9
–0.9
0.9
TIP5P
–7.5
–13.8
–11.9
–6.8
–8.4
–3.5
–1.7
–0.3
OPC
–5.8
–11.9
–11.0
–5.7
–6.8
0.8
2.9
3.4
OPC3
–5.2
–11.7
–9.5
–4.4
–5.3
2.7
3.9
4.9
SPCE
–9.5
–17.3
–15.2
–9.1
–10.9
–2.0
–0.9
0.0
CHCl3
–10.3
–14.9
–11.2
–9.8
–9.4
–7.2
–4.2
–3.1
GAFF
TIP3P
–13.6
–20.8
–18.1
–13.1
–13.9
–5.0
–3.7
–2.0
CHCl3
–11.3
–15.7
–11.6
–10.2
–10.1
–6.2
–4.3
–2.3
WOLF[50]
TIP3P
–15.5
–24.5
–21.0
–13.6
–14.1
–5.3
–4.2
–0.5
CHCl3
–12.8
–16.5
–13.2
–12.1
–11.4
–8.8
–6.6
–5.0
EXP
water[72,73]
–13.6
–(9–13)b
–5.9
–6.1
–0.8
CHCl3[50]
–12.5c
–9.0c
–3.9c
The values calculated by Wolf and
Groenhof[50] are provided for comparison.
The experimental values are reported below. All energies are given
in kcal/mol.
The value reported in ref (71) is an approximated range.
These values are calculated from
the experimentally determined solvation free energy in water and the
partition coefficient between water and chloroform.
The values calculated by Wolf and
Groenhof[50] are provided for comparison.
The experimental values are reported below. All energies are given
in kcal/mol.The value reported in ref (71) is an approximated range.These values are calculated from
the experimentally determined solvation free energy in water and the
partition coefficient between water and chloroform.The other water models seem to be less accurate than the TIP3P
calculations. Comparison with experimental data shows that there is
a root-mean-square error (rmse) of about 2.5 kcal/mol between the
experimental values and the calculated values for TIP3P. For the other
water models, the deviation is even larger and can be as high as 8.3
kcal/mol for OPC3. However, despite the offset in absolute values
for OPC3, the Pearson correlation for this model is similar to the
correlations found for TIP3P and TIP4P (all 0.99). While we find the
worst correlation between the experimental partition coefficients
and solvation free energies estimated from chloroform alone, the respective
Pearson correlation coefficient of 0.91 is still very high (Supporting Information Figure S2).The values calculated for chloroform also differ in our approach
compared to the one by Wolf and Groenhof. The values we calculated
with the GAFF2 parameters and our GIST approach are more than 2 kcal/mol
higher, that is, more positive, than the values calculated with TI.Figure shows the
solvation free energy for the different water models (ΔGcalc). It is clearly visible that all the water
models perform very similarly, when analyzed individually. All of
them show a very high correlation with the difference in solvation
free energies between water and chloroform (ΔΔGexp). It should be noted that we compare the
hydration free energy to a difference in solvation free energies in Figure , and therefore,
we do not expect the absolute numbers to match but rather focus on
the correlation between these values. Intriguingly, the SPC/E water
model has the highest correlation of 0.995, albeit only by a very
small margin. The lowest correlation coefficient can be found for
TIP5P, with 0.978, which is still very close to the highest 0.995
of SPC/E.
Figure 2
Solvation free energies of the compounds in the different water
models, compared with the experimentally determined difference in
solvation free energy between water and chloroform. (A) TIP3P, (B)
TIP4P, (C) TIP5P, (D) OPC3, (E) OPC, and (F) SPC/E. The absolute Pearson
correlation coefficient (R) is given for each solvent
within the graph.
Solvation free energies of the compounds in the different water
models, compared with the experimentally determined difference in
solvation free energy between water and chloroform. (A) TIP3P, (B)
TIP4P, (C) TIP5P, (D) OPC3, (E) OPC, and (F) SPC/E. The absolute Pearson
correlation coefficient (R) is given for each solvent
within the graph.
Partition Coefficients
From the difference of solvation
free energies between each individual water model and chloroform,
we then calculated the partition coefficient between the two solvents
using eq . The results
of the calculation for the water models are shown in Table . It is worth noting that we
always use the same chloroform model. The water models show differences
in terms of correlation and rmse with the experimental values. Most
intriguingly, the TIP3P water model in combination with GAFF2 parameters
does not only show a striking correlation with the experimental values
but indeed almost matches them numerically. We find a Pearson correlation
coefficient of 0.96 and an rmse in log P of 0.6 (rmse
in ΔΔG of 0.89 kcal/mol). The correlation
and rmse are worse for the TIP4P water model, which shows a Pearson
correlation coefficient of 0.94 and an rmse of 1.3. Of the TIP water
models, TIP5P clearly performs worst in this study, as it only shows
a correlation coefficient of 0.81 with the experimental results and
a significantly higher rmse of 2.5. While both OPC-type water models
show a moderate Pearson correlation (0.89 for OPC and 0.95 for OPC3),
the rmse values (3.8 for OPC and 4.6 for OPC3) are notably worse,
compared to the results we find for the TIP family.
Table 2
Partition Coefficients between Water
and Chloroform for All Eight Compounds Used in This Study with Experimental
Values Taken from Wolf and Groenhof[50]a
Model
A
G
C
T
U
W
Y
F
rmse
GAFF2
TIP3P
1.4
2.6
4.0
0.4
1.5
–2.7
–1.7
–2.1
0.6
TIP4P
–1.0
1.3
3.1
–0.2
1.0
–3.8
–2.4
–2.9
1.3
TIP5P
–2.0
–0.8
0.6
–2.1
–0.7
–2.6
–1.8
–2.0
2.5
OPC
–3.2
–2.2
–0.1
–2.9
–1.8
–5.8
–5.1
–4.7
3.8
OPC3
–3.7
–2.3
–1.2
–3.9
–3.0
–7.1
–5.9
–5.8
4.6
SPC/E
–0.5
1.8
2.9
–0.5
1.2
–3.7
–2.4
–2.2
1.1
GAFF
TIP3P
1.7
3.8
4.7
2.2
2.8
–0.8
–0.3
–0.2
1.5
WOLF[50]
TIP3P
2.0
5.9
5.7
1.1
2.0
–2.6
–1.8
–3.4
1.6
EXP[74,75]
0.8
3.5
3.0
0.5
1.2
–2.2
–2.3
0
The last column lists the rmse between
the partition coefficient of chloroform and the studied water model.
Because there is no experimental value for p-cresol
(Tyr), we ignored it for the calculation of the rmse.
The last column lists the rmse between
the partition coefficient of chloroform and the studied water model.
Because there is no experimental value for p-cresol
(Tyr), we ignored it for the calculation of the rmse.Compared to the previous study by Wolf and Groenhof (rmse of 1.6),
in our approach, TIP3P, TIP4P, and SPC/E clearly perform better at
reproducing the absolute ΔG values, with rmses
of 0.6, 1.3, and 1.1, respectively. We also compared TIP3P with GAFF
parameters to the calculations of Wolf and Groenhof. Although we find
similar rmse and correlation values for our calculations (rmse 1.5
compared to 1.6; correlation coefficient 0.97 compared to 0.99), our
approach does not reproduce their values exactly.In Figure , the
computed partition coefficients between water and chloroform are plotted
against the experimentally determined ones. Therefore, the ideal values
would be arranged along the diagonal; this is most true for TIP3P,
which shows a slope of the linear fit of 1.02 and an axis intercept
at 0.02. For the TIP4P and SPC/E water models, the slope is also very
close to 1 (TIP4P: 0.99; SPC/E: 0.96), but the axis intercept is significantly
farther away (TIP4P: −1.06; SPC/E: −0.82). Interestingly,
in the linear fit for the OPC-like water models, the slope is still
within an acceptable range (OPC3: 0.85; OPC: 0.74). However, the axis
intercept differs stronger from zero than for the water models mentioned
above (OPC3: −4.43; OPC: −3.48). Finally, the worst
fit is shown by TIP5P, which has a slope of 0.40 and an axis intercept
of −1.67. Interestingly, the linear fit on the TI data of Wolf
and Groenhof is much steeper than the calculations in this study with
a slope of 1.6 and an axis intercept at 0.51 (Supporting Information Figure S4).
Figure 3
Difference of the solvation free energy of chloroform for the different
water models: (A) TIP3P (cyan), (B) TIP4P (red), (C) TIP5P (green),
(D) OPC3 (orange), (E) OPC (magenta), and (F) SPC/E (gray). The identity
line between experimental and calculated values is depicted as a black
line in all subplots.
Difference of the solvation free energy of chloroform for the different
water models: (A) TIP3P (cyan), (B) TIP4P (red), (C) TIP5P (green),
(D) OPC3 (orange), (E) OPC (magenta), and (F) SPC/E (gray). The identity
line between experimental and calculated values is depicted as a black
line in all subplots.
Localized Thermodynamic Properties
An insightful feature
of GIST is the ability to localize the thermodynamic properties and
show where chloroform has more favorable interactions than water and
vice versa. These localized data can be visualized as a grid around
the molecules (free energy in Figure ). Here, atoms that would generally be considered hydrophilic
are shown to prefer water as a solvent, whereas hydrophobic atoms
prefer the chloroform phase more strongly.
Figure 4
Difference in solvation free energy between water and chloroform.
Blue regions are favorable water interaction sites (<−0.05
kcal/mol/Å3) and green regions are favorable chloroform
interaction sites (>0.05 kcal/mol/Å3). (F) toluene, (Y) p-cresol, (A) 9-methyl-adenine, and (T) 1-methyl-thymine.
Difference in solvation free energy between water and chloroform.
Blue regions are favorable water interaction sites (<−0.05
kcal/mol/Å3) and green regions are favorable chloroform
interaction sites (>0.05 kcal/mol/Å3). (F) toluene, (Y) p-cresol, (A) 9-methyl-adenine, and (T) 1-methyl-thymine.For F, chloroform is preferred at all positions around
the molecule. Especially at the pi cloud above the ring, chloroform
is much more favorable than water. Y shows a very similar
behavior. The pi cloud is again strongly favored by chloroform. However,
a strong interaction with water can be found at the hydrogen bond
position of the hydroxyl group. Note that the molecule is constrained
to a single conformation for the GIST calculation; thus, the strong
interaction with the water molecules is only present at one side of
the oxygen atom. Interestingly, this interaction is so strong that
it almost outweighs the two interactions with the pi cloud, which
heavily favor chloroform. However, despite its strength, the hydrogen
bond is limited to a singular interaction site, while the interaction
with the pi cloud is possible above and below the ring. Together with
the remaining (rather apolar) parts of the molecule, these comparably
weaker interactions in sum lead to favorization of chloroform over
water.The other two compounds in Figure , that is, A and T, favor
water over chloroform. Here again, the solvation free energy at the
pi cloud positions is less favorable than for chloroform. Interestingly,
the regions in the ring plane show a very high preference for water.
For these two molecules, this is mostly due to the multiple hydrophilic
groups and possible hydrogen bonding sites. For A, the
most favorable interactions occur around the nitrogen atoms, and for T, the strongest interaction is observed in the space surrounding
the nitrogen atom and encapsulated by the two oxygen atoms. Additionally,
the two oxygen atoms also show strongly favorable interactions with
water.In addition, GIST allows the visualization of contributions from
enthalpy (Figure top)
and entropy (Figure bottom) to the free energy. Interestingly, all compounds in this
study exhibit an energetic preference for water but show a strong
entropic penalty (Supporting Information Tables S2, S3), which is much less pronounced in chloroform. Indicative
for this are the large blue areas around each molecule in the top
of Figure . In contrast,
chloroform shows fewer regions that are entropically unfavorable,
which is indicated by the larger green areas in the bottom of Figure . The entropically
favored regions and the enthalpically favored regions are almost always
complementary to each other, highlighting the effect of enthalpy–entropy
compensation.
Figure 5
Difference in the enthalpic (top) and entropic (bottom) contributions
to the solvation free energy between water and chloroform, visualized
as a grid. Note that for the entropic contributions, −TΔS is visualized. Blue regions are
favorable water interaction sites (−0.05 kcal/mol/Å3), and green regions are favorable chloroform interaction
sites (0.05 kcal/mol/Å3). (F) toluene,
(Y) p-cresol, (A) 9-methyl-adenine,
and (T) 1-methyl-thymine.
Difference in the enthalpic (top) and entropic (bottom) contributions
to the solvation free energy between water and chloroform, visualized
as a grid. Note that for the entropic contributions, −TΔS is visualized. Blue regions are
favorable water interaction sites (−0.05 kcal/mol/Å3), and green regions are favorable chloroform interaction
sites (0.05 kcal/mol/Å3). (F) toluene,
(Y) p-cresol, (A) 9-methyl-adenine,
and (T) 1-methyl-thymine.For F, both solvents show enthalpically favored regions
on top and bottom of the ring. For chloroform, this region is farther
away than for the water molecules, which is in concordance with the
size of chloroform and the stronger electrostatic interactions with
water. However, from the free energy plot, we see that water is never
significantly more favorable than chloroform, indicating that either
its entropy is lower or the enthalpic contributions are higher.For Y, a similar picture can be observed, as there
are again two favorable areas around the pi interaction sites. Again,
these locations are farther away for chloroform and closer for water.
However, in the free energy plot (Figure ), these interactions are more favorable
in chloroform. Also, at the hydrogen bonding site, strong enthalpic
interactions are visible in water (top Figure ), which come at the price of a significantly
higher entropic loss (bottom Figure ). These entropic penalties are not as significant
as the enthalpic interactions in these spots, resulting in a more
favorable free energy in water at the hydrogen bonding sites.In A, we again see the larger distance for chloroform
than for water. For water, the enthalpic contribution is more favorably
close to the molecule, whereas chloroform has its interaction hot
spots a little farther away from A. In general, the enthalpic
interactions are pronounced very strongly close to the plane of the
ring, where the interactions with the nitrogen atoms and the possible
hydrogen bonds are visible. The most significant contribution in chloroform
seems to be above the ring, where again interactions with the pi cloud
are visible. In general, the surface for the favorable water interactions
is much clearer and more pronounced, when compared to the more rugged
surface in chloroform. The stronger enthalpic interactions persist
even into the free energy, making A more hydrophilic
than the previous molecules.Last, T shows three very strong enthalpic interaction
positions. Two are located close to the oxygen atom and one is close
to the exposed nitrogen atom. Like A, the favorable contributions
in chloroform are farther away and the surface is again more uneven,
indicating the higher enthalpic preference for water. Despite the
unfavorable entropic interactions, resulting from the increased ordering
of the water molecules around the hydrophilic atoms, the free energy
stays favorable for water in almost all regions. Again, the position
just above the ring remains the only exception.
Discussion
A reliable prediction of log P values between
water and chloroform is of high interest especially for biological
systems, where it finds application in the estimation of membrane
permeability. In a multitude of experimental and computational studies,
the relation of membrane permeability and a compound’s hydrophobicity
has been thoroughly investigated.[4,76] While several
of these studies, particularly computational investigations, already
consider partition coefficients between water and chloroform, the
typical experimental approach to determine a compound’s hydrophobicity
is its log POW, the partition coefficient
between water and octanol. We have previously shown that a solute’s
hydrophobicity relates to GIST solvation free energies.[40,44] These water-based approaches to GIST however approximate the hydrophobicity
of a compound as the difference in solvation free energies and vacuum.
Here, we extend this description as the difference between solvation
free energies in water and chloroform.In a first step, we calculated the hydration free energy using
different water models and the GAFF2 parameters. For all the investigated
water models, that is, TIP3P, TIP4P, TIP5P, OPC, OPC3, and SPC/E,
we find an extremely high correlation with the experimental partition
coefficient, in concordance with our previous studies.[40,44] The outstanding correlation with the experimental partition coefficient
between water and chloroform shows that for similar sized compounds,
the hydration free energy already correlates remarkably well with
the partition coefficient. Furthermore, a lowest correlation of 0.978
is observed for TIP5P. However, this correlation is still remarkably
high. These findings are reassuring as they indicate that our previously
postulated relation of hydrophobicity and GIST solvation free energies
in proteins can also be extended to small molecule systems. We want
to emphasize that as a proof of principle, we chose rather small molecules,
which do not differ much in size and functionalities, for this study,
as these compounds are ideally suited for the GIST calculations. However,
we surmise that molecules with a similar chloroform/water partition
coefficient, which differ strongly in size, would yield very different
hydration free energies. This should simply occur because a larger
molecule has more interactions with the water around it. However,
the same reasoning would also apply to the solvation free energy in
chloroform. Therefore, we hypothesize that the contributions of the
apolar solvent are necessary to predict the partition coefficients
of molecules of varying size and polarity.When compared to the few available experimental hydration free
energies, we can observe that TIP3P closely reproduces the experimental
data, with an rmse of about 2.5 kcal/mol. The other water models perform
worse in this regard, with rmses ranging up to almost 9 kcal/mol.
This is interesting, as most other water models are considered better
than TIP3P in many aspects.[77] Additionally,
the correlation coefficient of the other water models, with TIP5P
being the lone exception, is higher than that of TIP3P. We surmise
that one possible reason for this is that the GAFF parameters were
optimized in the context of the TIP3P water model. The more advanced
water models are parameterized to model pure water properties correctly,
which does not necessarily increase the reliability of hydration free
energy calculations.The trend of solvation free energy in chloroform is rather similar
to the trend found by Wolf and Groenhof[50] when considering the relative differences. Interestingly, all absolute
values are about 2 kcal/mol off. This could be due to an incompletely
converged nearest-neighbor estimate of the entropy, whose influence
is enhanced by the additional voxels we used due to the large cutoff.
However, we deemed it necessary to include the second solvation shell
in this study, as it is very pronounced in chloroform. Another source
for the difference in the solvation free energies between GIST and
TI calculations could be the approximation when calculating the entropy
term. In the current GIST implementations, only the two-body term
of the entropy is considered. We also performed TI calculations on
the TIP3P water model and chloroform, which are gathered in the Supporting Information. Our results show an almost
constant offset in the solvation free energies. However, smaller fluctuations
can still be observed, even when shifting the values by this offset.In general, we observe the poorest correlation between solvation
free energy and partition coefficient in chloroform. However, as this
correlation is still above 0.9, we surmise that also for chloroform,
the employed methodology works very well. A possible explanation for
the decreased correlation of chloroform could be that the applied
force field parameters describing the small molecules were derived
with the aim to represent the molecule in water. The used RESP charges
and GAFF/GAFF2 parameter combination are developed to account for
self-polarization in water. However, self-polarization in chloroform
is significantly smaller than in water, and therefore, the parameters
are overestimating the interaction with chloroform. To get similar
accuracy for chloroform, it might be necessary to refit parameters
to account for the change in polarity.[78,79] Additionally,
we tested various water models but used only one chloroform model.
We want to note that other chloroform models might improve the results
even further.The calculation of the partition coefficients between water and
chloroform not only preserved the high correlation (Figure ) but also reproduced the experimental
values almost numerically. This is prominent for the TIP3P water model,
where an rmse of only 0.6 and a correlation of 0.96 were found. Furthermore,
the calculated values align almost perfectly with the experimentally
determined ones, which is visible in Figure from the parameters of the linear fit. The
slope of this linear fit is almost exactly 1, and the axis intercept
is approximately 0. Other water models have similarly low rmse values,
while also maintaining a high correlation, that is, TIP4P (rmse: 1.3, R: 0.94) and SPC/E (rmse: 1.1, R: 0.95).
Additionally, the linear fit in Figure still shows a slope of almost 1 for these water models,
but they show a parallel displacement from the ideal diagonal. It
is intriguing that the highest correlation, lowest rmse, and best
linear fit are found for the TIP3P model, as TIP3P is nowadays generally
considered to be a subpar water model.[77] All other models overestimate the likelihood to find the compound
in chloroform. As described above, this might be caused by an overpolarization
of the solute in chloroform with GAFF2/RESP parameters. Hence, this
systematic overpolarization in the chloroform phase is likely the
reason for the parallel displacement of the differences in solvation
free energies, which is visible from Figure . Another possible explanation to the effect
could be favorable error compensation between the different solvent
models, that is, chloroform and TIP3P. In addition, GIGIST and GIST
treat the electrostatic interactions via the nearest image convention,
which could lead to larger offsets in better water models.Interestingly, the TI calculations did not reproduce the experimental
reference values of the partition coefficient as closely as the GIST
calculations. Because the TI calculations produce the exact solvation
free energy for a given force field, the reason for this discrepancy
might be due to a favorable error compensation between the approximation
of GIST, using only the two body terms for the entropy, and the force
field. This could hint toward an inaccurate description of the higher-order
entropy terms of the solvent phase in the used force fields.GIST allows for the localization and analysis of thermodynamic
properties around the molecule of interest, which has been shown to
be a valuable tool in analyzing water properties around biomolecules.[38,40,43,51] Here, we analyze differences in the local thermodynamic properties
of the different compounds solvated in water and chloroform. One of
the most interesting features found in this study is the preference
of the pi cloud, above and below an aromatic moiety, for chloroform
and not water. Water is preferred over chloroform on atoms, which
are traditionally considered as hydrophilic, such as nitrogen and
oxygen.Furthermore, our method allows for the analysis of entropic and
enthalpic contributions to the free energy. In all investigated compounds,
we were able to clearly see that an enthalpically favorable interaction
site is accompanied by a sizeable entropic penalty. This highlights
the concept of enthalpy–entropy compensation, which states
that a very favorable enthalpic interaction is accompanied by an unfavorable
entropic penalty.[80] To achieve a favorable
free energy, the enthalpic interaction must overcome the entropic
cost. In our examples, this is the case for the hydrophilic atoms,
such as nitrogen or oxygen. However, interactions that are also accompanied
by a large entropic cost and that are not favorable are often more
preferred in chloroform. One example is the pi cloud of the aromatic
rings, where a weak interaction is always visible between water and
the ring. The restricted space for the water molecules leads to a
high entropic penalty, which is much smaller in chloroform, and therefore,
chloroform is favored above the pi cloud. This is again visible for F, which is the most hydrophobic molecule in this set. It
shows no significant favorable interactions in water, and interestingly,
the area of the pi cloud is heavily favored in chloroform. This is
in line with a previous work from our group,[47] which showed that the solvation of an aromatic moiety counteracts
its ability to exhibit pi-stacking interactions, which is much more
pronounced in heterocycles with multiple hydrophilic atoms.Interestingly, in terms of enthalpy, all compounds would favor
water over chloroform. This is somewhat surprising, as on the one
hand, water has stronger interactions with the compound and therefore
a higher energetic contribution of the solute–solvent interaction
energy. On the other hand, a water molecule also has high interactions
in the bulk, in the form of hydrogen bonds, to other water molecules
in its vicinity. However, entropically, all compounds favor chloroform
over water. This can easily be explained by entropy–enthalpy
compensation. While water is bound more strongly to hydrophilic groups,
an entropic penalty for this strong interaction occurs because of
the introduced order. This leads to generally more favorable entropic
contributions for chloroform, compared to water. However, even for
the solvation in pure chloroform, an entropic penalty occurs, which
is just weaker than the one in water.The calculation speedup, which is provided by the GPU-accelerated
version of GIST, facilitates highly efficient analysis of trajectories,
even for large molecular systems. We already showed the speedup of
these calculations for water boxes of increasing size.[40] The bottleneck for the calculations of this
study is the simulation preceding GIST analysis, which—depending
on the size of the solute and the solvent box—can take multiple
hours to complete. The analysis itself can be performed within minutes
for a single molecule. Hence, while GIST is probably not the fastest
approach to estimate partition coefficients of small molecules, it
represents a reliable method which can also be applied to large biomolecular
systems.
Conclusions
We showed that our new GIGIST implementation can effectively compute
thermodynamic properties of water and chloroform as input solvents.
Additionally, we evaluate the reliability of this new feature in GIGIST
by calculating the differences in solvation free energies and thus
partition coefficients between water and chloroform.In summary, we were able to calculate the partition coefficients
between chloroform and water with a surprising accuracy, even more
so as the TI calculations do not reproduce these values closely. We
surmise that a favorable error compensation between the two-body approximation
and the force field plays a vital role. Interestingly, the TIP3P water
model performs best in this analysis. This might be due to the design
of force fields, which are usually developed in the context of the
TIP3P water model.Furthermore, the presented algorithm can localize the major contributors
to the difference in solvation free energies. We use this advantage
to highlight positions around the molecules where chloroform is largely
favored over water, which is in all cases at the position of the pi
clouds. Additionally, we found that the favorable positions for water
are mostly close to hydrophilic groups. Not surprisingly, we found
that enthalpically favored positions around the molecule come at an
entropic cost. In general, water shows stronger enthalpic interactions
with the solutes but also a higher entropic penalty.The GIGIST implementation of GIST allows for localized and global
calculations of solvation free energies in water and chloroform. We
surmise that the presented algorithm and method provide a robust framework
for the development of advanced tools to predict membrane permeabilities
or biomolecular hydrophobicities.
Authors: Andrew T Bockus; Katrina W Lexa; Cameron R Pye; Amit S Kalgutkar; Jarret W Gardner; Kathryn C R Hund; William M Hewitt; Joshua A Schwochert; Emerson Glassey; David A Price; Alan M Mathiowetz; Spiros Liras; Matthew P Jacobson; R Scott Lokey Journal: J Med Chem Date: 2015-05-20 Impact factor: 7.446
Authors: Franz Waibl; Johannes Kraml; Monica L Fernández-Quintero; Johannes R Loeffler; Klaus R Liedl Journal: J Comput Aided Mol Des Date: 2022-01-15 Impact factor: 4.179
Authors: Anna S Kamenik; Johannes Kraml; Florian Hofer; Franz Waibl; Patrick K Quoika; Ursula Kahler; Michael Schauperl; Klaus R Liedl Journal: J Chem Inf Model Date: 2020-06-29 Impact factor: 6.162