We present a simple optimization strategy for incorporating experimental dielectric response information on neat liquids in classical molecular models of alcohol. Using this strategy, we determine simple and transferable hydroxyl modulation rules that, when applied to an existing molecular parameter set, result in a newly dielectric corrected (DC) parameter set. We applied these rules to the general Amber force field (GAFF) to form an initial set of GAFF-DC parameters, and we found this to lead to significant improvement in the calculated dielectric constant and hydration free energy values for a wide variety of small molecule alcohol models. Tests of the GAFF-DC parameters in the SAMPL4 blind prediction event for hydration show these changes improve agreement with experiment. Surprisingly, these simple modifications also outperform detailed quantum mechanical electric field calculations using a self-consistent reaction field environment coupling term. This work provides a potential benchmark for future developments in methods for representing condensed-phase environments in electronic structure calculations.
We present a simple optimization strategy for incorporating experimental dielectric response information on neat liquids in classical molecular models of alcohol. Using this strategy, we determine simple and transferable hydroxyl modulation rules that, when applied to an existing molecular parameter set, result in a newly dielectric corrected (DC) parameter set. We applied these rules to the general Amber force field (GAFF) to form an initial set of GAFF-DC parameters, and we found this to lead to significant improvement in the calculated dielectric constant and hydration free energy values for a wide variety of small molecule alcohol models. Tests of the GAFF-DC parameters in the SAMPL4 blind prediction event for hydration show these changes improve agreement with experiment. Surprisingly, these simple modifications also outperform detailed quantum mechanical electric field calculations using a self-consistent reaction field environment coupling term. This work provides a potential benchmark for future developments in methods for representing condensed-phase environments in electronic structure calculations.
Computer simulations
exist as a way to connect theories of the
microscopic world to experimental results. They help validate our
understanding of microscopic phenomena and propose new directions
for investigation. Such simulations employ molecular models, and the
development of molecular models and their component parameters has
a long history.[1−7] One of the primary motivations in their continued development is
the increased ability of a molecular model to quantitatively reproduce
and/or predict experimental properties of interest.Our depiction
of molecules in simulations has been grounded by
several experimental observables. These include densities, enthalpies,
liquid structure, molecular geometry, phase coexistence boundaries,
and more.[8−12] In the case of water, anomalous properties such as the temperature
of maximum density provide key motivation.[13−15] Whether the
simulation is classical fixed-charge, polarizable, or even quantum
mechanical (QM), models used often are unable to reproduce all the
observables of interest and, thus, must strike a balance among fitting
the selected properties, avoiding too many poorly defined parameters,
and keeping the computational cost low. Generalized models, such as
OPLS-AA,[3] GAFF,[6] and CGenFF,[7] typically focus on neat
liquid density and heats of vaporization as experimental targets for
optimization of Lennard–Jones (LJ) parameters coupled with
partial charges assigned using fits to HF/6-31G* electrostatic potentials.
While models made from these parameter sets can be used to study neat
liquids, one of the main intents is for the representation of molecules
in other environments, such as a ligand interacting with a protein
or a solute transferring between different chemical phases. When using
such a model outside of a neat liquid, it is understandable that one
may observe a systematic error in a new property of interest. In the
case of molecular transfer, researchers have explored a variety of
approaches to reduce such systematic errors, and these generally amount
to fitting the models to get the answer desired,[5] sometimes coupled to hydrated environment QM calculations,[16] and other times through statistical means by
introducing additional water-mixing parameters to the LJ potential
terms.[17] For researchers interested in
simultaneous reproduction of experimental observables in multiple
disparate environments, which is sometimes considered impossible for
classical fixed-charge models,[5,18] more complex polarizable
models and corrections have shown promise.[19−22]We are interested in both
the neat liquid and air-to-water transfer
properties of molecules with hydroxyl moieties. Recent studies on
the hydration of alcohols using common force field parameters have
shown systematic errors in ΔGhyd[17,23−28] and other thermodynamic quantities, in particular the static dielectric
constant, ε(0).[29] Are systematic
errors in ε(0) and ΔGhyd coupled
to one another in some way? It has been suggested that functional
groups in typical force fields are underpolarized to some degree,[16,18,21,23,25,27−30] though this should not be too surprising given that partial charges
are typically fit to reproduce a gas-phase HF/6-31G* electrostatic
potential, despite this charge model’s fortuitous propensity
for overpolarizing the gas-phase electric field.[2,6,31,32] For solution-phase
simulations, it would be preferable to have an accurate degree of
polarization for the condensed-phase fluid encoded within the atomic
partial charges of the molecule while maintaining the typical neat
liquid density and enthalpy of vaporization (ΔHvap). Recent studies have begun using the experimentally
measured ε(0) of a neat liquid as a fitting target for systematically
scaling the degree of polarization of classical fixed-charge models.[33,34] This provides an orthogonal approach to the practice of setting
partial charges with a QM electrostatic potential and optimizing LJ
parameters to fit experimental observables. As it relies on experimental
quantities to govern the magnitude of the partial charges, it can
potentially provide an independent assessment of the quality of a
given QM calculation used to derive partial charges. Here, we extend
such work with the goal of addressing the following questions: (1)
is it possible to simultaneously capture the density, ΔHvap, and ε(0) for simple alcohol solvents;
(2) how transferable are the resulting hydroxyl parameters to other
more complex alcohols; and (3) if an alcohol model reproduces the
neat liquid ε(0), how does that affect the hydration of that
alcohol as a solute?
Methods
General Simulation Protocol
In this work, both extended
molecular dynamics simulations and alchemical free energy calculations
were used to calculate neat liquid thermodynamic properties and air-to-water
transfer free energies for a series of 41 small molecules containing
hydroxyl moieties for which both experimental quantities were available.
Simulations were performed using GROMACS 4.5.5[35−37] for the neat
liquid molecular dynamics and a development version of GROMACS 4.6
for alchemical free energy calculations on these alcohol systems.
Hydration free energies were computed using Thermodynamic Integration
(TI), following a combination of previous protocols.[24,25,38] We note deviations from previous
protocols below.Initial conformations for starting molecules
were taken from structures provided in previous studies,[25] excepting the case of methanol subject to the
neat liquid dielectric optimization, which was constructed and minimized
using the structure editor in the UCSF Chimera visualization system.[39] The GAFF small molecule parameters[6] and AM1-BCC partial charges[40,41] were assigned to each structure using the Antechamber package (Amber
11 version).[42] After setup, the resulting
structures and topologies were converted to the GROMACS format using
the ACPYPE python script.[43] For neat liquid
calculations, cubic simulation boxes of 256 like small molecules were
constructed with GROMACS utilities and subject to sets of 150 ps of
constant-pressure and constant-temperature equilibration until the
target state-point was achieved. Again, methanol in the neat liquid
optimization was an exception in needing a large cubic simulation
box containing 450 molecules in order to accommodate box size fluctuations
along the density optimization pathways. For hydration free energy
calculations, single copies of the small molecules were solvated in
rhombic-dodecahedral TIP3P[10] water boxes
with at least 1.2 nm of space between any solute atom and a box edge.For all neat liquid calculations, a set of five 50 ns production
simulations were performed for each molecule with a unique topology
in order to obtain reported ε(0) values and associated standard
errors. These calculations were performed at 1 atm and the target
temperature from the experimental dielectric constant determination,
usually 298.15 K unless otherwise noted. Specific temperatures are
listed in the associated Supporting Information. A time step of 2 fs was used with the leapfrog integration algorithm
for the equations of motion, and hydrogen atom bond lengths were constrained
using the LINCS algorithm.[44] Isotropic
pressure and temperature coupling were accomplished using the Parrinello–Rahman
barostat[45] and Nose–Hoover thermostat[46,47] with time constants of 10 and 1 ps, respectively. Smooth particle-mesh
Ewald[48] with a real-space cutoff of 1.2
Å, spline order of 4, and energy tolerance of 10–5 was used for long-ranged electrostatics corrections. LJ interactions
were switched off smoothly between 8 and 10 Å, and long-range
energy and pressure corrections were applied to these interactions
as implemented in GROMACS.[49]Hydration
free energy calculations were performed over a series
of separate λ windows as described in previous work,[24] with the primary difference being that these
simulations were performed at constant pressure as in the neat liquid
calculations, but with the Langevin thermostat as in previous work.
Performing the calculations with the Parrinello–Rahman barostat
was found to result in numerically equivalent results to previous
protocols, while eliminating the need for an extensive constant pressure
equilibration and average volume rescaling at the beginning of each
λ window. For our set of 41 alcohols, TI was used to determine
both the electrostatic and nonpolar components of solvation rather
than the Bennett acceptance ratio (BAR) for technical reasons in the
development version of GROMACS. For the SAMPL4 calculations, our standard
BAR protocol was restored with the use of GROMACS 4.6.2 beta 2. The
electrostatic component was calculated from the free energy of turning
on the solute partial charges in water, less the free energy of the
same charging process in vacuum. The nonpolar component was calculated
from the free energy for decoupling the uncharged solute LJ interactions
from the surrounding water as in previous work.[24] This overall protocol was validated against previous calculations
on these 41 small hydroxyl containing solutes[25] and found to be reliable and equivalently accurate.
Optimization
Process for Hydroxyl Groups
Starting with
a general model for a solute one would use in a typical alchemical
free energy calculation, an iterative optimization process was developed
to alter the molecular topology in order to improve agreement with
dielectric properties of the neat liquid. This process is a variation
of previous work[33] with a focus placed
specifically on small molecules containing hydroxyl groups. The simplest
of such solutes is methanol, and Figure 1 shows
an example optimization flow process for methanol.
Figure 1
Illustration of the iterative
optimization process for a methanol
model using experimentally measured data. First, the size (LJ σ)
is scaled to match experimental density for the neat liquid. This
is followed by charge-scaling, to match the experimental dielectric
constant, and well-depth (LJ ε) scaling, to match the experimental
ΔHvap. The resulting model is tested
in a final set of simulations to see how well it matches these three
properties, and it is either accepted or returned to the optimization
pipeline for further refinement.
Illustration of the iterative
optimization process for a methanol
model using experimentally measured data. First, the size (LJ σ)
is scaled to match experimental density for the neat liquid. This
is followed by charge-scaling, to match the experimental dielectric
constant, and well-depth (LJ ε) scaling, to match the experimental
ΔHvap. The resulting model is tested
in a final set of simulations to see how well it matches these three
properties, and it is either accepted or returned to the optimization
pipeline for further refinement.The goal of this process is to retain or improve the density
and
ΔHvap agreement that general molecular
models are developed upon, while simultaneously improving the agreement
with experimental ε(0) values. For simplicity, a three-stage
procedure was employed with a linear interpolation in each stage used
to target optimal values for these properties. After employing an
initial 1 ns simulation to determine the starting model’s density,
the hydroxyloxygen σ value was scaled up or down by 5% if the
density was either greater than or less than the target density window
(791.4 kg/m3 to ±1.0 kg/m3) respectively.
The density of a 1 ns simulation of the resulting model was calculated,
and a linear interpolation was performed between these densities to
determine a σ value that should correspond to the model having
the target density. This process was repeated until the model consistently
reported the experimental density. This led to the second stage process
of optimizing the ε(0) value (33 to ±4), calculated using
the standard dipole fluctuation formulas,[50] using similar linear interpolations over 15 ns simulations. In order
to maintain charge neutrality, all the partial charges were scaled
by ±5%, effectively scaling the dipole moment of the methanol
model in order to raise or lower the ε(0) value. This is illustrated
in Figure 1 as a saturation of atom and bond
color on the methanol molecule.The third stage involved linear
interpolation of ΔHvap values to
estimate an optimal LJ ε
well depth for the hydroxyloxygen atom. The ΔHvap was calculated from the difference in per-molecule
total energies from 10 ns single methanol gas phase and 500 ps liquid
phase simulations,Epol is the energetic
term corresponding to the work required to polarize the methanol molecule
from the gas phase dipole moment to the liquid phase dipole moment,[14]Here, μgas and αgas were 1.7 D
and 3.29 Å3 respectively for
gaseous methanol.[51] To determine μliq from a flexible, yet fixed-charge methanol model, we averaged
individual μliq values of each molecule from the
last frame of the liquid simulation. Note that, as the neat liquid
optimization system contains 450 alcohol molecules, the Eliq for the model is able to converge to ±0.1 kcal/mol
over this relatively short optimization stage. As in the other stages,
this process was iterated until the model reported ΔHvap within the desired tolerance (8.95 kcal/mol
to ±0.59 kcal/mol). It should be noted that we only use ΔHvap in this optimization of methanol and do
not calculate it for other solutes later in this study. Experimental
ΔHvap, μgas, and
αgas properties are not available for all subsequent
molecules investigated, making such comparisons problematic.After this series of parameter adjustments, a final set of simulations
was performed to check if all of these neat liquid properties were
within the specified error tolerances for each of their state points.
If any of the three properties degraded due to correlations between
the parameters and properties not directly associated in the optimization
procedure, the series of stages was iterated until the set of properties
converged. It should be noted that the experimental values need not
be at the same state point. In the case of methanol, the ΔHvap value used was measured at 298.15 K, while
the density and ε(0) were measured at 293.15 K.[51] The adjusted topology will simply describe a classical
molecular model that captures these experimental quantities at these
differing state points. This topology will also not be the only set
of parameters to capture the target property values, just the first
set found that satisfies the criteria.
Application of Optimized
Hydroxyl Parameters to New Model Molecules
While the ε(0)
optimization procedure could adjust a classical
molecular model to reproduce experimental properties for an entire
set of molecules, it is not particularly efficient. In our tests developing
an optimization approach, we found that this simple linear approach
usually only required three to four ε(0) calculations to converge.
As converging on ε(0) is the slow step of the process, we actually
found the simple line-search strategy[33] to be more efficient than grid-based approaches that required more
ε(0) calculations to map out parameter space. One could possibly
employ an even more efficient search strategy such as that used in
the ForceBalance tool,[52] but optimization
would still require a computationally intensive search for each molecule
of interest or a combined optimization over a set of small molecules.
Rather than pursue such a specialized route, we chose to test if the
knowledge gained from the hydroxyl optimization for methanol was transferrable
to other small molecules containing hydroxyl moieties.Illustration of the application
of optimized methanol polarization
to a (A) primary alcohol and (B) phenolic alcohol. The partial charges
on all labeled atoms are scaled by the scaling factor from the methanol
optimization with the exception of the NC atoms. In the primary alcohol,
the single NC atom acts as a neutralization site. In the phenolic
alcohol, the charge difference due to scaling of the hydroxyl group
is split over the two NC atoms.To apply the methanol hydroxyl optimization in a general
manner,
we assume that the hydrogen bonding between hydroxyls in neat methanol
is representative of hydroxylhydrogen bonding in other systems. Additionally,
we assume that intramolecular perturbations of the electric field
about a given hydroxyl group that would alter this behavior are treated
by the semiempirical or quantum mechanical approach used in the setting
of the partial charges for the model. Given these assumptions, we
adopt new LJ σ and ε values, taken from the methanol optimization,
for each hydroxyloxygen atom. For the partial charges, we scale all
hydroxyl group atoms by the amount observed for methanol, as shown
in Figure 2. Any non-hydrogen atoms bonded
to the α carbon are then treated as neutralization sites (the
NC site in Figure 2A) to compensate for any
charge difference introduced by this charge scaling. If there are
multiple non-hydrogen atoms bonded to the α carbon (Figure 2B), the charge difference between the original and
polarized set of hydroxyl atoms is distributed evenly among these
bonded atoms.
Figure 2
Illustration of the application
of optimized methanol polarization
to a (A) primary alcohol and (B) phenolic alcohol. The partial charges
on all labeled atoms are scaled by the scaling factor from the methanol
optimization with the exception of the NC atoms. In the primary alcohol,
the single NC atom acts as a neutralization site. In the phenolic
alcohol, the charge difference due to scaling of the hydroxyl group
is split over the two NC atoms.
Results and Discussion
Dielectric corrected methanol
provides general hydroxyl optimization
rules
We applied the described optimization protocol to alter
the force field parameters for the hydroxyloxygen in a methanol molecule
using the GAFF small molecule parameters and AM1-BCC partial charges.
The results of this procedure are shown in Tables 1 and 2.
Table 1
Calculated
Properties for the Original
and Altered Methanol Models Alongside Experimental Values
ρa (kg/m3)
ΔHvapb (kcal/mol)
ε(0)a
starting CH3OH
791.6
10.95
20.7
final CH3OH
792.4
8.96
34.6
experiment
791.4
8.95
33.0
At 293.15 K.
At 298.15 K.
Table 2
LJ Parameters and Charge Scaling Values
for the Hydroxyl Oxygen in the Initial and Altered Methanol Models
σ (Å)
ε (kcal/mol)
q scaling
starting CH3OH
3.066 47
0.210 40
1.000 00
final CH3OH
3.219 90
0.202 07
1.209 05
At 293.15 K.At 298.15 K.Table 1 shows how the properties for the
methanol models successfully balance within the predefined experimental
constraints. The density is sitting just within our predefined tolerance
(1 kg/m3) of the experimental value while the other properties
sit well within the kBT window for the ΔHvap and Δε(0)
= 4 window. The biggest change is seen for the ε(0) value in
moving a quantity that has a 40% discrepancy down to a less than 5%
difference. To accomplish this, the dipole moment of the methanol
model needed an ∼20% increase in magnitude (see Table 2). These modifications are similar to recently proposed
modifications derived from matching Kirkwood–Buff integrals,
though not quite of the same magnitude,[53] and also qualitatively similar to recent changes in the GROMOS force
field for alcohols.[54] Our changes indicate
that the condensed-phase polarization of methanol is underaccounted
for in the AM1-BCC charge model and the HF/6-31G* basis-set electrostatic
potential upon which it is fit,[41] even
given this basis set’s 10% to 15% overpolarization relative
to the gas phase expectation that is considered fortuitous for condensed-phase
simulations.[2,31,41] The experimental liquid-phase dipole moment for methanol is currently
unknown, so while it is expected to be greater in magnitude than the
gas-phase dipole moment, it is unclear if the optimized liquid state
electric field is a truer representation of that in real methanol
than the field used in other models.We cannot expect other
properties, such as the density, to remain
static in the face of such a large polarization of the liquid-state
methanol molecule. Thus, the hydroxyloxygen σ expands to counter
the increase in density that occurs with such a polarization. This
increase in size simultaneously decreases the strength of alcohol
model pair interactions and increases the molecular volume. While
this change in oxygen atom size acts to counter the increased polarization,
this does not mean that the structure of the liquid remains static.
Figure 3 shows the gOO(r) for methanol before and after the optimization
process, alongside a similar curve from neutron-diffraction studies.[55,56] We see an approximately 0.05 Å shift of peak maxima to longer
distances with the new parameters, in poorer agreement with experimentally
supported structural data. We also see that the peaks and troughs
are amplified, indicating an increase in the structuring of the liquid,
in better agreement with the experimentally supported structural data.
These small positive and negative changes, coupled with the fact that
the liquid structure cannot be unambiguously determined from experiments
without using model fits, make it difficult for us to state if the
liquid structure is better or worse relative to experiment. We can
state that while the liquid density remains the same, the new parameters
act to increase the structure of the liquid. If further changes in
the structure are desired, one could consider including target data
along with condensed-phase geometry optimizations in the fitting process.[57,58]
Figure 3
Oxygen–oxygen
radial distribution functions for the starting
GAFF/AM1-BCC methanol model (orange dashes), the
final dielectric corrected (DC) model (blue), and
neutron-diffraction studies (black).[55,56] The topology adjustments in the final DC model make the liquid more
structured, but they do little to the radial location of the existing
methanol structure of the starting model.
Oxygen–oxygen
radial distribution functions for the starting
GAFF/AM1-BCC methanol model (orange dashes), the
final dielectric corrected (DC) model (blue), and
neutron-diffraction studies (black).[55,56] The topology adjustments in the final DC model make the liquid more
structured, but they do little to the radial location of the existing
methanol structure of the starting model.In addition to liquid structure, experimental diffusion constant
data are available for methanol at standard conditions, 2.28 ×
10–5 cm2/s.[59] We computed the neat liquid diffusion constant for this methanol
model before and after the optimization process. The original model
has a diffusion constant of (3.8 ± 0.2) × 10–5 cm2/s, which is considerably faster than experiment.
This fast dynamic behavior is quite similar to that seen with the
OPLS-AA force field.[60] After optimization,
the diffusion constant drops to (1.8 ± 0.2) × 10–5 cm2/s, slower than, but considerably closer to, the experimental
value. This change is similar to that seen for the evolution of SPC
to SPC/E water, where the additional polarization in SPC/E lowered
the diffusion constant closer to the experimental value.[61,62] This improvement comes despite dynamical performance not being considered
in the optimization process. As suggested for the model liquid structure,
using the diffusion constant, possibly in place of ΔHvap, could be an alternate optimization strategy
to further the agreement in dynamical properties of general alcohols.The scaling values listed here were used as general rules in the
development of a simple tool for converting such hydroxyl groups in
GROMACS topologies into similar dielectric corrected (DC) forms. The
source code for perl and python versions of this utility is provided
in the Supporting Information for interested
readers.
New hydroxyls improve dielectric properties of other alcohols
The optimization improves the dielectric properties of methanol
directly but is difficult to apply prospectively to new molecules.
If improvement of the ε(0) is desired for models of other molecules
with hydroxyl groups, it would be advantageous to not need to undergo
such a process for each new molecule. In many cases, such an effort
would be impossible due to the lack of relevant experimental data.
This is a key concern for researchers wanting to use new molecular
models as a step in a computational chemical process for pharmaceutical
development, or in any other discovery setting. We therefore decided
to test how transferrable these adjustments are to other more complex
small molecules. This set, listed in the Supporting
Information, consists of methanol and 40 additional molecules
for which the experimental ε(0) and ΔGhyd are known.Figure 4A and 4B show the before and after
application of DC hydroxyl modifications results for the calculation
of the neat liquid ε(0) values at the experimentally measured
temperatures. One striking feature of the original general alcohol
models is the tight correlation but shallow slope for the calculated
ε(0) versus experimental measurements. Regression analysis shows
the model ε(0) to typically be about two-thirds of the experimental
value. Applying the DCmethanol modifications to all hydroxyl groups
corrects this systematic underprediction, while not degrading the
neat liquid densities.
Figure 4
Scatterplots of the calculated ε(0) values versus
experimental
quantities for a set of 41 small molecules using (A) the GAFF small
molecule parameters and AM1-BCC partial charges and (B) those same
models after applying the methanol dielectric correction to all hydroxyl
groups. For the GAFF/AM1-BCC models, the data are well-correlated
(R2 = 0.91) to an underpredicting
slope of 0.65(3). The DC models show a similar correlation (R2 = 0.89) but have a slope of 1.07(6), indicating
improved agreement with experiment.
Scatterplots of the calculated ε(0) values versus
experimental
quantities for a set of 41 small molecules using (A) the GAFF small
molecule parameters and AM1-BCC partial charges and (B) those same
models after applying the methanol dielectric correction to all hydroxyl
groups. For the GAFF/AM1-BCC models, the data are well-correlated
(R2 = 0.91) to an underpredicting
slope of 0.65(3). The DC models show a similar correlation (R2 = 0.89) but have a slope of 1.07(6), indicating
improved agreement with experiment.The general trend for the change in ε(0) values when
applying
the DCmethanol modifications was an approximately 50% increase, closing
the gap between the calculations and experimental measurements. There
were a few exceptions, the most glaring were cyclohexanol and cycloheptanol.
In these neat liquids, the ε(0) changed in the opposite manner,
an average 50% decrease. It should be noted that the densities for
the starting GAFF/AM1-BCC model liquids were already 10–20
kg/m3 less than experiment, and these values do not get
“corrected” by the hydroxyl modifications. This indicates
a potential issue may reside in the cycloalkane angle and/or torsion
parameters, altering the liquid-state dipole orientation distributions
in a manner inconsistent with the trends seen for other small molecules.
Regardless, such occurrences were the exception rather than the norm,
as only 4 of the 41 molecules showed a statistically significant decay
of ε(0) away from the experimental value, with the noted cycloalcohol
pair being the worst offenders. The majority of these alcohols showed
dramatic enhancement, indicating that the DCmethanol modifications
are surprisingly transferable for the purpose of improving model ε(0)
values.
ΔGhyd trends couple with ε(0)
One of the primary motivations behind improving the dielectric
properties of model solutes is the potential for improvement in force
field depictions of such solutes in other applications such as hydration
and binding free energy calculations. Solvation (and desolvation)
of ligand molecules is a critical aspect of ligand binding processes,
and accurate predictions of ΔGhyd are necessary for molecular modeling to be a useful contributor
to pharmaceutical design and development. Transfer processes are governed
by the dielectric permittivity of the mediums in implicit molecular
models, and we expect an experimentally relevant depiction of the
solute and environment interactions will be necessary for explicit
models to be quantitative. The dielectric correction process does
not use ΔGhyd information for optimization,
but it does target the quality of the model by polarizing hydroxyl
groups to a level relevant to a condensed-phase hydrogen-bonding environment.
Here, we test the DC hydroxyl modifications by comparing alchemical
hydration free energy calculations on the original GAFF/AM1-BCC and
modified forms. When performing ΔGhyd calculations on a polarized model, it has been suggested that a
work term for the polarization of the molecule (Epol) be included in final result similar to how it is
applied in the ΔHvap calculation.[27,63] Through personal correspondence with the suggesting authors, we
were advised not to include such a term when using
the AM1-BCC partial charges. Here, we follow this general advice;
however, because we are systematically polarizing the hydroxyl groups
beyond the AM1-BCC level, we include an additive per-hydroxyl
Epol term for polarizing methanol from the AM1-BCC
partial charge level to that at the end of the optimization, 0.409
kcal/mol per-hydroxyl. This term is taken to be additive with the
number of hydroxyl groups because we are only modifying the parameters
of hydroxyl groups–we leave the remaining model parameters
untouched. Note that the magnitude of this potential Epol term is less than 1 kT, the chosen
tolerance for the ΔHvap optimization.
This indicates that the new parameters are a valid fit to the computed
ΔHvap whether or not the Epol term is included. The Supporting Information contains results with and without the Epol term. We observe a net benefit whether we
apply the Epol term or not, though we
see the best performance using this correction. This work does not
resolve questions on whether or not such a term should be applied
in general,[24,27,63,64] but it shows how a simplified consideration
of this term can influence hydration free energy calculations. Plots
and error analysis of all these cases are provided in the Supporting Information for readers interested
in additional comparison clarifications.Figure 5A and 5B show the results of ΔGhyd calculations before and after the application
of the DC hydroxyl modifications. Figure 5A
highlights a key concern in hydration calculations of alcohols using
GAFF and AM1-BCC partial charges. There is a nearly 1.5 kcal/mol systematic
underestimation of the ΔGhyd. When
applying the DC hydroxyl modifications, this systematic error disappears,
and we are left with accurate calculation results that agree extremely
well with experiment. These results indicate that the neat liquid
methanol environment from the optimization appears to be a good proxy
for an aqueous hydrogen-bonding environment and quantitatively better
than the essentially gas-phase environment in an AM1-BCC calculation.
While we only have two solutes in this set with multiple hydroxyl
groups (1,2-ethandiol and 1,3-propanediol), the DC hydroxyl modifications
also appear to avoid potential issues related to nonadditivity of
local polarization, taking ΔGhyd values that are nearly 2 and 3 kcal/mol too unfavorable compared
to values that are within 0.5 kcal/mol of experiments.
Figure 5
Scatterplots comparing the calculated to experimental
ΔGhyd for 41 alcohols depicted using
the (A) GAFF
and AM1-BCC parameters and (B) the DC hydroxyl modified forms. The
GAFF/AM1-BCC mean signed error (MSE) of 1.43 kcal/mol is essentially
eliminated (MSE of −0.07 kcal/mol) when using the DC hydroxyl
modifications, and the root-mean-square error also drops from nearly
3kBT (1.52 kcal/mol)
down to under kBT (0.45
kcal/mol). The correlation of the data with experiment improves slightly
as well, going from an R2 of 0.87 to an R2 of 0.91 when using DC hydroxyls.
Environment
is important at both the classical and quantum levels
As
the methanolic environment provided by the DC hydroxyl modifications
is beneficial for both neat liquid ε(0) and ΔGhyd values calculated for general alcohols, we are interested
in determining if the new LJ parameters for the hydroxyloxygen atom
could be used in conjunction with a QM method that better treats the
condensed-phase environment’s effect on the solute electric
field and resulting partial charges. A better accounting of the condensed-phase
environment could result in a truer representation of the electrostatics
of the solute molecule. Ideally, we could set the partial charges
for a given solute containing hydroxyl groups by fitting to a QM electrostatic
potential calculated using a self-consistent reaction field (SCRF)
or similar environment term, rather than using the modified AM1-BCC
partial charges. To test this, we applied partial charges from MP2/cc-pTVZ
SCRF calculations[24] to a small subset of
phenolic and linear alcohol solutes and compared calculated ε(0)
and ΔGhyd values with similar results
from GAFF/AM1-BCC and the DC modified models. The results of these
calculations are shown in Figure 6A and 6B.
Figure 6
Comparison
of calculated (A) ε(0) and (B) ΔGhyd with experiment when using GAFF/AM1-BCC
(red squares), the DC hydroxyl form (blue
circles), and MP2/cc-pTVZ SCRF with the DC hydroxyl oxygen
LJ parameters (green diamonds). By using the new
LJ parameters, the neat liquid densities for methanol, ethanol, propan-1-ol,
phenol, o-cresol, and p-cresol do
not degrade when using a condensed-phase quantum calculation. The
resulting ε(0) and ΔGhyd are
still not quite to the accuracy level of the simple approach outlined
in this work.
Scatterplots comparing the calculated to experimental
ΔGhyd for 41 alcohols depicted using
the (A) GAFF
and AM1-BCC parameters and (B) the DC hydroxyl modified forms. The
GAFF/AM1-BCC mean signed error (MSE) of 1.43 kcal/mol is essentially
eliminated (MSE of −0.07 kcal/mol) when using the DC hydroxyl
modifications, and the root-mean-square error also drops from nearly
3kBT (1.52 kcal/mol)
down to under kBT (0.45
kcal/mol). The correlation of the data with experiment improves slightly
as well, going from an R2 of 0.87 to an R2 of 0.91 when using DC hydroxyls.It is apparent from the improved ε(0) values
that the use
of an SCRF polarizes the alcohols more than the semiempirical AM1-BCC
approach. Still, the more complex QM method does not result in dielectric
constants as accurate as our simple application of the polarization
changes from a neat liquid optimization of methanol. Similarly, the
hydration calculation shows improvement when using the more detailed
QM method, but agreement with experiment is still not quite to the
level of that from our DC hydroxyl modifications. While this is somewhat
disappointing from the perspective of the higher-level QM approach,
it does indicate that using the SCRF improves both dielectric and
hydration properties. Additionally, the neat liquid optimization rules
appear to provide a potential target benchmark for future development
of environment terms in QM electric field calculations.Comparison
of calculated (A) ε(0) and (B) ΔGhyd with experiment when using GAFF/AM1-BCC
(red squares), the DC hydroxyl form (blue
circles), and MP2/cc-pTVZ SCRF with the DC hydroxyloxygen
LJ parameters (green diamonds). By using the new
LJ parameters, the neat liquid densities for methanol, ethanol, propan-1-ol,
phenol, o-cresol, and p-cresol do
not degrade when using a condensed-phase quantum calculation. The
resulting ε(0) and ΔGhyd are
still not quite to the accuracy level of the simple approach outlined
in this work.
Blind predictions in the
SAMPL4 experiment show general improved
agreement with experiment
Our approach of correcting dielectric
properties proved successful for a moderately sized set of small molecules,
but our real goal is an approach which is transferable and predictive.
Thus, it would be helpful to have blind studies to evaluate if this
DC hydroxyl modification can be used in predictive situations. To
this end, we contributed ΔGhyd results
both without and with the DC modifications to the SAMPL4 hydration
prediction event.[65−67] The results of these contributions are summarized
in Table 3. For the actual SAMPL4 event, the Epol contribution to ΔGhyd was not included, but the results with this term are
included here for completeness. The reported errors in Table 3 were computed as the standard deviation in the
mean over 1000 bootstrap trials, with experimental values resampled
with added Gaussian noise drawn from a distribution with a standard
deviation equal to the experimental uncertainty.
Table 3
Analysis of Contributed Hydration
Predictions Using Versions of GAFF/AM1-BCC Parameters in TIP3P Water
for the 18 Hydroxyl Containing Solutes in the SAMPL4 Experiment
force field
RMSE (kcal/mol)
MSE (kcal/mol)
AUE (kcal/mol)
unmodified
1.77 ± 0.37
0.80 ± 0.39
1.40 ± 0.27
DC, no Epol
1.38 ± 0.25
–0.65 ± 0.30
1.02 ± 0.22
DC
1.51 ± 0.29
–0.15 ± 0.35
1.12 ± 0.24
Of the 47 small molecules in the full SAMPL4 set, 18 contained
at least one hydroxyl group. Since there are a diverse set of functional
groups represented in the SAMPL molecule sets, we would not expect
the application of these DC hydroxyl modifications to result in perfect
agreement with experimental values. However, the ∼0.3 kcal/mol
improvement shown in Table 3 in hydroxyl containing
solutes shown does shift the overall set of predictions closer to
experiment by 0.1 to 0.2 kcal/mol. We ran a paired t-test on GAFF/AM1-BCC versus GAFF/AM1-BCC DC and found that the latter
is significantly better (p = 0.008). It is interesting
to note from the Mean Signed Errors (MSE) that the DC hydroxyl modifications
without the Epol term improve the overall
results by countering the systematically hydrophobic nature of the
general force field, possibly skewing the predictions too hydrophilic.
The inclusion of the Epol term tempers
this effect, yet still results in similar overall prediction quality.While the aim of the SAMPL experiment is to inform rather than
act as a competition between groups, we note that the overall performance
of this simple force field + hydroxyl adjustments submission was regularly
among the top three predictions across the various performance metrics.[67] These results indicate that further force field
improvements using this DC strategy applied to other functional groups
could be an effective route toward more accurate predictions of hydration
from theoretical methods that rely upon the quality of small molecule
force field parameters.
Conclusions
We have shown a method
to incorporate experimental dielectric response
information on neat liquids in classical molecular models of alcohol.
By optimizing the LJ parameters and polarization of methanol to capture
the density, ε(0), and ΔHvap, we were able to determine simple hydroxyl modulation rules that
can be applied in a general manner to other alcohols. We tested how
these DC hydroxyl modifications affect the ε(0) for 41 small
molecule alcohols, and we found that these changes are transferable
and lead to substantially improved model property agreement with experiment.
We applied these modifications in the air-to-water transfer of these
molecules as well as in a blind prediction event and found that these
changes improve agreement with experiment by eliminating an observed
systematic hydrophobic bias in the hydration of GAFF/AM1-BCC alcohols.
These simple modifications outperform detailed QM calculations using
an SCRF environment coupling term and provide a potential benchmark
for future developments in QM calculation methods.These findings
make for several clear observations. First, the
relative simplicity and general nature of this effort to correct perceived
flaws in existing classical molecular mechanics force fields indicates
that there is room for improvement in the LJ and partial charge representations
of standard chemical groups. Here, we employed an optimization strategy
that utilized accepted experimentally measured thermodynamic properties
as target quantities. This work on the simplest of alcohols resulted
in a transferable approach that simultaneously improved dielectric
and hydration properties of more complex alcohols.Additionally,
the improvement in ΔGhyd results
indicates that the methanolic environment from
the hydroxyl optimization is a good analogue for the aqueous hydrogen-bonding
condensed-phase environment. Using such an environment when optimizing
LJ and partial charge parameters helps resolve systematic errors in
hydroxyl group hydration seen when using accepted classical force
fields. We suspect a similar improvement in hydration could occur
for other polar functional groups, such as amines and carbonyls, following
optimization, as their polar neat liquid environments might also be
reasonable analogues to polar aqueous environments.While the
optimization process applied here is specific to the
originating force field, it appears to be general for a given charge
model basis. We have applied a similar optimization to models using
the OPLS-UA force field for alcohols, which are fit to a slightly
less detailed basis set,[57] and arrived
at similar end-point alterations to those seen here. This indicates
that further development in condensed-phase QM electric field calculation
methodology, possibly in conjunction with experimentally grounded
efforts similar to those presented here, can potentially lead to even
more quantitative and insightful future classical molecular simulations.
Authors: Bruno A C Horta; Patrick F J Fuchs; Wilfred F van Gunsteren; Philippe H Hünenberger Journal: J Chem Theory Comput Date: 2011-03-25 Impact factor: 6.006
Authors: Shuai Liu; Shannon Cao; Kevin Hoang; Kayla L Young; Andrew S Paluch; David L Mobley Journal: J Chem Theory Comput Date: 2016-03-02 Impact factor: 6.006
Authors: David L Mobley; Caitlin C Bannan; Andrea Rizzi; Christopher I Bayly; John D Chodera; Victoria T Lim; Nathan M Lim; Kyle A Beauchamp; David R Slochower; Michael R Shirts; Michael K Gilson; Peter K Eastman Journal: J Chem Theory Comput Date: 2018-10-30 Impact factor: 6.006
Authors: Kyle A Beauchamp; Julie M Behr; Ariën S Rustenburg; Christopher I Bayly; Kenneth Kroenlein; John D Chodera Journal: J Phys Chem B Date: 2015-09-29 Impact factor: 2.991