Literature DB >> 24702668

A fixed-charge model for alcohol polarization in the condensed phase, and its role in small molecule hydration.

Christopher J Fennell¹, Karisa L Wymer, David L Mobley.

Abstract

We present a simple optimization strategy for incorporating experimental dielectric response information on neat liquids in classical molecular models of alcohol. Using this strategy, we determine simple and transferable hydroxyl modulation rules that, when applied to an existing molecular parameter set, result in a newly dielectric corrected (DC) parameter set. We applied these rules to the general Amber force field (GAFF) to form an initial set of GAFF-DC parameters, and we found this to lead to significant improvement in the calculated dielectric constant and hydration free energy values for a wide variety of small molecule alcohol models. Tests of the GAFF-DC parameters in the SAMPL4 blind prediction event for hydration show these changes improve agreement with experiment. Surprisingly, these simple modifications also outperform detailed quantum mechanical electric field calculations using a self-consistent reaction field environment coupling term. This work provides a potential benchmark for future developments in methods for representing condensed-phase environments in electronic structure calculations.

Entities: Chemical Disease Species

Mesh：

Substances：

Year: 2014 PMID： 24702668 PMCID： PMC4064691 DOI： 10.1021/jp411529h

Source DB: PubMed Journal: J Phys Chem B ISSN： 1520-5207 Impact factor: 2.991

Introduction

Computer simulations exist as a way to connect theories of the microscopic world to experimental results. They help validate our understanding of microscopic phenomena and propose new directions for investigation. Such simulations employ molecular models, and the development of molecular models and their component parameters has a long history.[1−7] One of the primary motivations in their continued development is the increased ability of a molecular model to quantitatively reproduce and/or predict experimental properties of interest. Our depiction of molecules in simulations has been grounded by several experimental observables. These include densities, enthalpies, liquid structure, molecular geometry, phase coexistence boundaries, and more.[8−12] In the case of water, anomalous properties such as the temperature of maximum density provide key motivation.[13−15] Whether the simulation is classical fixed-charge, polarizable, or even quantum mechanical (QM), models used often are unable to reproduce all the observables of interest and, thus, must strike a balance among fitting the selected properties, avoiding too many poorly defined parameters, and keeping the computational cost low. Generalized models, such as OPLS-AA,[3] GAFF,[6] and CGenFF,[7] typically focus on neat liquid density and heats of vaporization as experimental targets for optimization of Lennard–Jones (LJ) parameters coupled with partial charges assigned using fits to HF/6-31G* electrostatic potentials. While models made from these parameter sets can be used to study neat liquids, one of the main intents is for the representation of molecules in other environments, such as a ligand interacting with a protein or a solute transferring between different chemical phases. When using such a model outside of a neat liquid, it is understandable that one may observe a systematic error in a new property of interest. In the case of molecular transfer, researchers have explored a variety of approaches to reduce such systematic errors, and these generally amount to fitting the models to get the answer desired,[5] sometimes coupled to hydrated environment QM calculations,[16] and other times through statistical means by introducing additional water-mixing parameters to the LJ potential terms.[17] For researchers interested in simultaneous reproduction of experimental observables in multiple disparate environments, which is sometimes considered impossible for classical fixed-charge models,[5,18] more complex polarizable models and corrections have shown promise.[19−22] We are interested in both the neat liquid and air-to-water transfer properties of molecules with hydroxyl moieties. Recent studies on the hydration of alcohols using common force field parameters have shown systematic errors in ΔGhyd[17,23−28] and other thermodynamic quantities, in particular the static dielectric constant, ε(0).[29] Are systematic errors in ε(0) and ΔGhyd coupled to one another in some way? It has been suggested that functional groups in typical force fields are underpolarized to some degree,[16,18,21,23,25,27−30] though this should not be too surprising given that partial charges are typically fit to reproduce a gas-phase HF/6-31G* electrostatic potential, despite this charge model’s fortuitous propensity for overpolarizing the gas-phase electric field.[2,6,31,32] For solution-phase simulations, it would be preferable to have an accurate degree of polarization for the condensed-phase fluid encoded within the atomic partial charges of the molecule while maintaining the typical neat liquid density and enthalpy of vaporization (ΔHvap). Recent studies have begun using the experimentally measured ε(0) of a neat liquid as a fitting target for systematically scaling the degree of polarization of classical fixed-charge models.[33,34] This provides an orthogonal approach to the practice of setting partial charges with a QM electrostatic potential and optimizing LJ parameters to fit experimental observables. As it relies on experimental quantities to govern the magnitude of the partial charges, it can potentially provide an independent assessment of the quality of a given QM calculation used to derive partial charges. Here, we extend such work with the goal of addressing the following questions: (1) is it possible to simultaneously capture the density, ΔHvap, and ε(0) for simple alcohol solvents; (2) how transferable are the resulting hydroxyl parameters to other more complex alcohols; and (3) if an alcohol model reproduces the neat liquid ε(0), how does that affect the hydration of that alcohol as a solute?

Methods

General Simulation Protocol

In this work, both extended molecular dynamics simulations and alchemical free energy calculations were used to calculate neat liquid thermodynamic properties and air-to-water transfer free energies for a series of 41 small molecules containing hydroxyl moieties for which both experimental quantities were available. Simulations were performed using GROMACS 4.5.5[35−37] for the neat liquid molecular dynamics and a development version of GROMACS 4.6 for alchemical free energy calculations on these alcohol systems. Hydration free energies were computed using Thermodynamic Integration (TI), following a combination of previous protocols.[24,25,38] We note deviations from previous protocols below. Initial conformations for starting molecules were taken from structures provided in previous studies,[25] excepting the case of methanol subject to the neat liquid dielectric optimization, which was constructed and minimized using the structure editor in the UCSF Chimera visualization system.[39] The GAFF small molecule parameters[6] and AM1-BCC partial charges[40,41] were assigned to each structure using the Antechamber package (Amber 11 version).[42] After setup, the resulting structures and topologies were converted to the GROMACS format using the ACPYPE python script.[43] For neat liquid calculations, cubic simulation boxes of 256 like small molecules were constructed with GROMACS utilities and subject to sets of 150 ps of constant-pressure and constant-temperature equilibration until the target state-point was achieved. Again, methanol in the neat liquid optimization was an exception in needing a large cubic simulation box containing 450 molecules in order to accommodate box size fluctuations along the density optimization pathways. For hydration free energy calculations, single copies of the small molecules were solvated in rhombic-dodecahedral TIP3P[10] water boxes with at least 1.2 nm of space between any solute atom and a box edge. For all neat liquid calculations, a set of five 50 ns production simulations were performed for each molecule with a unique topology in order to obtain reported ε(0) values and associated standard errors. These calculations were performed at 1 atm and the target temperature from the experimental dielectric constant determination, usually 298.15 K unless otherwise noted. Specific temperatures are listed in the associated Supporting Information. A time step of 2 fs was used with the leapfrog integration algorithm for the equations of motion, and hydrogen atom bond lengths were constrained using the LINCS algorithm.[44] Isotropic pressure and temperature coupling were accomplished using the Parrinello–Rahman barostat[45] and Nose–Hoover thermostat[46,47] with time constants of 10 and 1 ps, respectively. Smooth particle-mesh Ewald[48] with a real-space cutoff of 1.2 Å, spline order of 4, and energy tolerance of 10–5 was used for long-ranged electrostatics corrections. LJ interactions were switched off smoothly between 8 and 10 Å, and long-range energy and pressure corrections were applied to these interactions as implemented in GROMACS.[49] Hydration free energy calculations were performed over a series of separate λ windows as described in previous work,[24] with the primary difference being that these simulations were performed at constant pressure as in the neat liquid calculations, but with the Langevin thermostat as in previous work. Performing the calculations with the Parrinello–Rahman barostat was found to result in numerically equivalent results to previous protocols, while eliminating the need for an extensive constant pressure equilibration and average volume rescaling at the beginning of each λ window. For our set of 41 alcohols, TI was used to determine both the electrostatic and nonpolar components of solvation rather than the Bennett acceptance ratio (BAR) for technical reasons in the development version of GROMACS. For the SAMPL4 calculations, our standard BAR protocol was restored with the use of GROMACS 4.6.2 beta 2. The electrostatic component was calculated from the free energy of turning on the solute partial charges in water, less the free energy of the same charging process in vacuum. The nonpolar component was calculated from the free energy for decoupling the uncharged solute LJ interactions from the surrounding water as in previous work.[24] This overall protocol was validated against previous calculations on these 41 small hydroxyl containing solutes[25] and found to be reliable and equivalently accurate.

Optimization Process for Hydroxyl Groups

Starting with a general model for a solute one would use in a typical alchemical free energy calculation, an iterative optimization process was developed to alter the molecular topology in order to improve agreement with dielectric properties of the neat liquid. This process is a variation of previous work[33] with a focus placed specifically on small molecules containing hydroxyl groups. The simplest of such solutes is methanol, and Figure 1 shows an example optimization flow process for methanol.

Figure 1

Illustration of the iterative optimization process for a methanol model using experimentally measured data. First, the size (LJ σ) is scaled to match experimental density for the neat liquid. This is followed by charge-scaling, to match the experimental dielectric constant, and well-depth (LJ ε) scaling, to match the experimental ΔHvap. The resulting model is tested in a final set of simulations to see how well it matches these three properties, and it is either accepted or returned to the optimization pipeline for further refinement. The goal of this process is to retain or improve the density and ΔHvap agreement that general molecular models are developed upon, while simultaneously improving the agreement with experimental ε(0) values. For simplicity, a three-stage procedure was employed with a linear interpolation in each stage used to target optimal values for these properties. After employing an initial 1 ns simulation to determine the starting model’s density, the hydroxyl oxygen σ value was scaled up or down by 5% if the density was either greater than or less than the target density window (791.4 kg/m3 to ±1.0 kg/m3) respectively. The density of a 1 ns simulation of the resulting model was calculated, and a linear interpolation was performed between these densities to determine a σ value that should correspond to the model having the target density. This process was repeated until the model consistently reported the experimental density. This led to the second stage process of optimizing the ε(0) value (33 to ±4), calculated using the standard dipole fluctuation formulas,[50] using similar linear interpolations over 15 ns simulations. In order to maintain charge neutrality, all the partial charges were scaled by ±5%, effectively scaling the dipole moment of the methanol model in order to raise or lower the ε(0) value. This is illustrated in Figure 1 as a saturation of atom and bond color on the methanol molecule. The third stage involved linear interpolation of ΔHvap values to estimate an optimal LJ ε well depth for the hydroxyl oxygen atom. The ΔHvap was calculated from the difference in per-molecule total energies from 10 ns single methanol gas phase and 500 ps liquid phase simulations,Epol is the energetic term corresponding to the work required to polarize the methanol molecule from the gas phase dipole moment to the liquid phase dipole moment,[14]Here, μgas and αgas were 1.7 D and 3.29 Å3 respectively for gaseous methanol.[51] To determine μliq from a flexible, yet fixed-charge methanol model, we averaged individual μliq values of each molecule from the last frame of the liquid simulation. Note that, as the neat liquid optimization system contains 450 alcohol molecules, the Eliq for the model is able to converge to ±0.1 kcal/mol over this relatively short optimization stage. As in the other stages, this process was iterated until the model reported ΔHvap within the desired tolerance (8.95 kcal/mol to ±0.59 kcal/mol). It should be noted that we only use ΔHvap in this optimization of methanol and do not calculate it for other solutes later in this study. Experimental ΔHvap, μgas, and αgas properties are not available for all subsequent molecules investigated, making such comparisons problematic. After this series of parameter adjustments, a final set of simulations was performed to check if all of these neat liquid properties were within the specified error tolerances for each of their state points. If any of the three properties degraded due to correlations between the parameters and properties not directly associated in the optimization procedure, the series of stages was iterated until the set of properties converged. It should be noted that the experimental values need not be at the same state point. In the case of methanol, the ΔHvap value used was measured at 298.15 K, while the density and ε(0) were measured at 293.15 K.[51] The adjusted topology will simply describe a classical molecular model that captures these experimental quantities at these differing state points. This topology will also not be the only set of parameters to capture the target property values, just the first set found that satisfies the criteria.

Application of Optimized Hydroxyl Parameters to New Model Molecules

While the ε(0) optimization procedure could adjust a classical molecular model to reproduce experimental properties for an entire set of molecules, it is not particularly efficient. In our tests developing an optimization approach, we found that this simple linear approach usually only required three to four ε(0) calculations to converge. As converging on ε(0) is the slow step of the process, we actually found the simple line-search strategy[33] to be more efficient than grid-based approaches that required more ε(0) calculations to map out parameter space. One could possibly employ an even more efficient search strategy such as that used in the ForceBalance tool,[52] but optimization would still require a computationally intensive search for each molecule of interest or a combined optimization over a set of small molecules. Rather than pursue such a specialized route, we chose to test if the knowledge gained from the hydroxyl optimization for methanol was transferrable to other small molecules containing hydroxyl moieties. Illustration of the application of optimized methanol polarization to a (A) primary alcohol and (B) phenolic alcohol. The partial charges on all labeled atoms are scaled by the scaling factor from the methanol optimization with the exception of the NC atoms. In the primary alcohol, the single NC atom acts as a neutralization site. In the phenolic alcohol, the charge difference due to scaling of the hydroxyl group is split over the two NC atoms. To apply the methanol hydroxyl optimization in a general manner, we assume that the hydrogen bonding between hydroxyls in neat methanol is representative of hydroxyl hydrogen bonding in other systems. Additionally, we assume that intramolecular perturbations of the electric field about a given hydroxyl group that would alter this behavior are treated by the semiempirical or quantum mechanical approach used in the setting of the partial charges for the model. Given these assumptions, we adopt new LJ σ and ε values, taken from the methanol optimization, for each hydroxyl oxygen atom. For the partial charges, we scale all hydroxyl group atoms by the amount observed for methanol, as shown in Figure 2. Any non-hydrogen atoms bonded to the α carbon are then treated as neutralization sites (the NC site in Figure 2A) to compensate for any charge difference introduced by this charge scaling. If there are multiple non-hydrogen atoms bonded to the α carbon (Figure 2B), the charge difference between the original and polarized set of hydroxyl atoms is distributed evenly among these bonded atoms.

Figure 2

Illustration of the application of optimized methanol polarization to a (A) primary alcohol and (B) phenolic alcohol. The partial charges on all labeled atoms are scaled by the scaling factor from the methanol optimization with the exception of the NC atoms. In the primary alcohol, the single NC atom acts as a neutralization site. In the phenolic alcohol, the charge difference due to scaling of the hydroxyl group is split over the two NC atoms.

Results and Discussion

Dielectric corrected methanol provides general hydroxyl optimization rules

We applied the described optimization protocol to alter the force field parameters for the hydroxyl oxygen in a methanol molecule using the GAFF small molecule parameters and AM1-BCC partial charges. The results of this procedure are shown in Tables 1 and 2.

Table 1

Calculated Properties for the Original and Altered Methanol Models Alongside Experimental Values

	ρa (kg/m³)	ΔH_vapb (kcal/mol)	ε(0)a
starting CH₃OH	791.6	10.95	20.7
final CH₃OH	792.4	8.96	34.6
experiment	791.4	8.95	33.0

At 293.15 K.

At 298.15 K.

Table 2

LJ Parameters and Charge Scaling Values for the Hydroxyl Oxygen in the Initial and Altered Methanol Models

	σ (Å)	ε (kcal/mol)	q scaling
starting CH₃OH	3.066 47	0.210 40	1.000 00
final CH₃OH	3.219 90	0.202 07	1.209 05

At 293.15 K. At 298.15 K. Table 1 shows how the properties for the methanol models successfully balance within the predefined experimental constraints. The density is sitting just within our predefined tolerance (1 kg/m3) of the experimental value while the other properties sit well within the kBT window for the ΔHvap and Δε(0) = 4 window. The biggest change is seen for the ε(0) value in moving a quantity that has a 40% discrepancy down to a less than 5% difference. To accomplish this, the dipole moment of the methanol model needed an ∼20% increase in magnitude (see Table 2). These modifications are similar to recently proposed modifications derived from matching Kirkwood–Buff integrals, though not quite of the same magnitude,[53] and also qualitatively similar to recent changes in the GROMOS force field for alcohols.[54] Our changes indicate that the condensed-phase polarization of methanol is underaccounted for in the AM1-BCC charge model and the HF/6-31G* basis-set electrostatic potential upon which it is fit,[41] even given this basis set’s 10% to 15% overpolarization relative to the gas phase expectation that is considered fortuitous for condensed-phase simulations.[2,31,41] The experimental liquid-phase dipole moment for methanol is currently unknown, so while it is expected to be greater in magnitude than the gas-phase dipole moment, it is unclear if the optimized liquid state electric field is a truer representation of that in real methanol than the field used in other models. We cannot expect other properties, such as the density, to remain static in the face of such a large polarization of the liquid-state methanol molecule. Thus, the hydroxyl oxygen σ expands to counter the increase in density that occurs with such a polarization. This increase in size simultaneously decreases the strength of alcohol model pair interactions and increases the molecular volume. While this change in oxygen atom size acts to counter the increased polarization, this does not mean that the structure of the liquid remains static. Figure 3 shows the gOO(r) for methanol before and after the optimization process, alongside a similar curve from neutron-diffraction studies.[55,56] We see an approximately 0.05 Å shift of peak maxima to longer distances with the new parameters, in poorer agreement with experimentally supported structural data. We also see that the peaks and troughs are amplified, indicating an increase in the structuring of the liquid, in better agreement with the experimentally supported structural data. These small positive and negative changes, coupled with the fact that the liquid structure cannot be unambiguously determined from experiments without using model fits, make it difficult for us to state if the liquid structure is better or worse relative to experiment. We can state that while the liquid density remains the same, the new parameters act to increase the structure of the liquid. If further changes in the structure are desired, one could consider including target data along with condensed-phase geometry optimizations in the fitting process.[57,58]

Figure 3

Oxygen–oxygen radial distribution functions for the starting GAFF/AM1-BCC methanol model (orange dashes), the final dielectric corrected (DC) model (blue), and neutron-diffraction studies (black).[55,56] The topology adjustments in the final DC model make the liquid more structured, but they do little to the radial location of the existing methanol structure of the starting model. In addition to liquid structure, experimental diffusion constant data are available for methanol at standard conditions, 2.28 × 10–5 cm2/s.[59] We computed the neat liquid diffusion constant for this methanol model before and after the optimization process. The original model has a diffusion constant of (3.8 ± 0.2) × 10–5 cm2/s, which is considerably faster than experiment. This fast dynamic behavior is quite similar to that seen with the OPLS-AA force field.[60] After optimization, the diffusion constant drops to (1.8 ± 0.2) × 10–5 cm2/s, slower than, but considerably closer to, the experimental value. This change is similar to that seen for the evolution of SPC to SPC/E water, where the additional polarization in SPC/E lowered the diffusion constant closer to the experimental value.[61,62] This improvement comes despite dynamical performance not being considered in the optimization process. As suggested for the model liquid structure, using the diffusion constant, possibly in place of ΔHvap, could be an alternate optimization strategy to further the agreement in dynamical properties of general alcohols. The scaling values listed here were used as general rules in the development of a simple tool for converting such hydroxyl groups in GROMACS topologies into similar dielectric corrected (DC) forms. The source code for perl and python versions of this utility is provided in the Supporting Information for interested readers.

New hydroxyls improve dielectric properties of other alcohols

The optimization improves the dielectric properties of methanol directly but is difficult to apply prospectively to new molecules. If improvement of the ε(0) is desired for models of other molecules with hydroxyl groups, it would be advantageous to not need to undergo such a process for each new molecule. In many cases, such an effort would be impossible due to the lack of relevant experimental data. This is a key concern for researchers wanting to use new molecular models as a step in a computational chemical process for pharmaceutical development, or in any other discovery setting. We therefore decided to test how transferrable these adjustments are to other more complex small molecules. This set, listed in the Supporting Information, consists of methanol and 40 additional molecules for which the experimental ε(0) and ΔGhyd are known. Figure 4A and 4B show the before and after application of DC hydroxyl modifications results for the calculation of the neat liquid ε(0) values at the experimentally measured temperatures. One striking feature of the original general alcohol models is the tight correlation but shallow slope for the calculated ε(0) versus experimental measurements. Regression analysis shows the model ε(0) to typically be about two-thirds of the experimental value. Applying the DC methanol modifications to all hydroxyl groups corrects this systematic underprediction, while not degrading the neat liquid densities.

Figure 4

Scatterplots of the calculated ε(0) values versus experimental quantities for a set of 41 small molecules using (A) the GAFF small molecule parameters and AM1-BCC partial charges and (B) those same models after applying the methanol dielectric correction to all hydroxyl groups. For the GAFF/AM1-BCC models, the data are well-correlated (R2 = 0.91) to an underpredicting slope of 0.65(3). The DC models show a similar correlation (R2 = 0.89) but have a slope of 1.07(6), indicating improved agreement with experiment. The general trend for the change in ε(0) values when applying the DC methanol modifications was an approximately 50% increase, closing the gap between the calculations and experimental measurements. There were a few exceptions, the most glaring were cyclohexanol and cycloheptanol. In these neat liquids, the ε(0) changed in the opposite manner, an average 50% decrease. It should be noted that the densities for the starting GAFF/AM1-BCC model liquids were already 10–20 kg/m3 less than experiment, and these values do not get “corrected” by the hydroxyl modifications. This indicates a potential issue may reside in the cycloalkane angle and/or torsion parameters, altering the liquid-state dipole orientation distributions in a manner inconsistent with the trends seen for other small molecules. Regardless, such occurrences were the exception rather than the norm, as only 4 of the 41 molecules showed a statistically significant decay of ε(0) away from the experimental value, with the noted cycloalcohol pair being the worst offenders. The majority of these alcohols showed dramatic enhancement, indicating that the DC methanol modifications are surprisingly transferable for the purpose of improving model ε(0) values.

ΔGhyd trends couple with ε(0)

One of the primary motivations behind improving the dielectric properties of model solutes is the potential for improvement in force field depictions of such solutes in other applications such as hydration and binding free energy calculations. Solvation (and desolvation) of ligand molecules is a critical aspect of ligand binding processes, and accurate predictions of ΔGhyd are necessary for molecular modeling to be a useful contributor to pharmaceutical design and development. Transfer processes are governed by the dielectric permittivity of the mediums in implicit molecular models, and we expect an experimentally relevant depiction of the solute and environment interactions will be necessary for explicit models to be quantitative. The dielectric correction process does not use ΔGhyd information for optimization, but it does target the quality of the model by polarizing hydroxyl groups to a level relevant to a condensed-phase hydrogen-bonding environment. Here, we test the DC hydroxyl modifications by comparing alchemical hydration free energy calculations on the original GAFF/AM1-BCC and modified forms. When performing ΔGhyd calculations on a polarized model, it has been suggested that a work term for the polarization of the molecule (Epol) be included in final result similar to how it is applied in the ΔHvap calculation.[27,63] Through personal correspondence with the suggesting authors, we were advised not to include such a term when using the AM1-BCC partial charges. Here, we follow this general advice; however, because we are systematically polarizing the hydroxyl groups beyond the AM1-BCC level, we include an additive per-hydroxyl Epol term for polarizing methanol from the AM1-BCC partial charge level to that at the end of the optimization, 0.409 kcal/mol per-hydroxyl. This term is taken to be additive with the number of hydroxyl groups because we are only modifying the parameters of hydroxyl groups–we leave the remaining model parameters untouched. Note that the magnitude of this potential Epol term is less than 1 kT, the chosen tolerance for the ΔHvap optimization. This indicates that the new parameters are a valid fit to the computed ΔHvap whether or not the Epol term is included. The Supporting Information contains results with and without the Epol term. We observe a net benefit whether we apply the Epol term or not, though we see the best performance using this correction. This work does not resolve questions on whether or not such a term should be applied in general,[24,27,63,64] but it shows how a simplified consideration of this term can influence hydration free energy calculations. Plots and error analysis of all these cases are provided in the Supporting Information for readers interested in additional comparison clarifications. Figure 5A and 5B show the results of ΔGhyd calculations before and after the application of the DC hydroxyl modifications. Figure 5A highlights a key concern in hydration calculations of alcohols using GAFF and AM1-BCC partial charges. There is a nearly 1.5 kcal/mol systematic underestimation of the ΔGhyd. When applying the DC hydroxyl modifications, this systematic error disappears, and we are left with accurate calculation results that agree extremely well with experiment. These results indicate that the neat liquid methanol environment from the optimization appears to be a good proxy for an aqueous hydrogen-bonding environment and quantitatively better than the essentially gas-phase environment in an AM1-BCC calculation. While we only have two solutes in this set with multiple hydroxyl groups (1,2-ethandiol and 1,3-propanediol), the DC hydroxyl modifications also appear to avoid potential issues related to nonadditivity of local polarization, taking ΔGhyd values that are nearly 2 and 3 kcal/mol too unfavorable compared to values that are within 0.5 kcal/mol of experiments.

Figure 5

Environment is important at both the classical and quantum levels

As the methanolic environment provided by the DC hydroxyl modifications is beneficial for both neat liquid ε(0) and ΔGhyd values calculated for general alcohols, we are interested in determining if the new LJ parameters for the hydroxyl oxygen atom could be used in conjunction with a QM method that better treats the condensed-phase environment’s effect on the solute electric field and resulting partial charges. A better accounting of the condensed-phase environment could result in a truer representation of the electrostatics of the solute molecule. Ideally, we could set the partial charges for a given solute containing hydroxyl groups by fitting to a QM electrostatic potential calculated using a self-consistent reaction field (SCRF) or similar environment term, rather than using the modified AM1-BCC partial charges. To test this, we applied partial charges from MP2/cc-pTVZ SCRF calculations[24] to a small subset of phenolic and linear alcohol solutes and compared calculated ε(0) and ΔGhyd values with similar results from GAFF/AM1-BCC and the DC modified models. The results of these calculations are shown in Figure 6A and 6B.

Figure 6

Comparison of calculated (A) ε(0) and (B) ΔGhyd with experiment when using GAFF/AM1-BCC (red squares), the DC hydroxyl form (blue circles), and MP2/cc-pTVZ SCRF with the DC hydroxyl oxygen LJ parameters (green diamonds). By using the new LJ parameters, the neat liquid densities for methanol, ethanol, propan-1-ol, phenol, o-cresol, and p-cresol do not degrade when using a condensed-phase quantum calculation. The resulting ε(0) and ΔGhyd are still not quite to the accuracy level of the simple approach outlined in this work.

Scatterplots comparing the calculated to experimental ΔGhyd for 41 alcohols depicted using the (A) GAFF and AM1-BCC parameters and (B) the DC hydroxyl modified forms. The GAFF/AM1-BCC mean signed error (MSE) of 1.43 kcal/mol is essentially eliminated (MSE of −0.07 kcal/mol) when using the DC hydroxyl modifications, and the root-mean-square error also drops from nearly 3kBT (1.52 kcal/mol) down to under kBT (0.45 kcal/mol). The correlation of the data with experiment improves slightly as well, going from an R2 of 0.87 to an R2 of 0.91 when using DC hydroxyls. It is apparent from the improved ε(0) values that the use of an SCRF polarizes the alcohols more than the semiempirical AM1-BCC approach. Still, the more complex QM method does not result in dielectric constants as accurate as our simple application of the polarization changes from a neat liquid optimization of methanol. Similarly, the hydration calculation shows improvement when using the more detailed QM method, but agreement with experiment is still not quite to the level of that from our DC hydroxyl modifications. While this is somewhat disappointing from the perspective of the higher-level QM approach, it does indicate that using the SCRF improves both dielectric and hydration properties. Additionally, the neat liquid optimization rules appear to provide a potential target benchmark for future development of environment terms in QM electric field calculations. Comparison of calculated (A) ε(0) and (B) ΔGhyd with experiment when using GAFF/AM1-BCC (red squares), the DC hydroxyl form (blue circles), and MP2/cc-pTVZ SCRF with the DC hydroxyl oxygen LJ parameters (green diamonds). By using the new LJ parameters, the neat liquid densities for methanol, ethanol, propan-1-ol, phenol, o-cresol, and p-cresol do not degrade when using a condensed-phase quantum calculation. The resulting ε(0) and ΔGhyd are still not quite to the accuracy level of the simple approach outlined in this work.

Blind predictions in the SAMPL4 experiment show general improved agreement with experiment

Our approach of correcting dielectric properties proved successful for a moderately sized set of small molecules, but our real goal is an approach which is transferable and predictive. Thus, it would be helpful to have blind studies to evaluate if this DC hydroxyl modification can be used in predictive situations. To this end, we contributed ΔGhyd results both without and with the DC modifications to the SAMPL4 hydration prediction event.[65−67] The results of these contributions are summarized in Table 3. For the actual SAMPL4 event, the Epol contribution to ΔGhyd was not included, but the results with this term are included here for completeness. The reported errors in Table 3 were computed as the standard deviation in the mean over 1000 bootstrap trials, with experimental values resampled with added Gaussian noise drawn from a distribution with a standard deviation equal to the experimental uncertainty.

Table 3

Analysis of Contributed Hydration Predictions Using Versions of GAFF/AM1-BCC Parameters in TIP3P Water for the 18 Hydroxyl Containing Solutes in the SAMPL4 Experiment

force field	RMSE (kcal/mol)	MSE (kcal/mol)	AUE (kcal/mol)
unmodified	1.77 ± 0.37	0.80 ± 0.39	1.40 ± 0.27
DC, no E_pol	1.38 ± 0.25	–0.65 ± 0.30	1.02 ± 0.22
DC	1.51 ± 0.29	–0.15 ± 0.35	1.12 ± 0.24

Of the 47 small molecules in the full SAMPL4 set, 18 contained at least one hydroxyl group. Since there are a diverse set of functional groups represented in the SAMPL molecule sets, we would not expect the application of these DC hydroxyl modifications to result in perfect agreement with experimental values. However, the ∼0.3 kcal/mol improvement shown in Table 3 in hydroxyl containing solutes shown does shift the overall set of predictions closer to experiment by 0.1 to 0.2 kcal/mol. We ran a paired t-test on GAFF/AM1-BCC versus GAFF/AM1-BCC DC and found that the latter is significantly better (p = 0.008). It is interesting to note from the Mean Signed Errors (MSE) that the DC hydroxyl modifications without the Epol term improve the overall results by countering the systematically hydrophobic nature of the general force field, possibly skewing the predictions too hydrophilic. The inclusion of the Epol term tempers this effect, yet still results in similar overall prediction quality. While the aim of the SAMPL experiment is to inform rather than act as a competition between groups, we note that the overall performance of this simple force field + hydroxyl adjustments submission was regularly among the top three predictions across the various performance metrics.[67] These results indicate that further force field improvements using this DC strategy applied to other functional groups could be an effective route toward more accurate predictions of hydration from theoretical methods that rely upon the quality of small molecule force field parameters.

Conclusions

We have shown a method to incorporate experimental dielectric response information on neat liquids in classical molecular models of alcohol. By optimizing the LJ parameters and polarization of methanol to capture the density, ε(0), and ΔHvap, we were able to determine simple hydroxyl modulation rules that can be applied in a general manner to other alcohols. We tested how these DC hydroxyl modifications affect the ε(0) for 41 small molecule alcohols, and we found that these changes are transferable and lead to substantially improved model property agreement with experiment. We applied these modifications in the air-to-water transfer of these molecules as well as in a blind prediction event and found that these changes improve agreement with experiment by eliminating an observed systematic hydrophobic bias in the hydration of GAFF/AM1-BCC alcohols. These simple modifications outperform detailed QM calculations using an SCRF environment coupling term and provide a potential benchmark for future developments in QM calculation methods. These findings make for several clear observations. First, the relative simplicity and general nature of this effort to correct perceived flaws in existing classical molecular mechanics force fields indicates that there is room for improvement in the LJ and partial charge representations of standard chemical groups. Here, we employed an optimization strategy that utilized accepted experimentally measured thermodynamic properties as target quantities. This work on the simplest of alcohols resulted in a transferable approach that simultaneously improved dielectric and hydration properties of more complex alcohols. Additionally, the improvement in ΔGhyd results indicates that the methanolic environment from the hydroxyl optimization is a good analogue for the aqueous hydrogen-bonding condensed-phase environment. Using such an environment when optimizing LJ and partial charge parameters helps resolve systematic errors in hydroxyl group hydration seen when using accepted classical force fields. We suspect a similar improvement in hydration could occur for other polar functional groups, such as amines and carbonyls, following optimization, as their polar neat liquid environments might also be reasonable analogues to polar aqueous environments. While the optimization process applied here is specific to the originating force field, it appears to be general for a given charge model basis. We have applied a similar optimization to models using the OPLS-UA force field for alcohols, which are fit to a slightly less detailed basis set,[57] and arrived at similar end-point alterations to those seen here. This indicates that further development in condensed-phase QM electric field calculation methodology, possibly in conjunction with experimentally grounded efforts similar to those presented here, can potentially lead to even more quantitative and insightful future classical molecular simulations.

39 in total

1. Calculation of the water-cyclohexane transfer free energies of neutral amino acid side-chain analogs using the OPLS all-atom force field.

Authors: Justin L MacCallum; D Peter Tieleman
Journal: J Comput Chem Date: 2003-11-30 Impact factor: 3.376

2. Fast, efficient generation of high-quality atomic charges. AM1-BCC model: II. Parameterization and validation.

Authors: Araz Jakalian; David B Jack; Christopher I Bayly
Journal: J Comput Chem Date: 2002-12 Impact factor: 3.376

3. Development and testing of a general amber force field.

Authors: Junmei Wang; Romain M Wolf; James W Caldwell; Peter A Kollman; David A Case
Journal: J Comput Chem Date: 2004-07-15 Impact factor: 3.376

4. Molecular dynamics simulations of liquid methanol and methanol-water mixtures with polarizable models.

Authors: Haibo Yu; Daan P Geerke; Haiyan Liu; Wilfred F van Gunsteren
Journal: J Comput Chem Date: 2006-10 Impact factor: 3.376

5. Automatic atom type and bond type perception in molecular mechanical calculations.

Authors: Junmei Wang; Wei Wang; Peter A Kollman; David A Case
Journal: J Mol Graph Model Date: 2006-02-03 Impact factor: 2.518

6. Electronic continuum model for molecular dynamics simulations.

Authors: I V Leontyev; A A Stuchebrukhov
Journal: J Chem Phys Date: 2009-02-28 Impact factor: 3.488

7. A blind challenge for computational solvation free energies: introduction and overview.

Authors: J Peter Guthrie
Journal: J Phys Chem B Date: 2009-04-09 Impact factor: 2.991

Review 8. Blind prediction of solvation free energies from the SAMPL4 challenge.

Authors: David L Mobley; Karisa L Wymer; Nathan M Lim; J Peter Guthrie
Journal: J Comput Aided Mol Des Date: 2014-03-11 Impact factor: 3.686

9. Derivation of fixed partial charges for amino acids accommodating a specific water model and implicit polarization.

Authors: David S Cerutti; Julia E Rice; William C Swope; David A Case
Journal: J Phys Chem B Date: 2013-02-18 Impact factor: 2.991

10. New Interaction Parameters for Oxygen Compounds in the GROMOS Force Field: Improved Pure-Liquid and Solvation Properties for Alcohols, Ethers, Aldehydes, Ketones, Carboxylic Acids, and Esters.

Authors: Bruno A C Horta; Patrick F J Fuchs; Wilfred F van Gunsteren; Philippe H Hünenberger
Journal: J Chem Theory Comput Date: 2011-03-25 Impact factor: 6.006

21 in total

1. Calculating Partition Coefficients of Small Molecules in Octanol/Water and Cyclohexane/Water.

Authors: Caitlin C Bannan; Gaetano Calabró; Daisy Y Kyu; David L Mobley
Journal: J Chem Theory Comput Date: 2016-08-01 Impact factor: 6.006

2. Predicting the excess solubility of acetanilide, acetaminophen, phenacetin, benzocaine, and caffeine in binary water/ethanol mixtures via molecular simulation.

Authors: Andrew S Paluch; Sreeja Parameswaran; Shuai Liu; Anasuya Kolavennu; David L Mobley
Journal: J Chem Phys Date: 2015-01-28 Impact factor: 3.488

3. Testing the semi-explicit assembly model of aqueous solvation in the SAMPL4 challenge.

Authors: Libo Li; Ken A Dill; Christopher J Fennell
Journal: J Comput Aided Mol Des Date: 2014-01-29 Impact factor: 3.686

Review 4. Blind prediction of solvation free energies from the SAMPL4 challenge.

Authors: David L Mobley; Karisa L Wymer; Nathan M Lim; J Peter Guthrie
Journal: J Comput Aided Mol Des Date: 2014-03-11 Impact factor: 3.686

5. Applicability of a thermodynamic cycle approach for a force field parametrization targeting non-aqueous solvation free energies.

Authors: Andreas Mecklenfeld; Gabriele Raabe
Journal: J Comput Aided Mol Des Date: 2019-11-28 Impact factor: 3.686

6. Using MD Simulations To Calculate How Solvents Modulate Solubility.

Authors: Shuai Liu; Shannon Cao; Kevin Hoang; Kayla L Young; Andrew S Paluch; David L Mobley
Journal: J Chem Theory Comput Date: 2016-03-02 Impact factor: 6.006

7. FreeSolv: a database of experimental and calculated hydration free energies, with input files.

Authors: David L Mobley; J Peter Guthrie
Journal: J Comput Aided Mol Des Date: 2014-06-14 Impact factor: 3.686

8. Escaping Atom Types in Force Fields Using Direct Chemical Perception.

Authors: David L Mobley; Caitlin C Bannan; Andrea Rizzi; Christopher I Bayly; John D Chodera; Victoria T Lim; Nathan M Lim; Kyle A Beauchamp; David R Slochower; Michael R Shirts; Michael K Gilson; Peter K Eastman
Journal: J Chem Theory Comput Date: 2018-10-30 Impact factor: 6.006

9. Evaluation of CM5 Charges for Nonaqueous Condensed-Phase Modeling.

Authors: Leela S Dodda; Jonah Z Vilseck; Kara J Cutrona; William L Jorgensen
Journal: J Chem Theory Comput Date: 2015-08-27 Impact factor: 6.006

10. Toward Automated Benchmarking of Atomistic Force Fields: Neat Liquid Densities and Static Dielectric Constants from the ThermoML Data Archive.

Authors: Kyle A Beauchamp; Julie M Behr; Ariën S Rustenburg; Christopher I Bayly; Kenneth Kroenlein; John D Chodera
Journal: J Phys Chem B Date: 2015-09-29 Impact factor: 2.991