Literature DB >> 32715202

A Simple Method for Including Polarization Effects in Solvation Free Energy Calculations When Using Fixed-Charge Force Fields: Alchemically Polarized Charges.

Abstract

The incorporation of polarizability in classical force-field molecular simulations is an ongoing area of research. We focus here on its application to hydration free energy simulations of organic molecules. In contrast to computationally complex approaches involving the development of explicitly polarizable force fields, we present herein a simple methodology for incorporating polarization into such simulations using standard fixed-charge force fields, which we call the alchemically polarized charges (APolQ) method. APolQ employs a standard classical alchemical free energy change simulation to calculate the free energy difference between a fully polarized solute particle in a condensed phase and its unpolarized state in a vacuum. APolQ can in principle be applied to any microscopically homogeneous system (e.g., pure or mixed solvents). We applied APolQ to hydration free energy data for a test set of 45 neutral solute molecules in the FreeSolv database and compared results obtained using three different water models (SPC/E, TIP3P, and OPC3) and using minimal basis iterative Stockholder (MBIS) and restrained electrostatic potential (RESP) partial charge methodologies. In comparison with AM1-BCC, we found that APolQ outperforms it for the test set. Despite our method using default GAFF parameters, the MBIS partial charges yield absolute average deviations 1.5-1.9 kJ mol-1 lower than using AM1 bond charge correction (AM1-BCC). We conjecture that this method can be further improved by fitting the Lennard-Jones and torsional parameters to partial charges derived using MBIS or RESP methodologies.

Entities: Chemical Disease Species

Year: 2020 PMID： 32715202 PMCID： PMC7376688 DOI： 10.1021/acsomega.0c01148

Source DB: PubMed Journal: ACS Omega ISSN： 2470-1343

Introduction

Free energy calculations are important in many scientific and engineering research areas, including chemical reaction and phase equilibria,[1−4] solubility,[5−7] hydration free energies,[8−10] and pharmaceutical drug design.[11−14] Molecular mechanics (MM) models are frequently used for atomistic simulations of these phenomena using either Molecular Dynamics (MD) or Monte Carlo (MC) algorithms. MM models use force fields (FFs) to describe the interactions between molecules, which incorporate parameters for bonds, angles, and torsions that are generated using the electronic structure (ES) methodology and subsequently fitted to data with mathematically simple functions. Intermolecular interactions typically involve both Lennard-Jones (LJ) and electrostatic (Coulombic) interactions. The electrostatic interactions are traditionally modeled using fixed atom-centered partial charges. Although fixed-charge FFs give reasonable results for several properties, they are often problematic for the calculations of the free energy changes between dissimilar environments, such as hydration free energies, ΔGhyd.[15,16] In this paper, we focus on the use of fixed-charge atomistic FFs for the calculation of ΔGhyd. Due to their intrinsic limitations,[17] we do not consider approaches using implicit solvent methods for its direct calculation. We also do not consider explicitly polarizable atomistic FFs, which add additional parameters to the model.[18−26] Despite the added computational burden of their determination[27,28] such FFs currently do not always give better predictions than fixed–charge FFs,[29] particularly in the case of free energy calculations.[30−32] ΔGhyd calculations for fixed-charge FFs are performed by alchemically decoupling (coupling) a solute molecule from (to) its solution environment to (from) its vacuum state. This is typically performed by separately decoupling the Coulombic and LJ contributions via a sequence of λ windows. For neutral molecules, in most cases, the electrostatic contribution is the major contribution to the hydration free energy. This is even more strongly the case for ions, where the electrostatic contribution is often an order of magnitude larger than the LJ contribution. It is therefore of great importance to develop methods that can better calculate the Coulombic contribution. One approach to incorporate polarizability when using fixed-charge FFs is to include some of its effects into the partial charges of the FF itself. For the general Amber force field (GAFF) approach,[33] the default partial charges are either restrained electrostatic potential (RESP)[34] at the HF/6-31G* level or the semi-empirical AM1 bond charge correction (AM1-BCC),[35] which is designed to yield similar results to RESP partial charges, but are faster to calculate and less sensitive to geometry. These choices indirectly incorporate polarization effects to mimic a condensed phase environment. Another approach has been to account for polarization by way of a correction to the fixed-charge FF calculation.[36−41] Whereas polarization is indirectly included in the partial charges calculated for GAFF FFs, due to its choice of ab initio theory, it is not specific to any particular solvent environment. A recent work of Zhou et al.(42) has shown, based on comparisons predicted with accurate/experimental molecular dipole moments in a vacuum, that HF/6-31G* exhibits inconsistencies (ranging from under- to overpolarization by as much as 35%). The best hydration free energy results are typically obtained by using AM1-BCC. This is likely due to the ad hoc and extensive empirical data fitting of its parameters; this also serves, however, as its greatest limitation. It is not possible to systematically improve the electrostatic interactions when using AM1-BCC without refitting the entire database of bond charge corrections. While AM1-BCC is the current “gold standard” for GAFF, which is used to compare alternative partial charge methods,[36,39,40] its results are, in our opinion, of only modest quality for solvation free energies. The fitting of the LJ parameters to experimental data can potentially compensate for an incorrect treatment of polarizability. Whereas some of the experimental data used for fitting procedures consists of hydration free energy results, the parameters are usually fitted to a broad spectrum of thermodynamic properties. In any event, the requirement for a property-specific FF is unsatisfactory in principle. Our goal is to address the deficiencies of existing polarization strategies in the context of simulations of hydration free energies, ΔGhyd, and to test our approach for a spectrum of molecule types in the FreeSolv database.[10] We recently proposed a method for explicitly incorporating polarization into a fixed-charge FF free energy calculation, called the OTFP (on-the-fly-polarization) method.[43] This approach decouples the electric field generated by the solvent’s MM partial charges by scaling the MM partial charges in a QM/MM snapshot proportionally to the Coulombic decoupling parameter λcoul. This allows polarity in the solute partial charges to evolve from being completely unpolarized in a vacuum to fully polarized in solution. We also showed how the cost of self-polarization of the solute could be calculated by employing double-sided perturbation techniques. In this paper, we propose a new method that uses a polarizable continuum model (PCM)[44] approach to calculate the partial charges of the solute FF. This is considerably faster than the OTFP approach and is very simple to implement. Only two solute partial charge calculations are required prior to a simulation: a vacuum charge set and a polarized charge set. The resulting relatively low computational burden makes it computationally feasible to add diffuse functions to the QM basis set used. We refer to our proposed methodology for using these charge sets in the course of the Coulomb decoupling/coupling procedure in the alchemical change calculation of the APolQ (alchemically polarized charge) polarization methodology. The overall computational cost of this method is similar to that required for a standard fixed-charge simulation, and its modularity allows different ES calculations and partial charge partitioning schemes to be easily implemented as future improvements evolve with respect to these aspects. APolQ is applicable only to solutions with a microscopically homogeneous environment, i.e., consisting of pure solvents and homogeneous mixtures of different compositions. This is the assumed context in the subsequent sections of this paper for its description and application to the large class of systems consisting of molecules of up to moderate size. It excludes, for example, solution environments that are microscopically inhomogeneous and/or for which dielectric constants are poorly defined, an example of which is the interior of a protein molecule. Such environments can be treated either by explicitly polarizable force fields or potentially by our OTFP method.[43] We test our methodology on 45 different solute molecules that range from small polar molecules to large drug-like molecules. We consider minimal basis iterative Stockholder (MBIS) and RESP partial charges. For MBIS charges, we show results using SPC/E,[45] TIP3P,[46] and OPC3[47] water FFs. For RESP, we only show results using TIP3P water because this is the default solvent for which GAFF parameters are optimized. We also compare our results with those computed using the traditional AM1-BCC fixed-charge method. The paper is organized as follows. We first summarize our FF generation procedures, followed by a description of the MD simulation protocols used. We summarize our methodology for calculating and transitioning from vacuum to solution partial charges and accounting for the cost of self-polarization. We then describe and discuss our results followed by conclusions and recommendations.

Simulation Methodology

The hydration (solvation) free energy is the molar free energy change when a single solute molecule is transferred from a vacuum (ideal gas state) into pure (water) solvent, where both the gas and the solvent are at unit molarity:where ρsolv(P, T) is the solvent density at the specified values of T and P and μsoluteres, NVT; ∞[T, ρsolv(P, T)] is the residual chemical potential of the solute at infinite dilution in the solvent, which we refer to as the intrinsic solvation free energy. Setting the chemical potentials of both states equal to each other using the constant density form of the chemical potential[43] leads to the experimentally used form of the ΔGhyd as given by This form assumes that solute concentrations are low enough for Henry’s law to apply in the solution and that the partial pressure of the solute, Pvap, is sufficiently low to be treated as an ideal gas. Our proposed approach is to calculate by mutating the partial charges on the solute molecule from their vacuum-derived values to implicitly polarized partial charges in the solution as the solute is coupled to the solvent. Similarly, as in our previous work,[43] which accounted for polarization explicitly and on-the-fly, we account for the cost of self-polarization in polarizing the solute by sampling the potential energy differences between states using double-sided perturbations in each Coulomb window using different charge sets in the forward and reverse perturbations. The implicitly polarized free energy algorithm for calculating ΔGhyd of the solute involves simple modifications to standard procedures.[48] In the following two sections, we first describe how we generate FF parameters and partial charges in a vacuum and in implicit solvent. We then describe the approach for calculating the effects of the solute polarization on ΔGhyd.

Force-Field Parameterization

We used the electronic structure software Spartan’18[49] to screen all conformers of each solute in vacuum, and identify their 20 lowest energy conformers using the semi-empirical PM6. We then screened for the lowest energy conformer using the HF/6-31G* level of theory. Finally, for small molecules (less than 7 C/N/O atoms), the lowest energy conformer was further optimized at the MP2/aug-cc-pVTZ level of theory, and for larger molecules, the ωB97X-D/aug-cc-pVTZ level was used. For larger molecules, the time and memory requirements for MP2 were severe, which is why we opted to use the range-separated, dispersion-corrected hybrid-generalized gradient approximation (GGA) density functional theory (DFT) model, ωB97X-D. Intramolecular parameters were generated for GAFF(2.11) using the AmberMD tool package Antechamber,[50] which was also used to generate the RESP and AM1-BCC partial charges. HORTON 2.1.1[51] was used for generating minimum basis iterative Stockholder (MBIS) charges using an ultrafine grid. Gaussian16[44] was used for all electronic structure calculations for the solute electron densities with the exception of AM1-BCC, which used Antechamber’s program, sqm. We then converted the topology files to GROMACS files using the python program ACPYPE.[52] Antechamber’s built-in use of OpenEye software[53] automatically averages equivalent atom-type partial charges. We used a python script to replicate the averaging procedure used in Antechamber since this is not available in the HORTON program.

Alchemically Polarized Charges

In the alchemical free energy changes, the solute molecule’s interactions with its environment are decoupled over a series of λ windows {1 = λ0, λ1, ..., λ = 0}. We incorporate polarization in the fixed charge simulations by using fixed partial charges within each window with magnitudes calculated according to Similar to the OTFP methodology, the APolQ method uses We note that this is not the same as performing a mutation, which would use different charges at each window’s endpoint in eq or 5. An undesirable consequence of using different partial charges in either eq or 5 is that the difference in Hamiltonian’s between two states must also account for the change in the intramolecular contribution to the Hamiltonian, rather than requiring only the difference in intermolecular contributions. Our method both avoids this extra step and simultaneously accounts for the cost of self-polarization. As an example of the difference between a conventional alchemical free energy calculation and that of our APolQ method, we compare the equations used to calculate the free energy perturbation according to a simple forward Zwanzig perturbation expression. For a linearly scaled Coulombic free energy contribution, this iswhere A is the Helmholtz free energy and 0 is the set of solute partial charges, which are unchanged in all windows. N is the number of Coulomb windows, is the potential energy, and denotes the ensemble average of the quantity x in window k. In the APolQ method, the solute partial charges change across the Coulombic windows, and the free energy is calculated according towhere both the scaling parameter λ and the partial charges, , depend on the Coulomb window. For a perturbation between any two states (windows), the same set of partial charges is used in both, and only the λ parameter changes. For example, when calculating the free energy change for two given states a and b, the forward direction perturbation will use the set of charges for state a in both state points. The perturbation in the reverse direction will necessarily use the partial charges for state b in both states. The average of these two perturbations has been shown previously[43] to be equivalent to doing two perturbations characterized by both using the same partial charges, which are the average of the two sets. The relevance to the cost of self-polarization is discussed in Section . Electron density calculations are performed in a vacuum with a high level of theory and a large basis set to avoid intrinsic overpolarization of the solute often associated with HF/6-31G* calculations. We also calculate the partial charges with the same basis set in the presence of the solvent via the use of a PCM (with default Gaussian 16 parameters) to implicitly generate the effects of a solvent on the electron density. We use MP2 for all calculations and considered the two basis sets aug-cc-pVTZ and cc-pVTZ based on their good results for polarizabilities and dipole moments, as reported by Hickey et al.(54) For the PCM calculations, we used the respective static dielectric constants of the TIP3P, SPC/E, and OPC3 water models (ε0 = 94.0,[47,55] ε0 = 70.7,[56,57] and ε0 = 78.4[47]). We note that the seemingly large differences in dielectric constants has very little effect on the partial charges produced and seems to differ only in the partial charge’s third or fourth decimal place. The relative insensitivity of solvation free energy to dielectric constants in the range of water models has been noted previously by Fennell et al.(57) We used the experimental (default) value for the dynamic dielectric constant of water available in Gaussian 16.[44]Figure is a schematic description of our method, which also compares it to a typical GAFF free energy simulation; the latter technically requires two corrections to its endpoints, which are not required in our method.

Figure 1

Hydration free energy as the difference between a solute with polarized partial charges in solution and unpolarized partial charges in a vacuum. This example uses MP2/aug-cc-pVTZ for both partial charge calculations; however, both theory and basis set can be changed.

Cost of Self-Polarization

There is a cost incurred when polarizing a solute molecule which, according to linear response theory, is equal to half the potential energy of interaction between the solvent and solute.[58,59] It is possible to account for this cost of polarizing the solute molecule in solution by using half-polarized partial charges. This is done in the IPolQ-Mod method,[60] which averages RESP partial charges calculated in a vacuum and implicit solvent and uses the same partial charges in all windows. Since using half-polarized partial charges accounts for the cost of self-polarization[38,59,61] (as in, e.g., IPolQ-Mod) and double-sided perturbations using APolQ are equivalent to using half-polarized partial charges for the given perturbations, this means that APolQ accounts for the cost of self-polarization in the free energy perturbations. Importantly, however, unlike actually using half-polarized charges, which is the approach of IPolQ-Mod, both endpoints of the perturbation are appropriately polarized for their particular environments. A previous work by Jia has noted that incorrect polarization intrinsic to IPolQ-Mod methods in the end states of the alchemical free energy change leads to incorrect Coulombic forces and energies between the solute and solvent.[40] There are additional consequences since all 1–4 and greater intramolecular Coulombic interactions are also incorrectly polarized in all but perhaps one window when using the same set of partial charges in all windows. Over the course of all windows, when using the APolQ method, the solute interacts first with strictly vacuum derived partial charges in the vacuum phase, retaining vacuum charges as LJ interactions are coupled to the solution environment. As the Coulomb interactions are then coupled, the partial charges increase their polarity until the solute exhibits completely polarized partial charges in the fully coupled solution phase. Double-sided perturbations account for the cost of self-polarization, and the transition from vacuum to polarized partial charges accounts implicitly for the work involved in organizing the solvent around the fully polarized solute molecule. A common double-sided perturbation method is the Bennett acceptance ratio (BAR)[62] method. In the following, we used the multistate BAR method (MBAR) and have shown previously[43] that MBAR equivalently accounts for self-polarization similar to BAR and the simpler Zwanzig perturbation methods.

Systems Studied and Simulation Details

We considered the SPC/E, TIP3P, and OPC3 water models for all simulations employing MBIS-derived partial charges. We used the TIP3P water model in the cases of RESP and AM1-BCC calculations. All calculations were performed at T = 298.15 K and P = 1 bar. We performed calculations for 45 molecules containing C/H/N/O atoms of varying complexity from the FreeSolv database, as shown in Figure . Twenty-eight molecules were previously studied using our OTFP method.[43] We added to this test set four linear alcohols and four alkylamines, in addition to several small cyclic molecules containing nitrogen or oxygen. For the molecules chosen, 23% of FreeSolv’s calculations using GAFF(1.7) and AM1-BCC partial charges give results within the experimental error, the same percentage shown by AM1-BCC over the entire FreeSolv database.

Figure 2

Molecules studied.

Molecules studied. All ΔGhyd calculations were performed in the NVT ensemble. We first determined the density for the OPC3 water model by following the protocol used in our previous study[43] for the SPC/E and TIP3P water models. This entailed performing 10 independent NPT simulations at T = 298.15 K and P = 1 bar for 2000 water molecules. For each run, we used the GROMACS (version 2016.3) “insert-molecules” function[63] to insert 2000 OPC3 water molecules into a box of length 4.3 nm. The system’s energy was then minimized using the steepest descent algorithm, terminating when a maximum force of 100 kJ mol–1 nm–1 or when 20,000 steps were reached. In all cases, the minimization converged prior to the maximum number of steps being reached. Initial velocities were then assigned from a Maxwell–Boltzmann distribution at the set simulation temperature. A short 1 ns NVT ensemble run equilibrated the system using a time step of 2 fs and a stochastic Langevin leapfrog integrator.[64] Long-range electrostatic interactions were handled using the particle mesh Ewald (PME) method with a real space cutoff of 1.2 nm, tolerance of 10–6, order of 12, and Fourier spacing of 0.1 nm. A Lennard-Jones cutoff distance of 1.2 nm was employed with energy and pressure tail corrections and a neighbor list with a cutoff of 1.2 nm. The neighbor list was updated every 10 steps. We then performed an NPT ensemble equilibration for 10 ns using a Berendsen thermostat with a thermo-coupling parameter of 0.1 and a Berendsen barostat with a pressure-coupling parameter of 2.0, compressibility of 4.5 × 10–5, and reference pressure of 1 bar. We then performed an NPT ensemble production run for 15 ns using a stochastic Langevin leapfrog integrator with a thermo-couple of 1.0 and a Parrinello–Rahman barostat with a pressure couple of 5.0, compressibility of 4.5 × 10–5, and reference pressure of 1 bar. The average cubic box lengths were determined to be 3.9133 nm (TIP3P), 3.9240 nm (SPC/E), and 3.9168 nm (OPC3). ΔGhyd calculations were initiated by assigning unique initial random number seeds for each λ window. Positions of the molecules were then randomly assigned, and energy minimization followed the same procedure as in the above procedure for the NPT simulations. A 1 ns NVT ensemble equilibration then followed, and initial velocities were uniquely assigned according to the Maxwell–Boltzmann distribution. Again, the same procedures were used as in the NVT simulation previously described. Equilibration was then followed by a 2 ns production run, which used, as its initial configuration, the final configuration of the equilibration stage. The protocol for the production stage was the same as for the equilibration stage. Free energy calculations were evaluated using the multistate Benedict acceptance ratio (MBAR)[65] as implemented in the pymbar/alchemical analysis software.[66] First, the solute’s Coulombic interactions were decoupled from its solution environment followed by decoupling of the LJ interactions using λcoulomb = [0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0] and λLJ = [0.0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.5 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1.0]. We used a soft-core Lennard-Jones potential with GROMACS parameters sc-alpha = 0.5, sc-r-power = 6, and sc-power = 1. Potential energies of all λ windows were sampled every 100 steps for post-processing with MBAR.

Results and Discussion

We performed APolQ calculations for ΔGhyd using MBIS partial charges for all three water models and RESP using the TIP3P water model. We used partial charges derived from electron densities calculated using the MP2 theory in conjunction with cc-pVTZ and with aug-cc-pVTZ basis sets. We also performed “classical” ΔGhyd calculations using the AM1-BCC partial charge methods with TIP3P for comparison purposes since we used GAFF(2.11) and the results in FreeSolv are for GAFF(1.7). The results are shown as parity plots in comparison with experiments in Figures and 4. Deviations from experiment and statistical measures of the agreement of the predicted and experimental results are shown in the figures and are summarized in Table .

Figure 3

Figure 4

Parity plots comparing calculated and experimental results for ΔGhyd at the MP2/aug-cc-pVTZ level of theory at T = 298.15 K and P = 1 bar for the 45 molecules of this study. Panel (a) shows results using MBIS partial charges with the TIP3P water model. Panel (b) shows results using MBIS partial charges and the SPC/E water model. Panel (c) shows results using MBIS partial charges with the OPC3 water model. Panel (d) shows results using RESP partial charges and the TIP3P water model. Blue stars indicate results whose absolute error e satisfies , where σexpt and σsim are the respective experimental and simulation uncertainties; red squares indicate outliers (e>5σexpt); and circles indicate results satisfying . Experimental error bars are shown, and simulation error bars (calculated from MBAR) are within the symbol sizes.

Table 1

Comparison of ΔGhyd Results (in kJ mol–1) for the 45 Species Studied in Comparison with Experimental Valuesa

charge method	basis	water model	RMSD	AAD	AD	ρ_s
MBIS	cc-pVTZ	TIP3P	6.2(4.6,7.7)	4.5(3.3,5.8)	2.1(0.3,3.9)	0.78(0.56,0.92)
MBIS	aug-cc-pVTZ	TIP3P	6.4(4.7,8.1)	4.5(3.3,5.9)	–0.8(−2.8,1.1)	0.77(0.55,0.92)
MBIS	cc-pVTZ	SPC/E	6.3(4.8,7.8)	4.8(3.6,6.1)	2.9(1.2,4.6)	0.81(0.62,0.94)
MBIS	aug-cc-pVTZ	SPC/E	6.5(4.6,8.2)	4.5(3.2,6.0)	–0.5(−2.5,1.4)	0.82(0.61,0.94)
MBIS	cc-pVTZ	OPC3	6.2(4.7,7.6)	4.7(3.5,5.9)	2.7(1.1,4.4)	0.81(0.61,0.93)
MBIS	aug-cc-pVTZ	OPC3	6.3(4.5,8.0)	4.4(3.1,5.8)	–0.0(−2.0,1.9)	0.81(0.62,0.93)

RESP	cc-pVTZ	TIP3P	7.5(5.1,9.6)	5.3(3.9,7.0)	4.2(2.3,6.1)	0.63(0.33,0.83)
RESP	aug-cc-pVTZ	TIP3P	7.4(5.5,9.2)	5.5(4.2,7.1)	2.4(0.3,4.6)	0.63(0.34,0.82)

OTFP
MBIS	cc-pVTZ	SPC/E	6.5(4.5,8.3)	4.6(3.3,6.1)	–0.3(−2.3,1.7)	0.84(0.67,0.94)

AM1-BCC
this work		TIP3P	7.6(6.1,9.1)	6.3(5.0,7.6)	3.0(0.9,5.2)	0.81(0.63,0.91)
FreeSolv[10]		TIP3P	7.3(5.7,9.0)	5.9(4.6,7.2)	2.2(0.1,4.3)	0.94(0.88,0.97)

Riquelme[36] (MBIS)	BLYP/def2-TZVP	SPCE	9.9(7.8,12.0)	7.7(5.9,9.6)	2.7(−0.3,5.6)	0.74(0.51,0.87)

All results use the MP2 theory to calculate the electron densities. On-the-fly-polarization (OTFP) is our previous method. RMSD is the root mean square deviation, AAD is the average absolute deviation (prediction – experiment), AD is the average deviation, and the Spearman rank correlation coefficient (ρ)[70] is a statistical measure comparing simulation and experimental results ΔGhyd values.[70] The confidence intervals (CI) are described in the text.

Parity plots comparing calculated and experimental results for ΔGhyd at the MP2/cc-pVTZ level of theory at T = 298.15 K and P = 1 bar for the 45 molecules of this study. Panel (a) shows results using MBIS partial charges with the TIP3P water model. Panel (b) shows results using MBIS partial charges and the SPC/E water model. Panel (c) shows results using MBIS partial charges with the OPC3 water model. Panel (d) shows results using RESP partial charges and the TIP3P water model. Blue stars indicate results whose absolute error e satisfies , where σexpt and σsim are the respective experimental and simulation uncertainties; red squares indicate outliers (e>5σexpt); and circles indicate results satisfying . Experimental error bars are shown, and simulation error bars (calculated from MBAR) are within the symbol sizes. Parity plots comparing calculated and experimental results for ΔGhyd at the MP2/aug-cc-pVTZ level of theory at T = 298.15 K and P = 1 bar for the 45 molecules of this study. Panel (a) shows results using MBIS partial charges with the TIP3P water model. Panel (b) shows results using MBIS partial charges and the SPC/E water model. Panel (c) shows results using MBIS partial charges with the OPC3 water model. Panel (d) shows results using RESP partial charges and the TIP3P water model. Blue stars indicate results whose absolute error e satisfies , where σexpt and σsim are the respective experimental and simulation uncertainties; red squares indicate outliers (e>5σexpt); and circles indicate results satisfying . Experimental error bars are shown, and simulation error bars (calculated from MBAR) are within the symbol sizes. All results use the MP2 theory to calculate the electron densities. On-the-fly-polarization (OTFP) is our previous method. RMSD is the root mean square deviation, AAD is the average absolute deviation (prediction – experiment), AD is the average deviation, and the Spearman rank correlation coefficient (ρ)[70] is a statistical measure comparing simulation and experimental results ΔGhyd values.[70] The confidence intervals (CI) are described in the text. The quality of the agreement with experiment of a theoretical approach can be assessed by various empirical and statistical measures. All raw simulation results can be found in the Supporting Information, which enables alternative measures to be calculated. For instance, we do not include the common Pearson R2 statistic due to its well-known deficiencies[67,68] in comparing predicted results with the experiment. Several measures are shown in the figures and are summarized in Table . The 95% CIs for RMSD, absolute average deviation (AAD), and Spearman ρs are obtained by means of empirical bootstrapping using 104 resamples with replacement (e.g., Chernick[69]). The confidence intervals (CI) for the AD were obtained using the Student t distribution at the 95% level.

Effect of the Basis Set

Figures and 4 show parity plots against experimental data using the cc-pVTZ and aug-cc-pVTZ basis sets, respectively, for all water models. We first remark that the RMSD and AAD values for both charge methods are very similar. However, the AD values and their corresponding 95% confidence intervals (CIs) differentiate between the effects of the two basis sets. If zero is not contained within the CI, the statistical inference is that the hypothesis that predicted and experimental data are in agreement for the method is rejected at the 95% confidence level, which is evidence that the procedures underlying the FF generation yield an overall unsatisfactory prediction of the experimental values. The results in the table indicate that for all three water models and both charge methods, all methods using the cc-pVTZ basis set are unsatisfactory. However, for MBIS charges, the addition of diffuse functions in the aug-cc-pVTZ basis set yields AD CIs containing zero. This is not the case for RESP charges, whose AD CIs using both basis sets do not contain zero.

Comparison with the OTFP Method

We recently developed a method that explicitly accounts for polarization during a MD simulation which we termed on-the-fly-polarization (OTFP). The test set used to validate the OTFP methodology includes 28 of the molecules shown here. The OTFP method using MBIS partial charges partitioned from an electron density calculated at the MP2/cc-pVTZ level of theory and basis set produced the best results. We have also calculated the 17 additional molecules using this approach in the present study and are also shown in Table . The OTFP results are very similar to the APolQ methods, which also use MBIS. Interestingly, despite using the cc-pVTZ basis set, the OTFP AD and its CI is similar to the MBIS APolQ methods that use diffuse functions with cc-pVTZ. The RMSD, AAD, and ρ for the MBIS OTFP and MBIS APolQ methods are nearly exactly the same. The computational advantage in using the APolQ method over the OTFP approach is that, apart from the additional requirement for calculating the solvent dielectric constant, only two partial charge calculations are required for APolQ, whereas OTFP requires multiple such calculations in each Coulomb window. We finally note that both APolQ and OTFP can in principle be applied to pure and mixed solvents and solutions at finite solute concentrations, although only OTFP is capable of handling non-homogeneous environments.

Comparison with Post-Simulation Self-Polarization Calculations

Methods to correct fixed-charge simulation free energy calculations have been under investigation for several years.[36−41] Here, we compare results with one such method that also used MBIS partial charges. In the method of Riquelme et al., the cost of self-polarization was calculated post-simulation and added onto the ΔGhyd calculated using standard alchemical free energy change with molecular dynamics. We show their results in the final row of Table . We note that they used partial charges derived from an electron density calculated using density functional theory with the generalized gradient approximation (GGA) exchange-correlation functional (BLYP), with a triple ζ basis set that did not include additional diffuse functions. For the 45 molecules used in this test set, our proposed method, as well as OTFP, both of which implicitly include the cost of both self-polarization and reorganization of the solvent due to changing polarity, has a lower AAD by approximately 2 kJ mol–1. The AD of Riquelme et al. was highly positive, along with all other methods that were either empirically parameterized (AM1-BCC) or calculated an electron density without using a basis set that had diffuse functions added. It is therefore possible that the Riquelme et al. results may be improved by the addition of diffuse functions. As mentioned above, accounting for the cost of polarizing a molecule is important, but not the only contribution to free energy when polarizing a molecule. The APolQ method, while accounting for the cost of self-polarization, does so as an average over many solute conformations with the solvent correspondingly adjusting around the solute. In doing so, APolQ also avoids overpolarized 1–4 and larger intramolecular interactions in windows where there should be no (or reduced) polarization in the charges. Accounting for the cost of self-polarization for a single geometry post-simulation does not account for the overpolarized intramolecular 1–4 and greater interactions in all windows and geometries of the solute during the course of the simulations. The APolQ method accounts for polarity in a way, which keeps intramolecular forces as well as intermolecular Coulombic forces and energies appropriately polarized during alchemical free energy simulations and from our results appears to improve predictions significantly.

Effect of the Water Model and Comparisons with AM1-BCC

Table and Figures and 4 indicate that all APolQ results yield a lower AAD and more predictions within experimental uncertainty () than those shown by AM1-BCC. All APolQ results using MBIS partial charges show lower AAD, AD values closer to 0, and higher ρ than APolQ using RESP, although their CIs overlap. Of the MBIS results, predictions using the OPC3 water model and aug-cc-pVTZ basis set have the smallest AAD and the best AD (closest to zero) of all results. Shown in Figure are the MBIS results for all three water models using partial charges derived from electron densities calculated with diffuse functions added to the basis sets (aug-cc-pVTZ). All three water models correlate well with each other, but TIP3P and SPC/E are closer to each other than either is with OPC3. The Spearman rank coefficients ρ = 0.97 indicate strong correlation of both the OPC3 and SPC/E results with those of TIP3P. The finding that all water models give similar results is an indication of the method’s robustness because APolQ is able to perform well for different water models without requiring reoptimization of the underlying GAFF parameters.

Figure 5

Parity plots comparing water models SPC/E and OPC3 with TIP3P for ΔGhyd. Dashed red line marks 2 kJ mol–1 horizontal and vertical offset from the parity line.

Influence of the Molecular Structure

Figure shows a comparison of the AAD results for a selection of structural features of the solute molecules. We compare small, large, cyclic, small cyclics, rotatable (non-cyclic), nitrogen-containing, nitrogen in a cyclic ring, oxygen-containing, alkylamines, and linear alcohols. With the exception of RESP, all methods struggle with nitrogen-containing cyclic rings. MBIS partial charge methods struggle with small cyclic rings, and to a lesser degree, RESP does as well. AM1-BCC has the lowest AAD for small cyclic rings. It is not surprising that fixed atom-centered charge FFs struggle with small cyclics because either off-center partial charges or atom-centered multipoles are needed to allow the dipole to leave the plane. (It has also been shown by Swope et al.(37) that, for benzene, the quadrupole moment contributes more to polarization than does the dipole moment. This likely also occurs for other small cyclic molecules.) Since such approaches require modification of the original fixed-charge FF, they are outside of the scope of this work.

Figure 6

ΔGhyd AAD comparison plot. Error bars are calculated using percentile bootstrapping at the 95% significance level.

ΔGhyd AAD comparison plot. Error bars are calculated using percentile bootstrapping at the 95% significance level. AM1-BCC is the only partial charge method to yield the same approximate AAD for total AAD, small molecules, large molecules, and both cyclic and non-cyclic molecules. For all other methods, small and non-cyclic molecules show lower AAD values than the total AAD, as opposed to large and cyclic molecules that show larger AAD values than the total AAD. MBIS methods using TIP3P and OPC3 water models yield approximately equal or lower AAD than AM1-BCC for large and cyclic molecules. All methods show AAD values lower than those of AM1-BCC for oxygen-containing molecules. Of the MBIS methods, the TIP3P water model using the aug-cc-pVTZ basis set in the electron density calculations yields the lowest AAD for small cyclic molecules. MBIS methods also perform consistently well for linear alcohols and alkylamines.

Non-Coulombic FF Parameters

The performance of FFs in the prediction of ΔGhyd depends on the Coulombic contribution and the LJ and intramolecular contributions. In this work, for the GAFF(2.1), we have modified the Coulombic contribution by incorporating a polarization algorithm in conjunction with the partial charge calculations, while leaving the LJ and intramolecular parameters unchanged. These parameters were originally developed by empirical modifications to fit experimental data and QM calculations on the basis of the partial charges being RESP HF/6-31G* and/or AM1-BCC. When using different partial charges, these parameters are no longer optimal and should be refitted. Optimizing the LJ, and in particular the torsion parameters, will likely lead to additional improvements in our ΔGhyd predictions. This has recently been shown to be the case for RESP partial charges.[71] Lennard-Jones parameter optimization is an active area of research.[72,73]

Conclusions and Recommendations

We have developed a novel approach (APolQ) for implementing polarization in the calculation of the solvation free energy, which transitions a solute molecule from having unpolarized partial charges in the decoupled (vacuum) state to polarized partial charges in the fully coupled (solution) state. The APolQ approach for implementing polarization retains the computationally efficient fixed-charge structure of the solute and requires only two partial charge calculations (in a vacuum and implicit solvent) prior to performing a standard MM simulation. The vacuum calculation need only be performed once and is used in all solvents. The only additional calculation required for each solvent is the calculation of the implicitly polarized partial charges. This is thus a simply implemented methodology that is transferable to any solvent. We have applied the APolQ method to the calculation of hydration free energies of 45 organic solute molecules of various complexities using MBIS and RESP partial charges. We have also applied our methodology to SPC/E, TIP3P, and OPC3 water models. All MBIS methods using diffuse functions in the basis set (aug-cc-pVTZ) contain zero within the 95% confidence interval of the average deviation (AD) from experiment, whereas AM1-BCC and APolQ RESP and MBIS using cc-pVTZ do not. In the latter cases, this indicates rejection of the hypothesis that the predictions agree with the experimental values. We have demonstrated that APolQ in conjunction with partial charges calculated at a high level of ES theory/basis set yields superior results to the standard AM1-BCC approach, despite the fact that the latter uses fully optimized LJ and intramolecular parameters. The APolQ results can potentially be further improved by refitting these parameters. APolQ can in principle be applied to any homogeneous system (e.g., pure or mixed solvents) and such solutions at finite solute concentrations, requiring only the solvent’s dielectric constant (from experimental data or simulation). The APolQ algorithm is modular, allowing it to be easily improved with more sophisticated ES methods and partial charge partitioning methodologies. It is both simple and fundamentally more rigorous than the commonly used approach of keeping partial charges fixed in all alchemical windows.

49 in total

1. Development and testing of a general amber force field.

Authors: Junmei Wang; Romain M Wolf; James W Caldwell; Peter A Kollman; David A Case
Journal: J Comput Chem Date: 2004-07-15 Impact factor: 3.376

2. Prediction of hydration free energies for the SAMPL4 data set with the AMOEBA polarizable force field.

Authors: Francesco Manzoni; Pär Söderhjelm
Journal: J Comput Aided Mol Des Date: 2014-03-01 Impact factor: 3.686

3. Benchmarking quantum chemical methods for the calculation of molecular dipole moments and polarizabilities.

Authors: A Leif Hickey; Christopher N Rowley
Journal: J Phys Chem A Date: 2014-05-09 Impact factor: 2.781

4. Solubility of KF and NaCl in water by molecular simulation.

Authors: E Sanz; C Vega
Journal: J Chem Phys Date: 2007-01-07 Impact factor: 3.488

5. Computer-Aided Drug Design Methods.

Authors: Wenbo Yu; Alexander D MacKerell
Journal: Methods Mol Biol Date: 2017

6. Evaluating Force-Field London Dispersion Coefficients Using the Exchange-Hole Dipole Moment Model.

Authors: Mohamad Mohebifar; Erin R Johnson; Christopher N Rowley
Journal: J Chem Theory Comput Date: 2017-11-30 Impact factor: 6.006

7. Small molecule hydration free energies in explicit solvent: An extensive test of fixed-charge atomistic simulations.

Authors: David L Mobley; Christopher I Bayly; Matthew D Cooper; Michael R Shirts; Ken A Dill
Journal: J Chem Theory Comput Date: 2009-02-10 Impact factor: 6.006

8. The Use of Anisotropic Potentials in Modeling Water and Free Energies of Hydration.

Authors: Panagiotis G Karamertzanis; Paolo Raiteri; Amparo Galindo
Journal: J Chem Theory Comput Date: 2010-05-11 Impact factor: 6.006

9. Calculation of the free energy of polarization: quantifying the effect of explicitly treating electronic polarization on the transferability of force-field parameters.

Authors: Daan P Geerke; Wilfred F van Gunsteren
Journal: J Phys Chem B Date: 2007-05-18 Impact factor: 2.991

10. Deriving a Polarizable Force Field for Biomolecular Building Blocks with Minimal Empirical Calibration.

Authors: Koen M Visscher; Daan P Geerke
Journal: J Phys Chem B Date: 2020-02-19 Impact factor: 2.991