Literature DB >> 32073849

Deriving a Polarizable Force Field for Biomolecular Building Blocks with Minimal Empirical Calibration.

Abstract

Force field parametrization involves a complex set of linked optimization problems, with the goal of describing complex molecular interactions by using simple classical potential-energy functions that model Coulomb interactions, dispersion, and exchange repulsion. These functions comprise a set of atomic (and molecular) parameters and together with the bonded terms they constitute the molecular mechanics force field. Traditionally, many of these parameters have been fitted in a calibration approach in which experimental measures for thermodynamic and other relevant properties of small-molecule compounds are used for fitting and validation. As these approaches are laborious and time-consuming and represent an underdetermined optimization problem, we study methods to fit and derive an increasing number of parameters directly from electronic structure calculations, in order to greatly reduce possible parameter space for the remaining free parameters. In the current work we investigate a polarizable model with a higher order dispersion term for use in biomolecular simulation. Results for 49 biochemically relevant molecules are presented including updated parameters for hydrocarbon side chains. We show that our recently presented set of QM/MM derived atomic polarizabilities can be used in direct conjunction with partial charges and a higher order dispersion model that are quantum-mechanically determined, to freeze nearly all (i.e., 132 out of 138) nonbonded parameters to their quantum determined values.

Entities: Chemical Disease Gene

Mesh：

Substances：
Hydrocarbons

Year: 2020 PMID： 32073849 PMCID： PMC7061328 DOI： 10.1021/acs.jpcb.9b10903

Source DB: PubMed Journal: J Phys Chem B ISSN： 1520-5207 Impact factor: 2.991

Introduction

The corner stone of many currently available molecular simulation approaches comprises the underlying models that describe and predict physicochemical properties and behavior of the systems of interest.[1,2] Molecular-mechanics (MM) based force fields are used to model electronic interactions implicitly by using simple but powerful functions that attempt to effectively describe nonbonded interactions such as dispersion, Coulomb interactions, and exchange repulsion.[3] The parameters in such a force field can be either derived from electronic structure calculations, linking their values directly to behavior on the electron level, or they can be calibrated in an optimization approach.[4−6] For many currently available force fields, a mixed strategy has been used in which only part of the parameters are directly derived from quantum-mechanical (QM) calculations and others are fitted or scaled on the basis of thermodynamic properties of, e.g., small-molecule analogues.[4,5,7] Even with current computing resources, force field calibration remains laborious, relying on manual calibration[7] and/or (semi)automated methods that perform genetic[8] or gradient descent optimization strategies.[9] Although state of the art and of clear value to the research field, nearly all automated methods may well have problems in finding physically relevant minima, given the complexity of the underlying optimization problem. Recently, this has motivated others and us to propose strategies and report attempts to directly derive force field parameter sets from QM electronic structure calculations,[10−14] allowing us to derive near ab initio force fields for molecules. Such force fields could in principle be directly used in molecular dynamics simulations (or equivalent force field based methods).[10,12−16] Additionally, we see a role for such ab initio parametrization to serve as seeding points for automated calibration, providing chemically sensible starting points that could be refined from that point. Work by Cole et al.[10] illustrated well that Coulomb and van der Waals nonbonded parameters can be directly obtained from electronic structure using Atoms In Molecules (AIM) approaches to decompose molecular density contributions into atomic ones.[17] As noted recently by Mohebifar et al.,[11] deriving estimates for van der Waals parameters from QM as a starting point for calibration can be especially relevant because these QM estimates indicate that current force fields sample a different part of parameter space and systematically overestimate dispersion C6 constants. Properly covering van der Waals parameter space is of particular interest because the recently increased interest in simulating disordered proteins indicates that balancing solute–solute and solute–water van der Waals interactions may be of vital importance to correctly describe correct folding states.[18−20] We build onto these works and, considering that higher order dispersion can contribute up to 40% of dispersion energies,[15,21] we recently proposed a dispersion potential that not only includes a C6 but also a higher order C8 term.[14,16] Use of this potential was verified in a simulation study on a series of hydrocarbons, for which we directly derived bonded and nonbonded force field parameters from QM.[14] To determine atomic C6 and C8 coefficients, we used Exchange Hole Dipole Moment (XDM)[22] calculations on Iterative-Hirshfeld (Hirshfeld-I)[23] partitioned molecular densities. In this manner atomic dispersion pair coefficients (C6 and C8) can directly be derived.[11,22] In combination with a C6 and C8 potential, we found that compared to using a C12 function to mimic exchange repulsion,[3] employing a C11 repulsive potential instead gave improved results for alkanes.[14]C11 or other types of pair constants can be directly derived from the zero-point van der Waals energies of corresponding atom pairs, as defined by the effective radii of the atoms of interest. To estimate these radii, we used in our alkane study a preoptimized polarizability based scaling function.[24] This effectively amounts to the Tkatchenko–Scheffler approach,[25] which scales reference atomic radii by the ratio between the partitioned atomic volumes and corresponding atomic reference volumes, similar to the method employed by Cole et al.[10] Recent work of Kooi et al.[26] illustrates that higher order dispersion coefficients (up to C10) can be closely approximated without density distortion. This indicates that the modeling of electronic polarization and dispersion effects can be decoupled. Thus, a force field in which density distortion is explicitly included through (atomic and/or off-site) polarizabilities may well use dispersion terms with van der Waals constants that are directly determined from quantum calculation. We recently showed that effective atomic polarizabilities for molecular simulation can be efficiently fitted from combined QM/MM calculations on solutes with explicit polarizing solvent shells.[16,27,28] Hence, explicitly including electronic polarization effects into a force field does not necessarily increase the number of free parameters and may even reduce it. In addition, it can greatly facilitate parametrization of effective Coulomb potentials because charge fitting can be done in vacuo and may not need to be corrected for missing self-polarization and electron structure distortion effects as described by others.[29] The aim of the current work is to obtain a starting point for a force field for use in protein simulation that includes both explicit polarization effects and higher order dispersion. We present such a set of force field parameters for a series of amino-acid small-molecule analogues. Our force field comprises 138 nonbonded parameters, of which 132 constants (i.e., all dispersion and electrostatic parameters) are quantum-mechanically determined. The remaining parameters were calibrated on the basis of experimental data for 49 small-molecule analogues. Thus, the largest part of the force field constants (including those for the bond-stretch and angle-bending terms) are estimated in an ab initio manner, allowing us to solve a well-determined optimization problem for the remaining (6) parameters. These are effectively defined by the van der Waals radii of free atoms, which are scaled by an AIM-derived volume ratio term.[10] The obtained force field parameter set can be directly used in classical molecular dynamics (MD) simulation that includes higher order dispersion and electronic polarization effects, or alternatively it can be used as starting point for refinement with a fully automated force field optimization method.

Methods

Electrostatic Model

In this work we employ the Charge-On-Spring (COS)[30−32] polarizable model to explicitly include electronic polarization during MD simulations. The charged massless COS particles are attached with a spring to their polarizable center, which is often a heavy atom of the molecular system. Induced atomic dipole moments (μ⃗) are thus expressed as the displacement (Δr⃗) of the COS particles from their associated polarizable center, given a static charge qpol (of −8.0 e in the current work) on the COS (which is counterbalanced by adding −qpol to the charge of its associated polarizable center). Note that the COS displacement will depend on qpol, the local electric field E⃗ at polarizable center i, and its (isotropic) polarizability α, eq . Local electric fields are evaluated between the simulation steps in an iterative manner (self-consistent optimization) as the polarization of the environment around an atom directly influences these electric fields. Values for these polarizabilities were previously determined by us for all chemical moieties and atom types of interest in this work (Table S3), using a QM/MM-based fitting approach.[28] Because of their mutual dependencies, the displacements Δr⃗ are determined via self-consistent field (SCF) optimization until convergence.[32] The E⃗’s are computed using simple Coulomb point potentials that sum over the set of static charges and charges on the COS particles (q’s at distance r from atom i), eq . These routines are build into the GROMOS molecular dynamics simulation package, using isotropic polarizabilities for all polarizable sites.[32,33] To be fully self-consistent, electric fields have to be calculated on the positions of the COS particles;[34] however, this is more expensive in terms of the number of pair interactions to calculate, and therefore, we always obtain an estimate at the location of the polarizable center. All COS charges are assigned a value of −8.0 e as we found that a sufficiently large charge is required to minimize the inherent error in this approximation.[34]

Determining Dispersion and Exchange Repulsion Parameters

To compute atomic dispersion parameters, we made use of a similar small-molecule compound library as utilized in our recent work on QM/MM fitting of atomic polarizabilities for amino-acid small-molecule analogues, Table . Initial molecular structures were downloaded from the Automated Topology Builder (ATB, all geometry optimized at the B3LYP/6-31G* level of theory).[35] Using NWChem version 6.8, these structures were further optimized at the B3LYP/aug-cc-pVTZ level of theory and the resulting wave files were stored.[36−41] These were subsequently converted with MOLDEN2AIM, resulting in wfx wave function files that could be read by postg for post processing.[42−44] Electron densities were partitioned into atomic contributions using the iterative Hirshfeld method in Horton version 2.1,[14,23,45,46] writing out the Hirshfeld weights to file for each atom. All subsequent XDM dispersion calculations were performed with an in-house adapted version of postg, which allows for reading in of integration grids and accompanying Hirshfeld weights.[14] The resulting pair coefficients (C6 and C8) on the diagonal of the coefficient matrix were subsequently stored and averaged over so-called van der Waals types. Off-diagonal elements of the pair coefficient matrix are currently not used, as their application would require calculations where each van der Waals type is present, which is simply unfeasible. Instead, heterogeneous pair coefficients (C6 and C8 in eq , with r the distance between an atom pair i and j) were obtained using the same combination rules as specified in our previous study on liquid alkanes.[14]

Table 1

Small Molecules (Classified by Their Chemical Moiety; One per Molecule) As Used in This Work for Evaluating Our Force Field Parameter Set in Terms of Reproducing Experimental Values of Properties in Molecular Dynamics Simulationa

compound		class	ϵ₀	T(ΔH_vap)	T(ρ)
propane	x	alkane	1.80	231	231
butane	x	alkane	1.77	273	273
pentane		alkane	1.84	298	293
hexane	x	alkane	1.88	298	298
heptane		alkane	1.91	298	298
octane		alkane	1.95	298	298
nonane		alkane	1.96	298	298
isobutane	x	alkane	1.83	261	261
isopentane		alkane	1.85	298	293
3-methylpentane		alkane	1.89	298	298
3-ethylpentane		alkane	1.94	298	298
cyclohexane		alkane	2.02	298	298
methanol	x	alcohol	33.50	298	298
ethanol		alcohol	24.00	298	298
propanol	x	alcohol	20.00	298	298
butanol		alcohol	17.70	298	298
pentanol		alcohol	15.10	298	298
hexanol	x	alcohol	13.00	298	298
heptanol		alcohol	11.50	298	298
octanol		alcohol	10.10	298	298
2-propanol	x	alcohol	19.10	298	298
2-butanol		alcohol	16.70	298	298
2-pentanol		alcohol	13.80	298	298
3-pentanol		alcohol	13.40	298	298
acetic acid	x	carboxylic acid	6.20	298	298
propanoic acid	x	carboxylic acid	3.40	298	298
butanoic acid		carboxylic acid	2.90	298	298
acetaldehyde	x	aldehyde	21.10	298	298
propionaldehyde	x	aldehyde	18.40	298	298
butyraldehyde		aldehyde	13.40	298	298
ethyl acetate	x	ester	6.00	298	298
methyl propanoate	x	ester	6.00	298	298
propyl acetate		ester	5.60	298	298
butyl acetate		ester	5.10	298	298
methoxymethane	x	ether	6.20	254	254
ethoxyethane	x	ether	4.20	298	298
1-methoxypropane		ether	3.70	298	298
propan-2-one	x	ketone	20.80	298	298
butan-2-one	x	ketone	17.70	298	298
pentan-2-one		ketone	15.40	298	298
pentan-3-one		ketone	16.60	298	298
hexan-2-one		ketone	14.50	298	298
hexan-3-one		ketone	10.75	298	298
ethanamine	x	amine	8.70	298	298
propan-1-amine		amine	5.10	298	298
butan-1-amine	x	amine	4.90	298	298
N-ethylethanamine		amine	3.90	298	298
N,N-dimethylmethanamine		amine	2.44	298	298
N,N-diethylethanamine	x	amine	2.40	298	298

All molecules used in initial training of rfree parameters used in eq are marked with a cross. Simulation temperatures in Kelvin for both the determination of liquid densities (T(ρ)) and heats of vaporization (T(ΔHvap)) are indicated. The dielectric constant used for the homogeneous medium in the reaction field treatment is listed as ϵ0. As part of the pairwise van der Waals potential-energy term Uvdw, we use a C11 repulsive potential as previously proposed by us,[14] which introduces higher order dispersion in combination with a C11 repulsive term (eq ). We introduce higher order dispersion in a van der Waals potential instead of using alternative functions that were recently proposed as a possible improvement for the description of Lennard-Jones interactions,[47] because use of eq allows us to directly derive dispersion parameters from QM calculations and makes a straightforward incorporation of higher order dispersion possible in GROMOS. The values for the C11 constants cannot directly be derived from QM calculations. We infer their values using van der Waals radii. As the zero-energy distance for the van der Waals interaction between a pair of atoms is defined by the sum of their effective van der Waals radii, the definition of these van der Waals radii automatically yields the C11 coefficients.[16] We find the atomic radii ratom by scaling a free parameter called rfree, which represents the radius of a free atom in vacuo with a given volume Vfree, eq .[10] The atomic volumes Vatom in eq are derived from the AIM partitioning, and the free volumes come from the single atom database that is required for Hirshfeld partitioning.[23,45,46] The C11 constants are subsequently obtained by solving for the zero-point energy in Uvdw, eq . This means that C11 parameters are implicit, as they are directly derived from radii and dispersion constants. We calibrated the exchange repulsion interactions in this work by optimizing values for rfree that together with our quantum determined bonded, electrostatic and dispersion force constants best describe the thermodynamic properties of the model compounds of interest.

Electrostatic Potential (ESP) Fitting of Static Atomic Charges

The class of force field in this paper is an explicitly polarized force field. Therefore, when determining static partial charges for the molecules of interest, we do not account for polarization by an environment of choice (e.g., water for biomolecules) and all charges are determined in a vacuum environment. We do this in an automated fashion for all molecules in Table , for which single-point calculations with the Amsterdam Density Functional (ADF)[48] software at the B3LYP/QZ4P level of theory in the gas phase, together with the associated grid-based ESP, are already available for our QM/MM molecule data set used to determine atomic polarizabilities as summarized in Table S3, which are lower than their gas-phase equivalent values by a factor that is consistent with other works in literature.[28] Intramolecular polarization is intrinsically encoded into the static charge parameters. For our sets of point charges we found that not only accuracy is important but also consistency in fitting them, to make sure that each molecule representing a certain class of functional groups behaves identical. Therefore, as in our QM/MM study to determine atomic polarizabilities,[28] we used here a consensus ESP fitting approach in which (atomic) sites that should have chemical equivalence in terms of charge distribution (defined by local neighborhood graphs that can be automated using graph theory algorithms[49]) are constrained to be assigned the same partial charge. In this manner we fit charges that have an overall optimal fit in their class of molecules, instead of averaging results from single molecule calculations.

Bonded Parameters

Optimal covalent bond and angle configurations were determined from the final geometry optimized configurations at the B3LYP/aug-cc-pVTZ level of theory using NWCHEM 6.8, by averaging values for similar bonds and angles within a class of molecules. To determine harmonic force constants for both the bond-stretch and angle-bend functions we utilized a modified Seminario approach,[50] in which a QM Hessian calculation at the same B3LYP/aug-cc-pVTZ level of theory is used to compute a Hessian matrix projection, effectively fitting all force constants at the same time. The final sets of covalent bond and angle parameters are summarized in Tables S1 and S2 of the Supporting Information, respectively. Dihedral parameters were transferred from the generated Automated Topology Builder (ATB) starting topologies, effectively adhering to GROMOS 54A7 style dihedrals. As it is currently unclear how 1–4 neighboring interactions should be handled in general for our C6–C8–C11 dispersion and exchange repulsion potential, we treated these using the traditional C6 and C12 types, making the choice for GROMOS torsion force constants valid.

MD Simulations

All molecular dynamics simulations were performed using GROMOS11 md++ version 1.4.1,[33] modified for the use of a higher order dispersion potential with C6 and C8 terms in combination with a C11 exchange repulsion term. Starting from the ATB downloaded atomic coordinates,[35] 1024 molecules were randomly packed into cubic boxes using the GROMOS++ tool ran_box.[51] To resolve any steric clashes, each molecular box was energy minimized using steepest-descent minimization. Initial velocities were picked using a Maxwell–Boltzmann distribution at 50 K and the boxes were slowly thermalized in five 20 ps NVT simulations toward the target temperature. After a NpT equilibration of 2 ns under production conditions, the systems were simulated for 5 ns, which was used for analysis. Newton’s equations of motion were integrated using a leapfrog integrator with a time step of 2 fs. Coordinates and energies were stored at 1 and 0.2 ps resolution, respectively. The temperature was maintained for each molecule at the temperature listed in Table using weak coupling with a Berendsen thermostat and a coupling constant of 0.1 ps.[52] In the cases where density and heat of vaporization reference data were not acquired at the same temperature, two simulations were ran at both temperatures, respectively. Pressure was maintained at 1 atm using a Berendsen barostat with a coupling constant of 0.5 ps.[52] For each system in this study the isothermal compressibility was set to 4.575 × 10–4 (kJ mol–1 nm–3)−1. All bonds were constrained using the SHAKE algorithm with a relative geometrical tolerance of 10–4.[53] Nonbonded interactions were treated using a twin-range cutoff scheme where interactions within 0.8 nm were evaluated every time step and interactions between 0.8 and 1.4 nm were evaluated every five time steps. In essence these longer range interactions are thus frozen for 4 steps. Long-range electrostatic interactions were treated using a reaction-field approach, in which the medium outside the interaction cutoff was treated as a homogeneous dielectric medium with the dielectric constants set to the values listed in Table .[54] Center of mass motion was removed in a linear manner every 1000 steps. Gas-phase energies were acquired in 20 ns runs, while the simulation temperature was maintained using stochastic dynamics with a friction coefficient of 24 ps–1.[55]

Results and Discussion

The general parametrization strategy in this work is to fix as many parameters as possible to quantum mechanical (QM) determined values in order to reduce the number of free parameters in the force field. As in our previous parametrization studies on alkanes[14] and water[16] the goal is to fix both electrostatic and dispersion parameters (as well as the bonded force constants) by computing and/or fitting them from electronic structure calculations. In this way we are only left with a few general parameters describing exchange repulsion for the elements used in this work (i.e., the rfree’s) and possibly some corrections to the general model. As the class of force fields we are interested in is polarizable force fields, a first step in parameter assignment involves the atomic polarizabilities that enter the COS model. For the functional groups in this work we have published a comprehensive list of atom types with associated polarizabilities that can directly be used in our force field.[28] The first aim is now to define dispersion parameters for these atom types, which are listed together with their polarizabilities in Table S3. We do this by utilizing electronic structure calculations for each query molecule at the B3LYP/aug-cc-pVTZ level of theory and subsequent partitioning of the resulting electron densities with an iterative Hirshfeld protocol.[14] Exchange-Hole Dipole Moment (XDM) calculations then directly yield C6 and C8 dispersion constants for atomic pairs.[22,56] The off-diagonal elements of the pair coefficient matrix are deterimed using geometrical combination rules (eq ) to form heterogeneous dispersion parameters for a given atom pair i,j, identical to the form used in the GROMOS force fields. The resulting dispersion constants per atom type are listed in Table , in which both the average determined C6 and C8 values together with their standard deviations are listed. The dispersion coefficients for these types are well resolved for the oxygen functional groups and we find larger spreads in computed values in the nitrogen and hydrocarbon parameters. During the iterative Hirshfeld partitioning there is still a varying amount of density attributed to the C and N atom types. This is not surprising as with their higher coordinatation number, the nitrogens and carbons have a relatively larger degree of variety in local electronic environment than the oxygen atoms. Nevertheless, our results indicate that a relatively small number of dispersion types can be sufficient to describe dispersion interactions in our model.

Table 2

Dispersion Parameters C6 and C8 in a.u. As Obtained from Our Exchange-Hole Dipole Moment (XDM) Analysisa

vdW type	C₆ (avg)	C₆ (stdev)	C₈ (avg)	C₈ (stdev)	r_atom
C_H1	16.88	0.36	594.4	13.6	3.175
C_H1,O	13.46	0.24	444.9	9.4	3.024
C_H2	26.38	3.31	1161.9	223.3	3.402
C_H2,N	18.55	0.56	654	25	3.156
C_H2,O	17.2	0.35	592.1	14.2	3.099
C_H3	35.56	3.34	1664.4	206.4	3.534
C_H3,N	28.37	1.52	1220.5	109.1	3.383
C_H3,O	23.3	1.02	889.5	61.9	3.231
C=O	11.02	0.22	318.9	7.8	2.835
C=O,O	8.11	0.12	212.6	4	2.683
C=O,OAC	7.56	0.11	193	3.8	2.646
H_C	1.83	0.16	37.2	3.9	2.305
H_C,al	2.42	0.05	49.9	1.4	2.381
H_NN	1.07	0.05	22.1	1.4	2.098
H_O	0.73	0.01	15.3	0.3	1.965
H_OAC	0.59	0	12.1	0.1	1.890
NN0	17.07	0.83	602.8	48.7	2.608
NN1	28.22	1.18	1218.6	78.9	2.835
NN2	37.96	0.42	1647.9	25.2	2.929
O=	20.11	0.19	567.9	8.3	2.665
OA	22.09	0.36	690.4	14.1	2.872
OAC	21.93	0.24	674.1	11.2	3.194
O=ac	22.14	0.19	646.3	10.3	2.702
O=s	22.26	0.14	655.7	3.7	2.721
OES	16.2	0.19	471.7	8.3	2.627
OET	15.37	0.22	443.9	9	2.589

The presented values are averages (avg) over the full set of 49 molecules (and their standard deviations, stdev) that are obtained after classifying each atomic site as van der Waals (vdW) type. Effective van der Waals radii ratom (bohr) of atoms in molecules are derived by scaling the (calibrated) radius rfree of the isolated element in vacuo with the ratio between the average atomic volume Vatom (of the vdW type) and the free atomic volume, Vfree, eq . Note that dispersion and repulsion parameters for H_NN, H_O, and H_OAC were set to zero in the final model; their C6 and C8 parameters were added to the atoms they are attached to by using eq , effectively also increasing C11 for these heavy atoms via eq . Charge type descriptions can be found in Table S3 of the Supporting Information. Note that an additional dispersion type was fitted for H_C in aldehydes (indicated by ‘al’). In order to compute the accompanying exchange repulsion parameters, atomic partition volumes as determined from our AIM calculations were averaged per atom type. These were subsequently used as input for Vatom in a radius scaling procedure (eq ) to determine effective van der Waals radii ratom for all atom types. The effective hard sphere radii rfree of isolated atoms cannot be directly determined quantum mechanically. However, their estimates are needed to derive ratom and, by using eq , C11 values for use in our dispersion-repulsion potential. As a result, we treat the rfree parameters as adjustable, effectively leaving a few required free parameters in the model. This is acceptable as any repulsive potential used in molecular mechanics is a loose approximation for Pauli repulsion effects, which are hard to capture in simple pairwise atomic functions. Hence, it is unclear how to define a more strictly theoretical approach to derive values for these radii. Before calibrating the rfree parameters in our model, static partial charges were (directly) derived from fitting to DFT determined gas-phase molecular electrostatic potentials (ESPs), thus resulting in a charge set that represents the Coulomb potential stemming from an unperturbed electron density. These ESP fits are not done on an individual basis, but a consensus approach is used in which molecular charge distributions are fitted that best represent the ESPs of all compounds considered, under the constraint that for every individual atom type a single partial-charge value is obtained. As a consequence, we derive charge types in this work that are closely linked to the local chemical environment of the atom; cf. Table S4. Note that we only considered influences from the environment up to 1 or 2 covalent bonds away, thereby reducing the number of charge types in the force field when compared to our choice made in our work on hydrocarbons. The reason is that we want to eliminate intramolecular class variations stemming from small but significant differences in the charge set, creating a more homogeneous response. The thus determined set of atomic charges is presented in Table . Even though no neutrality constraints were enforced during fitting, charge sums for the full set of considered molecules were close to zero with a maximum absolute deviation of 0.054 e. The remainder charges generated on the molecules were smeared out evenly over the molecules, neutralizing them to zero.

Table 3

Partial Charges (in e) Classified per Type of Charge Sitea

charge type	partial charge	charge type	partial charge
C_H1	–0.049	H_C2,OES	0.011
C_H1,OA	0.205	H_C2,OET	0.026
C_H2	–0.057	H_C3	0.029
C_H2,NN	0.061	H_C3,NN	0.030
C_H2,OA	0.173	H_C3,OA	–0.017
C_H2,OES	0.290	H_C3,OES	–0.004
C_H2,OET	0.101	H_C3,OET	0.039
C_H3	–0.085	H_NN1	0.264
C_H3,NN	0.044	H_NN2	0.264
C_H3,OA	0.290	H_OA	0.328
C_H3,OES	0.319	H_OAC	0.410
C_H3,OET	0.046	NN0	–0.389
C=O	0.483	NN1	–0.522
C=O,O	0.733	NN2	–0.659
C=Oal	0.494	O=	–0.483
C=O,OAC	0.747	O=,al	–0.466
H_C,al	–0.050	O=ac	–0.567
H_C1	0.050	O=s	–0.560
H_C1,OA	0.028	OA	–0.561
H_C2	0.030	OAC	–0.609
H_C2,NN	0.035	OES	–0.473
H_C2,OA	0.001	OET	–0.321

Values are determined using a consensus least-squares fitting approach to best reproduce molecular electrostatic potential (ESP) grids determined at the B3LYP/aug-cc-pVTZ level of theory for the 49 query compounds in vacuum. Charge type descriptions can be found in Table S4 of the Supporting Information. Having fixed most dispersion and charge parameters as well as the bonded parameters (Supporting Tables S1 and S2), there are in principle only four free parameters left to be calibrated for the set of molecules of interest. These are the free atomic radii of the four elements occurring in the compound set (C, O, N, H), which are needed for scaling toward the effective van der Waals radii ratom (using atomic volumes as determined from our AIM calculations and averaged per atom van der Waals type) and to obtain values for the exchange repulsion parameters C11. As there is no satisfying method to determine these free atomic radii a priori, we resorted to solving these as a general optimization problem in which the free parameters are calibrated in a supervised manner. For that purpose we optimized the densities and heats of vaporization of a training series of compounds marked in Table , which are small-molecule analogues for a series of amino-acid side chains and closely related chemical moieties. Traditionally, ρ and ΔHvap have been used to optimize and/or evaluate force field performance as they can be measured accurately in experiment, and their reproduction in simulation indicates a good balance between repulsive and attractive interactions and a well balanced force field. While being of high importance when evaluating force fields, solvation free energies are not considered in this work yet as higher order dispersion is currently not available in any of the free energy routines of molecular dynamics codes. As a target for our optimization problem, we have set a maximal average deviation from experiment of 25 kg m–3 for ρ and 2.5 kJ mol–1 for ΔHvap (kT). A force-field parameter set with such performance can later be used as a starting point for refinement in automated approaches. To reach our target, we started supervised calibration of single rfree parameters for hydrogen, carbon, oxygen, and nitrogen. However, by restricting in this way the number of free parameters to four, we did not manage to reach our goal and consistently obtained parameter sets that underestimated ρ and ΔHvap by twice our target error values or more (data not shown). We pinned this down to difficulties in describing hydrogen bonding in simulation, which is a nontrivial problem for classical force fields as hydrogen bonds involve a degree of electron density overlap. Therefore, when hydrogen bonding occurs, the classical description of exchange repulsion (i.e., with an exponential repulsive function) is not effective. We solved this in a similar manner as GROMOS and other force fields, by setting dispersion and repulsive parameters for polar hydrogens to zero. Because hydrogens contribute significantly to the dispersion energy, their contributions have to be redistributed over the attached polar groups. For every polar hydrogen atom type i, we perform an exact redistribution by summation of the square roots of the (AIM-determined) dispersion constants of i and its neighboring heavy atom type j, to find values that give the exact same dispersion power (eq ). C6, and C8, are then assigned to the involved heavy atom type, thereby effectively increasing its C11 value as well via eq . In particular for compounds with hydroxyl groups, we found that introducing C6, and C8, and setting the dispersion coefficients for the involved hydrogens to zero yielded too dense liquids after optimizing rfree for the remaining three elements. Instead of performing a redistribution of repulsive parameters, we instead opted to reparametrize the repulsive power for these hydrogen bonding groups, effectively extending the number of free parameters in our model to six (i.e., by introducing a rfree parameter for nitrogen and oxygen atoms that can donate hydrogen bonds). In this way we could train our force field to obtain a final model for which the performance over all 49 compounds considered in this work is shown in Figure .

Figure 1

Performance of the calibrated model in terms of reproducing experimental values (exp) for pure-liquid densities ρ (a) and heats of vaporization ΔHvap (b) for the 49 compounds considered in this work (Table ). The solid line at the diagonal indicates perfect reproduction of experiment by simulation (sim). Dashed lines represent deviations of ±25 kg m–3 for ρ and ±2.5 kJ mol–1 for ΔHvap. For both the computed densities (Figure a) and heats of vaporization (Figure b), the optimization targets are marked by a dashed line at a deviation from experiment of 25 kg m–3 and 2.5 kJ mol–1, respectively. We find that our highly constrained model is performing well. Notably, 63% of the molecules fall within the optimization target for the density and 90% for the heats of vaporization, suggesting that the underlying energetics are well captured. On average, the densities deviate 2.5% from their experimental value, while heats of vaporization are on average within 3% from experiment. Overall, we find a root-mean-square deviation from experiment (RMSD) of 23.5 kg m–3 and 1.63 kJ mol–1 for density and heat of vaporization, indicating that the force field is well behaved and shows comparable performance to previous force field optimization; see Table for a comparison with results from the 2016H66 reoptimization of the GROMOS-based nonpolarizable force field and from parametrization studies on the CHARMM Drude-Oscillator polarizable models. Considering our simple static charge model (without offsite charges or explicit lone pairs, and using fully isotropic polarizabilities and a very limited set of free parameters), our force field already compares well with the current state of the art. Looking at the results per class of compounds (Table ), we find RMSDs in the ranges 13.2–40.4 kg m–3 for densities and 0.68–2.84 kJ mol–1 for heats of vaporization. The largest deviations are found for the acids included in this set, but we should consider that this class of compounds has the largest absolute values for both density and heat of vaporization.

Table 4

Pure-Liquid Performance Subdivided into Results per Class of Moleculesa

	this work			2016H66[7]		CHARMM Drude[57−59]
class	N	RMSD (ρ)	RMSD (ΔH_vap)	RMSD (ρ)	RMSD (ΔH_vap)	RMSD (ρ)	RMSD (ΔH_vap)
ketones	6	29.8	1.38	13.7	1.3
acids	3	40.4	2.84	19.6	4.4
alcohols	12	13.2	2.16	34.0	3.3	6.34	1.98
aldehydes	3	16.2	0.68	10.0	2.6
alkanes	12	15.9	1.04			4.02	0.34
amines	6	25.2	1.03	50.2	4.4
esters	4	36.0	1.98	25.8	1.9
ethers	3	25.3	0.70	23.3	0.7	8.37	1.07
overall	49	23.5	1.63

Performance for both density (ρ, in kg m–3) and heat of vaporization (ΔHvap, in kJ mol–1) are given in terms of root-mean-square deviations (RMSDs) when results from simulation are compared with experiment. N is the number of molecules per class of compounds. In addition to results obtained in this work, data taken from refs (7) and (57−59) are given as well for the 2016H66 and CHARMM Drude-Oscillator parameter sets, respectively.

Conclusions

In this work we presented a higher order dispersion model for the use in explicitly polarized molecular dynamics simulation. By separating the induction and dispersion terms, we generated a force field that can use pure dispersion (up to C8) inputs. The parameters were directly obtained from quantum-mechanical (or QM/MM) calculations, and only the exchange repulsion parameters had to be empirically calibrated. We did this on the basis of pure-liquid experimental data for thermodynamic properties of amino-acid side-chain analogues. In total six free parameters were calibrated against a set of liquid densities and heats of vaporization for 49 molecules, including 37 polar and 12 apolar compounds. Overall, we find a well behaving force field with root-mean-square deviations from experimental data of 23.5 kg m–3 and 1.63 kJ mol–1 for pure-liquid densities and heats of vaporization, respectively.

41 in total

1. A Simple Method for Including Polarization Effects in Solvation Free Energy Calculations When Using Fixed-Charge Force Fields: Alchemically Polarized Charges.

Authors: Braden D Kelly; William R Smith
Journal: ACS Omega Date: 2020-07-07

Review 2. Progress in Simulation Studies of Insulin Structure and Function.

Authors: Biswajit Gorai; Harish Vashisth
Journal: Front Endocrinol (Lausanne) Date: 2022-06-20 Impact factor: 6.055

2 in total

Deriving a Polarizable Force Field for Biomolecular Building Blocks with Minimal Empirical Calibration.

Introduction

Methods

Electrostatic Model

Determining Dispersion and Exchange Repulsion Parameters

Electrostatic Potential (ESP) Fitting of Static Atomic Charges

Bonded Parameters

MD Simulations

Results and Discussion

Conclusions

1. Optimized Slater-type basis sets for the elements 1-118.

2. Exchange-hole dipole moment and the dispersion interaction revisited.

3. Accurate molecular van der Waals interactions from ground-state electron density and free-atom reference data.

4. Phase-Transferable Force Field for Alkali Halides.

5. Polarizable Drude Model with s-Type Gaussian or Slater Charge Density for General Molecular Mechanics Force Fields.

6. Testing and validation of the Automated Topology Builder (ATB) version 2.0: prediction of hydration free enthalpies.

Review 7. Fixed-Charge Atomistic Force Fields for Molecular Dynamics Simulations in the Condensed Phase: An Overview.

8. Water dispersion interactions strongly influence simulated structural properties of disordered protein states.

9. A QM/MM Derived Polarizable Water Model for Molecular Simulation.

10. Automated partial atomic charge assignment for drug-like molecules: a fast knapsack approach.

1. A Simple Method for Including Polarization Effects in Solvation Free Energy Calculations When Using Fixed-Charge Force Fields: Alchemically Polarized Charges.

Review 2. Progress in Simulation Studies of Insulin Structure and Function.