Literature DB >> 22241968

Force Field Benchmark of Organic Liquids: Density, Enthalpy of Vaporization, Heat Capacities, Surface Tension, Isothermal Compressibility, Volumetric Expansion Coefficient, and Dielectric Constant.

Carl Caleman, Paul J van Maaren, Minyan Hong, Jochen S Hub, Luciano T Costa, David van der Spoel.   

Abstract

The chemical composition of small organic molecules is often very similar to amino acid side chains or the bases in nucleic acids, and hence there is no a priori reason why a molecular mechanics force field could not describe both organic liquids and biomolecules with a single parameter set. Here, we devise a benchmark for force fields in order to test the ability of existing force fields to reproduce some key properties of organic liquids, namely, the density, enthalpy of vaporization, the surface tension, the heat capacity at constant volume and pressure, the isothermal compressibility, the volumetric expansion coefficient, and the static dielectric constant. Well over 1200 experimental measurements were used for comparison to the simulations of 146 organic liquids. Novel polynomial interpolations of the dielectric constant (32 molecules), heat capacity at constant pressure (three molecules), and the isothermal compressibility (53 molecules) as a function of the temperature have been made, based on experimental data, in order to be able to compare simulation results to them. To compute the heat capacities, we applied the two phase thermodynamics method (Lin et al. J. Chem. Phys.2003, 119, 11792), which allows one to compute thermodynamic properties on the basis of the density of states as derived from the velocity autocorrelation function. The method is implemented in a new utility within the GROMACS molecular simulation package, named g_dos, and a detailed exposé of the underlying equations is presented. The purpose of this work is to establish the state of the art of two popular force fields, OPLS/AA (all-atom optimized potential for liquid simulation) and GAFF (generalized Amber force field), to find common bottlenecks, i.e., particularly difficult molecules, and to serve as a reference point for future force field development. To make for a fair playing field, all molecules were evaluated with the same parameter settings, such as thermostats and barostats, treatment of electrostatic interactions, and system size (1000 molecules). The densities and enthalpy of vaporization from an independent data set based on simulations using the CHARMM General Force Field (CGenFF) presented by Vanommeslaeghe et al. (J. Comput. Chem.2010, 31, 671) are included for comparison. We find that, overall, the OPLS/AA force field performs somewhat better than GAFF, but there are significant issues with reproduction of the surface tension and dielectric constants for both force fields.

Entities:  

Year:  2011        PMID: 22241968      PMCID: PMC3254193          DOI: 10.1021/ct200731v

Source DB:  PubMed          Journal:  J Chem Theory Comput        ISSN: 1549-9618            Impact factor:   6.006


Introduction

Parameters in most force fields have been derived incrementally, that is, building on previous work by adding support for different chemical moieties in a sequential fashion. While the focus of many force fields is on biomolecules, the chemical basis lies in organic molecules. Of the major force fields available today OPLS/AA (optimized parameters for liquid simulations, all atoms) is one of the few that “specializes” in simple liquids.[1] The generalized Amber force field (GAFF) was introduced recently[2] (together with the Antechamber set of programs[3]) to aid in the derivation of force field parameters for small molecules that are often involved in binding to biomolecules. Accurate parameters are crucial for predicting, for instance, the Gibbs energy of ligand binding, a key property in drug design.[4] The GAFF parameters for small molecules are intended to be combined with the Amber force field[5] although there are studies of proteins using GAFF parameters as well.[6] A critical component in force field development is generation of partial charges. The method for deriving partial charges by optimizing their values to reproduce the electrostatic potential (ESP) was introduced in the 1980s by Kollman et al.[7,8] The electron density taken from a quantum chemistry calculation, together with the nuclear charges, generates an electrostatic potential around the molecule. Typically, the set of partial charges for a molecule, for use in force field calculations, is determined by minimizing the (square) difference between the ESP generated by the partial charges and the ESP generated by the quantum chemistry calculation. A set of partial charges (or indeed any atom-centered set of spherically distributed charges) can never completely reproduce the ESP due to the fact that electron density is not completely spherically symmetric around the nuclei (for instance, due to p and higher orbitals). A further issue is due to the fact that the fitting points are highly correlated, and hence atoms far from the ESP data points (e.g., the buried carbon in isobutanol) may end up being a sink for the fit[9,10] and get arbitrary values. An ad hoc refinement of the ESP method to overcome this problem is the restrained ESP (RESP) method.[11] The RESP method does the same fit, however with an added penalty on the absolute value of the charge. The RESP method is an integral part of the Antechamber package,[2,3] which relies on either quantum calculations or empirical methods, such as AM1-BCC,[12,13] to provide the partial charges. Mobley et al. tested the performance of GAFF parameters for Gibbs energies of hydration using two different water models.[14,15] They paid particular attention to the way the partial charges were determined and found that the final results are related to the level of theory used, something that was corroborated by Wallin et al., who did a similar study of charge schemes for ligand binding.[16] The CM1 charge model for OPLS/AA,[17] used in the study of Wallin et al.,[16] performs well,[18,19] although some degradation for conformational energetics is expected. The differences are generally considered to be minor.[1] There are some drawbacks with these studies however. First, they involve complex systems, where a subset of the parameters was changed and the “quality” of the charges evaluated on the basis of a single number, the free energy, hereby ignoring the interdependency between Lennard-Jones parameters and point charges. Second, free energy calculations depend critically on the amount of the sampling that was used, although it is possible to ascertain that the errors due to sampling are small.[20] In order to test the validity of force field, it would be good to take one step back and evaluate the performance for simple systems first, in order to avoid systematic errors due to water model and/or protein force fields. A recent review by Jorgensen and Tirado-Rives provides further background information on the topic of force field development.[1] To assess the state of the art of GAFF and OPLS/AA force fields, we provide a comprehensive benchmark of the liquid properties of molecules in each of the GAFF and OPLS/AA force fields. Previous simulations of mixtures of alcohol and water[21,22] using the OPLS/AA force field showed that many properties of the pure liquids are reproduced faithfully, but the heat of mixing and the density of mixing are slightly, but significantly, off. Similar comparisons of force fields for water models are numerous in the literature (see for example, refs (23−29)), while for organic liquids there are some papers by Kaminski and Jorgensen,[30,31] and a recent paper by Wang and Tingjun,[32] which we discuss in the Discussion section. Liquid properties are usually known experimentally with high accuracy, and their calculation is most often straightforward. Rather, the time goes into the preparation and equilibration of the systems. A total of 146 molecular liquids was prepared and simulated using these force fields in the GROMACS molecular simulation package,[33−36] and from these molecular dynamics simulations, we extract the density ρ (from constant pressure simulations), the enthalpy of vaporization ΔHvap, the heat capacities at constant pressure cP and volume cV, the volumetric expansion coefficient αP, the isothermal compressibility κT, the surface tension γ, and the static dielectric constant ε(0). Although, in principle, more observables could be computed, this set includes the most important thermodynamic properties of the liquids, including temperature derivatives of energy and volume. The intention of this work is to supply a large number of tests for further force field development. To this end, the topologies and structures have been made available on a dedicated Web site at http://virtualchemistry.org, while the simulation parameters are available as Supporting Information to this paper. These topologies and structure files may be useful for simulations of biomolecules in organic liquids as well. The recently presented all atom CHARMM general force field (CGenFF)[37] would be an equally well suited candidate for inclusion in this comparison, but we have chosen to limit our simulations to two force fields only. However, to allow the reader to compare OPLS/AA and GAFF to a similar study based on CGenFF, we have included results on density and enthalpy of vaporization from that paper.[37]

Methods

Energy Function

Most force fields use the same functional form for the intermolecular part of the interaction function, based on the Coulomb potential and the Lennard-Jones potential:where r is the distance between two atoms i and j, q and q are the partial charges on the atoms, ε0 is the permittivity of vacuum, σ is the van der Waals radius, and ε is the well-depth for this atom pair. In most force fields, the parameters σ and ε are derived from the atomic values σ and ε using a simple equation (the combination rule). Suffice to say that we have applied the standard combination rules for GAFF (Lorentz–Berthelot[38]) and for OPLS/AA (σ = (σσ)1/2 and ε = (εε)1/2[39]) in this work.

Molecule Selection and Preparation

A set of organic molecules was selected for which both enthalpy of vaporization and density are known at room temperature. Models for these molecules were built using either PRODRG[40] or Molden.[41] These molecules were optimized using the Gaussian 03 suite of programs[42] at the Hartree–Fock level with the 6-311G** basis set.[43−47]

OPLS/AA Topologies

The OpenBabel (http://openbabel.org) code was used to extract a coordinate file including connectivity information from the Gaussian output files, and this file was used to generate an initial topology using the GROMACS tools[35] for the OPLS/AA force field.[1,39] The topologies were checked manually for correctness before using them, making sure that the total charge of the molecule is zero, and also that the atom types were correct. For molecules containing linear groups (e.g., nitriles), a virtual site construction was added to the topologies preserving the moment of inertia and the total mass, in order to keep the groups perfectly linear.[48]

GAFF Topologies

For the simulations where GAFF[2] was used, the Antechamber software[2,3] was employed to generate the topologies from the coordinate files (which were generated as explained above). Gaussian 03[42] at the Hartee–Fock level with the 6-311G** basis set[43−47] (as provided by the Basis Set Exchange Web site[49,50]) and Merz–Singh–Kollman (MK) scheme[7] were used the to determine the partial charges in Gaussian. This particular basis set was used because it is very similar to the 6/31G* basis set,[51] which is the default for GAFF, while simultaneously supporting a larger number of elements (e.g., I). The MK radius for I is not implemented in Antechamber, we used RI = 2.15 Å. The amb2gmx.pl script[52] was used to convert the AMBER topologies into the GROMACS format (this script is available online at http://ffamber.cnsm.csulb.edu/). The final partial charges were calculated using the RESP method[11] as implemented in Antechamber, and we manually checked that the charges were sane. Note that RESP can be used with any QM method producing electrostatics, not just with HF/6-311G**. No modifications for linear group were made for the GAFF topologies, where the Antechamber software[3] generates a near-linear angle term instead.

Liquid Simulation Box Preparation

To generate liquid simulation boxes, we first made a 2 × 2 × 2 nm3 box containing a single molecule. From 125 such single molecule boxes, we built up a 10 × 10 × 10 nm3 box. These boxes were simulated under high pressure (100 bar) to force the molecules into the liquid phase, and finally we let the systems relax under normal pressure (1 bar) to reach an equilibrated system. For the equilibration simulations, we used Berendsen’s coupling algorithm[53] because of its efficient relaxation properties.[34] To generate our final simulation boxes, we stacked 2 × 2 × 2 of the 125 molecule boxes and ran an additional equilibration simulation. The absolute drift in total energy was automatically checked in the equilibration and production simulations, and the simulations were continued until the drift was below 0.5 J/mol/ns per degree of freedom, which is a very strict criterion but which is necessary to accurately compute fluctuation properties.

Simulation Parameters

The GROMACS suite of programs was used for all simulations.[33−36] Following previous simulations of alcohol water mixtures[21,22] using the OPLS/AA force field,[1,39] we employed a 1.1 nm cutoff for Lennard-Jones interactions and the same distance as the switching distance for the particle mesh Ewald (PME) algorithm for computing Coulomb interactions.[54,55] Although the OPLS/AA force field was not developed for use with PME, extensive studies on water models[56] and proteins in water[57] have shown that correspondence of simulation results with experimental data improves considerably when long-range interactions are taken into account explicitly—irrespective of the force field used. Analytic corrections to pressure and potential energies were made to compensate for the truncation of the Lennard-Jones interactions.[38] In the production simulations, we used the Nosé–Hoover algorithm for temperature coupling,[58,59] in order to provide correct fluctuations, which is necessary to compute fluctuation properties. A time constant for coupling of 1 ps (corresponding to a mass parameter Q of 7.6 ps at room temperature) was used, which is in the range of time scales for intermolecular collisions, as recommended by Holian et al.[60] For production simulations at constant pressure, the Parrinello–Rahman pressure coupling[61] algorithm was used with compressibility set to 5 × 10–5 bar–1 and a time constant of 5 ps. The temperatures of the simulations were selected to fit the experimental data available. In most simulations, the bonds were constrained using the LINCS algorithm[62,63] for all molecules, applying two iterations in order to obtain good energy conservation. Periodic boundary conditions were used in all liquid phase simulations. Four types of production run simulations were performed according to Table 1. The density of states (DOS) production simulations were performed under constant volume conditions, but they were preceded by equilibration simulations under NPT (without constraints) in order to obtain the equilibrium density at P = 1 bar for the subsequent DOS simulations. In the DOS simulations, slightly stricter energy conservation parameters were used: a neighbor list buffer of 0.3 nm, combined with a switched Lennard-Jones and short-range electrostatics term (1.0–1.1 nm), see reference (56) for a description of the functional form.
Table 1

Simulation Characteristics for the Different Simulation Types

namelength# moleculesensembleconstraintselectrostatics
LIQ10 ns1000NPTall bondsPME
GAS100 ns1NVTall bondsall interactions
SURF10 ns1000NVTall bondsPME
DOS100 ps1000NVTnonePME
The GAS simulations were done using a stochastic dynamics (SD) integrator, which adds a friction and a noise term to Newton’s equation of motion:where m is the mass of atom i, ξ is a friction constant, and ρ(t) is a noise process withwhere kB is Boltzmann’s constant, T is the temperature, δ(s) is the Dirac δ function, and δ is the Kronecker δ function. A leapfrog algorithm adapted for SD simulations[64] was used to integrate eq 2. When 1/ξ is large compared to the time scales present in the system, SD functions like molecular dynamics with stochastic temperature-coupling. One of the benefits with SD as compared to MD is that when simulating a system in a vacuum there is no accumulation of errors for the overall translational and rotational degrees of freedom, making sampling of different configuration states more accurate. SURF and LIQ simulations were done using a conventional MD leapfrog integrator.[65] To enable replication of our simulations and detailed scrutiny of the data, we provide all simulation parameters for each type of run, as well as starting structures and topologies. These files, in GROMACS format, are available for downloading at http://virtualchemistry.org. To ensure that our liquid systems did not freeze during the simulations, we monitored the changes in diffusion constant ΔD as derived from the mean square displacement during the simulations, defined asThe subscript “begin” means the value is an average over the 1000–1500 ps of the simulation, and “end” means over 8500–9000 ps. |ΔD| is close to zero for most simulations, indicating that D is approximately the same in the beginning and at the end of the simulation. We also verified that D > 0 for all simulations. For the simulations where |ΔD| ≥ 0.5, we ensured that the systems indeed were not frozen, by inspecting the full mean square displacement curve and the trajectory of the simulations. In the Supporting Information (Figure S1), we show ΔD for all of the liquid simulations.

Analysis

The density ρ in a constant pressure simulation follows trivially from the mass M of the system divided by the volume V:The enthalpy of vaporization can be computed fromwhere Eintra is the intramolecular energy in either the gas (g) phase or the liquid (l) phase and Einter represents the intermolecular energy of the system. In practice, we can simply evaluateρ was determined from LIQ simulations and ΔHvap from LIQ and GAS simulations. The SURF simulations were done using liquid boxes, the size of which in the z direction was extended by a factor of 3, generating a simulation box with two liquid–vacuum interfaces. The surface tension γ then follows fromwhere P is the pressure component in direction n and L is the length of the box in the z direction (perpendicular to the surfaces). Static dielectric constants ε(0) were computed on the basis of the fluctuations of the total dipole moment M of the simulation box[66,67] in the LIQ simulations:where V is the volume of the simulation box. Errors were estimated by block-averaging over 10 blocks of 1 ns. In order to verify the validity of eq 9, we computed the autocorrelation time τM of the total dipole moment M in the simulation boxes (from the integral of the autocorrelation function). In order for fluctuations to be well-defined, τM should be at least an order of magnitude shorter than the simulation length. Henceforth, we omitted the dielectric constants for those systems where τM was longer than 1 ns. For those systems where this was the case, longer simulations of 50 ns were performed, in most cases without any improvement. The fluctuation properties αP (the volumetric thermal expansion coefficient) and κT (the isothermal compressibility) are computed from the LIQ simulations according to[38]where H is the enthalpy and δ indicates the fluctuations, andThese two properties can be related to the difference between heat capacities at constant pressure and constant volume throughwhere V is the molecular volume. We can take advantage of this relation in two ways, first by computing αP and κT from our simulations and then computing the constant pressure heat capacity based on the constant volume heat capacity. By using experimental data for αP and κT, we can also establish “experimental” constant volume heat capacities, which are difficult to measure directly. In this work, we have done both, as detailed in the Results and Discussion sections. The classical—that is, without any quantum corrections—heat capacity cPclass can be obtained from the fluctuations in the enthalpy:[38]Although this is straightforward to calculate, the numbers obtained in this manner are a factor of 2 too high (Table 2). Therefore, we have determined the heat capacities cP and cV on the basis of the two phase thermodynamic method[68−70] (described in the Supporting Information), which is based on the convolution of the density of states with a weighting function based on quantum harmonic oscillators, as introduced originally by Berens et al.[71] The final expression yielding the heat capacity cV isDoSgas and DoSsolid denote the density of states in a gas and a solid, Wgas(ν) and Wsolid(ν) are weighting factors for the same, and cP can be obtained by combining eq 12 and eq 14. For all details and a complete derivation, we refer the reader to the Supporting Information.
Table 2

Statistics of a Linear Fit of Calculated to Experimental Values According to y = ax + ba

force fieldNabRMSD% dev.R2
ρ (g/l)
GAFF2350.9658.582.9497%
OPLS/AA2350.9820.940.4299%
CGenFF[37]1111.03–36.026.0299%
OPLS/AA[70]91.01–24.045.3496%
ΔHvap(kJ/mol)
GAFF2311.070.810.61783%
OPLS/AA2310.963.46.51189%
CGenFF[37]950.942.44.7784%
γ (10–3 N/m)
GAFF1550.750.98.62370%
OPLS/AA1550.97–5.57.32289%
ε(0)
GAFF1630.270.415.73555%
OPLS/AA1760.160.715.94355%
αP (10–3/K)
GAFF2210.900.30.32467%
OPLS/AA2210.910.30.32175%
OPLS/AA[70]90.530.80.74239%
κT (1/GPa)
GAFF1030.660.00.32774%
OPLS/AA1030.760.10.31985%
OPLS/AA[70]80.930.01.15984%
cP(J/mol K)
GAFF1301.08–30.919.81098%
OPLS/AA1321.10–30.218.21097%
OPLS/AA[70]90.943.510.4794%
cV(J/mol K)
GAFF721.02–17.618.81097%
OPLS/AA721.04–17.918.3995%
OPLS/AA[70]81.01–5.410.8795%
cPclass(J/mol K)
GAFF2141.77–21.6148.37787%
OPLS/AA2141.98–52.8147.07393%

Uncertainties in the simulation results are used as weights in the fit. The number of (experimental) data points N is given for each property. Root mean square deviation (RMSD) from experimental values, average relative deviation in percent, and the correlation coefficient R2 are given. OPLS/AA results from ref (70) and CGenFF results from ref (37) (using the so called CHARMM generalized force field) are also listed for comparison.

Uncertainties in the simulation results are used as weights in the fit. The number of (experimental) data points N is given for each property. Root mean square deviation (RMSD) from experimental values, average relative deviation in percent, and the correlation coefficient R2 are given. OPLS/AA results from ref (70) and CGenFF results from ref (37) (using the so called CHARMM generalized force field) are also listed for comparison. The properties investigated fall into two categories: those that follow directly from the ensemble average of a property (energy, pressure, volume) and those based on fluctuations (heat capacities, compressibility, and expansion coefficient). For the first category, error estimates were based on a block averaging procedure that automatically takes the autocorrelation of the property under investigation into account.[72] Properties like potential energy and density usually have relatively short autocorrelation times. The surface tension fluctuates significantly but also has a short autocorrelation time. For the second category, we have used a different approach when estimating the error. By dividing the entire simulation trajectory into nine, in time, equally long parts, we get nine values for each property, from which we can estimate the total error. In the case of cV, we used five blocks of 20 ps for error estimation instead. We calculated cP on the basis of eq 12 and estimated the error δcP from the errors in cV (δcV), αP (δαP), and κT (δκT) asor, expressed in Δc (eq 12):

Results

Correlations between experimental data and simulations for observables and derived quantities are plotted in Figures 1–8. The statistics for linear fits to the data (ycalcd = ayexptl + b) are given in Table 2 for each of the observables and the two force fields, plus similar data from refs (37) and (70). To identify which specific molecule generated a certain value in the figures, we refer to Tables S2–S10 in the Supporting Information. An overview of the names of the molecules, their formula, molecular weight, CAS number, and ChemSpider ID is given in Table S1 (Supporting Information). For many molecules, results at different temperatures were generated, and hence the number of data points may be larger than the number of molecules. For densities, heats of vaporization, surface tensions, and dielectric constants, some of the experimental values were generated from analytical functions of temperature based on experimental data, the parameters of which are given in the Handbook of Chemistry and Physics,[73] the Landolt-Bornstein database,[74] and Yaws’ book on Thermophysical Properties of Chemicals and Hydrocarbons.[75] In addition, we parametrized the dielectric constant, heat capacity at constant pressure, and isothermal compressibility as a function of the temperature for some molecules (see below).
Figure 1

Correlation between densities (ρ) calculated by MD simulation using GAFF, OPLS/AA, CGenFF, and experimental results. The CGenFF data were adopted from Vanommeslaeghe et al.[37] and are based on a different (but similar) set of molecules, including 111 molecules. For a full list of the CGenFF data, we refer to the reference and the supplemental files therein.

Figure 8

Correlation between measured heat capacity at constant volume (cV) and computed using the GAFF and the OPLS/AA force fields based on the density of states method, which includes quantum corrections.

Correlation between densities (ρ) calculated by MD simulation using GAFF, OPLS/AA, CGenFF, and experimental results. The CGenFF data were adopted from Vanommeslaeghe et al.[37] and are based on a different (but similar) set of molecules, including 111 molecules. For a full list of the CGenFF data, we refer to the reference and the supplemental files therein. Correlation between enthalpy of vaporization (ΔHvap) calculated using GAFF, OPLS/AA, CGenFF, and experimental results. The CGenFF data were adopted from Vanommeslaeghe et al.[37] and are based on a different (but similar) set of molecules, including 95 molecules. For a full list of the CGenFF data, we refer to the reference and the supplemental files therein. Correlation between surface tension (γ) calculated using the GAFF and the OPLS/AA force fields and experimental results. Correlation between dielectric constant (ε(0)) calculated using the GAFF and the OPLS/AA force fields and experimental results. Note the logarithmic axes. Correlation between volumetric expansion coefficient (αP) calculated using the GAFF and the OPLS/AA force fields and experimental results. Correlation between isothermal compressibility (κT) calculated using the GAFF and the OPLS/AA force fields and experimental results. Correlation between measured heat capacity at constant pressure (cP) and computed using the GAFF and the OPLS/AA force fields based on either the density of states (DoS) method, which includes quantum corrections and a Δc correction based on simulations, or based on a purely classical treatment (cPclass, Class.). Correlation between measured heat capacity at constant volume (cV) and computed using the GAFF and the OPLS/AA force fields based on the density of states method, which includes quantum corrections.

Statistics

In the following, we discuss general trends in all properties first; outliers are described separately below. A comparison of the values in Table 2 shows that OPLS/AA is slightly better than GAFF at reproducing experimental data for most observables, with both lower RMSD and higher correlation coefficients R2.

Density

The density ρ (Figure 1, Table S2) of virtually all liquids is reproduced very well, with R2 = 97% (GAFF) and 99% (OPLS/AA) (Table 2). For GAFF, the densities are systematically slightly underestimated (a = 0.96), while for OPLS/AA, a = 0.98, very close to 1, and both have an R2 close to 100%. In a recent publication, Vanommeslaeghe et al. presented the CHARMM general force field (CGenFF).[37] They calculated densities for a set of 111 drug-like molecules, using boxes of 216 molecules. Their reported densities are also very accurate with a = 1.03 and R2 = 99%, see Figure 1 and Table S2.

Enthalpy of Vaporization

ΔHvap (Figure 2, Table S4) correlates very well with experimental data in most cases, with R2 = 83% (GAFF) and 89% (OPLS/AA) (Table 2). The GAFF overestimates ΔHvap with slope a = 1.07, while OPLS/AA underestimates a slightly at 0.96. These deviations cannot just be attributed to a small number of outliers, as may be evident from Figure 2. Vanommeslaeghe et al.[37] calculated enthalpy of vaporization for a set of 95 small molecules. Like for OPLS/AA, ΔHvap is underestimated in CGenFF calculations with a slope of a = 0.94. The correlation between experiments and simulation is similar to the two force fields studied here, R2 = 84%. The CGenFF data set is based on a comparable but different set of molecules than what has been analyzed here (37 molecules overlap between the two studies). To simplify a comparison between OPLS/AA, GAFF, and CGenFF, we have listed the CGenFF ΔHvap values from the study by Vanommeslaeghe et al. next to OPLS/AA and GAFF values in Table S3, and we have plotted them in Figure 2.
Figure 2

Correlation between enthalpy of vaporization (ΔHvap) calculated using GAFF, OPLS/AA, CGenFF, and experimental results. The CGenFF data were adopted from Vanommeslaeghe et al.[37] and are based on a different (but similar) set of molecules, including 95 molecules. For a full list of the CGenFF data, we refer to the reference and the supplemental files therein.

Surface Tension

The surface tension γ (Figure 3, Table S4) seems to be underestimated systematically in both force fields with slope a = 0.75 (GAFF) and 0.97 (OPLS/AA, Table 2). The interactions between molecules on the surface are not sufficiently strong, a well know problem with nonpolarizable force fields.[21,25,76] The values are spread around the diagonal for both GAFF (R2 = 70%) and OPLS/AA (R2 = 89%), and here again OPLS/AA performs slightly better than GAFF.
Figure 3

Correlation between surface tension (γ) calculated using the GAFF and the OPLS/AA force fields and experimental results.

Dielectric Constant

For 32 molecules, a novel parametrization of the temperature dependence of the dielectric constant was made on the basis of experimental values predominantly from the Landolt-Bornstein database.[77] The parametrization is to a polynomial of second or third order (as is used in the Handbook of Chemistry and Physics[73]), and the resulting coefficients are given in Table 3. Interpolations of these polynomials were used in order to compare the simulations to experimental data, and the fits are presented in Figure S3 of the Supporting Information.
Table 3

Parameterization of Temperature Dependence of Dielectric Constants in a Polynomial Form ε(0) = A + BT + CT2 + DT3, Which Is the Same Form Used in the Handbook of Chemistry and Physics[73]a

moleculeNχ2TminTmaxABCD
bromomethane120.7194.60275.7052.59–2.812e–014.565e–040
methanol921.0175.62337.75226.69–1.319e+002.937e–03–2.359e–06
1,1,1,2,2-pentachloroethane90.0245.15338.1513.81–5.527e–027.186e–050
1,1,2,2-tetrachloroethane140.2231.15318.1571.61–3.630e–015.010e–040
1,2-dibromoethane390.1288.15353.1510.31–3.114e–024.200e–050
1,1-dichloroethane80.2288.15323.1536.77–1.300e–011.361e–040
2-chloroethanol303.1263.15401.75105.36–3.245e–013.619e–055.019e–07
ethanamide70.3358.15448.20–200.551.551e+00–2.239e–030
methyldisulfanylmethane60.0293.15323.1553.55–2.539e–013.571e–040
2-aminoethanol70.4283.65298.15166.68–7.576e–011.018e–030
1,3-dioxolan-2-one240.5309.46364.15223.34–4.560e–019.143e–050
1,3-dioxolane310.2175.93303.1540.61–2.507e–016.323e–04–5.695e–07
dimethoxymethane50.0170.65298.152.59–9.298e–043.847e–060
ethylsulfanylethane60.1293.15323.1511.68–1.994e–020.000e+000
2-methylpropan-2-amine40.0291.15303.15294.70–1.887e+003.060e–030
thiophene140.1252.65333.152.325.071e–03–1.232e–050
furan310.2198.15303.156.69–2.044e–022.644e–050
pentane-2,4-dione92.0291.15323.15–532.573.658e+00–5.982e–030
3-methylpyridine61.0293.15333.0035.54–9.303e–024.307e–050
benzenethiol60.3293.15358.155.72–7.033e–037.362e–060
(E)-hex-2-ene60.0157.00295.002.43–1.132e–03–1.372e–060
1-methoxy-2-(2-methoxyethoxy)ethane50.0298.15333.1532.07–1.359e–011.766e–040
diethyl propanedioate70.2293.15343.1519.98–5.034e–023.345e–050
2,4,6-trimethylpyridine100.1293.15358.1516.67–3.036e–022.361e–060
triethyl phosphate60.1294.15333.15–1.591.317e–01–2.780e–040
phenylmethanol260.2278.15363.15105.48–5.130e–016.802e–040
tetrahydrothiophene 1,1-dioxide570.4300.75398.15488.81–3.732e+001.055e–02–1.017e–05
2,4,6-trimethylpyridine100.1293.15358.1516.67–3.036e–022.361e–060
dimethoxymethane50.0170.65298.152.92–4.106e–031.126e–050
1,3-dichloropropane50.1298.15333.15–61.394.818e–01–8.107e–040
methylsulfanylmethane60.2273.30310.4823.41–8.896e–021.076e–040
1,2-ethanedithiol30.0293.15333.1511.23–1.350e–020.000e+000

Tmin and Tmax (K) indicate the validity range of the parameterization. N indicates the number of points in the fit; χ2 is the root mean square deviation. See the Supporting Information for details.

Tmin and Tmax (K) indicate the validity range of the parameterization. N indicates the number of points in the fit; χ2 is the root mean square deviation. See the Supporting Information for details. The dielectric constant ε(0) (Figure 4, Table S5) appears to be the most difficult property to reproduce in our simulations, with slopes a < 0.5 and R2 ≤ 60% for both force fields (Table 2). Apart from lacking explicit polarization, limited sampling (1000 molecules for 10 ns were used in all cases) may be one of the causes; another contributing factor is the high viscosity of molecules containing alcohol or amine groups, further aggravated by the fact that some of these molecules were simulated at temperatures close to the melting temperature.
Figure 4

Correlation between dielectric constant (ε(0)) calculated using the GAFF and the OPLS/AA force fields and experimental results. Note the logarithmic axes.

Some liquids have extremely large dielectric constants, e.g., methanamide (ε(0) = 109) and N-methylformamide (ε(0) = 190). For these molecules, GAFF predicts 41 and 14, respectively, while OPLS/AA predicts 51 and 19. Xie et al. report a simulated dielectric constant of 200 for N-methylformamide, using a polarizable model, with only 256 molecules and 1 ns of simulation, but the authors state that “The dielectric constants have only been averaged for 1 ns of simulation time, and they are almost certain not yet converged.” Indeed, Whitfield et al. had previously concluded that very long times (50 ns) may be needed to obtain converged dielectric constants of molecules like N-methylacetamide because they tend to form long linear chains.[78] Such chains can in periodic simulation systems become “infinite”, which may contribute to the long relaxation time. It should be noted, however, that for most molecules in our study, the values are well converged, as witnessed by small error bars. Deviations from experimental results are therefore due predominantly to a lack of polarization and too low mobility of molecules. Interestingly, GAFF is somewhat better at predicting ε(0) than OPLS/AA (Table 2), most likely because the partial charges are somewhat higher for most molecules.

Volumetric Expansion Coefficient

The volumetric expansion coefficient αP is plotted in Figure 5 and tabulated in Table S6. The slope of the correlation plots is slightly less than 1 for both GAFF (a = 0.9) and OPLS/AA (a = 0.91), and there is a large spread around the y = x line for both OPLS/AA and GAFF with a RMSD of 0.3/GPa in both cases.
Figure 5

Correlation between volumetric expansion coefficient (αP) calculated using the GAFF and the OPLS/AA force fields and experimental results.

Isothermal Compressibility

For 53 molecules, an interpolation of experimental values of the isothermal compressibility κT as a function of the temperature was performed (Table 4 and Figure S3). The simulated κT’s are plotted versus the experimental values in Figure 6 and tabulated in Table S8. Like for αP, the spread in numbers is large, and the slope of the correlation plots is significantly less than 1 (GAFF, 0.66; OPLS/AA, 0.76, Table 2). In general, it seems that fluctuation properties are more difficult to predict than simple linear averages. Although we applied a very strict convergence criterion for the total energy of 0.5 J/mol/ns per degree of freedom, it may be that even longer equilibration times and production simulations are needed.
Table 4

Parameterization of Temperature Dependence of Isothermal Compressibility Constants in a Polynomial Form κT = A + BT + CT2a

moleculeNχ2TminTmaxABC
dichloromethane30.000293.15303.15–1.709e+011.144e–01–1.800e–04
methanamide50.008288.15323.151.352e–019.161e–040
nitromethane40.020298.15323.15–1.253e+006.666e–030
methanol240.014213.15333.151.004e+00–6.791e–032.557e–05
acetonitrile50.000298.15318.153.174e+00–2.209e–025.114e–05
1,1,2,2-tetrachloroethane20.000293.15303.15–4.962e–013.900e–030
1,1,2-trichloroethane70.002288.15318.15–7.213e–014.937e–030
bromoethane50.010273.15323.159.748e+00–6.685e–021.287e–04
N-methylformamide40.011288.15313.156.378e–031.968e–030
nitroethane30.015298.15323.15–9.873e–016.004e–030
ethanol160.007203.15363.151.280e+00–8.946e–032.857e–05
methylsulfinylmethane70.030293.15353.155.206e–01–3.136e–031.052e–05
2-aminoethanol60.000278.15333.157.273e–01–4.276e–031.051e–05
1,3-dichloropropane60.000283.15323.156.932e–01–4.785e–031.678e–05
propan-2-one100.010293.15328.15–3.053e+001.468e–020
methyl acetate80.012293.15328.15–2.562e+001.249e–020
1,3-dioxolane20.000293.15313.15–1.317e+006.960e–030
1-bromopropane70.003288.15318.15–1.264e+008.037e–030
N,N-dimethylformamide180.018288.15333.201.748e+00–1.073e–022.367e–05
1-nitropropane30.004298.15323.15–1.111e+006.420e–030
2-nitropropane30.020298.15323.15–1.060e+006.604e–030
1,4-dichlorobutane50.004288.15318.15–8.725e–015.246e–030
propane-1,2,3-triol190.003293.15473.158.358e–01–4.323e–037.862e–06
propan-1-amine60.036293.15323.15–2.469e+001.238e–020
N,N-dimethylacetamide50.015288.15318.15–5.890e–014.142e–030
butan-1-ol150.021293.15393.151.307e+00–8.833e–032.543e–05
N-ethylethanamine50.002298.15318.157.548e+00–5.536e–021.188e–04
butan-1-amine80.003298.15328.152.330e+00–1.702e–024.371e–05
ethyl acetate90.012298.15350.305.084e+00–3.567e–027.598e–05
oxolane50.001278.15323.15–9.434e–014.999e–034.886e–06
1-bromobutane120.000298.15333.152.650e+00–1.860e–024.413e–05
1-chlorobutane100.029293.15318.15–2.399e+001.205e–020
pentanenitrile50.005283.15323.158.811e–01–7.004e–032.429e–05
ethyl propanoate150.022278.15338.156.964e–01–7.128e–032.882e–05
2-methylbutan-2-ol20.000293.15298.15–1.495e+008.600e–030
pentan-1-ol80.010293.15333.153.158e+00–2.044e–024.292e–05
pentan-3-ol100.003293.15368.154.952e+00–3.315e–026.587e–05
nitrobenzene50.009298.15323.15–3.337e–012.832e–030
cyclohexanone50.021298.15308.15–9.399e–015.421e–030
hexan-2-one80.022278.15338.15–1.451e+008.315e–030
1-methoxy-2-(2-methoxyethoxy)ethane60.001298.15318.15–8.794e–015.105e–030
N,N-diethylethanamine80.006298.15328.154.400e+00–3.405e–028.064e–05
N-propan-2-ylpropan-2-amine70.001298.15328.159.459e+00–6.732e–021.357e–04
methoxybenzene50.043298.15338.15–1.520e+007.287e–030
3-methylphenol60.041298.15413.151.744e+00–1.029e–022.104e–05
toluene500.006288.15333.152.342e+00–1.627e–023.853e–05
diethyl propanedioate70.000298.15328.152.164e+00–1.397e–023.048e–05
heptan-2-one20.000293.15298.15–8.915e–016.200e–030
ethylbenzene70.008293.15333.152.524e+00–1.652e–023.683e–05
1,2-dimethylbenzene100.022273.15417.50–2.914e–011.846e–036.429e–06
octan-1-ol160.033293.15413.152.242e+00–1.449e–023.206e–05
quinoline20.000333.15373.15–5.477e–013.320e–030
(1-methylethyl)benzene30.003293.15298.15–6.340e–015.110e–030

Tmin and Tmax (K) indicate the validity range of the parameterization. N indicates the number of points in the fit; χ2 is the root mean square deviation. See the Supporting Information for details.

Figure 6

Correlation between isothermal compressibility (κT) calculated using the GAFF and the OPLS/AA force fields and experimental results.

Tmin and Tmax (K) indicate the validity range of the parameterization. N indicates the number of points in the fit; χ2 is the root mean square deviation. See the Supporting Information for details.

Heat Capacities

For three molecules, an interpolation of experimental numbers is presented in Table 5 and Figure S4. The heat capacity is a difficult property to calculate due to significant quantum effects. The simple eq 7 produces numbers (cPclass) that are twice too high (Table 2). Since the energy taken up by vibrations in a classical harmonic oscillator is much higher than for a quantum harmonic oscillator at the same frequency, the cPclass values are too high. Introducing quantum corrections, in the manner proposed by Berens et al.,[71] on which the two phase thermodynamics (2PT) method[68−70] is based, presupposes that the frequencies in the classical simulation are correct: this is often the case since most force constants have been derived from spectroscopic experiments. It should be noted that there is no a priori reason to assume that the intermolecular degrees of freedom behave harmonically, as they are determined by Coulomb and van der Waals interactions. Despite these theoretical shortcomings, the 2PT method produces reasonable results for cP (see Figure 7, Table 2, and Table S8)—much closer to experiment than cPclass on any account. In order to compute cP, it is necessary to add a correction Δc (eq 12) to the heat capacity at constant volume cV that is produced by the density of states analysis. Δc is underestimated by classical force field calculations; however, cP still is estimated reasonably, with a = 1.08 for GAFF and a = 1.02 for OPLS/AA with correlation coefficients R2 = 98% and 97%, respectively. If we compare just cV from our simulations (i.e., without adding in Δc) and subtract the experimental Δc from the measured cP, we find a very good correlation (GAFF, a = 1.02, R2 = 97%; OPLS/AA, a = 1.04, R2 = 95%), see Figure 8 and Table 2. Although correlation between experimental results and calculations can by no means validate the underlying theoretical model, it nevertheless indicates that the results are meaningful, because we have approximately 70 experimental cV values to which to compare. Indeed, although the root-mean-square deviation (RMSD) from experimental results is similar for cV and cP, the fit to experimental results is much better (slope a close to 1) for both OPLS/AA and GAFF. The DOS simulations were performed without constraints, and the heat capacities depend directly on the intra- and intermolecular vibrations. Deviations from the experimental heat capacities could therefore indicate problems with the force constants for intramolecular motions.
Table 5

Parameterization of Temperature Dependence of Heat Capacity at Constant Pressure in a Polynomial Form cP = A + BTa

moleculeNχ2TminTmaxAB
1,3-dioxolane90.187288.15328.154.371e+012.613e–01
1,2,3,4-tetrafluorobenzene410.145235.47319.791.158e+022.491e–01
1,2,3,5-tetrafluorobenzene250.343229.32311.181.186e+022.400e–01

Tmin and Tmax (K) indicate the validity range of the parameterization. N indicates the number of points in the fit; χ2 is the root mean square deviation. See the Supporting Information for details.

Figure 7

Correlation between measured heat capacity at constant pressure (cP) and computed using the GAFF and the OPLS/AA force fields based on either the density of states (DoS) method, which includes quantum corrections and a Δc correction based on simulations, or based on a purely classical treatment (cPclass, Class.).

Tmin and Tmax (K) indicate the validity range of the parameterization. N indicates the number of points in the fit; χ2 is the root mean square deviation. See the Supporting Information for details.

Outliers Per Force Field

Table 6 shows how the molecular models of the individual molecules perform relative to the force field as a whole. The average relative deviations in σ and averaged over 1–8 data points (depending on the availability of experimental data) signals how well the force field performs for each molecule. The properties used were density, enthalpy of vaporization, surface tension, dielectric constant, volumetric expansion coefficient, isothermal compressibility, and the heat capacity at constant volume.
Table 6

Average Relative Deviation (σ) from Experimental Values, in Brackets, the Number of Observablesa

nameCGenFFGAFFOPLS/AA
1. chloroform 2.1(6)3.0(7)
2. dichloro(fluoro)methane 1.0(4)1.3(4)
3. dibromomethane 2.9(6)1.7(7)
4. dichloromethane 1.7(7)3.6(7)
5. methanal 0.3(4)0.3(4)
6. methanoic acid 4.5(6)2.6(7)
7. bromomethane 1.4(3)0.4(3)
8. methanamide0.0(1)1.2(7)0.4(6)
9. nitromethane 2.0(7)0.8(7)
10. methanol0.0(2)0.8(7)0.8(7)
11. 1,1,1,2,2-pentachloroethane 0.5(4)0.8(4)
12. 1,1,2,2-tetrachloroethane 1.7(7)1.7(7)
13. 1,1-dichloroethene 1.7(4)0.8(4)
14. 1,1,2-trichloroethane 1.2(7)0.9(7)
15. acetonitrile0.0(1)1.1(7)2.2(7)
16. 1,2-dibromoethane 2.6(7)4.0(7)
17. 1,1-dichloroethane0.0(1)0.7(7)1.7(7)
18. 1,2-dichloroethane 1.6(7)1.2(7)
19. methyl formate 0.9(4)0.8(5)
20. bromoethane0.0(1)2.2(7)0.6(7)
21. chloroethane0.0(1)0.8(5)1.3(5)
22. 2-chloroethanol 0.4(4)0.5(4)
23. ethanamide 0.2(4)0.8(5)
24. N-methylformamide 1.4(7)1.4(7)
25. nitroethane 1.5(7)0.7(7)
26. methoxymethane 0.5(5)1.3(5)
27. ethanol0.0(2)1.0(7)0.7(6)
28. 1,2-ethanedithiol 0.6(3)0.1(3)
29. methyldisulfanylmethane0.1(2)1.2(5)1.6(5)
30. methylsulfinylmethane0.1(1)1.0(7)0.6(7)
31. methylsulfanylmethane 1.4(5)1.2(5)
32. 2-aminoethanol 1.2(5)1.3(6)
33. ethane-1,2-diamine 1.2(7)1.9(7)
34. prop-2-enenitrile 1.0(5)1.2(5)
35. 1,3-dioxolan-2-one 0.5(5)0.2(4)
36. propanenitrile 1.1(7)1.9(7)
37. 1,2-dibromopropane 1.1(5)0.6(4)
38. 1,3-dichloropropane 0.9(7)1.0(7)
39. (2R)-2-methyloxirane 0.0(2)0.1(2)
40. propan-2-one0.0(2)1.0(7)0.7(7)
41. methyl acetate0.0(2)1.3(7)0.9(7)
42. 1,3-dioxolane0.0(1)1.2(4)0.6(4)
43. 2-iodopropane 0.7(5)1.1(5)
44. 1-bromopropane 1.3(7)0.6(7)
45. N,N-dimethylformamide 0.7(6)0.5(6)
46. N-methylacetamide0.0(1)0.4(4)0.2(4)
47. 1-nitropropane 1.6(7)1.2(7)
48. 2-nitropropane 1.6(7)0.9(7)
49. dimethoxymethane 0.8(5)0.9(5)
50. propane-1,2,3-triol 1.3(6)0.8(6)
51. propan-1-amine 1.1(7)1.5(7)
52. propan-2-amine 0.7(5)0.6(4)
53. 2-methylpropane0.0(1)0.8(5)1.1(5)
54. ethylsulfanylethane 0.6(5)0.7(5)
55. butane-1-thiol 0.9(5)0.5(5)
56. butan-1-ol 1.1(7)0.9(7)
57. 2-methylpropan-2-ol 0.4(2)0.1(2)
58. butane-1,4-diol 0.9(6)0.4(6)
59. (2-hydroxyethoxy)ethan-2-ol 1.2(4)1.1(5)
60. N-ethylethanamine 1.1(7)1.2(7)
61. butan-1-amine 1.1(7)0.9(7)
62. 2-methylpropan-2-amine 1.0(5)0.8(5)
63. 2-(2-hydroxyethylamino)ethanol 0.5(4)0.4(4)
64. pyrimidine0.0(2)0.7(4)0.6(4)
65. furan0.2(2)1.9(5)1.9(5)
66. thiophene0.0(2)0.7(4)0.3(5)
67. 1H-pyrrole0.1(1)1.3(7)1.1(7)
68. ethenyl acetate 0.5(4)0.8(4)
69. oxolan-2-one 0.3(3)0.3(4)
70. acetyl acetate 1.2(4)1.2(4)
71. 1,4-dichlorobutane 0.6(7)0.8(7)
72. oxolane 0.6(6)1.3(7)
73. ethoxyethene 0.3(3)0.2(3)
74. ethyl acetate0.0(2)1.2(7)1.1(7)
75. tetrahydrothiophene 1,1-dioxide 0.8(4)0.9(4)
76. thiolane 0.5(4)0.4(4)
77. 1-bromobutane 1.1(7)0.7(7)
78. 1-chlorobutane 1.4(7)1.8(7)
79. pyrrolidine0.1(1)1.3(7)1.3(7)
80. N,N-dimethylacetamide 1.0(7)0.9(7)
81. morpholine 0.8(5)0.9(5)
82. pyridine0.1(2)0.6(6)0.9(7)
83. cyclopentanone 0.8(5)0.6(5)
84. 1-cyclopropylethanone 0.2(2)0.1(2)
85. pentane-2,4-dione 0.9(5)1.3(5)
86. methyl 2-methylprop-2-enoate 0.8(5)3.6(5)
87. pentanenitrile 0.6(6)1.6(7)
88. ethyl propanoate 1.3(7)1.1(7)
89. diethyl carbonate 2.1(7)0.7(6)
90. pentan-1-ol 1.0(7)0.9(7)
91. pentan-3-ol 1.0(7)1.1(7)
92. 2-methylbutan-2-ol 1.1(5)0.5(5)
93. pentane-1,5-diol 0.8(6)0.6(6)
94. pentan-3-amine 0.5(4)0.6(4)
95. 1,2,3,4-tetrafluorobenzene 0.2(2)0.1(2)
96. 1,2,3,5-tetrafluorobenzene 0.2(2)0.1(2)
97. 1,3-difluorobenzene0.2(2)0.7(4)1.3(5)
98. 1,2-difluorobenzene 0.7(4)1.0(5)
99. fluorobenzene0.1(2)1.6(7)0.5(6)
100. nitrobenzene0.0(2)1.1(7)1.1(7)
101. 2-chloroaniline 0.9(4)0.6(4)
102. phenol 0.8(4)0.9(5)
103. benzenethiol 1.4(5)1.3(5)
104. 2-methylpyridine 0.3(4)0.9(5)
105. 3-methylpyridine0.1(2)0.8(5)0.6(5)
106. 4-methylpyridine0.0(2)1.1(7)0.4(6)
107. cyclohexanone 1.0(7)0.9(7)
108. (E)-hex-2-ene0.0(2)0.0(2)0.0(2)
109. hexan-2-one 0.8(6)0.9(7)
110. 2,4,6-trimethyl-1,3,5-trioxane 1.6(4)1.0(4)
111. cyclohexanamine 0.8(5)0.7(5)
112. 2-propan-2-yloxypropane 3.3(7)0.9(7)
113. 1-methoxy-2-(2-methoxyethoxy)ethane 1.5(7)1.1(7)
114. triethyl phosphate 2.8(6)2.2(6)
115. N,N-diethylethanamine 1.2(7)1.0(7)
116. N-propan-2-ylpropan-2-amine 0.8(6)0.6(6)
117. trifluoromethylbenzene 0.8(5)0.5(4)
118. benzonitrile 1.0(5)1.0(5)
119. benzaldehyde0.2(2)5.7(7)3.7(6)
120. toluene0.1(2)1.6(7)1.3(7)
121. methoxybenzene0.1(2)1.2(7)1.1(7)
122. phenylmethanol 1.0(5)0.8(5)
123. 2-methylphenol 0.9(5)0.8(5)
124. 3-methylphenol 1.0(5)0.9(5)
125. 4-methylphenol0.1(1)1.2(5)0.7(5)
126. diethyl propanedioate 1.1(4)0.8(4)
127. 2,4-dimethylpentan-3-one 0.6(4)0.4(4)
128. heptan-2-one 1.1(7)0.7(7)
129. ethenylbenzene 1.2(5)1.1(5)
130. 1-phenylethanone 1.0(7)1.1(7)
131. methyl benzoate 0.9(7)1.0(7)
132. methyl 2-hydroxybenzoate 1.1(5)0.4(4)
133. ethylbenzene0.1(2)1.4(7)1.1(7)
134. 1,2-dimethylbenzene0.1(1)1.7(7)1.0(7)
135. 1,2-dimethoxybenzene 0.4(4)0.6(5)
136. 2,4,6-trimethylpyridine 0.9(5)1.0(5)
137. octan-1-ol 0.8(6)1.7(7)
138. 1-butoxybutane 0.7(4)1.0(5)
139. N-butylbutan-1-amine 0.9(7)0.8(7)
140. isoquinoline0.0(1)0.7(4)1.3(4)
141. quinoline0.1(2)1.1(7)1.2(7)
142. (1-methylethyl)benzene0.1(2)1.0(6)0.7(6)
143. 1,2,4-trimethylbenzene0.1(1)1.2(6)1.0(6)
144. 2,6-dimethylheptan-4-one 1.0(5)0.9(5)
145. 1-chloronaphthalene 0.5(6)1.3(7)
146. phenoxybenzene 0.6(4)1.1(5)

Average relative deviation larger than 1σ is printed in bold, larger than 1.5σ in bold italic.

Average relative deviation larger than 1σ is printed in bold, larger than 1.5σ in bold italic. Some types of molecules are problematic in both of the force fields considered here. Small molecules containing more than one Cl or Br atom generally have both density and enthalpy of vaporization values that deviate significantly from experimental reference. This is not the case for molecules containing only one of these atoms, or molecules where there is a spacer (e.g., a CH2 group) between them. It could therefore be that the differences are caused by overlapping atoms. By introducing a new atom type of Br and Cl for the case where there are two such atoms next to each other on the carbon chain, these problems might be resolved. The density and enthalpy of vaporization of methanoic acid (formic acid) were particularly hard to reproduce, as was noted previously by Jedlovsky and Turi, who constructed a specific potential for this molecule.[79] The main feature responsible for the improved model in this case was a higher charge (≈ 0.1e) on the C–H atom than is used in either OPLS/AA (0) or GAFF (0.04). Methanoic acid forms very strong linear chains, which are difficult to break. This leads to long correlation times for the system dipoles and to dielectric constants that are far from the experimental values (Table S5). Benzaldehyde and furan are also problematic in both force fields. Even if they both generate decent densities and enthalpies of vaporization, the other properties (surface tension, dielectric constant, and thermal expansion coefficient) are far from the experimental values. Molecules containing a nitro group (specially nitromethane, 1-nitropropane, and 2-nitropropane) stick out as a problematic group in GAFF. The charges on nitro groups are high, leading to high density and enthalpy of vaporization. The standard OPLS/AA parametrization of alcohols has been reported to perform poorly for octan-1-ol. MacCallum and Tieleman[80] therefore derived a specific united atom potential of the molecule where they used modified charges on the headgroup. The OPLS/AA parametrization investigated here gives both too high a density and too high an enthalpy of vaporization, and therefore the other properties investigated for this molecules also deviate from experimental results. Methyl-2-methylprop-2-enaote shows similar problems, and this could probably be corrected in a similar way. It should be noted that, compared to GAFF, the charges on the headgroup in these two molecules are relatively high in OPLS/AA.

Discussion

The development of force fields for molecular simulation is critically dependent on the availability of good reference data, preferably from experimental sources. All force fields, be they empirical, purely derived from quantum-mechanics, or a combination of the two, will eventually have to face the test of comparing predicted to measured values. There is a large amount of literature on force field testing for proteins and peptides,[57,81−87] nucleic acids,[88−91] carbohydrates,[92] specific organic molecules or protein fragments,[20,26,93−97] and ions,[98−101] to list but a few. In addition, there are indirect force field tests, for instance of the binding energy in protein–ligand complexes,[16,102] protein structure prediction,[103] or of force-field-based docking codes.[104−106] It is interesting to mention the industrial fluid properties simulation challenges, which are stimulating modelers to predict properties of liquids by any means, including molecular simulation.[107,108] Here, we have introduced a benchmark set of 146 liquids in order to assess two popular all atom force fields, OPLS/AA and GAFF, and to set a standard for future force fields. For comparison, we have included an independent density and enthalpy of vaporization data set computed using CGenFF, based on a similar set of molecules.[37] Calculated density, enthalpy of vaporization, heat capacities, surface tension, dielectric constants, volumetric expansion coefficients, and isothermal compressibility from the two force fields are compared to experimental values. Indeed the benchmark is quite revealing, in that systematic deviations can be found and rationalized. The knowledge about such deviations will hopefully be useful for further development of the force fields. To a first approximation, molecular vibrations can be described as quantum harmonic oscillators.[109] Classical harmonic oscillators do not describe the properties of quantum harmonic oscillators, which makes it necessary to implement corrections in computing for instance heat capacities. The two phase thermodynamics method employed here for estimating cP and cV relies on the force constants of the force field used, and on the effective frequencies in the simulations. The density of states obtained from the velocity autocorrelation is convoluted by a weighting function derived from the partition function for a quantum harmonic oscillator in order to obtain a heat capacity for a corresponding quantum liquid. If a force field would allow one to directly reproduce the “correct” density of states, one could use the much simpler fluctuation formulas, as described by Allen and Tildesley;[38] however, heat capacities computed in this manner overestimate the experimental values by about 100% for OPLS/AA and GAFF (Table S10). Going beyond the harmonic approximation should therefore be considered by force field developers. Despite efforts in the context of the MMF94 force field[110] and the MM3-MM4 family of force fields,[111−113] this has not been widely adopted in the biomolecular simulation community, although the polarizable AMOEBA force field[114] does feature anharmonic bond and angle potentials as well. In principle, it should be advantageous to use for instance Car–Parrinello molecular dynamics,[115] in order to more faithfully represent a liquid than is possible in a classical simulation. This was attempted by Kuo et al. for water.[116] They find a large scatter in cP values due to limited sampling, but also a systematic deviation from the experimental value. Obviously, the computational bottleneck that would be introduced by CPMD or related methods will remain difficult to surmount for the immediate future, and therefore force-field-based methods remain necessary. Nevertheless, it is encouraging that there is a trend to use molecular dynamics simulations based on density functional theory codes to study vibrational properties of biomolecular systems beyond the harmonic approximation.[117−119] The dielectric constant seems to be the hardest nut to crack. Nonpolarizable force fields (such as GAFF and OPLS/AA) are known to have difficulties in reproducing the dielectric constant and to some extent also the surface tension. In the case of water, for which a large number of force fields have been developed, there are several studies that describe this (for a review, see, for example, Guillot[25]). Improving the dielectric function often turns out to be done at the cost of the enthalpy of vaporization and the free energy of solvation—properties that may be more important to reproduce in biomolecular simulations. In addition to systematic problems, like sampling or the lack of polarization in our simulations,[120] the temperature dependence of the dielectric constant provides both a challenge and an opportunity for future force field development. For most molecules, the temperature dependence is very strong, because molecular motion is the largest factor contributing to ε(0). In his review of water models, Guillot has pointed out that the relation between dielectric constant and other properties is complex, and hence it can be used to test and validate force fields, but not likely as a target for force field optimization.[25] The benchmark we present here allows one to pinpoint systematic errors in force fields due to the fact that most chemical moieties are represented more than once. The overall performance of GAFF is surprisingly good, seeing that the parameter development was not aimed at liquids. The results from the OPLS/AA force field are slightly better than GAFF, obviously due to the fact that OPLS/AA was parametrized for liquids. The CHARMM generalized force field seems to be even slightly better, at least for density and enthalpy of vaporization.[37] It is reassuring for applications of force field calculations beyond liquids that the parameters in most cases are reasonable; however, the results presented here also show that blind faith in force fields is not warranted in all cases. In Table 2, we list the root-mean-square deviation, as well as the average relative deviation, of the calculated values from the experimental, for each property we have analyzed. Even if our set of molecules is limited to 146, these numbers give a measurement of how well the properties are reproduced in the two force fields, at least for molecules similar to the set presented here. Wang and Tingjun have recently reported a similar force field test of 71 organic molecules based on the GAFF and OPLS/AA force fields.[32] They report densities and enthalpies of vaporization for these molecules and find small deviations from experimental results that are comparable to our numbers. It is encouraging to note that these authors were able to improve the correspondence to experimental numbers by tuning the Lennard-Jones parameters of some of the atom types. How this affects the other properties that we have studied here, in particular, the dielectric constant and the surface tension, remains to be determined, but it is likely that just tweaking the Lennard-Jones parameters is not sufficient to cure the significant and systematic deviations observed for those properties. Mobley et al. have performed free energy of solvation (ΔGhyd) benchmarks, reporting a RMS error from experimental numbers of 5.2 kJ/mol for more than 500 molecules.[121,122] This number is comparable to the RMSD of 6.5 kJ/mol we computed for ΔHvap for OPLS/AA (10.6 kJ/mol for GAFF and 4.7 kJ/mol for CGenFF[37]). Since both numbers are to a large extent determined by the intermolecular energies, we can conclude that the RMS error in intermolecular energies for (“small”) organic molecules is 5–6 kJ/mol using state of the art simulations and nonpolarizable force fields. It should be noted that this result may be biased by the choice of test set, as has been shown in the context of the SAMPL contest where hydration energies were to be predicted.[123,124] It was found here that larger molecules with multiple functional groups have similar deviations from the experimental hydration energy—errors up to 10 kJ/mol.[123] It seems plausible that part of this error is due to the simple nonpolarizable water model used, however, since the enthalpy of vaporization is approximately additive (which can be seen by plotting ΔHvap for, e.g., alkanes as a function of the number of carbons), the error per functional group should still be relatively low, less than 5 kJ/mol for most groups. In the present work, we studied pure liquids only, providing a simpler test set than what has been used in previous studies. Further tests on pure liquids and liquid mixtures should provide a more detailed understanding of the predictive power of force field calculations. At the same time, systematic methods for force field development[23,125] could be used for the improvement of classical force fields.
  59 in total

1.  Distinguishing native conformations of proteins from decoys with an effective free energy estimator based on the OPLS all-atom force field and the Surface Generalized Born solvent model.

Authors:  Anthony K Felts; Emilio Gallicchio; Anders Wallqvist; Ronald M Levy
Journal:  Proteins       Date:  2002-08-01

2.  Fast, efficient generation of high-quality atomic charges. AM1-BCC model: II. Parameterization and validation.

Authors:  Araz Jakalian; David B Jack; Christopher I Bayly
Journal:  J Comput Chem       Date:  2002-12       Impact factor: 3.376

3.  PRODRG: a tool for high-throughput crystallography of protein-ligand complexes.

Authors:  Alexander W Schüttelkopf; Daan M F van Aalten
Journal:  Acta Crystallogr D Biol Crystallogr       Date:  2004-07-21

4.  Basis set exchange: a community database for computational sciences.

Authors:  Karen L Schuchardt; Brett T Didier; Todd Elsethagen; Lisong Sun; Vidhya Gurumoorthi; Jared Chase; Jun Li; Theresa L Windus
Journal:  J Chem Inf Model       Date:  2007-04-12       Impact factor: 4.956

5.  Alanine polypeptide structural fingerprints at room temperature: what can be gained from non-harmonic Car-Parrinello molecular dynamics simulations.

Authors:  M-P Gaigeot
Journal:  J Phys Chem A       Date:  2008-12-25       Impact factor: 2.781

6.  Massively parallel computation of absolute binding free energy with well-equilibrated states.

Authors:  Hideaki Fujitani; Yoshiaki Tanida; Azuma Matsuura
Journal:  Phys Rev E Stat Nonlin Soft Matter Phys       Date:  2009-02-26

7.  Thermodynamics of hydrogen bonding in hydrophilic and hydrophobic media.

Authors:  David van der Spoel; Paul J van Maaren; Per Larsson; Nicusor Tîmneanu
Journal:  J Phys Chem B       Date:  2006-03-09       Impact factor: 2.991

8.  Application of Molecular Dynamics Simulations in Molecular Property Prediction I: Density and Heat of Vaporization.

Authors:  Junmei Wang; Hou Tingjun
Journal:  J Chem Theory Comput       Date:  2011-07-12       Impact factor: 6.006

9.  Balance of Attraction and Repulsion in Nucleic-Acid Base Stacking: CCSD(T)/Complete-Basis-Set-Limit Calculations on Uracil Dimer and a Comparison with the Force-Field Description.

Authors:  Claudio A Morgado; Petr Jurečka; Daniel Svozil; Pavel Hobza; Jiří Šponer
Journal:  J Chem Theory Comput       Date:  2009-04-29       Impact factor: 6.006

Review 10.  Hydrogen bonding and pi-stacking: how reliable are force fields? A critical evaluation of force field descriptions of nonbonded interactions.

Authors:  Robert S Paton; Jonathan M Goodman
Journal:  J Chem Inf Model       Date:  2009-04       Impact factor: 4.956

View more
  62 in total

1.  Osmotic Pressure Simulations of Amino Acids and Peptides Highlight Potential Routes to Protein Force Field Parameterization.

Authors:  Mark S Miller; Wesley K Lay; Adrian H Elcock
Journal:  J Phys Chem B       Date:  2016-04-21       Impact factor: 2.991

2.  Toward Improved Force-Field Accuracy through Sensitivity Analysis of Host-Guest Binding Thermodynamics.

Authors:  Jian Yin; Andrew T Fenley; Niel M Henriksen; Michael K Gilson
Journal:  J Phys Chem B       Date:  2015-08-05       Impact factor: 2.991

3.  GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit.

Authors:  Sander Pronk; Szilárd Páll; Roland Schulz; Per Larsson; Pär Bjelkmar; Rossen Apostolov; Michael R Shirts; Jeremy C Smith; Peter M Kasson; David van der Spoel; Berk Hess; Erik Lindahl
Journal:  Bioinformatics       Date:  2013-02-13       Impact factor: 6.937

4.  Measuring electrostatic fields in both hydrogen-bonding and non-hydrogen-bonding environments using carbonyl vibrational probes.

Authors:  Stephen D Fried; Sayan Bagchi; Steven G Boxer
Journal:  J Am Chem Soc       Date:  2013-07-18       Impact factor: 15.419

5.  Chiral recognition of liquid phase dimers from gamma-valerolactone racemic mixture.

Authors:  Felippe M Colombari; André F de Moura; Luiz Carlos G Freitas
Journal:  J Mol Model       Date:  2018-07-26       Impact factor: 1.810

6.  Escaping Atom Types in Force Fields Using Direct Chemical Perception.

Authors:  David L Mobley; Caitlin C Bannan; Andrea Rizzi; Christopher I Bayly; John D Chodera; Victoria T Lim; Nathan M Lim; Kyle A Beauchamp; David R Slochower; Michael R Shirts; Michael K Gilson; Peter K Eastman
Journal:  J Chem Theory Comput       Date:  2018-10-30       Impact factor: 6.006

7.  Evaluation of CM5 Charges for Nonaqueous Condensed-Phase Modeling.

Authors:  Leela S Dodda; Jonah Z Vilseck; Kara J Cutrona; William L Jorgensen
Journal:  J Chem Theory Comput       Date:  2015-08-27       Impact factor: 6.006

8.  Toward Automated Benchmarking of Atomistic Force Fields: Neat Liquid Densities and Static Dielectric Constants from the ThermoML Data Archive.

Authors:  Kyle A Beauchamp; Julie M Behr; Ariën S Rustenburg; Christopher I Bayly; Kenneth Kroenlein; John D Chodera
Journal:  J Phys Chem B       Date:  2015-09-29       Impact factor: 2.991

9.  Olanzapine crystal symmetry originates in preformed centrosymmetric solute dimers.

Authors:  Monika Warzecha; Lakshmanji Verma; Blair F Johnston; Jeremy C Palmer; Alastair J Florence; Peter G Vekilov
Journal:  Nat Chem       Date:  2020-09-23       Impact factor: 24.427

10.  Vibrational Stark Effects of Carbonyl Probes Applied to Reinterpret IR and Raman Data for Enzyme Inhibitors in Terms of Electric Fields at the Active Site.

Authors:  Samuel H Schneider; Steven G Boxer
Journal:  J Phys Chem B       Date:  2016-08-31       Impact factor: 2.991

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.