| Literature DB >> 35058472 |
Leonid Pereyaslavets1, Ganesh Kamath2, Boris Fain3, Oleg Butin2, Alexey Illarionov2, Michael Olevanov2,4, Igor Kurnikov2, Serzhan Sakipov2, Igor Leontyev2, Ekaterina Voronina2,4, Tyler Gannon2, Grzegorz Nawrocki2, Mikhail Darkhovskiy2, Ilya Ivahnenko2, Alexander Kostikov2, Jessica Scaranto5, Maria G Kurnikova5, Suvo Banik6,7, Henry Chan6,7, Michael G Sternberg6, Subramanian K R S Sankaranarayanan6,7, Brad Crawford8, Jeffrey Potoff8, Michael Levitt9, Roger D Kornberg9.
Abstract
The main goal of molecular simulation is to accurately predict experimental observables of molecular systems. Another long-standing goal is to devise models for arbitrary neutral organic molecules with little or no reliance on experimental data. While separately these goals have been met to various degrees, for an arbitrary system of molecules they have not been achieved simultaneously. For biophysical ensembles that exist at room temperature and pressure, and where the entropic contributions are on par with interaction strengths, it is the free energies that are both most important and most difficult to predict. We compute the free energies of solvation for a diverse set of neutral organic compounds using a polarizable force field fitted entirely to ab initio calculations. The mean absolute errors (MAE) of hydration, cyclohexane solvation, and corresponding partition coefficients are 0.2 kcal/mol, 0.3 kcal/mol and 0.22 log units, i.e. within chemical accuracy. The model (ARROW FF) is multipolar, polarizable, and its accompanying simulation stack includes nuclear quantum effects (NQE). The simulation tools' computational efficiency is on a par with current state-of-the-art packages. The construction of a wide-coverage molecular modelling toolset from first principles, together with its excellent predictive ability in the liquid phase is a major advance in biomolecular simulation.Entities:
Year: 2022 PMID: 35058472 PMCID: PMC8776904 DOI: 10.1038/s41467-022-28041-0
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Fig. 1QM: FF energies’ correspondences and deviations.
a FF vs. QM energies for all the dimers in our training sets. The functional form reproduces the lower energy conformations very well and is designed to permit a larger error in less important high-energy high electron overlap regions. b the distribution of errors for our training dimer sets. The MAE of errors are 0.17 kcal/mol for all, 0.19 kcal/mol for dimers with water (total number of dimers = 36,309), and 0.16 kcal/mol for dimers with alkanes (total number of dimers = 25,986): A specific system (ethanol-water) provides a more detailed illustration of model energies and their correspondence with QM. c, d dissociation curves for primary (c) and secondary (d) minima of the ethanol-water dimer. QM energies are solid lines and FF values are filled circles. The colors designate the energy components: electrostatics (ES), exchange-repulsion (EX), dispersion (DS) and induction (IND). The agreements for the total energy and for each component are excellent. e–g Error distributions for the ethanol-water dimer: e is analogous to a; f is a difference plot offering a more detailed view and is projected onto (g) the error distribution. The MAE for the errors in this system is 0.08 kcal/mol.
Fig. 2Properties of the ARROW water model.
a The non-additive many-body error for water multimers vs. their total QM intermolecular energy. All the many-body errors are below 0.5 kcal/mol or 1% of total energy, and below 3% of the many-body contributions. b The radial distribution function for the O–O distance in water. The MD RDF (dotted green) is over-structured compared with the experimental curve, and the presence of NQE (solid green) brings the structure of ARROW H2O in excellent agreement with the experimental one.
Neat properties and hydration/solvation of water and cyclohexane.
| H2O | Density (g/cc) | Hvap (kcal/mol) | Hydration (kcal/mol) | Self-solvation (kcal/mol) |
|---|---|---|---|---|
| expt | 0.997 | 10.51 | −6.30 | −6.30 |
| ARROW FF (MD) | 1.027 | 11.98 | −6.81 | −6.81 |
| ARROW FF (PIMD8) | 1.027 | 10.63 | −6.13 | −6.13 |
Predictions were performed by classical simulations and with inclusion of NQE. All numbers are in good agreement with the experimental values, with PIMD simulations being significantly closer than the classical MD ones. The self-solvation of water is a succinct measure of model accuracy and we recommend its determination for all water models. For water the self-solvation and hydration are obviously identical.
Fig. 3ARROW force field solvation predictions.
a Predicted vs. experimental free energy of hydration for a diverse set of compounds. The straight line is a line of perfect agreement between experimental and theoretical values, and the gray bar is the range of chemical accuracy. The predicted and experimental free energies of hydration for neutral amino acid analogs are inset. b Predicted vs. experimental free energy of solvation in cyclohexane. The molecules here are a subset of those in a because only those with experimental values for CHEX solvation can be included. c H2O/CHEX partition coefficient for the same set as b. The free energy predictions are well within chemical accuracy.
Fig. 4NQE effect and comparison of hydration predictions.
a A visual comparison of the hydration predictions vs. experimental values for PIMD8 vs. classical MD values. The inclusion of NQE systematically improves the predictions and decreases the overall error (MAE) from 0.78 to 0.2 kcal/mol. b A comparison of the hydration free energies to state-of-the art wide coverage Force Fields. The molecules shown include the major functional groups that have been parametrized by all three models and are therefore a subset of those shown in Fig. 3a. The errors (MAE) for GAFF, AMOEBA, and ARROW FF are 0.88, 0.76, and 0.22 kcal/mol, respectively.