| Literature DB >> 34941253 |
Hadrián Montes-Campos1, Jesús Carrete2, Sebastian Bichelmaier2, Luis M Varela1, Georg K H Madsen2.
Abstract
We present NeuralIL, a model for the potential energy of an ionic liquid that accurately reproduces first-principles results with orders-of-magnitude savings in computational cost. Built on the basis of a multilayer perceptron and spherical Bessel descriptors of the atomic environments, NeuralIL is implemented in such a way as to be fully automatically differentiable. It can thus be trained on ab initio forces instead of just energies, to make the most out of the available data, and can efficiently predict arbitrary derivatives of the potential energy. Using ethylammonium nitrate as the test system, we obtain out-of-sample accuracies better than 2 meV atom-1 (<0.05 kcal mol-1) in the energies and 70 meV Å-1 in the forces. We show that encoding the element-specific density in the spherical Bessel descriptors is key to achieving this. Harnessing the information provided by the forces drastically reduces the amount of atomic configurations required to train a neural network force field based on atom-centered descriptors. We choose the Swish-1 activation function and discuss the role of this choice in keeping the neural network differentiable. Furthermore, the possibility of training on small data sets allows for an ensemble-learning approach to the detection of extrapolation. Finally, we find that a separate treatment of long-range interactions is not required to achieve a high-quality representation of the potential energy surface of these dense ionic systems.Entities:
Mesh:
Substances:
Year: 2021 PMID: 34941253 PMCID: PMC8757435 DOI: 10.1021/acs.jcim.1c01380
Source DB: PubMed Journal: J Chem Inf Model ISSN: 1549-9596 Impact factor: 4.956
Figure 1Examples of basis functions for the spherical Bessel descriptors. (top) Radial components of all basis functions for nmax = 5. (bottom) Angular components of two basis functions represented in the half-plane ϕ = 0.
Figure 2Global schematic representation of the ML model, including the calculation of descriptors, the embedding, and the NN. natoms, np, and nemb are the number of atoms, the number of descriptors [see eq ], and the dimension of the embedding, respectively. The diagram at the bottom illustrates, schematically, how reverse-mode automatic differentiation computes the forces; α is a shorthand index that runs over all descriptors for all atoms in the system. A cross-hatch fill represents full all-to-all connectivity between adjacent layers.
Figure 3Predicted vs reference forces for the NeuralIL model over the training (blue) and validation (green) sets.
Mean Absolute Errors in the Forces and the Energies Achieved with Several Kinds of Models over the Validation Seta
| model | MAE | MAE |
|---|---|---|
| (meV atom–1) | (meV Å–1) | |
| 1.86 | 65.6 | |
| 2.26 | 65.7 | |
| χ | 11.8 | 167 |
| 16.9 | 171 | |
| 7.42 | 109 | |
| 1.63 | 559 | |
| 3.10 | 71.3 | |
| 12.0 | 91.0 | |
| 1.93 | 60.8 | |
| 856 | 1970 | |
| 2.87 | 67.2 |
See Table 2 for a short description of each model, or see the main text for a more extended discussion.
Summary of the Differences with Respect to NeuralIL of All Models Discussed in This Article and Listed in Table 1, for Quick Reference
| χ |
Figure 4Ensemble predictions of the projections of the N–O force (in the anion, top panel) and C–N force (in the cation, bottom panel) on the segment joining both atoms, extracted from 18 instances of NeuralIL built based on random samples containing 50% of the training data each. The gray area spans a single standard deviation above and below the ensemble average. Also depicted: the main NeuralIL, the OPLS-AA value of the same force, and the ground truth of all the NN models, i.e., the forces extracted from a Gpaw DFT calculation. The bottom part of each panel shows a frequency density plot of the training data for the corresponding distance. The vertical dotted lines mark the minimum and maximum values found in the training set.
Figure 5Comparison of the predicted projections of the C–N force on a C–N bond from NeuralIL and from a model identical in all respects except in that it uses the SELU instead of Swish-1 as the activation function and that it does not require the use of LayerNorm. The vertical dotted lines mark the minimum and maximum values found in the training set. (inset) First derivatives of those two activation functions.