| Literature DB >> 28993674 |
François Zielinski1,2, Peter I Maxwell1,2, Timothy L Fletcher1,2, Stuart J Davie1,2, Nicodemo Di Pasquale1,2, Salvatore Cardamone1,2, Matthew J L Mills3,4, Paul L A Popelier5,6.
Abstract
The geometry optimization of a water molecule with a novel type of energy function called FFLUX is presented, which bypasses the traditional bonded potentials. Instead, topologically-partitioned atomic energies are trained by the machine learning method kriging to predict their IQA atomic energies for a previously unseen molecular geometry. Proof-of-concept that FFLUX's architecture is suitable for geometry optimization is rigorously demonstrated. It is found that accurate kriging models can optimize 2000 distorted geometries to within 0.28 kJ mol-1 of the corresponding ab initio energy, and 50% of those to within 0.05 kJ mol-1. Kriging models are robust enough to optimize the molecular geometry to sub-noise accuracy, when two thirds of the geometric inputs are outside the training range of that model. Finally, the individual components of the potential energy are analyzed, and chemical intuition is reflected in the independent behavior of the three energy terms [Formula: see text](intra-atomic), [Formula: see text] (electrostatic) and [Formula: see text] (exchange), in contrast to standard force fields.Entities:
Year: 2017 PMID: 28993674 PMCID: PMC5634454 DOI: 10.1038/s41598-017-12600-3
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Flowchart of FFLUX’s training (first four steps) and execution (DL_POLY), detailing the programs involved and summaries of their tasks.
Figure 2S-curves for the 100, 300, 500, T500 and TE500 water models described using the three energies given in Eqn. 1 (, and ). The label “T” stands for the tighter scrubbing threshold of 0.00005 Hartrees, while “TE” stands for this tight model using single total atomic energies, .
Statistical analysis of the performance of the five water kriging models.
| Measure | Model | ||||
|---|---|---|---|---|---|
| 100 | 300 | 500 | T500 | TE500 | |
| Test Set Energy Range | 194.8 | 199.6 | 199.6 | 179.2 | 179.2 |
| Training Set Energy Range | 181.0 | 197.5 | 199.4 | 188.7 | 188.7 |
| Maximum Error | 16.6 | 0.8 | 0.6 | 0.9 | 0.3 |
| Mean Absolute Error (MAE) | 1.00 | 0.10 | 0.06 | 0.07 | 0.10 |
| Prediction % Error | 0.51 | 0.05 | 0.03 | 0.04 | 0.05 |
All energies are in kJ mol−1.
Figure 3SP1 (left, +15.05 kJ mol−1), SP2 (middle, +47.97 kJ mol−1) and SP3 (right, +126.18 kJ mol−1) water geometries. Bond distances are in Å, and bond angles in degrees.
SP1, SP2 and SP3 water optimization results for each model (100, 300, 500, T500 and TE500).
| Model: | 100 | 300 | 500 | T500 | TE500 | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| QM Energy | −199 620.00 | |||||||||
| SP1 | Energy/kJmol−1 | ΔE | Energy/kJmol−1 | ΔE | Energy/kJmol−1 | ΔE | Energy/kJmol−1 | ΔE | Energy/kJmol−1 | ΔE |
| SP Energy | −199 604.95 | 15.05 | −199 604.95 | 15.05 | −199 604.95 | 15.05 | −199 604.95 | 15.05 | −199 604.95 | 15.05 |
| Set 1 | −199 619.89 | 0.11 | −199 619.88 | 0.12 | −199 619.99 | 0.01 | −199 619.96 | 0.04 | −199 619.86 | 0.14 |
| Set 2 | −199 619.89 | 0.11 | −199 619.88 | 0.12 | −199 619.99 | 0.01 | −199 619.91 | 0.08 | −199 619.86 | 0.14 |
| Set 3 | −199 621.42 | −1.42 | −199 619.86 | 0.14 | −199 619.96 | 0.03 | −199,619.93 | 0.06 | −199 619.81 | 0.19 |
| Set 4 | −199 621.42 | −1.42 | −199 619.86 | 0.14 | −199 091.34 | — | −199,619.93 | 0.06 | −199 619.81 | 0.19 |
|
|
|
|
|
|
|
|
|
|
|
|
| SP Energy | −199 572.03 | 47.97 | −199 572.03 | 47.97 | −199 572.03 | 47.97 | −199 572.03 | 47.97 | −199 572.03 | 47.97 |
| Set 1 | −199 621.60 | −1.60 | −199 619.87 | 0.13 | −199 619.99 | 0.01 | −199 620.00 | 0.00 | −199 619.86 | 0.14 |
| Set 2 | −199 620.19 | −0.20 | −199 619.87 | 0.13 | −199 619.99 | 0.01 | −199 619.96 | 0.04 | −199 619.86 | 0.14 |
| Set 3 | −199 621.40 | −1.40 | −199 619.84 | 0.15 | −199 619.72 | 0.28 | −199 619.82 | 0.17 | −199 619.79 | 0.20 |
| Set 4 | −199 621.40 | −1.40 | −199 619.84 | 0.15 | −199 619.72 | 0.28 | −199 619.82 | 0.17 | −199 619.79 | 0.20 |
|
|
|
|
|
|
|
|
|
|
| |
| SP Energy | −199 493.81 | 126.18 | −199 493.81 | 126.18 | −199 493.81 | 126.18 | −199 493.81 | 126.18 | −199 493.81 | 126.18 |
| Set 1 | −199 621.58 | −1.58 | −199 619.99 | 0.00 | −199 620.06 | −0.06 | −199 619.97 | 0.03 | −199 619.86 | 0.14 |
| Set 2 | −199 621.58 | −1.58 | −199 619.88 | 0.12 | −199 619.99 | 0.01 | −199 619.95 | 0.04 | −199 619.86 | 0.14 |
| Set 3 | −199 620.52 | −0.53 | −199 619.85 | 0.14 | −199 619.87 | 0.12 | −199 619.83 | 0.16 | −199 619.80 | 0.20 |
| Set 4 | −199 620.52 | −0.53 | −199 619.85 | 0.14 | −199 619.87 | 0.12 | −199 619.83 | 0.16 | −199 619.80 | 0.20 |
QM energy. Set 1: 0 K run for 5000 steps with time step of 1 fs; Set 2: same as Set 1 but 0.5 fs; Set 3: CG with 1 fs and 10−5 Å as convergence threshold; Set 4: same as Set 3 but convergence threshold at 10−6 Å.
aThe minimum was never reached for reasons described in the main text (“fourth observation”). ΔE is the energy difference between the molecule’s optimized energy and its.
Water’s optimized geometrical data from each starting point (SP1, SP2 and SP3) using the five models with parameter Set 1 throughout. Optimized values are reported as relative to the QM, i.e. bond distances and angles are plotted as “relative data” bars where red indicates a lower value, blue a higher value. The magnitude of each bar is marked by its length, normalized using all resulting bond distances across all three SPs. The largest bar (red, SP3, 100 model) is set to one unit of length. The angles are treated similarly, with the unit length bar being “blue, SP1, 100 model”.
Figure 4T500 molecular model geometry optimization trajectory steps with SP1 (blue), SP2 (red) and SP3 (green) starting points: (a) Set 1 (0 K and 1 fs timestep) truncated at 500 steps where the energy fluctuation is <0.0001 kJ mol−1 and (b) Set 3 (GC and 1 fs timestep) with no truncation. The x-axis marks the timestep number. In the left panels, the y-axes denote molecular energy; in the right panels the y-axes denote ΔE (current energy – previous energy). All energies are in kJ mol−1.
Optimized geometrical data for each of the four SP-OUT runs for the most energetically stable parameter set (Set 1). All runs are completed using the T500 model. Optimized values are reported as relative to the QM, i.e. bond distances and angles are plotted as “relative data” bars where red indicates a lower value, blue a higher value. The magnitude of each bar is marked by its length, normalized using all resulting bond distances across all three SPs. The largest bar (blue, SP-OUT4, O1-H3) is set to one unit of length. The angles are treated similarly, with the unit length bar being “red, SP-OUT1”. Values outside the training range are highlighted in yellow and taken out of the data bars calculations.
Optimization results from starting points (SP) generated outside (OUT) the training set energy range (called “SP-OUT 1” to “SP-OUT 4”), using the T500 water model.
| QM Energy | T500 - Outside 1 – (SP-OUT 1) | T500 - Outside 2 – (SP-OUT 2) | ||||
|---|---|---|---|---|---|---|
| −199 620.00 | −199 620.00 | |||||
| Energy/kJmol−1 | ΔE/kJmol−1 | Steps | Energy/kJmol−1 | ΔE/kJmol−1 | Steps | |
| SP Energy | −199 424.75 | 195.24 = 6.53 | −199 418.84 | 201.15 = 12.45 | ||
| 1 | −199 619.89 | 0.10 | 5000 | −199 619.95 | 0.05 | 5000 |
| 2 | −199 619.86 | 0.14 | 5000 | −199 619.95 | 0.05 | 5000 |
| 3 | −199 619.86 | 0.13 | 360 | −199 619.85 | 0.14 | 350 |
| 4 | −199 619.86 | 0.13 | 360 | −199 619.86 | 0.13 | 435 |
|
|
| |||||
| SP Energy | −199 319.50 | 300.50 = 111.79 | −199 027.78 | 592.21 = 403.5 | ||
| 1 | −199 619.94 | 0.05 | 5000 | −199 482.19 | 137.81 | 5000 |
| 2 | −199 619.95 | 0.04 | 5000 | −199 482.19 | 137.81 | 5000 |
| 3 | −199 619.96 | 0.04 | 563 | −199 085.71 | 534.29 | 942 |
| 4 | −199 619.96 | 0.04 | 563 | −199 085.71 | 534.29 | 942 |
Figure 5Performance of the T500 model using the 0 K optimization: (a) aggregated plot of the molecular energy evolution in time for each of the 2000 starting geometries considered (runs are coloured from dark to light blue allowing tracking); (b) magnified energy evolution between the 900th and 1000th timesteps; (c) distribution of energies at the 1000th timestep, relative to the ab initio energy.
Figure 6Single-energy optimized water geometries using the individual , and energies. Initialization geometry is the QM minimum, and the optimizations are performed using the T500 model with parameter Set 1.
Geometrical data for the single-energy optimized runs, using Set 1 associated to the T500 model. Optimized values are reported as relative to the QM, i.e. the value of [Resulting Feature – QM], and plotted as a relative data bar. Optimized values are reported as relative to the QM, i.e. bond distances and angles are plotted as “relative data” bars where red indicates a lower value, blue a higher value. The magnitude of each bar is marked by its length, normalized using all resulting bond distances across all three SPs. The largest bar (red, , O1-H2) is set to one unit of length. The angles are treated similarly, with the unit length bar being “blue, ”.