| Literature DB >> 22942689 |
Hong Zhi Li1,2, Li Hong Hu1, Wei Tao3, Ting Gao1, Hui Li1, Ying Hua Lu1, Zhong Min Su3.
Abstract
A DFT-SOFM-RBFNN method is proposed to improve the accuracy of DFT calculations on Y-NO (Y = C, N, O, S) homolysis bond dissociation energies (BDE) by combining density functional theory (DFT) and artificial intelligence/machine learning methods, which consist of self-organizing feature mapping neural networks (SOFMNN) and radial basis function neural networks (RBFNN). A descriptor refinement step including SOFMNN clustering analysis and correlation analysis is implemented. The SOFMNN clustering analysis is applied to classify descriptors, and the representative descriptors in the groups are selected as neural network inputs according to their closeness to the experimental values through correlation analysis. Redundant descriptors and intuitively biased choices of descriptors can be avoided by this newly introduced step. Using RBFNN calculation with the selected descriptors, chemical accuracy (≤1 kcal·mol(-1)) is achieved for all 92 calculated organic Y-NO homolysis BDE calculated by DFT-B3LYP, and the mean absolute deviations (MADs) of the B3LYP/6-31G(d) and B3LYP/STO-3G methods are reduced from 4.45 and 10.53 kcal·mol(-1) to 0.15 and 0.18 kcal·mol(-1), respectively. The improved results for the minimal basis set STO-3G reach the same accuracy as those of 6-31G(d), and thus B3LYP calculation with the minimal basis set is recommended to be used for minimizing the computational cost and to expand the applications to large molecular systems. Further extrapolation tests are performed with six molecules (two containing Si-NO bonds and two containing fluorine), and the accuracy of the tests was within 1 kcal·mol(-1). This study shows that DFT-SOFM-RBFNN is an efficient and highly accurate method for Y-NO homolysis BDE. The method may be used as a tool to design new NO carrier molecules.Entities:
Keywords: Y-NO bond; density functional theory; homolysis bond dissociation energies; radial basis function neural network; self-organizing feature mapping neural network
Mesh:
Substances:
Year: 2012 PMID: 22942689 PMCID: PMC3430220 DOI: 10.3390/ijms13078051
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 6.208
Figure 1The structure of self-organizing feature mapping neural network (SOFMNN).
Figure 2The structure of radial basis function neural network (RBFNN).
Deviations between experimental and calculated values of 92 organic molecules from different methods, reported in kcal·mol−1.
| No. | B3LYP/6-31G(d) | B3LYP/STO-3G | DFT-RBFNN | DFT-SOFM-RBFNN | ||
|---|---|---|---|---|---|---|
|
|
| |||||
| 6-31G(d) | STO-3G | 6-31G(d) | STO-3G | |||
| 1 | −17.17 | −6.89 | −0.12 | −1.84 | −0.04 | −1.18 |
| 2 | −7.88 | 2.66 | 0.46 | 0.12 | 0.38 | 0.22 |
| 3 | −9.31 | 0.85 | −0.48 | −0.65 | −0.38 | −0.58 |
| 4 | −9.29 | 1.27 | −0.03 | −0.02 | −0.01 | −0.01 |
| 5 | −9.77 | 0.14 | 0.00 | −0.01 | 0.00 | 0.00 |
| 6 | −9.13 | 1.04 | −0.40 | −0.53 | −0.34 | −0.46 |
| 7 | −9.01 | 1.11 | 0.05 | −0.01 | 0.03 | 0.01 |
| 8 | −12.53 | 0.28 | −0.03 | 0.00 | −0.01 | 0.00 |
| 9 | −13.13 | −3.06 | 0.00 | 0.00 | 0.00 | 0.00 |
| 10 | −10.9 | −0.51 | −0.01 | −0.01 | 0.00 | 0.00 |
| 11 | 2.16 | 12.31 | 0.07 | 0.02 | 0.04 | 0.01 |
| 12 | 2.70 | 13.23 | 0.58 | 0.81 | 0.55 | 0.68 |
| 13 | 1.72 | 12.17 | −0.34 | 0.42 | −0.34 | 0.23 |
| 14 | −0.39 | 10.26 | −0.10 | 0.03 | −0.06 | 0.01 |
| 15 | −1.56 | 10.1 | 0.00 | 0.01 | 0.00 | 0.01 |
| 16 | 1.69 | 11.63 | 0.00 | 0.01 | 0.00 | 0.00 |
| 17 | 2.00 | 12.39 | −0.20 | 0.25 | −0.23 | 0.13 |
| 18 | −8.37 | 2.73 | −0.16 | 0.05 | −0.06 | 0.03 |
| 19 | −7.30 | 4.12 | −0.28 | −0.02 | −0.21 | −0.01 |
| 20 | −6.93 | 4.16 | −0.22 | −0.47 | −0.21 | −0.41 |
| 21 | −7.68 | 3.96 | 0.29 | 0.01 | 0.27 | 0.00 |
| 22 | −10.58 | 0.56 | 0.00 | 0.00 | 0.00 | 0.00 |
| 23 | −2.11 | 8.33 | 0.01 | −0.93 | 0.06 | −0.75 |
| 24 | 3.45 | 12.33 | 0.35 | 0.67 | 0.19 | 0.45 |
| 25 | −8.07 | 3.05 | −0.53 | −0.21 | −0.51 | −0.18 |
| 26 | −7.90 | 3.23 | 0.28 | 0.18 | 0.29 | 0.17 |
| 27 | −8.60 | 2.58 | −0.42 | −0.01 | −0.38 | −0.01 |
| 28 | −8.22 | 4.07 | 0.01 | 0.00 | 0.00 | 0.00 |
| 29 | −4.97 | 6.77 | 0.00 | 0.00 | 0.00 | 0.00 |
| 30 | 1.87 | −11.2 | 0.00 | 0.02 | 0.00 | 0.01 |
| 31 | 1.97 | −11.27 | −0.05 | 0.00 | −0.04 | 0.00 |
| 32 | 0.33 | −12.53 | −0.01 | −0.03 | 0.00 | −0.02 |
| 33 | 1.91 | −6.79 | 0.04 | −0.03 | 0.03 | −0.03 |
| 34 | 0.74 | −11.6 | 0.00 | 0.00 | 0.00 | 0.00 |
| 35 | 1.92 | −10.83 | 0.18 | 0.01 | 0.15 | 0.01 |
| 36 | 0.62 | −14 | −0.18 | 0.00 | −0.15 | 0.00 |
| 37 | 1.16 | 10.52 | 0.00 | 0.00 | 0.00 | 0.00 |
| 38 | 0.76 | 11.2 | 0.14 | 0.12 | 0.10 | 0.10 |
| 39 | 0.29 | 11.06 | −0.05 | −0.09 | −0.07 | −0.08 |
| 40 | −0.36 | 10.68 | −0.06 | −0.39 | −0.05 | −0.36 |
| 41 | −0.41 | 11.52 | 0.00 | 0.00 | 0.00 | 0.00 |
| 42 | −0.04 | 11.72 | 0.02 | 0.40 | 0.01 | 0.37 |
| 43 | −0.26 | 10.28 | 0.04 | −0.05 | 0.04 | −0.03 |
| 44 | −1.14 | 11.08 | 1.01 | 0.95 | 0.92 | 0.84 |
| 45 | −0.97 | 9.89 | 0.00 | 0.00 | 0.00 | 0.00 |
| 46 | 0.03 | 12.03 | 0.00 | 0.00 | 0.00 | 0.00 |
| 47 | 0.87 | 10.84 | 0.02 | 0.04 | 0.01 | 0.02 |
| 48 | −1.67 | 8.65 | 0.00 | 0.00 | 0.00 | 0.00 |
| 49 | −3.41 | 8.59 | −0.01 | −0.03 | 0.00 | −0.02 |
| 50 | 7.47 | −0.71 | −0.01 | 0.01 | 0.01 | 0.01 |
| 51 | 5.60 | −0.55 | 0.00 | 0.00 | 0.00 | 0.00 |
| 52 | 7.03 | −1.38 | 0.03 | 0.00 | 0.01 | 0.00 |
| 53 | 6.33 | −2.14 | −0.01 | −0.01 | −0.01 | −0.01 |
| 54 | −2.62 | 15.71 | 0.00 | 0.00 | 0.00 | 0.00 |
| 55 | −2.88 | 15.23 | 0.12 | 0.28 | 0.08 | 0.25 |
| 56 | −3.88 | 14.1 | −0.12 | −0.28 | −0.08 | −0.25 |
| 57 | −3.89 | 13.76 | 0.00 | −0.01 | 0.00 | −0.01 |
| 58 | −7.57 | 9.35 | 0.00 | 0.00 | 0.00 | 0.00 |
| 59 | −4.88 | 12.76 | 1.26 | 1.19 | 1.20 | 1.14 |
| 60 | −7.33 | 9.84 | −1.20 | −1.15 | −1.16 | −1.12 |
| 61 | −6.90 | 10.9 | 0.17 | 0.26 | 0.20 | 0.28 |
| 62 | 6.39 | 18.5 | 0.00 | 0.00 | 0.00 | 0.00 |
| 63 | 4.12 | 17.94 | 0.00 | 0.38 | 0.00 | 0.35 |
| 64 | −9.96 | 16.41 | 0.00 | −0.37 | 0.00 | −0.34 |
| 65 | 4.19 | 15.06 | 0.00 | −0.01 | 0.00 | −0.01 |
| 66 | 0.55 | 14.42 | 0.00 | 0.00 | 0.00 | 0.00 |
| 67 | −3.51 | 19.3 | −0.60 | −0.52 | −0.47 | −0.43 |
| 68 | −2.46 | 21.15 | −0.93 | −0.93 | −0.85 | −0.90 |
| 69 | 0.27 | 22.96 | 0.51 | 0.57 | 0.44 | 0.54 |
| 70 | 0.05 | 22.7 | 0.07 | 0.50 | 0.04 | 0.47 |
| 71 | 2.43 | 22.6 | 0.19 | 0.18 | 0.16 | 0.14 |
| 72 | 0.20 | 19.63 | 0.01 | 0.00 | 0.00 | 0.00 |
| 73 | −0.88 | 20.53 | −0.16 | −0.52 | −0.09 | −0.48 |
| 74 | 7.91 | 19.5 | 0.02 | 0.03 | 0.01 | 0.02 |
| 75 | −0.36 | 22.56 | 0.38 | 0.39 | 0.39 | 0.40 |
| 76 | 2.96 | 21.38 | 0.00 | 0.00 | 0.00 | 0.00 |
| 77 | 1.69 | 22.06 | 0.83 | 0.53 | 0.61 | 0.43 |
| 78 | 2.77 | 21.23 | 0.00 | 0.01 | 0.00 | 0.01 |
| 79 | 2.52 | 20.27 | 0.21 | 0.00 | 0.13 | 0.00 |
| 80 | 0.84 | 19.65 | 0.01 | −0.01 | 0.00 | −0.01 |
| 81 | 1.17 | 21.22 | 0.00 | 0.00 | 0.00 | 0.00 |
| 82 | 0.68 | 20.49 | −0.21 | 0.00 | −0.13 | 0.00 |
| 83 | −2.03 | 16.73 | −0.27 | −0.57 | −0.26 | −0.56 |
| 84 | −0.24 | 18.15 | 0.27 | 0.57 | 0.26 | 0.56 |
| 85 | −7.63 | 2.33 | −0.04 | 0.02 | −0.03 | 0.02 |
| 86 | −4.58 | 6.59 | 0.00 | 0.00 | 0.00 | 0.00 |
| 87 | −7.16 | 5.16 | 0.48 | 0.16 | 0.36 | 0.12 |
| 88 | −8.00 | 2.5 | 0.02 | 0.10 | 0.01 | 0.07 |
| 89 | −3.70 | 11.26 | 0.00 | 0.00 | 0.00 | 0.00 |
| 90 | −10.85 | 0.62 | −0.49 | −0.26 | −0.37 | −0.18 |
| 91 | −8.77 | 5.98 | −0.16 | −0.17 | −0.13 | −0.13 |
| 92 | −8.61 | 1.34 | 0.00 | 0.00 | 0.00 | 0.00 |
The molecules belong to the test set.
Figure 3(a) The topology structure of the competitive layer; (b) Distances of neighbor neurons.
SOFMNN clustering analysis results for twelve molecular descriptors.
| DFT | Training Steps | Clustering Analysis | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
| |||||||||||||
| ΔHhomo | QY | QN | QO, | NX | μ | α | EHOMO-1 | EHOMO | ELUMO | ELUMO+1 | ΔE | ||
| B3LYP/6-31G(d) | 10 | 24 | 1 | 1 | 1 | 24 | 4 | 24 | 1 | 1 | 1 | 1 | 1 |
| 30 | 5 | 13 | 13 | 13 | 24 | 19 | 24 | 13 | 13 | 13 | 13 | 13 | |
| 50 | 4 | 12 | 6 | 12 | 1 | 21 | 1 | 12 | 12 | 12 | 12 | 12 | |
| 100 | 19 | 12 | 10 | 12 | 3 | 22 | 1 | 12 | 12 | 11 | 11 | 10 | |
| 200 | 16 | 1 | 8 | 1 | 11 | 19 | 24 | 1 | 1 | 2 | 2 | 8 | |
| 500 | 16 | 13 | 1 | 19 | 12 | 8 | 24 | 19 | 19 | 20 | 20 | 1 | |
| 1000 | 16 | 13 | 20 | 13 | 23 | 2 | 24 | 13 | 13 | 14 | 14 | 20 | |
|
| |||||||||||||
| B3LYP/STO-3G | 10 | 2 | 1 | 1 | 1 | 24 | 1 | 24 | 1 | 1 | 1 | 1 | 1 |
| 30 | 23 | 1 | 7 | 1 | 24 | 5 | 24 | 1 | 1 | 1 | 2 | 7 | |
| 50 | 21 | 1 | 1 | 1 | 6 | 13 | 12 | 1 | 1 | 1 | 1 | 1 | |
| 100 | 21 | 7 | 19 | 7 | 24 | 3 | 12 | 7 | 7 | 14 | 19 | 19 | |
| 200 | 5 | 7 | 19 | 1 | 24 | 3 | 22 | 1 | 1 | 13 | 14 | 15 | |
| 500 | 4 | 16 | 19 | 21 | 24 | 8 | 12 | 21 | 21 | 20 | 19 | 13 | |
| 1000 | 10 | 13 | 15 | 19 | 24 | 2 | 12 | 19 | 19 | 20 | 15 | 21 | |
Figure 4The histograms of deviations between the different calculated homolysis BDE and the experimental values for 92 organic molecules, (a) B3LYP/6-31G(d); (b) B3LYP/6-31G(d)-RBFNN; (c) B3LYP/6-31G(d)-SOFM-RBFNN methods; (d–f) are the deviations when changing the corresponding basis set from 6-31G(d) to STO-3G.
The extrapolation test for the DFT-SOFM-RBFNN method. (kcal·mol−1).
| No. | Structures | Expt. | B3LYP/6-31G(d) | DFT-SOFM-RBFNN 6-31G(d) | B3LYP/STO-3G | DFT-SOFM-RBFNN STO-3G |
|---|---|---|---|---|---|---|
| 1 |
| 31.6 | 29.02 | 30.49 | 48.10 | 30.56 |
| 2 |
| 41.1 | 38.55 | 40.95 | 49.4 | 40.59 |
| 3 |
| 39.9 | 37.67 | 39.90 | 32.47 | 39.90 |
| 4 |
| 50.5 | 50.34 | 50.48 | 60.6 | 51.12 |
| 5 |
| 37.8 | 27.04 | 37.85 | 48.8 | 37.98 |
| 6 |
| 44.8 | 34.65 | 44.76 | 50.58 | 44.62 |
The deviations of calculation methods (kcal·mol−1).
| NO. | DFT-SOFM-RBFNN | M06-2X/6-311 + G(2d,p) | M06-2X/6-311 + G(2d,p) (PCM) | B3LYP/6-31G(d) |
|---|---|---|---|---|
| 39 | −0.1 | 3.6 | 2.4 | 0.29 |
| 59 | 1.2 | 1.5 | 0.8 | −4.9 |
| 76 | 0.0 | 4.2 | 4.2 | 3.0 |
| 91 | −0.1 | −2.2 | −4.1 | −8.7 |
DFT-SOFM-RBFNN is based on B3LYP/6-31G(d) calculations.