| Literature DB >> 31489167 |
Badri Narayanan1,2, Paul C Redfern2, Rajeev S Assary2, Larry A Curtiss2.
Abstract
The energies of the 133 000 molecules in the GDB-9 database have been calculated at the G4MP2 level of theory and then were used to calculate their enthalpies of formation. This database contains organic molecules having nine or less atoms of carbon, nitrogen, oxygen, and fluorine, as well as hydrogen atoms. The accuracy of the G4MP2 energies was investigated on a subset of 459 of the molecules having experimental enthalpies of formation with small uncertainties. On this subset the G4MP2 enthalpies of formation have an accuracy of 0.79 kcal mol-1, which is similar to its accuracy previously reported for the smaller G3/05 test set. An error analysis of the theoretical enthalpies of formation of the 459 molecules is presented in terms of the size and type of the molecules. Three different density functionals (B3LYP, ωB97X-D, M06-2X) were also assessed on 459 molecules of accurate enthalpy data for comparison with the G4MP2 results. The G4MP2 energies for the 133 K molecules provide a database that can be used to calculate accurate reaction energies as well as to assess new or existing experimental enthalpies of formation. Several examples are given of types of reactions that can be predicted using the G4MP2 database of energies. The G4MP2 energies of the GDB-9 molecules will also be useful in future investigations of applications of machine learning to quantum chemical data.Entities:
Year: 2019 PMID: 31489167 PMCID: PMC6713865 DOI: 10.1039/c9sc02834j
Source DB: PubMed Journal: Chem Sci ISSN: 2041-6520 Impact factor: 9.825
Distribution of molecules in the GDB-9 database. We provide the number of molecules containing different number of non-hydrogen atoms (left two columns), as well as for prominent molecule types, each with different constituent elements (right two columns)
| Number of heavy atoms | Number of molecules | Constituent elements of molecule | Number of molecules |
| 1 | 3 (CH4, H2O, NH3) | HCON | 66 573 |
| 2 | 5 | HCO | 45 601 |
| 3 | 9 | HCN | 14 092 |
| 4 | 31 | HC | 4849 |
| 5 | 129 | HCOFN | 1061 |
| 6 | 615 | HCFN | 734 |
| 7 | 3171 | HCOF | 244 |
| 8 | 18 205 | HCF | 90 |
| 9 | 111 128 |
Mean absolute deviations (MAD) from experiment for the Pedley test set for G4MP2 and DFT methods
| Molecule type | G4MP2 | B3LYP | M06-2X | ωB97X-D |
| Hydrocarbons (175) | 0.68 (0.63) | 2.77 | 3.06 | 1.35 |
| Substituted hydrocarbons (284) | 0.86 (0.83) | 4.74 | 2.51 | 2.16 |
| Total (459) | 0.79 (0.77) | 3.99 | 2.71 | 1.85 |
Number of molecules given in parentheses.
G4MP2 MAD for the G3/05 test set25 given in parentheses. The G3/05 test set has 38 hydrocarbons, 100 substituted hydrocarbons, and 138 molecules in total, 92 of which are in common with the Pedley test set.
The B3LYP energies were calculated with the 6-31G(2df,p) basis at the B3LYP/6-31G(2df,p) geometry; the M06-2X and ωB97X-D energies were calculated with the 6-311+G(3df,2p) basis at the B3LYP/6-31G(2df,p) geometry. The zero-point energies used for the density functional results are unscaled ones from B3LYP/6-31G(2df,p).
Fig. 1Mean absolute deviations (MAD) of G4MP2 and three DFT methods for the Pedley test set of 459 molecules as a function of number of heavy atoms.
Fig. 2Mean absolute deviations (MAD) per electron pair of the G4MP2 and three DFT methods for the Pedley test set of 459 molecules as a function of number of heavy atoms.
Fig. 3Atomization energy as a function of standard enthalpy of formation at 298 K for (a) 133 K molecules in GDB-9 dataset, and (b) 459 molecules in the selected Pedley test set. In each panel, a frequency distribution of atomization energy and standard enthalpy of formation among the molecules is shown at the top and right margins, respectively.
Fig. 4Standard enthalpy of formation from G4MP2 calculations of the 133 K organic molecules classified into various groups of atom types.
Fig. 5Standard enthalpy of formation of CHO type molecules as a function of different number of oxygen atoms, as obtained from G4MP2 calculations.
Examples of reaction energies (in kcal mol–1) derived from the G4MP2 energies
| Alcohol oxidation |
| ||
| R1 = H | R2 = H | Δ | |
| R1 = H | R2 = CH3 | Δ | |
| R1 = CH3 | R2 = CH3 | Δ | |
| R1 = CH3 | R2 = C3H7 | Δ | |
| R1 = C2H5 | R2 = C2H5 | Δ | |
| Alkane oxidation |
| ||
| R = H | Δ | ||
| R = CH3 | Δ | ||
| R = C2H5 | Δ | ||
| R = C3H7 | Δ | ||
| R = C4H9 | Δ | ||
| Ether hydrolysis |
| ||
| R1 = CH3 | R2 = CH3 | Δ | |
| R1 = CH3 | R2 = C2H5 | Δ | |
| R1 = C2H5 | R2 = C5H11 | Δ | |
| R1 = C3H7 | R2 = C4H9 | Δ | |
| R1 = C4H9 | R2 = C4H9 | Δ | |
| Hydrogenolysis |
| ||
| R1 = H | R2 = CH3 | Δ | |
| R1 = H | R2 = C6H13 | Δ | |
| R1 = CH3 | R2 = CH3 | Δ | |
| R1 = CH3 | R2 = C3H7 | Δ | |
| R1 = C2H5 | R2 = C2H5 | Δ | |
| Carbonyl reduction |
| ||
| R1 = H | R2 = H | Δ | |
| R1 = H | R2 = CH3 | Δ | |
| R1 = CH3 | R2 = CH3 | Δ | |
| R1 = CH3 | R2 = C3H7 | Δ | |
| R1 = C2H5 | R2 = C2H5 | Δ | |