| Literature DB >> 32694541 |
Peter C St John1, Yanfei Guan2, Yeonjoon Kim3, Brian D Etz3, Seonah Kim4, Robert S Paton5.
Abstract
The stabilities of radicals play a central role in determining the thermodynamics and kinetics of many reactions in organic chemistry. In this data descriptor, we provide consistent and validated quantum chemical calculations for over 200,000 organic radical species and 40,000 associated closed-shell molecules containing C, H, N and O atoms. These data consist of optimized 3D geometries, enthalpies, Gibbs free energy, vibrational frequencies, Mulliken charges and spin densities calculated at the M06-2X/def2-TZVP level of theory, which was previously found to have a favorable trade-off between experimental accuracy and computational efficiency. We expect this data to be useful in the further development of machine learning techniques to predict reaction pathways, bond strengths, and other phenomena closely related to organic radical chemistry.Entities:
Year: 2020 PMID: 32694541 PMCID: PMC7374734 DOI: 10.1038/s41597-020-00588-x
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
Fig. 1Overview of the calculation pipeline and associated software. On each worker, closed-shell molecules and radicals (in SMILES format) are pulled from a central database. Optimized, validated 3D geometries are stored in the database after completed, and a new molecule is started.
Description of the associated data fields, their formats, and units.
| Data Field | Description |
|---|---|
| SMILES | String representation of the 2D connectivity of the molecule. Radicals are denoted using the bracket notation. |
| Enthalpy | Molecular enthalpies, specified to six decimal places. In Hartree |
| FreeEnergy | Gibbs energy at standard temperature (298.15 K) and pressure (1 atm). In Hartree |
| SCFEnergy | Total SCF energy (electronic + nuclear). In Hartree |
| AtomCharges | Mulliken atomic charges, one for each atom. The values are formatted as a python list, beginning and ending with brackets and separated with commas. Values correspond to the atom order as given in the 3D coordinates. |
| AtomSpins | Atomic spin densities (for radicals only). In the same format as AtomCharges. |
| VibFreqs | Vibrational frequencies in wavenumbers (cm−1). Formatted as a python list of length 3N-6 (or 3N-5 for linear molecules) |
| RotConstants | Rotational constants (GHz). A formatted python list of length 3. |
| IRIntensity | Infrared intensities (km/mol). In the same format as VibFreqs. |
Number of optimized closed-shell molecules and radicals by number of heavy atoms.
| # Heavy Atoms | Molecules | Radicals |
|---|---|---|
| 0 | 0 | 1 |
| 1 | 3 | 4 |
| 2 | 11 | 17 |
| 3 | 50 | 89 |
| 4 | 167 | 404 |
| 5 | 485 | 1867 |
| 6 | 1326 | 6570 |
| 7 | 3452 | 19931 |
| 8 | 7573 | 46163 |
| 9 | 13594 | 86499 |
| 10 | 16615 | 84818 |
Distribution of the 246,363 radicals by location of the unpaired electron.
| Element | Primary | Secondary | Tertiary |
|---|---|---|---|
| C | 56,067 | 121,369 | 28,135 |
| N | 11,349 | 14,048 | |
| O | 15,354 |
Primary, secondary, and tertiary refers to atoms having 1, 2, or 3 non-hydrogen neighbors.
Characterization of carbon-centered radicals by neighboring substituents.
| Name | SMARTS | Count |
|---|---|---|
| Allylic | [#6;X3v3 + 0]-[#6] = [#6 × 3] | 16,229 |
| Propargylic | [#6;X3v3 + 0]-[#6]#[#6] | 1,887 |
| Benzylic | [#6;X3v3 + 0]-[c] | 8,286 |
| α- to π-acceptor | [#6;X3v3 + 0]-[C,N] = ,#[N,O] | 18,758 |
| α-to lone-pair | [#6;X3v3 + 0]-[O,N] | 55,136 |
| Captodative | [#6;X3v3 + 0](-[O,N])-[C,N]=,#[N,O] | 43,86 |
Fig. 2Bond strength versus bond length for several common single bonds. Bond dissociation enthalpies are inversely correlated with bond lengths for carbon-containing bonds, but less so for other species.
Fig. 3Radical stabilization and enthalpy validation (a). BDE versus maximum atom spin density. Maximum spin density is calculated across all atoms in the resulting radical, with lower numbers indicating a more even distribution of electron spin across all atoms. (b) Distribution of calculated minus predicted enthalpies (in kcal mol−1) following a linear model of atomic composition. The shaded grey region indicates the inner quartile range. Vertical grey dashed lines indicate the thresholds used for outlier detection, defined as ±3 inner quartile ranges away from the first or third quartile.
| Measurement(s) | SMILES string • Standard Transformed Enthalpy Change • Standard Gibbs Free Energy Change • vibrational frequency • Charge • spin density • infrared spectrum • organic radical • closed-shell molecules |
| Technology Type(s) | computational modeling technique |
| Factor Type(s) | molecular structure |