Literature DB >> 35655882

Detection of multi-reference character imbalances enables a transfer learning approach for virtual high throughput screening with coupled cluster accuracy at DFT cost.

Chenru Duan^1,2, Daniel B K Chu¹, Aditya Nandy^1,2, Heather J Kulik¹.

Abstract

Appropriately identifying and treating molecules and materials with significant multi-reference (MR) character is crucial for achieving high data fidelity in virtual high-throughput screening (VHTS). Despite development of numerous MR diagnostics, the extent to which a single value of such a diagnostic indicates the MR effect on a chemical property prediction is not well established. We evaluate MR diagnostics for over 10 000 transition-metal complexes (TMCs) and compare to those for organic molecules. We observe that only some MR diagnostics are transferable from one chemical space to another. By studying the influence of MR character on chemical properties (i.e., MR effect) that involve multiple potential energy surfaces (i.e., adiabatic spin splitting, ΔE H-L, and ionization potential, IP), we show that differences in MR character are more important than the cumulative degree of MR character in predicting the magnitude of an MR effect. Motivated by this observation, we build transfer learning models to predict CCSD(T)-level adiabatic ΔE H-L and IP from lower levels of theory. By combining these models with uncertainty quantification and multi-level modeling, we introduce a multi-pronged strategy that accelerates data acquisition by at least a factor of three while achieving coupled cluster accuracy (i.e., to within 1 kcal mol-1 MAE) for robust VHTS. This journal is © The Royal Society of Chemistry.

Entities: Chemical

Year: 2022 PMID： 35655882 PMCID： PMC9067623 DOI： 10.1039/d2sc00393g

Source DB: PubMed Journal: Chem Sci ISSN： 2041-6520 Impact factor: 9.969

Introduction

Approximate density functional theory (DFT) has become an indispensable workhorse in virtual high-throughput screening (VHTS)[1-8] and machine learning (ML)-accelerated chemical discovery[9-18] due to its balanced trade-off in computational cost and accuracy. However, DFT can fail prominently for many of the most promising VHTS targets (e.g., open-shell radicals, transition-metal-containing systems, and strained bonds in transition states).[19-23] These systems may have strong multi-reference (MR) character due to near-degenerate orbitals,[24] which cannot be accurately accounted for in DFT due to its single-reference (SR) description of the wavefunction.[25] Although benchmarking studies[26-28] can be used to identify the best density functional approximation (DFA) to yield accurate energetic properties for a chosen class of material, the choice of DFA depends strongly on the system of interest and cannot be determined a priori in VHTS where most materials have yet to be characterized.[29,30] Moreover, an imbalanced treatment of systems that have weak or strong MR character can be expected to undermine the data fidelity and bias the candidate materials recommended by chemical discovery efforts.[31] To quantify the degree of MR character, researchers have devised many MR diagnostics[24,32-42] based on different properties (e.g., occupations or atomization energies) and levels of theory. These MR diagnostics often disagree with each other,[24,43] with the diagnostics derived from DFT being less predictive than those derived from wavefunction theory (WFT).[44] Data-driven methods have augmented conventional approaches[45-49] for making system-specific decisions associated with carrying out quantum chemical calculations. For example, Jeong et al.[50] demonstrated an ML protocol that performs automated selection of active spaces for bond dissociation of main group diatomic molecules, alleviating the computational cost. We recently introduced a semi-supervised learning approach to MR classification based on the consensus of 15 MR diagnostics that outperforms the traditional cutoff-based approach (i.e., from a single diagnostic) and is transferable to systems of larger sizes and unseen chemical composition.[51] Potentially strong MR character still poses challenges for VHTS. The applicability of MR diagnostics and associated cutoffs to both organic and transition-metal-containing systems is not clear because most studies focus solely on organic systems. A notable exception is work from Wilson and co-workers[52,53] that demonstrated that coupled cluster (CC)-based diagnostics require larger cutoffs on transition-metal complexes (TMCs). In addition, while most studies focus on the MR character of a single structure, most chemical properties of interest involve multiple structures and/or electronic states. How the MR character of multiple related structures influences a property prediction (i.e., MR effect[24,54]) is not well understood. Although MR diagnostics and tools for method selection[55,56] have been developed, they have yet to be adapted to improve data quality in VHTS. In this work, we demonstrate the lack of transferability of many MR diagnostics from one chemical space to another. Use of the most robust diagnostics shows that imbalances in MR character are more important than cumulative MR character for properties that depend on multiple electronic states (e.g., adiabatic spin splitting or ionization potential). Motivated by these observations, we train transfer learning models to predict CCSD(T)-level properties from inputs including calculations carried out at lower levels of theory (i.e., DFT) and MR diagnostics. We further introduce uncertainty quantification and multi-level modeling into our workflow to apply the transfer learning predictions only when the model has high confidence, accelerating data acquisition while achieving coupled cluster accuracy (i.e., within 1 kcal mol−1 of CCSD(T)) in VHTS.

Results and discussion

Limits of MR diagnostic transferability

To evaluate trends in MR character for more than 10 000 model TMCs (described next), we used the percentage of correlation energy recovered by CCSD relative to CCSD(T) (i.e., %Ecorr[(T)]) as a figure of merit for measuring the MR character of a system.[44] A smaller %Ecorr[(T)] suggests stronger MR character because CCSD is insufficient to recover the correlation energy. We previously showed[44] that %Ecorr[(T)] is system size insensitive and correlates well with %Ecorr[T] (i.e., from comparison to full CCSDT, see Computational details and ESI Fig. S1†). Over our data set consisting of low-spin (LS), intermediate-spin (IS), and high-spin (HS) complexes, we observe a trend of decreasing MR character with increasing number of unpaired electrons (i.e., LS > IS > HS) (Fig. 1). This observation is consistent with both expectations and our prior work[57] and is due to the increased number of accessible configuration state functions in the LS state. Complexes with stronger-field ligands (i.e. CO) generally exhibit higher MR character (Fig. 1). For example, we observe decreasing %Ecorr[(T)] for complexes with increasing ligand field strength from H2O to NH3 to CO (Fig. 1). This increased MR character can be attributed to the more covalent metal–organic bonding character for complexes with stronger ligand fields. Consequently, when we substitute a 2p metal-coordinating atom with a 3p element from the same group (e.g., NH3 to PH3), both the ligand field strength and the MR character of the complex increases (Fig. 1 and ESI Fig. S2†). In prior work,[58] Feldt et al. used increased metal–helium bond lengths to weaken effective ligand fields. Here, we indeed find that the effective ligand fields decrease as the metal–helium bond lengths increase (ESI Fig. S3†). Concomitant with decreases in effective ligand field as the M–He bond is elongated, we observe decreases in MR character, although the trend is weak over the full data set and there are some MR compounds for all M–He distances (Fig. 1). This trend is in agreement with the observations from spectrochemical series ligands.

Fig. 1

Distribution of %Ecorr[(T)] for more than 10 000 TMCs categorized by their spin states (top, blue for high-spin, gray for intermediate-spin, and red for low-spin), ligands (middle, orange for PH3, gray for CO, blue for NH3, and red for H2O), and metal–helium distances (d(M–He), bottom, with decreasing opacity from 1.0 to 0.4 as d(M–He) increases from 1.5 Å to 2.7 Å). Four representative TMCs with LS Fe(ii) and 1.9 Å d(M–He) with different ligands in a cis configuration are shown. Their corresponding %Ecorr[(T)] values are shown with a colored tick on the x-axis. All atoms are colored as follows: brown for Fe, gray for C, blue for N, red for O, orange for P, white for H, and green for He.

Next, we investigated the linear correlations between pairs of MR diagnostics (ESI Table S1†). Consistent with our prior observations on equilibrium and distorted organic molecules,[44] the correlation coefficients are generally low between pairs of diagnostics obtained from different levels of theory (ESI Fig. S4–S5†). As was also observed for organic molecules,[44] WFT-based MR diagnostics generally have better linear and rank-order correlations with %Ecorr[(T)] in comparison to those derived from DFT (ESI Fig. S6–S7†). An exception to this are fractional occupation-based diagnostics (i.e., Matito's degree of nondynamical correlation, IND[B3LYP],[40,41] and the ratio of nondynamical to total correlation, rND[B3LYP][43]) that are readily obtained at DFT cost (ESI Table S1†). Their low cost has motivated use of DFT-based diagnostics in VHTS[57] where MR detection at low cost is needed to avoid computational bottlenecks and can be used to identify “DFT-safe” islands in VHTS.[57] Although over the present set, IND[B3LYP][40,41] and rND[B3LYP][43] yield the best linear and rank-order correlation with %Ecorr[(T)], their performance on organic molecules was significantly poorer[44] (ESI Fig. S6–S7†). This suggests that WFT-based diagnostics are more predictive of whether a system has strong MR character. A closely related question is to what extent MR diagnostics and associated cutoffs are transferable from one chemical space to another. We compare the relationship between %Ecorr[(T)] and MR diagnostics for TMCs to that for equilibrium or stretched organic molecules. We observe divergent behavior for representative MR diagnostics across these two sets. Organic molecules and TMCs have distinct T1 diagnostic vs. %Ecorr[(T)] distributions (Fig. 2). We therefore conclude that the T1 diagnostic is not a transferable metric for measuring MR character, because organic molecules and TMCs have different ranges of the T1 diagnostic for the same value of %Ecorr[(T)]. This lack of transferability across compounds supports previous arguments for distinct cutoff values for the T1 diagnostic when it is used for organic molecules or inorganic complexes.[52] Distinct distributions for organic molecules and TMCs are generally observed for all DFT- and CC-based diagnostics and %Ecorr[(T)] (ESI Fig. S8†).

Fig. 2

2D histogram for %Ecorr[(T)] vs. T1 (top) and %Ecorr[(T)] vs. nHOMO[MP2] (bottom) for more than 10 000 TMCs in this work (blue) and for the 12 500 equilibrium or distorted organic molecules in our prior work[44] (red). The relative density of systems within a specific bin is represented by the opacity of the coloring.

Overall, the MP2- and CASSCF-based diagnostics all have a greater degree of overlap with respect to %Ecorr[(T)] for organic molecules and TMCs, suggesting their greater transferability (ESI Fig. S9†). The one low-cost diagnostic for which organic molecules and TMCs have overlapping values at the same %Ecorr[(T)] is the MP2-based nHOMO[MP2] diagnostic. Because nHOMO[MP2] evaluation is not overly computationally demanding, this analysis highlights its potential use as a relatively low-cost, transferable metric for MR character determination (Fig. 2). Surprisingly, the DFT-based IND and rND diagnostics, although also motivated from the occupation of virtual orbitals upon electron excitation, are not transferable across chemical spaces (ESI Fig. S8†). The lack of transferability of IND and rND diagnostics may arise from the different degrees of accuracy of DFT for organic molecules and TMCs. We find that the distinct behavior of diagnostics across the two types of molecules is not due to their difference in size; invariant 2D distributions of %Ecorr[(T)] vs. MR diagnostics are observed between subsets of organic molecules and TMCs grouped by size (ESI Fig. S10 and S11†). We previously showed that MR classification is more robust when multiple diagnostics are used rather than a single diagnostic and cutoff,[51] motivating the use of not just a single diagnostic such as the reasonably performing nHOMO[MP2] but a range of WFT-based diagnostics to more robustly predict different regimes of MR effect. To bridge the gap in performance between low-cost DFT-based diagnostics and computationally demanding WFT-based diagnostics, we trained ANN models to predict the WFT-based diagnostics and %Ecorr[(T)] using DFT-based diagnostics and Coulomb-decay revised autocorrelations (CD-RACs),[44] a set of graph-based descriptors that encode 3D geometric information of TMCs as inputs (see Computational details). With this approach, we predict WFT-based diagnostics for TMCs with similar accuracies to predictions for organic molecules,[44] despite the poor linear correlations between DFT- and WFT-based diagnostics (Fig. 3, ESI Fig. S4–S5 and Table S2†). In addition, we predict %Ecorr[(T)] particularly well from the combination of DFT-based diagnostics and CD-RACs with a Pearson's r of 0.95 and an MAE of 0.21% (i.e., scaled MAE = 0.015). Given the relatively poor linear correlation between individual MR diagnostics and %Ecorr[(T)] for the TMCs, the accurate prediction of %Ecorr[(T)] highlights the utility of our model in practical VHTS (ESI Fig. S6†).

Fig. 3

(left) Scaled MAE for WFT-based diagnostics and %Ecorr[(T)] for the TMCs (blue) and organic structures (red) on the set-aside test data. The mean scaled MAE for all WFT-based diagnostics and %Ecorr[(T)] is also shown, with the error bar representing a standard deviation. The scaled MAE is not shown for %Ecorr[(T)] on the organic space, because %Ecorr[(T)] is not an ML model target property in ref. 44. (middle) Predicted vs. actual %Ecorr[(T)] on the set-aside 20% test data points (i.e., from more than 10 000 TMCs) colored by kernel density estimation (KDE) density values, as indicated by the inset color bar. A black dashed parity line is also shown. (right) Distributions of absolute test errors for %Ecorr[(T)] (unitless, bins of 0.1) with the MAE annotated as a green vertical bar and the cumulative count shown in blue according to the axis on the right.

Cancellation of error in MR effect

Numerous efforts[24,38,40,41,43,52] have focused on quantifying the MR character of a single structure and the MR effect on energy evaluation using single-reference methods. However, most property predictions in chemistry are determined from the relative energy of multiple geometric and/or electronic structures, potentially leading to cancellation of error. Here, we investigate whether the MR effect between multiple structures tends to accumulate or cancel for representative properties. We studied the adiabatic HS-to-LS splitting, ΔEH–L, which we obtain from the relative electronic energies of two spin states of the same compound in their respective optimized geometries. We also compute the adiabatic ionization potential, IP, which we compute as the electronic energy difference between a molecule before and after electron removal, including any reorganization of the oxidized species. For both properties, we observe that differences in MR character are more important than the total degree of MR character because the MR effect cancels when calculating properties involving multiple structures. The error for ΔEH–L obtained with CCSD in comparison to CCSD(T), i.e., |ΔΔEH–L[CCSD–CCSD(T)]|, correlates well (Pearson's r = 0.92) with the absolute difference of %Ecorr[(T)] of the two structures (Fig. 4). If we instead attempt to predict CCSD errors from the total MR character summed over both structures, we obtain a much poorer correlation (Pearson's r = −0.52, Fig. 4). To probe why high MR character does not always lead to high MR effect, we considered representative compounds. For the example of cis Cr(ii)(NS−)2He4, the ΔΔEH–L[CCSD–CCSD(T)] is small at 7.2 kcal mol−1, although both the LS and HS structures have significant and comparable MR character (LS %Ecorr[(T)] = 89.2, HS %Ecorr[(T)] = 91.1). Another complex, Co(iii)(NH2−)He5, has a relatively small amount of MR character, as judged by the sum of two spin states (LS %Ecorr[(T)] = 94.9, HS %Ecorr[(T)] = 98.4). However, the imbalance in MR character (i.e., %Ecorr[(T)]) for the two spin states leads to a large ΔΔEH–L[CCSD-CCSD(T)] (18.2 kcal mol−1).

Fig. 4

The absolute difference in adiabatic spin-splitting energy between CCSD and CCSD(T), i.e., |ΔΔEH–L[CCSD-CCSD(T)]|, vs. the absolute difference (top) and the sum (bottom) of %Ecorr[(T)] of the two spin states. Points are colored by kernel density estimation (KDE) density values, as indicated by the inset color bar. A black dashed best-fit line is shown along with the Pearson correlation coefficient. Cr(ii)(NS−)2He4 is shown as a representative example for the cancellation of MR character in property prediction MR effect. Atoms are colored as follows: purple for Cr, blue for N, yellow for S, and green for He.

The relationship between differences in MR character and errors that are indicative of MR effect also applies to DFT errors, albeit more weakly. Choosing B3LYP as a representative functional, its error with respect to the CCSD(T) ΔEH–L, i.e., |ΔΔEH–L[B3LYP-CCSD(T)]|, shows a moderate correlation with the absolute difference of %Ecorr[(T)] (Pearson's r = 0.45), which is stronger than that observed for the sum of %Ecorr[(T)] (Pearson's r = −0.11) (ESI Fig. S12†). Observations of better correlations of property errors to MR character differences than to total MR character also hold when evaluating adiabatic IP (ESI Fig. S13†).

Transfer learning to improve prediction accuracy

Because high MR character in one structure or electronic state does not necessarily lead to large DFT (or single-reference WFT) errors for property evaluations, strategies are needed to predict and correct errors for a property of interest rather than relying on MR character to serve as a proxy for high property uncertainty. We previously developed an approach[44] to predict the degree of MR character of a single structure at low cost (i.e., with DFT-level diagnostics and CD-RAC descriptors), which we now extend to property prediction (ESI Table S3†). Here, we demonstrate a transfer learning approach with ANN models to predict the CCSD(T) adiabatic ΔEH–L and IP from CD-RACs and information obtained from DFT calculations, including the sums and differences of the six DFT-based MR diagnostics and DFT-evaluated ΔEH–L and IP from four density functionals used in evaluating the MR diagnostics (i.e., BLYP, B3LYP, PBE, and PBE0, see Methods and ESI Table S3†). These trained ANN transfer learning models accurately predict the CCSD(T) result at DFT cost (Fig. 5 and ESI Fig. S14†). With this model, we obtain a mean absolute error (MAE) of 2.8 kcal mol−1 for ΔEH–L that is three-fold lower than the error obtained from using the B3LYP hybrid functional (Fig. 5). We observe similar behavior for the IP, where the transfer learning MAE of 0.14 eV is one third of the error obtained using the B3LYP hybrid functional (ESI Fig. S14†).

Fig. 5

Distributions of absolute errors for ΔEH–L predicted with DFT using B3LYP (red) and transfer learning models (gray) on the set-aside test data, with the cumulative count shown according to the axis on the right (top). The MAEs are shown as vertical bars at 10.2 kcal mol−1 for DFT and 2.8 kcal mol−1 for transfer learning. The MAE of the multi-pronged strategy of transfer learning, uncertainty quantification, and multi-theory modeling vs. the fraction of CCSD(T) calculations required (bottom). In all cases, the CCSD(T) result is treated as the reference against which MAEs are evaluated.

In addition, our transfer learning ANN model can be systematically improved by using WFT-based (i.e., MP2, CCSD) diagnostics that are more predictive of strong correlation but still lower cost to compute than CCSD(T) (ESI Table S4†). For example, by including MP2-based diagnostics (i.e., nHOMO[MP2] and nLUMO[MP2]) and ΔEH–L (or IP) computed by MP2, we lower the MAE of ΔEH–L to 2.2 kcal mol−1 and IP to 0.12 eV. These MAEs are further reduced to 0.4 kcal mol−1 for ΔEH–L and 0.06 eV for IP if we include CCSD-based diagnostics and CCSD-computed ΔEH–L and IP. More interestingly, we see these large improvements of the transfer learning model performance even though the MP2- or CCSD-evaluated ΔEH–L and IP do not show significant improvements over DFT in comparison to the CCSD(T) reference (ESI Table S4†). This observation suggests our transfer learning models do learn from these WFT-based diagnostics to better predict ΔEH–L and IP computed by CCSD(T). In addition, we achieve comparable performance to other transfer learning approaches[59,60] demonstrated on organic molecules in terms of scaled MAEs on set-aside test data. By focusing on properties involving multiple geometric and/or electronic structures, we are able to take advantage of error cancellation in MR effects, in comparison to previous transfer learning efforts that treat each electronic state and structure separately in predicting properties such as the correlation energy. Although the transfer learning models demonstrated good overall performance in the prediction of CCSD(T)-level properties, we next investigated whether we could identify specific compounds with large model uncertainty where errors might also be expected to be large. In such cases, a transfer learning correction may lead to large errors that would motivate carrying out the full CCSD(T) calculation instead. To quantify uncertainty, we used the distance in latent space developed in our previous work[61] as an uncertainty quantification (UQ) metric on our transfer learning models to select complexes likely to benefit from explicit CCSD(T) calculations. In this approach, we selectively perform CCSD(T) calculations on the TMCs that have the largest distance to training data in latent space, and thus highest model uncertainty, but use transfer learning predictions for the others at DFT cost. If we carry out CCSD(T) on the 30% of the data with the highest uncertainty, we reduce errors by a factor of three and achieve coupled cluster accuracy (i.e., 1 kcal mol−1 MAE) for the prediction of the CCSD(T) ΔEH–L value (Fig. 5). Given that the computational cost of the DFT calculations is negligible relative to CCSD(T) calculations for the moderately sized TMCs in our dataset, we achieve a three-fold acceleration in data acquisition compared to an all-CCSD(T) approach while maintaining close-to-CCSD(T) accuracy. If we aim for an MAE of 1.5 kcal mol−1 and accept transfer learning predictions on points with higher uncertainty, we reduce the number of complexes that require WFT calculations to only 19% (i.e., five-fold speedup). Similar speedups are observed for the prediction of adiabatic IP. We only need to carry out the 40% of the CCSD(T) calculations with the largest ML model uncertainty to achieve a MAE of 0.042 eV (i.e., 1 kcal mol−1). The percentage of CCSD(T) calculations carried out can be further reduced to 30% if we aim for a MAE of 0.065 eV (i.e., 1.5 kcal mol−1, ESI Fig. S14†). This strategy of performing CCSD(T) calculations on high-uncertainty points shows significant improvement over the previous strategy[61] where we would avoid making a prediction on the these points. For example, we must discard 74% of complexes for ΔEH–L and 90% of the complexes for IP transfer learning to retain the CCSD(T) accuracy of 1 kcal mol−1 on the points for which a prediction is still made (ESI Fig. S15†). Thus, this multi-pronged strategy of transfer learning, ML model UQ, and multi-level modeling accelerates data acquisition while maintaining high overall data fidelity for chemical discovery.

Conclusions

In conclusion, we studied trends in MR character for over 10 000 TMCs. Over this set, we observed that complexes with fewer unpaired d electrons (i.e., LS) and stronger ligand fields have more significant MR character. Taking both organic molecules and TMCs into consideration, we showed that DFT- and CC-based diagnostics (e.g., T1 diagnostic) have distinct relationships with %Ecorr[(T)] for the two classes of molecules, thus limiting their transferability. In contrast, MP2- and CASSCF-based diagnostics have more consistent relationships with %Ecorr[(T)] for organic molecules and TMCs, demonstrating greater transferability. Therefore, we built ML models to predict these computationally demanding and transferable WFT-based diagnostics and %Ecorr[(T)] from less costly DFT-based diagnostics. We obtained excellent accuracy to directly predict %Ecorr[(T)] (i.e., MAE = 0.21), demonstrating the potential of these models for use in VHTS. Motivated by the fact that most chemical properties are determined from the relative energies of multiple geometric and/or electronic structures, we investigated the effect of MR character on two properties that depend on multiple optimized geometries, the adiabatic spin splitting (ΔEH–L) and ionization potential (IP). We observed that differences in MR character are more important than the cumulative degree of MR character, suggesting that cancellation of MR effects outweighs their accumulation. As a result, strong MR character in a single structure does not necessarily lead to large DFT errors. Motivated by this observation, we built two ML models to predict the CCSD(T) adiabatic ΔEH–L and IP via a transfer learning approach. This approach demonstrated a three-fold reduction in errors compared to using B3LYP on both properties. Finally, we introduced UQ and multi-level modeling into our workflow in which we carried out CCSD(T) calculations on the most uncertain points and used transfer learning predictions on the others. We demonstrated that this multi-pronged strategy accelerates data acquisition by a factor of three while maintaining high overall data fidelity (i.e., 1 kcal mol−1 CCSD(T) accuracy) for chemical discovery. We emphasize that this great performance is observed on energetic properties of a chemical system and further investigations are required on more complex electronic structure properties such as the wavefunction. However, we would expect our multi-pronged strategy to be general and have similar accuracy if data from higher-level theory (e.g., phaseless auxiliary field quantum Monte-Carlo[62]) and experiments are provided as the reference. We anticipate our observations on the cancellation of MR effects in property evaluations and our multi-pronged strategy to overcome cost-accuracy trade-off limitations in VHTS to be broadly applicable for challenging transition-metal compound spaces.

Computational details

Data sets

Mononuclear octahedral transition-metal complexes (TMCs) with Cr, Mn, Fe, and Co in +2 and +3 oxidation states were studied in up to three spin states, i.e., high, intermediate, and low, as follows: quintet, triplet, and singlet for d6 Co(iii)/Fe(ii) and d4 Mn(iii)/Cr(ii); sextet, quartet, and doublet for d5 Fe(iii)/Mn(ii), and quartet and doublet for d3 Cr(iii) and d7 Co(ii) (ESI Table S5†). We used monodentate ligands from both the spectrochemical series[63] and our prior OHLDB set[64] (ESI Table S6†). To restrict the system size, we employed He atoms as four to six of the six ligands. For the remaining non-He ligands, we considered both cis and trans symmetry, and we varied the metal–He distance to mimic ligand field strength differences while all other metal–ligand distances were freely optimized (ESI Table S7†).

DFT geometry optimizations

DFT geometry optimizations with the B3LYP[65-67] global hybrid functional were carried out using a developer version of graphical-processing unit (GPU)-accelerated electronic structure code TeraChem.[68-70] The LANL2DZ effective core potential[71] basis set was used for metals and the 6-31G* basis for all other atoms. Singlet spin states were calculated with the spin-restricted formalism while all other calculations were carried out in a spin-unrestricted formalism. In all DFT geometry optimizations, level shifting[72] of 0.25 Ha on all virtual orbitals was employed. Initial geometries were assembled by molSimplify[73,74] and optimized using the L-BFGS algorithm in translation rotation internal coordinates (TRIC)[75] to the default tolerances of 4.5 × 10−4 hartree per bohr for the maximum gradient and 1 × 10−6 hartree for the energy change between steps. During the optimization, the positions of the metal and He atoms were fixed to maintain the target metal–He distances and angles. Geometry checks[76,77] were applied to eliminate optimized structures that deviated from the expected octahedral shape following previously established metrics[76,77] without modification (ESI Table S7†).

MR diagnostic calculations

Following our prior studies,[44,51] we calculated 14 MR diagnostics[24,32-41] using ORCA 4.0.2.1 (ref. 78,79) with the cc-pVTZ basis set on the metals as well as P and S elements and the cc-pVDZ basis set on all other atoms (ESI Table S1†). To evaluate the MR character, the restricted open-shell formalism was used in all DFT and Hartree–Fock (HF) calculations. We chose the restricted open-shell formalism because it was observed[24,52] that unrestricted formalism can recover some MR effects in open-shell systems and thus lead to smaller MR diagnostics. We converged a B3LYP calculation and used it to initialize both DFT calculations with other density functionals (i.e., BLYP, B1LYP, PBE, and PBE0) and the HF calculations. This ensured we converged a consistent electronic state over multiple calculations and also saved computational time. The converged HF wavefunction was then used for MP2 and CCSD(T) calculations. Finally, the MP2 natural orbitals were used to set up a CASSCF calculation with active spaces of 10, 12, and 14 orbitals (ESI Fig. S16†). All MR diagnostics were computed using the default parameters in ORCA (ESI Table S8†). During the computation of total atomization energy (TAE)-based diagnostics, we assumed heterolytic dissociation for the metal–ligand bond (i.e., the oxidation state of the metal does not change) and homolytic dissociation for the atoms in the ligands, where each individual atom kept its formal charge (ESI Table S6†). We chose the percentage of correlation energy recovered by CCSD compared to CCSD(T) (i.e., %Ecorr[(T)]) as the figure of merit, as we observed good correspondence of %Ecorr[(T)] and %Ecorr[T] in both equilibrium and distorted organic molecules in our previous work.[44] We also tested it for complexes with six helium atoms as ligands in this work and found excellent agreement between %Ecorr[(T)] and %Ecorr[T] (ESI Fig. S1†). If all 14 MR diagnostics could not be successfully computed (e.g., due to lack of SCF convergence or one of the calculations exceeding the allowed wall time), we removed the single TMC structure (i.e., at a single M–He bond length) from the dataset (ESI Table S9†). A few (274, ca. 2%) CCSD(T) calculations resulted in significantly different perturbative triples corrections among TMCs with different metal–He distances but same chemical composition, potentially due to the CCSD wavefunctions converging to different electronic states. We removed those cases by the Grubbs outlier test[80] and Z-score test by comparing the perturbative triples corrections obtained using TMCs with the same chemical composition and ligand symmetry but different metal–He distances (ESI Fig. S17 and Tables S10–S11†). We also removed TMCs where the standard deviation of the leading weight of the CASSCF wavefunction, C02, obtained with the three active spaces (i.e., with 10, 12, and 14 active orbitals) was larger than 0.1 (334, ca. 3%), which indicated that a final active space of 14 orbitals was not sufficient (ESI Fig. S18 and Table S12†).

ML models

As in prior work,[44,51] we use Coulomb-decay revised auto-correlations (CD-RACs)[44] as descriptors for all of our machine learning models. CD-RACs are sums of products and differences of five atom-wise heuristic properties (i.e., topology, identity, electronegativity, covalent radius, and nuclear charge) on the 2D molecular graph divided by the pairwise atomic distance. This incorporation of the pairwise distance imparts 3D geometric information to graph-based RACs[81] to distinguish TMCs with the same chemical composition but different metal–He distances. We chose CD-RACs as descriptors because RACs have been previously demonstrated to provide good performance in equilibrium properties of TMCs[77,82] and CD-RACs have shown superior performance on predicting MR diagnostics for both equilibrium and non-equilibrium geometries of organic molecules in comparison to several alternatives.[44] As motivated previously,[81] we apply the maximum bond depth of three and eliminate constant RACs (ESI Text S1†). For properties that involve two structures (i.e., adiabatic spin splitting and ionization potential), the CD-RACs of the two geometries were concatenated (ESI Table S3†). For all artificial neural network (ANN) models, the hyperparameters were selected using HyperOpt[83] with 200 evaluations, using a random 80/20 train/test split, with 20% of the training data (i.e., 16% overall) used as the validation set (ESI Table S13†). All ANN models were trained using Keras[84] with Tensorflow[85] as a backend. All models used the Adam optimizer up to 2000 epochs, and dropout, batch normalization, and early stopping to avoid over-fitting.

Data availability

The datasets supporting this article have been uploaded as part of the ESI.† The machine learning models are available on Zenodo at the URL https://zenodo.org/record/5851432#.YeHOoS-B1pQ.

Author contributions

C. D., D. B. K. C., and H. J. K.: conceptualization, methodology, C. D. and D. B. K. C.: data curation, writing – original draft preparation. C. D., D. B. K. C., and H. J. K.: visualization, investigation. H. J. K.: supervision. C. D., D. B. K. C, A. N., and H. J. K.: writing – reviewing and editing.

Conflicts of interest

The authors declare no competing financial interest.

54 in total

1. Navigating Transition-Metal Chemical Space: Artificial Intelligence for First-Principles Design.

Authors: Jon Paul Janet; Chenru Duan; Aditya Nandy; Fang Liu; Heather J Kulik
Journal: Acc Chem Res Date: 2021-01-22 Impact factor: 22.384

2. Development of the Colle-Salvetti correlation-energy formula into a functional of the electron density.

Authors:
Journal: Phys Rev B Condens Matter Date: 1988-01-15

3. Perspective: Kohn-Sham density functional theory descending a staircase.

Authors: Haoyu S Yu; Shaohong L Li; Donald G Truhlar
Journal: J Chem Phys Date: 2016-10-07 Impact factor: 3.488

4. Rational Density Functional Selection Using Game Theory.

Authors: Suzanne McAnanama-Brereton; Mark P Waller
Journal: J Chem Inf Model Date: 2017-12-19 Impact factor: 4.956

5. Automation of Active Space Selection for Multireference Methods via Machine Learning on Chemical Bond Dissociation.

Authors: WooSeok Jeong; Samuel J Stoneburner; Daniel King; Ruye Li; Andrew Walker; Roland Lindh; Laura Gagliardi
Journal: J Chem Theory Comput Date: 2020-03-02 Impact factor: 6.006

6. Efficient Computational Screening of Organic Polymer Photovoltaics.

Authors: Ilana Y Kanal; Steven G Owens; Jonathon S Bechtel; Geoffrey R Hutchison
Journal: J Phys Chem Lett Date: 2013-04-29 Impact factor: 6.475

7. Communication: An adaptive configuration interaction approach for strongly correlated electrons with tunable accuracy.

Authors: Jeffrey B Schriber; Francesco A Evangelista
Journal: J Chem Phys Date: 2016-04-28 Impact factor: 3.488

8. Geometry optimization made simple with translation and rotation coordinates.

Authors: Lee-Ping Wang; Chenchen Song
Journal: J Chem Phys Date: 2016-06-07 Impact factor: 4.304

9. Approaching coupled cluster accuracy with a general-purpose neural network potential through transfer learning.

Authors: Justin S Smith; Benjamin T Nebgen; Roman Zubatyuk; Nicholas Lubbers; Christian Devereux; Kipton Barros; Sergei Tretiak; Olexandr Isayev; Adrian E Roitberg
Journal: Nat Commun Date: 2019-07-01 Impact factor: 14.919