Literature DB >> 36112354

Large-Scale Benchmarking of Multireference Vertical-Excitation Calculations via Automated Active-Space Selection.

Daniel S King¹, Matthew R Hermes¹, Donald G Truhlar², Laura Gagliardi³.

Abstract

We have calculated state-averaged complete-active-space self-consistent-field (SA-CASSCF), multiconfiguration pair-density functional theory (MC-PDFT), hybrid MC-PDFT (HMC-PDFT), and n-electron valence state second-order perturbation theory (NEVPT2) excitation energies with the approximate pair coefficient (APC) automated active-space selection scheme for the QUESTDB benchmark database of 542 vertical excitation energies. We eliminated poor active spaces (20-40% of calculations) by applying a threshold to the SA-CASSCF absolute error. With the remaining calculations, we find that NEVPT2 performance is significantly impacted by the size of the basis set the wave functions are converged in, regardless of the quality of their description, which is a problem absent in MC-PDFT. Additionally, we find that HMC-PDFT is a significant improvement over MC-PDFT with the translated PBE (tPBE) density functional and that it performs about as well as NEVPT2 and second-order coupled cluster on a set of 373 excitations in the QUESTDB database. We optimized the percentage of SA-CASSCF energy to include in HMC-PDFT when using the tPBE on-top functional, and we find the 25% value used in tPBE0 to be optimal. This work is by far the largest benchmarking of MC-PDFT and HMC-PDFT to date, and the data produced in this work are useful as a validation of HMC-PDFT and of the APC active-space selection scheme. We have made all the wave functions produced in this work (orbitals and CI vectors) available to the public and encourage the community to utilize this data as a tool in the development of further multireference model chemistries.

Entities: Chemical

Year: 2022 PMID： 36112354 PMCID： PMC9558375 DOI： 10.1021/acs.jctc.2c00630

Source DB: PubMed Journal: J Chem Theory Comput ISSN： 1549-9618 Impact factor: 6.578

Introduction

The accurate treatment of excited states is critical for understanding photochemical phenomena,[1−7] and it has been a long-standing goal of the electronic structure community.[8−19] Although treating excited states is difficult in general, it is particularly challenging when single-determinant methods such as Hartree–Fock or Kohn–Sham density functional theory provide a poor reference state for predicting excited states. This can occur either because the excited states vary greatly from the ground state (e.g., double excitations[20]) or because the ground state itself is not well described (e.g., strongly correlated systems[21−24]). One can overcome these deficiencies by using multiple-determinant reference states, and the methods that take this approach are called multireference methods. The most popular multireference method is the complete active-space self-consistent field (CASSCF) method,[25] which expresses approximate wave functions in the space of all possible configurations of electrons in an “active space” of orbitals and electrons. These wave functions can then serve as references for perturbation theories such as MC-QDPT,[26,27] CASPT2,[28,29] and NEVPT2.[30,31] Alternatively, quantitative accuracy can be achieved by using a nonclassical-energy functional applied to the converged wave function in multiconfiguration nonclassical functional theory (MC-NCFT).[32−36] The total energy is then a sum of the classical portion of the CASSCF energy and the nonclassical energy from the functional. The most common form of MC-NCFT utilizes nonclassical-energy functionals obtained by translating Kohn–Sham exchange–correlation functionals for use with multiconfigurational wave functions via the on-top pair density and is called multiconfiguration pair-density functional theory (MC-PDFT). The translated PBE functional (tPBE) has been used as the functional in the majority of MC-PDFT calculations to date. The nonclassical energy from a density functional can be mixed with the nonclassical part of the CASSCF energy to form a “hybrid” nonclassical functional; for example, using a 0.75:0.25 mixture of tPBE and CASSCF nonclassical energies yields the tPBE0 functional.[37] Difficulties encountered in all such post-CASSCF methods are making the active space large enough and well balanced enough to converge to a qualitatively accurate description of the underlying wave function(s). The results can depend significantly on the size and nature of the active space and the initial orbital guess.[38] Moreover, in many occasions, the orbitals will change character during their optimization. For these reasons, such calculations often require expert human guidance to carefully choose and monitor the active space size and composition. Although CASSCF has been used since the 1980s,[25] the prospect of automated active space selection has only received significant attention within the last decade or so.[39−58] Recently, we published the ranked-orbital approach to select active spaces and the approximate pair coefficient (APC) approximation for low-cost estimates of the orbital entropies used in the ranking.[59] This automated scheme, inspired by the entropy-driven approach of Stein and Reiher,[42] allows for the flexible selection of active-space size with a hierarchy of levels (max(8,8), max(10,10), max(12,12)...) reminiscent of the CI level sequence (CISD, CISDT, CISDTQ, ...). Recently, Jacquemin and co-workers published the QUESTDB benchmark data set of 542 vertical excitation energies on a diverse set of small and midsize main-group molecules, calculated via a variety of high-level wave function methods in the aug-cc-pVTZ[60,61] basis.[20,62−66] In the present paper, we have undertaken the automated calculation of these excitation energies with SA-CASSCF, NEVPT2, and MC-PDFT using the APC-ranked-orbital active-space selection scheme. To benchmark and analyze the performance of various multireference methods on this diverse set of excitations, we eliminate poor active spaces (20–40% of calculations) by setting an error threshold on the SA-CASSCF excitation energy because that has previously been shown to be good way to judge the quality of the active space.[49] By analyzing results across different active space and basis set size choices, we find different trends in the performance of MC-PDFT and NEVPT2 where the performance of NEVPT2 is overly dependent on the basis set in which the underlying wave function is converged. Additionally, we are able to produce the first large-scale and robust comparison of MC-PDFT to other single-reference methods such as CC2 and find the CASSCF mixing parameter of 0.25 used in tPBE0 to be optimal. We have made all the wave functions converged in this work available to the public via publication of the converged orbitals and CI vectors and encourage others to use these data in the development of further multireference model chemistries.

Methods

Data Overview

The data we have examined can be found in the QUESTDB data set,[65] which consists of 542 vertical excitations of small and midsize main-group molecules (molecules with 1–10 non-hydrogenic atoms). Of these excitations, 491 are from singlet ground states and 51 are from doublet ground states. Every excitation in the QUESTDB data set is specified by its spatial and spin symmetries, and benchmark values are reported as “theoretical best estimates” (TBEs) calculated with a variety of high-level methods with the aug-cc-pVTZ[60,61] basis. These TBEs have been used in this work to judge the errors of all computed excitation energies, even those obtained with a different basis set.

Active-Space Selection

To obtain orbitals for the active-space selection scheme, we started with a restricted Hartree–Fock singlet wave function for closed-shell molecules and a restricted open-shell Hartree–Fock doublet wave function for doublet molecules, as calculated using PySCF.[67] The molecular point group was reduced to the highest available symmetry implemented for the PySCF SA-CASSCF solver: C2, C2, C, or D2. The APC-ranked-orbital active-space selection scheme[36,59] starts with a set of candidate localized orbitals, ranks them by their approximated orbital entropies, and then eliminates orbitals starting from the lowest-entropy orbitals (those with the highest entropies are considered to be the most important) until the active space size reaches a predetermined maximum number of configuration state functions. We next describe the generation of candidate orbitals, then the ranking scheme, and finally the maximum-size criteria. Following previous work,[36] up to 23 lowest-energy virtual orbitals of the Hartree–Fock calculation were selected, and orbitals within this subset were grouped by symmetry and Boys-localized[68] within each symmetry. Likewise, up to 23 highest-energy doubly occupied orbitals were also grouped by symmetry and Boys-localized within each symmetry. These two sets of localized orbitals (and the one singly occupied orbital, when present) were then considered as candidates for the active space. Next, we describe how we ranked the localized orbitals. In the originally published APC ranking scheme, given a set of doubly occupied candidate orbitals i and virtual orbitals a, the APC matrix C was calculated aswhere K is the diagonal virtual element of the exchange matrix and F and F are diagonal virtual and doubly occupied elements of the Fock matrix in the MO basis. The entropies of doubly occupied orbitals i are then calculated asand those of virtual orbitals a as Finally, any singly occupied orbitals are assigned the maximum entropy value from the set of doubly occupied and virtual orbital entropies ({S}, {S}). Note that the removal of a virtual orbital from consideration affects all doubly occupied entropies and vice-versa. This method is inexpensive because it uses only easily calculated diagonal Hartree–Fock matrix elements (Supporting Information Section S1.3). However, in the present work, we have found that in larger molecules (with >350 aug-cc-pVTZ basis functions), the APC entropies tend to overestimate the interaction of some virtual orbitals with the doubly occupied orbitals, artificially inflating the entropies of all doubly occupied orbitals and causing the selection of highly imbalanced active spaces (Supporting Information Figure S1). To overcome this issue, we propose an algorithmic extension of APC in which high-entropy virtual orbitals are removed from consideration when calculating entropies and then assigned the maximum entropy value (i.e., treated in the same way as singly occupied orbitals). The algorithm takes the following steps: Provide the sets of candidate doubly occupied/singly occupied/virtual orbitals ({L}, {L}, {L}). Calculate entropies ({S}, {S}, {S}) = APC({L}, {L}, {L}) and then remove the highest-entropy virtual orbital from L and put it in L. Repeat N times. Return ({S}, {S}, {S}). The algorithm above has a single parameter N (APC-N), which is the number of times the highest-entropy virtual orbital is removed. In this work, we have found a good value of N to be 2 (a scheme we refer to as APC-2), and we find that using APC-2 entropies results in more balanced active spaces and lower SA-CASSCF error than APC (Supporting Information Figure S2). The APC-2 entropies are then used to rank and select the orbitals for the active space by dropping the lowest-entropy orbitals until the active space size is lower than some maximum active space size. However, we note that the current scheme should be improved for the treatment of orbitals with degenerate entropies (Supporting Information Section S6.4). In more detail, the active space size for a given set of Norb active orbitals containing Nelec active electrons is calculated via the equation[69]where Nα + Nβ = Nelec and Nα = Nβ for even Nelec and Nα = Nβ + 1 for odd Nelec. The maximum active space size NCSFMax is set via a specification of a maximum number of active electrons and orbitals (NelecMax, NorbMax) whose size is calculated via eq ; this maximum active-space choice is notated as max(NelecMax, NorbMax). In this work, we calculate results at three choices of max(NelecMax, NorbMax): max(8,8) (NCSFMax = 1764), max(10,10) (NCSFMax = 19404), and max(12,12) (NCSFMax = 226512). Following this specification, all orbitals are selected and then the lowest-entropy orbital is successively dropped until the size of the active space calculated via eq is less than or equal to NCSF. As an example, we guide the reader through choosing a max(4,4) active space (NCSFMax = 20) from a set of orbitals with occupancies and entropies {(n, S)} In this case, the active space is selected as follows: (2,0.05),(2,0.5),(2,0.9),(1,0.9),(0,1.2),(0,0.2),(0,0.1) | (7,7) NCSF = 784 (2,0.5),(2,0.9),(1,0.9),(0,1.2),(0,0.2),(0,0.1) | (5,6) NCSF = 210 (2,0.5),(2,0.9),(1,0.9),(0,1.2),(0,0.2) | (5,5) NCSF = 75 (2,0.5),(2,0.9),(1,0.9),(0,1.2) | (5,4) NCSF = 20 with a resulting selected (5,4) active space.

Calculation of the Excitation Energies

Calculations of excited-state wave functions were carried out by state-averaged CASSCF, averaging over the ground state and the minimum necessary number of excited states of the symmetry specified by QUESTDB. For example, in a C2 molecule (e.g., water), if the symmetry of the excited state under consideration is specified to be 1A2 with no lower 1A2 excitations present, then the state averaging was done evenly over the 1A1 ground state and the 1A2 excited state. For higher 1A2 excitations, however, the state averaging included an additional 1A2 state for each 1A2 excitation lower in energy (again with weights for state averaging being the same for all states averaged). Standard convergence parameters were employed, and for a few poor active-space choices (0.4% of cases), the calculations failed to converge. Because the highest available point groups supported by the SA-CASSCF solver in PySCF have lower symmetry than those specified by QUESTDB for single atoms and diatomics, the point groups sometimes had to be reduced to the highest-symmetry subgroup. Additionally, the labeling of different irreps is sometimes a choice of axis convention, such as between B1 and B2 in C2 or between B1g/B2g/B3g and B1u/B2u/B3u in D2; we have done our best to match the irrep we think was used in QUESTDB. Calculations were done with the highest M allowed by the spin symmetry (e.g., if an excited state has S = 1, then for 8 electrons in the active space the active space would have 5 α and 3 β electrons). The tPBE and NEVPT2 energies of the converged SA-CASSCF states were then calculated using the implementations of these methods in PySCF. Our implementation of MC-PDFT within PySCF is currently available in the mrh repository.[70] Additionally, tPBE0 energies were calculated by averaging the SA-CASSCF and tPBE energies[37] The only implementation of NEVPT2 currently in PySCF is strongly contracted NEVPT2 (SC-NEVPT2),[31] and our NEVPT2 calculations use this. In order to maintain a consistent labeling, the excited state to be compared to the QUESTDB excitation energy was chosen to be the state highest in energy as judged by tPBE. Although this is not a fail-proof scheme in terms of isolating the “same” QUESTDB state of the specified symmetry due to root flipping, we have found it to be satisfactory for our work as the converged QUESTDB wave functions are unavailable and labels such as “n → π*” are ambiguous non-observables. However, because the present work shows that tPBE0 is more accurate than tPBE, we suggest ordering the states by tPBE0 in future work.

Method Timing

All converged CASSCF wave functions (orbitals and CI vectors) were saved to disk at the end of the calculation. Timings for tPBE and NEVPT2 calculations were achieved by loading in the converged CASSCF wave functions, computing the relevant quantity (the tPBE nonclassical energy or the NEVPT2 perturbative correction) and then saving the results. The amount of resources requested for each calculation was determined by an empirically derived formula dependent on the number of aug-cc-pVTZ basis functions in the underlying molecule (Supporting Information Section S8.2), and so, timings between the tPBE and NEVPT2 implementations available in PySCF can be fairly compared (although we note that methodologies can always be further optimized).

Plotting

Figures were made in Python using matplotlib as enhanced by Pandas[71,72] and Seaborn.[73] Seaborn calculates 95% confidence intervals for the mean values reported in plots by bootstrapping the mean value over 1000 random samplings of the underlying data.[74−76]

Results

Eliminating Poor Active Spaces

We calculated excitation energies for all 542 vertical excitations listed in the QUESTDB database with six combinations of active space and basis set: four involving max(12,12) APC-2 active spaces with decreasing basis size (aug-cc-pVTZ,[60,61] jun-cc-pVTZ,[77] cc-pVTZ,[78,79] and cc-pVDZ[78,79]) and two involving jun-cc-pVTZ with decreasing active space size [max(10,10) and max(8,8)]. We will refer to these combinations throughout the paper as Aug(12,12), Jun(12,12), TZ(12,12), DZ(12,12), Jun(10,10), and Jun(8,8). Figure shows the mean absolute errors of SA-CASSCF, tPBE, tPBE0, and NEVPT2 that we obtain for all wave functions converged at each combination of active space and basis set tested in this work. Adding together the number of converged calculations at each active space and basis set shown at the bottom of Figure yields 3237 calculations. As expected, we find that the SA-CASSCF error increases when we move from a larger to a smaller basis set with a given active-space scheme or when we move from a larger active space to a smaller one with a given basis set. However, in order to reasonably evaluate the accuracies of these methods, we need to eliminate results whose error is driven mainly by poorly chosen active spaces. To analyze only cases with reasonable active spaces, we set a threshold T on the SA-CASSCF error of 1.1 eV (TSA-CASSCF = 1.1 eV). That is, we consider that the APC scheme has produced a good active space if the error in the SA-CASSCF excitation energy is less than 1.1 eV.

Figure 1

Comparison of the mean absolute errors of SA-CASSCF, tPBE, tPBE0, and NEVPT2 across different active space and basis set sizes for all converged calculations. The number of converged excitations with each combination of active space and basis is shown below each column, and 95% confidence intervals for each mean are shown in black. Figure shows the performance of SA-CASSCF, tPBE, tPBE0, and NEVPT2 at different active space and basis set sizes after using the 1.1 eV SA-CASSCF error cutoff to eliminate poor active-space choices. As expected, instead of observing an increasingly poor performance for SA-CASSCF excitations as active space and basis set size is decreased, we instead see a consistent error of roughly 0.39 ± 0.03 eV with an increasing amount of excitations excluded by TSA-CASSCF = 1.1 eV. The number of excluded excitations roughly doubles from 19.2% at Aug(12,12) to 39.6% at DZ(12,12) with decreasing basis size and to 32.4% at Jun(8,8) with decreasing active space size. We note the very small increase of eight excluded excitations upon moving from Jun(12,12) to Aug(12,12), highlighting the very efficient nature of the jun basis set.[77] Of course, with a better automatic active-space selection scheme, one would observe an increased amount of excitations included at each active space and basis set size, but the error will remain fairly consistent.

Figure 2

Comparison of the mean absolute errors of SA-CASSCF, tPBE, tPBE0, and NEVPT2 excitations across different active space and basis set sizes included by TSA-CASSCF = 1.1 eV. The number of excitations included in this analysis for each combination of active space and basis set is shown below each group of bars, and 95% confidence intervals for each mean are shown in black. As we found for SA-CASSCF, we find that tPBE (0.25 ± 0.02 eV) and tPBE0 (0.20 ± 0.02 eV) maintain relatively consistent errors across different active spaces and basis set sizes when the 1.1 eV SA-CASSCF error threshold is applied. This is an intuitive result, as the accuracy of MC-PDFT is primarily contingent on the quality of the SA-CASSCF density and on-top density and on the quality of the on-top functional; if one eliminates the poor active spaces, then the functional (correlation) error may dominate, and this is approximately independent of the active space and basis set. In contrast, NEVPT2 shows quantitatively worse results as the basis set is decreased even as the wave function remains qualitatively well described. This makes sense because the power of NEVPT2 to change the SA-CASSCF energy stems from its perturber states, which are less capable of describing dynamic correlation within a smaller basis because the smaller basis set cannot represent the virtual-orbital space as well.[80−82] Figure shows that the difference between the tPBE0 excitation energy and the SA-CASSCF excitation energy remains fairly consistent across active spaces and basis sets, but there is a significant drop in the NEVPT2 correction when moving from aug-cc-pVTZ to cc-pVDZ, resulting in increased NEVPT2 error. Figure combined with Figure shows clearly how the NEVPT2 results degrade in quality with decreasing size of the basis set, while the performance of tPBE0 remains consistent. As the basis set is decreased in size, the mean absolute change to the SA-CASSCF excitation energy decreases for NEVPT2, while remaining constant for tPBE. These results provide a plausible explanation of the discrepancy in mean absolute error found for SC-NEVPT2 between the study of Schapiro et al.[83] (0.23 eV) and the more recent study of Sarkar et al.[84] (0.15 eV). They imply that it is due to the fact that the Sarkar study used the aug-cc-pVTZ basis, while the Schapiro study employed the cc-pVTZ basis. However, our results point to this being caused by a poorer performance of NEVPT2 with the smaller basis set and not due to a poorer zeroth-order description of the underlying wave functions.

Figure 3

Mean absolute changes to the SA-CASSCF excitation energy made by tPBE0 and NEVPT2 across different active space and basis set calculations included by TSA-CASSCF = 1.1 eV.

Mean absolute changes to the SA-CASSCF excitation energy made by tPBE0 and NEVPT2 across different active space and basis set calculations included by TSA-CASSCF = 1.1 eV. Further discussion of the error threshold is given in the Supporting Information, which shows the 1.1 eV SA-CASSCF error threshold to be optimal (albeit imperfect) for isolating subsets of automated wave function calculations that reproduce results curated by hand.[84] Additionally, we analyze alternative error thresholds on the NEVPT2 and tPBE0 error. However, an error criterion cannot be used when a benchmark excitation energy or experimental excitation energy is not available. Nevertheless, when an accurate value is not available, one can still use this criterion (although with somewhat less reliably) by comparing to one’s best estimate rather than to an accurate value. Clearly, if one’s best estimate is good, this will work as well as comparing to an accurate value. Finally, one might wonder what one can do to fix the active space if a calculation goes poorly. Of course, increasing the size of the active space via NCSFMax is a worthwhile option to explore if affordable, and it is clearly seen in Figure how this significantly increases the success rate of the selection algorithm. However, following our previous work,[59] we also recommend experimenting with different orbital localization schemes for initializing the ranked-orbital selection as this can be a low-cost way to converge to a reasonable result.

Comparison to Single-Reference Methods

Data Overview

In the QUESTDB database,[65] excitations from many methods are only reported for the 491 excitations from closed-shell (S0) molecules, and, due to double excitations and strongly mixed states, results from most methods are only available for about 460 of these excitations (Supporting Information Figure S15). Our Aug(12,12) results comprise 436 excitations included by TSA-CASSCF = 1.1 eV, 399 of which come from closed-shell molecules. Combining all methods and leaving out STEOM-CCSD, CCSDR(3), and CCSDT-3 for which there is significantly less available data (Supporting Information Figure S15), there are a total of 373 excitations consistently available for comparison with SA-CASSCF, tPBE0, NEVPT2, and 12 other methods in the QUESTDB database. Unlike Jacquemin and co-workers, we have not limited ourselves to comparisons based on “safe”[65,66] excitations, and this includes 29 excitations that would otherwise have been excluded (out of a total of 57 unsafe excitations in the total set of 542). Figure shows the mean absolute and signed errors of SA-CASSCF, tPBE, tPBE0, and NEVPT2 in comparison with 12 other methods in the QUESTDB database on the set of 373 excitations. First considering the mean absolute errors, we find that both NEVPT2 (0.18 eV) and tPBE0 (0.19 eV) have accuracy on par with CC2 (0.15 eV), with tPBE lagging significantly behind (0.24 eV). However, we note that the errors we report here for tPBE, tPBE0, and NEVPT2 are likely slightly overestimated, as our CASSCF error threshold is imperfect and fails to eliminate all cases with poor active-space choices (as discussed in the Supporting Information). Furthermore, this consistently available data set excludes all double excitations, for which the performance of the multireference methods is far superior (as discussed below).

Figure 4

Comparison of the mean signed and unsigned errors of various methods on the 373 Aug(12,12) excitations included by TSA-CASSCF = 1.1 eV error threshold. The 95% confidence intervals are shown in black. Left: Mean absolute errors. Right: Mean signed errors. Nevertheless, trends in the signed errors are particularly interesting, with all but four methods (ADC(3), ADC(2.5), tPBE, and tPBE0) overestimating excitations; this implies biased relative overstabilization of the ground state for most of the methods. One can clearly see how tPBE0 benefits from balancing the treatment of exchange and correlation, with SA-CASSCF overestimating excitations by 0.18 eV and tPBE underestimating by 0.08 eV, such that tPBE0 has nearly zero mean signed error. We note that the same good balance seems to occur in ADC(2.5),[85] which averages ADC(3) and ADC(2). The left of Figure shows the mean absolute error of different methods on excitations classified as double excitations for all methods with any calculated double excitations in the QUESTDB database; our automated approach was able to converge results within the 1.1 eV SA-CASSCF error threshold for 16/23 (70%) of the double excitations that have TBEs available, which is only slightly lower than the overall un-dropped-out fraction of 436/542 (80%) in the Aug(12,12) calculations. In keeping with the usual recommendation to use multireference methods for this class of excitation, we find that multireference methods are the only methods that perform consistently well on double excitations.

Figure 5

Left: Comparison of the mean absolute error of different methods on the entire subset of 23 double excitations in the QUESTDB data set. The amount of excitations available for each method (with SA-CASSCF, tPBE, tPBE0, and NEVPT2 included via a 1.1 eV SA-CASSCF error threshold) is marked under each bar. Right: Comparison of the mean absolute errors of various methods on the 165 Aug(12,12) excitations included by TSA-CASSCF = 1.1 eV with high multireference character (Max[M(ψGS),M(ψES)] > 0.14) and data available for every method shown, where M is the M diagnostic[40] of the corresponding wave function. 95% confidence intervals are shown in black. However, double excitations are not the only category of excitations, which are a challenge for single-reference approaches. To quantify the multireference character of single excitations, we have calculated the M diagnostic[40] of the ground and excited state Aug(12,12) wave functions included by the 1.1 eV SA-CASSCF error threshold and use the maximum of these values, MMax = Max[M(ψGS),M(ψES)]. Doing so, we find that the lowest MMax calculated for a double excitation is 0.14 (Supporting Information Table S12) and use this threshold as a classifier for identifying highly multireference single excitations; it happens to fall at slightly above the 50th percentile in the MMax distribution (Supporting Information Figure S20). The right of Figure shows a comparison of the mean absolute errors of SA-CASSCF, tPBE, tPBE0, and NEVPT2 to 12 single-reference methods on the 165 excitations with MMax > 0.14 included by TSA-CASSCF = 1.1 eV and data available for every method shown. Compared to Figure , we find that the performance of nearly all single-reference methods deteriorates significantly by about 0.05–0.07 eV when we consider only this high-MMax subset; this brings the performance of CCSD into line with tPBE0. In summary, we find tPBE0 and NEVPT2 to perform competitively on single excitations when compared to single-reference methods (Figures and 5) and to be the only methods capable of reasonably describing double excitations (Figure ). As such, we recommend tPBE0 and NEVPT2 as robust methods for calculating all classes of vertical excitations, although the active-space selection scheme may sometimes fail.

Performance by Excitation Type

Figure shows the errors classified by excitation type. In line with the Sarkar study,[84] we find that NEVPT2 is more accurate for triplet excitations than singlet excitations, and tPBE and tPBE0 follow the same trend. The figure shows that, with the exception of Rydberg states, tPBE0 has better performance than tPBE for every excitation category, and therefore, we recommend the use of tPBE0 rather than tPBE for calculating excitation energies of valence excitations. We also recommend tPBE0 for calculating a spectrum containing both valence and Rydberg excitations since the performance of the two methods is very similar (on average) for Rydberg states.

Figure 6

Mean absolute errors (in eV) of tPBE, tPBE0, and NEVPT2 Aug(12,12) calculations on various types of S0 excitations included by the threshold TSA-CASSCF = 1.1 eV.

Method Timing: tPBE0 Versus NEVPT2

Figure shows the average time consumed by the calculation of the NEVPT2 perturbative correction at different active space/basis set sizes and compares these timings to those for the calculation of the tPBE on-top energy by the methodology in Section . We find that at the normal grid size (grids_level = 3 in PySCF), tPBE is on average 114× less expensive than NEVPT2 for the large max(12,12) active spaces. This is because—as is well known—the cost of NEVPT2 scales very poorly with the size of the active space, while the cost of tPBE0 remains independent of that. Furthermore, the memory required for NEVPT2 also increases with active space size. It is around the max(12,12) active space size that the compute time for the perturbative correction begins to exceed the compute time of the underlying SA-CASSCF step, while the compute time of tPBE remains low.[86] For smaller active spaces such as max(8,8), the cost of NEVPT2 is comparable to that of tPBE and tPBE0.

Figure 7

Comparison of the mean compute times for the post-SCF portion of tPBE calculations with various grid specifications and for the post-SCF portion of NEVPT2 calculations with various active spaces and basis set sizes on the set of 533 excitations that were converged with all active spaces and basis sets. The costs of the SA-CASSCF portions of the calculations were removed from these comparisons by caching the converged wave functions. Keeping the cost down is important in many applications. Figure shows that the dependence of MC-PDFT compute times on grid size is a significant consideration; we observe a roughly 10× increase in cost from grids_level = 3 (3.4 s) to grids_level = 9 (30.8 s). Our studies find that the standard grids_level = 3 in PySCF is sufficient for excitations such as those we have calculated because we only see a significant change between the maximum and default grid size for a single excitation (Supporting Information Figure S24). Therefore, we recommend standard grid sizes for most applications involving state-averaged MC-PDFT.

Method Timing: tPBE0 Versus CC2 and CCSD

In an effort to give greater context to the standing of tPBE0 as a method for calculating vertical excitations outside of cases where multireference methods are absolutely needed (such as double excitations), we compare the timings of complete CASSCF + tPBE0 calculations to those of CC2 and CCSD. Figure shows the comparison of such timings for tPBE0 (including RHF convergence, Boys orbital localization, active-space selection, CASSCF optimization, and computation of the tPBE0 nonclassical energy) and CC2 and CCSD as computed in Psi4[87] for two excitations in QUESTDB. All calculations were given the same amount of computational resources as outlined in Supporting Information Section S8.2. We have chosen to show timings and accuracies for both a “medium-sized” excitation (pyrazine-x2, with 368 aug-cc-pVTZ basis functions) and a “large-sized” excitation (aza-naphthalene-x1, with 552 aug-cc-pVTZ basis functions). Additionally, Figure shows timings and accuracies for tPBE0 at all six of the active space and basis set combinations explored in this work.

Figure 8

Comparison of timings and accuracy between tPBE0 at the six active space/basis set combinations explored in this work and CC2 and CCSD in the aug-cc-pVTZ basis. Timings for tPBE0 include the steps of RHF convergence, Boys orbital localization, active-space selection, CASSCF optimization, and computation of the tPBE0 nonclassical energy. Timings for CC2 and CCSD were computed in the aug-cc-pVTZ basis using their implementation in Psi4[87] and were confirmed to reproduce the Jacquemin results. Focusing first on the aug-cc-pVTZ calculations, one can see that tPBE0 takes a comparable amount of time compared to CC2 and CCSD, both for pyrazine and aza-naphthalene. However, in both of these cases, costs can be cut significantly, while maintaining accuracy by decreasing active space and basis set size. As demonstrated by Figure , through a judicious choice of active space and basis set, tPBE0 has the potential to be much less expensive than comparative single-reference approaches, while achieving similar accuracy or better. For aza-naphthalene-x1, tPBE0 is about 16× as fast as CC2 at Jun(8,8) and about 72× as fast at DZ(12,12). The speedup one can obtain tends to be greater when considering larger systems. However, the idealized (albeit real) case shown for aza-naphthalene-x1 is far from general. First, one can only reduce basis set and active space size so far before one’s results become highly inaccurate with tPBE0, and the point at which this happens is highly excitation-dependent and somewhat dependent on the active-space selection scheme. Second, the timing behavior of CASSCF + tPBE0 is not always as well behaved: CASSCF optimization relies on a highly nonconvex and nonlinear optimization process, which may not conform to expected timing trends. An example of this can be seen in the pyrazine-x2 timings in Figure , where tPBE0@DZ(12,12) takes significantly more time than tPBE0@TZ(12,12). Further taking into account differences between implementations, we present Figure only to give readers a rough sense of timings for tPBE0 with respect to comparably accurate single-reference methods on different system sizes. Additionally, we attempted to compute timings for CC3 for these two excitations: the pyrazine-x2 result was computed in 1765 min (29 h) and aza-naphthalene-x1 was not able to finish within the 36 h time limit allowed by the resources available for these calculations. Finally, we note that CCSD also includes an iterative step, but a study of the convergence issues in CCSD is beyond our scope.

Optimizing the Mixing Parameter in Hybrid tPBE

A major motivation of this work was to generate data for benchmarking and improving MC-PDFT. As a first use of our data to optimize MC-PDFT functionals, we have investigated the optimal mixing parameter λ for hybrid tPBE (htPBE, for which the energy is given by λESA-CASSCF + (1 – λ)EtPBE) over the Aug(12,12) database. We have chosen this set of excitations because it is likely to have the smallest amount of poor active spaces erroneously included by the TSA-CASSCF = 1.1 eV error threshold. In other words, we expect this set of excitations to have the largest percentage of well-chosen active spaces. Figure shows the optimization of λ on the Aug(12,12) set of included excitations. Delightfully, we find that λ = 0.25—the same parameter used in tPBE0—is optimal for this set of excitations, in agreement with the much smaller study previously conducted on the EE27 database.[37] Therefore, we recommend using tPBE0 for excitation energies in the general case and especially for excitations similar to those in the QUESTDB data set. Optimizing the parameter over all active spaces and basis sets results in only a slightly shifted value of λ = 0.3, which appears to be offset mostly by the greater number of poor active spaces included in the Jun(8,8) excitation energies (Supporting Information Figure S22); using the more robust tPBE0 error threshold (discussed in the Supporting Information) removes this discrepancy (Supporting Information Figure S23). This suggests that a higher value of λ may be optimal for cases in which wave function error dominates.

Figure 9

Mean absolute errors of different mixing parameters λ in energies computed by htPBE for the 436 Aug(12,12) excitations included with TSA-CASSCF = 1.1 eV. The optimal value of λ = 0.25 (the same as in tPBE0) is marked with a dashed green line.

Conclusions and Future Work

The work presented here is the largest application to date of automated multireference calculations on a broad range of molecules. The generation of 3237 multireference excitation energies has allowed us to gain insights into how to eliminate poorly chosen active spaces and has identified trends in the performance of MC-PDFT and NEVPT2. This work has been possible only through the careful work of Loos, Jacquemin, and co-workers in compiling the QUESTDB data set[20,62,64−66,88] and the recent work of Sarkar et al.,[84] which has enabled us to compare our automatically generated results to hand-selected active-space calculations. We see this initial publication as laying the groundwork for several future applications related to MC-PDFT and high-throughput multireference calculations including Using the generated data to train and test novel functionals for MC-NCFT, representing a continuation of our initial work that used carbene singlet–triplet excitation energies to train machine-learned functionals.[36] Improving the active-space selection scheme. Our finding that error thresholds can be used to determine the fraction of poor wave functions in the calculated excitation energies can be used as a measure to benchmark the effectiveness of different active-space selection schemes. Determining if a selected active space is well chosen without reference to the underlying benchmark values. For specific active spaces and basis sets, there appears to be promise in looking at differences between different methods (Supporting Information Figure S13), but a method that is generalizable across active spaces and basis sets has yet to be found. Additionally, we expect that the wave functions converged in this work will be of interest for the development of different post-CASSCF methods such as multireference adiabatic connection (AC)[89] and algebraic diagrammatic construction (ADC).[90] For this reason, we are making all 3237 converged wave functions freely available for public use. We hope that this data will be useful to the electronic structure community both for comparing to the results published here and for developing and testing their own methods. In summary, we have carried out the largest benchmarking of SA-CASSCF and MC-PDFT to date. This was accomplished by means of an automatic active-space selection scheme and use of a SA-CASSCF error threshold to eliminate poor active-space choices. On a set of 373 aug-cc-pVTZ excitation energies, we find that tPBE0 and NEVPT2 perform with accuracy similar to CC2, while tPBE lags behind. However, the accuracy of NEVPT2 degrades with basis set size even as the quality of the underlying density and on-top pair density appear to remain the same. As expected, we find that tPBE0 is orders of magnitude less expensive than NEVPT2 for larger active spaces, and we recommend its use for the calculation of a broad range of excitation energies, including double excitations.

55 in total

1. Multiconfiguration Pair-Density Functional Theory Is as Accurate as CASPT2 for Electronic Excitation.

Authors: Chad E Hoyer; Soumen Ghosh; Donald G Truhlar; Laura Gagliardi
Journal: J Phys Chem Lett Date: 2016-01-26 Impact factor: 6.475

2. Exact and approximate adiabatic connection formulae for the correlation energy in multireference ground and excited states.

Authors: Katarzyna Pernal
Journal: J Chem Phys Date: 2018-11-28 Impact factor: 3.488

3. Progress and challenges in the calculation of electronic excited states.

Authors: Leticia González; Daniel Escudero; Luis Serrano-Andrés
Journal: Chemphyschem Date: 2011-09-16 Impact factor: 3.102

4. Multireference Approaches for Excited States of Molecules.

Authors: Hans Lischka; Dana Nachtigallová; Adélia J A Aquino; Péter G Szalay; Felix Plasser; Francisco B C Machado; Mario Barbatti
Journal: Chem Rev Date: 2018-07-24 Impact factor: 60.622

5. Multiconfiguration Pair-Density Functional Theory for Iron Porphyrin with CAS, RAS, and DMRG Active Spaces.

Authors: Chen Zhou; Laura Gagliardi; Donald G Truhlar
Journal: J Phys Chem A Date: 2019-04-04 Impact factor: 2.781

6. Automated Selection of Active Orbital Spaces.

Authors: Christopher J Stein; Markus Reiher
Journal: J Chem Theory Comput Date: 2016-03-21 Impact factor: 6.006

7. Psi4 1.1: An Open-Source Electronic Structure Program Emphasizing Automation, Advanced Libraries, and Interoperability.

Authors: Robert M Parrish; Lori A Burns; Daniel G A Smith; Andrew C Simmonett; A Eugene DePrince; Edward G Hohenstein; Uğur Bozkaya; Alexander Yu Sokolov; Roberto Di Remigio; Ryan M Richard; Jérôme F Gonthier; Andrew M James; Harley R McAlexander; Ashutosh Kumar; Masaaki Saitow; Xiao Wang; Benjamin P Pritchard; Prakash Verma; Henry F Schaefer; Konrad Patkowski; Rollin A King; Edward F Valeev; Francesco A Evangelista; Justin M Turney; T Daniel Crawford; C David Sherrill
Journal: J Chem Theory Comput Date: 2017-06-06 Impact factor: 6.006

8. Perspectives on Basis Sets Beautiful: Seasonal Plantings of Diffuse Basis Functions.

Authors: Ewa Papajak; Jingjing Zheng; Xuefei Xu; Hannah R Leverentz; Donald G Truhlar
Journal: J Chem Theory Comput Date: 2011-08-31 Impact factor: 6.006

9. Reference Energies for Double Excitations.

Authors: Pierre-François Loos; Martial Boggio-Pasqua; Anthony Scemama; Michel Caffarel; Denis Jacquemin
Journal: J Chem Theory Comput Date: 2019-02-13 Impact factor: 6.006