Literature DB >> 30653322

Assessment of Initial Guesses for Self-Consistent Field Calculations. Superposition of Atomic Potentials: Simple yet Efficient.

Susi Lehtola1.   

Abstract

Electronic structure calculations, such as in the Hartree-Fock or Kohn-Sham density functional approach, require an initial guess for the molecular orbitals. The quality of the initial guess has a significant impact on the speed of convergence of the self-consistent field (SCF) procedure. Popular choices for the initial guess include the one-electron guess from the core Hamiltonian, the extended Hückel method, and the superposition of atomic densities (SAD). Here, we discuss alternative guesses obtained from the superposition of atomic potentials (SAP), which is easily implementable even in real-space calculations. We also discuss a variant of SAD which produces guess orbitals by purification of the density matrix that could also be used in real-space calculations, as well as a parameter-free variant of the extended Hückel method, which resembles the SAP method and is easy to implement on top of existing SAD infrastructure. The performance of the core Hamiltonian, the SAD, and the SAP guesses as well as the extended Hückel variant is assessed in nonrelativistic calculations on a data set of 259 molecules ranging from the first to the fourth periods by projecting the guess orbitals onto precomputed, converged SCF solutions in single- to triple-ζ basis sets. It is shown that the proposed SAP guess is the best guess on average. The extended Hückel guess offers a good alternative, with less scatter in accuracy.

Entities:  

Year:  2019        PMID: 30653322      PMCID: PMC6727215          DOI: 10.1021/acs.jctc.8b01089

Source DB:  PubMed          Journal:  J Chem Theory Comput        ISSN: 1549-9618            Impact factor:   6.006


Introduction

Quantum chemical calculations are used in several applications to determine single-point energies or molecular properties of systems of interest. The level of theory can range from mean-field Hartree–Fock (HF) or Kohn–Sham (KS) density functional theory[1,2] (DFT) to high-level ab initio methods, such as multiconfigurational (MC) self-consistent field (SCF) theory,[3] coupled-cluster (CC) theory,[4] or the density matrix renormalization group (DMRG) method.[5] In each of these approaches, the energy can be written in terms of a reference set of orbitals. Solving the electronic structure is then tantamount to minimizing the energy with respect to the reference orbitals. High-level ab initio methods are invariably initialized with HF or KS orbitals. As HF produces by definition the best possible single-configurational wave function, it often offers a reasonable starting point for the treatment of correlation effects. Conversely, KS typically produces good orbitals in cases where HF is not a good starting point, such as for transition metal complexes. Because of this, for the present purpose it is sufficient to restrict the discussion to the HF and KS levels of theory. Although the HF and KS theories are mathematically simpler than high level ab initio methods such as MC-SCF or DMRG-SCF, the minimization of the corresponding energy functional is still a difficult nonlinear optimization problem that has been tackled with dozens of robust methods that cannot be satisfactorily reviewed here due to length constraints. Regardless of the method used to optimize the orbitals, an initial guess is necessary. Orbital optimization is usually the simpler, the closer the initial guess is to the converged result. However, despite its pronounced importance, the choice of initial orbitals has attracted surprisingly little interest in the literature.[6−13] (Note that although the optimization problem can also be reformulated only in terms of density matrices in the case of HF and KS theory,[14] this has no implications for the present study, as the two approaches are equivalent.) As the HF/KS potential is density dependent, the simplest sensible orbital guess (ignoring the trivial random orbital guess) is obtained by minimizing the density independent part of the functional. By employing the variational principle, it can be seen that in an orthonormal basis set {|i⟩}, this task is equivalent to finding the lowest eigenpairs of the matrix of the core Hamiltonianwhere T̂ is the (single-determinant) kinetic energy and V̂nuc is the nuclear attraction operator. Correspondingly, the guess from eq is known as the core or one-electron guess. If only one nucleus is present in the system, eq is the hydrogenic Hamiltonian, and the core guess yields hydrogenic orbitals. Now, as the core guess neglects all interactions between the electrons, it does not take into account the significant screening of the nuclear charge by the core electrons, thereby wasting considerable effort to converge the shell structure of atoms. Furthermore, as the initial guess does not reproduce the true energy ordering of atomic orbitals of s, p, d, and f symmetry, the procedure may produce a molecular guess that does not have the symmetry of the ground state solution. This may lead to the SCF algorithm requiring many more iterations to find the ground state, or possibly even to the SCF procedure converging onto a higher lying solution or a saddle-point, as, e.g., we have recently shown for fully numerical calculations on diatomic molecules.[15] Even worse, when applied to large systems composed of a diversity of elements, the core guess will tend to crowd the heaviest atoms with a large surplus of electrons, leaving the other atoms in highly ionized states, because the hydrogenic orbital energies scale as the square of the atomic number, ϵ ∝ −Z2, while the number of occupied orbitals per atom only scales as nocc ∝ Z. This is not all, however. Hydrogenic orbitals are a famously poor choice for single atoms as well, as the orbitals quickly become too diffuse, thereby missing the important structure in the core and valence regions,[16] further highlighting the significant shortcomings of the core guess both near and far from the nucleus. (Note that while hydrogenic orbitals do not form a complete basis set and have to be supplemented with continuum orbitals in order to achieve good results,[17,18] this is not a problem for the core guess, as the eigendecomposition of eq does not change the dimension of the basis.) The generalized Wolfsberg–Helmholz (GWH) approximation[19] is used in ref (20) as an alternative guess to the core Hamiltonian, and it is the default guess for open-shell systems in ref (21). In the GWH guess, the off-diagonal elements of the Hamiltonian are approximated aswhere the parameter K typically has the value K = 1.75, H and H are diagonal elements of the core Hamiltonian, and S is the overlap of basis functions i and j. Although in some cases the GWH modification of the core guess yields better results than the core guess itself, it no longer yields an exact solution for one-electron systems. All of the problems with the core guess and its GWH modification can be avoided by the use of the superposition of atomic densities (SAD) guess,[10,22] which employs converged atomic density matrices at each nucleus in the system. As SAD has the correct shell structure, it typically reproduces orbital energy orderings as well. Indeed, SAD is used as the default guess in most popular quantum chemistry packages, such as Gaussian,[23]Molpro,[24]Orca,[25]Psi4,[21]PySCF,[26] and Q-Chem.[20] An underappreciated feature of the SAD guess is that it allows for pursuing different charge states for a system as well as ionic vs nonionic solutions by manually assigning the charge states of the individual nuclei in the system; unfortunately, this is only possible in some implementations.[10,27] Thus, in most cases the SAD density matrix is charge neutral, meaning it may not match the actual charge state of the system. Furthermore, the density matrix produced by SAD is nonidempotent and does not correspond to a single-determinant wave function, which results in a nonvariational energy. In fact, SAD yields a nonidempotent density matrix even for single atoms, as the guess typically uses either configuration-averaged densities[10,22] or calculations with fractionally occupied orbitals.[28] In addition, the SAD density matrix is spin-restricted, meaning that it may also represent a different spin state than the one targeted in the calculation. The solution to the problems caused by the nonidempotency and the possibly incorrect spin and charge state of the SAD density matrix is simple. In the procedure of ref (10), a spin-restricted Fock matrix build is performed with the initial guess density, which is then diagonalized to yield a set of guess orbitals that are then used to start the wanted type of calculation; this is the most commonly used approach. (Some implementations of SAD use the nonidempotent closed-shell density matrix for the first step of the SCF calculation, reporting nonvariational energies for the first iteration.) Instead of a usual Fock build, guess orbitals can also be constructed from the Harris functional[29,30] that does not require the guess density to be idempotent; this is the approach chosen by Gaussian.[23] Alternatively to the diagonalization of a Fock matrix built from the SAD density, guess orbitals could also be obtained from a SAD guess by diagonalizing its density matrix to obtain natural orbitals. We are not aware of the this guess that we call SADNO having been explicitly considered previously in the literature. SADNO has been available in Erkale(27,31) and Q-Chem(20) for some time, implemented in both programs by the present author. However, as SADNO arises spontaneously in linear-scaling approaches that employ density matrix purification methods,[32−38] it may have been used implicitly in previous work. Another guess that has been widely used in the past is the extended Hückel method.[39] In the extended Hückel method, orbitals are obtained by diagonalizing an effective one-particle Hamiltonian, the diagonal of which consists of approximate valence state ionization potentials (IPs), H = −IP, whereas the off-diagonal is estimated using the GWH rule (eq ). Traditionally, a minimal set of Slater functions is used as the basis set, which is often replaced in Gaussian basis codes with STO-3G.[40] Semiempirical calculations such as the CNDO[41] or INDO[42] models can also be used instead of the extended Hückel guess. However, as the traditional formulation of the extended Hückel method, like the CNDO and INDO models, only operates within a minimal valence basis set, the accuracy of these three methods may thereby be quite limited, which is presumably why they have been largely replaced with the SAD guess. Still, an implementation of the extended Hückel method for real-space calculations has been described recently with good results.[12] However, as ref (12) only considered a SAD guess formed of exponential model atomic densities instead of ab initio atomic density matrices, it is possible that the performance of the SAD guess was underestimated. As the original formulation of the extended Hückel method only describes valence orbitals, core orbitals were added in ref (12) by inserting Slater orbitals with exponents estimated from Slater’s screening rules. However, instead of relying on pretabulated IPs and minimal Slater orbital basis sets as in the original formulation of the extended Hückel method, Norman and Jensen proposed a variant in ref (43) in which the basis functions and the diagonal elements of the Hamiltonian are adopted as pretabulated Gaussian expansions of occupied atomic HF orbitals and their orbital energies, respectively. Note that this approach is in line with Hoffmann’s original proposal,[39] since HF orbital energies are approximations to the ionization potential according to Koopmans’ rule.[44] In the present work, we employ an extension of Norman and Jensen’s approach for the extended Hückel guess. However, instead of relying on pretabulated orbitals and orbital energies as Norman and Jensen do, in the present work the atomic orbitals and orbital energies are calculated directly in the used basis set, employing the same machinery as is used for the SAD guess. The present Hückel approach is easy to implement, completely parameter-free, requires no orbital projections, and is directly applicable to both all-electron and effective core potential calculations. In addition to the aforementioned approaches, calculations may be initialized recursively by reading in a converged density computed in a smaller basis set. Such a procedure has recently been advocated for real-space calculations;[13] the use of confinement potentials has also been found to help SCF convergence in the case of extended real-space basis sets.[45] In some cases it is also possible to decompose the system into either single molecules or chemically meaningful molecular fragments,[28,46,47] and “glue” the orbitals together to form a good guess density for the original calculation. However, the problem of the proper choice of the guess orbitals is not solved by either of these approaches, but rather just moved to the additional calculation(s) in the smaller basis set, or delegated to the isolated molecular fragments. Having reviewed existing methodologies, what else could be done? In the present work, we study the superposition of atomic potentials (SAP) guess, which can be used within both atomic orbital as well as real-space basis set approaches. We will describe how to generate the atomic potentials used in the SAP guess for all of the chemically relevant part of the periodic table, and how to implement the guess efficiently in nonrelativistic or scalar-relativistic molecular calculations. The SAP guess is extensively benchmarked against the core Hamiltonian guess and its GWH modification, two variants of the SAD guess (SAD and SADNO) as well as the extended Hückel guess variant that were outlined above. The nonrelativistic benchmark calculations comprise 259 molecules consisting of first to fourth period elements, employing a variety of basis sets. The organization of the manuscript is the following. Next, in the Theory section, we will present the theory behind the SAP approach. Then, in the Methods section, we describe the benchmark data set, the SCF calculations, and the guess assessment. The Computational Details section outlines how the atomic potentials were calculated and how the SAP guess can be efficiently implemented in molecular calculations using nonrelativistic or scalar-relativistic approaches. Then, in the Results section, we will present extensive benchmarks of the core, GWH, extended Hückel, SAD, SADNO, GSZ, and SAP guesses employing various potentials. Finally, the article concludes in a brief Summary and Discussion section. Atomic units are used throughout the manuscript.

Theory

As molecules are formed from atoms, in which a sometimes overwhelming fraction of electrons—the core states—are but spectators in chemistry, an atom-focused guess indeed makes the most sense, as it is simple, and as it yields the correct zeroth order solution. As such, there are two ways in which the aim of an atomic guess could be realized. First, the target could be to reproduce the atomic orbitals or the atomic electron density itself, as is done in the SAD guess. In calculations employing linear combination of atomic orbitals (LCAO) basis sets, it is trivial to perform atomic calculations at the beginning of a molecular calculation, because the atomic basis sets are small–especially since no polarization functions are necessary in the atomic calculations. However, on-the-fly atomic calculations are intractable for molecular calculations employing real-space methods, as performing the atomic calculation in the three-dimensional molecular basis set is inefficient. Thus, pretabulated atomic orbitals should be used instead, requiring projections of the N atomic orbitals onto the molecular grid, followed by a construction of the density matrix in the real-space basis set. We wish to point out here that although ref (12) argued that the SAD guess only solves half the problem for real-space electronic structure calculations by yielding only a guess density but no guess orbitals, the approach we have outlined here can be used to produce suitable guess orbitals for real-space calculations. Namely, after projection of the numerical atomic orbital SAD guess, molecular orbitals could be obtained with the SADNO scheme by employing, e.g., a pivoted Cholesky decomposition of the SAD density matrix which is feasible even for large systems.[48] Second, an atomic guess could be reproduced from a potential that yields the correct atomic electron density. This kind of a guess would be equivalent to SAD in the case of noninteracting closed-shell atoms, as the ground state of the atomic potential by definition yields the orbitals for the atom. However, the use of a superposition of atomic potentials (SAP) in a system of interacting atoms might produce a better guess than that produced by the SAD method, because the guess density will already be guided by interatomic interactions. This can be illustrated by the following argument. Given electron densities {n()} on atoms {A} that generate potentials {v()}, respectively, the total energy of the total system is given bywhere we have approximated going from eq to eq that the potential is linear in the density, as is the case for the Coulomb and exact exchange potentials. The SAD guess corresponds to the separate minimizations of the terms in the first sum, whereas the SAP guess minimizes the total energy including the interatomic interactions, thus yielding an improved guess density. However, as SAP neglects the nonlinear terms in eq , the SAP guess may be worse than SAD if the SAP density deforms a lot from the atomic limit. Compared to the many versions of the SAD guess or alternatives such as the extended Hückel guess, the SAP guess is exceedingly simple to implement. First, the atomic potentials for the SAP guess can be easily obtained from calculations near the basis set limit, as we have done in the present work; this step does not need to be replicated, as nonrelativistic and scalar-relativistic atomic potentials for 1 ≤ Z ≤ 102 are available in the Supporting Information. Second, the formation of the SAP potential at any point involves but a simple summation over the tabulated atomic potentials, which can be truncated within a finite range. As a similarly local potential is also used in the simplest variant of DFT, i.e., the local spin density approximation (LDA), existing DFT programs can be easily tailored for the formation of the SAP guess. The analogy to DFT further shows that the SAP potential matrix can be formed in linear scaling time in large systems.[49,50] Like the SADNO guess we have proposed above, the SAP guess should be especially powerful for real-space implementations, as in addition to producing a suitably close guess density, it can also be used to produce a starting guess for the orbital eigenvectors, for instance by solving its eigenstates in a small basis of numerical atomic orbitals, or by iterative refinement of the SAP orbitals to finer meshes. The SAP potential can be reformulated by replacing the bare nuclear attraction potential in the Hamiltonianwhere the sum runs over all atoms A in the system, with a screened versionwhere the effective charge in eq can be trivially obtained from the radial potential VSAP(r) produced by an atomic calculation as The representation of eq is extremely appealing, as it removes any possible divergences of the potential at the nucleus: as a consequence of eq , the numerical range of Zeff(r) is limited and the function is smooth, making it easy to represent numerically on a grid. Indeed, the canonical numerical representation for the orbitals in atomic electronic structure programs is[51]which leads to the use of potentials rV(r) for the radial functions P(r). If Zeff(r) only described the classical Coulomb potential, it would be a monotonically decreasing function, going from Zeff(0) = Z at the nucleus to Zeff(∞) = 0. But, in order to be exact for atoms, quantum mechanical effects need to be included in VSAP(r), meaning that Zeff may not be monotonic. However, the limit Zeff(0) = Z still holds even in the presence of exchange and correlation, and the function is overall decreasing. The present SAP approach is not fully novel, as somewhat reminiscent approaches have been suggested earlier in the literature. It was recognized already in the 1930s by Zener and Slater that the core electrons screen the nuclear charge non-negligibly, and better approximate wave functions can be obtained by employing an effective, shell-dependent screened nuclear charge[52−54]where s is the screening constant for the shell n. Next, the use of a radially screened nuclear potential for obtaining approximate atomic orbitals for phenomenological studies was studied by Green and co-workers in the late 1960s and early 1970s.[55,56] In contrast to Zener and Slater’s rules, the approach used by Green et al. only has an implicit shell dependence through the Schrödinger equation: shells with l > 0 experience a smaller nuclear charge, since the l(l + 1)/r2 centripetal term[51] in the kinetic energy prevents them from seeing the less-screened regions close to the nucleus; this is also true for SAP. The Green–Sellin–Zachor (GSZ) expression for the screened nuclear charge is given by[55]where d is a nucleus specific parameter. Values for d for Z ∈ [2,103] have been fitted to nonrelativistic HF orbital energies;[55,56] hydrogen is unaffected by eq . Slightly more refined GSZ-type potentials in which also H is a free parameter have been obtained by minimization of the HF energy of the wave function produced by the guess, reproducing good agreement with numerical HF energies;[57−60] unfortunately, these potentials are not available for all the chemically relevant parts of the periodic table. Although Green and co-workers found the orbitals produced by the GSZ potential to be close to the converged HF solutions, yielding good agreement with experiment both for atoms[55−68] as well as diatomic molecules,[69−71] the GSZ approach appears to be all but forgotten. The GSZ approach is available in the diatomic finite-difference HF program,[72,73] while the use of GSZ-inspired potentials for optimized effective potential calculations has been studied by Theophilou and Glushkov.[74,75] At long-range, the GSZ potential (eq ) has the asymptotic value ZGSZ(∞) = 1. Equivalently, far away, any SAP potential should behave as −1/r. However, approximate exchange-correlation potentials have an incorrect asymptotic form: potentials derived from exchange-correlation energies decay in an exponential fashion,[76] meaning that DFT potentials will yield Zeff(∞) = 0. This behavior is illustrated in Figure , which shows the effective nuclear charges given by eq for the SAP approach from a nonrelativistic spin-restricted calculation with the BP86[77,78] functional (see Computational Details section), and for the GSZ approximation. Effective charges are also shown in Figure for a Coulomb-only screened nucleus, based on the converged BP86 electron density. Comparison of parts a and b of Figure shows that while the classical screening of the Coulomb charge results in a rapid decay of the effective charge, the inclusion of exchange and correlation effects makes the atom much more attractive at chemically relevant distances.
Figure 1

Effective charges for the noble gas atoms, computed using the BP86 functional[77,78] or given by the GSZ approach[55] (eq ). Note the logarithmic scale.

Effective charges for the noble gas atoms, computed using the BP86 functional[77,78] or given by the GSZ approach[55] (eq ). Note the logarithmic scale. More recently, apparently unaware of the work by Green et al., Amat and Cardó-Dorca[8] suggested building guess orbitals from an effective HF potential given by a prefitted,[79] spherically symmetric density representing the atomic shell structure, motivated by the so-called atomic shell approximation (ASA).[80] (A similar approach for building guess orbitals from extended Hückel calculations was suggested by Norman and Jensen.[43]) ASA potentials have been fit for H–Ar[79] and Sc–Kr[8] in the 6-311G basis set. In turn, the DIRAC program[81] has employed a screened nuclear charge expressed as a Gaussian expansion constructed from Zener and Slater’s rules to obtain more accurate guess orbitals for systems containing heavy atoms ever since its first release in 2004. Another approach was suggested by Nazari and Whitten,[82] who optimized effective Gaussian potentials for H, C, N, O, and F using a set of six molecules. The potentials were then benchmarked for a test set of 20 molecules. Nazari and Whitten’s results were promising, showing that the model potentials are transferable between different molecules to some extent. The method has been recently extended to Ti, Fe, and Ni, as well as functional group specific potentials.[83] However, as it is well-known that orbital optimization in HF and DFT can be reformulated as a problem of finding the right optimized effective potential,[84] it is difficult to estimate how well and how easily the results of refs (82) and[83] can be generalized to the rest of the periodic table, or even how the method generalizes beyond the nonstandard[85] “‘double-ζ”’ and “‘triple-ζ”’ basis sets of ‘‘near Hartree–Fock atomic orbitals’’ used in the study. As machine learning will likely soon be able to predict accurate electron densities in a cost-efficient fashion,[86] it will thereby likely also yield excellent initial guesses for SCF calculations. In the mean time, the present work yields suitably accurate starting guesses for the whole of the periodic table. The present work differs significantly from those of Green and co-workers,[55] Amat and Cardó-Dorca,[8] and Nazari and Whitten,[82] as follows: the form of our atomic potential is not restricted to a fixed analytic form as in refs (8, 55, and 82) but is instead determined numerically in a tabulated form; we employ unoptimized atomic potentials constructed using fully numerical calculations on atoms, not effective potentials optimized for a molecular training set as in ref (82) or potentials fitted to reproduce atomic calculations in a specific basis set as in ref (55); unlike refs (8, 55, and 82) we present a set of potentials for the whole chemically relevant periodic table (H–No, i.e., 1 ≤ Z ≤ 102), in both nonrelativistic and scalar relativistic variants, enabling practical calculations to be performed on any system; unlike refs (8 and 82), the atomic potentials are extensively benchmarked with calculations on entirely unbiased systems, as even the basis sets used to generate the atomic potentials and those used in the molecular applications are fundamentally different.

Methods

Molecular Data Set

In the present work, we study the 183 nonmultireference molecules from the high-level W4–17 test set of first- and second-row molecules,[87] which we furthermore augment with a data set composed of 50 transition metal complexes from refs (10 and 88) (referred to as TMC), as well as 28 complexes containing third or fourth period elements from the MOR41 database of single-reference systems.[89] Although the entries ED15, PR01, and PR02 in MOR41 are also included in TMC as Ni(C3H5)2, Cr(CO)6, and Fe(CO)5, respectively, the existence of these duplicates should not significantly affect our results as they represent only a small fraction of the database, and the geometries for the molecules are also slightly different. In contrast, Cr(C6H6)(CO)3,CrO2F2, Fe(CO)5, VF5, and VOF3 were excluded from ref (88), as these molecules also exist in ref (10). Moreover, as only two molecules in the collection are charged, CrO42– and Co(NH3)63+ (both from ref (10)), they are omitted from the analysis due to insufficient representation. The data set of the present study thus contains 259 charge-neutral molecules in total, 222 of which are singlets, and the remaining 37 are nonsinglets.

SCF Calculations

Nonrelativistic HF and revTPSSh[90−92] wave functions were calculated for all the molecules using a development version of Q-Chem,[20] employing wave function stability analysis and a (99,590) integration grid for the exchange-correlation functional. A 10–5 basis set linear dependence threshold, a 10–12 integrals screening threshold, and a 10–6 SCF convergence criterion was used. To investigate the impact of the basis set on the performance of the guesses, calculations were performed in the minimal STO-3G basis,[40] as well as the recently published single- to triple-ζ-level pcseg-0, pcseg-1, and aug-pcseg-2 basis sets.[93,94] The motivation for this range of basis sets is that the aug-pc-2 basis set[95,96] that is the parent of aug-pcseg-2 has been recently found to be a good choice for reasonably accurate calculations at the DFT level of theory.[97]

Guess Assessment

The quality of a starting guess is determined by how close the orbitals it yields are to the true ground state solution. Thus, the various initial guesses are assessed in the present work by calculating the projection of the guess orbitals onto the converged SCF wave function aswhere i and j are molecular orbitals, the sums run over the Noccσ occupied orbitals of spin σ, guessσ and SCFσ are the guess and SCF density matrices, and is the overlap matrix. While the examination of SCF convergence characteristics has been used e.g. in the study by van Lenthe et al.,[10] it is nontrivial to discern between the effects of the initial guess and that of the SCF algorithm in such an approach. In contrast, the projection Qσ yields an unambiguous appraisal of initial guesses, 0 ≤ Qσ ≤ Noccσ; it is also continuous rather than discrete like the number of SCF iterations, yielding a more fine-grained ranking. Furthermore, as only a single SCF calculation is necessary on each system in a Qσ based approach, it is possible to explore many kinds of initial guesses cost-efficiently with the present methodology. Because the SAD density matrix is nonidempotent, Qσ is not a fully reliable estimator for the accuracy of the SAD guess. As the largest natural orbital occupation numbers in SAD may be many times larger than one, they may artificially inflate the value of Qσ while representing an unphysical density. Moreover, fractional occupation of the valence orbitals is a well-known trick to aid SCF convergence;[98,99] in this respect the nonidempotency of SAD can actually be helpful, although fractional occupations can be formed for other guesses as well. Furthermore, as the SAD density matrix is typically chosen spin-restricted and charge neutral, the projections Qσ are not reliable estimates of the resulting SCF convergence for nonsinglet molecules, and/or for charged molecules that were excluded from the present study due to the scarcity of reference geometries. However, Qσis a reliable estimator for the SADNO guess. As it extracts natural orbitals from the SAD density matrix, the SADNO guess is able to form idempotent density matrices, as well as to adapt to charged as well as spin-polarized systems. As will be seen below, the SADNO guess consistently yields better Qσ values than SAD. To better be able to compare the performance of the guesses, the projections in both spin channels are condensed into a single criterion. Because it is clear that the number of electrons missed by a guess will scale proportionally to system size, the criterion should be intensive rather than extensive, lest the largest systems dominate the analysis entirely. Thus, we choose the fraction f of electron density covered, (worst) 0 ≤ f ≤ 1 (best),as the guess ranking criterion. The various initial guesses are formed and assessed with the freely available Erkale program[27,31] by reading in the basis sets and SCF wave functions from the Q-Chem output. The following guesses are studied: The core Hamiltonian guess, denoted as CORE. The GWH guess, i.e. the GWH modification of the core Hamiltonian. The SAD guess. In the present work, the atomic densities are formed in Erkale with HF employing spin-averaged, fractional orbital occupations of the valence shells. The extended Hückel guess, denoted as HUCKEL, with the atomic orbitals and eigenvalues taken from calculations analogous to the ones in the SAD guess. The GSZ potential.[55] The SADNO guess, where the orbitals are obtained by diagonalizing the SAD density matrix. The SAP guess, with the LDA-X, CAP-X and CHA-X potentials, described below in the Generation of Atomic Potentials section. As was mentioned in the Introduction, guesses 1–3 are commonly used approaches, whereas guess 4 employs a parameter-free, easily implementable variant of the extended Hückel guess that has not been previously considered to our knowledge, guess 5 has not been previously considered beyond diatomic molecules, and guesses 6 and 7 have not been previously considered at all in the literature.

Computational Details

Generation of Atomic Potentials

The atomic potentials are generated by KS-DFT calculations with an all-electron atomic program employing spherical symmetry (i.e., fractional occupations) that is available as a part of the Gpaw program package[100,101] and which uses the Libxc library[102] to evaluate the exchange-correlation functionals. The atomic program produces self-consistent solutions of the radial Kohn–Sham equations. The atomic calculation is initialized with a solution in a large Gaussian basis set, after which the solution is further refined by a finite difference calculation on a radial grid. With a simple modification, the atomic program was made to save the converged radial potentials to disk. Default settings for the convergence criteria (density converged to 10–6) were employed, while a radial grid two times larger than the default (4000 instead of 2000 points) was used, with the practical infinity set at the default value of 50 bohr. Initial experimentation with various functionals in Libxc revealed that the best results were obtained from exchange-only calculations, and that the best exchange functionals were the local spin-density approximation (LDA-X),[103,104] the ‘‘correct asymptotic potential’’ (CAP-X),[105] as well as the Chachiyo[106] (CHA-X) generalized gradient exchange functionals. Self-consistent atomic potentials were generated for these three functionals at the nonrelativistic and scalar-relativistic levels of theory. The atomic calculations were spin-unrestricted, and the SAP potential was generated by averaging over the two spin channels of the converged potential. Next, as visual examination of the potentials generated by GPAW revealed significant numerical noise far away from the nucleus, the potentials were smoothed by forcing them to decay exponentially far away from the nucleus aswith the onset r0 = 8a0 and the decay parameter γ = 4a0. The GSZ potential[55] was implemented by pretabulating its values in the same format as the potentials obtained from Gpaw, thus allowing maximal code reuse.

Implementation of SAP

The SAP potential matrixwhere VSAP() is the effective potential at arising from atom A and the sum runs over all the atoms in the system, is calculated in Erkale using Becke’s polyatomic integration scheme[107]which kills off any possible nuclear cusps in the potential. In analogy to the SCF calculations, a (99,590) grid, i.e. 99 radial and 590 angular points, is used for the SAP integrals (eq ). The SAP potential in eq is calculated using eq , in which linear interpolation is used for the pretabulated effective charges ZSAP(r).

Results

The statistics on the accuracy of the guesses described in the Guess Assessment section, assessed on HF/aug-pcseg-2 and HF/pcseg-0 wave functions, is shown in Table ; the full set of results is available in the Supporting Information. The analysis in Table has been performed separately for the 222 neutral singlet molecules, and for the 39 neutral nonsinglet molecules of the present study. Table also shows data for projections of wave functions calculated at a different level of theory to study the accuracy of the commonly used approach of reading in converged densities from another calculation.
Table 1

Statistics for the 222 Neutral Singlet Molecules and the 37 Neutral Non-Singlet Molecules at the HF/aug-pcseg-2 and HF/pcseg-0 Levels of Theory

aug-pcseg-2singlets
nonsinglets
 singlets
nonsinglets
pcseg-0
guessmin fmean fmin fmean f min fmean fmin fmean fguess
GWH0.0000.4430.2850.450 0.4050.5870.4580.558GWH
CORE0.4350.5850.4170.610 0.5230.6800.5570.662CORE
SAD0.7000.9010.7300.864 0.7110.9080.7390.871SAD
GSZ0.7260.9260.8020.939 0.7520.9350.8090.947GSZ
SADNO0.7010.9640.7150.946 0.7580.9730.8610.959SADNO
HUCKEL0.9100.9700.8470.955 0.9500.9790.8680.974HUCKEL
LDA-X0.8930.9740.8490.969 0.8980.9790.9010.964LDA-X
CAP-X0.8960.9740.8980.973 0.9010.9790.8510.975CAP-X
CHA-X0.8920.9760.9200.974 0.8970.9800.8430.976CHA-X
           
HF/STO-3G0.9200.9840.8930.977 0.9170.9850.8990.983HF/STO-3G
HF/pcseg-00.9320.9960.9290.991 0.9740.9980.9760.997revTPSSh/pcseg-0
revTPSSh/aug-pcseg-20.9770.9990.9310.992      
HF/pcseg-10.9630.9990.9940.999      
As is shown by the large projection of the HF/aug-pcseg-2 and revTPSSh/aug-pcseg-2 wave functions, the level of theory used to study the accuracy of the initial guesses in a Qσ based approach does not matter. After all, it is well-known that HF and KS orbitals are typically very similar (other than in pathological multireference cases such as Cr2); analogous results can be found in the literature.[108,109] We have thus shown that the level of theory used is not important for the present approach: HF and KS references yield similar results. Despite the resemblance of the orbitals at convergence, differences in the SCF convergence characteristics of HF theory and DFT between different initial guesses can likely be found. This is due to the differing nature of the HF and KS potentials. The potential is orbital-dependent in HF, whereas in DFT all orbitals experience the same potential; however, the Taylor expansion of the KS energy is more complicated than that of HF which is quadratic in the density. Thus, the differences in the convergence speed of SCF calculations started from two guesses of a similar accuracy will be dominated by the SCF acceleration technique, of which there are many as stated in the beginning of the Introduction, and which are known to behave differently even when started from the same guess. The results in Table support the well-known procedure of reading in an SCF solution from a smaller basis set: unsurprisingly, a guess consisting of a converged SCF solution yields better results than any of the ad hoc guesses considered in the present work. On the basis of its excellent coverage, we can recommend the single-ζ pcseg-0 basis as a guess basis for calculations in larger basis sets. However, as discussed in the Introduction, an initial guess is still necessary in the small basis set. Now, we continue by studying the performance of the ad hoc guesses in the aug-pcseg-2 and pcseg-0 basis sets. The high quality of the SAP guess is demonstrated by the high f values reproduced by all the three atomic potentials chosen for the present work. The guess rankings in decreasing accuracy are for aug-pcseg-2 as follows. Singlets: mean f, CHA-X, CAP-X, LDA-X, HUCKEL, SADNO, GSZ, SAD, CORE, and GWH; min f, HUCKEL, CAP-X, LDA-X, CHA-X, GSZ, SADNO, SAD, CORE, and GWH. Nonsinglets: mean f, CAP-X, CHA-X, LDA-X, HUCKEL, SADNO, GSZ, SAD, CORE, and GWH; min f, CAP-X, CHA-X, LDA-X, HUCKEL, GSZ, (SAD,) SADNO, CORE, and GWH. For pcseg-0, they are as follows. Singlets: mean f, CHA-X, CAP-X, HUCKEL, LDA-X, SADNO, GSZ, SAD, CORE, GWH; min f, HUCKEL, CAP-X, LDA-X, CHA-X, SADNO, GSZ, SAD, CORE, GWH. Nonsinglets: mean f, CHA-X, LDA-X, CAP-X, HUCKEL, SADNO, GSZ, (SAD,) CORE, GWH; min f, HUCKEL, CAP-X, SADNO, LDA-X, CHA-X, GSZ, (SAD,) CORE, GWH. This shows that on average, the SAP guess yields the best starting point for calculations. The guess performances in the pcseg-0 and aug-pcseg-2 basis sets are also similar, underlining the quality of the proposed approaches. The extended Hückel variant described in the present work is also a good performer, especially in the small pcseg-0 basis. The extended Hückel guess can be seen as an approximation to SAP: in the version described in the present manuscript, the atomic Hückel Hamiltonian coincides with the atomic Fock operator that is diagonal in the Hückel basis; note that the single-center atomic orbitals are orthonormal, S = δ. However, the Hückel guess approximates the interatomic elements of the Hamiltonian with the generalized Wolfsberg–Helmholz rule (eq ). Furthermore, as the Hückel guess is limited to a minimal basis (even though the basis functions themselves can constitute the exact solution to the free atom), it yields a spectrum consisting mostly of zeros for the virtual orbitals. In contrast, the SAP guess yields a full spectrum for the virtual space also in a large basis set. However, it is likely exactly the limitation to the minimal basis that allows the Hückel guess to work so well: in contrast to the other ad hoc guesses considered here, the Hückel guess only mixes low-lying states for the individual atoms, which means it can never stray very far from a physical solution. The differences between the top performers can be studied in more detail with the scatter plots shown in Figure . The GSZ potential, which has but a single parameter per atom, exhibits a rapidly decaying accuracy with increasing system size. The SAD guess, represented here through the SADNO guess, generally offers a good starting point for calculations, with some notable outliers in the case of small systems. The extended Hückel variant is an improvement over SADNO, although the scatter plots for the two methods share striking similarities; after all, both guesses employ the same atomic calculations.
Figure 2

Guess accuracy scatter plots for the HF/aug-pcseg-2 singlet wave functions. Legend: nonmultireference part of W4–17 (circles), transition metal complexes from refs (10 and 88) (diamonds), and MOR41 (squares). The raw data are available in the Supporting Information.

Guess accuracy scatter plots for the HF/aug-pcseg-2 singlet wave functions. Legend: nonmultireference part of W4–17 (circles), transition metal complexes from refs (10 and 88) (diamonds), and MOR41 (squares). The raw data are available in the Supporting Information. The three SAP methods are also strikingly similar to each other. Although the SAP results show considerably more scatter than the SADNO or the extended Hückel guesses, SAP yields a more accurate initial guess—f values closer to 1—for a large number of molecules.

Summary and Discussion

We have discussed an alternative method for obtaining an initial guess for self-consistent field calculations that is based on the superposition of atomic potentials (SAP), which is equivalent to the commonly used superposition of atomic densities (SAD) approach in the case of systems of noninteracting closed-shell atoms, for which both guesses are exact. In the case of either open-shell atoms or molecules, neither the SAP nor the SAD guess is exact. However, in contrast to SAD, the guess formed by SAP also includes chemical interactions between the atoms in a molecule in a linearized approximation. The SAP approach can straightforwardly be implemented in programs employing a linear combination of atomic orbitals, or a real-space basis set. Once the SAP approach has been implemented, the choice of the employed atomic potentials can be left to the user, greatly facilitating future studies on better atomic potentials. Improvements on the accuracy of the SAP approach could be obtained, e.g., by manually specifying the charge states of the individual atoms in the system, as in some implementations of SAD. We have implemented the SAP approach in the freely available Erkale program,[27,31] which we have used for guess assessment by projecting the guess orbitals onto the ground state wave functions of a data set consisting of 259 molecules comprised of first to fourth period elements. At variance to the assessment of initial guesses by comparison of the resulting SCF convergence, the results of which are highly dependent on the used SCF algorithm and the assessment is further complicated by the possibility of convergence to saddle point solutions or different minima, the presently used projection approach yields an unambiguous accuracy score for any guess, and also has a low computational cost that allows benchmarking a wide variety of guesses. In addition to SAP, we have discussed, implemented and assessed a variant of SAD we call SADNO that produces guess orbitals by purification of the nonidempotent SAD guess density matrix, which does not appear to have been previously considered in the literature, as well as pointed out and demonstrated that an extended Hückel guess can be easily implemented on top of a pre-existing SAD solver, based on the procedure of ref (43). The SAP guess was shown to yield excellent guess wave functions in combination with the Chachiyo generalized gradient exchange functional;[106] almost as good results could also be obtained with the CAP[105] and LDA[103,104] exchange functionals. On average, the SAP guess was best. However, there was more scatter in the accuracy of SAP than in that of SADNO or the Hückel guess. The accuracy of the SAP guess might be improved by forming the atomic potentials at a better level of theory; for instance, effective potential calculations[110,111] could be pursued in future work. The good results of the parameter-free extended Hückel guess variant were explained through its connection to the SAP approach, as well as through its minimal-basis structure that prevents it from yielding very good or very bad performance. While its overall accuracy in the present data set was not as good as that of SAP, its accuracy is remarkably stable. Because it is an improvement over SAD and because it is extremely easy to implement on top of pre-existing SAD code, we can recommend the extended Hückel variant described in the present work as a default choice. While the present work considered only all-electron calculations at the nonrelativistic level of theory, the approaches discussed in the present work are readily applicable to scalar-relativistic calculations, and they also can be straightforwardly extended to calculations employing effective core potentials. In the case of SAP, this would likely entail the removal of the contributions from the core electrons to the SAP potential, as the core electrons are already included in the effective core potential. The original motivation and driver of the present work was to develop accurate yet easily implementable guesses for real-space approaches.[15,51] The SAP guess offers such an approach: suitable guess orbitals can be easily obtained from a superposition of atomic potentials, which are but simple scalar radial functions. Alternatively, the SAD guess based on projection of pretabulated numerical orbitals could be used to produce a guess density. If molecular orbital coefficients are also needed, then the SADNO approach could be used to obtain them from the SAD density. Finally, if pretabulated atomic orbitals and orbital energies are already available for a SAD approach, the extended Hückel variant studied in the present work following Norman and Jensen’s suggestion[43] can also be easily implemented, again yielding molecular orbital coefficients and a likely improved accuracy over SAD.
  4 in total

1.  Psi4 1.4: Open-source software for high-throughput quantum chemistry.

Authors:  Daniel G A Smith; Lori A Burns; Andrew C Simmonett; Robert M Parrish; Matthew C Schieber; Raimondas Galvelis; Peter Kraus; Holger Kruse; Roberto Di Remigio; Asem Alenaizan; Andrew M James; Susi Lehtola; Jonathon P Misiewicz; Maximilian Scheurer; Robert A Shaw; Jeffrey B Schriber; Yi Xie; Zachary L Glick; Dominic A Sirianni; Joseph Senan O'Brien; Jonathan M Waldrop; Ashutosh Kumar; Edward G Hohenstein; Benjamin P Pritchard; Bernard R Brooks; Henry F Schaefer; Alexander Yu Sokolov; Konrad Patkowski; A Eugene DePrince; Uğur Bozkaya; Rollin A King; Francesco A Evangelista; Justin M Turney; T Daniel Crawford; C David Sherrill
Journal:  J Chem Phys       Date:  2020-05-14       Impact factor: 3.488

2.  SPAHM: the spectrum of approximated Hamiltonian matrices representations.

Authors:  Alberto Fabrizio; Ksenia R Briling; Clemence Corminboeuf
Journal:  Digit Discov       Date:  2022-04-04

3.  Simple and Accurate Exchange Energy for Density Functional Theory.

Authors:  Teepanis Chachiyo; Hathaithip Chachiyo
Journal:  Molecules       Date:  2020-07-31       Impact factor: 4.411

4.  An Overview of Self-Consistent Field Calculations Within Finite Basis Sets.

Authors:  Susi Lehtola; Frank Blockhuys; Christian Van Alsenoy
Journal:  Molecules       Date:  2020-03-08       Impact factor: 4.411

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.