Literature DB >> 35735354

How Reliable Are Modern Density Functional Approximations to Simulate Vibrational Spectroscopies?

Sebastian P Sitkiewicz1,2, Robert Zaleśny3, Eloy Ramos-Cordoba1,2, Josep M Luis4, Eduard Matito1,5.   

Abstract

We show that properties of molecules with low-frequency modes calculated with density functional approximations (DFAs) suffer from spurious oscillations along the nuclear displacement coordinate due to numerical integration errors. Occasionally, the problem can be alleviated using extensive integration grids that compromise the favorable cost-accuracy ratio of DFAs. Since spurious oscillations are difficult to predict or identify, DFAs are exposed to severe performance errors in IR and Raman intensities and frequencies or vibrational contributions to any molecular property. Using Fourier spectral analysis and digital signal processing techniques, we identify and quantify the error due to these oscillations for 45 widely used DFAs. LC-BLYP and BH&H are revealed as the only functionals showing robustness against the spurious oscillations of various energy, dipole moment, and polarizability derivatives with respect to a nuclear displacement coordinate. Given the ubiquitous nature of molecules with low-frequency modes, we warrant caution in using modern DFAs to simulate vibrational spectroscopies.

Entities:  

Mesh:

Year:  2022        PMID: 35735354      PMCID: PMC9251762          DOI: 10.1021/acs.jpclett.2c01278

Source DB:  PubMed          Journal:  J Phys Chem Lett        ISSN: 1948-7185            Impact factor:   6.888


Given its effectiveness, low cost, and conceptual simplicity, Kohn–Sham density functional theory has become the true workhorse of modern computational chemistry.[1−3] Since the exact density functional is unknown, a race for constructing the most accurate and versatile density functional approximation (DFA) is still on. The three most significant challenges of modern DFAs,[4,5] the delocalization error,[6] dispersion forces,[7] and strong correlation,[2] occupy the agenda of DFA developers, who have lately put forward various approximations that, to some extent, succeeded in overcoming these limitations.[8,9] The increase of accuracy often comes hand-in-hand with the mathematical complexity of modern DFAs. Unlike most wave function methods, the computation of electronic energies from DFAs requires the numerical integration of the exchange and correlation functionals, controlled by the size and type of the numerical integration grid. Johnson and co-workers showed that spurious oscillations appear in the potential energy surface (PES) of dissociating dispersion-bounded systems calculated with meta generalized gradient approximations (meta-GGAs) using small integration grids.[10,11] The origin of these oscillations was attributed to an insufficient grid sampling of the midpoints in between the dispersion-bonded atoms, where the kinetic energy density has greatly enhanced values.[11] Other authors have identified similar spurious oscillations on other molecules stabilized by London dispersion forces.[5,10−20] However, thus far, the problem seemed to be limited to the computation of PES of dispersion complexes with meta-GGAs and solved by the use of reasonably large grids [such as (250,590), see below]. In the present work, we show that the spurious oscillations that arise from the sizable numerical integration errors of DFAs are neither a problem limited to meta-GGAs and the PES of dispersion compounds nor can be solved using the largest predefined grids available in most computational packages. An earlier study by some of the present authors reported enormous errors in anharmonic vibrational corrections to electric properties using some DFAs.[21] A subsequent in-depth analysis of these errors motivated the present Letter. As it will be demonstrated in this work, these spurious oscillations occur in molecules with low-frequency vibrational modes, affect various molecular properties, and vary with the grid size, making most DFA predictions strongly grid-size dependent. The errors derived from this grid-dependency problem are sometimes huge and can be easily spotted,[21] whereas in many other cases are disguised as performance errors attributed to the approximate nature of the functional. In fact, the oscillations significantly affect energy derivatives with respect to nuclear displacements and can occur in all kinds of DFAs. Avoiding them is not always possible and requires grid sizes that compromise the cost-efficiency of DFAs. As a result, the calculation of fundamental properties related to the vibrational motion of molecular systems poses a greater challenge to DFAs than it was initially anticipated. First of all, let us show that the problem affects various energy derivatives and it cannot always be solved increasing the numerical grid size. Figure illustrates the spurious oscillations of various energy derivatives with respect to the interatomic distance in Ar2 calculated with three popular DFAs. Although these oscillations were not evident from the potential energy curve of Ar2,[11,19,20] they compromise the calculation of accurate force constants and higher-order derivatives, for which the oscillations intensify, leading to great discrepancies in response properties that involve derivatives with respect to the nuclear coordinates. In some cases, at the expense of a large computational cost, the problem can be partially alleviated if grids much larger than those predefined in standard packages are employed. However, for some DFAs (see M06-2X and ωB97X), the oscillations persist after increasing the grid size, and obtaining an accurate response property is out of reach even with extremely large integration grids. Since one cannot either anticipate whether there is a grid size sufficient to avoid the spurious oscillations or which DFAs will present this problem, one is exposed to serious performance errors in the calculation of various response properties such as IR and Raman intensities and frequencies and zero-point vibrational averages or pure vibrational contributions to any molecular property. The specific goals of this work are (i) bring attention to this shortcoming of many modern DFAs, (ii) put forward a method to easily identify and quantify the error due to these spurious oscillations, (iii) do an exploration of the grid-size dependency of the most popular DFAs, and (iv) identify which functionals can be safely used to calculate various response properties.
Figure 1

Grid-related spurious oscillations affecting the derivatives of total energy with respect to interatomic separation R of Ar2, dE/dR (m = 2–4). Top-row panels show the results of DFAs combined with the (99,590) grid. On the bottom row panels, curves of d4E/dR4 obtained with various integration grids are shown. In all plots, the black solid curve represents the CCSD(T) results and the vertical lines mark the equilibrium distance. All units are a.u.

Grid-related spurious oscillations affecting the derivatives of total energy with respect to interatomic separation R of Ar2, dE/dR (m = 2–4). Top-row panels show the results of DFAs combined with the (99,590) grid. On the bottom row panels, curves of d4E/dR4 obtained with various integration grids are shown. In all plots, the black solid curve represents the CCSD(T) results and the vertical lines mark the equilibrium distance. All units are a.u. We will analyze the derivatives of the total energy, dE/dξ (m = 1–4), the dipole moment, dμ/dξ (m = 1–3), and the static polarizability, dα/dξ (m = 1–2), with respect to the nuclear displacement coordinate (ξ) of dispersion-bonded complexes (Ar2 and He2), hydrogen-bonded (HCN···HF, HCN···HCl, CO···HF, and N2···HF), and halogen-bonded (HCN···BrH and HCN···BrF) complexes. We investigated 45 DFAs including generalized gradient approximations (GGAs), meta-GGAs, global hybrids, range-separated DFAs, and double hybrids (see the figures below for the full list). In all cases, the aug-cc-pVTZ basis set was used. We tested various integration grids: (99,590) (i.e., unpruned SG3[22] and UltraFine[23]), (250,974) (i.e., unpruned SuperFineGrid[23]), (500,974), and (750,974), where (Nr, NΩ) indicates Nr radial shells and NΩ angular points per atom. The unpruned (250,974) grid is already larger than most of the largest predefined grids in standard computational packages (see Table S3). The cost of the calculations multiply by four going from (99,590) to (250,974) and triple from (250,974) to (750,974). Hence, the calculation with the latter grid increases by more than 1 order of magnitude the cost of most calculations performed nowadays. For diatomic molecules, we scanned the properties along the interatomic bond distance, whereas in polyatomic systems, we represent the nuclear displacements (ξ) with the first-order field-induced coordinate (FIC), χ1,. The first-order FIC is a linear combination of normal modes, which effectively represents an overall response of the equilibrium geometry to an external electric field (see eq S1 for further details).[24] We find two difficulties to establish a spurious-oscillation-free reference that we can use to quantify the distortion of the property along the nuclear displacement. First, we cannot directly compare the DFA and another computational method because the differences could be either due to the spurious oscillations or to the methods’ performance. Hence, the reference must be based on the same DFA. However, we have also established (see Figure ) that we cannot blindly trust a very large grid. To circumvent these problems, we will benefit from the fact that the spurious oscillations depicted in Figure show regular patterns. We have designed an algorithm (sketched in Figure , see section 1.2 of the Supporting Information for further details) that employs Fourier spectral analysis and digital signal processing to identify and quantify the oscillations of a property measured with a given DFA, PDFA. The Fourier spectra of PDFA and Pref, an integration-error-free reference, usually obtained from ab initio calculations such as HF or CCSD, are compared to identify spurious oscillations. The latter are removed from PDFA, giving rise to PfiltDFA, which can now be compared against PDFA. In order to quantify how the oscillations affect the property, we calculate the root-mean-square (RMS) of a set of points (close to the equilibrium geometry) along the property profile. The ratio between the oscillation and the property RMSs gives the relative root-mean-square error (RRMSE),Large RRMSE values indicate significant spurious oscillations and warn about the presence of potentially large errors in various molecular properties. RRMSE is a convenient measure of the spurious oscillations because it only requires the computation of the derivative profile at the DFA level and a cost-efficient ab initio reference such as HF. dE/dξ, dμ/dξ, and dα/dξ enter the expression of various vibrational spectroscopic quantities; however, it is difficult to anticipate the effect of the spurious oscillations on these molecular properties because they depend on various derivatives in quite different ways. In Table S5, we review the connection between these derivatives and various molecular properties. Since we are quantifying the effect of the spurious oscillations on the derivative profile with RRMSE, we can also use it to predict how large the error in the properties might be. To this end, in Table S16, we collect RRMSE values and relative errors for various vibrational spectroscopic properties of N2·HF. The numbers show that the relative errors in the property might be as large as twice the RRMSE value. For instance, 17% RRMSE on dα/dξ gives a relative error on the harmonic Raman intensity of 38%; 64% RRMSE on d2α/dξ2 gives a relative error on the anharmonic correction to the Raman intensity of 163%. Certain properties depend on more than one derivative, and the RRMSE of all the derivatives need to be considered. For instance, RRMSEs of 16% and 665% for d3E/dξ3 and d4E/dξ4, respectively, must be considered to explain the 234% error on the calculation of IR anharmonic corrections to the intermolecular stretching mode of N2·HF using ωB97X with the (250,974) grid.
Figure 2

Flowchart of the algorithm to quantify the spurious oscillations. PDFA is the property profile calculated with the target DFA. Pref is a integration-error-free property profile, usually obtained from an ab initio wave function method. PfiltDFA is the property profile calculated with the target DFA after filtering the spurious oscillations.

Flowchart of the algorithm to quantify the spurious oscillations. PDFA is the property profile calculated with the target DFA. Pref is a integration-error-free property profile, usually obtained from an ab initio wave function method. PfiltDFA is the property profile calculated with the target DFA after filtering the spurious oscillations. In Figure , for a representative set of 11 DFAs, we collect the maximum RRMSE values within the set of studied systems (see Figures S9–S17 for the full list of DFAs). RRMSE can reach thousands of percents and, as expected, the largest errors correspond to high-order derivatives. The (99,590) grid, which is equal or larger than the largest predefined grid in QCHEM 5.3[22] or ORCA 5.0.1,[25] gives maximum RRMSE values that exceed 10%, 40%, and 500%, respectively, for d2E/dξ2, d3E/dξ3, and d4E/dξ4. The latter is a clear indication that (99,590) is definitely an insufficient grid to compute vibrational spectroscopic data of molecules with at least a low-frequency vibrational mode. For the (250,974) grid, the maximum values of RRMSE for d4E/dξ4 are greater than 160% for all DFAs, except for BH&H (18%) and LC-BLYP (11%). The largest grid, (750,974), reduces the maximum RRMSE significantly for the first and second energy derivatives but some popular DFAs, such as ωB97X, SCAN, or M06-2X, give extremely large errors for the third and fourth derivatives; in fact, even with the largest grid, only a handful of functionals gives a maximum RRMSE below 50% for d4E/dξ4.
Figure 3

Maximum values of RRMSE for the derivatives of total energy within the set of all studied molecular complexes.

Maximum values of RRMSE for the derivatives of total energy within the set of all studied molecular complexes. We set RRMSE below 10% as the criterion to define a safe calculation (thorough numerical tests indicate that property errors due to spurious oscillations are usually below 20% for such RRMSE values). Modern DFAs with the (99,590) integration grid, which is the default choice in many computational packages,[22,23] cannot be safely used to study the IR or Raman spectroscopy of molecules involving floppy motions (see Figure S18). In Figure , we collect the maximum order of the derivatives which can be safely calculated using a wide range of DFAs employing the (250,974) grid. Derivatives up to the fourth, third, and second order along the nuclear displacement were considered for the energy and the main symmetry axis component of the dipole and first polarizability, respectively. The molecules tested are divided into three groups: hydrogen-bonded (A), halogen-bonded (B), and dispersion-bonded (C) molecules. Hydrogen-bonded complexes show the smallest spurious oscillations, although they are still considerably large. Halogen-bonded molecules exhibit much more significant oscillations, whereas dispersion-bonded systems are the species most affected by this problem. We have classified the DFAs into five rungs according to their overall stability across various molecular properties (see section S1.3 of the Supporting Information for further details about this classification). Within the same group of molecules, the magnitude of the spurious oscillations increases in the following order: dE/dξ, dμ/dξ, and dα/dξ. One should take into account that molecules in group C are centrosymmetric; hence, dμ/dξ were not tested for the most challenging molecules of the set.
Figure 4

Maximum order of derivatives that can be safely obtained (i.e., RRMSE ≤ 10%) with the (250,974) grid. Groups A, B, and C, correspond to hydrogen-, halogen-, and dispersion-bonded complexes, respectively. Data not available for ωB97X-V, ωB97M-V, and B97M-V, for which static polarizabilities are not implemented. Figures S18 and S19 collect the same information for grids (99,590) and (750,974).

Maximum order of derivatives that can be safely obtained (i.e., RRMSE ≤ 10%) with the (250,974) grid. Groups A, B, and C, correspond to hydrogen-, halogen-, and dispersion-bonded complexes, respectively. Data not available for ωB97X-V, ωB97M-V, and B97M-V, for which static polarizabilities are not implemented. Figures S18 and S19 collect the same information for grids (99,590) and (750,974). Rung I includes grid-robust DFAs, namely, BH&H and LC-BLYP. These DFAs provide safe results for almost all derivatives using the (250,974) grid. BH&H is the only functional in our set which does not employ a GGA or meta-GGA exchange. In practice, BH&H is seldom employed, because it is not accurate for most chemical purposes. (This DFA is not even considered in two of the most extensive recent benchmark studies.[5,26]) Conversely, LC-BLYP, has been extensively used to study molecular properties,[21,27−29] and, hence, it is considered the safest choice among the DFAs covered in this study. DFAs in Rung II, PBE, CAM-B3LYP, and BH&HLYP, suffer from some grid dependency. They can provide stable anharmonic vibrational corrections for the hydrogen- and halogen-bonded complexes and harmonic contributions for the dispersion-bonded systems. On the contrary, they should not be employed to calculate anharmonic contributions of dispersion-bonded systems. Rung III includes DFAs suffering from grid-dependent oscillations for all the studied systems. Widely used DFAs, such as B3LYP, PBE0, MN15, and ωB97X-V, belong to this group. The DFAs in this rung cannot provide stable force constants or the Raman intensities (see dα/dξ) of dispersion-bonded systems. Rung IV encloses DFAs which are highly grid dependent. Most of these DFAs encounter serious difficulties to study molecules with low-frequency vibrations, and within this group of molecules, they are limited to the basic anharmonic corrections to frequencies and IR intensities (see d3E/dξ3 and d2μ/dξ2) of hydrogen- and halogen-bonded complexes. We advise to use these DFAs with a great caution, as they may require giant integration grids to avoid spurious oscillations (see Figure S19). DFAs with the highest grid dependency belong to Rung V. This group consists mostly of DFAs from the B97, M06, SCAN, and N12 families, including very popular DFAs such as M06-2X and ωB97X. Within the set of molecules exhibiting low-frequency vibrations, most of Rung V DFAs are limited to the calculation of harmonic vibrational properties of hydrogen- and halogen-bonded complexes. Some DFAs, namely, B97, M06-L, M06-HF, N12, and ωB97XD, cannot provide stable energy or polarizability gradients for the halogen-bonded systems. Grids smaller than (250,974) seriously compromise the calculation of energy gradients even for hydrogen-bonded complexes (see Figure S18). A much larger grid, such as (750,974), increases the stability of some DFAs in Rung V, but even with this enormous grid, these DFAs cannot be trusted to calculate harmonic frequencies of dispersion-bonded complexes (see Figure S19). In summary, we have uncovered an important limitation of most modern density functional approximations, which might suffer from spurious oscillations of molecular properties along the nuclear displacement coordinate. The oscillations are due to numerical integration errors, which can be alleviated using large integration grids that compromise the favorable cost-accuracy ratio of DFAs. In some cases, the spurious oscillations cannot be avoided even using enormous grids such as (750,974). Using Fourier spectral analysis and digital signal processing techniques we can easily identify and quantify these oscillations. We have used an algorithm to classify forty-five popular DFAs into five groups. Among widely employed DFAs, only LC-BLYP shows robustness against spurious oscillations of various molecular properties examined along a nuclear displacement coordinate. Hardly a handful of DFAs is safe to compute basic anharmonic corrections to the vibrational properties of molecules with low-frequency modes. Long-range corrected DFAs (LC-DFAs) reduce the amount of GGA exchange, prone to numerical integration errors, at large interelectronic separations. Hence, LC-DFAs are less sensitive to spurious oscillations than their uncorrected counterparts. The latter statement is supported by the different performance of LC-BLYP (CAM-B3LYP) and BLYP and (B3LYP). All the molecules presented in this Letter are noncovalently bonded complexes, which especially suffer from the presence of spurious oscillations. However, the latter are not limited to this kind of molecules and could be identified in all sorts of molecular systems. In section 4 of the Supporting Information, we show that molecules like H2O2, H2S2, the allyl anion, butadiene, cyclobutadiene, benzene, naphthalene, and phenanthrene also exhibit spurious oscillations for various energy and dipole derivatives over certain normal modes. All the evidence indicates that the oscillations are a universal feature of modern DFAs that can affect molecules with low-frequency vibrational modes. Given the ubiquitous nature of the latter, the grid-dependency problem of modern DFAs constitutes a challenge to the development of new approximations that calls for the inclusion of molecular properties and the study of spurious oscillations in the construction of new DFAs.
  21 in total

1.  Challenges for density functional theory.

Authors:  Aron J Cohen; Paula Mori-Sánchez; Weitao Yang
Journal:  Chem Rev       Date:  2011-12-22       Impact factor: 60.622

2.  Assessment of DFT and DFT-D for Potential Energy Surfaces of Rare Gas Trimers-Implementation and Analysis of Functionals and Extrapolation Procedures.

Authors:  Roberto Peverati; Marina Macrina; Kim K Baldridge
Journal:  J Chem Theory Comput       Date:  2010-07-13       Impact factor: 6.006

3.  Efficient density-functional theory integrations by locally augmented radial grids.

Authors:  Jürgen Gräfenstein; Dieter Cremer
Journal:  J Chem Phys       Date:  2007-10-28       Impact factor: 3.488

4.  Insights into current limitations of density functional theory.

Authors:  Aron J Cohen; Paula Mori-Sánchez; Weitao Yang
Journal:  Science       Date:  2008-08-08       Impact factor: 47.728

5.  Density functionals for static, dynamical, and strong correlation.

Authors:  Axel D Becke
Journal:  J Chem Phys       Date:  2013-02-21       Impact factor: 3.488

6.  A look at the density functional theory zoo with the advanced GMTKN55 database for general main group thermochemistry, kinetics and noncovalent interactions.

Authors:  Lars Goerigk; Andreas Hansen; Christoph Bauer; Stephan Ehrlich; Asim Najibi; Stefan Grimme
Journal:  Phys Chem Chem Phys       Date:  2017-12-13       Impact factor: 3.676

7.  A new tuned range-separated density functional for the accurate calculation of second hyperpolarizabilities.

Authors:  Pau Besalú-Sala; Sebastian P Sitkiewicz; Pedro Salvador; Eduard Matito; Josep M Luis
Journal:  Phys Chem Chem Phys       Date:  2020-05-22       Impact factor: 3.676

8.  Can Density Functional Theory Be Trusted for High-Order Electric Properties? The Case of Hydrogen-Bonded Complexes.

Authors:  Robert Zaleśny; Miroslav Medved'; Sebastian P Sitkiewicz; Eduard Matito; Josep M Luis
Journal:  J Chem Theory Comput       Date:  2019-05-31       Impact factor: 6.006

9.  Analysis of hydrogen-bond interaction potentials from the electron density: integration of noncovalent interaction regions.

Authors:  Julia Contreras-García; Weitao Yang; Erin R Johnson
Journal:  J Phys Chem A       Date:  2011-07-25       Impact factor: 2.781

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.