Sarai Dery Folkestad1, Henrik Koch1,2. 1. Department of Chemistry, Norwegian University of Science and Technology, Trondheim N-7491, Norway. 2. Scuola Normale Superiore, Piazza dei Cavaleri 7, Pisa 56126, Italy.
Abstract
We present an implementation of equation-of-motion oscillator strengths for the multilevel CCSD (MLCCSD) model where CCS is used as the lower level method (CCS/CCSD). In this model, the double excitations of the cluster operator are restricted to an active orbital space, whereas the single excitations are unrestricted. Calculated nitrogen K-edge spectra of adenosine, adenosine triphosphate (ATP), and an ATP-water system are used to demonstrate the performance of the model. Projected atomic orbitals (PAOs) are used to partition the virtual space into active and inactive orbital sets. Cholesky decomposition of the Hartree-Fock density is used to partition the occupied orbitals. This Cholesky-PAO partitioning is cheap, scaling as O(N3), and is suitable for the calculation of core excitations, which are localized in character. By restricting the single excitations of the cluster operator to the active space, as well as the double excitations, the CCSD-in-HF model is obtained. A comparison of the two models-MLCCSD and CCSD-in-HF-is presented for the core excitation spectra of the adenosine and ATP systems.
We present an implementation of equation-of-motion oscillator strengths for the multilevel CCSD (MLCCSD) model where CCS is used as the lower level method (CCS/CCSD). In this model, the double excitations of the cluster operator are restricted to an active orbital space, whereas the single excitations are unrestricted. Calculated nitrogen K-edge spectra of adenosine, adenosine triphosphate (ATP), and an ATP-water system are used to demonstrate the performance of the model. Projected atomic orbitals (PAOs) are used to partition the virtual space into active and inactive orbital sets. Cholesky decomposition of the Hartree-Fock density is used to partition the occupied orbitals. This Cholesky-PAO partitioning is cheap, scaling as O(N3), and is suitable for the calculation of core excitations, which are localized in character. By restricting the single excitations of the cluster operator to the active space, as well as the double excitations, the CCSD-in-HF model is obtained. A comparison of the two models-MLCCSD and CCSD-in-HF-is presented for the core excitation spectra of the adenosine and ATP systems.
The multilevel coupled cluster (MLCC)
approach can be used to calculate
excitation energies of molecular systems that are too large for the
standard coupled cluster models. In MLCC, the higher order excitations
that are included in the cluster operator are restricted to an active
orbital space. One can view the approach as applying a higher level
of coupled cluster theory to the active orbitals. The MLCC approach
was introduced by Myhre et al.[1−3] but is similar to the active space
approach, which has resulted from the multireference coupled cluster
method of Oliphant and Adamowicz et al.[4−7]In the multilevel coupled cluster
singles and doubles (MLCCSD)
model,[1,2] CCSD[8] is applied
to the active orbital space. Coupled cluster singles (CCS) and/or
singles and perturbative doubles[9] (CC2)
are used for the inactive orbital space. With carefully selected active
orbitals, excitation energies of CCSD quality are obtained. When the
active orbital space is enlarged, the MLCCSD excitation energies converge
smoothly toward the CCSD excitation energies. The MLCCSD oscillator
strengths, within the coupled cluster response formalism,[10,11] were reported in ref (12). However, these proof-of-concept calculations did not exploit the
computational reductions offered by the multilevel framework and were
performed using a standard CCSD code.Recently, we have reformulated
and implemented the MLCCSD ground
and excited state equations.[13] We have
found it sufficient to use CCS as the lower level model to obtain
accurate valence excitation energies. This CCS/CCSD model is cheaper
and simpler than the CC2/CCSD and CCS/CC2/CCSD models. Moreover, the
CCS/CCSD model is compatible with properties derived within the equation-of-motion[14−17] (EOM) framework as well as with the coupled cluster response theory.
This is because the only modification with respect to CCSD is a restriction
of the double part of the cluster operator to the active orbital space.
The CCS/CCSD model has been used to calculate valence excitation energies
for a system with more than 50 second row atoms and the computational
scaling approaches that of CCS for sufficiently large inactive spaces.[18] In this paper, we present an implementation
of EOM oscillator strengths[19] for the CCS/CCSD
model.The success of an MLCC calculation relies heavily on
the choice
of the active orbital space. Two strategies are used to obtain the
active orbitals: either information from a cheaper electronic structure
model is used or localized (or semilocalized) orbitals in a subregion
of the molecular system defines the active orbital space. The success
of the first strategy relies on the accuracy of the cheaper electronic
structure model. The use of correlated natural transition orbitals[13,20] (CNTOs) to determine the active space is an example of such an approach.
The CNTOs are similar to natural transition orbitals[21,22] (NTOs), which are extensively used for analysis[23,24] and in reduced cost methods for excited states,[25−29] but are defined by using excitation vectors that
are parameterized with both single and double substitutions with respect
to the reference determinant.When an electronic excitation
is localized in a region of the molecule,
localized or semilocalized Hartree–Fock orbitals can be used
to determine the active space. Cholesky orbitals have been used in
MLCC calculations for both core and valence excitation energies.[2,3,12] Occupied Cholesky orbitals can
be obtained through a partial, limited Cholesky decomposition of the
idempotent Hartree–Fock density in the atomic orbital (AO)
basis.[30,31] Virtual Cholesky orbitals can be obtained
in the same way by considering the virtual Hartree–Fock density.
This localization scheme is non-iterative and has cubic scaling with
respect to the system size. Another option to determine the active
virtual orbitals, which can be used in conjunction with occupied Cholesky
orbitals, are the projected atomic orbitals (PAOs). PAOs have been
used extensively in reduced cost electronic structure methods.[32−38] The construction of PAOs is also a non-iterative procedure with
cubic scaling.In MLCCSD, the double excitations of the cluster
operator are restricted
to the active orbital space. By also restricting the single excitations,
we obtain a reduced space CCSD approach. There are several reduced
space coupled cluster approaches, such as the frozen core approximation,
the frozen natural orbital approaches,[39−44] and the LoFEx[28,29] and CorNFLEx[45] methods. The LoFEx and CorNFLEx methods are specialized
for the calculation of accurate excitation energies. A truncated set
of molecular orbitals (MOs) is determined by considering the dominant
NTOs or CNTOs, obtained from a cheaper electronic structure method,
and localized orbitals that overlap with these dominant NTOs/CNTOs.
The reduced orbital space is increased until the excitation energy
is converged to within a predefined threshold. As NTOs/CNTOs from
a single excited state are used to determine the MOs that enter the
coupled cluster calculation, LoFEx and CorNFLEx are state specific
methods; the reduced space differs depending on the excited state.In this work, we consider a reduced space CCSD approach (CCSD-in-HF)
where Cholesky occupied orbitals and PAOs are used to obtain the active
orbital space for a region of interest. Several excited states can
be treated using the same truncated set of molecular orbitals, as
long as the excitation processes are located in the region of interest.
Preliminary studies using CC-in-HF to describe valence excitations
have been reported.[18,31,46] An iterative procedure where the active space is increased and the
excitation energies recomputed until convergence, as is done in LoFEx/CorNFLEx,
is possible but has not yet been implemented.In near edge X-ray
absorption fine structure (NEXAFS) spectroscopy,[47] a core electron is excited. Since the binding
energy of a core electron is unique to a given atomic number, specific
energy ranges correspond to the K-edge NEXAFS spectrum for the different
atoms. The excitation energies are sensitive to the environment of
the core excited atom and NEXAFS spectra can be used to probe the
local environment. Because of the strong interaction between the core
hole and the excited electron, core excitation processes are generally
localized in character.With the development of the liquid microjet
technique, studies
of solutions and liquids with NEXAFS can be performed routinely. For
reviews of the liquid microjet technique in soft X-ray spectroscopies,
we refer the reader to refs (48) and (49). Proper interpretation of NEXAFS spectra relies on accurate theoretical
modeling. While it can be challenging to accurately model the NEXAFS
spectra of small molecules in vacuo, it is significantly
more complicated for complex systems such as solutions and liquids.The coupled cluster hierarchy of models can be used to accurately
calculate core excitations, for instance by use of the core–valence
separation[50] (CVS) approach of Coriani
and Koch.[51,52] Typical errors of CCSD core excitation energies,
obtained within the CVS approximation, are on the order of 1 eV. The
errors can be significantly reduced by including triple excitations.[53−55] Intensities can be obtained from coupled cluster linear response
theory or from EOM coupled cluster theory. Myhre et al.[12] calculated the MLCCSD NEXAFS spectra at the
carbon and oxygen edge for ethanal, propenal, and butanal, demonstrating
excellent agreement with the CCSD spectra. Their work showed that
the multilevel coupled cluster models, using localized orbitals to
determine the active space, is appropriate for the description of
core excitation processes. While illustrating the usefulness of the
MLCCSD model, this implementation was, as mentioned previously, not
optimal and calculations on larger systems have not yet been performed.In this paper, we consider the MLCCSD and CCSD-in-HFnitrogen K-edge
spectra of adenine, adenosine, adenosine triphosphate (ATP), and an
ATP–water system. As core excitation processes are spatially
localized, the orbital space can be partitioned using occupied Cholesky
orbitals and PAOs. For valence excitations, which are generally more
delocalized in character, orbital selection can be more challenging
and active spaces determined from CNTOs are often preferable. An exception
is when the excitation of interest is localized in some known region
of the system, e.g., in solvent–solute systems. The CCSD-in-HF
approach relies on the selection of an active region and to treat
delocalized valence excitations, a reduced space approach like LoFEx/CorNFLEx
is more appropriate.With the calculations presented in this
paper, we outline a procedure
to obtain accurate NEXAFS spectra for larger molecular systems, liquids,
or solutions. First, a model system is used to determine the basis
set and to ensure that the active space of the MLCCSD and CCSD-in-HF
calculations is suitable for accurate treatment of the core excitations.
Here, we use adenine and adenosine for this purpose. Afterward, the
MLCCSD and CCSD-in-HF calculations are performed on the full system,
i.e., ATP and the ATP–water system. The systems were selected
because experimental spectra are available[56,57] and because ATP (C10H16N5O13P3) is large enough that the full CCSD NEXAFS
spectra is computationally expensive to generate. Another theoretical
NEXAFS study on adenine and ATP, both in vacuum and in aqueous solution,
has been performed at the DFT level of theory, using polarizable density
embedding to describe the solvent.[58] In
that study, a series of representative geometries for ATP solvated
in water were considered; this is likely necessary in order to accurately
describe the NEXAFS spectra of a solute. Our mission in the present
study is not an accurate description of the experiment but rather
to establish the performance of the EOM-MLCCSD and EOM-CCSD-in-HF
implementations and their usefulness for modeling the NEXAFS spectra
of complex systems.
Theory
The coupled cluster wave
function is given bywhere X is
the cluster operator, |HF⟩ is the Hartree–Fock reference, xμ is the cluster amplitude, and τμ is the excitation operator. The standard models within
the coupled cluster hierarchy are obtained by restricting X to include excitation operators up to a certain order.
The cluster amplitudes are determined through the projected coupled
cluster equations,and the
energy is obtained fromwhere H̅ = exp (–T)H exp(T) andis the non-relativistic electronic
Hamiltonian operator in terms of the singlet operators E.[59] The g = (pq|rs) are
the electron repulsion integrals in the Mulliken notation.
Equation-of-Motion
Coupled Cluster Theory
In the equation-of-motion
(EOM) coupled cluster framework,[14−17] a general state is expressed
aswhere |μ⟩ = τμ|HF⟩, and is obtained as the right eigenvectors of
the similarity transformed Hamiltonian,Here, is the Jacobian matrix, with elements Aμ = ⟨μ|[H̅, τ]|HF⟩
and η = ⟨HF|[H̅, τ]|HF⟩,
and we have assumed that the ground state amplitudes have been determined
from eq . As is not Hermitian, the left eigenvectors
differ from the right eigenvectors. We have the EOM coupled cluster
left statesand we require that the left
and right eigenvectors satisfy the biorthonormalization criterionThe ground state solutions, 0 and 0, are given
bywhere is the ground state multipliers determined fromThe
excited state solutions, and for k > 0, are given bywhere and r are left and right eigenvectors of , respectively, and r0 = – · . The eigenvalues of are the excitation energies, ω.Oscillator strengths for transitions between the ground and
the kth excited state are given bywhere μα = ∑ μαE is the
α component of dipole operator.[19,60]
Multilevel
CCSD
In the multilevel coupled cluster theory,
we restrict the higher order excitations of the cluster operator to
an active orbital space. In the two-level CCS/CCSD approach, the cluster
operator assumes the formThe single excitation
operator,includes single
excitations
in the entire orbital space, that is, the summation indices A and I label general (active and inactive)
virtual and occupied orbitals, respectively. The double excitation
operator T2 is restricted to the active
orbital space,where the summation indices a, b and i, j label active virtual and occupied orbitals, respectively.The MLCCSD ground state equations for the two-level CCS/CCSD model
arewhere Ĥ is the X1-transformed
Hamiltonian and
the doubles projection space is associated with T2. These equations are equivalent to the standard CCSD
ground state equations, except for the restriction of the T2 operator to the active space.Properties
of this MLCCSD model can be obtained within the EOM
framework. The excited states (|k⟩, ⟨k|) are constructed by solving the eigenvalue equations
of the MLCCSD (CCS/CCSD) Jacobian matrix,whereNote that MLCCSD assumes
the same form as CCSD except
for restriction of the operator T2 and
the corresponding projection space. Core excited states can be obtained
using the core–valence separation (CVS) approach of Coriani
and Koch.51,52 In this approach, the non-zero elements
of the excitation vectors have at least one occupied index belonging
to the excited core orbital. It can be implemented as a projection51,52 or by implementing the linear transformation by the CVS
Jacobian matrix directly.[61,62] Once the left and right
states are constructed and the multipliers are determined from eq (, then the MLCCSD oscillator
strengths can be calculated according to eq (.
Partitioning the Orbital Space
The
first step of any
multilevel coupled cluster calculation is to partition the molecular
orbitals into the active and inactive orbital sets. The canonical
Hartree–Fock orbitals are not suitable to determine the active
space. If the property of interest is spatially localized in the molecular
system, such as core excitations or excitations in a target molecule
in a solvent, localized orbitals can be used.For the occupied
space, there are many widely used iterative localization procedures,
such as the Boys,[63] Pipek–Mezey,[64] and Edmiston–Ruedenberg[65] procedures. In this work, we use the semilocalized Cholesky
orbitals described in refs (30, 31), which can be obtained in a non-iterative procedure. A set of active
atoms are selected and the idempotent Hartree–Fock density,is Cholesky decomposed in a specialized procedure
where the pivoting elements are restricted to correspond to AOs on
the active atoms. The decomposition procedure ends when all “active”
diagonals fall below a given threshold. After the decomposition, the
Cholesky factors are the orbital coefficients of the active occupied
orbitals, :The density of the inactive space, , can be fully Cholesky decomposed
to
yield the inactive occupied orbitals.The iterative localization
procedures, which are extensively used
for the occupied space, can also be applied to the virtual space.
However, convergence for the virtual space is more challenging and
the use of sophisticated level-shift and trust-radius solvers are
often necessary.[66] The projected atomic
orbitals (PAOs), is an alternative to such iterative localization
procedures for the virtual space. To construct PAOs in an active region
of the molecular system, the occupied orbitals are projected out of
the AOs centered on the active atoms. The orbital coefficient matrix
for the active virtual PAOs iswhere ′
is rectangular and contains the columns of the AO
overlap matrix that correspond to AOs centered on the active atoms.
These orbitals are non-orthogonal and linearly dependent. The Löwdin
canonical orthonormalization procedure[67] can be used to obtain a set of orthonormal active virtual orbitals.
Linear dependence in the full set of AOs should be removed before
the PAO construction. The inactive virtual orbitals are obtained in
a similar way. The active virtual orbitals, as well as the occupied
orbitals, are projected out of the full set of AOs. The resulting
orbitals are orthonormalized.After the orbitals have been partitioned,
we block diagonalize
the occupied–occupied and virtual–virtual Fock matrices
such that the active–active and inactive–inactive blocks
become diagonal. This is achieved by rotating among the active orbitals
and among the inactive orbitals separately. This semicanonical basis
is used throughout the MLCCSD calculation as this significantly improves
convergence.
Reduced Space CCSD
In MLCCSD, the
double excitations
included in the cluster operator are restricted to an active space.
In the reduced space CCSD-in-HF approach, the single excitations are
also restricted to the active space. The active orbitals are determined
from occupied Cholesky orbitals and PAOs. The inactive orbitals contribute
through the Fock matrix,where p and q are general active space indices, i denotes
an active occupied orbital, and the index denotes
an inactive occupied orbital. The
CCSD-in-HF calculation is performed as a standard CCSD calculation
but with a truncated MO basis consisting of the active orbitals (NMO < NAO) and
with the effective Fock matrix of eq .
Results and Discussion
As a test
study for the EOM-MLCCSD and EOM-CCSD-in-HF implementations,
we consider the nitrogen core excitations of adenine, adenosine, adenosine
triphosphate (ATP) in vacuo, and ATP with 12 water
molecules, see Figure . These systems are chosen because of their biological importance
and the availability of experimental studies.[56,57] In particular, experimental NEXAFS spectra at the nitrogen and carbon
edge of adenosine triphosphate in aqueous solution has been reported.[57] Our goal in this paper is not to perform an
accurate application study but rather to demonstrate the performance,
in terms of accuracy and cost, of the MLCCSD and CCSD-in-HF methods.
This has dictated our choice of basis sets and the number of computed
states. Furthermore, in order to properly describe the effects of
solvents, one should sample the spectra at several representative
geometries, e.g., obtained from a molecular dynamics simulation. Bulk
solvent should also be included in the system, for instance, by using
the QM/MM framework,[68−70] polarizable continuum model,[71,72] or by treating all water molecules at the Hartree–Fock level
of theory. In general, triple excitations (CC3[73]) are needed to obtain quantitative, unshifted NEXAFS spectra;
this is demonstrated for adenine. However, shifted CCSD spectra can
be useful for qualitative interpretation of experiments.
Figure 1
Adenine, adenosine,
adenosine triphosphate (ATP), and ATP with
12 water molecules.
Adenine, adenosine,
adenosine triphosphate (ATP), and ATP with
12 water molecules.All geometries, except
the ATP–water and methylamine–water
geometries, are obtained at the B3LYP/aug-cc-pVDZ level using the
NWChem[74] software. The ATP–water
and methylamine–water geometries were built using the Avogadro
software package.[75] All geometries are
available from ref (76). Visualization of the molecular systems is done using the Chimera
software package.[77]The EOM-MLCCSD
oscillator strengths were implemented in a development
branch of the eT program,[46] and all calculations
are performed with eT. The following thresholds have been used: For
the ground state, we used a threshold of 10–6 on
|Ω| and on the residual of the multiplier equations.
For the excited states, a threshold of 10–4 was
used for the residual and 10–6 on the change in
the excitation energies. The electron repulsion integrals are Cholesky
decomposed, and the decomposition threshold is 10–8 for adenine and 10–6 for the remaining calculations.
Generally, a decomposition threshold of 10–3 is
sufficient for accurate excitation energies. All timings were performed
on two Intel Xeon Gold 6138 processors, using 40 threads, and 360
GB of memory was available in all calculations. In the EOM-MLCCSD
and EOM-CCSD-HF calculations, we use Cholesky-PAOs to partition the
orbital space and the “adenine part” of adenosine and
ATP is considered active.
Adenine
In order to select the basis
set for the larger
systems, we start by considering the EOM-CCSDnitrogen K-edge spectrum
of adenine for different combinations of Dunning correlation consistent
basis sets.[79,80] The results are given in Figure . Generally, we use
a larger basis set on the nitrogen atoms than on the carbon and hydrogen
atoms. From the spectra with aug-cc-pVDZ and aug-cc-pCVDZ/aug-cc-pVDZ,
we observe that additional core functions on the nitrogen atoms shifts
the spectrum but that the overall shape is unchanged. From the two
lower panels, we see that the shift, with respect to the experimental
value of 399.5 eV for the first peak,[56] is significantly reduced by using triple-zeta rather than double-zeta
on the nitrogen atoms. Although the finer details might differ, the
main features of the spectrum are the same for all basis set combinations.
As a compromise between accuracy and cost, we use the aug-cc-pVTZ/cc-pVDZ
basis sets for the remaining calculations.
Figure 2
Adenine NEXAFS nitrogen
edge calculated at the CCSD level of theory
with combinations of the Dunning correlation consistent basis sets.
When a larger basis set is used on the nitrogen atoms, we use the
notation: basis-on-nitrogen/basis-on-other-atoms.
Lorentzian broadening with 0.3 eV FWHM has been applied. The first
peak has been shifted to the experimental value 399.5 eV, as reported
in ref (56). Six roots
have been calculated for each nitrogen atom.
Adenine NEXAFS nitrogen
edge calculated at the CCSD level of theory
with combinations of the Dunning correlation consistent basis sets.
When a larger basis set is used on the nitrogen atoms, we use the
notation: basis-on-nitrogen/basis-on-other-atoms.
Lorentzian broadening with 0.3 eV FWHM has been applied. The first
peak has been shifted to the experimental value 399.5 eV, as reported
in ref (56). Six roots
have been calculated for each nitrogen atom.In Figure , the
EOM-CCSD/aug-cc-pVTZ/cc-pVDZ and EOM-CC3/aug-cc-pVTZ/cc-pVDZ[62] are compared to the experimental spectrum from
ref (56). The inclusion
of triple excitations significantly improves the computed spectrum
for the second feature at approximately 402 eV. Furthermore, more
than six roots per nitrogen atom are necessary in order to describe
the spectrum from 403 eV. To accurately describe Rydberg states, additional
diffuse basis functions are likely needed.
Figure 3
Adenine NEXAFS nitrogen
edge calculated at the CCSD and CC3 level
of theory with the aug-cc-pVTZ/cc-pVDZ basis set. The larger basis
set is used on the nitrogen atoms. The first peak has been shifted
to the experimental value 399.5 eV, as reported in ref (56). Six roots have been calculated
for each nitrogen atom. Lorentzian broadening with 0.3 eV FWHM has
been applied. Experimental data collected from ref (56) using WebPlotDigitizer.[78]
Adenine NEXAFS nitrogen
edge calculated at the CCSD and CC3 level
of theory with the aug-cc-pVTZ/cc-pVDZ basis set. The larger basis
set is used on the nitrogen atoms. The first peak has been shifted
to the experimental value 399.5 eV, as reported in ref (56). Six roots have been calculated
for each nitrogen atom. Lorentzian broadening with 0.3 eV FWHM has
been applied. Experimental data collected from ref (56) using WebPlotDigitizer.[78]
Adenosine
To illustrate
the performance of EOM-MLCCSD
and EOM-CCSD-in-HF compared to EOM-CCSD for core excitations and oscillator
strengths, we consider the nitrogen edge NEXAFS spectrum of adenosine
(conformer 1 in Figure ) calculated using the aug-cc-pVTZ/cc-pVDZ basis. The results are
given in Figure , Table , and Table . In the MLCCSD and CCSD-in-HF
calculations for conformer 1, there are 40 active occupied orbitals
and 301 active virtual orbitals, and there are 30 inactive occupied
orbitals and 120 inactive virtual orbitals.
Figure 4
Adenosine conformers
1 and 2 with labels corresponding to Tables and 2.
Figure 5
Nitrogen K-edge NEXAFS spectra of two conformers of Adenosine calculated
at the EOM-MLCCSD/aug-cc-pVTZ/cc-pVDZ and EOM-CCSD-in-HF/aug-cc-pVTZ/cc-pVDZ
levels of theory. Six roots have been calculated for each nitrogen
atom. Lorentzian broadening with 0.3 eV FWHM has been applied.
Table 1
MLCCSD/aug-cc-pVTZ/cc-pVDZ Core Excitations
out of the Nitrogen 1s Orbitals for Adenosine (Conformer 1)a
nitrogen
ω1 (Δω1)
ω2 (Δω2)
ω3 (Δω3)
ω4 (Δω4)
ω5 (Δω5)
ω6 (Δω6)
N1
403.848 (0.011)
405.529 (0.041)
406.465
(0.130)
406.836 (0.207)
406.958 (0.120)
407.200 (0.219)
N2
403.269
(0.004)
403.521 (0.007)
404.461 (0.004)
405.509 (0.006)
405.871 (0.012)
405.956 (0.009)
N3
400.815
(0.004)
404.119 (0.031)
404.302 (0.013)
404.821 (0.020)
404.990 (0.068)
405.186 (0.053)
N4
401.021
(<0.001)
402.853 (0.001)
404.069 (0.026)
404.806 (0.022)
405.062 (0.046)
405.224 (0.028)
N5
401.177
(<0.001)
402.044 (<0.001)
403.727
(0.004)
404.425 (0.006)
404.735 (0.013)
404.888 (0.007)
N1†
403.845 (0.008)
405.503 (0.015)
406.407 (0.072)
406.755 (0.126)
406.915 (0.077)
407.104 (0.123)
Excitation energies, ω, and errors
with respect to CCSD, Δω,
are given in eV. † = nearest neighboring
carbon on ribose included in active space.
Table 2
CCSD-in-HF/aug-cc-pVTZ/cc-pVDZ Core
Excitations out of the Nitrogen 1s Orbitals for Adenosine (Conformer
1)a
nitrogen
ω1 (Δω1)
ω2 (Δω2)
ω3 (Δω3)
ω4 (Δω4)
ω5 (Δω5)
ω6 (Δω6)
N1
403.850 (0.013)
405.526 (0.038)
406.474
(0.140)
406.854 (0.224)
406.961 (0.123)
407.232 (0.251)
N2
403.269
(0.004)
403.522 (0.008)
404.461 (0.004)
405.508 (0.004)
405.874 (0.016)
405.957 (0.010)
N3
400.815
(0.004)
404.119 (0.031)
404.300 (0.011)
404.818 (0.018)
404.998 (0.076)
405.187 (0.054)
N4
401.021
(<0.001)
402.855 (0.002)
404.074 (0.032)
404.807 (0.023)
405.068 (0.052)
405.226 (0.030)
N5
401.178
(0.001)
402.045 (0.001)
403.726 (0.003)
404.426 (0.007)
404.736 (0.014)
404.888 (0.008)
Excitation energies, ω, and errors with respect to CCSD, Δω, are given in eV.
Adenosine conformers
1 and 2 with labels corresponding to Tables and 2.Nitrogen K-edge NEXAFS spectra of two conformers of Adenosine calculated
at the EOM-MLCCSD/aug-cc-pVTZ/cc-pVDZ and EOM-CCSD-in-HF/aug-cc-pVTZ/cc-pVDZ
levels of theory. Six roots have been calculated for each nitrogen
atom. Lorentzian broadening with 0.3 eV FWHM has been applied.Excitation energies, ω, and errors
with respect to CCSD, Δω,
are given in eV. † = nearest neighboring
carbon on ribose included in active space.Excitation energies, ω, and errors with respect to CCSD, Δω, are given in eV.For nitrogen atoms 2–5 (see Table ), the calculated MLCCSD and
CCSD-in-HF excitation
energies for adenosine has an error with respect to CCSD of less than
0.1 eV. This is well within the expected error of CCSD for K-edge
core excitations. The errors are generally larger for nitrogen atom
1 (N1). This is because not all nearest neighbours of N1 are defined
as active. The errors for N1 can be reduced by including the neighboring
carbon on the ribose into the set of active atoms, as seen from Table . As seen from Figure and Tables and 2, the errors of MLCCSD and CCSD-in-HF, compared to full CCSD, are
small: the spectra coincide for all but the low intensity peaks at
around 407 eV. The MLCCSD and CCSD-in-HF spectra are almost indistinguishable,
with small differences observed for the high energy excitations only.Wall times are given in Table . We report average timings from the calculations of
the nitrogen atoms for the cluster amplitudes (t) and the average time to transform by the Jacobian matrix, and (t and t). Although the active
space is quite large in these calculations, the computational savings
are significant with approximately a factor of five for t, t, and t compared to full CCSD.
Specialized active spaces for each of the five MLCCSD and CCSD-in-HF
calculations can be used where we include only the neighbors of the
nitrogen atom being excited. This can significantly reduce the cost
but will probably lead to increased errors with respect to the full
CCSD calculation.
Table 3
Average Wall Times to Solve for the
Ground State Equations, t, and Average
Time for Transformations by (t) and (t) of Adenosine (Conformer 1)
model
tgs [ min ]
tA [
min ]
tAT [ min ]
CCSD-in-HF
19
2
3
MLCCSD
22
2
3
CCSD
128
11
14
The time to converge the excited
states varies greatly because
the number of iterations required to reach convergence varies for
the different nitrogen atoms. To give a perspective on the computational
savings achieved in these MLCCSD calculations, we compare the calculation
time of the cheapest CCSD calculation (N2) with the most expensive
MLCCSD calculation (N4); the cheapest CCSD calculation used 9 days
and 16 h, whereas the most expensive MLCCSD calculation used 1 day
and 8 h. There are two contributing factors to the savings with MLCCSD:
the savings due to the reduced cost of constructing the Ω vector and performing the linear transformation by and , and the reduction of IO in the MLCCSD calculations.
The IO is reduced because the reduced space of the Davidson procedure
can be stored in memory for MLCCSD, whereas this is not possible for
CCSD with 360 GB of memory available.In Figure , we
also present the EOM-MLCCSD/aug-cc-pVTZ/cc-pVDZ and EOM-CCSD-in-HF/aug-cc-pVTZ/cc-pVDZ
nitrogen K-edge spectra for conformer 2 of adenosine. For conformer
2, there are 42 active occupied orbitals and 303 active virtual orbitals,
and there are 28 inactive occupied orbitals and 118 inactive virtual
orbitals. We observe a change in the spectrum, compared to conformer
1, resulting from the change in the environment of the nitrogen atoms.
Again, the MLCCSD and CCSD-in-HF models give very similar results,
as seen for conformer 1.
ATP and ATP–Water
The EOM-MLCCSD
and EOM-CCSD-in-HF
methods can be used to treat systems for which full EOM-CCSD is too
expensive. The EOM-MLCCSD/aug-cc-pVTZ/cc-pVDZ and EOM-CCSD-in-HF/aug-cc-pVTZ/cc-pVDZ
nitrogen K-edge spectrum of ATP is given Figure . There are 44 active occupied orbitals and
303 active virtual orbitals and 86 inactive occupied orbitals and
253 inactive virtual orbitals. The most expensive of the five MLCCSD
calculations for this system completed in 2 days and 23 h. As for
adenosine, the MLCCSD and CCSD-in-HF spectra coincide for all but
the high energy excitations around 407 eV.
Figure 6
Nitrogen K-edge NEXAFS
spectra calculated at the MLCCSD/aug-cc-pVTZ/cc-pVDZ
and CCSD-in-HF/aug-cc-pVTZ/cc-pVDZ level for ATP. Six roots have been
calculated for each nitrogen atom. Lorentzian broadening with 0.3
eV FWHM has been applied.
Nitrogen K-edge NEXAFS
spectra calculated at the MLCCSD/aug-cc-pVTZ/cc-pVDZ
and CCSD-in-HF/aug-cc-pVTZ/cc-pVDZ level for ATP. Six roots have been
calculated for each nitrogen atom. Lorentzian broadening with 0.3
eV FWHM has been applied.In Figure , we
present the nitrogen K-edge spectrum of the ATP–water system
(see Figure ). In
this system, there are 58 active occupied and 310 active virtual orbitals
and 132 inactive occupied and 474 inactive virtual orbitals. The MLCCSD
and CCSD-in-HF results coincide well, but some differences are observed
for higher excitation energies. This is likely because MLCCSD, with
CCS on the whole system, offers an improved description with more
diffuse core excited states. Including additional Rydberg functions
on the active atoms could be important for the description of these
states in both methods. The MLCCSD calculations are more expensive
than the CCSD-in-HF calculations; timings for transformations by and are given in Table .
Figure 7
Nitrogen K-edge NEXAFS spectra calculated at
the MLCCSD/aug-cc-pVTZ/cc-pVDZ
and CCSD-in-HF/aug-cc-pVTZ/cc-pVDZ level for ATP and 12 water molecules.
Six roots have been calculated for each nitrogen atom. Lorentzian
broadening with 0.3 eV FWHM has been applied.
Table 4
MLCCSD and CCSD-in-HF Calculations
on the ATP–Water System Using the aug-cc-pVTZ/cc-pVDZ Basis
and Average Wall Times to Transformations by (t) and (t)
model
tA[ min ]
tAT[ min ]
CCSD-in-HF
5
7
MLCCSD
33
37
Nitrogen K-edge NEXAFS spectra calculated at
the MLCCSD/aug-cc-pVTZ/cc-pVDZ
and CCSD-in-HF/aug-cc-pVTZ/cc-pVDZ level for ATP and 12 water molecules.
Six roots have been calculated for each nitrogen atom. Lorentzian
broadening with 0.3 eV FWHM has been applied.With the calculation on
ATP and the ATP–water system, we demonstrate that the MLCCSD
and CCSD-in-HF approaches can be used to treat sizable molecules and
to include solvent effects explicitly. To fully capture the effects
of the solvent, the calculations must be performed on several representative
geometries. Additionally, one should increase the number of water
molecules included in the calculation, treating most water molecules
at a lower level of theory. For such a study, the CCSD-in-HF approach
is preferable since accuracy is comparable to MLCCSD, but the cost
is significantly lower.The MLCCSD and CCSD-in-HF models can
be used in a QM/MM framework.
In Figure , we compare
the MLCCSD and CCSD-in-HF spectra for the ATP–water system
to the QM/MM approach where the QM region (ATP) is treated with either
CCSD-in-HF or MLCCSD and the MM region (the 12 water molecules) is
treated with electrostatic embedding.[81] There is good agreement between the pure QM and the QM/MM spectra
for the peaks in the range of 401–405 eV. For the remaining
peaks of higher energy, the differences are significant.
Figure 8
Nitrogen K-edge
NEXAFS spectra calculated at the MLCCSD/aug-cc-pVTZ/cc-pVDZ,
MLCCSD/MM/aug-cc-pVTZ/cc-pVDZ, CCSD-in-HF/aug-cc-pVTZ/cc-pVDZ, and
CCSD-in-HF/MM/aug-cc-pVTZ/cc-pVDZ levels for ATP and 12 water molecules.
Six roots have been calculated for each nitrogen atom. Lorentzian
broadening with 0.3 eV FWHM has been applied.
Nitrogen K-edge
NEXAFS spectra calculated at the MLCCSD/aug-cc-pVTZ/cc-pVDZ,
MLCCSD/MM/aug-cc-pVTZ/cc-pVDZ, CCSD-in-HF/aug-cc-pVTZ/cc-pVDZ, and
CCSD-in-HF/MM/aug-cc-pVTZ/cc-pVDZ levels for ATP and 12 water molecules.
Six roots have been calculated for each nitrogen atom. Lorentzian
broadening with 0.3 eV FWHM has been applied.In order to assess the quality of the MLCCSD and CCSD-in-HF approaches
compared to the QM/MM approach with electrostatic embedding, we consider
the nitrogen K-edge spectrum of methylamine and eight water molecules
(see Figure ) The
results are given in Figure . The active region, and the QM region of the QM/MM calculation,
is methylamine. The MLCCSD, CCSD-in-HF and CCSD/MM spectra are shifted
such that the most intense peak is aligned with the CCSD spectrum.
Contrary to the CCSD/MM spectrum, the MLCCSD and CCSD-in-HF spectra
capture the features of the CCSD spectrum. MLCCSD performs better
than CCSD-in-HF, but not significantly.
Figure 9
Methylamine and eight
water molecules.
Figure 10
Nitrogen K-edge of methylamine–water
system. Comparison
of CCSD, MLCCSD, CCSD-in-HF, and CCSD/MM using the aug-cc-pVTZ basis
on the nitrogen atom and the cc-pVDZ basis on the remaining atoms.
The MLCCSD, CCSD-in-HF, and CCSD/MM spectra are shifted so that the
most intense peak is aligned with the corresponding CCSD peak. Lorentzian
broadening with 0.3 eV FWHM has been applied.
Methylamine and eight
water molecules.Nitrogen K-edge of methylamine–water
system. Comparison
of CCSD, MLCCSD, CCSD-in-HF, and CCSD/MM using the aug-cc-pVTZ basis
on the nitrogen atom and the cc-pVDZ basis on the remaining atoms.
The MLCCSD, CCSD-in-HF, and CCSD/MM spectra are shifted so that the
most intense peak is aligned with the corresponding CCSD peak. Lorentzian
broadening with 0.3 eV FWHM has been applied.
Concluding Remarks
We have presented an implementation of
EOM-MLCCSD oscillator strengths
for the two level CCS/CCSD model. The model can be used to simulate
UV/visible and NEXAFS spectroscopies with CCSD quality at a significantly
reduced cost, given that an adequate active orbital space is employed.
In this paper, we have partitioned the orbital space by using occupied
Cholesky orbitals and PAOs localized in a subregion of the molecular
system. This orbital selection procedure is suitable for localized
excitation processes such as core excitations. The CCS/CCSD model,
and a reduced space CCSD model (CCSD-in-HF), has been applied to the
nitrogen K-edge spectra of adenosine, adenosine triphosphate (ATP),
and an ATP–water system. With these calculations, we have demonstrated
that MLCCSD and CCSD-in-HF are useful for the accurate modeling of
the NEXAFS spectra of complex molecular systems. Our results indicate
that CCSD-in-HF may be the preferable approach to treat low lying
core excitations, as the method has lower costs at no significant
loss of accuracy. The MLCCSD approach, or a combined MLCCSD-in-HF
approach, may be preferable for calculations with smaller active spaces
and for more delocalized core excitation processes. Furthermore, MLCCSD
can be used to assess the quality of CCSD-in-HF calculations. We believe
that both models, together with a geometry sampling from, e.g., a
molecular dynamics simulation, will provide a useful theoretical tool
for the interpretation of experimental NEXAFS spectra of solvents
and liquids.
Authors: Marta L Vidal; Xintian Feng; Evgeny Epifanovsky; Anna I Krylov; Sonia Coriani Journal: J Chem Theory Comput Date: 2019-05-01 Impact factor: 6.006
Authors: Daniel N Kelly; Craig P Schwartz; Janel S Uejio; Andrew M Duffin; Alice H England; Richard J Saykally Journal: J Chem Phys Date: 2010-09-14 Impact factor: 3.488
Authors: Marcus D Hanwell; Donald E Curtis; David C Lonie; Tim Vandermeersch; Eva Zurek; Geoffrey R Hutchison Journal: J Cheminform Date: 2012-08-13 Impact factor: 5.514