Eli Pollak1, Rocco Martinazzo2. 1. Chemical and Biological Physics Department, Weizmann Institute of Science, 76100 Rehovot, Israel. 2. Department of Chemistry, University of Milan and Institute of Molecular Science and Technologies (ISTM), Consiglio Nazionale delle Ricerche (CNR), I-20133 Milan, Italy.
Abstract
As of the writing of this paper, lower bounds are not a staple of quantum chemistry computations and for good reason. All previous attempts at applying lower bound theory to Coulombic systems led to lower bounds whose quality was inferior to the Ritz upper bounds so that their added value was minimal. Even our recent improvements upon Temple's lower bound theory were limited to Lanczos basis sets and these are not available to atoms and molecules due to the Coulomb singularity. In the present paper, we overcome these problems by deriving a rather simple eigenvalue equation whose roots, under appropriate conditions, give lower bounds which are competitive with the Ritz upper bounds. The input for the theory is the Ritz eigenvalues and their variances; there is no need to compute the full matrix of the squared Hamiltonian. Along the way, we present a Cauchy-Schwartz inequality which underlies many aspects of lower bound theory. We also show that within the matrix Hamiltonian theory used here, the methods of Lehmann and our recent self-consistent lower bound theory (J. Chem. Phys. 2020, 115, 244110) are identical. Examples include implementation to the hydrogen and helium atoms.
As of the writing of this paper, lower bounds are not a staple of quantum chemistry computations and for good reason. All previous attempts at applying lower bound theory to Coulombic systems led to lower bounds whose quality was inferior to the Ritz upper bounds so that their added value was minimal. Even our recent improvements upon Temple's lower bound theory were limited to Lanczos basis sets and these are not available to atoms and molecules due to the Coulomb singularity. In the present paper, we overcome these problems by deriving a rather simple eigenvalue equation whose roots, under appropriate conditions, give lower bounds which are competitive with the Ritz upper bounds. The input for the theory is the Ritz eigenvalues and their variances; there is no need to compute the full matrix of the squared Hamiltonian. Along the way, we present a Cauchy-Schwartz inequality which underlies many aspects of lower bound theory. We also show that within the matrix Hamiltonian theory used here, the methods of Lehmann and our recent self-consistent lower bound theory (J. Chem. Phys. 2020, 115, 244110) are identical. Examples include implementation to the hydrogen and helium atoms.
One of the outstanding challenges especially in ab initio quantum
chemistry is obtaining lower bounds to atomic and molecular energies,
which are as accurate as the upper bounds obtained with the Courant–Fischer
theorem from the Ritz variational method.[1] Lower bound methods abound, starting with Temple’s seminal
expression derived in 1928.[2] Landmarks
in the derivation of lower bounds are Weinstein’s lower bound
of 1934[3,4] and Lehmann’s optimization of Temple’s
lower bound presented in 1949–50.[5,6] Especially
Lehmann’s expression has turned out to be quite accurate in
different settings, however not so for Coulombic systems, as exemplified
by computations on the He[7−10] and Li[11] atoms.In the past few years, we have presented an improvement of Temple’s
formula for lower bounds of eigenvalues of Hermitian operators Ĥ(12−15) and like Lehmann’s theory it can become as accurate as the
Ritz upper bound estimates.[16] However,
these results were derived through explicit use of a Lanczos basis
set,[17] which depends on a Krylov space,[18] in which one creates basis vectors by repeated
application of the Hamiltonian operator. This does not work for unscreened
Coulomb potentials[19] because the Coulomb
singularity causes the third and higher moments of the Hamiltonian
to diverge.In this paper, we provide an answer to this challenge.
Assuming
an L dimensional subspace of the Hilbert space (as
is standard in the Ritz variational theory), we present a lower bound
expression, which depends only on the Ritz eigenvalues and associated
variances. In contrast to Lehmann’s method, there is no need
to compute the matrix representing Ĥ2 in the chosen space. The theory utilizes Lehmann’s approach
as well as our recent results. To distinguish the present theory from
previous ones, we will refer to it as the Pollak–Martinazzo
(PM) lower bound theory using the abbreviation PM theory.In Section , we
review the known lower bound theories and their simplification when
using Lanczos basis sets. Especially for Lehmann’s theory,
we derive a new and simplified expression of the Lehmann eigenvalues,
which replaces the necessity of knowing the full Ĥ2 matrix with the need to know only the variances of the
Ritz eigenstates which, in turn, thanks to the Lanczos construct,
are available from the Hamiltonian matrix only. We then construct
in Section an L + 1-dimensional auxiliary Hamiltonian matrix whose diagonal
elements are the L Ritz eigenvalues. The additional
dimension is taken to be such that the L + 1 dimensional
matrix is parametrically dependent on one of its eigenvalues, which
can be set at will, and in practice is set formally to one of the
true (unknown) eigenvalues of the Hamiltonian operator. In this construct,
the standard deviations associated with the Ritz eigenvalues turn
out to be coupling elements, which couple the approximate eigenfunctions
associated with the Ritz eigenvalues to the exact eigenfunction associated
with the chosen exact eigenvalue. In Sections and 3.3, we proceed
to show that the resulting eigenvalue expression is consistent with
both the Lehmann and our previous self-consistent lower bound theory
when applied to the L + 1-dimensional auxiliary problem.
The reader interested only in the new lower bound theory and its implementation
can skip these at a first reading.To apply the theory in practice,
it is necessary to determine a
lower bound to one of the roots of the eigenvalue expression. We show
that this is readily obtained by considering how the roots of the
equation change with increasing dimensionality. As a result, there
is a parallel to the so-called Lehmann pole[20] such that the Lehmann pole, which is employed in the “standard”
Lehmann lower bound theory, turns out to be also the pole needed to
construct lower bounds through the new lower bound expression. It
cannot be over stressed that the “PM” theory derived
in this paper depends only on the Ritz eigenvalues and their associated
variances. Eigenfunctions associated with the Ritz eigenvalues are
only needed for computation of the associated variances but there
is no need to compute the full Ĥ2 matrix.In Section , we
apply the resulting theory to the He atom using a scaled Schrödinger
basis set[25] and to the hydrogen atom using
a Gauss Hermite basis set. The resulting lower bounds are superior
to estimates based on Temple’s expression and the “standard”
Lehmann theory, which is not only less accurate but also considerably
more expensive since it is based on computation of the L dimensional matrix of Ĥ2. We
end with a discussion of the advantages, future challenges, and possible
pitfalls in application of the new method to more complex atoms and
molecules.
Short Review of Previous Lower Bound Theories
Framework
We start with a Hamiltonian
operator, whose eigenvalues and eigenstates are denoted asThe eigenvalues are ordered in ascending
order, that is, if j ≤ k,
then ε ≤ ε. The ground state is given the index 1 (rather than
0) to simplify the notation later on. We also assume the existence
of a known orthonormal basis set |Ψ⟩ = 1, 2, ... such that the Hamiltonian operator may be represented
exactly asandTo simplify, we will assume that all functions and associated
overlaps
are real; however, this is not essential; the important property is
that the operator under study is Hermitian. As in any practical computation,
one never has the full Hamiltonian matrix (except for special cases)
but rather its representation in a finite basis set, say the first L states spanning a space . Henceforth,
in order to simplify notation,
we will not indicate the dimensionality L and denote
with the projector
onto and with that onto
its orthogonal complement.The Hamiltonian projected onto the
finite basis set isand
we assume that we know how to diagonalize
this Hamiltonian in such
that it has eigenvalues λ and normalized
eigenfunctions |Φ⟩With each state, we also define a standard
deviation σThe overlap squared
of the jth eigenfunction in
the projected space with the exact kth eigenfunction
is denoted asFor future reference, we note
that the variance may be rewritten
as
Weinstein
and Temple Lower Bound Expressions
Underlying the derivation
of many lower bounds is the following
Cauchy–Schwartz inequalitywhere Q̂ is a projector.
For example, choosinginserting it into the Cauchy–Schwartz
inequality and rearranging gives the inequalityAssuming that a ≥ 1/2 gives the Weinstein lower
boundAs discussed
in ref (15), this assumption
is somewhat less restrictive than the accepted
condition for the validity of the Weinstein lower bound,[4] which is that the Ritz eigenvalue λ is the closest one to the true eigenvalue
ε, that is,To derive the Temple lower bound, we
define with each eigenstate
in the projected space a “residual energy” λ̅ such thatWith this definition, the residual energy may be expressed in terms
of the overlaps and exact eigenvalues aswhere δ is the Kronecker delta.
It is a matter of straightforward algebra
to show from eq thatInserting
this identity into eq and rearranging, one finds the Temple lower bound
expressionIn the form of eq , the previous unknown overlap a has been replaced by the as yet unknown
residual energy λ̅. However,
we have gained something. Consider
the ground-state residual energy. For any k ≥
2, we have, through the initial ordering of the eigenvalues, the property
ε ≥ ε2 so
thatThe Temple
lower bound for the ground state now takes the well-known
formwhere ε2–, which must be greater than λ1, is defined as a
lower bound to the first excited-state energy. This lower bound may
be obtained through a variety of lower bound methods such as the Weinstein,[3] Bazley,[21,22] Miller,[23] and Marmorino[24] methods.
Introduction of the residual energy made it possible to obtain a practical
calculable form of Temple’s lower bound formula.If the
Ritz eigenvalues converge to the exact energies when increasing
the dimensionality of , so
will the Weinstein and Temple lower
bounds since the variance vanishes for exact eigenstates. However,
the convergence will be much slower than the Ritz convergence due
especially to the variances. As noted in ref (8), the integrand of a diagonal
matrix element of the Hamiltonian squared is always positive so that
all errors in the approximate wavefunction add up. This is not the
case for the Hamiltonian itself, where positive and negative errors
tend to cancel each other out, leading to more rapid convergence.
It is this slow convergence of the Weinstein and Temple lower bounds,
as exemplified by precise computations on the He atom,[8] which has hindered their usage.
Self-Consistent
Lower Bound Theory
Instead of using the crude separation
of the Hilbert space as in eq , one may use the projector onto the orthogonal
complement to and insert
it into the Cauchy–Schwartz
inequality (eq ). Upon
rearranging, this leads to an improved lower bound inequalityThis result is superior to Temple’s
lower bound since the expression in square brackets is always greater
than unity. However, the overlaps a are unknown. As shown in our previous papers,[14,15] the key to turning this result into a practical one is the use of
a Lanczos basis set so that the Hamiltonian has the formThis implies that the “complementary
part” of the
Hamiltonian takes the formso that, for example, from eq one findsUsing the identityone finds the needed
relationInserting this into eq gives a practical improved
lower bound expressionOne also notes that the same considerations lead to an improved
Weinstein lower bound. Assuming as before that a ≥ 1/2 but using the projection operator one readily
finds thatwhere the second line is valid only when using
a Lanczos basis set.The implementation of these results and
their improved convergence
properties have been discussed in some detail in refs (14) and (15). The central drawback
is that the expression is derived by using the Lanczos basis set,
which does not exist for Coulombic systems due to the Coulomb singularity.
It is this challenge which is addressed in this paper.
Lehmann Theory
The Temple lower bound
as expressed in eq is based on a particular choice of a basis function, namely, the
eigenfunction of the Ritz eigenvalue. Lehmann noticed that one may
choose a better linear combination of states in the space by solving
a generalized eigenproblem in
this space. The Lehmann equation is, in a form that suits best our
purposes,where κ is the Lehmann eigenvalue and is
the associated Lehmann eigenfunction.
The parameter ρ is known as the Lehmann pole and can be any
real number but a Ritz eigenvalue for the above eigenproblem to be
well-defined. However, for eq to provide lower bounds τ = κ + ρ to the
first L* ≤ L lowest eigenvalues
(as is customarily needed in quantum chemistry calculations), the
sample space must be “good
enough” such
that λ ≤ ε holds, and ρ must be limited by the condition
λ ≤ ρ ≤ ε. Only under such circumstances will eq deliver L* negative Lehmann eigenvalues and these are lower bounds to the
first L* eigenvalues of Ĥ. In practice, then, L* is the highest state for
which the inequality λ ≤
ε holds and ρ is a lower
bound to ε.To understand
the lower bound property of the Lehmann bounds (the τ’s
obtained from the Lehmann eigenvalues according to κ + ρ),
it is useful to introducefor arbitrary and to notice that the Lehmann equation
amounts to the stationary condition of an ordinary Rayleigh–Ritz
quotient involving the resolvent G(ρ) = (Ĥ – ρÎ)−1. Specifically, for |y⟩ arbitrary
in the space ,
we haveif and only if |y⟩
≡ |Y⟩ = (Ĥ –
ρÎ)|Ω⟩ where |Ω⟩
satisfies eq , and
in turnwhere κ–1 is the corresponding
quotient (a Ritz eigenvalue of G(ρ)). Then,
the Courant–Fischer theorem guarantees that the negative values
κ–1 are upper bounds to the exact eigenvalues
(ε – ρ)−1 of G(ρ) from above for ε lower than ρ; if the negative κ’s
are sorted in order of decreasing magnitude, |κ| ≤ |κ| ≤ |κ1|, then τ = ρ + κn is a lower bound to the (L* – n + 1)th eigenvalue left of
ρ, that is the lower bounds are ordered as τ < ε.To see the connection with Temple’s
lower bound, one multiplies eq with the bra ⟨Ω| to find thatwithThe Ritz variational theorem which
underlies the Lehmann construct,
as in eq , shows that
the Lehmann eigenfunction is the function that maximizes Temple’s
lower bound.To summarize thus far, the Lehmann method builds
on the matrices
of Ĥ2 and Ĥ in the space; diagonalization
of eq gives the lower
bound eigenvalues.
The condition that ρ ≤ ε implies that one needs knowledge of a non-trivial lower
bound to the state ε, this could
be a Weinstein- or a Bazley-related lower bound.Interestingly,
when using a Lanczos basis, one does not need to
know the full Ĥ2 matrix in the
projected space but only the variances σ2 associated with the respective Ritz eigenvalues.
To see this, one multiplies eq by the bra ⟨Φ|
to findso thatMultiplying by ⟨Ψ|Φ⟩
givesRearranging and summing
over all k from 1 to L gives an
eigenvalue equation, valid for the Lanczos constructand one notes expressly that the
variances
may be obtained from eq , that is, all the information is in matrix elements of the Hamiltonian
only. The challenge then is to obtain similar results also in the
case of Coulombic potentials where the Lanczos construct is not possible.
Lower Bounds for Coulombic Systems
Hamiltonian Matrix Construct for Lower Bounds
The “Achilles
heel” in the simplifications presented
in the previous section is the need to create a Lanczos basis with
the full Hamiltonian. Of course, for a finite Hamiltonian matrix representation,
any power of the matrix is well defined and does not diverge. As before,
in the projected space , we
assume that the L-dimensional
Hamiltonian matrix is diagonal, with known Ritz eigenvalues and associated
variances. At this point, we do not discuss how these variances are
computed. We then expand the diagonal Hamiltonian matrix with one
additional row and column such that it takes the formwhere ε is for the time being
a parameter.
Notice that, for the sake of clarity, we use a simplified notation
for ; henceforth,
it is understood to have the
dimension (L + 1) × (L + 1)
where .
In this “auxiliary” Hamiltonian
matrix, the standard deviations σ couple the “Ritz states” to the added new state and,
because of the Cauchy interlacing theorem,[26] its eigenvalues x (k = 0, 1,..., L) are interlaced by the
λ’s, that is, x ≤ λ ≤ x. Among these L + 1 new eigenvalues, one will
be the energy ε. To see this, we note that the eigenvalue equation
isThe expression in the square brackets
has to vanish and this implies thatand clearly one solution is x = ε. The other L eigenvalues[1]x, j = 1, ..., L are the solutions of the
remaining polynomial equationNotice the interesting symmetry: if x is an eigenvalue of other than ε, then ε
is an
eigenvalue of other than x.As we shall show below, eq lies at the heart of PM theory.
Lower estimates on the x poles will lead under suitable
conditions to lower bounds to the eigenvalue under consideration.
For example, when interested in the ground-state energy ε1, we find that one should expect that the first root x1 ≥ε2 so that a lower
bound on the first excited-state energy ε2 will give
a lower bound to the ground-state energy, provided, of course, that
the Ritz eigenvalue for the ground state is lower than the lower bound
for the first excited state. Note the formal similarity between eq and the Lanczos-based
Lehmann equation for the lower bound derived in eq . It is in this sense that PM theory generalizes
Lehmann theory without the need to use a Lanczos basis set.The eigenvalue eq may also be rewritten asfrom which one finds thator in other words the eigenvalues
of are increasing functions of the
energy
parameter ε.Examination of eq might suggest that this monotonicity is
only in intervals since
one encounters an infinity as either x or ε
go through a Ritz eigenvalue. In reality, there is no discontinuity
as ε comes close to a Ritz eigenvalue; the same will happen
to all the roots x of eq except one. The result is that the roots can be arranged
to define functions x((ε) that are continuous on the whole real axis except for a
single pole singularity at a Ritz eigenvalue λ and which are monotonically increasing in each connected sub-domain
(−∞, λ), (λ, +∞). This property is discussed
further in detail in the Appendix where
it is shown that the above-mentioned singularity is harmless for the
method described below.Monotonocity has far-reaching implications.
Let us set ε
= ε1 and consider the limiting situation where the
Ritz eigenvalues λ approach the
exact eigenenergies ε and have
thus vanishingly small variances σ2. We then choose the lowest L eigenvalues
to construct the space
and the auxiliary matrix. In this
limit, the matrix has L eigenvalues matching
the Ritz values and one eigenvalue which diverges to +∞ since
the (L + 1)th diagonal entry of the auxiliary
matrix causes to diverge.
Having chosen one of the eigenvalues
of to be ε1, then,
necessarily,
all other roots of the eigenvalue eq are such that x(ε1) → λ. Moreover, this occurs from below because of the interlacing
theorem (λ ≤ x(ε1) ≤ λ). In addition, since with increasing accuracy
each λ tends to ε from above the same holds for the roots x(ε1), that
is, ε ≤ x(ε1) in this limit.Now, we can revert the argument. Suppose that the calculation is
sufficiently converged such that λ ≤ ε and that we know
a rough lower bound (yet greater than λ) to the (k + 1)th exact energy, call it ε–. We know that there
exists an ε such that has ε– among its eigenvalues x(ε). Indeed, thanks
to the symmetry
of eq , such an ε
is just the lowest eigenvalue of and—this is the key point—since x(ε) is monotonically increasing, ε is guaranteed
to be left of (i.e., lower than) ε1. In other words,
we have managed to convert a lower bound to the (k + 1)th energy into a lower bound to the ground-state energy. This
is further shown in Figure for the case k = 1. The question of whether
this “map” produces a tight (hence useful) bound will
be addressed numerically below, where it will be shown to be indeed
the case.
Figure 1
Diagram showing the use of the auxiliary matrix . The top part (blue) of the figure
shows
the spectrum of the matrix when the energy parameter ε = ε1 and the bottom (in green) when the energy parameter is chosen
as a lower bound to the first excited-state energy (ε = ε2–). Red dots indicate the positions of the
exact energy levels ε and black
vertical bars that of the Ritz eigenvalues λ. Note that x1(ε2–) is a lower bound to ε1 and x1(ε1) is an upper bound to
ε2.
Diagram showing the use of the auxiliary matrix . The top part (blue) of the figure
shows
the spectrum of the matrix when the energy parameter ε = ε1 and the bottom (in green) when the energy parameter is chosen
as a lower bound to the first excited-state energy (ε = ε2–). Red dots indicate the positions of the
exact energy levels ε and black
vertical bars that of the Ritz eigenvalues λ. Note that x1(ε2–) is a lower bound to ε1 and x1(ε1) is an upper bound to
ε2.The condition x(ε1) >
ε deserves some further
comments since it is the key to obtaining lower bounds to the ground
state. For definiteness, let us focus on the case k = 1 considered in the applications of Section . As shown in the Appendix, it is the degree of convergence of the ground-state Ritz eigenvalue
that determines the closeness of x1 to
λ2 irrespective of whether or not λ2 is close to ε2. Hence, if λ1 is
reasonably close to ε1, then x1 should be close enough to λ2 to guarantee
that x1 > ε2. As demonstrated
for the computations of lower bounds for the He and H atoms, this
property is verifiable by considering the dependence of x1 on the dimensionality L of the computation.
If, and this is the typical case, it is a monotonically decreasing
function of the dimensionality, then due to the limit that ultimately
λ2 → ε2, the eigenvalue x1 is guaranteed to lie above ε2. The fact that ε1 is not known is not critical
since the property of a monotonically decreasing value of x1 with increasing dimensionality will hold for
a range of ε values close to ε1. It is this
added property that distinguishes PM theory from Lehmann theory when
the Lanczos construction is not exploited or not possible. When the
latter is used, the ordering x1(ε1) > ε2 always holds and the two theories
become equivalent to each other. In fact, if we had x1(ε1) < ε2, we could
choose a pole ρ in eq larger than x1 and yet below
ε2 (x1 ≤ ρ
≤ ε2) and bound in this way the ground state
from below. However, this is clearly impossible by virtue of monotonicity
(eq ) since ε1 < y1(ρ) where y1 is the inverse function of x1.These same considerations can be generalized
to excited states.
For the sake of clarity, let us focus on the first excited state ε2 and set ε = ε2 in the auxiliary matrix.
In the limiting situation considered above, the eigenvalues x(ε2) approach
the corresponding Ritz values from below, but now, due to the interleaving
theorem and our ordering of the eigenvalues, x1(ε2) ≤ λ1 and x(ε2) ≤
λ for 2 < k ≤ L. The lowest root—x1(ε2) is a lower bound to the ground
state since, according to the above, we know that ε2 < x1(ε1) and we
have to move x1 leftward to match ε2. The remaining eigenvalues, on the other hand, are upper
bounds to the states higher than ε2 for the very
same reasons given above. Hence, even a crude lower bound ε– can be converted into
a lower bound to ε2–. As is usual
when using a finite basis set, the quality of the Ritz eigenvalues
deteriorates as one “goes up the eigenvalue ladder”.
There will be some value L* above which the interleaving
property of the Ritz and exact eigenvalues is no longer valid. However,
the upper bound quality of the roots of the auxiliary Hamiltonian
remains up to L*, that is, x ≥ ε. In practice, then, one can use the highest index (k + 1 = L*) for which ε ≥ λ holds and use a lower bound ε– (yet such that ε– ≥ λ) to obtain lower bounds to all the lower lying states.
This is the analogue of the pole in Lehmann theory.The practical
implementation of the PM theory parallels the practical
implementation of the Lehmann theory. One starts with a valid lower
bound to the Lehmann pole. Then, one computes all roots of eq which are below the
lower bound to the Lehmann pole and these will be lower bounds to
the respective eigenvalues.Apart from the increased lower bound
accuracy obtained through
the PM method, we note that the computational expense may be lower
than the effort involved in computing the “standard”
Lehmann lower bound. For the Lehmann equation, one needs the full Ĥ2 matrix. For the PM method, one only
needs the variances associated with the Ritz eigenfunctions. This
implies that if |Φ⟩ is an
eigenfunction, one can compute directly diagonal matrix elements of
the sort ⟨Φ|Ĥ2|Φ⟩ and there
is no need to first compute the Ĥ2 matrix.In the next section, we will give a numerical example
which shows
that the present theory gives improved lower bounds for the excited
states as compared to any other existing method.
Lehmann Eigenvalue Equation for the Hamiltonian
of Equation
The result of the previous subsection has the flavor of the Lehmann
lower bound and now we will show that indeed it is identical to it
provided that one uses in the Lehmann eq the matrix Hamiltonian (eq ) instead of the full Hamiltonian Ĥ. For this purpose, it is expedient to introduce
an auxiliary vector
|Ψ⊥⟩ (which is only required to be
orthogonal to the |Φ⟩ vectors,
for k = 1, ..., L) and interpret
the matrix as the representation of an operator K̂(ε) acting in the enlarged, L + 1-dimensional space built with the |Φ⟩’s and |Ψ⊥⟩.
We then repeat the derivation of the Lehmann lower bound expression
starting from the eigenvalue eq , replacing the full Hamiltonian Ĥ with the Hamiltonian of eq . By definitionandThe Lehmann eq remains as before. Multiplying
it by the
bra ⟨Φ| and rearranging
givesWe then use the notationso thatInserting this into left-hand
side of eq leads
to the eigenvalue equationComparing
with eq we identify
κ + ρ with ε and ρ with x. This Lehmann equation is exact for the Hamiltonian K̂(ε) and is valid also for Coulombic systems.
In other words, the PM method is identical to the Lehmann method for
systems with Lanczos basis functions as may be inferred by comparing
the PM equation and the Lehmann eigenvalue equation for Lanczos systems
as in eq . If however
one uses the Lehmann eq for systems that cannot use the Lanczos construct such as Coulombic
systems, then the PM method gives superior results as shall be exemplified
below. However, the PM method is identical to Lehmann’s equation
provided that one uses it with the matrix Hamiltonian rather than
the full Hamiltonian.This then complements the previous proof
of how one obtains lower
bounds. Let us suppose that, as before, L* is the
highest state for which the interleaving property of the Ritz and
exact eigenvalues holds. We then use ε– as the Lehmann pole. The lower bound property
of the Lehmann equation remains valid so that we know thatFinally, before closing this subsection, let
us mention two further
results that are instrumental to the next one. First, we notice that
in the enlarged space, the eigenvector of the auxiliary Hamiltonian K̂(ε) with eigenvalue ε takes the simple
formas can be readily verified by inspection.
Second, with some straightforward algebra, it is possible to recast
the solutions for the eigenvalues other than ε asand to
write the associated eigenvectors in
a formthat
closely parallels eq . We will make use of these expressions in
the next section where we shed light on the relationship between eq and our recent findings.[14−16]
Improved Self-Consistent Lower Bound Theory
Using the Hamiltonian of Equation
We may now also show that our previous improvements
of Temple’s theory as described in refs[14,16] and eq are also identical to eq . Choosing the parameter
ε to be the k-th eigenvalue of the Hamiltonian
(Ĥ), the eigenfunction of the Hamiltonian K̂(ε) associated with this eigenvalue is denoted
as |ε⟩ and the eigenvectors
associated with the remaining roots of eq as |x⟩ (see eqs and 3.16, respectively). Similar to the
development in Section of this paper, we may rewrite each Ritz eigenvalue aswhere
the residual energy of the Hamiltonian K̂(ε)
becomesWe then note that for the
extended
Hilbert space of the Hamiltonian K̂(ε),
we have the identitybut equivalentlyso that with our constructIn view
of eqs and 3.20, we derive the analogue of the Lanczos
relation of eq and in
view of eqPutting this all into eq and rearranging, we
get the identityand
this has the same form as eq without invoking a Lanczos basis
set. To turn this into a practical expression, it is necessary to
estimate the residual energy as defined in eq , which differs from the residual energy
as defined in eq .
For this purpose, one needs lower bounds to the roots x and this could follow the same procedure
as above using the improved Weinstein lower bounds, which now may
be also derived by assuming as in Weinstein theory that ⟨ε|Φ⟩2 ≥ 1/2 to find thatAlternatively, one could use
the Bazley-related lower bounds to
the corresponding eigenvalues of the Hamiltonian.To summarize
this section, we have two main results. The first
and most important one is the practical one. Given the “Lehmann
pole”, we obtain lower bounds to eigenvalues without the need
to compute the full Ĥ2 matrix;
all one needs are diagonal elements of it. Second, we have demonstrated
the identity of the Lehmann lower bound expression with the self-consistent
Temple lower bound expression and both are in principle exact. It
remains to show that this methodology gives lower bounds for Coulombic
systems that are superior to those obtained through the “standard”
Lehmann methods. This is demonstrated in the next section.
Applications
He Atom Basis Set
To demonstrate
the practicality of the theory, we consider first the ground state
of the He atom. Our initial normalized function will bewith r1, r2 the distances of the electrons from the nucleus
and the variational parameter α was chosen in all the computations
as the value α = 27/16 which minimizes ⟨Ψ1|Ĥ|Ψ1⟩. The basis
set was constructed using the scaled Schrödinger approach of
Nakatsuji.[25] The scaling function (with r12 the distance between the electrons) was chosen
to beas this was the easiest one to manipulate
and compute using Maple. The Hamiltonian isand following Hylleraas,[27] the kinetic
energy operator isThe volume integral isIf as is often the case
thatso that the volume integral may be simplified
toAs already mentioned, the basis set is constructed using the scaled
Schrödinger equation. Thus, the second normalized function
will bewithThe third function
will then bewithand one continues in this fashion to build
up the basis set. Note explicitly that although constructed similarly
to the Lanczos algorithm, this procedure does not lead to a tridiagonal
representation of the Hamiltonian.In our computation, due to
our use of Maple, we were limited to
small dimensionality. Even for the seven-dimensional computation,
the second lowest eigenvalue is higher than ε3 so
that the lower bound computation is limited to the ground state. The
same holds true of course for the Lehmann lower bound where even at L = 7, one finds only one negative eigenvalue which gives
the lower bound to the ground-state energy.
Lower
Bounds to the Ground-State Energy of
He
The lower-bound property is based on the observation that
the solutions of eq are monotonically decreasing functions of the dimensionality. This
is shown for the specific case of the Helium atom with our chosen
basis set in Figure where we plot the dependence of x1(ε1) on the dimensionality of the basis set as it changes from
3 to 7. The monotonic decrease is the same when changing the argument
in the vicinity of the ground-state energy. One does not need to know
the exact ground-state energy to ascertain that the eigenvalue decreases
with increasing dimensionality. As described in Section , it is the observation that the eigenvalue x1 is “trapped” between the Ritz
eigenvalue λ2 and the exact eigenstate ε2, which allows us to replace x1 in eq with ε2 or a lower bound to it such that the resulting lowest root
of eq will be a lower
bound to the ground-state energy.
Figure 2
Dependence of the eigenvalue x1(ε1) on the dimensionality L of the basis set
used for computation of the Helium atom energies (solid red line).
Shown also is the second Ritz eigenvalue λ2 as a
function of L (long dashed purple line); it is always
larger than x1(ε1) and
both slowly converge toward the energy ε2 (shown
as the horizontal dashed–dotted blue line) from above. This
demonstrates that one may use the energy ε2 as a
lower bound to x1(ε1)
and thus obtain lower bounds to the ground-state energy.
Dependence of the eigenvalue x1(ε1) on the dimensionality L of the basis set
used for computation of the Helium atom energies (solid red line).
Shown also is the second Ritz eigenvalue λ2 as a
function of L (long dashed purple line); it is always
larger than x1(ε1) and
both slowly converge toward the energy ε2 (shown
as the horizontal dashed–dotted blue line) from above. This
demonstrates that one may use the energy ε2 as a
lower bound to x1(ε1)
and thus obtain lower bounds to the ground-state energy.To test the new theory and compare it with Lehmann, we use
the
following strategy. For the “standard” Lehmann computation,
we choose the best possible Lehmann pole: ε2 = −2.14597405.
This is of course an idealization, typically if one knows one state,
one knows the other and certainly if ε2 is known
then so is ε1. In a “realistic” scenario,
the Lehmann pole for the first excited state would be given by a Weinstein-
or Bazley-type lower bound, but for the sake of understanding the
new theory without adding in other sources of approximate values,
we make this choice. Similarly, the value of ε2 was
used to obtain the Temple lower bound as well as the PM lower bound
derived from x1(ε2).
The resulting lower bounds as well as the Ritz upper bounds are shown
in Figure . One notes
the essential improvement of the PM lower bound (solid blue line),
which becomes competitive with the Ritz upper bound when the dimensionality
reaches 7. This is further exemplified in Figure where the gap ratio of the lower bound to
the upper bound is
plotted as a function of dimensionality
for the Lehmann (upper blue dashed line) and PM (lower solid red line)
lower bounds. At L = 7, the PM gap ratio is 1.03.
Figure 3
Lower
bounds for the ground-state energy of the Helium atom as
functions of the dimensionality of the basis set. The lowest (brown)
long dashed line is the Temple lower bound, the dashed (orange) line
is the Lehmann lower bound, and the solid blue line is the present
PM lower bound. The black dotted line is the exact ground-state energy
and the upper dashed–dotted (green) line is the Ritz eigenvalue
for the ground state. Note the essential improvement of the lower
bound obtained using the present PM theory.
Figure 4
Gap ratios
for the Lehmann and PM lower bounds for the Helium atom.
Note that the PM lower bounds are roughly as accurate as the Ritz
upper bounds, while the Lehmann lower bounds do not come close.
Lower
bounds for the ground-state energy of the Helium atom as
functions of the dimensionality of the basis set. The lowest (brown)
long dashed line is the Temple lower bound, the dashed (orange) line
is the Lehmann lower bound, and the solid blue line is the present
PM lower bound. The black dotted line is the exact ground-state energy
and the upper dashed–dotted (green) line is the Ritz eigenvalue
for the ground state. Note the essential improvement of the lower
bound obtained using the present PM theory.Gap ratios
for the Lehmann and PM lower bounds for the Helium atom.
Note that the PM lower bounds are roughly as accurate as the Ritz
upper bounds, while the Lehmann lower bounds do not come close.One notes that in the range 3 ≤ L ≤
7 the Ritz upper bound and the Temple and Lehmann lower bounds hardly
change. The basis set we chose is not optimal in this sense, but critically,
as the dimension is increased, the Ritz upper bounds to the excited
states improve significantly, as may be seen for the first excited
state in Figure .
It is this improvement in the Ritz eigenvalues and variances of the
excited states which leads to the significant improvement of the PM
lower bound with dimensionality as shown in Figure .
H Atom
The energy
levels of the hydrogen
atom are known analytically, yet it serves as a good “playground”
for studying lower bounds for this simplest of Coulombic systems.
The Hamiltonian for the hydrogen atom is in atomic units (r is the electron proton distance)The ground-state wavefunction isand the ground-state energy
isTo test the lower bound expressions,
we used the normalized antisymmetric harmonic oscillator basis setsuch thatand H((r) is the kth order Hermite
polynomial. This allows us to readily set up the Hamiltonian and Hamiltonian
squared matrices and so test the various lower bound theories.As in the case of the Helium atom, we show in Figure that x1(ε1) is a monotonically decreasing function
of the dimensionality, ultimately going down to ε2. Here, we plot both the difference between the second Ritz eigenvalue
and x1(ε1) (top, brown
long dashed line) and the difference between x1(ε1) and the second excited-state energy
(−1/8) (bottom red solid line) as functions of the dimensionality.
The dotted–dashed blue line is 0, which should be the limit
of both lines with increasing dimensionality. For all values, the
pole x1(ε) is in between the Ritz
eigenvalue λ2 and the exact energy ε2.
Figure 5
Dependence of the eigenvalue x1(ε1) on the dimensionality L of the computation
for the Hydrogen atom using a Gauss Hermite basis set. The upper dashed
(purple) line shows the difference λ2 – x1(ε1) between the Ritz upper
bound for the second state and the eigenvalue x1(ε1) while the lower (red) solid line shows
the difference x1(ε1)
– ε2 between the eigenvalue x1(ε1) and the exact second-state energy
ε2. Both lines are always positive, demonstrating
that the second-state energy is indeed a lower bound to the eigenvalue x1(ε1).
Dependence of the eigenvalue x1(ε1) on the dimensionality L of the computation
for the Hydrogen atom using a Gauss Hermite basis set. The upper dashed
(purple) line shows the difference λ2 – x1(ε1) between the Ritz upper
bound for the second state and the eigenvalue x1(ε1) while the lower (red) solid line shows
the difference x1(ε1)
– ε2 between the eigenvalue x1(ε1) and the exact second-state energy
ε2. Both lines are always positive, demonstrating
that the second-state energy is indeed a lower bound to the eigenvalue x1(ε1).Then, we compute as a function of L the Temple,
Lehmann, and PM lower bounds, in all of them using ε2 as the Lehmann pole energy. The results are shown in Figure and one notices that PM theory
is again superior to the other lower bounds. In Figure , we plot the gap ratios for the Lehmann
and PM lower bounds; at its worst, the PM gap ratio is ca. 23 and
one sees that it improves significantly with the increasing dimensionality
of the basis set, reaching 3.05 when L = 70.
Figure 6
Lower bounds
for the ground-state energy of the hydrogen atom as
functions of the dimensionality of the basis set. The top dashed–dotted
(blue) line is the Ritz upper bound, the (black) horizontal dotted
line shows the exact ground-state eigenvalue, the solid (red) line
is the present PM lower bound, the long dashed (brown) line is the
Lehmann lower bound, and the dashed (green) line is the Temple lower
bound. Note the superiority of the PM lower bound. The various bounds
were computed for L = 3, 4, ..., 14, 15 and then
for L = 20, 25, 30, 35, 40, 45, 50, 60, 70. The small
oscillations at the lower dimensionality are a reflection of the basis
set chosen. Adding in a new even function (k even
in eq ) improves the
Ritz upper bounds more than adding another odd function.
Figure 7
Gap ratios for the Lehmann and PM lower bounds for the hydrogen
atom. The upper dotted line is for the Lehmann lower bound and the
lower dashed line is the PM lower bound gap ratio. Note that the gap
ratios of the PM lower bounds are substantially lower than those of
the Lehmann lower bounds. At L = 70, the error in
the Ritz upper bound is 0.00033, while the PM gap ratio is 3.05 and
the Lehmann gap ratio is 6.93.
Lower bounds
for the ground-state energy of the hydrogen atom as
functions of the dimensionality of the basis set. The top dashed–dotted
(blue) line is the Ritz upper bound, the (black) horizontal dotted
line shows the exact ground-state eigenvalue, the solid (red) line
is the present PM lower bound, the long dashed (brown) line is the
Lehmann lower bound, and the dashed (green) line is the Temple lower
bound. Note the superiority of the PM lower bound. The various bounds
were computed for L = 3, 4, ..., 14, 15 and then
for L = 20, 25, 30, 35, 40, 45, 50, 60, 70. The small
oscillations at the lower dimensionality are a reflection of the basis
set chosen. Adding in a new even function (k even
in eq ) improves the
Ritz upper bounds more than adding another odd function.Gap ratios for the Lehmann and PM lower bounds for the hydrogen
atom. The upper dotted line is for the Lehmann lower bound and the
lower dashed line is the PM lower bound gap ratio. Note that the gap
ratios of the PM lower bounds are substantially lower than those of
the Lehmann lower bounds. At L = 70, the error in
the Ritz upper bound is 0.00033, while the PM gap ratio is 3.05 and
the Lehmann gap ratio is 6.93.With our choice of basis set, for L ≥ 12
the second eigenvalue has the property that ε2 ≤
λ2 ≤ ε3 so that from L = 12, using ε3 as the “Lehmann
pole”, one will get lower bounds for the ground and first excited
state. This is shown in Figure where the two horizontal lines are the ground and first excited-state
energies while the lower dashed line is the PM lower bound to the
ground state and the solid red line the PM lower bound to the second
state. Comparing with Figure , one notes that the ground-state lower bound here is not
as good as the one obtained with ε2 as the Lehmann
pole. The reason is quite clear; the Ritz eigenvalue λ2 converges more rapidly than λ3 so that the lower
bound to x2(ε1) given
by ε3 is worse than the lower bound of ε2 and compared to x1(ε2). However, as may be seen from the plot, one is getting a
rather “decent” lower bound for the second state −0.1314734
at L = 70 as compared to −0.125. Not as good
as the Ritz upper bound (−0.124108), the gap ratio at L = 70 is 36.45, but the PM lower bound for the first excited
state is much more accurate than the Lehmann lower bound, which is
−0.166806 with a gap ratio of 235.4 under the same conditions.
Figure 8
Lower
bounds for the first excited state of the Hydrogen atom using
ε3 = −1/18 as the Lehmann pole. The horizontal
lines show the numerically exact first and second eigenvalues (−1/2,
−1/8), the dashed lower (orange) line is the PM lower bound
for the ground state, and the solid (red) line is the PM lower bound
for the second state. The (blue) dashed–dotted line shows the
second Ritz eigenvalue, which bounds the second state from above.
As noted in the text, the lower bound shown here for the second state
is superior to the same obtained from the “standard”
Lehmann theory.
Lower
bounds for the first excited state of the Hydrogen atom using
ε3 = −1/18 as the Lehmann pole. The horizontal
lines show the numerically exact first and second eigenvalues (−1/2,
−1/8), the dashed lower (orange) line is the PM lower bound
for the ground state, and the solid (red) line is the PM lower bound
for the second state. The (blue) dashed–dotted line shows the
second Ritz eigenvalue, which bounds the second state from above.
As noted in the text, the lower bound shown here for the second state
is superior to the same obtained from the “standard”
Lehmann theory.
Discussion
This work presents significant progress in lower bound theoryA simple
polynomial equation has been
derived for the Lehmann lower bound based on the use of a Lanczos
basis set.Lower bound
theories were shown to
have their origin in a generalized Cauchy–Schwartz inequality.Using a finite Hamiltonian
representation
which is guaranteed to have one eigenvalue of the operator Hamiltonian,
the polynomial equation derived for the Lehmann lower bound based
on the Lanczos construct was generalized and shown to be valid even
when the Lanczos structure cannot be constructed as in Coulombic systems.The computational expense
was significantly
reduced since only Ritz eigenvalues and their associated standard
deviations are needed to construct the lower bounds instead of the
construction of the full matrix of Ĥ2 as in “standard” Lehmann theory.The same derivation showed that both
the Lehmann and the recent self-consistent lower bound method developed
by the authors[14,15] are within the present context
of the finite Hamiltonian construct, identical.The resulting PM theory was shown to
be robust for the hydrogen and helium atoms and superior to any of
the other lower bound theories. Lower bounds for Coulombic systems
were demonstrated to have accuracy similar in quality to the Ritz
upper bounds. The PM lower bound theory was implemented also for an
excited state.However, there remain
difficult challenges ahead. Obtaining a Ritz
upper bound is easy and straightforward. All that is needed is to
construct the Hamiltonian matrix and diagonalize it. Deriving lower
bounds is more complex. Even within the present simplified framework,
one still needs to verify that the x’s are monotonically decreasing functions
of the dimensionality of the basis set used and it is necessary to
compute the standard deviations associated with the Ritz eigenvalues.
This implies the need to compute not only the Ritz eigenvalues but
also their eigenfunctions. Perhaps, and this will be considered in
future computations, it is not necessary to obtain the full orthogonal
diagonalization matrix but only those of the first few dozen Ritz
states to construct the matrix Hamiltonian and still obtain “good”
lower bounds. But it will still be necessary to obtain the associated
eigenfunctions of these lowest lying states, increasing the numerical
cost of the computations.However, the real challenge is not
obtaining the eigenfunctions.
A critical element in the theory is obtaining an accurate lower bound
to the so-called “Lehmann pole”. In the present paper,
we took the easy road, by using the known excited-state energies for
the helium and hydrogen atoms. As stressed in the paper, the reason
we did this was to provide a fair and unbiased comparison of the different
lower bound theories. We saw that the present PM theory is superior
to any other, yet the quality of the lower bound depends critically
on the choice of the Lehmann pole. A “standard” methodology
would be to use the Weinstein lower bound. For Hubbard-like Hamiltonians,
we have shown[14] that this choice is sufficient
for obtaining tight lower bounds. The same is true for atoms as long
as one is considering the ground state. However, the rapid reduction
of the level spacing between excited states, as is the case in hydrogen,
helium, and lithium,[11] presents a serious
challenge. Although we have shown how to improve upon the Weinstein
lower bound, as for example in eq , the implementation is based on the assumption that
the diagonal overlap matrix element squared is greater than 1/2. The
challenge is to know when this assumption holds. In the “standard”
Weinstein theory, one must consider the corresponding Ritz eigenvalue
and know that it is the closest to the true eigenstate of the Hamiltonian
under consideration. This is especially difficult when eigenstates
come close together as for excited electronic states of atoms. Here,
one must show that the Ritz eigenvalue is closest to the eigenvalue
of the matrix Hamiltonian under consideration. These eigenvalues do
not necessarily bunch together and this is an advantage; however,
one does need to construct an objective criterion which would enable
knowing that the overlap condition is valid, and this is not a trivial
task.Another challenge is understanding the choice of the basis
set.
Even the Ritz upper bound depends on the choice of the basis set.
Different basis sets could give different Ritz upper bounds and PM
lower bounds. For example, even in the present application of the
theory to He, we made a specific choice of the exponent α in
the initial wavefunction. In principle, for a given dimension of the
basis set, one could vary α to minimize the Ritz eigenvalue
for the ground state. One could also consider maximizing the PM lower
bound for the ground state via variation of α. The two variations
need not give the same value of the parameter. Different values imply
different basis sets. The question of the “best” basis
set remains open to both analytical as well as numerical research.