We study the tree-tensor-network-state (TTNS) method with variable tensor orders for quantum chemistry. TTNS is a variational method to efficiently approximate complete active space (CAS) configuration interaction (CI) wave functions in a tensor product form. TTNS can be considered as a higher order generalization of the matrix product state (MPS) method. The MPS wave function is formulated as products of matrices in a multiparticle basis spanning a truncated Hilbert space of the original CAS-CI problem. These matrices belong to active orbitals organized in a one-dimensional array, while tensors in TTNS are defined upon a tree-like arrangement of the same orbitals. The tree-structure is advantageous since the distance between two arbitrary orbitals in the tree scales only logarithmically with the number of orbitals N, whereas the scaling is linear in the MPS array. It is found to be beneficial from the computational costs point of view to keep strongly correlated orbitals in close vicinity in both arrangements; therefore, the TTNS ansatz is better suited for multireference problems with numerous highly correlated orbitals. To exploit the advantages of TTNS a novel algorithm is designed to optimize the tree tensor network topology based on quantum information theory and entanglement. The superior performance of the TTNS method is illustrated on the ionic-neutral avoided crossing of LiF. It is also shown that the avoided crossing of LiF can be localized using only ground state properties, namely one-orbital entanglement.
We study the tree-tensor-network-state (TTNS) method with variable tensor orders for quantum chemistry. TTNS is a variational method to efficiently approximate complete active space (CAS) configuration interaction (CI) wave functions in a tensor product form. TTNS can be considered as a higher order generalization of the matrix product state (MPS) method. The MPS wave function is formulated as products of matrices in a multiparticle basis spanning a truncated Hilbert space of the original CAS-CI problem. These matrices belong to active orbitals organized in a one-dimensional array, while tensors in TTNS are defined upon a tree-like arrangement of the same orbitals. The tree-structure is advantageous since the distance between two arbitrary orbitals in the tree scales only logarithmically with the number of orbitals N, whereas the scaling is linear in the MPS array. It is found to be beneficial from the computational costs point of view to keep strongly correlated orbitals in close vicinity in both arrangements; therefore, the TTNS ansatz is better suited for multireference problems with numerous highly correlated orbitals. To exploit the advantages of TTNS a novel algorithm is designed to optimize the tree tensor network topology based on quantum information theory and entanglement. The superior performance of the TTNS method is illustrated on the ionic-neutral avoided crossing of LiF. It is also shown that the avoided crossing of LiF can be localized using only ground state properties, namely one-orbital entanglement.
It has been more than
a decade ago that the quantum chemistry version
of the density matrix renormalization group (QC-DMRG) method[1,2] has been applied to study the ionic-neutral curve crossing of LiF
in order to demonstrate that it provides a globally accurate description
of the system even if the wave function changes dramatically transversing
the avoided crossing.[3] In the following
years, various theoretical studies have been devoted to investigate
dissociation curves in diatomic molecules using QC-DMRG,[4−8] and by now the method has become a rival to the conventional multiconfiguration
wave function approaches.[9−11] Inclusion of the concepts of
entanglement from quantum information theory (QIT)[12−15] has paved the road for identifying
highly correlated molecular orbitals leading to an efficient construction
of active spaces[12,16] and for characterizing the various
types of correlation effects relevant for chemical bonding.[8,17]In the mean time, a reformulation of DMRG in terms of so-called
matrix product states (MPS)[18−21] has shown that it is only one special case in a much
more general set of methods: the so-called tensor network states (TNS),[19,22−32] which in certain cases is expected to even outperform QC-DMRG in
the near future.[33,34] A special form of TNS, the tree
tensor network states (TTNS) approach,[35−38] was first applied in quantum
chemistry by some of us[33] to present the
underlying theoretical background and scaling properties of the QC-TTNS
algorithm, while an efficient extension of the two-site QC-DMRG using
the tree-like topology has been applied recently to dendrimers.[34] In this latter work, a novel half-renormalization
scheme has also been introduced in order to reduce computational cost
related to the diagonalization of the effective Hamiltonian. The TTNS
approach also plays a fundamental role in hierarchical tensor decompositions,
recently developed for tensor product approximation,[31] see e.g. refs (29 and 30).Unlike models with translational symmetry studied usually
in condensed
matter physics, the orbital entanglement is nonconstant in the quantum
chemical applications. Therefore, the optimal arrangement of matrices
in MPS-based approaches has a tremendous effect on their performance
and on the required computational resources to reach a given accuracy.[12,15,39] Similar optimization strategies
to find the best tensor topology are also crucial in the case of the
TTNS algorithm.[15,34]In this paper, we discuss
the most general version of the one-site
QC-TTNS algorithm in which the local properties of the tensors can
be different for each orbital. By studying the ionic-neutral potential
curve crossing of the LiF we present an optimization strategy to set
up tensor topologies which reflect the structure of the entanglement
bonds between the molecular orbitals as the bond length between the
Li and F is stretched. This problem is complicated enough to show
that the optimization of the tree topology and the sweeping procedure
is far from trivial and requires much more care than in the case of
the QC-MPS. Therefore, we also compare the MPS(DMRG) and TTNS convergence
properties by changing the order of the tensors only. In our study,
we calculate excited states as well, i.e., the two lowest 1Σ+ states of LiF and the corresponding one- and
two-orbital entropy functions.[8,14] We use this example
to demonstrate that the TTNS wave function is very stable even when
the wave function changes dramatically transversing through an avoided
crossing. The localization of the avoided crossing in terms of orbital
entropy is also discussed.The setup of the paper is as follows.
In Section II we briefly describe the main steps
of the QC-TTNS algorithm
and the details of the numerical procedure used to determine the optimal
tensor arrangements. Section III contains the
numerical results and analysis of the observed trends of the numerical
error. The summary of our conclusions is presented in Section IV.
Numerical Procedure
Hamiltonian and Target States
In the
QC-DMRG and QC-TTNS applications, the electron–electron correlation
is taken into account by an iterative procedure that minimizes the
Rayleigh quotient corresponding to the Hamiltonian of the system:This Hamiltonian
determines the exact states
of the given molecule. In eq 1, c† and c create and annihilate an electron with spin
σ, respectively. T denotes the matrix elements of the one-particle Hamiltonian, which
is comprised of the kinetic energy and the external electric field
of the nuclei, and V stands for the matrix elements of the electron repulsion operator,
defined asThe matrix elements T and V are expressed in a molecular orbital
(MO) basis obtained by
CASSCF optimizations. The benchmark energies are computed with the
same set of MO’s. (For more details, see section Basis states.)In the present version of
our TTNS method non-Abelian symmetries[40−44] are not implemented yet, thus in order to specify
the eigenstates we have fixed the number of electrons with up and
down spins and shifted the triplet levels from the low lying spectrum
by adding a term ∑δ(S+S– + S–S+) with S+ = c†c and δ = 1 to the Hamiltonian given in eq 1. Nevertheless, we have also checked the total spin
of each state by calculating the expectation value of the S2 = ∑S–S+ + ∑SS + ∑S operator which is equal
to S(S + 1) in Hartree atomic units,
i.e., zero for a singlet state and two for a triplet state.In the MPS-based approaches, several eigenstates can be calculated
within a single calculation. Therefore, we have formed the reduced
density matrix of the target state, ρ, from the reduced density
matrices of the lowest n eigenstates as ρ =
∑γpγργ with γ = 1...n and pγ = 1/n at each bond length. In
the case of our QC-DMRG code the orbital spatial symmetry of the target
state can also be fixed[39] in which case
the first two lowest lying 1Σ+ states
can be calculated directly by using only n = 2 eigenstates
of the related subspace of the Hamiltonian. In the present version
of our TTNS method, however, orbital spatial symmetry is not implemented
yet. Therefore, we had to target the four lowest lying states with n = 4 due to the fact that there are two additional eigenstates
between the two lowest 1Σ+ states. For
more detailed descriptions of target states we refer to the original
works and reviews.[1,3,45]
Basis States
Atomic orbital (AO) basis
was adopted from the work of Bauschlicher and Langhoff[46] in order to match with previous DMRG computations.[3] The AO basis set of ref .[46] is suitable to describe
the ionic and covalent LiF states as well. It consists of 9s and 4p
functions contracted to 4s and 2p functions on the Li atom and 9s,
6p, and 1d functions contracted to 4s, 3p, and 1d on the F atom. For
more details of the AO basis set we refer to the original publication.[46]The two lowest 1Σ+ states of LiF around the equilibrium bond length can be qualitatively
described by the 1σ22σ23σ24σ21π4 and 1σ22σ23σ24σ15σ11π4 configurations.[46] For this reason, the MO basis was obtained by
CASSCF optimizations, with two active electrons on two active orbitals
(4σ and 5σ) (CAS(2,2)). MO’s were optimized simultaneously
for both 1Σ+ states. T and V matrix elements of eq 1 are expressed in this MO basis. CASSCF optimizations were carried
out with the GAMESS-US quantum chemistry package.[47]Orbitals 1σ, 2σ, and 3σ were kept
frozen in all
presented configurational interaction (CI), MPS(DMRG), and TTNS computations.
Six of the valence electrons were excited to all orbitals in the CI
calculation, which we use as reference to compare the TTNS results
to. Therefore, the active space in all CI, MPS(DMRG), and TTNS consists
of 6 electrons and 25 orbitals: CAS(6,25). A smaller active space
was also formulated for illustration purposes (Figure 11), that consists of 6 electrons and 18 orbitals: CAS(6,18).
In this active space, besides the above-mentioned three occupied σ
orbitals, the highest lying seven virtual orbitals were also kept
frozen. CI results were obtained by utilizing the determinant-based
full-configuration interaction (full-CI) program of Z. Rolik (Budapest),
which is based on the CI algorithm of Olsen et al.[48]C2 point
group symmetry constraints were assigned during this study.
Figure 11
The ground state energy for LiF CAS(6,18)
with D = 4 at r = 3.05 for three
different network topologies
using orbital dependent coordination numbers.
The QC-TTNS Method
In this section,
we present the brief overview of the most general QC-TTNS algorithm.
For the full description of the method we refer to the original work.[33] In our implementation, we allow tensors to have
orbital dependent coordination number, z, in contrast to the implementation of Nakatani et
al.[34] which is an efficient extension of
the DMRG method using fixed number of blocks. Our main motivation
is to develop an algorithm which reflects the entanglement structure
of the molecule under study as much as possible (see Figure 9). The computational cost in one step of the algorithm
scales as D+1 where D is the dimension
of the auxiliary space (in the DMRG community referred as block states),
while the long-range correlation deviates from the mean-field value
only polynomially with distance. Therefore, there is a trade-off between
entanglement localization and increased order of the tensors. The
construction of the optimal tensor network is thus a far more complex
task than it is in the case of the MPS based approaches.
Figure 9
Mutual information represented as a two-dimensional
weighted graph
for the LiF molecule at bond length (a) r = 3.05
and at (b) r = 13.7. Colors indicate different strengths
of I, and the symbols
label the irreducible representations of the molecular orbitals in
the C2 point group.
In
a full-CI treatment, a given eigenfunction of eq 1 can be written in a full tensor form aswhere Uα is
a tensor with order N, and |α⟩ represents basis states at molecular orbital i. In our study, a molecular orbital can be empty, singly
occupied with up or down spins, or doubly occupied, thus the dimension
of the local Hilbert space, d, is four. Since the
number of independent parameters in U scales exponentially
with N it is mandatory to approximate such high dimensional
tensor with a proper factorization in terms of lower dimensional tensors.
In the MPS representation, U describes a matrix network,
i.e., it emerges from contractions of a set of matrices {A1,...,A},
whereis a matrix at each vertex i of the network, with 2 virtual indices m,m of dimension D and one physical index α of dimension d, thusThe schematic plot of the matrix product state
(MPS) network is shown in Figure 1.
Figure 1
Schematic plot of the matrix product state
(MPS) algorithm. Each
node is represented by a tensor of order 2 and the vertical line the
physical index α.
A
natural extension of the MPS approach is to use higher order
tensors. In this work, we form a tree tensor network in which all
sites in the tree represent physical orbitals and in which entanglement
is transferred via the virtual bonds that connect the sites as shown
in Figure 2.
Figure 2
Schematic plot of a higher dimensional network, for example, the
tree tensor network state (TTNS) algorithm. Each node is represented
by a tensor of order z, where z is an orbital
dependent coordination number. The network supposed to reflect the
entanglement structure of the molecule as much as possible. The vertical
red lines denote physical indices α, i ∈{1,N}. Entanglement
is transferred via the virtual bonds that connect the orbitals shown
by black lines. The central node is indicated by red contour.
Schematic plot of the matrix product state
(MPS) algorithm. Each
node is represented by a tensor of order 2 and the vertical line the
physical index α.Schematic plot of a higher dimensional network, for example, the
tree tensor network state (TTNS) algorithm. Each node is represented
by a tensor of order z, where z is an orbital
dependent coordination number. The network supposed to reflect the
entanglement structure of the molecule as much as possible. The vertical
red lines denote physical indices α, i ∈{1,N}. Entanglement
is transferred via the virtual bonds that connect the orbitals shown
by black lines. The central node is indicated by red contour.Our motivation is to treat models
in which orbitals have varying
degrees of entanglement; positions closer to the center of the tree
should be better suited to represent more entangled sites. An additional
motivation is to take advantage of the property of the tree tensor
network ansatz that the long-range correlations differ from the mean-field
value polynomially with distance rather than exponentially with distance
as for MPS. In our algorithmic approach to optimize the tree tensor
network, we use tools similar to those used in refs (35), (36), (37), and (38) and optimize the network
site-by-site as in the DMRG. Therefore, Uα can describe
a tree tensor network, i.e., they emerge from contractions of a set
of tensors {A1,...,A}, whereis a tensor with z + 1 indices,
at each vertex i of the network according to Figure 3. Each tensor has z virtual indices m1...m of dimension D and one physical index α of dimension d, with z being the coordination number of that site. The coefficients Uα are obtained by contracting the virtual indices of the
tensors according to the scheme of a tree tensor network (see Figure 3). The structure of the network can be arbitrary,
and the coordination number can vary from site to site. The only condition
is that the network is bipartite, i.e., by cutting one bond, the network
separates into two disjoint parts. Therefore, the TTNS network does
not contain any loop (see Figure 3) which allows
an exact mathematical treatment.[29,31] For z = 2, the one-dimensional MPS-ansatz used in DMRG is recovered.
Figure 3
Top view
of the tree tensor network (TTNS) algorithm with fixed
coordination number, (a) z = 3 and (b) z = 4. The structure of the tensors is shown in (c) and (d). The bonds
indicate the virtual indices m1,...,m and the circle the physical
index α.
Top view
of the tree tensor network (TTNS) algorithm with fixed
coordination number, (a) z = 3 and (b) z = 4. The structure of the tensors is shown in (c) and (d). The bonds
indicate the virtual indices m1,...,m and the circle the physical
index α.Since entanglement is
transferred via the virtual bonds that connect
the sites, it is preferable to put strongly correlated sites close
together, i.e. to minimize the number of bonds between them. For z > 2 the number of virtual bonds required to connect
two
arbitrary orbitals scales logarithmically with the number of orbitals
α, whereas the scaling is linear in N for z = 2. This can be seen by considering a Cayley-tree of
depth Δ, as shown in Figure 3. The number
of sites in the tree isand thus, the
maximal distance between two
orbitals, 2Δ, scales logarithmically with N for z > 2. Because of this logarithmic scaling,
the expectation value of a long-range correlations differs from the
mean-field result by a quantity that scales polynomially with distance.
This contrasts the MPS ansatz (z = 2) that shows
an exponential decay of the difference with distance.[33]The TTNS algorithm consists in the variational optimization
of
the tensors A in such
a way that the energy is minimized (with the constraint that the norm
of the state remains constant). This is equivalent to optimizing the
functionalwhere Ψ
= Ψ(A1,...,A).
This functional is nonconvex with respect to all parameters {A1,...,A}. However, fixing all tensors A except A, due to the tensor network structure of the ansatz, it is
quadratic in the parameters A associated with one lattice site i. Because
of this, the optimal parameters A can simply be found by solving a generalized eigenvalue problem . For a bipartite network, it is always
possible to assume a gauge condition so that and thus reduce the generalized eigenvalue
problem to an ordinary one.[33] The concept
of the algorithm is to do this one-site optimization site-by-site
until convergence is reached. The challenge that remains is to calculate
the effective Hamiltonian of the eigenvalue problem. In
principle,
this is done by contracting all indices in the expression for the
expectation value ⟨Ψ|H|Ψ⟩
except those that connect to A. By interpreting the tensor A as a qD-dimensional vector A⃗, this expression can be written as . Since and , the
functional F attains
its minimum whenDue to the bipartite structure of the tensor
network, the calculation of can be performed efficiently,
i.e., on
a time that scales polynomially with N and D.(a) Separation of the state into z blocks
plus
the site under optimization, as described by eq 6. (b) Natural freedom in the tensor network: insertion of a matrices V and W fulfilling VW =
1 at one bond leaves the state invariant. The contraction of A with V forms the new tensor A′ on the left-hand side; the contraction of B with W forms the new tensor B′
at the right-hand side.This TTNS algorithm is similar to a DMRG calculation with z blocks instead of two, where a block consists of all of
the sites within one of the branches emerging from site i (see Figure 4(a)). The wave function is then
formed aswhere |ϕγ⟩ (m = 1,...,D) is the basis
in environment block γ (γ = 1,...,z)
and | φα ⟩ is the state of site i. Matrix is equal to the identity if the
basis |ϕ⟩ in each environment
block is orthonormal.
This can always be achieved, because of a gauge degree of freedom
in a TTNS:[21,33,49] it is possible to insert at any bond a resolution of the identity
1 = VW and contract the matrices V and W with the adjacent tensors A (see Figure 4b). This does not change the state but changes the tensors A and the basis states |ϕ⟩ of the environment blocks. It can be shown
that it is always possible to find gauge transformations that orthonormalize
the basis states of the environment blocks.[33]
Figure 4
(a) Separation of the state into z blocks
plus
the site under optimization, as described by eq 6. (b) Natural freedom in the tensor network: insertion of a matrices V and W fulfilling VW =
1 at one bond leaves the state invariant. The contraction of A with V forms the new tensor A′ on the left-hand side; the contraction of B with W forms the new tensor B′
at the right-hand side.
Formation
of the effective Hamiltonian with respect to the Fermionic
interaction h = c7†c15. Between sites 7 and 15 a chain of Z-matrices appears due to the Jordan-Wigner transformation.
The sites on which the interaction has support are marked in red.
Each open (filled) circle in the tensor network corresponds to the
contraction of the layered structure of tensors shown in (b).Thus, by assuring that the gauge
condition is always satisfied
in the course of the algorithm, the only term that must be calculated
is the effective Hamiltonian . This term is obtained by contracting
all
tensors except A in
the expectation value ⟨Ψ|H|Ψ⟩
as shown in Figure 6. Indices (α̃, m̃) form the row indices and (α,m) the column indices
of , such that is fulfilled. Hamiltonian H is a sum of O(N4) two-point
and four-point Fermionic interaction terms c†c and c†c†cc (σ,σ′ ∈ {↑,↓}), as defined
in eq 1. The Fermionic nature of the terms can
be handled by mapping them to nonlocal bosonic operators via Jordan-Wigner
transformations. Writing H = ∑h with h denoting the Jordan-Wigner
transformed two- and four-point interaction terms, the effective Hamiltonian transforms into a sum , whereDue to the structure (6) of the TTNS, each
effective Hamiltonian factorizes into a tensor product of z matriceswhere each matrix corresponds to the
matrix elements of h with respect to the basis
in environment block γ:Graphically, the
evaluation of ⟨ϕγ|h|ϕγ⟩ corresponds to the contraction
of a three-layered tensor
network according to the structure of the branch in block γ,
as depicted in Figure 5. This network can be contracted efficiently by starting from the
leaves and working in the inward direction.
Figure 6
Construction of the effective
Hamiltonian with respect to site i. (a) Ket- and
Bra-TTNS with tensor A singled out. A has
physical index α and connects with
its virtual indices m to the remaining network with tensors A1,...,Â,...,A indicated by the blue box.
(b) Expectation value ⟨Ψ|H|Ψ⟩
with respect to the TTNS written in (a). The effective Hamiltonian is obtained by contracting all
tensors
except A (the tensors
surrounded by the green polygon).
Figure 5
Formation
of the effective Hamiltonian with respect to the Fermionic
interaction h = c7†c15. Between sites 7 and 15 a chain of Z-matrices appears due to the Jordan-Wigner transformation.
The sites on which the interaction has support are marked in red.
Each open (filled) circle in the tensor network corresponds to the
contraction of the layered structure of tensors shown in (b).
Construction of the effective
Hamiltonian with respect to site i. (a) Ket- and
Bra-TTNS with tensor A singled out. A has
physical index α and connects with
its virtual indices m to the remaining network with tensors A1,...,Â,...,A indicated by the blue box.
(b) Expectation value ⟨Ψ|H|Ψ⟩
with respect to the TTNS written in (a). The effective Hamiltonian is obtained by contracting all
tensors
except A (the tensors
surrounded by the green polygon).With TTNS we can easily enforce the U(1)
symmetry
that is fulfilled by Hamiltonian (1), i.e. the conservation of the
number of particles. For this, the tree graph has to be made directed
(see Figure 4a), such that all sites (except
site i that is optimized) have z–1 incoming and one outgoing bond. Thus, each virtual index
of a tensor A is equipped
with the additional information on whether it is “incoming”
or “outgoing”. Each virtual index connecting to block
γ (γ = 1,...,z) is split into an index
tuple (mγ, nγ↑, nγ↓). Assuming that the index connecting to block γ
= 1 is the outgoing index, we require that n1↑ = n2↑ + ... +nz↑ + n↑ (α) and n1↓ = n2↓ + ... + n↓ + n↓ (α). n↑(α) (n↓(α)) denotes the number of electrons with spin up (down) corresponding
to the physical index α. Thus, for γ = 2,...,z, nγ↑ (nγ↓) counts the number
of up-electrons (down-electrons) within branch γ. The index n1↑ (n1↓), on the other hand, is equal to the number of electrons
with spin up (down) in all the branches plus the number of electrons
at site i.Formation of the effective Hamiltonian with respect to the Fermionic
interaction h = c7†c15 with particle number conservation
taken into
account. The sites on which the interaction has support are marked
in red. All branches marked by dotted lines and circles yield the
identity when contracted. The parity operator Z̃ is contracted to the virtual bond.Besides the ability of targeting a state with a specific
total
number of up- and down-electrons N↑ and N↓ utilization of symmetries
also gives a performance boost to the algorithm since the virtual
dimension effectively increases from D to D(N↑ + 1)(N↓ + 1), and even further if spatial symmetry is
also utilized. As an example, for the problem considered in the paper D = 4 would correspond to M = 64 DMRG block
states. Furthermore, if spatial symmetry is also exploited this increases
to M = 256 in the C2 point group. In addition, the inclusion of the particle-number
conservation also simplifies the treatment of the Fermionic nature
of the electrons. The main idea is depicted in Figure 7 for the interaction c7†c15: with an appropriately chosen numbering of the Fermions,
each sub-branch that has no Fermionic support either has only identities
acting on the sites or only matrices Z stemming from
the Jordan-Wigner transformation (Z = δ(−1)). The sub-branches including only identities simplify to the identity
because we work in a gauge in which the basis in each environment
block is orthonormal. As shown in ref (33), the Z operators can be “moved”
to the virtual bonds and, since Z2 = 1,
all of them except one cancel (see Figures 5 and 7). What remains is a sub-branch that
includes only identities, which reduces to the identity because of
the orthonormalization of the state. Thus, for a Fermionic two-site
interaction, it is sufficient to take into account the path connecting
the two sites. In this way, the treatment of long-range Fermionic
interactions is feasible with the same numerical effort as the treatment
of long-range spin interactions.
Figure 7
Formation of the effective Hamiltonian with respect to the Fermionic
interaction h = c7†c15 with particle number conservation
taken into
account. The sites on which the interaction has support are marked
in red. All branches marked by dotted lines and circles yield the
identity when contracted. The parity operator Z̃ is contracted to the virtual bond.
The numerical effort of the
algorithm has two major contributions.
On the one hand, the bond-dimension (tensor rank) D is crucial: the numerical effort for calculating one term of the
effective Hamiltonian by tensor contraction scales as D. On the other hand, this calculation
has to be performed for each term in the Hamiltonian, such that naively
a scaling N4D is expected. Fortunately, the same summation tricks
as described in refs (2 and 50) can be applied, such that the scaling reduces to N2D. Since O(N) iteration steps are required for convergence,
the overall time of the algorithm will scale as N3D. Therefore,
in general tensor ranks decrease by going from MPS to TTNS, but the
order of the tensors increases. However, the number of tensors with z = 1 lying on the boundary of the network increases exponentially
when larger and larger systems are considered, and there is an expected
crossover in CPU time between the full sweep of the MPS and TTNS.
It is worth mentioning that if D ≥ d, orbitals lying on the boundary are excluded automatically
from the optimization.
Network Optimization by Entanglement
Localization
The amount of contribution to the total correlation
energy of an
orbital can be detected qualitatively by the single-orbital entropy, si = −Trρ ln ρ where ρ is the reduced density matrix at orbital i. The two-orbital entropy is constructed similarly using
the reduced density matrix, ρ,
of a subsystem built from orbitals i and j, and the mutual information I = s – s – s describes how orbitals are
entangled with each other as they are embedded in the whole system.
For more detailed derivations we refer to the original papers.[8,12,14,15] Therefore, these quantities provide chemical information about the
system, especially about bond formation and nature of static and dynamic
correlation.[8,16,17,51] As an example, s and I are shown in Figures 8 and 9, respectively, for the equilibrium bond length r = 3.05 and at large separation r = 13.7. The total
quantum information encoded in the wave function is given by the sum
of the orbital entropy, i.e., Itot = ∑s, which is twice as large for the stretched geometry as compared
to the equilibrium case.
Figure 8
One orbital entropy profile for the LiF molecule
at bond length
(a) r = 3.05 and at (b) r = 13.7.
Symbols label the irreducible representations of the molecular orbitals
in the C2 point group.
One orbital entropy profile for the LiF molecule
at bond length
(a) r = 3.05 and at (b) r = 13.7.
Symbols label the irreducible representations of the molecular orbitals
in the C2 point group.Mutual information represented as a two-dimensional
weighted graph
for the LiF molecule at bond length (a) r = 3.05
and at (b) r = 13.7. Colors indicate different strengths
of I, and the symbols
label the irreducible representations of the molecular orbitals in
the C2 point group.It is clear from Figure 9that some orbitals are strongly
entangled with several other orbitals,
while some orbitals are entangled with only a few others and some
are almost disentangled from the system. Therefore, the obvious choice
is to allow the coordination number, z, to vary from orbital to orbital. In the following
analysis, however, we restrict ourselves to a fixed z = 3 case in order to allow a more direct
analysis when data are compared to the z = 2 MPS case.Since both DMRG and TTNS rely
on the systematic application of
the Schmidt-decomposition, the required computational resources to
reach a given error margin is determined by the amount of entanglement
in the system.[39] This can be manipulated
by changing the basis functions[33,52] or by changing the
tensor topology. For the latter case, the entanglement lengthshould be minimized in order to localize the
entanglement in the system, where d is the distance function between orbital i and j depending on the tensor topology, and η
is some exponent that we set to 1 or 2. For the one-dimensional case,
i.e., for DMRG and MPS d = |i – j|. For the tree
topology d can be computed
as the distance from the center to i, plus the distance
from the center to j, minus twice the distance from
the center to their lowest common ancestor. The lowest common ancestor
can be obtained within a linear preprocessing time O(N) and a constant query time using the Berkman’s
algorithm.[53] As an example, the optimized
tensor topologies at the equilibrium bond length r = 3.05 are shown in Figure 10 for the one-dimensional
topology and for the tree topology.
Figure 10
Optimization
of tensor topology by minimizing the entanglement
length in the system at the equilibrium bond length r = 3.05. (a) and (b) are for the one-dimensional MPS like topology
for the original ordering and for the optimized ordering, respectively.
(c) Shows the optimized topology on the tree (small dots indicate
not used grid points of the tree). The total quantum information Itot does not change, but the entanglement length
calculated with η = 1 and 2 indicated by Cost1 and
Cost2 drops significantly.
In practice, first we performed
a quick and fast DMRG full sweep
with a fixed small number of block states (M ≃
16,...,64) using the ordering of orbitals for which the T and V integral files were generated by the GAMESS-US
program in order to determine the one- and two-orbital entropy profiles
qualitatively. For all subsequent calculations we rendered orbitals
with descending orbital entropy values to form the Complete Active
Space CAS-vector that was used during the configuration interaction
based dynamic extended active space (CI-DEAS) procedure.[12] We also determined the mutual information, I, and the orbitals were reordered
by minimizing the entanglement length given by eq 7. In the one-dimensional case either the graph Laplacian can
be used or by other heuristic methods to reduce the spectral envelope
of I, i.e., to make
it as diagonally dominant as possible.[54,55] In the case
of the tree network, the optimization is less straightforward, but
as a rule of thumb we placed orbitals with largest entropy values
close to the center of the network while keeping orbitals with large I values close together (see
Figure 10(c)). In the subsequent step, accurate
DMRG (with optimized CAS vector) or MPS/TTNS calculations were performed.Optimization
of tensor topology by minimizing the entanglement
length in the system at the equilibrium bond length r = 3.05. (a) and (b) are for the one-dimensional MPS like topology
for the original ordering and for the optimized ordering, respectively.
(c) Shows the optimized topology on the tree (small dots indicate
not used grid points of the tree). The total quantum information Itot does not change, but the entanglement length
calculated with η = 1 and 2 indicated by Cost1 and
Cost2 drops significantly.
Numerical Results
In this work we
have used two codes. Our QC-DMRG program has been
developed for a long time, and it includes advanced features like
the dynamic block state selection (DBSS) approach,[13,39] the CI-DEAS procedure,[12] and the treatment
of orbital spatial symmetries which are not implemented yet in the
TTNS code. Therefore, the entropy functions were calculated by the
QC-DMRG code to provide initial data quickly for network optimizations.
Variable Tensor Orders and Convergence Properties
In
order to present a more systematic analysis on the separated
and combined effects of the network optimization with orbital dependent
coordination number and the permutation of the orbital ordering in
a given network, we calculated the ground state energy of CAS(6,18)
with D = 4 for three different network topologies
shown in Figure 11. Gradual
increase in the rate of convergence can be achieved by optimizing
the number of components with z = 3 together with the ordering of the orbitals in that given
network topology as it is shown in Figure 11(d).
Ground State and Excited States
The
rest of the analysis for a fair treatment is based on the TTNS code
alone using fixed z =
2 (MPS) and z = 3 (TTNS)
coordination number (except for the boundaries) while keeping all
other parameters of the algorithm the same.The potential energy
curve (PES) can be calculated for an a priory set error margin using
the DBSS procedure.[3] Therefore, we have
easily reproduced the full-CI energies up to 10–8 au in absolute error. In the following, however, we have used a
small fixed number of block states, i.e., fixed bond dimension, in
order to demonstrate the benefits underlying the TTNS geometry. The
four lowest lying eigenstates (γ = 1,...,4) are shown in Figure 12 calculated by the QC-DMRG method using M = 64 block states or alternatively by the TTNS approach
with z = 2 and D = 4. We have confirmed that each state is a singlet with S2 = 0.
Figure 12
The energy of the four lowest lying states as a function bond length.
The relative error of the ground
state (γ = 1) and the third
excited state (γ = 4), ΔEγ = |Eγ – EFCIγ|/EFCIγ, are
shown in Figure 13. The open symbols stand
for the MPS solution, while the filled symbols indicate the TTNS result.
It is clear from the figure that the accuracy of the energy dropped
for both states by at least an order of magnitude. For the ground
state close to the equilibrium bond length the change is almost 2
orders of magnitude. It is important to emphasize again that all parameters
of the calculations were kept fixed except that we used the optimized
one-dimensional MPS-like topologies or the two-dimensional optimized
TTNS-like topologies. The network topologies for both cases were optimized
for each bond length using the procedure outlined in Section II.
Figure 13
The relative error of the energy of the
two lowest lying 1Σ+ states as a function
of bond length using the
Tree-TNS with z = 2 and z = 3 with D = 4. For z = 2, the one-dimensional MPS-ansatz
used in DMRG is recovered.
In the MPS based DMRG method the matrices
of the one-dimensional
tensor network are optimized iteratively by traversing through the
network starting from the left boundary until the right boundary is
reached. In the following steps the same procedure is repeated but
in the reverse direction. This systematic optimization procedure is
called as sweeping. The relative error of the two 1Σ+ states as a function of iteration steps is shown in Figure 14 for the MPS case with z = 2 and D = 4. It can be seen in
the figure that the relative error of the ground state energy drops
quickly as a function of iteration steps, and it saturates after some
20 iterations. In contrast to this, the convergence of the third excited
state is somewhat slower. In addition, the method lost the target
state for certain interaction steps, i.e., for certain network configurations.
This is due to the very low bond dimension used in the test calculations.
When we used larger bond dimension or included point group symmetries,
this problem was fully eliminated.
Figure 14
The
relative error of the energy of the ground state and the third
excited state as a function of iteration step with z = 2 and D = 4 for
a few selected bond lengths.
The ground state energy for LiF CAS(6,18)
with D = 4 at r = 3.05 for three
different network topologies
using orbital dependent coordination numbers.The energy of the four lowest lying states as a function bond length.The relative error of the energy of the
two lowest lying 1Σ+ states as a function
of bond length using the
Tree-TNS with z = 2 and z = 3 with D = 4. For z = 2, the one-dimensional MPS-ansatz
used in DMRG is recovered.In the case of the tree-network, there is more freedom to
choose
the optimal sweeping procedure, i.e., to choose the optimal path through
which the network is traversed. In the present work, we have swept
through the network by going recursively back and forth through each
branch. Therefore, according to the labeling of the orbitals on the
lattice shown in Figure 15 one sweep goes through
the orbitals as 1 2 3 4 5 4 6 4 3 7 8 7 3 2 9 10 9 11 9 2 1 12 13
14 13 15 13 12 16 17 16 18 16 12 1 19 20 21 20 22 20 19 23 24 23 25
23 19. The main advantage of this path is that highly entangled orbitals
located close to the center of the network are optimized several times
in a full sweep. The relative error of the two 1Σ+ states as a function of iteration steps obtained by the TTNS
method with z = 3 and D = 4 is shown in Figure 16. It can
be seen that the relative error of the ground state energy reached
the saturation value of the MPS calculation (shown in Figure 14) after a few iteration steps. However, unlike
the MPS result the error dropped further for subsequent iteration
steps until a much lower saturation value was reached. The overall
improvement compared to the MPS case was almost 2 orders of magnitude.
Similar improvement was observed for the excited state, although,
due to the low bond dimension the target state was lost again for
certain interaction steps, i.e., for certain network configurations.
Figure 15
The figure shows how the tree network is traversed through in a
full sweep.
Figure 16
Similar to Figure 14 but for the TTNS method
with z = 3.
The
relative error of the energy of the ground state and the third
excited state as a function of iteration step with z = 2 and D = 4 for
a few selected bond lengths.The figure shows how the tree network is traversed through in a
full sweep.Similar to Figure 14 but for the TTNS method
with z = 3.
Locating Avoided Crossing by Entanglement
Since the one- and two-orbital entropy functions were calculated
to optimize network topologies, they can also be used to locate the
avoided crossing. For problems in condensed matter physics the one-
and two-orbital entropy functions and the block entropy are used to
locate quantum phase transitions.[56,57] For translationally
invariant systems the single-orbital entropy function is the same
for all sites, while in a chemical system it is orbital dependent.
Therefore, the behavior of Itot as a function
of bond length can be used to detect and locate transition points
where the wave function changes dramatically. In Figure 17Itot is shown as a
function of bond length. It can be seen that Itot has a cusp-like structure at r = 11.86
indicating the position of the avoided crossing.
Figure 17
Total quantum information
encoded in the wave function as a function
bond length. The cusp-like structure indicates the dramatic change
in the wave function.
Total quantum information
encoded in the wave function as a function
bond length. The cusp-like structure indicates the dramatic change
in the wave function.
Conclusion
The present paper has been
devoted to the application of the quantum-chemistry
tree tensor network state (QC-TTNS) method to calculate the potential
energy curve in the vicinity of the ionic–covalent avoided
crossing in LiF. We have discussed the main features of the most general
version of the QC-TTNS algorithm in which the local properties of
the tensors can be different for each orbital. The optimized tensor
topologies, which reflect the structure of the entanglement bonds
between the molecular orbitals, were determined by minimizing the
entanglement length in the system as the bond length between the Li
and F was stretched. In order to compare tensor topology effects only,
we have kept all parameters of the algorithms fixed and used a very
small bond dimension or alternatively a very small number of block
states. By comparing the MPS(DMRG) and TTNS convergence properties
we have demonstrated that the TTNS approach can converge to a significantly
lower energy. Although the tensor contraction scales as D, the TTNS topology offers a more
optimal network structure since the number of components with z = 1 lying on the boundary
scales exponentially with the system size. As a consequence we have
found that the relative error of the energy of the ground state as
well as the excited states can be improved by an order of magnitude
or more for the same value of D. This also indicates
that the MPS result can be reproduced with a significantly lower bond
dimension using the TTNS method.Our QC-TTNS approach seems
to be a promising direction reflected
by the stability and fast convergence of the new method even for systems
in which the wave function character changes as a function of bond
length, especially in the region of an avoided crossing. The underlying
benefits of TTNS is, however, far from fully exploited and the optimization
of the method is far more complicated. Due to the more advanced topology,
several optimization tasks and problems arise which do not have counterparts
in the MPS formulation.Extension of this work using more complex
systems and orbital dependent
coordination number is under progress.
Authors: Fabian M Faulstich; Mihály Máté; Andre Laestadius; Mihály András Csirik; Libor Veis; Andrej Antalik; Jiří Brabec; Reinhold Schneider; Jiří Pittner; Simen Kvaal; Örs Legeza Journal: J Chem Theory Comput Date: 2019-03-13 Impact factor: 6.006