Jacob N Sanders1, Xavier Andrade1, Alán Aspuru-Guzik1. 1. Department of Chemistry and Chemical Biology, Harvard University , 12 Oxford Street, Cambridge, Massachusetts 02138, United States.
Abstract
This article presents a new method to compute matrices from numerical simulations based on the ideas of sparse sampling and compressed sensing. The method is useful for problems where the determination of the entries of a matrix constitutes the computational bottleneck. We apply this new method to an important problem in computational chemistry: the determination of molecular vibrations from electronic structure calculations, where our results show that the overall scaling of the procedure can be improved in some cases. Moreover, our method provides a general framework for bootstrapping cheap low-accuracy calculations in order to reduce the required number of expensive high-accuracy calculations, resulting in a significant 3× speed-up in actual calculations.
This article presents a new method to compute matrices from numerical simulations based on the ideas of sparse sampling and compressed sensing. The method is useful for problems where the determination of the entries of a matrix constitutes the computational bottleneck. We apply this new method to an important problem in computational chemistry: the determination of molecular vibrations from electronic structure calculations, where our results show that the overall scaling of the procedure can be improved in some cases. Moreover, our method provides a general framework for bootstrapping cheap low-accuracy calculations in order to reduce the required number of expensive high-accuracy calculations, resulting in a significant 3× speed-up in actual calculations.
Matrices
are one of the most fundamental objects in the mathematical
description of nature, and as such they are ubiquitous in every area
of science. For example, they arise naturally in linear response theory
as the first term in a multidimensional Taylor series, encoding the
response of each component of the system to each component of the
stimulus. Hence, in many scientific applications, matrices contain
the essential information about the system being studied.Despite
their ubiquity, the calculation of matrices often requires
considerable computational effort. Returning to the linear response
theory example, it might be necessary to individually calculate the
response of every component of the system to every component of the
stimulus and, depending on the area of application, each individual
computation may itself be quite expensive. The overall expense stems
from the fact that evaluating a matrix of dimension N × M requires, in principle, the individual
evaluation of N × M elements.
But this does not always have to be the case.For example, if
we know a priori the eigenvectors
of an N × N diagonalizable matrix, then we can obtain the full matrix by only
calculating the N diagonal elements. Similarly, a
sparse matrix, which contains many zero elements, can be evaluated
by calculating only the nonzero elements, if we know in advance where
such elements are located. In this article, we present a general approach
that can produce a considerable reduction in the cost of constructing
a matrix in many scientific applications by substantially reducing
the number of elements that need to be calculated.The key numerical
procedure of our approach is a method to cheaply
recover sparse matrices with a cost that is essentially proportional
to the number of nonzero elements. The matrix reconstruction procedure
is based on the increasingly popular compressed sensing approach,[1−4] a state-of-the-art signal processing technique developed to minimize
the amount of data that needs to be measured to reconstruct a sparse
signal.Although the theory of compressed sensing is extensive
and well-developed,
the use of compressed sensing and sparse sampling methods in scientific
development has been dominated by experimental applications, including
multidimensional nuclear magnetic resonance,[5,6] super-resolution
microscopy,[7] and other applications in
spectroscopy and beyond.[8−15] However, compressed sensing is also becoming a tool for computational
applications.[16−22] In particular, in previous work we have shown that compressed sensing
can also be used to reduce the amount of computation in numerical
simulations.[16]In this article, we
apply compressed sensing to the problem of
computing matrices. This method has two key properties. First, the
cost of the procedure is quasi-linear with the size of the number
of nonzero elements in the matrix, without the need to know a priori the location of the nonzero elements. Second, the
reconstruction is exact. Furthermore, the utility of the method extends
beyond the computation of a priori sparse matrices.
In particular, the method suggests a new computing paradigm in which
one develops methods to find a basis in which the matrix is known
or suspected to be sparse, based on the characteristics and prior
knowledge of the matrix, and then afterward attempts to recover the
matrix at lower cost.To demonstrate the power of our approach,
we apply these ideas
to an important problem in quantum chemistry: the determination of
the vibrational modes of molecules from electronic structure methods.
These methods require the calculation of the matrix of the second
derivatives of the energy with respect to the nuclear displacements,
known as the force-constant or Hessian matrix. This matrix is routinely
obtained in numerical simulations by chemists and physicists, but
it is relatively expensive to compute when accurate quantum mechanical
methods are used.The search for more efficient methods of computing
Hessian matrices
has a long history. One of the earliest advances was the development
of theoretical methods to compute the derivatives of the matrix analytically,
rather than numerically, first in one of the simplest quantum mechanical
methods known as Hartree–Fock theory,[23,24] then in more sophisticated higher-accuracy correlated methods,[25−27] and later in modern density functional theory.[28] These analytical derivative techniques have been further
optimized by systematically exploiting molecular symmetries[29] and by organizing calculations more efficiently
by working directly in the so-called atomic orbital basis.[30,31] More recent developments have applied Davidson methods to compute
only those vibrational modes relevant to searching for the transition
state in a chemical reaction while sidestepping the computation of
a full Hessian matrix.[32,33] Finally, another research frontier
involves developing methods that scale favorably with system size
for computing energies[34] and response properties
such as molecular vibrations,[35,36] with the ultimate goal
of achieving practical techniques that exhibit linear scaling.Our approach exploits the sparsity of the Hessian matrix and cheap
auxiliary calculations to further improve the efficiency of computing
the vibrational modes of molecules; moreover, our approach is compatible
with and complementary to some of the aforementioned techniques. At
the same time, our method provides a general framework for bootstrapping
cheap low-accuracy calculations to reduce the required number of expensive
high-accuracy calculations, something which previously was not possible
to do in general.We begin by discussing how compressed sensing
makes it practical
to take a new approach for the calculation of matrices based on finding
strategies to make the matrix sparse. Next, we introduce the mathematical
foundations of the method of compressed sensing and apply them to
the problem of sparse matrix reconstruction. This is the numerical
tool that forms the foundation of our approach. Finally, we illustrate
these new ideas by applying them to the problem of obtaining molecular
vibrations from quantum mechanical simulations.
Finding a Sparse Description
of the Problem
The first step in our approach is to find
a representation for
the problem where the matrix to be calculated is expected to be sparse.
In general, finding this sparsifying basis is specific
to each problem and ranges from trivial to quite complex; it has to
do with the knowledge we have about the problem or what we expect
about its solution.Leveraging additional information about
a problem is an essential
concept in compressed sensing, but it is also a concept that is routinely
exploited in numerical simulations. For example, in quantum chemistry
it is customary to represent the orbitals of a molecule in a basis
formed by the orbitals of the atoms in the molecule,[37] which allows for an efficient and compact representation
and a controlled discretization error. This choice comes from the
notion that the electronic structure of the molecule is roughly described
by “patching together” the electronic structure of the
constituent atoms.An ideal basis in which to reconstruct a
matrix is the basis of
its eigenvectors, or eigenbasis, as this basis only
requires the evaluation of the diagonal elements to obtain the entire
matrix. Of course, finding the eigenbasis requires knowing the matrix
in the first place, so reconstructing a matrix in its eigenbasis is
not practically useful. However, in many cases it is possible to obtain
reasonable approximations to the eigenvectors (an idea which also
forms the basis of perturbation theory in quantum mechanics). The
approximate eigenbasis probably constitutes a good sparsifying basis
for many problems, as we expect the matrix to be diagonally dominant,
with a large fraction of the off-diagonal elements equal to zero or
at least small.Since the determination of an approximate eigenbasis
depends on
the specific problem at hand, a general prescription is difficult
to give. Nevertheless, a few general ideas could work in many situations.
For example, in iterative or propagative simulations, results from
previous iterations or steps could be used to generate a guess for
the next step. Alternatively, cheap low-accuracy methods can be used
to generate a guess for an approximate eigenbasis. In this case, the
procedure we propose provides a framework for bootstrapping the results
of a low-cost calculation in order to reduce the required number of
costly high-accuracy calculations. This last strategy is the one we
apply to the case of molecular vibrations.What makes looking
for a sparsifying basis attractive, even at
some computational cost and code-complexity overhead, are the properties
of the recovery method. First, the cost of recovering the matrix is
roughly proportional to its sparsity. Second, the reconstruction of
the matrix is always exact up to a desired precision; even if the
sparsifying basis is not a good one, we eventually converge to the
correct result. The penalty for a bad sparsifying basis is additional
computation, which in the worst case makes the calculation as costly
as if compressed sensing were not used at all. This feature implies
that the method will almost certainly offer some performance gain.There is one important qualification to this gain. For some matrices,
there is a preferred basis in which the matrix is cheaper to compute,
and the extra cost of computing its elements in a different basis
might offset the reduction in cost offered by compressed sensing.
Compressed
Sensing for Sparse Matrices
Once a sparse representation
for the matrix is known, the numerical
core of our method for the fast computation of matrices is the application
of compressed sensing to calculate sparse matrices without knowing a priori where the nonzero elements are located. Related
work has been presented in the field of compressive principal component
pursuit,[38−41] which focuses on reconstructing matrices that are the sum of a low-rank
component and a sparse component. Our work instead outlines a general
procedure for reconstructing any sparse matrix by measuring it in
a different basis.Suppose we wish to recover an N × N matrix A known to be sparse
in a particular
orthonormal basis {ψ} (for simplicity
we restrict ourselves to square matrices and orthonormal bases). Without
any prior knowledge of where the S nonzero elements
of A are located, it might appear that we need to calculate
all N2 elements, but this is not the case.In a different orthonormal basis {ϕ}, the matrix A has a second representation B given bywhere P is the orthogonal change-of-basis
matrix from the basis {ψ} to the
basis {ϕ}. Note that in general B is not sparse.If we regard A and B as N2-element vectors, it is easy to see
that the change-of-basis transformation from A to B given by eq 1 is linear. This fact enables us to use the machinery of compressed
sensing to reconstruct the full matrix A by measuring
only some of the entries of B.For a fully general matrix A, we would indeed need
to measure all the entries of B so that A could be recovered by inverting the linear system in eq 1. However, because the matrix A is sparse,
compressed sensing lets us do better. The idea is to measure only some of the entries of B, which means that
eq 1 is now underdetermined and admits many
possible solutions for A. From these many solutions for A, we simply select the sparsest one; it
has been proven in general[1−4] that this can be done in a highly computationally
efficient way by minimizing the sum of the absolute values of the
entries of A. The key insight of compressed sensing is
that, as more entries of B are measured, the sparsest
matrix A which satisfies the underdetermined system converges
to the true solution of the full system. Moreover, this convergence
typically happens long before the matrix B has been fully
sampled, particularly if the true matrix A is quite sparse.The compressed sensing reconstruction is done by solving the so-called
basis pursuit (BP) problem,[4,42]where the 1-norm
is considered
as a vector norm (∥A∥1 = ∑|A|), the change-of-basis
matrix P is known, and W is a set of
randomly measured entries in matrix B which are known.
We assume that we have some method of computing the entries of B, but that the method is expensive, and we would therefore
like to recover A while computing as few of them as possible.
This approach to matrix reconstruction is illustrated in Figure 1.
Figure 1
General scheme for the recovery of a sparse matrix A via compressed sensing. Rather than sampling A directly,
the key is to sample the matrix B, which corresponds
to A expressed in a different (known) basis. Recovery
of A from the undersampled entries of B proceeds
via compressed sensing by solving eq 2.
General scheme for the recovery of a sparse matrix A via compressed sensing. Rather than sampling A directly,
the key is to sample the matrix B, which corresponds
to A expressed in a different (known) basis. Recovery
of A from the undersampled entries of B proceeds
via compressed sensing by solving eq 2.The size of the set W, a number that we call M, is the number of matrix
elements of B that
are sampled. M determines the quality of the reconstruction
of A. From compressed sensing theory we can find a lower
bound for M as a function of the sparsity of A and the change-of-basis transformation.One important
requirement for compressed sensing is that the sparse
basis {ψ} for A and
the measurement basis {ϕ} for B should be incoherent, meaning that the
maximum overlap between any vector in {ψ} and any vector in {ϕ}should be as small as possible
(in general μ ranges from 1 to N1/2). Intuitively, this incoherence condition means that the change-of-basis
matrix P should thoroughly scramble the entries of A to generate B.It can be proven[3] that the number of
entries of B which must be measured in order to fully
recover A by solving the BP problem in eq 2 scales asThis scaling equation encapsulates
the important
aspect of compressed sensing: if a proper measurement basis is chosen,
the number of entries which must be measured scales linearly with
the sparsity of the matrix and only depends weakly on the full size
of the matrix. For the remainder of this paper, we always choose our
measurement basis vectors to be the discrete cosine transform (DCT)
of the sparse basis vectors, for which the parameter μ is equal
to 21/2. The DCT is a common transformation chosen for
compressed sensing because it is easy to implement, with fast and
readily available algorithms for its computation, and because it guarantees
that the sparse basis and the measurement basis are highly incoherent.
Intuitively, this ensures that each element from the measurement matrix B contains as much information as possible about the elements
of the sparse matrix A. A small value of μ such
as 21/2 cuts down on the number of entries of B which must be measured in order to fully recover the sparse matrix A. Of course, our method is independent of this choice and
could work with other basis transformations as well.In order
to study the numerical properties of the reconstruction
method, we performed a series of numerical experiments. We generated
100 × 100 matrices of varying sparsity with random values drawn
uniformly from the interval [−1,1] and placed in random locations
in the matrix. Matrix elements were then sampled in the DCT measurement
basis, and an attempt was made to recover the original sparse matrix
by solving the basis pursuit problem in eq 2.Figure 2 illustrates the percent of
matrix
elements that had to be sampled for accurate recovery of the sparse
matrix compared with other recovery approaches. If no prior knowledge
of a matrix is used for its recovery, then one simply measures each
entry; this is the current paradigm in many scientific applications.
If one knows exactly where the nonzeros in a sparse matrix are located,
one can simply measure those elements. Compressed sensing interpolates
between these two extremes: it provides a method for recovering a
sparse matrix when the locations of the nonzeros are not known in
advance. Although this lack of knowledge comes with a cost, the recovery
is still considerably cheaper than measuring the entire matrix.
Figure 2
Percent of
entries that must be sampled for accurate recovery of
a matrix as a function of sparsity. Comparison between compressed
sensing and two limiting cases: “no prior knowledge”
of sparsity and the “perfect oracle”, which reveals
where all nonzero entries are located. Each point on the compressed
sensing curve is an average of ten different randomizations. The accuracy
criterion is a relative error in the Frobenius norm smaller than 10–7.
Percent of
entries that must be sampled for accurate recovery of
a matrix as a function of sparsity. Comparison between compressed
sensing and two limiting cases: “no prior knowledge”
of sparsity and the “perfect oracle”, which reveals
where all nonzero entries are located. Each point on the compressed
sensing curve is an average of ten different randomizations. The accuracy
criterion is a relative error in the Frobenius norm smaller than 10–7.
Application: Molecular
Vibrations
Calculating the vibrations of a molecule, both
the frequencies
and normal modes, is one of the most ubiquitous tasks in computational
chemistry.[43] Integrated into nearly all
computational chemistry packages, including the Q-Chem package used
for this study,[44] molecular vibrations
are computed by theoretical and experimental chemists alike. Chemists
routinely optimize molecular geometries to find minimal energy conformations;
computing and confirming the positivity of all vibrational frequencies
is the standard method of assuring that a local minimum has been found.
Another common task is to find the transition state for a proposed
reaction: here it is also necessary to compute the vibrations to find
one mode with an imaginary frequency, confirming the existence of
a local maximum along the reaction coordinate.[45] Despite the centrality of molecular vibrations in computational
chemistry, it remains one of the most expensive computations routinely
performed by chemists.The core of the technique lies in calculating
the matrix of the
mass-weighted second derivatives of the energy with respect to the
atomic positions,whereis the ground-state energy of the
molecule, RA is coordinate i of atom A,
and MA is its mass. Hence, the Hessian
is a real 3N × 3N matrix where N is the number of
atoms in the molecule. When the molecule is in a minimum energy conformation,
the eigenvectors of the Hessian correspond to the vibrational modes
of the molecule, and the square root of the eigenvalues correspond
to the vibrational frequencies.[45]Our goal, therefore, is to understand how our approach can reduce
the cost of computing the Hessian matrix of a molecule. We achieve
this understanding in two complementary ways. First, for a moderately
sized molecule, we outline and perform the entire numerical procedure
to show in practice what kinds of speed-ups may be obtained. Second,
for large systems, we investigate the ability of compressed sensing
to improve how the cost of computing the Hessian scales with the number
of atoms.Calculating the Hessian requires a method for obtaining
the energy
of a given nuclear configuration. There exist many methods to choose
from, which offer a trade-off between accuracy and computational cost.
Molecular mechanics approaches, which model the interactions between
atoms via empirical potentials,[45] are computationally
cheap for systems of hundreds or thousands of atoms, while more accurate
and expensive methods explicitly model the electronic degrees of freedom
at some level of approximated quantum mechanics, such as methods based
on density functional theory (DFT)[46−48] or wave function methods.[37] We focus on these quantum mechanical approaches,
since there the computation time is dominated by the calculation of
the elements of the Hessian matrix, making it an ideal application
for our matrix-recovery method.To recover a quantum mechanical
Hessian efficiently with compressed
sensing, we need to find a basis in which the matrix is sparse. While
we might expect the Hessian to have some degree of sparsity in the
space of atomic Cartesian coordinates, especially for large molecules,
we have found that it is possible to find a better basis. The approach
we take is to use a basis of approximated eigenvectors generated by
a molecular mechanics computation, employing the common MM3 force
field,[49] which provides a cheap approximation
to the eigenvectors of the quantum mechanical Hessian.[50] This is illustrated in Figure 3 for the benzene molecule (C6H6). The
figure compares the quantum mechanical Hessian in the basis of atomic
Cartesian coordinates with the same matrix in the approximate eigenbasis
obtained via an auxiliary molecular mechanics computation. The matrix
in the molecular mechanics basis is much sparser, and is therefore
better suited to recovery via compressed sensing.
Figure 3
Quantum mechanical Hessian
of benzene in the basis of atomic Cartesian
coordinates (on the left) and in the basis of molecular mechanics
normal modes (on the right). Since the molecular mechanics normal
modes form approximate eigenvectors to the true quantum mechanical
Hessian, the matrix on the right is sparse (close to diagonal) and
therefore well-suited to recovery via compressed sensing.
Quantum mechanical Hessian
of benzene in the basis of atomic Cartesian
coordinates (on the left) and in the basis of molecular mechanics
normal modes (on the right). Since the molecular mechanics normal
modes form approximate eigenvectors to the true quantum mechanical
Hessian, the matrix on the right is sparse (close to diagonal) and
therefore well-suited to recovery via compressed sensing.The second derivatives of the energy required for
the Hessian,
eq 5, can be calculated either via finite differences,
generating what are known as numerical derivatives,
or using perturbation theory, generating so-called analytical derivatives.[51−54] A property of the calculations of the energy derivatives is that
the numerical cost does not depend on the direction they are calculated.
This can be readily seen in the case of finite differences, as the
cost of calculatingis essentially the same as computingAs discussed
previously, this ability to compute
matrix elements at a comparable cost in any desired basis is an essential
requirement of our method.A second property of both numerical
and analytical derivatives
that appears in variational quantum chemistry formalisms like DFT
or Hartree–Fock is that each calculation yields a full column
of the Hessian, rather than a single matrix element. Again, this is
easy to see in finite difference computations. We can write the second
derivative of the energy as a first derivative of the force,By the Hellman–Feynman theorem[55,56] and the appropriate correction for the Pulay forces,[51] a single energy calculation yields the forces
acting over all atoms, so the evaluation of eq 6 by finite differences for fixed A and i yields the derivatives for all values of B and j, a whole column of the Hessian. An equivalent result holds for analytic
derivatives obtained via perturbation theory.[53,54] Thus, our compressed sensing procedure for this particular application
focuses on measuring random columns of the quantum mechanical Hessian
rather than individual random entries.The full compressed sensing
procedure applied to the calculation
of a quantum mechanical Hessian is implemented as follows:Calculate approximate
vibrational modes
using molecular mechanics.Transform the approximate modes using
the DCT matrix.Randomly
select a few of the transformed
modes.Calculate energy
second derivatives
along these random modes to yield random columns of the quantum mechanical
Hessian.Apply compressed
sensing to rebuild
the full quantum mechanical Hessian in the basis of approximate vibrational
modes.Transform the
full quantum mechanical
Hessian back into the atomic coordinate basis.Diagonalize the quantum mechanical
Hessian to obtain the vibrational modes and frequencies.The optimal number of random modes can be selected iteratively,
repeating steps 3–7 adding more random modes each time until
convergence is reached. See the Supporting Information for details.Figure 4 illustrates the
results of applying
our Hessian recovery procedure to anthracene (C14H10), a moderately sized polyacene consisting of three linearly
fused benzene rings. The top panel illustrates the vibrational frequencies
obtained by the compressed sensing procedure outlined above for different
extents of undersampling of the true quantum mechanical Hessian. Even
sampling only 25% of the columns yields vibrational frequencies that
are close to the true quantum mechanical frequencies, and much closer
than the molecular mechanics frequencies. The middle panel illustrates
the error in the vibrational frequencies from the true quantum mechanical
frequencies. Sampling only 30% of the columns gives a maximum frequency
error of less than 3 cm–1, and sampling 35% of the
columns yields nearly exact recovery. The bottom panel illustrates
the error in the normal modes. Once again, sampling only 30% of the
columns gives accurate recovery of all vibrational normal modes to
within 1%. In short, our compressed sensing procedure applied to anthracene
reduces the number of expensive quantum mechanical computations by
a factor of 3. The additional cost of the molecular mechanics computation
and the compressed sensing procedure, which take a few seconds, is
negligible compared to the reduction in cost for the computation of
the Hessian which for anthracene takes on the order of hours.
Figure 4
Results of
applying our compressed sensing procedure to the vibrational
modes and frequencies of anthracene. (Top) Even by sampling only 25%
of the quantum mechanical Hessian, the vibrational frequencies obtained
via compressed sensing converge to those of the true quantum mechanical
Hessian. (Middle) Error in vibrational frequencies for different extents
of undersampling. When only 30% of the columns are sampled, the maximum
error in frequency is within 3 cm–1, and with 35%
sampling, the recovery is essentially exact. (Bottom) Error in vibrational
normal modes for different extents of undersampling on a logarithmic
scale; the error is calculated as one minus the overlap (dot product)
between the exact quantum mechanical normal mode and the normal mode
obtained via compressed sensing. Once 30% of the columns are sampled,
the normal modes are recovered to within 1% accuracy.
Results of
applying our compressed sensing procedure to the vibrational
modes and frequencies of anthracene. (Top) Even by sampling only 25%
of the quantum mechanical Hessian, the vibrational frequencies obtained
via compressed sensing converge to those of the true quantum mechanical
Hessian. (Middle) Error in vibrational frequencies for different extents
of undersampling. When only 30% of the columns are sampled, the maximum
error in frequency is within 3 cm–1, and with 35%
sampling, the recovery is essentially exact. (Bottom) Error in vibrational
normal modes for different extents of undersampling on a logarithmic
scale; the error is calculated as one minus the overlap (dot product)
between the exact quantum mechanical normal mode and the normal mode
obtained via compressed sensing. Once 30% of the columns are sampled,
the normal modes are recovered to within 1% accuracy.Having shown that our compressed sensing procedure
gives a 3×
speed-up for a moderately sized organic molecule, we now move to larger
systems and investigate how the cost of computing the Hessian scales
with the number of atoms. In the absence of compressed sensing, if
the entries of the Hessian must be calculated independently, the cost
of calculating the Hessian would scale as O(N2) × OE, where OE is the cost of computing the energy of a given nuclear
configuration (the cost of analytical and numerical derivatives usually
have the same scaling). For example, for a DFT-based calculation, OE is typically O(N3). However, since many quantum mechanical methods obtain the
Hessian one column at a time, only O(N) calculations are required, so the scaling is improved to O(N) × OE.How does compressed sensing alter this scaling?
From eq 4, the number of matrix elements needed
to recover
the Hessian via compressed sensing scales as O(S log N), where S is number
of nonzero elements in the Hessian, so the net scaling is O(S log N) × OE. By obtaining the Hessian one column
at a time, we expect the net scaling to improve to O(S/N log N) × OE. However, we should note that eq 4 is only valid in principle for the random sampling of elements,
and it is not necessarily valid for a random column sampling. This
scaling result illustrates the critical importance of recovering the
Hessian in a sparse basis, with S as small as possible.
So what is the smallest S that can reasonably be
achieved?For many large systems, the Hessian is already sparse
in the basis
of atomic Cartesian coordinates. Since the elements of the Hessian
are partial second derivatives of the energy with
respect to the positions of two atoms, only direct interactions between
the two atoms, with the positions of all other atoms held fixed, must
be taken into account. For most systems we expect that this direct
interaction has a finite range or decays strongly with distance. Note
that this does not preclude collective vibrational modes, which can
emerge as a result of “chaining together” direct interactions
between nearby atoms.If we assume that a system has a finite
range interaction between
atoms, and since each atom has an approximately constant number of
neighbors, irrespective of the total number of atoms in the molecule,
the number of nonzero elements in a single column of the Hessian should
be constant. Hence, for large molecules, the sparsity S of the Hessian would scale linearly with the number of atoms N. Putting this result into O(S/N log N) × OE yields a best-case scaling of O(log N) × OE, which is
a significant improvement over the original O(N) × OE in the absence of compressed sensing.To study the validity
of our scaling results we have performed
numerical calculations on a series of polyacene molecules, which are
aromatic compounds made of linearly fused benzene rings. For polyacenes
ranging from 1 to 15 rings, Figure 5 illustrates
the average number of nonzeros per column in the Hessian matrices
obtained via molecular mechanics and quantum mechanical calculations
in the basis of atomic coordinates. In the molecular mechanics Hessians,
the average sparsity per column approaches a constant value as the
size of the polyacene increases, consistent with each atom having
direct interaction with a constant number of other atoms.
Figure 5
Average sparsity
per column (S/3N) of molecular mechanics
and quantum mechanical Hessians in the basis
of atomic coordinates for the series of polyacenes. (An entry in the
Hessian is considered nonzero if it is greater than 10 (cm–1)2, 6 orders of magnitude smaller than the largest entry.)
In the molecular mechanics Hessians, the average sparsity per column
is roughly constant with the size of the molecule, because each atom
has a roughly constant number of neighbors regardless of the size
of the entire molecule.
Average sparsity
per column (S/3N) of molecular mechanics
and quantum mechanical Hessians in the basis
of atomic coordinates for the series of polyacenes. (An entry in the
Hessian is considered nonzero if it is greater than 10 (cm–1)2, 6 orders of magnitude smaller than the largest entry.)
In the molecular mechanics Hessians, the average sparsity per column
is roughly constant with the size of the molecule, because each atom
has a roughly constant number of neighbors regardless of the size
of the entire molecule.Since the molecular mechanics Hessians illustrate the best-case
scenario in which the sparsity S scales linearly
with the number of atoms N, we attempted to recover
these Hessians directly in the basis of atomic coordinates via the
compressed sensing procedure we have outlined by sampling columns
in the DCT basis. Figure 6 illustrates the
number of columns which must be sampled to recover the Hessians to
within a relative error of 10–3 as a function of
the size of the polyacene. Far fewer than the total number of columns
in the entire matrix need to be sampled. Even more attractive is the
fact that the number of columns grows quite slowly with the size of
the polyacene, consistent with the best-case O(log N) × OE scaling result obtained above.
This result indicates that our compressed sensing approach is especially
promising for the calculation of Hessian matrices for large systems.
For comparison, we also recovered the Hessians in their sparsest possible
basis, which is their own eigenbasis. This procedure is not practical
for actual calculation since it requires knowing the entire Hessian
beforehand, but it shows the best-case scenario and illustrates how
the compressed sensing procedure can be improved further if an appropriate
sparsifying transformation is known.
Figure 6
Number of columns which must be sampled
as a function of the number
of rings in the polyacene to achieve a relative Frobenius norm error
less than 10–3 in the recovered molecular mechanics
Hessian. Legend entries indicate the (sparse) recovery basis, and
columns are always sampled in the DCT basis with respect to the recovery
basis. (Relative error is measured by averaging over ten different
trials which sample different sets of random columns.)
Number of columns which must be sampled
as a function of the number
of rings in the polyacene to achieve a relative Frobenius norm error
less than 10–3 in the recovered molecular mechanics
Hessian. Legend entries indicate the (sparse) recovery basis, and
columns are always sampled in the DCT basis with respect to the recovery
basis. (Relative error is measured by averaging over ten different
trials which sample different sets of random columns.)While the recovery of molecular mechanics Hessians
provides a clear
illustration of the scaling of our compressed sensing procedure, molecular
mechanics matrix elements are not expensive to compute in comparison
with the rest of the linear algebra operations required to diagonalize
the Hessian. Hence, from a computational standpoint, the real challenge
is to apply our procedure to the computation of quantum mechanical
Hessians.As Figure 5 shows, the sparsity S of a quantum mechanical Hessian does not necessarily scale
linearly with the number of atoms N in the molecule.
Figure 7 illustrates the cost of recovering
the quantum mechanical Hessians of polyacenes using compressed sensing
in a variety of sparse bases. Recovering the Hessian in the atomic
coordinate basis already provides a considerable computational advantage
over directly computing the entire Hessian. In fact, this curve mirrors
the sparsity per column curve for quantum mechanical Hessians in Figure 5, consistent with our prediction that the number
of sampled columns scales as O(S/N log N) × OE. More significantly,
recovering the quantum mechanical Hessian in the molecular mechanics
basis provides a substantial advantage over recovery in the atomic
coordinates basis, reducing the number of columns which must be sampled
approximately by a factor of 2. This is consistent with the quantum
mechanical Hessian being sparser in the approximate eigenbasis of
molecular mechanics normal modes. Of course, nothing beats recovery
in the exact eigenbasis, which is as sparse as possible, but requires
knowing the exact Hessian in the first place.
Figure 7
Number of columns which
must be sampled as a function of the number
of rings in the polyacene to achieve a relative Frobenius norm error
less than 10–3 in the recovered quantum mechanical
Hessian. Legend entries indicate the (sparse) recovery basis, and
columns are always sampled in the DCT basis with respect to the recovery
basis. (Relative error is measured by averaging over ten different
trials which sample different sets of random columns.).
Number of columns which
must be sampled as a function of the number
of rings in the polyacene to achieve a relative Frobenius norm error
less than 10–3 in the recovered quantum mechanical
Hessian. Legend entries indicate the (sparse) recovery basis, and
columns are always sampled in the DCT basis with respect to the recovery
basis. (Relative error is measured by averaging over ten different
trials which sample different sets of random columns.).In short, the take-home message of Figure 7 is that using compressed sensing to recover a quantum
mechanical
Hessian in its basis of molecular mechanics normal modes is a practical
procedure which substantially reduces the computational cost of the
procedure.
Conclusions
We have presented a new approach for calculating
matrices. This
method is suitable for applications where the cost of computing each
matrix element is high in comparison to the cost of linear algebra
operations. Our approach leverages the power of compressed sensing
to avoid individually computing every matrix element, thereby achieving
substantial computational savings.When applied to molecular
vibrations of organic molecules, our
method results in accurate frequencies and normal modes with about
30% of the expensive quantum mechanical computations usually required,
which represents a quite significant 3× speed-up. Depending on
the sparsity of the Hessian, our method can also improve the overall
scaling of the computation. These computational savings could be further
improved by using more sophisticated compressed sensing approaches,
such as recovery algorithms based on belief propagation[57,58] which offer a recovery cost directly proportional to the sparsity
of the signal, and which could be easily integrated into our approach.Our method could also be applied to other common calculations in
computational chemistry, including the Fock matrix in electronic structure
or the Casida matrix in linear-response time-dependent DFT.[59] Nevertheless, our method is not restricted to
quantum chemistry and is applicable to many problems throughout the
physical sciences and beyond. The main requirement is an a
priori guess of a basis in which the matrix to be computed
is sparse. The optimal way to achieve this requirement is problem-dependent,
but as research into sparsifying transformations continues to develop,
we believe our method will enable considerable computational savings
in a wide array of scientific fields.The power of compressed
sensing comes from the fact that it optimizes
the amount of information obtained per measurement, and hence the
required number of measurements scales with the information content
being measured, or in this case calculated.[60] We believe compressed sensing can change computational chemistry
for the better by focusing attention on how to avoid redundant computations
and ensure that each computation delivers as much new information
about a system as possible. In broad terms, the key insight of compressed
sensing for computational chemistry is to calculate only what
should be calculated.In fact, a recent area of interest
in compressed sensing is the
development of dictionary learning methods that do not directly require
knowledge of a sparsifying basis, but instead generate it on-the-fly
based on the problem.[61,62] We believe that combining our
matrix recovery protocol with state-of-the-art dictionary learning
methods may eventually result in further progress toward the calculation
of scientific matrices. Beyond the problem of computing matrices,
our work demonstrates that compressed sensing can be integrated into
the core of computational simulations as a workhorse to reduce costs
by optimizing the information obtained from each computation.Finally, we introduced an effective method of bootstrapping low-accuracy
calculations to reduce the number of high-accuracy calculations that
need to be done, something which is not simple to do in quantum chemical
calculations. In this new paradigm, the role of expensive high-accuracy
methods is to correct the low-accuracy results, with a cost proportional
to the magnitude of the required correction, rather than recalculating
the results from scratch.
Computational Methods
The main computational
task required to implement our approach
is the solution of the 1 optimization problem in eq 2. From the many
algorithms available for this purpose,
we rely on the spectral projected gradient 1 (SPGL1) algorithm developed
by van den Berg and Friedlander[42] and their
freely available implementation.For all compressed sensing
calculations in this paper, the change-of-basis
matrix between the sparse basis and the measurement basis is given
by the DCT matrix whose elements are given bywith the first row multiplied by an extra
factor of 21/2 to guarantee orthogonality.For the
numerical calculations we avoid explicitly constructing
the Kronecker product of P with itself and instead perform
all matrix multiplications in the SPGL1 algorithm directly in terms
of P. This latter approach has much smaller memory requirements
and numerical costs, ensuring that the compressed sensing process
itself is rapid and not a bottleneck in our procedure. The condition PAPT = B is satisfied up to a relative
error of 10–7 in the Frobenius norm (vectorial 2-norm).In order to perform the undersampling required for our compressed
sensing calculations, first the complete Hessians were calculated,
then they were converted to the measurement basis, and finally they
were randomly sampled by column. Quantum mechanical Hessians were
obtained with the QChem 4.2[44] software
package, using density functional theory with the B3LYP exchange-correlation
functional[48] and the 6-31G* basis set.
Molecular mechanics Hessians were calculated using the MM3 force field[49] and the open-source package Tinker 6.2.
Authors: Mariya Doneva; Peter Börnert; Holger Eggers; Alfred Mertins; John Pauly; Michael Lustig Journal: Magn Reson Med Date: 2010-09-21 Impact factor: 4.668
Authors: Jacob N Sanders; Semion K Saikin; Sarah Mostame; Xavier Andrade; Julia R Widom; Andrew H Marcus; Alán Aspuru-Guzik Journal: J Phys Chem Lett Date: 2012-09-11 Impact factor: 6.475
Authors: A Shabani; R L Kosut; M Mohseni; H Rabitz; M A Broome; M P Almeida; A Fedrizzi; A G White Journal: Phys Rev Lett Date: 2011-03-07 Impact factor: 9.161