We propose a general approach to describe large amplitude motions (LAM) with multiple degrees of freedom (DOF) in molecules or reaction intermediates, which is useful for the computation of thermochemical or kinetic data. The kinetic part of the LAM Lagrangian is derived using a Z-matrix internal coordinate representation within a new numerical procedure. This derivation is exact for a classical system, and the uncertainties on the prediction of observable quantities largely arise from uncertainties on the LAM potential energy surface (PES) itself. In order to rigorously account for these uncertainties, we present an approach based on Bayesian theory to infer a parametrized physical model of the PES using ab initio calculations. This framework allows for quantification of uncertainties associated with a PES model as well as the forward propagation of these uncertainties to the quantity of interest. A selection and generalization of some treatments accounting for the coupling of the LAM with other internal or external DOF are also presented. Finally, we discuss and validate the approach with two applications: the calculation of the partition function of 1,3-butadiene and the calculation of the high-pressure reaction rate of the CH(3) + H → CH(4) recombination.
We propose a general approach to describe large amplitude motions (LAM) with multiple degrees of freedom (DOF) in molecules or reaction intermediates, which is useful for the computation of thermochemical or kinetic data. The kinetic part of the LAM Lagrangian is derived using a Z-matrix internal coordinate representation within a new numerical procedure. This derivation is exact for a classical system, and the uncertainties on the prediction of observable quantities largely arise from uncertainties on the LAM potential energy surface (PES) itself. In order to rigorously account for these uncertainties, we present an approach based on Bayesian theory to infer a parametrized physical model of the PES using ab initio calculations. This framework allows for quantification of uncertainties associated with a PES model as well as the forward propagation of these uncertainties to the quantity of interest. A selection and generalization of some treatments accounting for the coupling of the LAM with other internal or external DOF are also presented. Finally, we discuss and validate the approach with two applications: the calculation of the partition function of 1,3-butadiene and the calculation of the high-pressure reaction rate of the CH(3) + H → CH(4) recombination.
Quantum chemical methods, as implemented
in many software packages,[1,2] allow for routine calculations
of various properties of molecules.
Among them, thermodynamic and kinetic data are crucial for the understanding
and prediction of chemical processes.[3]A reliable prediction of such properties requires tackling two
main issues: First, it is necessary to calculate accurate molecular
energies at different geometries defining the energetic minima (stable
states) and saddle points (transition states, TS) on the potential
energy surface (PES). Modern ab initio methods are well-adapted and
efficient in addressing these issues. A second issue arises when one
wishes to compute temperature-dependent properties (entropies, heat
capacities, reaction rates) for which densities of states (DOS) and/or
partition functions need to be computed. This is usually achieved
using the simple rigid rotor harmonic oscillator (RRHO) approximation.
Indeed, most of the internal vibrations are of small amplitude and
are very well-described under this approximation. However, it has
been shown that these approximations fail when large amplitude motions
are involved,[4,5] and a rigorous treatment is often
required if an accurate prediction of statistical properties is required.[6−8] This is particularly the case when a torsional motion is present[7,9−11] or when a loose TS is involved in a dissociation/recombination
reaction.[12−14] Numerous studies have been dedicated to the development
of statistical methods in the context of one or the other application,
while, to our knowledge, none has been presented for the general case.
In cases of torsional motions, the proposed treatments are usually
derived from the one-dimensional hindered rotor (1DHR) model, originally
proposed by Pitzer et al.[4] Recent studies[7,9,11] show the importance of a rigorous
treatment, where quantum effects, multidimensionality, as well as
coupling with other internal or external motions can have a non-negligible
influence. In the context of dissociation reactions involving a loose
complex, two- to five-dimensional large amplitude motions need to
be described. Analytical expressions for the kinetic energy of two
rigid counterparts have already been derived in previous work,[12,13,15] and it has been shown that this
formalism gives a very good approximation of the reaction rate, as
long as the used PES is determined with a high level of accuracy.[12,16,17]Since the kinetic energy
operator has an analytic form, the main
and most critical issue involved in studies of large amplitude motion
is the determination of the PES.[18,19] Even though
the first-principle calculations have proven their ability to yield
accurate electronic energies, their computational cost does not generally
allow for on-the-fly calculations,[20] and
therefore, an appropriate interpolation method has to be used.[20] Three approaches can be distinguished: (i) local
(or weighted) interpolation,[19,21−23] (ii) global (or nonweighted) interpolation by a set of mathematical
functions,[16,24,25] and (iii)
global interpolation by a parametrized physical model.[26−29] The first two solutions are general in principle but tedious to
set up in practice when the dimension of the problem increases. The
practical issues involved, along with the large amount of ab initio
energy computations required, are important limitations to their applicability.
The third solution is a global interpolation by a parametrized physical
model of the interaction energy and is widely referred to as a ‘force
field’. This particular formalism is extensively used to study
the dynamics of large atomic systems.[30−32] Explicit introduction
of the physical contributions to the interaction energy allows for
a considerable decrease in the amount of information required to set
up the model. However, because this simplification can introduce large
model errors, it is of primary importance to estimate the confidence
expected on a final quantity of interest (QoI). Only very recently
has this issue been addressed by Cailliez and Pernot.[33] By using a Bayesian approach, they have calculated the
parameter uncertainties associated with a van der Waals type PES and
have evaluated their influence on the uncertainties for the thermodynamic
data. A major conclusion of their work is that the PES model error
is indeed a critical feature when one wants to predict thermodynamic
properties.The purpose of this paper is two-fold: First, we
present a general
procedure based on a Bayesian framework to infer a system-specific
force field. This procedure allows the computation of probability
density functions (PDFs) of the model parameters using a set of ab
initio data and also allows their propagatation to the desired QoI
(here, partition function or reaction rate). The other concern of
this paper is the presentation of a general approach for characterizing
the statistical properties of a large amplitude motion (LAM). The
restriction of previous methods to specific types of motion, largely
due to an ad hoc derivation of the kinetic energy of the system, is
overcome by introducing a new numerical method based on a functional Z-matrix, in which the DOF (DOFs) are written as functions
of generalized coordinates. This approach provides several advantages:
(i) the dynamical variables can be of any kind (bond length, bending
or dihedral angle, reaction path-like coordinate, etc.), (ii) the
geometry relaxation as well as the constraints are easily expressed
in the Z-matrix formalism, and (iii) the numerical
implementation is straightforward and robust. Other issues, such as
quantum effects and couplings between LAMs with the overall rotation
and the internal HO, are approximately taken into account using selected
methods from the literature.[7,11]To illustrate
the advantages and accuracies of the proposed methodology,
two calculation tests are presented: (i) the partition function of
1,3-butadiene, and (ii) the high-pressure limit of the CH3 + H → CH4 reaction rate. The presented methods
are parts of the C++ library openSOAMS,[34] currently under development.
Theory
The overall methodology proposed in this paper
is illustrated in
Figure 1. The particular steps that are emphasized
in this work are (i) the inference of a PES model (parametrized by
the vector θ) from a set of ab initio energies (‘PES’
box), (ii) the implementation of a Z-matrix coordinate
system for the computation of the kinetic function (‘Kinetic’
box), and (iii) the Lagrangian analysis of the LAM and PES uncertainty
propagation to the QoI (‘Stat. & UQ’ box). In the
canonical ensemble, the partition function is the central quantity
for computing thermodynamic data and rate constants. In order to compute
the partition function, we propose to split the total Hamiltonian
of a system containing a LAM into the following four contributions:
(i) a large amplitude motion contribution, from which overall rotation
and translation are rigorously removed, (ii) the small amplitude vibrations,
eventually considered loosely coupled to the LAM, (iii) the overall
rotation, loosely coupled to the LAM, and (iv) the overall translation,
rigorously independent of all the other modes. The total partition
function of the system is then trivially obtained by simple multiplication
of all the contributions:where T is the temperature
and QLM, QHOcoupled, Qrotcoupled and Qtrans are the partition function
of the contributions (i) to (iv), respectively. In this section, we
will address the calculation of QLM, QHOcoupled, and Qrotcoupled in detail, and the partition function
of the overall translation has been discussed extensively elsewhere.[35]
Figure 1
Flowchart of the thermal statistical computations in our
approach.
Acronyms and abbreviations: potential energy surface (PES); uncertainty
quantification (UQ); probability density function (PDF); thermodynamic
(TD); parameters (param.); translation (transl.); rotation (rot.);
statistic (stat.).
Flowchart of the thermal statistical computations in our
approach.
Acronyms and abbreviations: potential energy surface (PES); uncertainty
quantification (UQ); probability density function (PDF); thermodynamic
(TD); parameters (param.); translation (transl.); rotation (rot.);
statistic (stat.).
LAM
Let a LAM of a molecular system be described by
the positions R⃗at of the atomic
nuclei in a system of generalized coordinates q. The
Lagrangian of the system is writtenwhere V(q) is
the electronic PES and T(q,q̇) the kinetic energy, given by the relation:The matrix A̲ is called
the kinetic matrix, its elements are defined byThe theorem of Aston and Eidinoff[36] allows a computation of the LAM partition function
through a configurational integral involving the determinant of the
kinetic matrix and the PES:where kB is the
Boltzmann constant, T the temperature, n the dimension of q, and δ the symmetry number
of the motion (see the work of Fernández-Ramos et al.[37] for a thorough discussion). The term K(q) = |A̲(q)|, called the kinetic function here, completely describes the kinetic
part of the LAM from a statistical point of view. This integral is
restricted within configurational space and allows a simple numerical
procedure for its evaluation for routine applications. The partition
function defines the statistical properties of the LAM in the canonical
ensemble; specifically, the probability density that the system occupies
a particular position q in the classical formulation
is given byIn the case of one-dimensional motion,
the computation of the partition
function is improved by using quantum statistics. The method is similar
to the one presented by Reinisch et al.[38] An effective temperature-dependent kinetic constant is introduced
by the following relation:The Fourier grid Hamiltonian algorithm[39] is then employed to compute the eingenvalues
εeff(T) of the effective Hamiltonian defined
byThe partition function is finally obtained
by a direct count of
the eigenvalues εeff(T):
Calibration of the PES
The calibration of a model can
be viewed as the update of model parameters in order to get a better
respresentation of observations. This model can then be used to make
predictions of a quantity of interest for specific scenarios. When
calibrating a physico-mathematical model of the PES with respect to
ab initio data, uncertainties are associated with the “physical
model uncertainties”, arising from inadequacies of the physical
model (due to underlying assumptions and simplifications). As a result,
there are “model parameter uncertainties,” arising from
uncertain values of the model parameters. These uncertainties are
simultaneously quantified here through the solution of statistical
inverse problems based on a Bayesian approach, as illustrated by the
block ‘PES’ in Figure 1.
Check Model (In)adequacy
Let M designate
a stochastic model class.[40,41] A stochastic model M is specified by a set of uncertain parameters a, to which an additional uncertain parameter, variance (σtotal2), is included
as a measure of the total model error. Each stochastic model is then
specified by the set θ = a ∪ σtotal2 ∈ Ω
⊂ . One can use the data D to compute the posterior PDF p(θ|D,M) as defined by Bayes theorem:where c is a normalization
constant that makes the probability volume under the posterior PDF
equal to unity, p(D|θ
M) is the likelihood function, and p(θ|M) is the prior PDF for θ (always chosen to be uniform
in this paper). The likelihood function expresses the probability
of observing D based on the predictive PDF for the
system output given by the set of parameters θ in the model M. To compute the likelihood function, the assumption of
statistically independent error is used. We denote by D the j data point and by Y the model output computed for the same scenario
as D. We also consider
an additive error based on the assumption that the error is independent
from the value D:In this study we assume that there is no data
error, e.g., that the ab initio values D calculated at a particular level of theory are perfectly
converged with respect to all the numerical parameters of the quantum
method used. The error is modeled in this paper as a Gaussian deviate
with zero mean and variance σtotal2. Based on all these assumptions, the
likelihood function readswhere Nd is the
number of data points. The variance σtotal2 is treated as an unknown and thus needs
to be calibrated as well. Because no data error is assumed, the calibrated
variance is a measure of the model adequacy, and thus σtotal2 will be referred
to simply as σM2. The criterion for accepting a model as “not-invalidated”
(hereafter, we simply say “validated”) is subjective;
it requires a metric to compare the predicted quantities produced
by the calibration and the data used for the calibration. If the data
agree within the acceptable tolerance limit, the model (denoted as Mval) is then directly used for the forward problem:
the PDF model parameters are propagated to the quantity of interest.
This is illustrated in the right part of the block ‘Stat. &
UQ’ in Figure 1 and discussed in detail
below.
Uncertainty Propagation
One of the most important objectives
of performing the above analysis is to make robust predictions about
the QoI from a data set of the system of interest. Based on a candidate M, all the probabilistic information for the prediction
of a vector of quantities of interest Q is contained
in the posterior predictive PDF for M given by the
theorem of total probability:The above equation obtains the prediction p(Q|D,M)
of a vector of quantities of interest Q ∈ R by summing up the prediction p(Q|θ,D,M) of each model specified by θ ∈ Ω weighted by
its posterior probability p(θ|D,M)dθ. The evaluation of the multidimensional
integrals of eq 13 cannot usually be done analytically.
A common numerical approach often used is based on simulating samples
θ(, k = 1, 2,
..., K, (called posterior samples) from the posterior
PDF p(θ|D,M).
The posterior PDF p(θ|D,M) can be approximated by using these samples:We use in this paper the adaptive multilevel
stochastic simulation algorithm presented by Cheung and Beck[42] to generate the posterior samples. The integral
in eq 13 is then approximated bywhere Q( is a sample simulated from p(Q|θ(,D,M). The Monte Carlo simulation method[43] can be used to simulate this sample. If Q is
a deterministic function of q (i.e., Q = Q(θ)), then Q( = Q(θ().
Estimates for important statistical moments of Q conditioned
on M and D can be obtained using
the samples Q(, k = 1, ..., K. For instance, the posterior
mean is calculated as follows:The 95% confidence interval (CI), usually
taken to represent the confidence expected on a predicted outcome,
is the interval I defined by:In this paper, the mean and the 95%
CI on the predicted posterior
of Q are noted by and [Q], respectively. The library QUESO[44] is
used to solve the inverse problem and to compute the posteriors p(θ|D,M). The current
numerical methodology is very efficient and feasible for various engineering
applications (e.g. refs (45−49)).
Kinetic Function
The computation of the kinetic function
|A̲(q)| defined by eq 4 is achieved using the internal Z-matrix
coordinates. As illustrated in the kinetic box in Figure 1, the objective is to compute the absolute displacement of
the atoms with respect to variations in the generalized coordinates
from the atomic relative positions. The contribution of the overall
rotation and translation is separable through eq 1, and the kinetic function is calculated so that the total angular
(J⃗) and linear (P⃗) momenta associated to any variation dq are null.Let the internal coordinates of the Z-matrix be noted by the vector z of dimension N, where N = 3N – 5 for
a linear molecule or 3N – 6 otherwise (N being the number of atoms). The configurations of an n-dimensional motion parametrized by the generalized coordinates q will then be described by N functions z(q). The connectivity scheme of the Z-matrix along with the functions z(q) is referred to as a functional Z-matrix. It is the definition of the kinetic energy of the system.
The (relative) positions of the atoms obtained using the Z-matrix definition are denoted by Z⃗at(q), and their construction rule is straightforward:
the first three atoms are defining the orientation of the structure,
and the origin of the Cartesian coordinates coincides with the center
of gravity of the system. Given an initial structure R⃗at(q0) directly constructed from
the Z-matrix (e.g., R⃗at(q0) = Z⃗at(q0)), the ∂R⃗at/∂q terms in eq 4 are calculated by determining
the structure R⃗at(q + dq) generated
by any displacement dq and associated
to J⃗ = P⃗ = 0⃗.
While the condition P⃗ = 0⃗ is satisfied
by construction in the Z-matrix (because the center
of mass of the system is fixed at the origin), this is not the case
for the condition J⃗ = 0⃗. By consequence,
the structure R⃗at(q0 + dq) is related to Z⃗at(q0 + dq) by an overall rotation around its center
of gravity:where M̲rot is
a rotational matrix defined by the condition:The matrix M̲rot is optimized by
an iterative procedure, generalizing the one presented in our previous
paper[38] which was restricted to a fixed
axis of rotation. The iterative method is initiated by setting M̲rot = . At each step i, the angular momentum J⃗ is calculated using
the current M̲rot according toThe matrix M̲rot is then corrected at the
next iteration bywhere M̲rot is the matrix of rotation around
the axis J⃗ (current
overall angular momentum) with the counter rotational angle dθ defined bywhere dα is an elementary
angle (taken in this paper as 10–4 radian), J⃗ext is the angular momentum associated
with a rotation of dα around J⃗ of the structure M̲rotZ⃗at(q0 + dqα).This procedure leads to a progressive annihilation of J⃗ associated with a decrease of the kinetic
function. The procedure is stopped in this paper when the kinetic
function is converged within a relative factor of 10–5.
Coupled Motions
Flexible Rotor (FR) Partition Function
The presence
of the large amplitude motion leads to some coupling with the overall
rotation, and the usual RR approximation needs to be overcome. The
partition function is computed by considering a loose coupling between
the LAM and the overall rotation, as proposed by Vansteenkiste et
al.[11] The partition function of the flexible
rotor QFR(T) is based
on the rigid rotor (RR) expression,[50] calculated
at each configuration point of the LAM and averaged using the probability
density defined in relation (eq 6):where QRR(T,q) is the partition function of a RR[50] of the structure Z⃗at(q).
Flexible Oscillator (FO) Partition Function
As for
the overall rotation, the presence of the LAM may lead to some coupling
with small internal vibrations. Wong et al.[9] and Vansteenkiste et al.[11] proposed to
perform an integration similar to eq 23 by using
a HO approximation of the small amplitude vibration at each configuration
point. This procedure leads to an excellent approximation of the partition
function, however it requires the costly evaluation of the Hessian
over all the configurational space. Here, we propose to restrict the
integration to the stable configurations involved in the LAM. The
partition function of the coupled small amplitude vibration under
this assumption is given bywhere the sum is carried out on all the stable
configurations of the LAM, q is the associated configurational point, QHO(T,q) is the partition function of the small amplitude vibration
in the HO approximation[35] of the structure Z⃗at(q), and P(q,T) is the weighting factor defined by
Applications
1,3-Butadiene Partition Function
The first application
concerns the computation of the partition function of 1,3-butadiene.
The geometry of the molecule at the minimum of the PES is illustrated
in Figure 2 along with the atom numbering used
hereafter. The molecule contains a torsional motion defined by the
relative rotation of the two H2CH parts around the central
C5–C2 bond. Previous studies[9,11] have shown that it involves a highly asymmetric internal rotor associated
with a non-negligible geometry relaxation effect as well as a non-negligible
coupling between the torsion and the other DOF.
Figure 2
Illustration of the geometry
of 1,3-butadiene at the global minimum
of the PES. The large circles represent the carbon atoms, and the
small circles represent the hydrogen atoms. Bond lengths are in angstroms
and angles in degrees. The atom numbering used in the paper is indicated
in the circles.
Illustration of the geometry
of 1,3-butadiene at the global minimum
of the PES. The large circles represent the carbon atoms, and the
small circles represent the hydrogen atoms. Bond lengths are in angstroms
and angles in degrees. The atom numbering used in the paper is indicated
in the circles.
Electronic Structure Calculation
Ground-state electronic
energies are calculated using the ab initio methods UB3LYP[51]/6-31G(d)[52] (hereafter
referred to as DFT) for qualitative calculations, and RHF-CCSD(T)/aug-cc-pVDZ[53,54] (hereafter referred as CC) for accurate quantitative calculations.
Geometry optimizations and a normal-mode analysis are performed at
the UB3LYP/6-31G(d) level of theory. The torsional PES is obtained
by computing the relaxed geometries with respect to the dihedral C1C2C5C7 using a 10° step.
We use a local quadratic interpolation to obtain ab initio values
of the energy at other torsional angles. All the electronic calculations
have been performed using the GAMESS code.[2]The Table 1 presents the functional Z-matrix used to compute
the kinetic function. The functions f(θ) (i = 1, ..., 8) define
the relaxation of the DOF with respect to the generalized coordinate
θ. The DOFs contributing to less than 1% of the kinetic function
have been considered constant. The functions f(θ) are obtained by Fourier series
fit to the optimized internal coordinates and are presented in Appendix A. The kinetic function calculated using our
numerical approach is presented in Figure 3 (solid line) and compared to those reported by Vansteenkiste et
al.[11] and Wong et al.[9] A very good agreement is observed between the three studies,
the difference from the Eckart method at θ ≈ 180°
being most likely attributed to the different level of theory used
in the study of Wong et al. for geometry optimization (MP2). We also
present in Figure 3 the kinetic function computed
when the relaxation of the geometry with respect to θ is not
taken into account (f(θ) = 0 ∀i, dashed line). As it can
be seen, the kinetic function is strongly affected and gives values
in close agreement with those presented by Wong et al.[9] calculated using the Pitzer method.[55] Because the Pitzer values are also based on relaxed geometries,
this demonstrates that the Pitzer method actually fails to take into
account the dynamic influence of the relaxation on the kinetic energy.
Table 1
Functional Z-Matrix
Definition of the Torsional Motion of the 1,3-Butadienea
C1
C2
1
1.34
H3
1
1.09
2
121.8
H4
1
1.09
3
116.6
2
180 + f1 (θ)
C5
2
1.46 + f2 (θ)
1
1243 + f3 (θ)
3
180 + f4 (θ)
H6
2
1.09
1
119.4
3
f5 (q)
C7
5
1.34
2
124.3 + f6 (θ)
1
–180 + θ
H8
5
1.09
2
116.2
1
θ + f7 (θ)
H9
7
1.09
5
121.5
2
f8 (θ)
H10
7
1.09
5
121.8
2
180
+ f9 (θ)
The definition of the relaxation
functions f(θ)
is given in Appendix A. Lengths are in
angstroms and angles in degrees.
Figure 3
Comparison between the kinetic function calculated in this work
(lines) and the kinetic function obtained using the EHR model[11] and the Eckart[9] and
the Pitzer[55] methods (presented by Wong
et al.).[9] Solid line: flexible torsion
(f(θ) ≠
0); dashed line: rigid rotation (f(θ) = 0).
The definition of the relaxation
functions f(θ)
is given in Appendix A. Lengths are in
angstroms and angles in degrees.Comparison between the kinetic function calculated in this work
(lines) and the kinetic function obtained using the EHR model[11] and the Eckart[9] and
the Pitzer[55] methods (presented by Wong
et al.).[9] Solid line: flexible torsion
(f(θ) ≠
0); dashed line: rigid rotation (f(θ) = 0).The model used to describe the
PES of the torsional motion accounts for two physical contributions:
(i) the contribution of the intrinsic torsional energy of the C5–C2 sp2 chemical bond, and (ii)
the contribution of a repulsive part, representing the steric repulsion
between the atoms H9 and H4 during the torsion.
Using the definition of the repulsive part of a Lennard-Jones-type
potential, and a modified cosine function to account for the sp2 torsion energy, the PES is defined using the three parameters a0, a1, and a2 bywhere dvdW is
taken equal to 2.4 Å (sum of the van der Waals radii of the hydrogen
atoms), and g(θ) = 1 – a0(1 –
cos(2θ)). The function g adds DOF to model the bonding contribution of
the sp2 torsional energy, and one can verify that g(θ) = g(−θ)
as well as g(−180) = g(180) = g(0) = 1. The term (dvdW/|R⃗9(0) – R⃗4(0)|)9 in the equation ensures the condition V(0) = 0. The stochastic PES model M contains
four parameters to calibrate: the three involved in eq 26 (a0,a1,a2), and the variance (σ2).95% CI of the posterior PES using the data set DDFT with n = 5, 10, and 20 (dotted, dashed,
and solid lines, respectively), and ● represents the data set D19DFT.In order to study how the ab initio data inform
the PES parameters,
different data sets are used for the inference process. They are denoted
by DX where X ∈ {DFT,CC} stands for
the level used for the electronic energy calculations, and n is the number of values included (in uniform repartition
in [0;180°]). The prior PDFs, the 95% CIs based on the posterior
PDFs, and the posterior means are presented in Table 2.
Table 2
Definition of the Parameter’s
Prior and Description of the Parameter’s Posterior for Each
Data Seta
95% CI of the posterior PES using the data set DDFT with n = 5, 10, and 20 (dotted, dashed,
and solid lines, respectively), and ● represents the data set D19DFT.
We
now look at the posterior PDFs of model parameters obtained
using the data set DCC (n = 2, 3, 5) presented in the four last rows of Table 2. In this case, we calibrate model parameters except for σ2, which is set to be 1.25 × 10–3 kcal2/mol2. In other words, we assume that the model
error obtained by the DFT level of theory is close to the realistic
estimation of the true model error. As can be seen in Table 2, this assumption is reasonable since [V]5CC is already
very narrow (and actually converged) for all the parameters even when
five points are used. It is also shown that using two points (at 0°
and 180°) is not enough to infer a0 and a2 (posterior PDFs approximately
equal to the prior PDFs), while a third point at 90° allows a
considerable reduction of the uncertainty on a2. These results are illustrated in Figure 5, which presents the uncertainty domain of the posterior PES
for these three data sets. Comparing [V]5CCa posteriori with the D19CC data set demonstrates that the DFT level
of theory is fully able to estimate the absolute model error for this
case as the data points are almost perfectly encapsulated in [V]5CC (assuming that the CCSDT/aug-cc-pVDZ level of theory gives the exact
PES).
Figure 5
95% CI of the posterior PES using the data set DCC with n = 2, 3, and 5 (dotted, dashed, and
solid lines, respectively). The ○, ×, and △ represent
the corresponding data set, respectively, and ● represents
the data set D19CC.
95% CI of the posterior PES using the data set DCC with n = 2, 3, and 5 (dotted, dashed, and
solid lines, respectively). The ○, ×, and △ represent
the corresponding data set, respectively, and ● represents
the data set D19CC.95% CI of the posterior torsional partition function using
the
data set DDFT with n = 5,
10, and 20 (dotted, dashed, and solid lines, respectively), and ●
mean of the posterior torsional partition function using DDFT. Results are normalized by the HO partition function.
Forward Problem: Partition Function
The 95% CI of the
posterior partition function [QLAM]DFT using the DDFT data sets is presented in Figure 6. The results
are normalized to the partition function obtained using an HO approximation.
The mean value of the posterior partition function using D5DFT is also
presented. As for the PES results, the 95% CI becomes very narrow
as soon as 10 points are used, and the mean value < Q >5DFT is
already
almost converged. The fact that the [Q]X is centered around 1 at low temperature (T <
400 K) is important because it demonstrates that
the method presented here to treat the 1D LAM has a sufficient capability
to perfectly reproduce the harmonic approximation results in the low-temperature
limit.
Figure 6
95% CI of the posterior torsional partition function using
the
data set DDFT with n = 5,
10, and 20 (dotted, dashed, and solid lines, respectively), and ●
mean of the posterior torsional partition function using DDFT. Results are normalized by the HO partition function.
The total partition function of 1,3-butadiene is obtained
by multiplication of the LAM partition function by the FR and FO partition
functions. To compute the FHO partition function, we consider the
system constituted by two stable complexes, characterized by the normal-mode
frequencies at θ = 0° and 145° (the normal-mode frequencies
are presented in Appendix B). It should
be noted that the PES uncertainty also affects the FR and FO partition
functions through eqs 6 and 25. The 95% CI of the posterior partition function of 1,3-butadiene
using the D5CC (with σ2 kept fixed at 1.25 ×
10–3 kcal2/mol2) and D20DFT data sets are presented in Figure 7 and compared
to the results presented by Wong and Raman[56] (Pert model) and those presented by Vansteenkiste et al.[11] [extended hindered rotor (EHR) model]. The results
are normalized by the partition function obtained using a RRHO approximation.
We first compare our prediction using D5CC with the low-temperature
results presented by Wong and Raman[56] using
their Pert method. The comparison is meaningful because the same level
of theory has been used to compute the PES and also because the Pert
method used represents most likely the best model available in the
literature. However, this fully coupled quantum method is computationally
expensive, which explains why no results for temperatures higher than
500 K have been presented. Although our predictions slightly overestimate
the partition function, the behavior of the curves is very similar,
and the results stay very close to each other. The small overestimation
comes from either the simplification introduced by our treatment and/or
from differences in the RRHO reference partition function. We now
compare the results obtained by Vansteenkiste et al.[11] (EHR method) and our results using the D20DFT data
set. The same DFT method used to compute the PES is shared by the
two studies; however, Vansteenkiste et al. consider a larger basis
set in their work (6-311G(d,p)). The two PESs are nevertheless similar,
as reflected by the small difference of the torsional barrier height
(0.1 kcal/mol). As can be seen in the figure, the behavior of the
partition function curves is very similar, essentially differing by
a small translation factor. This factor seems to be an artifact involved
in the EHR method, as the partition function does not converge to
1 at the limit of low temperatures.
Figure 7
Comparison between the 95% CI of the posterior
partition function
of the butadiene using D20DFT (dashed line) and D5CC (solid
line, σ2 = 1.2510–3); ×, Pert
method;[56] and ●, EHR method.[11] Results normalized to the RRHO partition function.
Comparison between the 95% CI of the posterior
partition function
of the butadiene using D20DFT (dashed line) and D5CC (solid
line, σ2 = 1.2510–3); ×, Pert
method;[56] and ●, EHR method.[11] Results normalized to the RRHO partition function.The model 1D LAM method proposed in this paper
is able to account
for 1D torsional motion with an accuracy comparable to the best methods
presented until now. It is worth recalling that only 2 normal-mode
analyses have been realized here, while they have been conducted all
over the configurational space in the studies of Vansteenkiste et
al.[11] and Wong and Raman.[56] Also, this efficient formalism, in terms of computational
time, further allows it to be used in conjunction with an uncertainty
quantification algorithm.
Kinetic Rate of CH3 + H → CH4 in the High-Pressure Limit
The reaction rate of the CH3 + H recombination is computed at the high-pressure limit
using variational transition-state theory (VTST) with variational
reaction coordinate (VRC) and spherical dividing surfaces (DS) in
the canonical ensemble. In the present case, the DS is parametrized
using one pivot point p⃗ attached to the CH3 part, around which the approaching H* atom is allowed to
rotate. The optimal DS (noted DS*), defining the TS, is optimized
for every temperature in a way that it is associated to a minimal
partition function:[57,58]where s is the reaction coordinate
(separation p⃗–H*), and QDS(T) is the partition function
of the DS defined by s and p⃗. The high-pressure reaction rate is calculated using the standard
TST assumptions and is given by[50]where ℏ is Planck’s
constant, QCH and QH the partition functions of the methyl and
hydrogen radicals, respectively, and V* is the electronic
barrier height associated with the CH3 + H DS*. Over the
small amplitude motion involved in the system, the CH3 umbrella
motion changes quite substantially and may need to be included in
a FO partition function. However, Klippenstein et al.[16] have shown that its influence is negligible with at most
a 2% increase of the reaction rate for temperatures below 2400 K.
In this work, the small amplitude motions are then supposed uncoupled
to the reactional motion, and the reaction rate expression is simplified
towhere QDS*2D is the partition function
of the H* motion on the DS*, QDS*FR the effective overall rotation
partition function of the DS, and QCHtrans&rot the
partition function of the overall translational and rotational motion
of CH3.Ground-state electronic
energies are calculated using the CR-CC(2,3)[59,60]/aug-cc-pVDZ[53,54] level of theory. The CR-CC(2,3)
method is an improvement over the CCSD(T) approach to overcome its
deficiencies in describing systems involving biradical character.[61] Geometry optimizations are carried out at the
UB3LYP[51]/6-31G(d)[52] level of theory. All the electronic calculations have been performed
using the GAMESS code.[2]The kinetic energy of the DS is obtained
by defining the functional Z-matrix associated with
the H* 2D motion. Because of the C3 symmetry
of CH3, the DS is parametrized by two parameters. The first
parameter is the distance x between the pivot point p⃗, lying in the C3 axis
of symmetry of CH3, and the carbon atom, taken as the origin.
We denote from now on p⃗ = x. The second parameter is the distance s between
the approaching H* atom and the pivot point. The functional Z-matrix which describes the DS is presented in Table 3. The C3 symmetry is
imposed to the system, and the relaxation of the CH3 part
is taken into account to some extent by using the functions a(|*|) (interpolation between the optimized
values on the MEP).
Table 3
Z-Matrix Definition
of the Dividing Surface Parameterized by the Reaction Coordinate s and the Pivot Point xa
C
X
1
x
H0
1
1.09
2
H1
1
1.09
2
3
120
H2
1
1.09
2
4
120
H*
2
s
1
q0
3
q1
Lengths in angstroms, angles
in degrees.
Illustration of a typical bifaceted dividing surface of
the CH3 + H → CH4 reaction.It has been shown that a limitation of the spherical
dividing surfaces
is that they overaccount artificial contributions to the reaction
rate when multiple reaction paths are present.[12,13] To avoid this overestimation, a multifaceted dividing surface is
used, composed by the envelope of spherical DSs centered around a
reactive channel-specific pivot point. Considering the equivalence
of the two association channels for the H* addition on CH3, a typical dividing surface for the reaction is illustrated in Figure 8. To compute the DS partition function, the integration
domain defined in eq 5 has to be restricted
towhere q0min = arcsin(x/s) is the solid angle from X defining
the intersection of the two spherical DS. The overall symmetry number
of the irreducible integration domain is then 12. Since this number
takes into account the two reaction paths, no symmetry number has
to be considered in the overall rotational partition function of both
CH3 and the DS.
Figure 8
Illustration of a typical bifaceted dividing surface of
the CH3 + H → CH4 reaction.
Lengths in angstroms, angles
in degrees.The PES of the CH3/H* interaction accounts for three contributions: (i) the stretching
energy of the C–H* bond (noted Vstr), (ii) the bending energy of the HCH*
DOF (noted Vbend), and (iii) the steric
repulsion energy between the H and H*. The stretching energy Vstr is written as a Morse-like function of |*| with an origin of energy taken at the
products state:where De = 107.75
kcal/mol is the dissociation energy, req = 1.09 Å is the equilibrium bond length, and a0 and a1 are the first two
parameters of the PES model. Note that the function is defined only
for |*|>req. The
bending energy is modeled by a simple cosine function, with a barrier
height equal to the dissociation energy at the angle α = ± π/2:Finally the PES also accounts for the spherical
repulsion between the passive hydrogen atoms H and H*:where σHH = 2.4 Å with a2 and a3 being the
last two parameters, and H*0 is the corresponding position
of H* on the MEP (at the same C–H* separation). The term (σHH/|(|)) allows a cancellation of the steric interaction energy on the MEP.
The PES is the sum of these three contributions and is defined through
the four parameters a, i = 0, ..., 3:The geometries used to compute the
electronic energies are defined by the Z-matrix presented
in Table 3 with x = ε
= 10–5 (in order to preserve a nonambiguous definition
of the C3 axis). The 50 data points collected
are divided in two groups: (i) one is used to infer the PES model
(noted D*), and (ii) the second is used to validate
it for extrapolation (noted D). The data set D* contains 25 points in the relevant space of the PES for
reaction rate calculation and is constituted by:The D data set collects the energies of the
corresponding geometries of (ii) at q1 = 30° as well as 10 other points at s = 1.6
Å. The prior intervals, the posteriors means, and 95% CIs are
presented in Table 4, and the 4 posteriors p(a|D*,M) are presented in figure Figure 9. We confirm
here that the Morse potential (defined for a1 = 1) is not the most appropriate to describe the MEP, and
a value of a1 = 1.25 is the most probable
value. The mean value of a3 is surprisingly
low for a repulsion term: in a van der Waals force field, the exponent
of the repulsive part is usually taken between 9 and 12.
Table 4
Prior PDFs of the Parameters and 95%
CIs ([ ])a
a0
a1
a2
a3
σM2
prior
[0;10]
[0;10]
[0;10]
[0;10]
[0;10]
[ ]
[1.75;1.87]
[1.22;1.43]
[2.13;6.04]
[2.12;2.87]
[1.6;2.8]
< >
1.82
1.27
4.12
2.53
2.59
Based on the posterior PDFs,
and the posterior mean values (< . >). Units: a0, Å–; a1: no unit; and a2, a3, and σM, kcal/mol.
Figure 9
Posterior PDF, p(a|D*,M), (a–d)
for respectively i = 0, 1, 2, and 3.
10 points on the MEP (q0 = q1 = 0) from s = 2.0–3.8 Å.15 points out of the MEP defined
by the combination s ∈ {2.0, 2.4, 2.8 Å}
and q0 ∈ {20,40,60,80,100°}
with q1 = 0.Based on the posterior PDFs,
and the posterior mean values (< . >). Units: a0, Å–; a1: no unit; and a2, a3, and σM, kcal/mol.Posterior PDF, p(a|D*,M), (a–d)
for respectively i = 0, 1, 2, and 3.We compare in Figure 10 the
95% CI of the
posterior MEP ([V]) with the data points of D* (solid circles) and D (open circles)
included in the MEP. We also report in this figure the ab initio results
at the CASPT2/aug-cc-pVQZ,[17] full-CI/6-31G(d),[62] and CCSD(T)/6-31G(d)[17] level of theories presented in the literature. The uncertainty of
the PES model is under 2 kcal/mol for separations higher than 2.0
Å and coincidently encapsulates the results obtained at the full
CI and CASPT2 level of theory. As already reported,[17] the CCSD(T) level of theory is not able to properly estimate
the energy of the system for intermediate separation. The comparison
between the PES model and the direct ab initio data is shown in Figure 11 for out-of-MEP situations. The symbols still represent
the D* and D data sets respectively.
The approximation of the PES model is very satisfying. Even for extrapolated
values at low separation (s = 1.6 Å), the model
still performs well with the hindering domain being correctly predicted
within 10%.
Figure 10
Comparison between the predicted 95% CI of the MEP ([V]), the data sets D and D* and
the ab initio results presented in literature. Lines, [V]; ●, D*; and ○, D. Gray symbols, literature results: *, CCSD(T)/6-31G(d);[17] †+, CASPT2/aug-cc-pVQZ;[17] and ×, full CI/6-31G(d).[62] Origin of energy: V(4 Å,0,0).
Figure 11
Comparison between the predicted 95% CI of the PES ([V]) with the data sets D and D*.
Lines, [V]; filled symbols, D*;
and empty symbols, D. Origin of energy: V(4 Å,0,0).
Comparison between the predicted 95% CI of the MEP ([V]), the data sets D and D* and
the ab initio results presented in literature. Lines, [V]; ●, D*; and ○, D. Gray symbols, literature results: *, CCSD(T)/6-31G(d);[17] †+, CASPT2/aug-cc-pVQZ;[17] and ×, full CI/6-31G(d).[62] Origin of energy: V(4 Å,0,0).Comparison between the predicted 95% CI of the PES ([V]) with the data sets D and D*.
Lines, [V]; filled symbols, D*;
and empty symbols, D. Origin of energy: V(4 Å,0,0).
Forward Propagation to the High-Pressure Recombination Rate
The 95% CI ([k]) and the mean () of the posterior high-pressure reaction are presented in Figure 12 with the available experimental measurements[63−65] and the VTST-VRC theoretical predictions presented by Klippenstein
et al.[16] and Harding et al.[17] The experimental measures have been converted from the k0 values of the original experimental work in
the same way presented by Klippenstein et al.[16] The CIs predicted by our approach are in very good agreement with
the experimental values as well as with the theoretical calculations.
Our calculations are mainly limited by two methodological factors.
First, the representation of the dividing surface does not constitute
a perfect DS and is associated with an overestimation of the reaction
rate. The works of Klippenstein et al.[16] have shown that by using both VTST-VRC and a
direct dynamic method, the spherical DS for the CH3 + H
→ CH4 reaction is associated with a 9% overestimation,
approximately independent of the temperature. The doubly faceted DS
used here leads to an even lower recrossing factor. The second limitation
is the restriction of the statistical study to the canonical ensemble.
In other works,[12] the associated error
is evaluated at approximately 20%. The overestimation of the reaction
rate coming from the canonical analysis is probably compensated to
some extent by the PES model which predicts a slightly higher hindrance
effect at high separation (the ab initio data are close to the bottom
boundary of the CI at the s = 2.8 Å case in
Figure 11).
Figure 12
Comparison
of the high pressure CH3 + H recombination
rate predicted by this work (black lines, [k]; ■,
) with other theoretical calculations (gray
lines) and the experimental values (+,[64] *,[65] ×[63]). Gray lines, calculations using different representation of the
PES; full line, on the fly calculation;[17] dashed line, 4D mathematical fit;[16] and
dotted line, Hirst–Hase PES.
To finish this study, we
comment on the difference in the PES representation used here and
in the works presented by Harding et al.[17] and Klippenstein et al.[16] In the works
of Klippenstein et al.,[16] a combined Fourier
series/3D spline fitting procedure is used to obtain a 4D PES representation
(the umbrella motion is also explicitly considered). If the fit is
almost exact, it has been achieved using 798 ab initio calculations
in the 3D space analyzed here. Klippenstein et al. have also used
the analytical PES presented by Hirst–Hase to compute the reaction
rate. This model, in addition to having required the manual optimization
of almost 20 parameters, does not provide any indication of the associated
model error. In the study of Harding et al.,[17] an on-the-fly method is used to compute the PES. The number of quantum
calculations has not been presented, but assuming that ten points
are used to integrate the partition function along q0 and q1 and 10 points to
optimize ds and dx, this already
results in 10 000 ab initio calculations. We recall that our
model, even if associated with a 1.5 uncertainty factor on the reaction
rate, is based on 4 parameters and has been set using 25 ab initio
calculations for the calibration step (and 25 to examine the extrapolation
capabilities). The restriction to the canonical ensemble is another
issue which does not need more PES calculations to be overcome.Comparison
of the high pressure CH3 + H recombination
rate predicted by this work (black lines, [k]; ■,
) with other theoretical calculations (gray
lines) and the experimental values (+,[64] *,[65] ×[63]). Gray lines, calculations using different representation of the
PES; full line, on the fly calculation;[17] dashed line, 4D mathematical fit;[16] and
dotted line, Hirst–Hase PES.
Conclusions
We have presented in this work an original
approach for the computation
of statistical properties of molecular systems involving a large amplitude
motion. The objectives were to propose: (i) a simple and general procedure
to compute the kinetic energy of a LAM; (ii) a general procedure to
calibrate an analytical PES using ab initio data; and (iii) a rigorous
quantification of the uncertainty of the PES model and their propagation
to the QoI(s). Two typical and different test cases have been considered
for assessing the methodology: (i) the study of 1,3-butadiene, involving
nontrivial features, such as coupling of modes or a highly variable
kinetic function; and (ii) the study of the CH3 + H recombination,
for which a VTST-VRC approach is needed to compute an accurate reaction
rate.We proposed to compute the kinetic energy of a LAM based
on a functional Z-matrix formalism, e.g., a Z-matrix, in
which the internal DOFs are defined with respect to some generalized
coordinates. The results obtained are exact within numerical precision
and are in total agreement with previous exact methods. The particular
advantage that our approach offers is its practical convenience. Indeed,
for a typical LAM, the internal Z-matrix coordinates
naturally describe the configurational space of the motion and are
much more suited than Cartesian coordinates. Also, the possibility
to include ghost atoms in the Z-matrix to represent
a virtual pivot point is an additional important advantage when one
wishes to study dissociation reactions involving loose complexes.
Finally, the method applies equally regardless of the number of generalized
coordinates or their type (length, angle, and reaction path-like coordinate).Furthermore, the calibration of parameters for the analytical PES
from ab initio calculations has been achieved using Bayesian theory.
The two examples treated have allowed us to point out different and
interesting features of this approach. The most important one, which
is critical to the derivation of analytical models, is that the PES
model uncertainty is properly evaluated and can be propagated to the
QoI. We have also shown that, providing an adequate PES model is used,
narrowing the QoI CI needs significantly fewer data points than other
methods, which does not exploit any particular physical contribution
of the interaction energy. This is particularly true when the dimension
of the problem increases. For instance, in the study of the CH3 + H recombination rate, we were able to use only 25 data
points to calibrate the 3D PES in order to compute a reaction rate
within an uncertainty factor of 1.5 (coming from the PES model). This
corresponds to a typical reduction of one or two orders of magnitude
in the amount of data points needed with respect to previous studies.
However, it is clear that this approach is relevant when the condition
that a satisfactory model of the interaction energy is provided. Even
for complex force fields, we have shown here that for the two studied
applications, accurate models can be based on simple contributions:
Morse-type potentials for bond stretching, cosine-like functions for
bending and torsional motion, and steric repulsion for nonbonded atoms.
We believe that these types of potentials should hold for the majority
of torsional and simple bond-breaking context studies. Finally, for
the 1,3- butadiene application, we have discussed the possibility
of performing a dual-level inference process. While it is not possible
to generalize the results, it was shown that the B3LYP/6-31G(d) level
of theory was almost perfectly able to compute the true (or absolute)
model error, even if the PES is not accurately rendered at this level.
As a prior estimation of the model error allows to considerably reduce
the required amount of data needed to obtain a given accuracy on the
QoI, this property would be of particular interest for the calibration
of high-level PESs at a minimum computational cost.
Authors: Jingjing Zheng; Jeffrey R Gour; Jesse J Lutz; Marta Włoch; Piotr Piecuch; Donald G Truhlar Journal: J Chem Phys Date: 2008-01-28 Impact factor: 3.488