Semiclassical spectroscopy is a practical way to get an accurately approximate quantum description of spectral features starting from ab initio molecular dynamics simulations. The computational bottleneck for the method is represented by the cost of ab initio potential, gradient, and Hessian matrix estimates. This drawback is particularly severe for biological systems due to their unique complexity and large dimensionality. The main goal of this manuscript is to demonstrate that quantum dynamics and spectroscopy, at the level of semiclassical approximation, are doable even for sizable biological systems. To this end, we investigate the possibility of performing semiclassical spectroscopy simulations when ab initio calculations are replaced by computationally cheaper force field evaluations. Both polarizable (AMOEBABIO18) and nonpolarizable (AMBER14SB) force fields are tested. Calculations of some particular vibrational frequencies of four nucleosides, i.e., uridine, thymidine, deoxyguanosine, and adenosine, show that ab initio simulations are accurate and widely applicable. Conversely, simulations based on AMBER14SB are limited to harmonic approximations, but those relying on AMOEBABIO18 yield acceptable semiclassical values if the investigated conformation has been included in the force field parametrization. The main conclusion is that AMOEBABIO18 may provide a viable route to assist semiclassical spectroscopy in the study of large biological molecules for which an ab initio approach is not computationally affordable.
Semiclassical spectroscopy is a practical way to get an accurately approximate quantum description of spectral features starting from ab initio molecular dynamics simulations. The computational bottleneck for the method is represented by the cost of ab initio potential, gradient, and Hessian matrix estimates. This drawback is particularly severe for biological systems due to their unique complexity and large dimensionality. The main goal of this manuscript is to demonstrate that quantum dynamics and spectroscopy, at the level of semiclassical approximation, are doable even for sizable biological systems. To this end, we investigate the possibility of performing semiclassical spectroscopy simulations when ab initio calculations are replaced by computationally cheaper force field evaluations. Both polarizable (AMOEBABIO18) and nonpolarizable (AMBER14SB) force fields are tested. Calculations of some particular vibrational frequencies of four nucleosides, i.e., uridine, thymidine, deoxyguanosine, and adenosine, show that ab initio simulations are accurate and widely applicable. Conversely, simulations based on AMBER14SB are limited to harmonic approximations, but those relying on AMOEBABIO18 yield acceptable semiclassical values if the investigated conformation has been included in the force field parametrization. The main conclusion is that AMOEBABIO18 may provide a viable route to assist semiclassical spectroscopy in the study of large biological molecules for which an ab initio approach is not computationally affordable.
Semiclassical (SC) dynamics has recently demonstrated its important role in the field of
theoretical vibrational spectroscopy. Exploiting information coming from classical dynamics,
the SC approach provides zero point energies and the frequencies of quantum-mechanical
vibrational transitions through a Fourier transform of the quantum wavepacket survival
amplitude. Originating as a stationary phase approximation to the Feynman quantum
propagator,[1] SC dynamics became popular thanks to the initial value
representation (IVR) and the Herman–Kluk formulation of the semiclassical
propagator.[2−9] Then, starting from the early 2000s, a sequence of theoretical advances
has contributed to enlarge applicability and reliability of the theory. In 2003 Kaledin and
Miller proposed a time averaging filtering technique, labeled TA SCIVR, that alleviated the
convergence problem of the phase-space integral calculation.[10,11] Afterward, in 2009, the computational
cost of the semiclassical analysis was drastically decreased by the Multiple Coherent
formulation (MC SCIVR), which introduced a tailored choice of a single or few classical
trajectories to overcome the standard computationally expensive Monte Carlo
sampling.[12,13] Using
such developments, the semiclassical method demonstrated the capability to study small and
medium sized systems, up to the glycine molecule, efficiently and in full
dimensionality.[14,15]
Finally, a crucial leap forward has been performed as early as three years ago with the
Divide-and-Conquer technique (DC SCIVR).[16,17] It consists of an efficient recipe to partition the system
degrees of freedom, ensuring that the survival amplitude calculation leads to valuable
information also in case of high dimensional systems. Exploiting these advances, the
semiclassical theory has been successfully applied not only to the calculation of power
spectra of medium-size isolated molecules but also to the study of complex systems like
water clusters, the Zundel cation, molecules adsorbed on TiO2 surfaces, and
solvation models.[17−24] SC calculations can be
also performed to reproduce IR transition intensities,[25,26] while the most recent advances have focused on
fundamental physics aspects like zero-point energy leakage and deterministic chaos.
Specifically, it has been shown that SC calculations are free of zero-energy leakage at
least when a full sampling of the phase space is performed[27] and that the
influence of chaotic classical trajectories can be largely reduced and sometimes completely
avoided by adopting a preliminary adiabatic switching procedure to sample initial
conditions.[28]Semiclassical evaluation of the vibrational spectral density, i.e., calculation of SC power
spectra, requires a phase space analysis, based on a short trajectory, together with
calculation of the Hessian matrix of the potential energy along the dynamics. When a
precalculated Potential Energy Surface (PES) is not available, the simulations are performed
through Ab Initio Molecular Dynamics (AIMD), i.e., evaluating the potential
energy step by step using an ab initio method. Therefore, any semiclassical
approach is limited by the computational effort required and mostly due to the evaluation of
the Hessian matrix at all trajectory steps. More than one strategy has been proposed to
address this issue. Garashchuk and Light elaborated a method that approximates the Hessian
calculation by generating classical trajectories with initial conditions close to the main
reference trajectory.[29] Ceotto, Hase et al. proposed instead a compact
finite difference (CFD) method to approximate the Hessian calculation at a certain time step
by using the latest calculated one and extrapolating the new one.[30,31] Recently, we suggested the
possibility of creating a database of Hessian matrices during the dynamics that can be
exploited to avoid the calculation step by step in favor of a reuse of already calculated
Hessian matrices for similar geometries.[7] All these suggestions certainly
help alleviate the computational overhead, but ab initio calculations
remain a relevant time-consuming factor which becomes less and less manageable as the system
dimensionality increases. For this reason ab initio SC calculations are
basically restricted to the DFT level of theory, which can limit the accuracy of results in
certain instances but provides often quite satisfactorily estimates.[32]Within this context, we wonder if a more efficient method for calculating trajectories and
Hessians is viable. For instance, among all the available computational methods, classical
molecular dynamics performed through force fields is implemented by means of fast potential
energy calls. It is computationally cheap and commonly used to tackle huge biological
systems, like solvated protein, nanotubes, and DNA fragments.Since the release of the first versions of the most famous force fields, like AMBERff94,
CHARMM22, or OPLS-AA, during the 1980s–1990s, the success of such an approach has
been rapid and widespread.[33−36] In the following years, the growth in available
computational power, the advent of multicore-CPU, GPU and specialized hardware, and the
constant update of the potential energy function of each of these force fields contributed
to the improvement of simulation accuracy.[37−44] Together with
the advance of these pioneering versions, starting from the 2000s, a new class of force
fields has been proposed by the scientific community. In fact, in the aforementioned force
fields, the electrostatic term is described through fixed-point charge methods. To overcome
this limitation, the new approach includes an additional term that effectively describes
charge polarization. The resulting class of force fields is labeled
“polarizable”. A famous example is AMOEBA, recently versioned for proteins
(AMOEBAPRO13) and nucleic acids (AMOEBABIO18).[45−48]In this work we want to perform quantum dynamics simulations employing the AMBER and AMOEBA
force fields within the semiclassical approach, in comparison with the well established DFT
ab initio method. To reach such a goal we selected some biological
systems, specifically four nucleosides, for which a parametrization is available in both the
chosen force fields. Nucleosides are molecules made of a nucleobase condensed with a
five-membered furanose ring, i.e., ribose or deoxyribose. The importance of these molecules
lies in the fact that even minor modifications in their structure can lead to different
conformations, greatly affecting their biological functionality. Additionally, modified
nucleosides are of great interest because they are often employed as new
pharmaceuticals.[49,50]
To have a representative sample of such biomolecules, we chose to study a couple of
deoxy-nucleosides and a couple of nucleosides featuring the ribose sugar moiety, namely,
deoxyguanosine, thymidine, uridine, and adenosine. All these systems have been
experimentally studied in the gas phase in recent years. Specifically, thymidine, uridine,
and adenosine have been investigated in argon matrices by the Ivanov group, while a
comprehensive study of deoxyguanosine isolated and in mono- and dihydrated clusters has been
performed by the Saigusa group.[51−54] The presence of such experimental data gives us a
precise benchmark for our calculations, since an exact quantum theoretical estimation is out
of reach for molecular systems of this size.The paper is structured as follows: Theoretical and Computational
Details describes the theoretical and computational details for both the ab
initio and force field approaches here employed, and Results and
Discussion presents all the vibrational frequencies obtained and a discussion of
the results, while in Summary and Conclusions the conclusions are
listed together with possible future developments.
Theoretical and Computational Details
All DFT calculations were performed by means of the NWChem 6.6 suite of software.[55] We chose to adopt the B3LYP functional,[56] already
employed in other semiclassical works focused on biological systems,[15,19,21] and the 6-31G*
basis set. Force field calculations were implemented using two different software: Gromacs
5.0.4, in its double precision version, for AMBER simulations, and Tinker 8.6.1 for the
AMOEBA counterparts.[57,58]
The version of AMBER adopted is ff14SB while for AMOEBA we chose
AMOEBABIO18.[42,48] The
integration algorithms used in this work are the velocity-Verlet for NWChem simulations, the
“md-vv-avek”, which is a more accurate version of velocity-Verlet, for
Gromacs, and the Beeman integrator for Tinker. All the NVE trajectories were propagated for
a total of 0.6 ps. Specifically, we ran 2500 steps of 10 au (about 0.24 fs) each for the DFT
dynamics and 3000 steps of 0.20 fs each for the force field ones. Such a short total
propagation time is typical of an SC simulation, and it is necessary for capturing all
quantum-mechanical information within the survival amplitude calculation before the accuracy
of the SC propagator starts to deteriorate.As for the calculation of the Hessian matrices along the trajectory, we computed them step
by step in the case of force field simulations while we adopted the already mentioned
Hessian database strategy for DFT studies.[7] This approach allowed us to
save about 1 order of magnitude in computational time. Hessians were analytically computed
for calculations employing DFT or AMOEBA, while they were numerically estimated by means of
a finite difference approach in the case of simulations based on the AMBER force field. The
stability criterion for the monodromy matrix, required by the semiclassical method, has been
enforced by means of the well-established regularization strategy.[59] This
technique has been always applied choosing the threshold parameter in a way that the
regularization is performed for a minimal number of times, in order to minimize the loss of
accuracy.More information regarding the semiclassical formulation and the force field energy
functions is briefly reported hereafter.
Semiclassical DC-SCIVR Method
To clearly understand the working equation of the DC-SCIVR approach here employed we
start describing shortly the earlier MC-SCIVR formulation, according to which the
vibrational power spectrum for an N-dimensional system has the following
formulawhere (p(0), q(0)) are the
positions and momenta of the system degrees of freedom at the beginning of the trajectory,
T is the total simulation time,
S the instantaneous classical action at
time t, E the Fourier transform energy, and
ϕ the phase of the prefactor whose definition
isand
is the
quantum overlap between the coherent state and the reference state
|Ψ⟩.The coherent state with a Gaussian width matrix Γ has the following
formulationThe summation runs over a handful of trajectories
(N) selected according to the
MC-SCIVR recipe: the initial conditions should be such that the trajectory explores a
region of the phase space close in energy to the real quantum-mechanical vibrational
levels. This choice has its foundation in a crucial work by De Leon and Heller who
demonstrated that even a single trajectory can effectively lead to a correct quantum
eigenvalue estimate, if properly chosen.[60] The statement has been
confirmed by several semiclassical studies, remarkably also in the case of neutral
glycine.[15] In that work the power spectrum was obtained in two ways,
either by means of a single trajectory or by using one trajectory per signal. In the case
of a single trajectory calculation, the initial conditions were equilibrium positions and
velocities derived from the harmonic zero-point vibrational energy estimate of each normal
mode. In the case of the multitrajectory calculations, instead, an additional quantum of
excitation was given to the normal mode under consideration. This last strategy is called
a “refined” analysis, while the study made in a single trajectory is labeled
“ZPE” which stands for “Zero Point Energy”. Both these
approaches were employed in the present work too.The DC-SCIVR formula is similar to the already presented MC-SCIVR one, with the
difference that all involved quantities are projected onto appropriate
subspaces:where ∼ indicates projection onto an
M-dimensional subset, with M <
N.All terms are trivially separable except for the potential energy, for which an
ad hoc expression modeled on the separable case has been
proposed:In a few words, out of the full dimensional dynamics only information coming from a
subset of degrees of freedom is considered for the semiclassical analysis. In this way the
survival probability calculations return clear signals for the spectrum even for systems
made of a large number of degrees of freedom. Such an approach works correctly if the
subspace is a good approximation of an isolated system. For this reason more than one
strategy has been proposed to partition the totality of the normal modes composing the
whole system. Among the proposed methods, the least computationally expensive is the one
involving the average Hessian matrix. Following this strategy, the grouping of normal
modes is done according to the off diagonal elements of a single matrix obtained by
averaging all the Hessian matrices computed along the trajectory. Once the threshold value
is fixed, all combinations of normal modes that have off diagonal terms bigger than the
threshold value are deemed to interact significantly and hence enrolled in the same
subspace.[61]All spectra presented in this work have been calculated by means of eq
, and all subspaces have been determined by means of the
average Hessian matrix criterion.
Amber and Amoeba Potential Energy Function
The structure of the AMBER14SB potential energy function is simple, and it has been kept
in later versions almost unchanged with respect to the one published in 2000.[39] It is composed of four pair terms plus a specific component describing the
electrostatic contribution, which is based on the calculation of fixed charges obtained
with the restrained electrostatic potential (RESP) procedure.[39,62,63] During the
development of AMBER, the major changes have involved different reparameterizations based
on more accurate theoretical quantum mechanical calculations or wider and more precise
experimental databases. For example, AMBERff14SB, here employed and published in 2015,
overcomes some limitations of the former version in the description of the protein
backbones through MP2 calculations in vacuum and some empirical corrections based on
recent experiments.[42]The AMOEBA energy function presents five principal terms for short-range interactions:
bond stretchings, angle bendings, bond-angles cross terms, out-of-plane bendings, and
torsional rotations, plus three other terms for nonbonded van der Waals and electrostatic
contributions.[46] The polarization term is modeled through dipole and
quadrupole moments. Furthermore, a damping scheme for local polarization effects accounts
for a consistent treatment of intra- and intermolecular polarization. The major difference
with AMBER lies in the electrostatic description that in AMOEBA is evaluated by means of
dipole and quadrupole moments. This permits a more precise reproduction of the actual
electrostatic contribution to the potential as in the case of the directional hydrogen
bond interactions.We remark that both the Gromacs and Tinker software packages give the possibility to
change the functional form of some of the terms mentioned above. For example, it is
possible to choose the well-known Morse function to model the bond term. Indeed, in this
work, we adopt this choice for all our force field calculations, ensuring a more realistic
description for such a contribution.
Results and Discussion
Differently from the simpler nucleobases, the molecular structure of nucleosides is more
complex and flexible. For this reason nucleosides present a great variety of possible
conformations. The global minima adopted in this work are the ones described in the papers
by Ivanov. This is true for all molecules with the exception of deoxyguanosine, which is
reported in one paper by Saigusa.[51−54] In Figure the minimum geometries predicted by DFT B3LYP/6-31G* ab initio
calculations are reported. For brevity, we report here only the DFT minimum structures while
the AMBER and AMOEBA ones can be found in the SI. However, it is important to point out that they are very similar to the
DFT ones, with root-mean-square deviation (RMSD) values significantly below 1 Å, which
is commonly considered the upper limit for a good structural resemblance. Internal hydrogen
bonds are present in all the nucleosides here studied with the only exception of thymidine.
They are displayed in Figure as dashed black
lines.
Figure 1
Global minimum structures of uridine (a), thymidine (b), deoxyguanosine (c), and
adenosine (d) as predicted by DFT B3LYP ab initio calculations. Colors
stand for: oxygen (red); hydrogen (gray); carbon (light blue); and nitrogen (dark blue).
The positions of some relevant carbon atoms are labeled according to the standard
numbering, and all the internal hydrogen bonds are reported, together with the
corresponding distances, in angstroms.
Global minimum structures of uridine (a), thymidine (b), deoxyguanosine (c), and
adenosine (d) as predicted by DFT B3LYP ab initio calculations. Colors
stand for: oxygen (red); hydrogen (gray); carbon (light blue); and nitrogen (dark blue).
The positions of some relevant carbon atoms are labeled according to the standard
numbering, and all the internal hydrogen bonds are reported, together with the
corresponding distances, in angstroms.On the global minimum structures, after performing a minimization calculation, the first
method we applied to evaluate the vibrational frequencies was the simple harmonic
calculation. Results for all the levels of theory employed together with experimental data
are reported in Table S1 of the Supporting Information (SI). The difference between calculated and experimentally measured values is
instead pictorially represented in Figure . The
investigated portion of the vibrational spectra is the characteristic interval in the
mid-infrared that spans the interval from 3000 to 4000 cm–1. As already
mentioned in the Introduction, the experimental results come from
the analysis performed by the group of Ivanov, with the exception of deoxyguanosine.
Similarly to its corresponding nucleobase, this latter nucleoside exists in both ketonic and
enolic conformations due to tautomerism. Unfortunately, in both force fields here employed
the parametrized conformation is the ketonic one, while the experimental signals come from
the enolic structure, as clearly stated in the Saigusa work.[54]
Consequently, we did not have experimental data for the ketonic form, but we could still
reliably estimate it. In fact, for the OH stretching frequencies we maintained the frequency
values coming from the sugar moiety (5′OH and 3′OH), considering negligible
for these two OH stretchings the presence of a ketonic group instead of an enolic one in the
nucleobase ring. As for the other three frequencies (the NH and the NH2 symmetric
and antisymmetric stretches) we took instead the values reported in the work by Choi and
Miller for the ketonic form of the guanine molecules, once again ignoring the interaction
effect between the sugar and these three modes in the nucleobase ring moiety.[64] The rationale for these choices comes from a work by Nir et al., in which
the similarity between spectra of enolicguanosine and deoxyguanosine is highlighted.[65]
Figure 2
Differences (cm–1) between all calculated harmonic frequencies and
experimental values for each nucleoside and theoretical method.
Differences (cm–1) between all calculated harmonic frequencies and
experimental values for each nucleoside and theoretical method.By just looking at the harmonic results, we can already draw some important considerations.
As expected, the DFT harmonic estimates are usually higher than the experimental findings. A
standard procedure to fix this deviation consists in applying ad hoc
scaling factors to shift the harmonic predictions near the experimental bands. Conversely, a
method like our DC SCIVR can, by construction, account for the actual anharmonicity of the
system within a quantum mechanical framework. For this reason, DFT harmonic frequencies
higher than the experimental counterpart are suggesting a promising prediction after the
inclusion of the anharmonic contributions given by the semiclassical calculation. Such
consideration is also valid for the AMOEBABIO18 harmonic results. Even if almost all the
values are higher than the DFT ones, they are still promising for application of the
semiclassical procedure. An exception is the group of three modes of thymidine in the range
between 1350 and 1950 cm–1, namely, the C5–C6, C4–O, and
C2–O stretchings, whose frequencies are lower than the experimental ones. The same
reasoning cannot be applied to the AMBER14SB harmonic estimates. In some cases the harmonic
frequencies of AMBER14SB are already a good approximation to the real frequencies, while in
other instances the estimated frequencies are way too low.We now apply the DC-SCIVR technique employing the two force fields in addition to the B3LYP
DFT functional and compare results to the available experimental fundamentals of vibration.
As described in Theoretical and Computational Details, the
“refined” DC-SCIVR analysis requires running a trajectory per vibrational
mode, instead of deriving all spectral signals from a single “ZPE” trajectory.
In some cases, in the refined approach, it was also necessary to remove the initial kinetic
energy associated with a few modes that might induce internal rotations. This is mandatory
to avoid spurious signal splittings in the power spectrum. Owing to the size of the
molecules under study, the computational effort required by ab initio DFT
dynamics and Hessian matrix calculations has limited the possibility to apply the refined
procedure to each normal mode for all the molecules. Therefore, we adopted the
“ZPE” approach when performing the ab initio simulations,
limiting the refinement to a single normal mode per molecule, where the ZPE estimate was not
satisfactory. On the contrary, force field simulations are extremely cheap, permitting a
refined analysis for all the target vibrations of all the nucleosides.All the semiclassical spectra are displayed in Figures , 4, 5, 6, and 7, while the frequency values are listed in the SI. In Figure we report the three
peaks belonging to the lower IR frequency region of the thymidine spectrum, which is the
only molecule for which we found experimental data also in that spectral region
(1350–1950 cm–1).
Figure 3
Some DC-SCIVR fundamental frequencies of vibration calculated for the uridine
nucleoside with AMBER14SB, AMOEBABIO18, and the ab initio DFT B3LYP
functional. The experimental values are reported as vertical dashed lines.[51]
Figure 4
Same as Figure but for the thymidine
molecule. Experimental values are taken from ref (52).
Figure 5
Same as Figure but for the deoxyguanosine
molecule. Experimental values are taken from refs (54 and 64).
Figure 6
Same as Figure but for the adenosine
molecule. Experimental values are taken from ref (53).
Figure 7
Same as Figure , for the 1350–1950
cm–1 frequency region of thymidine. The experimental results come
from ref (52).
Some DC-SCIVR fundamental frequencies of vibration calculated for the uridinenucleoside with AMBER14SB, AMOEBABIO18, and the ab initio DFT B3LYP
functional. The experimental values are reported as vertical dashed lines.[51]Same as Figure but for the thymidine
molecule. Experimental values are taken from ref (52).Same as Figure but for the deoxyguanosine
molecule. Experimental values are taken from refs (54 and 64).Same as Figure but for the adenosine
molecule. Experimental values are taken from ref (53).Same as Figure , for the 1350–1950
cm–1 frequency region of thymidine. The experimental results come
from ref (52).From these figures we can conclude that we obtained a very good agreement between DFT
semiclassical spectra and experimental findings, even if the original harmonic estimates
were quite off the mark. The mean absolute error (MAE) calculated for each nucleoside is 40,
33, 25, and 26 cm–1 for uridine, thymidine, deoxyguanosine, and adenosine,
respectively. These are reasonable deviations for semiclassical simulations, given that the
basis set here employed, 6-31G*, is quite small. For example, Ivanov’s work on
thymidine presents a VPT2 calculation, performed with the same B3LYP functional but in
conjunction with the triple-ζ 6-311++G** basis set. Such a theoretical estimate leads
to a MAE equal to 7 cm–1. Unfortunately we could not employ that basis set
for the semiclassical calculations due to its computational overhead, and we had to settle
for a faster calculation but slightly lower accuracy.Moving to the AMOEBABIO18 results, we notice that the general agreement with the experiment
is quite good for all the investigated frequencies, with the exception of some OH stretching
modes. In particular, we obtained very large deviations from the dashed vertical
experimental sticks for the 2′OH stretching in uridine and in adenosine and for the
5′OH stretching in deoxyguanosine. The discrepancies are equal to 122, 285, and 360
cm–1, respectively. These modes are all involved in internal hydrogen
bond interactions. Our explanation for these significant deviations is that AMOEBABIO18 has
been probably parametrized on different nucleoside conformations, resulting in a set of atom
types that could not predict these internal hydrogen bond interactions. This consideration
is reasonable if we think that nucleosides are involved in the double helix formation, where
the sugar moiety is perpendicular to the nucleobase and interactions between these two
components are minimal. The original parametrization of the force field, hence, refers to
this conformation, which is the one biologically active, rather than the one investigated in
this paper. Indeed, when the internal hydrogen bond is formed within the sugar moiety, as,
for example, in the 3′OH stretching of uridine and adenosine, the agreement with the
experiment turns out to be acceptable (43 and 56 cm–1). A very similar
situation has already been detected in AMOEBA by Marx, Head-Gordon, and collaborators.
Specifically, in 2017, they published a work of comparison between AIMD and AMOEBA, studying
the THz spectra of solvated glycine and valine.[66] They noticed that the
zwitterionic form of glycine needed a reparametrization in order to correctly reproduce the
hydrogen bond network and hence the correct signal position and intensity in the THz
spectrum. Another example can be found in the SAMPL4 challenge event, which consisted of a
blind comparison between various theoretical methods on a set of experimental hydration free
energies. In that occasion, a series of papers highlighted that the worst performance of
AMOEBA with respect to GAFF (Generalized Amber Force Field) was due to the great sensitivity
of AMOEBA to the conformations used for the parametrization.[67−70] Both these examples indeed
confirm our AMOEBA semiclassical spectra interpretations. The semiclassical AMOEBABIO18 MAE,
calculated considering all the investigated signals, is 77 cm–1 for
uridine, 51 cm–1 for thymidine, 98 cm–1 for
deoxyguanosine, and 95 cm–1 for adenosine. If we remove the three
aforementioned erroneous estimates, related to the unparametrized hydrogen bonds, the MAEs
become 61, 51, 33, and 48 cm–1, respectively. The deviations from the
experiment are higher than those obtained with the DFT simulations, but they are still
acceptable, and most importantly the approach is promising for bigger molecular systems,
which cannot be treated with ab initio DFT.Conversely AMBER14SB results were, not surprisingly, largely inadequate. As expected, in
nearly all the cases in which the harmonic calculation was already a good estimate, the
semiclassical analysis deteriorated the accuracy of frequency evaluations. More precisely,
almost all harmonic frequencies are closer to the experimental value than the semiclassical
ones. This fact suggests to us that the AMBER force field parametrization was set to give
the best frequency at a harmonic level. For this reason, we do not encourage the reader to
employ advanced anharmonic methodologies with the AMBER force field.To complete the comparison between these three theoretical approaches, we look at the
computational effort, expressed in terms of CPU time. It was not surprising to ascertain
that the most accurate method required more computational resources than the others.
Specifically, DFT B3LYP/6-31G* simulations required about 50 h on 20 2.4 GHz cpus for the
0.6 ps trajectory. This is an average time for the variously sized nucleosides studied in
this work. Additionally, the Hessian matrices took about 30 min each to be computed, again
on 20 2.4 GHz cpus. This time had to be multiplied by the number of Hessian matrices
required, which thanks to the adoption of the Hessian database approach was reduced from
2500 to just around 250. A completely different picture was offered by both force fields.
The trajectory took a handful of seconds to be evolved, while 3000 Hessian matrices were
computed in less than an hour, employing a single CPU. Furthermore, although it is known
that AMOEBA is a bit more computationally expensive than AMBER, the difference could not be
appreciated at these molecular sizes. The huge advantage of AMOEBA in terms of CPU times
over DFT calculations at the cost of a moderate loss in accuracy opens up the route to the
semiclassical vibrational study of sizable biological systems.
Summary and Conclusions
In this paper quantum molecular dynamics simulations in semiclassical approximation for
AMBER14SB, AMOEBABIO18, and ab initio DFT have been performed for the
calculation of the vibrational frequencies of four nucleosides through the DC-SCIVR method.
The good agreement with experimental data obtained using DFT demonstrates that the DC-SCIVR
method is an adequate approach for medium sized systems and that the B3LYP functional may be
appropriate for studying biological systems in spite of the small basis set here employed.
AMBER14SB best estimates are harmonic ones, while application of the semiclassical recipe
worsens the prediction for almost all the simulated signals. Conversely, we obtained a
reasonable set of vibrational frequencies when AMOEBABIO18 was used for semiclassical
analysis. In fact, we achieved a comprehensive MAE of about 50 cm–1, with
the caveat that frequencies calculated for normal modes involving atoms parametrized for
different conformations must be neglected. In our opinion, this aspect represents the real
limitation of the AMOEBABIO18 force field: It is necessary to study molecular systems in
their biological active conformation, the one for which the force field has been correctly
parametrized. In this regard, our semiclassical method can be employed to validate new force
fields. Currently, geometrical parameters are the main terms of comparison with experiments
for assessing the quality of force fields. Here we propose an additional tool for force
field validation, which is based on an anharmonic spectroscopic comparison.The potential energy surface of the investigated nucleosides is characterized by many
low-energy conformers in addition to the global minimum one.[52] We have
presented semiclassical simulations based on a short-time dynamics (less than 1 ps long)
initiated at the global minimum. Adoption of a much longer dynamics is not a viable route in
semiclassical calculations not only because of computational costs but also because the
semiclassical propagator loses rather quickly its unitarity and ability to reproduce quantum
effects. Similarly to what we pointed out in our past study on glycine,[15]
some secondary conformers may be visited during the dynamics in spite of its short duration,
owing to the high energy of the trajectories (harmonic zero point energy or higher) compared
to the interconversion barriers. On the other side, in such a short time it is not possible
to sample the entire phase space including all conformers, and results may overweight the
contribution of the global minimum conformer. The necessity to sample a larger portion of
the phase space is certainly more compelling when experiments are performed at room
temperature. For the investigated nucleosides the benchmark experimental values have been
obtained at very low temperature (6–12 K)[51−54] and, even if we cannot
rule out that several low-energy conformers might have been populated in the experiment,
supported by results we deem that our semiclassical estimates derived from trajectories
started at the global minimum provide accurate comparisons to the experiments.Overall the study opens up the possibility to simulate the quantum dynamics and
spectroscopy of very large biomolecules by means of semiclassical techniques assisted by an
adequately parametrized force field. For instance, the negligible computational time
required for an AMOEBABIO18 DC-SCIVR simulation is promising for future investigations on
biological systems like couples of bases, single or double DNA strands, and solvated
biomolecules.