Asaminew H Aytenfisu1, Aleksandar Spasic1, Matthew G Seetin1, John Serafini1, David H Mathews2. 1. Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center , 601 Elmwood Avenue, Box 712, Rochester, New York 14642, United States. 2. Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center , 601 Elmwood Avenue, Box 712, Rochester, New York 14642, United States ; Department of Biostatistics and Computational Biology, University of Rochester Medical Center , 601 Elmwood Avenue, Box 712, Rochester, New York 14642, United States.
Abstract
Molecular mechanics with all-atom models was used to understand the conformational preference of tandem guanine-adenine (GA) noncanonical pairs in RNA. These tandem GA pairs play important roles in determining stability, flexibility, and structural dynamics of RNA tertiary structures. Previous solution structures showed that these tandem GA pairs adopt either imino (cis Watson-Crick/Watson-Crick A-G) or sheared (trans Hoogsteen/sugar edge A-G) conformations depending on the sequence and orientation of the adjacent closing base pairs. The solution structures (GCGGACGC)2 [Biochemistry, 1996, 35, 9677-9689] and (GCGGAUGC)2 [Biochemistry, 2007, 46, 1511-1522] demonstrate imino and sheared conformations for the two central GA pairs, respectively. These systems were studied using molecular dynamics and free energy change calculations for conformational changes, using umbrella sampling. For the structures to maintain their native conformations during molecular dynamics simulations, a modification to the standard Amber ff10 force field was required, which allowed the amino group of guanine to leave the plane of the base [J. Chem. Theory Comput., 2009, 5, 2088-2100] and form out-of-plane hydrogen bonds with a cross-strand cytosine or uracil. The requirement for this modification suggests the importance of out-of-plane hydrogen bonds in stabilizing the native structures. Free energy change calculations for each sequence demonstrated the correct conformational preference when the force field modification was used, but the extent of the preference is underestimated.
Molecular mechanics with all-atom models was used to understand the conformational preference of tandem guanine-adenine (GA) noncanonical pairs in RNA. These tandem GA pairs play important roles in determining stability, flexibility, and structural dynamics of RNA tertiary structures. Previous solution structures showed that these tandem GA pairs adopt either imino (cis Watson-Crick/Watson-Crick A-G) or sheared (trans Hoogsteen/sugar edge A-G) conformations depending on the sequence and orientation of the adjacent closing base pairs. The solution structures (GCGGACGC)2 [Biochemistry, 1996, 35, 9677-9689] and (GCGGAUGC)2 [Biochemistry, 2007, 46, 1511-1522] demonstrate imino and sheared conformations for the two central GA pairs, respectively. These systems were studied using molecular dynamics and free energy change calculations for conformational changes, using umbrella sampling. For the structures to maintain their native conformations during molecular dynamics simulations, a modification to the standard Amber ff10 force field was required, which allowed the amino group of guanine to leave the plane of the base [J. Chem. Theory Comput., 2009, 5, 2088-2100] and form out-of-plane hydrogen bonds with a cross-strand cytosine or uracil. The requirement for this modification suggests the importance of out-of-plane hydrogen bonds in stabilizing the native structures. Free energy change calculations for each sequence demonstrated the correct conformational preference when the force field modification was used, but the extent of the preference is underestimated.
RNA plays an important
role in biological processes, including
information storage, protein expression, catalysis, and regulation
of gene expression.[1−12] RNA functions can involve conformational changes.[13−20] Therefore, understanding its secondary and tertiary structure and
flexibility is important to understand its functions.RNA internal
loops are unpaired or noncanonically paired nucleotides
closed on both sides by canonical base pairs, GU, GC, or AU. These
loops provide one of the locations in RNA structures for tertiary
or quaternary contacts. One of these internal loop motifs is the tandem
guanine–adenine (GA) pair shown in Figure 1. Prior solution structures demonstrate that the conformational
preference depends on the sequence identity of the closing base pairs.[21,22] Table 1 summarizes these findings. When adjacent
to GC closing base pairs, an imino conformation (cis Watson–Crick/Watson–Crick
A-G) for the tandem GA pairs is observed.[22] When the C is changed to U in the closing pairs, making GU pairs,
a sheared conformation (trans Hoogsteen/sugar edge A-G) for the tandem
GA pairs is observed.[21] It is unclear what
interactions underlie this sequence-dependent conformational preference.
Figure 1
Hydrogen-bonding
pattern for two conformations of a GA pair.[66] (A) The cis Watson–Crick/Watson–Crick
(imino) interaction and (B) the trans-Hoogsteen/Sugar edge (sheared)
A-G interaction. In each panel, one of two base pairs in the tandem
pair is shown.
Table 1
Conformational
Preference of Tandem
GA pairs, Where Systems 1 and 4 (boldface) are Experimentally[21,22] Preferred Conformations, whereas Systems 2 and 3 Were Control Structures
Based on Systems 4 and 1, Respectivelya
The conformational
preference
is altered by changing closing GC pairs (system 1) with closing GU
pairs (system 4). Source is the protein data bank accession number.[21,22]
Hydrogen-bonding
pattern for two conformations of a GA pair.[66] (A) The cis Watson–Crick/Watson–Crick
(imino) interaction and (B) the trans-Hoogsteen/Sugar edge (sheared)
A-G interaction. In each panel, one of two base pairs in the tandem
pair is shown.The conformational
preference
is altered by changing closing GC pairs (system 1) with closing GU
pairs (system 4). Source is the protein data bank accession number.[21,22]Molecular dynamics (MD)
simulations can be used to study the structure
and dynamics of RNA and other biomolecules. Some of the most commonly
used software packages for all-atom MD are Amber, CHARMM, Gromacs,
and NAMD.[23−26] The accuracy of these packages depends on the set of equations and
parameters used to describe all the interactions in the system, called
the force field. Currently, the force fields most commonly used for
nucleic acids are the ones derived for the Amber and CHARMM packages,
although most force fields can be used in any MD package with modifications.
The latest version of the Amber force field for nucleic acids, distributed
with Amber, was named ff10 (and was unchanged in the newest ff12SB
force field).[27] This force field is based
on the ff94 force field of Cornell et al. and its subsequent modifications.[28] Initially, this force field was improved by
changing the sugar pucker and glycosidic torsions in force fields
ff98 and ff99.[29,30] Later, a new set of torsion parameters
called bsc0 was derived for α- and γ-torsions.[31] In 2010 and 2011, two new sets of glycosidic
torsions were derived for each base, namely Chi_YIL and ChiOL, respectively,
with ChiOL being incorporated into ff10.[32,33] Beyond ff10, an additional revision of the van der Waals interaction
strengths for the Amber RNA force field was recently reported.[34] This work uses the ff10 force field as the starting
point. The current CHARMM force field for nucleic acids, CHARMM36
is based on the CHARMM27 force field. Specifically, CHARMM36 improves
the description of nucleic acids by reparametrizing the torsion of
2′-hydroxyl group of RNAs and sugar pucker and also several
backbone torsions to better describe conformation equilibrium of DNAs.[35−37]In this work, MD was used to understand the interactions that
lead
to the observed conformational preference of GA pairs. One focus was
the role of the amino group in the G base. The GAimino pair has two
hydrogen bonds per GA pair but leaves the guanine amino unpaired.
Likewise, the GA sheared pair has two hydrogen bonds per GA pair,
but one of which involves the guanine amino group.[38] Previous high level quantum calculations for the amino
group were performed, and these showed that the amino groups do not
need to be confined to the plane of the base.[39−42] A subsequent molecular dynamics
study also found that the unpaired amino group in the imino pair stabilized
the interaction by forming a weak out-of-plane hydrogen bond. The
current Amber force field for RNA (referred to as ff10) prevents this
interaction by requiring the amino group to stay in the plane of the
base.[43] For this work, force field parameters
of Yildirim et al. (referred to as modified ff10) were used to model
this out-of-plane behavior.[38]Yildirim
et al. investigated the sequence-dependent stability of
GA pairs using thermodynamic integration.[38] They studied tandem GA pairs flanked by GC, CG, iGiC, or iCiG base
pairs, where iG and iC denote isoguanosine and iocytidine, which have
amino and carbonyl groups transposed relative to guanosine and cytidine.[38] They used four solution structures (PDB numbers 1MIS, 2O83, 2O81, and 1YFV),[22,44,45] and they used thermodynamic integration
to calculate free energy differences. These free energy differences
were compared to experiments using thermodynamic cycles. Their results
agreed with the experiments when they used the modified force field
for the unpaired amino group of guanine in the tandem GA pairs. Their
findings demonstrated that the out-of-plane hydrogen bonds were important
to improve the accuracy of the result for the imino conformation.[38]In this work, the conformational preference
for GA pairs was investigated
using molecular dynamics and two-dimensional umbrella sampling for
both GC and GU closing base pairs. The system with GC closing pairs
(1MIS) was previously studied as part of the free energy perturbation
studied by Yildirim et al.[38] The system
with GU closing pairs (2IRO) has not, to our knowledge, been studied
by MD.To examine conformational preference, control structures
were constructed
to represent conformations not observed (systems 2 and 3 in Table 1). The structures for both solution structures (systems
1 and 4 in Table 1) were maintained for unrestrained
molecular dynamics when the modified ff10 force field was used. Umbrella
sampling between sheared and imino conformations was performed along
reaction coordinates suggested by nudged elastic band calculations
of low potential energy conformational change pathways.[46,47] The free energy differences demonstrated that Amber can correctly
model the conformational preference of tandem GA pairs.
Methods
Force Fields
and Software
All simulations were performed
with the Amber 11 or Amber 12 simulation packages.[25,27] Two force fields were used for explicit solvent molecular dynamics
simulations with the TIP3P water model.[49] The first set was with Amber ff10, which is based on the Amber ff99
force field,[28−30] with corrections to the α- and γ-torsions[31] and corrections to the glycosidic bond torsions.[33] The second set of simulations was with a modified
ff10 that allows the exocyclic amine of guanine to leave the plane
of the base.[38,43] The modified ff10 was used for
guanines only in the loop regions of the structures. This modified
ff10 applies the corrections for the amino dihedrals[38] to ff10, and these corrections had previously only been
used in conjunction with ff99. The amino dihedral correction was assumed
to be orthogonal to the backbone and gylcosidic corrections. The excellent
performance of the modified ff10 force field, in terms of preserving
the native loop structures for multiple 1 μs duration simulations,
supports this assumption.
Model Structures
The starting structures
were solution
structures by Wu and Turner[22] (system 1)
and Tolbert et al.[21] (system 4), summarized
in Table 1. The 3′ dangling uracils
for system 4, present for the experiments,[21] were removed for this work. These are the preferred conformations,
and additionally two more control conformations, named system 2 and
system 3, were manually constructed. System 2 was generated by changing
the two uracils in the closing base pair from system 4 to cytosines
(by manually changing atom types). An unrestrained molecular dynamics
simulation using Amber ff10 for 10 ns was sufficient to bring the
two internal loop closing base pairs (GC) into the expected Watson–Crick
pairing pattern.System 3 was generated from system 1 by replacing
the two cytosines of system 1 with uracils (by manually changing atom
types). System 4 has GU closing base pairs with one hydrogen bond.
To make the closing base pair in system 3 replicate this pair, a targeted
molecular dynamics simulation on system 3, using a biasing potential
force constant of 1 kcal/mol and performed in explicit solvent, was
used on the two closing base pairs with the conformation of system
4 as the target. The simulation was 20 ps, and this was long enough
to achieve the change in GU conformation for both closing pairs. The
subsequent free molecular dynamics simulation showed a GU wobble interaction
was preferred, instead of the single hydrogen bond GU pair observed
in system 4.
Molecular Dynamics Setup
All four
molecules described
in Table 1 were neutralized with Na+ ions and then solvated with a TIP3P water box of equal length sides.
The box dimensions were set so that the RNA was no less than 10 Å
from the edge of the box. Finally 0.1 M NaCl was added based on ∼600
water molecules. This was 10 Na+ and 10 Cl– atoms, and this number of additional salt atoms was used for all
simulations, including the umbrella sampling simulations (below).
All simulations were performed with periodic boundary conditions using
Particle Mesh Ewald[50,51] and a 10 Å direct space
cutoff.
Explicit Solvent Simulations
All four structures were
energy minimized in two steps. During the first step, RNA was held
fixed with a restraint force of 500 kcal/(mol·Å2) and minimized for 2500 steps of steepest descent followed by a
2500 steps of conjugate gradient minimization. During the second step,
the whole system was minimized without any restraints for 2500 steps
of steepest descent followed by 2500 steps of conjugate gradient minimization.
The final energy-minimized structure was heated at constant volume
from 0 K to a final of 300 K over 50 ps. During this initial heating
stage, the RNA was held fixed in space with a harmonic potential of
10 kcal/(mol·Å2) for 50 ps. The final step before
production was to equilibrate at constant temperature of 300 K and
a pressure of 1 atm for 100 ps. For the simulations, SHAKE[52,53] was used to restrain bonds involving hydrogen atoms and the time
step was 2 fs.
Nudge Elastic Band (NEB)
An implicit
solvent NEB calculation
was used to search the conformational change pathway for each sequence.[54,55] The Generalized Born implicit solvent was used with a 0.1 M NaCl
concentration, a 25 Å cutoff for the solvation calculations and
a 99 Å cutoff for nonbonded interactions.[56] To identify starting structures, all snapshots from the
explicit solvent molecular dynamics were stripped of solvent, and
energy-minimized with Generalized Born implicit solvent. These structures
were energy minimized in two stages using the Amber ff10 force field.
First, a steepest descent minimization for 2500 steps was performed,
and this was followed by 2500 steps of conjugate gradient minimization.
The lowest potential energy image was then subsequently used in NEB
as an end point structure. For NEB, 18 images copied from the imino
conformation were followed by 18 images copied from the sheared conformation
to have 36 images in total. Those 36 images were initially heated
with a controlled increase of temperature from 0 to 300 K with an
NEB spring constant of 10 kcal/(mol·Å2) over
50 ps. This followed by constant temperature equilibration over 100
ps. A simulated annealing calculation (detailed in Supporting Information Table 1) was performed to search the
conformational change pathways.[54,55,57] During this pathway search, a time step of 1 fs and a Langevin dynamics
collision frequency of 1000 ps–1 were used. For
the NEB simulated annealing, the modified ff10 force field was used.
Umbrella Sampling
For each sequence described in Table 1, three umbrella sampling calculations were performed.
For system 1 and system 2, a total of 153 windows were used. For system
4 and system 3, 130 windows were used. Each window was sampled for
13 ns. A two-dimension reaction coordinate involving angles between
A13 (center mass of N3, C4, and C2), G4 (center mass of N1, C6, and
C5), and G4 (atom N3) and between A5 (center mass of N3, C4 and C5),
A5 (center mass of N1, C5, and C6), and G12 (atom N3) was chosen to
model the conformational change pathway. The MD setup was the same
as the unrestrained molecular dynamics setup explained in the above,
except a restraining harmonic potential was added during the second
step of minimization, and this restraining potential was kept for
the rest of the simulation time. This harmonic potential was an angle
restraint with a parabolic side with a force constant of 100 kcal/(mol·rad2). Coordinates were written to disk every 0.5 ps.
Free Energy
Difference Calculation
The Weighted Histogram
Analysis Method[58,59] (WHAM) was used to merge potential
of mean force (PMF) surfaces of individual windows into a single PMF
profile. This was done using the WHAM software package, version 2.0.7.[60] In the WHAM calculation, 1 ns of equilibration
was used, 12 ns of sampling for each umbrella simulation were used,
periodicity was off in both dimensions, and the bin size was one degree
in each dimension.Based on the angles observed in unrestrained
molecular dynamics, regions on the PMF were chosen to represent the
imino and sheared conformations. Then, the free energy difference
was calculated usingwhere the terms ΔGreactant→product is the free energy difference going
from the nonobserved structure to the native structure and pbin was probability of being inside a bin of
size 1 degree by 1 degree area. The sums were over all bins attributed
to the imino or sheared conformations.
General Analysis
Distance and root-mean-square deviation
(RMSD) analyses were calculated using the ptraj module in the AmberTools
package.[27] Nucleotides structures were
diagrammed with ChemDraw Ultra 12.0 and the PMF plot was generated
using Matlab.[61,62]
Results
Molecular Dynamics
and the Importance of Modified ff10
The initial molecular
dynamics simulations were run with the standard
Amber ff10 force field.[25] For system 1
and system 4, the native structures shown in Table 1, three independent simulations were run for 1 μs each.
Three independent trajectories were run for systems 2 and 3, where
system 2 was run for 1 μs each, but the system 3 simulations
were run for shorter durations. Figure 2 shows
the mass-weighted RMSD of the loop regions for the imino-paired systems,
including closing base pairs. For ff10, the plots show the native
loop structure (system 1) was consistently lost. The structure oscillated
between the native structure and a non-native structure, spending
a majority of the simulation time at the non-native structure.
Figure 2
Mass-weighted
root-mean-square deviations (RMSD) for the imino-paired
structures, system 1 (native structure, panel A) and system 3 (non-native
structure, panel B), for nucleotides in the internal loop and the
loop-closing base pairs as a function of simulation time. Blue and
red plots were the RMSD with ff10 and modified ff10 force fields,
respectively. For system 3, the non-native structure, the ff10 simulations
were stopped because of the evidence that the modified ff10 could
better model the native structure.
Mass-weighted
root-mean-square deviations (RMSD) for the imino-paired
structures, system 1 (native structure, panel A) and system 3 (non-native
structure, panel B), for nucleotides in the internal loop and the
loop-closing base pairs as a function of simulation time. Blue and
red plots were the RMSD with ff10 and modified ff10 force fields,
respectively. For system 3, the non-native structure, the ff10 simulations
were stopped because of the evidence that the modified ff10 could
better model the native structure.Previous work on iminoGA pairs had demonstrated the importance
of allowing amino groups to leave the plane of the base.[38] Using these modified ff10 parameters, a separate
set of simulations were run. For each system, shown in Table 1, a total of three independent simulations of one
microsecond duration were run. The native imino pairing structure
was preserved for the duration of the simulations (Figure 2). In addition, the structures of the sheared-paired
systems, 2 and 4, were maintained using the modified ff10 as shown
by RMSD (Supporting Information Figure S1). Because the native structures were more consistently maintained
for MD using modified ff10 than for ff10, MD simulations for the imino-paired
control structure (system 3) were completed only with modified ff10.In addition to following the simulations of structures by RMSD,
atomic distances were also followed as a function of time. One instructive
distance is that between C2 of adenine and O6 of guanine (Figure 3A), which is at about 4 Å in the solution structure.
For system 1, this distance demonstrated the differences in the conformations
visited by simulations run with ff10 and modified ff10. Figure 3B provides the distance as a function of time and
a histogram plot. For ff10, the distance fluctuated, demonstrating
both an alternative base pairing (at a distance of approximately 3
Å) and a loss of base pairing (at a distance of approximately
7 Å). For modified ff10, the distance was more consistent with
the imino pairing observed in the solution structure.
Figure 3
Distance plots of GA
pairs of system 1. The distance between C2
of adenine and O6 of guanine was followed as a function of time for
each of the GA pairs. (A) The top structure shows the imino pairing
as observed in the solution structure. In the solution structure,
this distance is 3.93 and 3.92 Å for the two pairs. The bottom
structure shows a structure sampled frequently in the simulations
using ff10. This structure corresponds to a distance of approximately
3 Å. (B) The top plots show the distance as a function of time
for all three simulations. The bottom plots show histograms for distances,
aggregated from all three simulations. The left plots are for ff10
and the right plots are for modified ff10. Blue and red colors are
for the distance between A13 (atom C2) and G4 (atom O6) and the distance
between A5 (atom C2) and G12 (atom O6), respectively. The histogram
plots show excellent convergence of the distance for the two pairs
because the two plots nearly superimpose. The histograms also show
clear differences between ff10 and modified ff10.
Distance plots of GA
pairs of system 1. The distance between C2
of adenine and O6 of guanine was followed as a function of time for
each of the GA pairs. (A) The top structure shows the imino pairing
as observed in the solution structure. In the solution structure,
this distance is 3.93 and 3.92 Å for the two pairs. The bottom
structure shows a structure sampled frequently in the simulations
using ff10. This structure corresponds to a distance of approximately
3 Å. (B) The top plots show the distance as a function of time
for all three simulations. The bottom plots show histograms for distances,
aggregated from all three simulations. The left plots are for ff10
and the right plots are for modified ff10. Blue and red colors are
for the distance between A13 (atom C2) and G4 (atom O6) and the distance
between A5 (atom C2) and G12 (atom O6), respectively. The histogram
plots show excellent convergence of the distance for the two pairs
because the two plots nearly superimpose. The histograms also show
clear differences between ff10 and modified ff10.
Preference of GU Conformation of System 3 and System 4
The
modified ff10 parameters were previously thought to stabilize
the imino pairing pattern by allowing the exocyclic amine to participate
in other interactions. In this work, the stability of system 4, the
sheared conformation, was also improved because modified ff10 stabilized
the closing GU base pairs. As shown in the top of Figure 4A, the solution structure has a single hydrogen
bond to the uracil O4 (called SHB4 by Tolbert et al.[21]). The loop RMSD also showed the switching from SHB4 to
GU wobble interaction as shown by Figure 4B
(right column) and Supporting Information Figure
S1 (right column). Simulations with ff10 (left of Figure 4B) often visited the wobble GU pair (bottom of Figure 4A), as demonstrated by the distance of uracil N3
to guanine O6. This pairing was also observed to a lesser extent in
the modified ff10 simulations (right of Figure 4B). In contrast, the non-native imino pairing (system 3) preferred
the GU wobble interaction (bottom of Figure 4A and Figure 4C).
Figure 4
Distance plots for closing
GU pairs. The guanine O6 and uracil
N3 distance was measured for both closing GU pairs in systems 3 and
4. (A) The top structure is the solution structure (system 4), which
has a distance of 9.11 Å and 9.02 Å for the two pairs, and
the bottom structure is a wobble GU, with a distance of approximately
3 Å. (B) The distance as a function of time for system 4, the
native sheared structure. (C) The distance as a function of time for
the imino control structure (system 3). In panels B and C, the ff10
force field is to the left and the modified ff10 force field is to
the right. Blue and red colors are for the distance between U14 (atom
N3) and 3G (atom O6) and the distance between U6 (atom N3) and G11
(atom O6), respectively.
Distance plots for closing
GU pairs. The guanine O6 and uracil
N3 distance was measured for both closing GU pairs in systems 3 and
4. (A) The top structure is the solution structure (system 4), which
has a distance of 9.11 Å and 9.02 Å for the two pairs, and
the bottom structure is a wobble GU, with a distance of approximately
3 Å. (B) The distance as a function of time for system 4, the
native sheared structure. (C) The distance as a function of time for
the imino control structure (system 3). In panels B and C, the ff10
force field is to the left and the modified ff10 force field is to
the right. Blue and red colors are for the distance between U14 (atom
N3) and 3G (atom O6) and the distance between U6 (atom N3) and G11
(atom O6), respectively.
Conformational Preference of Tandem GA Pairs
The simulations
provide insight into the atomic interactions that drive the conformational
preference of tandem RNA GA pairs. For GA pairs closed by either GC
or GU, the out-of-plane[38] interactions
between the guanine N2 and O2 of the adjacent paired uracil or cytosine
stabilized the imino conformation of the GA pair. For GC closure,
however, the additional stabilization is to a greater extent because
the average distance between the amino group of G and the O2 of the
pyrimidine in the closing pair is shorter when the pyrimidine in the
closing pair is C (Supporting Information Figure
S2A and B).The MD snapshots provide an atomic level
detail how the interaction of the guanine amino group was affected
by closing with either GC or GU pairs. In the case of GU closure (system
3), the O2 of U14 was involved in GU wobble interactions as shown
in Supporting Information Figure S2A and
the out-of-plane interaction distance between O2 and N2 was generally
longer than hydrogen bonding distance. Among all trajectories with
modified ff10, 96% of snapshots with GC closure (system 1, native
structure) had a distance between N2 of G4 and O2 of C14 of less than
4 Å. For GU closure (system 3, non-native), however, only 64%
of snapshots had an O2 of U14 to N2 of G4 distance of less than 4
Å. As shown in panel C of Supporting Information
Figure S2, in system 1 the distance between O2 of C14 and N2
of G3, a hydrogen bond for the Watson–Crick pair, is generally
less than 3 Å, while the out-of-plane interaction between O2
of C14 and N2 of G4 is simultaneously less than 3 Å. In contrast,
in panel D of Supporting Information Figure S2, system 3 shows a significant number of snapshots at hydrogen bonding
distance for O2 of U14 and N1 of G3, a hydrogen bond for the wobble
pair, but with a long distance between O2 of U14 and N2 of G4. Based
on these findings, it appears that a GA pair closed by GU pair prefers
the sheared interactions because the GU wobble interaction sequesters
the O2 of the U and reduces its out-of-plane interactions with the
amino of guanine.
Conformational Change Pathway Search
Reaction coordinates
were needed to compute the free energy difference between the imino
and sheared conformations. In order to find reasonable pathways, Nudged
Elastic Band (NEB) calculations were used. These calculations, detailed
in the Methods, were repeated five times for
each of the sequences.The NEB calculations suggested a two-dimensional
reaction coordinate (one dimension for each of the tandem pairs),
involving the angles between A13 (center of mass of N3, C4 and C2),
G4 (center of mass of N1, C6, and C5), and G4 (atom N3) and between
A5 (center of mass of N3, C4, and C5), G12 (center of mass of N1,
C5, and C6), and G12 (atom N3). On the plane defined by these reaction
coordinates (Figure 5), the imino conformation
is the right top corner centered at (100, 100) and the sheared conformation
is the left bottom corner centered at (20, 20). The NEB pathway involves
a stepwise movement of the pairs as shown in Figure 5 for sequence (GCGGACGC)2.[22] A similar path was also found for sequence (GCGGAUGC)2[21] as shown in Supporting
Information Figure S3. This path shows only one GA pair changing
conformation at a time. This was far from the diagonal of the reaction
coordinate, where both pairs would be moving through transition states
simultaneously.
Figure 5
NEB images and unrestrained molecular dynamics as a function
of
reaction coordinate for molecule (GCGGACGC)2.[22] Blue stars plot the NEB images for one calculation
and the dots are snapshots from the unrestrained molecular dynamic
simulations. The five NEB calculations showed a consistent path. Snapshots
were taken every 10 ps from three unrestrained molecular dynamics
calculations (shown in red, green, and blue) for imino (system 1)
and (shown in purple, orange and cyan) for sheared (system 2). The
reaction coordinate involves angles between A13 (center of mass of
N3, C4, and C2), G4 (center of mass of N1, C6, and C5), and G4 (atom
N3) on the x-axis and the angle between A5 (center
of mass of N3, C4, and C5), G12 (center of mass of N1, C5, and C6),
and G12 (atom N3) on the y-axis.
NEB images and unrestrained molecular dynamics as a function
of
reaction coordinate for molecule (GCGGACGC)2.[22] Blue stars plot the NEB images for one calculation
and the dots are snapshots from the unrestrained molecular dynamic
simulations. The five NEB calculations showed a consistent path. Snapshots
were taken every 10 ps from three unrestrained molecular dynamics
calculations (shown in red, green, and blue) for imino (system 1)
and (shown in purple, orange and cyan) for sheared (system 2). The
reaction coordinate involves angles between A13 (center of mass of
N3, C4, and C2), G4 (center of mass of N1, C6, and C5), and G4 (atom
N3) on the x-axis and the angle between A5 (center
of mass of N3, C4, and C5), G12 (center of mass of N1, C5, and C6),
and G12 (atom N3) on the y-axis.Umbrella sampling and WHAM were applied
to calculate the potential of mean force (PMF) profile along the reaction
coordinate. To calculate the PMF of the transition from system 2 to
system 1, a total of 153 windows were used. That transition was studied
with three separate calculations with 1 ns of equilibration, followed
by 12 ns of sampling in each window, for a total of approximately
6 μs aggregate simulation time. The location and number of windows
was selected in conjunction with the restraint force constant to ensure
sampling overlap of adjacent windows. Supporting
Information Figure S4 demonstrates the sampling overlap for
each of the three calculations. Additionally, 130 windows were used
to calculate the PMF of the transition from system 3 to system 4 with
12 ns of sampling after 1 ns of equilibration for each for a total
approximately 5 μs simulation time. Supporting
Information Figure S4 demonstrates the sampling overlap for
each of the three calculations.The PMF profiles (Figures 6 and 7 for GC closed loops
and GU closed loops, respectively) had two major minima located on
the top right corner, the imino conformation, and on the left bottom
corner, the sheared conformation. An additional fourth PMF profile
was also generated for each sequence by aggregating the data from
all three umbrella sampling simulations. Overall, the locations of
the global minima are similar for the individual simulations.
Figure 6
Potential of
mean force plots for the conformational change from
system 1 to system 2 calculated using umbrella sampling and WHAM.
Plots A, B, and C were the PMF corresponding to three independent
calculations, whereas plot D was the PMF calculated with the aggregate
data. The plots show the angle between A13 (center of mass of N3,
C4, and C2), G4 (center of mass of N1, C6, and C5) and G4 (atom N3)
on the x-axis and the angle between A5 (center of
mass of N3, C4, and C5), G12 (center of mass of N1, C5, and C6) and
G12 (atom N3) on the y-axis.
Figure 7
Potential of mean force plot for the conformational change from
system 4 to system 3 calculated using umbrella sampling and WHAM.
Plots A, B, and C were the PMF corresponding to three independent
simulations, whereas plot D was the PMF calculated from the aggregate
data. The plots show the angle between A13 (center of mass of N3,
C4, and C2), G4 (center of mass of N1, C6, and C5), and G4 (atom N3)
on the x-axis and the angle between A5 (center of
mass of N3, C4, and C5), G12 (center of mass of N1, C5, and C6), and
G12 (atom N3) on the y-axis.
Potential of
mean force plots for the conformational change from
system 1 to system 2 calculated using umbrella sampling and WHAM.
Plots A, B, and C were the PMF corresponding to three independent
calculations, whereas plot D was the PMF calculated with the aggregate
data. The plots show the angle between A13 (center of mass of N3,
C4, and C2), G4 (center of mass of N1, C6, and C5) and G4 (atom N3)
on the x-axis and the angle between A5 (center of
mass of N3, C4, and C5), G12 (center of mass of N1, C5, and C6) and
G12 (atom N3) on the y-axis.Potential of mean force plot for the conformational change from
system 4 to system 3 calculated using umbrella sampling and WHAM.
Plots A, B, and C were the PMF corresponding to three independent
simulations, whereas plot D was the PMF calculated from the aggregate
data. The plots show the angle between A13 (center of mass of N3,
C4, and C2), G4 (center of mass of N1, C6, and C5), and G4 (atom N3)
on the x-axis and the angle between A5 (center of
mass of N3, C4, and C5), G12 (center of mass of N1, C5, and C6), and
G12 (atom N3) on the y-axis.As detailed in the Methods, the unrestrained
MD suggested regions of the PMF to consider as representing the shared
or imino structures (Figure 5). The free energy
differences were then calculated between these areas on the plots.
To test convergence, the free energy difference between the two conformations
was generated using eq 1 as a function of time
for all the individual runs and for the average PMF at 1 ns intervals
(Figure 8). For all six umbrella sampling calculations,
12 ns of sampling (after 1 ns of equilibration) was sufficient so
that the PMFs were no longer changing as a function of sampling time.
Figure 8
Convergence
of free energy change. The points annotated with red,
green, and blue were three independent simulations and the black was
an average over all three simulations. The free energy difference
was calculated at 1 ns intervals, using data prior to that simulation
time. Left and right plots correspond to the conformational change
of system 2 to system 1 and system 3 to system 4, respectively.
Convergence
of free energy change. The points annotated with red,
green, and blue were three independent simulations and the black was
an average over all three simulations. The free energy difference
was calculated at 1 ns intervals, using data prior to that simulation
time. Left and right plots correspond to the conformational change
of system 2 to system 1 and system 3 to system 4, respectively.The average free energy difference,
using all the sampling data,
for the conformational change from system 2 (sheared) to system 1
(imino) was −0.16 ± 0.10 kcal/mol. For the conformational
change for system 3 (imino) to system 4 (sheared), the free energy
difference was −0.24 ± 0.11 kcal/mol. The direction of
the free energy difference was in agreement with the experimentally
determined conformational preference, although the magnitudes are
probably underestimating the extent of conformational preference because
no characteristics of the control structures are observed in the NMR
spectra.[21,22]
Discussion
In
this work, the conformational preference of tandem GA pairs
was studied with explicit solvent molecular dynamics. The native imino
structure, system 1, was not preserved during MD simulations with
ff10. A modified Amber force field, modified ff10,[38] was used and force field was shown to preserve the native
structure in MD and it was also capable of predicting the conformational
preference.Recent studies reported limitations in the available
Amber force
fields for RNA when modeling loop structures. ff99 with the parmbsc0
modification was shown to not correctly stabilize tetraloop structures,[63] an AA noncanonical pair[46] or the preferred conformations of GACC.[64] The recent ff10 force field, which includes revised torsion parameters
for α, γ, and χ, however, was shown to correctly
model tetraloop structures.[63] Also, ff99
with the parmbsc0 modification was shown to correctly model the relative
stability of helices, within the sampling error.[65]This work shows that molecular mechanics can also
correctly model
the conformational preference of tandem GA noncanonical pairs, once
a simple modification of the force field is applied.[38] This modified force field was implemented for both imino
and sheared conformations despite a previous recommendation to use
the parameters for GAimino interactions only. Apparently, the relaxation
of the improper dihedral that keeps the amino group atoms in the plane
of the base does not adversely affect the sheared interaction when
closed by GC pairs. This is also reinforced by unrestrained molecular
dynamics simulations on system 4 (Figure 4B).The extent of the conformational preference for the native structures
is underestimated for both sequences. As shown in Table 2, the free energy differences between the native structure
and the non-native, decoy structure are less than RT, although the decoys were not observed in solution. This suggests
that there are other interactions that are not yet correctly modeled
by the modified ff10 force field. This shallow conformational preference
also highlights the importance of the free energy calculations performed
here. In each sequence, unrestrained MD starting from the solution
structure does not sample the control (non-native) conformation because
the barrier between the states is larger than 6 kcal/mol (Figures 6 and 7). It would not be
apparent that the native conformations are understabilized by the
force field if the umbrella sampling, or other enhanced sampling,
was not performed.
Table 2
Free Energy Difference Calculated
from the PMF (Figures 6 and 7) Using Equation 1
system 2 ⇌ 1
ΔΔG (kcal/mol)
system 3 ⇌ 4
ΔΔG (kcal/mol)
simulation 1
–0.17
simulation 1
–0.16
simulation 2
–0.25
simulation 2
–0.19
simulation 3
0.06
simulation 3
–0.36
aggregatea
–0.16
aggregate
–0.23
averageb
–0.16 ± 0.10
average
–0.24 ± 0.11
ΔΔG calculated based on the aggregate data of all three simulations.
Mean ΔΔG calculated from the individual calculations. Errors were
the standard
deviation of three simulations.
ΔΔG calculated based on the aggregate data of all three simulations.Mean ΔΔG calculated from the individual calculations. Errors were
the standard
deviation of three simulations.This work shows the promise of currently available RNA force fields,
but it also shows the importance of continued development and testing
of force field parameters. The free energy calculations performed
here allow direct comparison between simulation and experiment and
provide a benchmark for force fields. This work also shows the importance
of broadening the set of RNA molecules used to test force field performance.
Authors: Gang Chen; Ryszard Kierzek; Ilyas Yildirim; Thomas R Krugh; Douglas H Turner; Scott D Kennedy Journal: J Phys Chem B Date: 2007-04-06 Impact factor: 2.991
Authors: B R Brooks; C L Brooks; A D Mackerell; L Nilsson; R J Petrella; B Roux; Y Won; G Archontis; C Bartels; S Boresch; A Caflisch; L Caves; Q Cui; A R Dinner; M Feig; S Fischer; J Gao; M Hodoscek; W Im; K Kuczera; T Lazaridis; J Ma; V Ovchinnikov; E Paci; R W Pastor; C B Post; J Z Pu; M Schaefer; B Tidor; R M Venable; H L Woodcock; X Wu; W Yang; D M York; M Karplus Journal: J Comput Chem Date: 2009-07-30 Impact factor: 3.376
Authors: Sander Pronk; Szilárd Páll; Roland Schulz; Per Larsson; Pär Bjelkmar; Rossen Apostolov; Michael R Shirts; Jeremy C Smith; Peter M Kasson; David van der Spoel; Berk Hess; Erik Lindahl Journal: Bioinformatics Date: 2013-02-13 Impact factor: 6.937
Authors: Ilyas Yildirim; Harry A Stern; Jiri Sponer; Nada Spackova; Douglas H Turner Journal: J Chem Theory Comput Date: 2009-07-02 Impact factor: 6.006
Authors: Joseph A Liberman; Krishna C Suddala; Asaminew Aytenfisu; Dalen Chan; Ivan A Belashov; Mohammad Salim; David H Mathews; Robert C Spitale; Nils G Walter; Joseph E Wedekind Journal: Proc Natl Acad Sci U S A Date: 2015-06-23 Impact factor: 11.205
Authors: Jiří Šponer; Giovanni Bussi; Miroslav Krepl; Pavel Banáš; Sandro Bottaro; Richard A Cunha; Alejandro Gil-Ley; Giovanni Pinamonti; Simón Poblete; Petr Jurečka; Nils G Walter; Michal Otyepka Journal: Chem Rev Date: 2018-01-03 Impact factor: 60.622
Authors: Kyle D Berger; Scott D Kennedy; Susan J Schroeder; Brent M Znosko; Hongying Sun; David H Mathews; Douglas H Turner Journal: Biochemistry Date: 2018-03-23 Impact factor: 3.162
Authors: Wilma K Olson; Shuxiang Li; Thomas Kaukonen; Andrew V Colasanti; Yurong Xin; Xiang-Jun Lu Journal: Biochemistry Date: 2019-05-08 Impact factor: 3.162
Authors: Javier Gutiérrez-Fernández; Malek Saleh; Martín Alcorlo; Alejandro Gómez-Mejía; David Pantoja-Uceda; Miguel A Treviño; Franziska Voß; Mohammed R Abdullah; Sergio Galán-Bartual; Jolien Seinen; Pedro A Sánchez-Murcia; Federico Gago; Marta Bruix; Sven Hammerschmidt; Juan A Hermoso Journal: Sci Rep Date: 2016-12-05 Impact factor: 4.379
Authors: Asaminew H Aytenfisu; Aleksandar Spasic; Alan Grossfield; Harry A Stern; David H Mathews Journal: J Chem Theory Comput Date: 2017-01-24 Impact factor: 6.006
Authors: Aleksandar Spasic; Scott D Kennedy; Laura Needham; Muthiah Manoharan; Ryszard Kierzek; Douglas H Turner; David H Mathews Journal: RNA Date: 2018-02-06 Impact factor: 4.942