Melek N Ucisik1, Sharon Hammes-Schiffer1. 1. Department of Chemistry, University of Illinois at Urbana-Champaign , 600 South Mathews Avenue, Urbana, Illinois 61801-3364, United States.
Abstract
High-energy ultraviolet radiation damages DNA through the formation of cyclobutane pyrimidine dimers, which stall replication. When the lesion is a thymine-thymine dimer (TTD), human DNA polymerase η (Pol η) assists in resuming the replication process by inserting nucleotides opposite the damaged site. We performed extensive molecular dynamics (MD) simulations to investigate the structural and dynamical effects of four different Pol η complexes with or without a TTD and with either dATP or dGTP as the incoming base. No major differences in the overall structures and equilibrium dynamics were detected among the four systems, suggesting that the specificity of this enzyme is due predominantly to differences in local interactions in the binding regions. Analysis of the hydrogen-bonding interactions between the enzyme and the DNA and dNTP provided molecular-level insights. Specifically, the TTD was observed to engage in more hydrogen-bonding interactions with the enzyme than its undamaged counterpart of two normal thymines. The resulting greater rigidity and specific orientation of the TTD are consistent with the experimental observation of higher processivity and overall efficiency at TTD sites than at analogous sites with two normal thymines. The similarities between the systems containing dATP and dGTP are consistent with the experimental observation of relatively low fidelity with respect to the incoming base. Moreover, Q38 and R61, two strictly conserved amino acids across the Pol η family, were found to exhibit persistent hydrogen-bonding interactions with the TTD and cation-π interactions with the free base, respectively. Thus, these simulations provide molecular level insights into the basis for the selectivity and efficiency of this enzyme, as well as the roles of the two most strictly conserved residues.
High-energy ultraviolet radiation damages DNA through the formation of cyclobutane pyrimidine dimers, which stall replication. When the lesion is a thymine-thymine dimer (TTD), human DNA polymerase η (Pol η) assists in resuming the replication process by inserting nucleotides opposite the damaged site. We performed extensive molecular dynamics (MD) simulations to investigate the structural and dynamical effects of four different Pol η complexes with or without a TTD and with either dATP or dGTP as the incoming base. No major differences in the overall structures and equilibrium dynamics were detected among the four systems, suggesting that the specificity of this enzyme is due predominantly to differences in local interactions in the binding regions. Analysis of the hydrogen-bonding interactions between the enzyme and the DNA and dNTP provided molecular-level insights. Specifically, the TTD was observed to engage in more hydrogen-bonding interactions with the enzyme than its undamaged counterpart of two normal thymines. The resulting greater rigidity and specific orientation of the TTD are consistent with the experimental observation of higher processivity and overall efficiency at TTD sites than at analogous sites with two normal thymines. The similarities between the systems containing dATP and dGTP are consistent with the experimental observation of relatively low fidelity with respect to the incoming base. Moreover, Q38 and R61, two strictly conserved amino acids across the Pol η family, were found to exhibit persistent hydrogen-bonding interactions with the TTD and cation-π interactions with the free base, respectively. Thus, these simulations provide molecular level insights into the basis for the selectivity and efficiency of this enzyme, as well as the roles of the two most strictly conserved residues.
The energy of the ultraviolet
radiation from the sun is high enough
to catalyze the formation of covalent bonds between adjacent pyrimidine
bases in DNA, resulting in cyclobutane pyrimidine dimers (CPDs).[1,2] Such CPDs constitute one of the most prevalent types of DNA damage
caused by exposure to sunlight.[3−6] This alteration of the pyrimidine nucleotides in
DNA leads to structural and chemical changes in the vicinity of the
CPD, modifying the Watson–Crick base pairing and base stacking.
These changes are not tolerated by the high-fidelity, high-processivity
DNA replication polymerases when these cells attempt to replicate,
thereby resulting in stalled replication forks.[6−9] For the replication to continue
and the genomic material to be correctly transferred to the new generation
of cells, the CPD lesions need to be either excised and replaced with
their undamaged counterparts or bypassed, a process by which the lesion
itself is not repaired, but the primer strand retains accurate genomic
information despite the damage in the template strand.[7,10−12] In the latter case, the Watson–Crick base
pairing occurs correctly against the distorted nucleotides comprising
the lesion, and the replication proceeds as usual. This process, denoted
translesion DNA synthesis, is performed by specialized DNA polymerases,
most of which are categorized as the Y-family DNA polymerases.[13−15] In humans, the Y-family encompasses four out of the 17 DNA polymerases:
η, ι, κ, and Rev1. Each of these enzymes has different
preferences in terms of the lesion and the incoming nucleotide that
would be incorporated opposite the damaged bases.[16−18]The bypass
of cyclobutanethymine–thymine dimers (TTDs)
is the specialty of DNA polymerase η, denoted Pol η, which
is encoded by the human POLH gene.[19−22] Mutations in this gene cause
the variant form of Xeroderma Pigmentosum, a condition characterized
by deficiency of repairing sun-induced damage in skin, which in turn
leads to increased sensitivity to sunlight and a high susceptibility
to skin cancer.[23−27] Hence, the correct and complete functioning of Pol η is vital
for humans. It binds to TTD-containing DNA more strongly than to undamaged
DNA, and it exhibits higher accuracy and processivity when extending
the DNA primer opposite a TTD than opposite two normal thymines.[28,29]Despite its critical role in alleviating the negative effects
of
sun exposure, Pol η exhibits a few potentially disadvantageous
properties that are common to the entire class of Y-family DNA polymerases.
It incorporates incorrect bases frequently when operating on undamaged
DNA, which could have severe mutational consequences.[30−32] Furthermore, it has a lower processivity and a lower catalytic efficiency
than DNA replicases.[17] Thus, the use of
Pol η for DNA replication is strictly regulated, and it is utilized
only when the replication fork encounters a TTD.[21] In such cases, Pol η takes over the replication with
its open active site to accommodate the bulky CPDs.[19,21] It also exhibits activity against the intrastrand cross-links in
DNA that are induced by anticancer agents such as cisplatin, carboplatin,
gemcitabine, and oxaliplatin. The activity of Pol η provides
an opportunity for the cancer cells to proliferate further, rendering
such chemotherapy less potent.[33−37] This resistance to cisplatin treatment could be decreased or even
eradicated upon greater understanding of the structural and dynamical
properties of Pol η.In this paper, we present comparative
molecular dynamics (MD) studies
conducted on four systems comprised of the catalytic domain of Pol
η, a DNA template/primer of six or eight base pairs with or
without a TTD, and a free deoxyribose nucleotide triphosphate (dNTP),
namely either dATP or dGTP.[19] The configurations
generated along microsecond MD trajectories are extensively analyzed
to elucidate the differences and similarities among the four systems.
In particular, we investigate the hydrogen-bonding patterns in the
region of the active site containing the incoming dNTP and the TTD
or the two consecutive, normal thymines (TT) at the same location.
These analyses provide insights into the structural and dynamical
properties of Pol η that could be relevant to its critical function.
Methods
We performed classical MD simulations on four systems comprised
of the catalytic domain of the enzyme Pol η, a DNA primer/template
of six or eight base pairs, and a free, incoming deoxyribose nucleotide
triphosphate (dNTP). The enzyme consists of 432 amino acid residues
and requires two Mg2+ ions for the nucleotidyl transfer
reaction.[38] The four systems studied are
as follows:Pol η, an enzyme-bound DNA primer/template
of eight base pairs with a TTD in its template strand, and a free
deoxyribose adenine triphosphate (dATP) base-paired with the 3′
thymine of the TTD. This system is denoted “TTD3′-A”
in our analysis, and the initial structure was obtained from the PDB
structure 3MR3[19] with dAMNPP modified to
dATP.Pol η, an
enzyme-bound DNA primer/template
of eight base pairs with a TTD in its template strand, and a free
deoxyribose guanine triphosphate (dGTP) base-paired with the 3′
thymine of the TTD. This system is termed “TTD3′-G”
in our analysis, and the initial structure was obtained from the PDB
structure 3MR3 with dAMNPP modified to dGTP to investigate the effects
of a purine different from adenine, which is the natural Watson–Crick
base-pair partner for thymine.Pol η, an enzyme-bound DNA primer/template
of six base pairs with no defect, and a free dATP base-paired with
a normal thymine followed by another one located where the TTD would
be. This system is denoted “N/A-A” in our analysis,
and the initial structure was obtained from the PDB structure 3MR2(19) with dAMNPP modified to dATP.Pol η, an enzyme-bound DNA primer/template
of six base pairs with a TTD in its template strand, and a free dATP
base-paired with the 5′ thymine of the TTD. This system is
denoted “TTD5′-A” in our analysis, and the initial
structure was obtained from the PDB structure 3SI8(19) with dAMNPP modified to dATP.The complete DNA sequences and numbering are given in Table S1, and the entire complex for each system
is depicted in Figure S1. The differences
between these systems are highlighted in Figure .
Figure 1
Active sites of the systems TTD3′-A (A),
TTD3′-G
(B), N/A-A (C), and TTD5′-A (D). The incoming free nucleotide,
dATP or dGTP, is shown in ball-and-stick representation and labeled.
The TTD or consecutive normal thymines for N/A-A is displayed in licorice
representation with the 3′ and 5′ ends labeled.
Active sites of the systems TTD3′-A (A),
TTD3′-G
(B), N/A-A (C), and TTD5′-A (D). The incoming free nucleotide,
dATP or dGTP, is shown in ball-and-stick representation and labeled.
The TTD or consecutive normal thymines for N/A-A is displayed in licorice
representation with the 3′ and 5′ ends labeled.These four systems were simulated
with the ff12SB force field[39−42] for the protein and the ff99bsc0 force field for
the DNA within
the AMBER14 suite of programs.[43] All of
the protein, nucleic acid, and solvent atoms were treated explicitly,
and the Mg2+ parameters were adopted from Allner et al.[44] The Mg2+ ions were free to move but
remained close to their original positions throughout the simulations.
The atomic charges for the TTD and the dNTP residues, with N defined
as A or G, were obtained using the restrained electrostatic potential
(RESP) method.[45,46] The details pertaining to our
RESP protocol and system preparation are provided in the SI. Each system was solvated with the TIP3P triangulated
water model[47] in a periodically replicated
truncated octahedral water box with sides that were at least 10 Å
from any solute atom. The systems were neutralized by the addition
of Na+ ions, and then additional Na+ and Cl– ions were added to bring the salt concentration to
∼125 mM. The charged amino acids were modeled according to
the protonation states obtained with the H++ protonation state server
at neutral pH.[48]Initially each system
was processed through an energy-minimization
protocol comprised of seven stages: the minimization of only the solvent
atoms and the counterions (stage 1), the minimization of the solute
hydrogen atoms (stage 2), the minimization of the side chains by gradually
decreasing the harmonic positional restraints acting on them (stages
3–6), and finally the energy minimization of the whole system
with no positional restraints (stage 7). A total of 63 000
energy-minimization steps were performed; the first 24 000
steps used the steepest descent method,[49] and the remaining 39 000 steps used the conjugate gradient
method for minimization.[49] Subsequent to
the energy minimization, a two-stage equilibration MD protocol was
followed. First, the system was heated slowly from 0 to 300 K over
200 ps of MD within the canonical ensemble (NVT) while maintaining
a weak harmonic restraint on the protein. Second, after the removal
of the harmonic restraints, an MD trajectory of 10 ns at 300 K was
propagated at a constant pressure of 1.0 bar within an isobaric, isothermal
ensemble (NPT) using Langevin dynamics.[50]Periodic boundary conditions were utilized for the energy
minimizations
and MD simulations. The Particle Mesh Ewald (PME) method was employed
for long-range electrostatic interactions, and an 8 Å nonbonded
cutoff was applied to limit the direct space sum in PME.[51] The lengths of the covalent bonds involving
hydrogen were constrained with the SHAKE algorithm during the MD simulations.[52] The temperature of the systems was maintained
at 300 K, and the pressure was maintained at 1.0 bar with Langevin
dynamics with a collision frequency of 1.0 ps–1.
A time step of 2 fs was used for all MD trajectories.Three
independent, 1 μs MD trajectories were propagated within
the NPT ensemble for each system beginning with the structures obtained
from the second equilibration phase with random initial velocities
chosen according to a Maxwell–Boltzmann distribution. The MD
trajectories were analyzed to shed light on the structure and dynamics
of the Pol η enzyme and the nucleic acids bound to it, namely
the short DNA primer/template and the incoming dNTP. In this context,
the term dynamics refers to equilibrium motions and fluctuations.
Specifically, we performed a comprehensive analysis of key distances,
root-mean-square deviations (RMSDs), root-mean-square fluctuations
(RMSFs), radii of gyration, surface areas, nucleic acid flexibilities,
and cross-correlations. These analyses were performed with the cpptraj
utility of AmberTools13.[43]
Results and Discussion
Structure
and Motion of the Enzyme
To evaluate the
structural stability over the MD trajectories, we examined the RMSDs
of the Cα atoms of the protein backbone. Figure overlays the RMSD
profiles of the four systems using data from one MD trajectory per
system. The RMSD analysis indicates that these systems are not fully
equilibrated until ∼50 ns. Following this equilibration, the
RMSD fluctuations within a trajectory remain within 1 Å, indicating
that these are reasonably stable structures. The structural stability
of the N/A-A system is comparable to that of the other three systems,
suggesting that the presence of the TTD does not lead to extra stabilization
that could have been related to the function of this enzyme.
Figure 2
(top) RMSDs
of the Cα atoms for the systems TTD3′-A
(black), TTD3′-G (red), N/A-A (green), and TTD5′-A (blue)
obtained from one of the three independent trajectories for each system.
(bottom) Akima spline interpolations of the same data. The RMSDs obtained
from all trajectories are provided in Figure S4.
(top) RMSDs
of the Cα atoms for the systems TTD3′-A
(black), TTD3′-G (red), N/A-A (green), and TTD5′-A (blue)
obtained from one of the three independent trajectories for each system.
(bottom) Akima spline interpolations of the same data. The RMSDs obtained
from all trajectories are provided in Figure S4.The RMSDs of Cα atoms obtained from the three
independent trajectories for each system are provided in Figure S4. The RMSD fluctuations within each
trajectory remain within ∼1.5 Å for all 12 independent
trajectories except for a jump observed in one of the TTD5′-A
trajectories and another jump observed at the end of one of the TTD3′-G
trajectories. The TTD5′-A trajectory jump can be traced to
the thumb domain opening up in that trajectory. Specifically, one
of the alpha helices contained within this domain becomes distorted,
changing the packing of the helices and causing that whole domain
to move slightly outward. The TTD3′-G trajectory jump can be
attributed to the movement of the whole thumb domain toward the finger
domain, a nearly opposite motion to what is observed for the TTD5′-A
jump.Comparison of the RMSFs of the four system from MD simulations
and the B-factor values from the crystal structure 3MR3, which we
used as the initial coordinates for the system TTD3′-A.To locate the flexible regions
of these four systems, we calculated
the RMSFs of the Cα atoms. Figure features an overlay of the RMSFs from one
MD trajectory per system. The RMSFs for all three trajectories for
each of the four systems are provided in Figure S5. The greatest mobility is found at the loop regions in the
palm (residues 120–180) and little finger (residues 400–420)
domains along with the entire thumb domain (residues 240–310),
which consists of two longer alpha helices, one short α helix,
and loops. The calculated RMSF plots are generally in agreement with
the B-factors detected in the X-ray experiments for the TTD3′-A
structure, as shown in Figure . However, the mobility of the thumb region is not as pronounced
and the mobility of the little finger domain is more pronounced in
the experiments than in the MD simulations.
Figure 3
Comparison of the RMSFs of the four system from MD simulations
and the B-factor values from the crystal structure 3MR3, which we
used as the initial coordinates for the system TTD3′-A.
The open active
site of Pol η is believed to be one of the
potential reasons for its low fidelity.[15,20,21] We investigated the time evolution of the binding
surfaces of the DNA and the substrates, namely the dNTP molecules,
during the MD trajectories. The molecular surface areas for the residues
that directly participate in the nucleotidyl transfer reaction and
the residues that interact closely with these residues were calculated
using the linear combination of pairwise overlaps (LCPO) method by
Weiser et al. as a function of time.[53] Initially
this analysis was performed for the set of residues identified in
ref (19). According
to this previous work, the residues participating in the reaction
are D13, M14, D115, and E116, which directly coordinate the two catalytic
Mg2+ ions.[38] In addition, the
residues F18, Q38, Y39, I48, R61, S62, K86, L89, Y92, and R93 were
designated as residues interacting with the DNA and the dNTP around
the active site. The surface area data for these residues are shown
in Figures S6 and S7.For a more
comprehensive analysis, the definition of active site
was modified to include all residues that are within 5 Å from
the dNTP nucleotide or the TTD lesion (Table S2). These residues were inferred from three frames extracted from
each of the TTD3′-A, TTD3′-G, and TTD5′-A MD
trajectories. The identified residues for each system were combined
to form a consensus active site, and the associated surface area was
calculated for all 12 independent MD trajectories. The results are
depicted in Figure . All of these trajectories initially exhibit a surface area value
of approximately 750 Å2. The TTD3′-A, TTD3′-G,
and N/A-A trajectories oscillate about this value with minor fluctuations,
with TTD3′-A remaining the most consistent. Two of the N/A-A
trajectories exhibit a slightly more closed active site, which could
be attributed to the absence of the distortion created by the TTD
lesion. However, the difference is relatively small, thereby still
consistent with a rather open active site spacious enough to accommodate
distorted DNA.[15,21]
Figure 4
Surface areas for the
consensus active site for all three independent
trajectories of the four systems, TTD3′-A (A), TTD3′-G
(B), N/A-A (C), and TTD5′-A (D), are shown as transparent lines,
whereas the Akima spline interpolations of these are shown as thicker,
opaque lines for better visibility of the trends. Black, red, and
green represent the profiles for three independent trajectories.
The greatest differences
across the three trajectories for the
same system are found in the TTD5′-A system, which corresponds
to insertion of the second A opposite the 5′-T of the TTD by
Pol η. For this system, the movement of the entire TTD deeper
into the active site could possibly open up the binding pocket. The
largest increase in the surface area is observed in the third trajectory
of the TTD5′-A system, consistent with the significant change
in the RMSD for this trajectory (Figure S4). Upon visual inspection, this increase in surface area appears
to be due to the motion of the finger domain.Surface areas for the
consensus active site for all three independent
trajectories of the four systems, TTD3′-A (A), TTD3′-G
(B), N/A-A (C), and TTD5′-A (D), are shown as transparent lines,
whereas the Akima spline interpolations of these are shown as thicker,
opaque lines for better visibility of the trends. Black, red, and
green represent the profiles for three independent trajectories.After investigating the active
site surface area, we examined the
overall compactness of the enzyme. For this purpose, we calculated
the time evolution of the radius of gyration, as well as the largest
distance between any two protein atoms, for the MD trajectories. As
shown in Figure ,
all four systems exhibit similar behavior with respect to these properties.
The analogous data for all trajectories are provided in Figure S8. The compactness of these systems remains
consistent throughout the trajectories with an approximate radius
of gyration of 25 Å. The maximum interatomic distance is less
meaningful than the radius of gyration because of the loop motions,
but even this quantity remains mostly unchanged during the trajectories
except for minor fluctuations.
Figure 5
Time evolution of the radius of gyration
(lower curves) and the
largest distance between any two protein atoms (upper curves) for
the systems TTD3′-A (black), TTD5′-G (red), N/A-A (green),
and TTD5′-A (blue) obtained from one of the three independent
trajectories for each system. The analogous data obtained from all
trajectories are provided in Figure S5.
Time evolution of the radius of gyration
(lower curves) and the
largest distance between any two protein atoms (upper curves) for
the systems TTD3′-A (black), TTD5′-G (red), N/A-A (green),
and TTD5′-A (blue) obtained from one of the three independent
trajectories for each system. The analogous data obtained from all
trajectories are provided in Figure S5.To explore the possibility of
correlated motions between distal
residues in Pol η, we calculated the cross-correlation maps
for all pairs of residues (Figure S9).
In this analysis, correlated motions identify residues moving in the
same direction, and anticorrelated motions identify residues moving
in the opposite direction. No obvious pattern of correlations or anticorrelations
was detected for these systems except for a few minor trends. Overall,
these cross-correlation maps do not show any distinct interrelationship
between motions across domains in this enzyme.
Structure and Motion of
the Nucleic Acids
In this section,
we analyze the structure and motion of the short DNA primer/template
bound to Pol η and the free, incoming nucleotide, dNTP, where
N is either A or G. Prior to this analysis, we ensured that the DNA
strands across the four systems were superimposable. The set of base
pairs C-G, G-C, T-A, C-G, and A-T are common to the primer/templates
in all four systems, thereby allowing a comparison of their interactions
with each other and with the environment. The DNA primer/template
in 3SI8 has
the same overall shape in the enzyme crevice, but the base pair sequence
is structurally shifted upstream by one pair (Figure S10).
Geometric Configurations
To gain
a better understanding
of the geometric configurations of the DNA constructs bound to the
enzyme in the four systems, we analyzed the nucleic acid flexibility
parameters from the MD trajectories. The spatial arrangement of one
base with respect to another within a base pair was examined through
three rotational (buckle, propeller, opening) and three translational
(shear, stretch, stagger) intrabase pair parameters. This examination
showed that the base pairings were predominantly conserved, with the
base pairs maintaining planarity, throughout the trajectories for
each system.Three translational (shift, slide, rise) and three
rotational (tilt, roll, twist) interbase pair parameters were analyzed
to obtain quantitative information about the spatial arrangement of
the consecutive base pairs and thus the overall DNA structure. The
slide and shift values were found to be small for all base pair steps
in all systems, implying that the B-conformation of DNA was preserved
at all times. The tilt and roll values were found to remain small
during the simulated time frames, ensuring a mostly parallel arrangement
of the base pairs throughout. Additionally, three translational (helical
X-displacement, helical Y-displacement, helical rise) and three rotational
(helical inclination, helical tip, helical twist) helix parameters
were extracted from the MD trajectories. The X- and Y-displacement
values were observed to be mostly around zero, and the rise values
appeared as narrow distributions centered at ∼3 Å. Inclination
and tip angles were usually around 0°, and twist values displayed
narrow distributions predominantly centered at around 35–40°.
All of these findings support the overall observation of conservation
of the B-form for the DNA strands bound to Pol η.Time evolution
of the major (black) and minor (red) groove widths
for one of the three independent trajectories of the systems TTD3′-A
(A), TTD3′-G (B), N/A-A (C), and TTD5′-A (D). The analogous
data obtained from all trajectories are provided in Figure S11.The major and minor groove
widths of the DNA strands bound to Pol
η were also examined. The time evolution of these widths is
depicted in Figure for one independent trajectory per system, and the data for all
trajectories are provided in Figure S11. As depicted in Figure , these parameters remained extremely steady over the entire
trajectory for systems N/A-A and TTD5′-A but exhibited wider
fluctuations for systems TTD3′-A and TTD3′-G. These
changes suggest that the DNA may be more mobile in the TTD3′-A
and TTD3′-G systems.
Figure 6
Time evolution
of the major (black) and minor (red) groove widths
for one of the three independent trajectories of the systems TTD3′-A
(A), TTD3′-G (B), N/A-A (C), and TTD5′-A (D). The analogous
data obtained from all trajectories are provided in Figure S11.
Hydrogen Bonds
To characterize the
significant interactions
in these systems, we examined the number of hydrogen bonds formed
within the nucleic acid subsystem, namely within the subsystem comprised
of the DNA primer/template and the dNTP. To compare these numbers, Figures and S12 depict histograms of the number of hydrogen
bonds within the nucleic acid subsystem. Figure illustrates that the nucleic acid construct
in the TTD3′-A system forms the largest number of hydrogen
bonds, suggesting stronger interactions between the DNA primer/template
structure and the dNTP molecule. Interestingly, the smallest number
of hydrogen bonds is found in the N/A-A system without a TTD.
Figure 7
Histograms
depicting the number of hydrogen bonds formed within
the nucleic acids obtained from one of the three independent trajectories
of the systems TTD3′-A (black), TTD3′-G (red), N/A-A
(green), and TTD5′-A (blue). Hydrogen bonds were defined with
a donor–acceptor heavy-atom distance cutoff of 3.5 Å and
a donor–hydrogen–acceptor angle cutoff of 135°.
The analogous data obtained from all trajectories are provided in Figure S12.
Histograms
depicting the number of hydrogen bonds formed within
the nucleic acids obtained from one of the three independent trajectories
of the systems TTD3′-A (black), TTD3′-G (red), N/A-A
(green), and TTD5′-A (blue). Hydrogen bonds were defined with
a donor–acceptor heavy-atom distance cutoff of 3.5 Å and
a donor–hydrogen–acceptor angle cutoff of 135°.
The analogous data obtained from all trajectories are provided in Figure S12.The number of hydrogen bonds within the nucleic acid subsystem
remains the lowest for the N/A-A system across the independent MD
trajectories, as exhibited in the histograms given in Figure S12. Furthermore, all of the systems except
for the TTD3′-G system form the same number of hydrogen bonds
within the nucleic acid subsystem over the three independent trajectories,
as demonstrated by the overlapping peaks in Figure S12. The greatest fluctuations are associated with the only
system containing dGTP rather than dATP. This observation could potentially
be related to the substrate selectivity of Pol η.In addition,
the number of hydrogen bonds formed between the nucleic
acids, i.e., the DNA primer/template or the dNTP, and the protein
was quantified, as depicted in Figure . In contrast to the analysis above, the N/A-A system
maintains almost the same number of hydrogen bonds between the DNA
components and the amino acid residues as the other systems, indicating
similar binding properties for the damaged and undamaged DNA to Pol
η. Moreover, Figure illustrates that approximately 50–100 hydrogen bonds
are maintained between the nucleic acids and the protein during the
entire time of the long MD trajectories. These persistent hydrogen-bonding
interactions suggest that the DNA primer/template and dNTP are bound
strongly to the protein in a relatively specific location and orientation.
This observation could be related to the role of Pol η as a
molecular splint, which was pointed out in previous experimental studies.[19] In particular, the enzyme is able to keep the
damaged DNA template straight by accommodating TTD-induced distortions
with only minor perturbations to the torsional angles of neighboring
nucleotides, allowing the newly forming strand to preserve its B-form.
If the template DNA around the lesion becomes distorted, it might
not fit into the active site of Pol η and would either be left
damaged, therefore stalling replication, or would need to be excised
by the nucleotide-excision pair if it were recognized as damaged.
Figure 8
Histograms depicting
the number of hydrogen bonds formed between
the nucleic acids and the protein obtained from all three independent
trajectories for the systems TTD3′-A (A), TTD3′-G (B),
N/A-A (C), and TTD5′-A (D). The different colors represent
independent trajectories.
Hydrogen-Bonding Networks
We also investigated the
details of the specific hydrogen-bonding network that holds the incoming
free nucleotide and the TTD in place. Our analysis indicates that
every lone pair with the potential of acting as a hydrogen bond acceptor
on the dNTP is in close proximity to a possible hydrogen bond donor
in the protein. This specific positioning of the dNTP with respect
to the lesion may be related to the overall effectiveness of the nucleotidyl
addition reaction in the case of TTD lesions.Histograms depicting
the number of hydrogen bonds formed between
the nucleic acids and the protein obtained from all three independent
trajectories for the systems TTD3′-A (A), TTD3′-G (B),
N/A-A (C), and TTD5′-A (D). The different colors represent
independent trajectories.For all four systems, the hydrogen bonds between dNTP and
residues
Y52, C16, F17, R55, and F18 are present in ∼70% or more of
all saved configurations (Figure S13).
The most common hydrogen bonds, which were detected in all trajectories,
involve one of the oxygen atoms of the terminal phosphate in dNTP
participating simultaneously in two hydrogen bonds. This oxygen atom
establishes a hydrogen bond with the backbone amide of C16 in 99.8%
of all saved configurations from all systems, with an average heavy-atom
distance of 2.93 Å and an average angle of 164°. This same
phosphateoxygen also exhibits a hydrogen bond with the side chain
hydroxyl group of Y52 in 96.5% of all saved configurations with an
average heavy-atom distance of 2.60 Å and an average angle of
167°. Previously mutations of Y52 were observed to alter the
fidelity and efficiency of Pol η.[54,55] Similarly,
the hydrogen atom attached to the backbone nitrogen of F17 and the
guanidiniumhydrogens of R55 form very persistent hydrogen bonds to
triphosphateoxygens of the dNTP. Previously the R55A mutant of Pol
η was found to be completely inactive.[54] The hydrogen bond involving the hydrogen attached to the backbone
nitrogen of F18 and the deoxyribose O3′ is also observed in
more than 75% of all saved configurations. In addition, hydrogen bonds
were observed between the oxygens of the triphosphate of dNTP and
the side chain amino hydrogens of K231, and occasionally between the
N7 of dNTP and R61. The residues involved in these hydrogen-bonding
interactions are depicted in Figure .
Figure 9
Depiction of the residues involved in the most common
hydrogen-bonding
interactions between the dNTP and Pol η: C16, F17, F18, Y52, R55, and R61, which
are located in the finger domain, and K231, which is located in the
palm domain. Mg2+ ions are not shown for clarity. The dNTP
molecule is displayed in ball-and-stick representation, whereas the
protein residues are represented as sticks.
Depiction of the residues involved in the most common
hydrogen-bonding
interactions between the dNTP and Pol η: C16, F17, F18, Y52, R55, and R61, which
are located in the finger domain, and K231, which is located in the
palm domain. Mg2+ ions are not shown for clarity. The dNTP
molecule is displayed in ball-and-stick representation, whereas the
protein residues are represented as sticks.In addition to these intermolecular hydrogen bonds holding
the
free dNTP in place, the two Mg2+ ions located near the
triphosphate group may also contribute to the positioning of the dNTP.
These combined effects create an electrostatic environment in which
oppositely charged species are in close proximity and the triphosphate
is fixed where it must be for the nucleotidyl addition reaction. The
exact positioning and configuration of the dNTP are also maintained
by a persistent intramolecular hydrogen bond between a phosphateoxygen
and the O3′ hydrogen, as well as a persistent hydrogen bond
between its O5′ and the O3′ hydrogen of the 3′
terminal nucleotide on the DNA primer. The structural stability of
the dNTP is apparent from the RMSD and RMSF analyses (Figure S14 and Table S3). Although the average
RMSFs of the dNTP range only from 0.23 to 0.31 Å across the four
systems studied (Table S3), the RMSFs,
as well as the fluctuations in the RMSD, are slightly lower in the
TTD3′-A and TTD5′-A systems. These differences underscore
the importance of the Watson–Crick base-pairing interactions
that the dNTP base establishes with the TTD, where the interaction
of A with T is expected to be stronger than the interaction of G with
T.Another hydrogen-bonding network is observed around the TTD.
The
occurrence of hydrogen bonds between the TTD and the protein is not
as high as that of hydrogen bonds between the dNTP and the protein.
The most recurrent hydrogen bond between the TTD and the protein involves
residue Q38 through its side chain amide group. The hydrogen attached
to the nitrogen of the side chain amide forms a hydrogen bond with
the O2 atom of either thymine within the TTD. Residue Q38 establishes
hydrogen bonds to the TTD in 34.4% of all saved snapshots with an
average distance of 2.98 Å and an average angle of 156°.
In some cases, it establishes two hydrogen bonds using both hydrogen
atoms attached to the side chain amidenitrogen atom and the O2 atoms
of both thymines of the TTD. This observation provides a possible
explanation for the key role of residue Q38, as it is one of the two
uniquely conserved amino acids in the entire Pol η family, and
its substitution with Ala decreases the catalytic efficiency.[19] In addition to residue Q38, the protein residues
A87, N324, Y39, R61, and R371 also form hydrogen-bonding interactions
with the TTD or the TT motif in the N/A-A system in some configurations.
The extent of hydrogen bonding around these thymines is much less
in the absence of a TTD defect than in the presence of a TTD defect.
In the absence of a TTD, the second thymine, which does not form a
base pair with the incoming dNTP, can move freely due to the lack
of steric constraints enforced by the covalent bonds constituting
the cyclobutane moiety of the TTD. The enhanced mobility of the TT
motif compared to the TTD motif is illustrated by our RMSD and RMSF
analyses (Figure S15 and Table S3). Specifically,
the average RMSF of the TT in the N/A-A system is 0.97 Å, whereas
the average RMSF of the TTD in the other three systems ranges from
0.38 to 0.59 Å. This decrease in hydrogen-bonding interactions
between the normal TT motif and the protein could contribute to the
reduced overall efficiency in nucleotidyl addition reactions for undamaged
DNA.In addition to residue Q38, residue R61 is the second strictly
conserved amino acid in the Pol η family of enzymes.[19] It exhibits a persistent cation-π interaction
with the incoming dNTP molecule as well as less persistent hydrogen-bonding
interactions between its guanidium hydrogens and the nitrogen or oxygen
atoms of the dNTP. Such cation-π interactions involving arginine
residues are fairly common in proteins.[56,57] In 83.2–95.8%
of all saved configurations for the four systems, the distance between
the central carbon of the guanidinium side chain and the center of
mass of the purine heavy atoms was found to be less than 6 Å,
which strongly suggests a cation-π interaction when combined
with our thorough visual analysis. In contrast to the claim in ref (19), the closed finger domain
was not observed to block residue R61 from stacking with the purine
base.[19] This situation is related to the
observation that residue R61 forms cation-π interactions with
the base of the dNTP molecule in the crystal structures of the yeast
Pol η-cisPt-DNA complex.[58]Another factor to explore is the extent of hydrogen bonding between
the incoming nucleotide dNTP and the TTD. For the TTD3′-A,
TTD3′-G, and TTD5′-A systems, Watson–Crick base
pairing between one of the T residues of the TTD and the purine base
of the dNTP is observed. These hydrogen bonds are mostly conserved
throughout the trajectories for these systems. For the N/A-A system,
the Watson–Crick base pairing involves the dATP and the 3′
thymine residue opposite it, although the hydrogen bonds associated
with this base pairing disappear occasionally during the trajectories.
In this case, either the dATP forms weak interactions with the 5′
thymine or does not pair with any bases.
Additional Observations
As a further analysis, we examined
the physical positioning of dNTP and the 3′ terminus of the
DNA primer to determine if any of the systems at hand would favor
the nucleophilic attack by the O3′ atom of the 3′-end
of the DNA primer on the α-phosphate (Pα) of
the dNTP molecule. The separation between these two atoms remains
between 2.9 and 3.6 Å throughout all of the trajectories. The
most prevalent value for this distance is 3.1–3.2 Å, as
illustrated in Figure S16. Hence, none
of the four systems exhibits an advantage or a disadvantage in terms
of the ease of the nucleophilic attack. The relative positions of
the dNTP and the 3′-DNA primer remain virtually the same in
all four systems.RMSFs of the P atoms in the DNA constructs in systems
TTD3′-A
(black), TTD3′-G (red), N/A-A (green), and TTD5′-A (blue)
from one of the three independent trajectories. The primer strands
are represented as solid lines and the template strands are represented
as dashed lines. P atoms 447 and 448 represent the P atoms of the
TTD in the TTD3′-A and TTD3′-G systems, while P atoms
448 and 449 constitute the TTD in the TTD5′-A and the normal
TT in the N/A-A systems. The analogous data for all trajectories are
depicted in Figure S17.Finally, we performed an analysis of the atomic
fluctuations of
the phosphorus atoms of the DNA backbone to investigate the relative
mobilities of the nucleotides in the DNA strands. Note that the 5′-terminal
nucleotides are exempt from this analysis because of their lack of
phosphate groups. In the template strand, the mobility is consistently
low in the region around the TTD, as depicted in Figures and S17. In general, the nucleotides toward the strand ends are
more mobile. Overall, the residues that are closer to the active site
are less mobile, as they are embedded in a more extended hydrogen-bonding
network involving not only their phosphate groups, but also the thymine
bases themselves. Additionally, covalent bonding enforces restraints
on the motions of the thymines of the TTD, as illustrated by the typically
lower mobilities of the P1 and P2 atoms of the TTD3′-A, TTD3′-G,
and TTD5′-A systems compared to the P atoms of residues 448
and 449 in the N/A-A system (Figure S17).
Figure 10
RMSFs of the P atoms in the DNA constructs in systems
TTD3′-A
(black), TTD3′-G (red), N/A-A (green), and TTD5′-A (blue)
from one of the three independent trajectories. The primer strands
are represented as solid lines and the template strands are represented
as dashed lines. P atoms 447 and 448 represent the P atoms of the
TTD in the TTD3′-A and TTD3′-G systems, while P atoms
448 and 449 constitute the TTD in the TTD5′-A and the normal
TT in the N/A-A systems. The analogous data for all trajectories are
depicted in Figure S17.
Conclusions
In this paper, we used classical MD to
explore the structure and
dynamics of four different systems containing the catalytic domain
of the enzyme Pol η, a DNA primer/template bound to the enzyme,
and a free dNTP molecule in its active site. The bound DNA has a TTD
in only three of the four systems to assess the effects of damaged
versus undamaged DNA, while the dNTP is either dATP or dGTP to determine
the impact of the purine base identity. We specifically sought a molecular
level explanation for the low fidelity and processivity of Pol η
in the absence of a TTD in the bound template DNA strand. Previously,
in vitro studies observed both of these properties to increase significantly
if the enzyme was acting upon DNA templates with cyclobutane pyrimidine
dimers, especially TTDs.[28−31] An objective of this work was to determine if this
improvement could arise from structural and dynamical differences
in Pol η depending on the presence or absence of a TTD.We generated a total of 3 μs of classical MD data on each
of the four Pol η-DNA-dNTP systems through three independent
1 μs trajectories. Analyses of the RMSD, RMSF, active site surface
area, radius of gyration, and cross-correlation maps for the enzyme,
in conjunction with analyses of nucleic acid flexibility, major/minor
groove width, nucleophilic attack distance for the nucleotidyl addition
reaction, RMSF, cation-π interaction, and hydrogen-bonding networks
for the bound DNA and dNTP, identified only minor differences in the
overall structures and equilibrium dynamics among the four systems
studied. A notable exception to this similarity among the four systems
is the hydrogen-bonding patterns around the TTD and between the dNTP
and the TTD or, for one system, the two consecutive, normal thymines
at the same location as the TTD. The Y52, C16, F17, R55, and F18 residues
were found to form the most persistent hydrogen bonds with the dNTP,
regardless of the purine base identity. On the other hand, the TTD
was found to participate in a higher number of hydrogen bonds with
the enzyme than its healthy counterpart of two normal thymines, thereby
potentially leading to a stronger binding interaction between the
enzyme and the TTD. The Q38 residue established the most persistent
hydrogen bond with the TTD, underscoring its importance as one of
the two strictly conserved residues in the Pol η family. Moreover,
the second strictly conserved R61 residue was observed to form persistent
cation-π interactions with the purine base of the dNTP, also
providing an indication of its key role.Our findings are consistent
with the low fidelity of Pol η
with respect to the base of the incoming dNTP molecule[28−31] because no significant differences were detected between the structures
and dynamics of the systems containing dATP versus dGTP. The structural
and dynamical similarities suggest that the enzyme may not be able
to easily differentiate between dATP and dGTP, therefore potentially
leading to replication errors. Furthermore, the observed differences
in hydrogen-bonding interactions between the DNA primer/template and
the enzyme in the absence or presence of the TTD could explain the
elevated overall efficiency of the enzyme in the presence of a TTD
defect.[19,29] In particular, the covalently bonded thymines
comprising a cyclobutane dimer were found to be held more tightly
in place than their normal counterparts through a more extensive hydrogen-bonding
network. Pol η appears to be designed to hold the TTD defect
region more rigidly than most nucleotides bound to its DNA-binding
interface, although its base pair partner, the dNTP molecule, is not
covalently bonded to the upstream DNA primer yet. This more rigid
and specific orientation could contribute to the elevated bypass efficiency
of the enzyme in the presence of a TTD lesion.The results from
these MD trajectories provided the groundwork
for another study focusing on the relative binding free energies of
dATP and dGTP to the enzyme-TTD or enzyme-TT complex.[59] The relative binding free energies for these systems were
explained through differences in hydrogen-bonding interactions observed
during the microsecond MD trajectories. The conclusion from the present
study that the overall structure and dynamics of the protein are similar
for the TTD versus the normal TT and the dATP versus dGTP systems
provides support for explaining the differences in binding free energies
in terms of local interactions in the binding regions. In particular,
the more persistent hydrogen-bonding interactions between the enzyme
and the TTD provide a molecular-level explanation for the greater
binding free energy of dATP to the TTD-containing DNA than to the
undamaged DNA. Analysis of the hydrogen-bonding interactions between
the DNA and the incoming base provided additional insights into the
greater binding free energy of dATP versus dGTP to the enzyme–DNA
complex. These types of comparative studies are enhancing our understanding
of the molecular basis for the fidelity and overall efficiency of
this biomedically important enzyme. A molecular-level understanding
of this system could assist in the design of inhibitors to improve
the effectiveness of cancer chemotherapy treatments for skin cancer
in the future.[33,34,36,58]