Wei Yang1. 1. Laboratory of Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health , Bethesda, Maryland 20892, United States.
Abstract
Y-Family DNA polymerases specialize in translesion synthesis, bypassing damaged bases that would otherwise block the normal progression of replication forks. Y-Family polymerases have unique structural features that allow them to bind damaged DNA and use a modified template base to direct nucleotide incorporation. Each Y-Family polymerase is unique and has different preferences for lesions to bypass and for dNTPs to incorporate. Y-Family polymerases are also characterized by a low catalytic efficiency, a low processivity, and a low fidelity on normal DNA. Recruitment of these specialized polymerases to replication forks is therefore regulated. The catalytic center of the Y-Family polymerases is highly conserved and homologous to that of high-fidelity and high-processivity DNA replicases. In this review, structural differences between Y-Family and A- and B-Family polymerases are compared and correlated with their functional differences. A time-resolved X-ray crystallographic study of the DNA synthesis reaction catalyzed by the Y-Family DNA polymerase human polymerase η revealed transient elements that led to the nucleotidyl-transfer reaction.
Y-Family DNA polymerases specialize in translesion synthesis, bypassing damaged bases that would otherwise block the normal progression of replication forks. Y-Family polymerases have unique structural features that allow them to bind damaged DNA and use a modified template base to direct nucleotide incorporation. Each Y-Family polymerase is unique and has different preferences for lesions to bypass and for dNTPs to incorporate. Y-Family polymerases are also characterized by a low catalytic efficiency, a low processivity, and a low fidelity on normal DNA. Recruitment of these specialized polymerases to replication forks is therefore regulated. The catalytic center of the Y-Family polymerases is highly conserved and homologous to that of high-fidelity and high-processivity DNA replicases. In this review, structural differences between Y-Family and A- and B-Family polymerases are compared and correlated with their functional differences. A time-resolved X-ray crystallographic study of the DNA synthesis reaction catalyzed by the Y-Family DNA polymerasehuman polymerase η revealed transient elements that led to the nucleotidyl-transfer reaction.
Replication
of genomic DNA is
a prerequisite for cell proliferation. High-fidelity DNA polymerases,
regardless of their diversity in amino acid sequence, three-dimensional
structure, origin, or complexity of subunit composition, are uniformly
accurate and processive, such that up to billions of Watson–Crick
(WC) base pairs are copied rapidly with high fidelity.[1−3] However, these remarkable high-speed and high-accuracy machines
have an Achilles heel that is the intolerance of any alteration in
the four normal bases, A, G, T, and C. The presence of a chemically
modified DNA base or a DNA strand break could cause stalling or collapse
of a replication fork. Because genomic DNA is frequently damaged by
environmental toxins, radiation, and endogenous and exogenous metabolites,[4] every living organism relies on two general pathways
to cope with “road blocks” in DNA replication.[5,6] One is to use specialized translesion DNA polymerases to accommodate
DNA base lesions and continue DNA synthesis past damaged sites.[7−9] Alternatively, a detour by the homologous recombination pathway
can avoid troublesome lesion sites and allow replication to proceed.[10,11]DNA base lesions cause structural and chemical changes at
the site
of damage and inevitably hinder normal base pairing and base stacking
in the vicinity, hence leading to general instability and distortion
of the DNA double helix.[12−14] As a result, it is suggested
that two different specialized polymerases are required to bypass
a lesion in translesion DNA synthesis (TLS).[15] The first polymerase accommodates the damaged base and performs
nucleotide incorporation directly opposite the lesion; the second
polymerase extends the DNA primer beyond the lesion for several base
pairs before the damaged DNA can be safely handled by a replicase
(defined as a high-fidelity and high-processivity polymerase dedicated
to the replication of normal DNA genomes). Y-Family polymerases are
generally employed in the first step of TLS. For the second step of
TLS, Escherichia coli DNA pol II and eukaryotic pol
ζ, both of which belong to the B-Family, have been shown to
perform primer extension after a lesion or a mismatched base pair.[16−18]Because DNA lesions are varied in shape and chemical nature,
a
large fraction of DNA polymerases in each organism are specialized
for repair and translesion synthesis, while replicases that copy genomic
DNA of uniform WC base pairs consist of a relatively small fraction
of polymerases in the genome. In E. coli, only one
of the five total DNA polymerases is the high-fidelity and high-processivity
replicase (pol III),[19] and the most abundant
pol I has high fidelity, but low processivity, and performs short
patch synthesis in DNA repair.[20] The remaining
three, the Y-Family pol IV and pol V and the B-Family pol II, are
involved in translesion and mutagenic DNA synthesis.[3,21−23] In humans, only five of 17 DNA polymerases are involved
in normal genomic and mitochondrial DNA replication (pol α,
δ, ε, and γ and telomerase). Of the remaining 12,
four are Y-Family TLS polymerases [pol η, ι, κ,
and Rev1 (see details below)], and eight others (pol β, λ,
and μ, TdT, pol ν, θ, and ζ, and PrimPol)
are required for TLS and repair of DNA breaks.[24−34]Individual Y-Family polymerases have different template base-binding
sites and different preferences for the incoming nucleotide.[8,35] The general properties of Y-Family DNA polymerases are (1) the ability
to conduct translesion synthesis with good accuracy, (2) a high error
rate when copying normal DNA, (3) a lack of the 3′–5′
exonuclease activity and intrinsic proofreading ability,[36] (4) a low catalytic efficiency, and (5) a low
processivity compared to that of DNA replicases.[9] The last two seemingly undesirable features ensure that
they do not associate with or stay on a replication fork for more
than several base pairs, which is thus advantageous in the scheme
of DNA replication. The Y-Family polymerases are recruited to replication
forks only when necessary. Regulation of their recruitment will be
summarized in this review. For more comprehensive treatments, readers
are referred to two recent reviews.[9,28] The focus
of this review will be on the catalytic properties of Y-Family polymerases
and comparisons with those of replicases.
Y-Family Polymerases
Y-Family polymerases are divided into six major groups based on
amino acid sequence. They are represented by E. coli pol IV (known as DinB) and pol V (of which UmuC is the catalytic
subunit) and four human enzymes, pol η, ι, κ, and
Rev1[7,9] (Figure 1A). All
Y-Family polymerases comprise two functional parts, the polymerase
catalytic region of 350–500 residues and a regulatory region
from 10 (as in DinB, Dbh, and Dpo4) to 600 residues (as in Rev1).
Figure 1
Diagram
of domain structures and protein interactions of Y-Family
polymerases. (A) Catalytic and regulatory regions of six subgroups
of Y-Family polymerases. The catalytic core contains finger, palm,
and thumb subdomains. LF denotes the little finger domain. PIP stands
for the PCNA interaction peptide, RIR for the Rev1 interaction region,
NLS for the nuclear localization signal, Ub for ubiquitin, UBZ for
the Ub-binding zinc finger, UBM for the Ub-binding module, and BRCT
for the BRCA1 C-terminal domain. (B) Schematic diagram of ubiquitinated
PCNA interacting with all Y-Family polymerases and Rev1 interacting
with pol η, ι, and κ.
Diagram
of domain structures and protein interactions of Y-Family
polymerases. (A) Catalytic and regulatory regions of six subgroups
of Y-Family polymerases. The catalytic core contains finger, palm,
and thumb subdomains. LF denotes the little finger domain. PIP stands
for the PCNA interaction peptide, RIR for the Rev1 interaction region,
NLS for the nuclear localization signal, Ub for ubiquitin, UBZ for
the Ub-binding zinc finger, UBM for the Ub-binding module, and BRCT
for the BRCA1 C-terminal domain. (B) Schematic diagram of ubiquitinated
PCNA interacting with all Y-Family polymerases and Rev1 interacting
with pol η, ι, and κ.Among these six groups, bacterial DinB and its archaeal homologues,
Dpo4 and Dbh, are related to eukaryotic pol κ by sharing a structural
gap in the catalytic region (Figure 2) and
their tendency to make deletion mutation and ability to bypass bulky
DNA adducts[8,37−39] (see details
below). Initially, pol η and ι were considered close homologues,[40] but the two enzymes turn out to share little
functional similarity.[9,41] Inactivation of POLH (the gene
encoding pol η) in humans leads to extreme UV sensitivity and
predisposition to skin cancers, and the genetic disease is known as
the variant form of xeroderma pigmentosum (XPV);[42,43] however, the function of pol ι remains uncertain as Poli–/– and Poli+/+ mice are alike in all aspects of growth.[44]
Figure 2
Varied interactions between the catalytic core (CC) and
little
finger domain (LF) of Y-Family polymerases. All structures shown here
are of ternary complexes with DNA and dNTP. The CCs are superimposed,
and the DNA and dNTP have been removed for clarity. (A) The relative
orientation of CC (blue) and LF (purple) differs in archaeal Dbh [Protein
Data Bank (PDB) entry 3BQ1] and Dpo4 (PDB entry 2AGQ), resulting a small gap between the CC
and LF in Dpo4 and a large opening in Dbh. (B) The gap between the
CC and LF is large in pol κ (PDB entry 2OH2) and Rev1 (PDB entry 2AQ4) and small to nonexistent
in pol η (PDB entry 4ED8) and ι (PDB entry 3GV8). For the integrity of the catalytic
region, pol κ uses its N-terminal extension (N-clasp, colored
yellow) to bridge the CC and LF (in the back for this view), and Rev1
uses the N-terminal extension to fill the gap (crossing from the back
to the front).
Varied interactions between the catalytic core (CC) and
little
finger domain (LF) of Y-Family polymerases. All structures shown here
are of ternary complexes with DNA and dNTP. The CCs are superimposed,
and the DNA and dNTP have been removed for clarity. (A) The relative
orientation of CC (blue) and LF (purple) differs in archaeal Dbh [Protein
Data Bank (PDB) entry 3BQ1] and Dpo4 (PDB entry 2AGQ), resulting a small gap between the CC
and LF in Dpo4 and a large opening in Dbh. (B) The gap between the
CC and LF is large in pol κ (PDB entry 2OH2) and Rev1 (PDB entry 2AQ4) and small to nonexistent
in pol η (PDB entry 4ED8) and ι (PDB entry 3GV8). For the integrity of the catalytic
region, pol κ uses its N-terminal extension (N-clasp, colored
yellow) to bridge the CC and LF (in the back for this view), and Rev1
uses the N-terminal extension to fill the gap (crossing from the back
to the front).As the tenets of evolution
would predict, the substrate specificity
and prevalence of Y-Family polymerases are highly correlated with
the abundance of naturally occurring DNA lesions. Ultraviolet radiation
(UV) is an intrinsic part of sunlight, and UV-induced cis-synpyrimidine dimers (cyclobutane pyrimidine dimers, or CPD) and 6-4
photoproducts are the oldest and most prevalent DNA lesions on earth.[45] Accordingly, UmuC, ubiquitous in bacteria, and
pol η, found in all eukaryotes, primarily bypass UV-induced
CPD lesions.[42,43,46] Because 6-4 photoproducts are efficiently removed by nucleotide
excision repair, they are rarely encountered in the S phase. In addition,
Rev1, which was originally identified and isolated because of its
UV-induced expression and UV sensitivity in its absence,[47] is present universally among eukaryotes. Rev1
is a template-independent deoxycytidyl transferase and has more than
600 residues outside of the catalytic region (Figure 1).[48] The noncatalytic regions of
Rev1 appear to regulate other Y-Family polymerases and contribute
to TLS more than its catalytic activity.[49−52]Polycyclic aromatic hydrocarbons
(PAHs) are environmental pollutants
produced by incomplete combustion of fossil fuels and biomass, and
they are also present in tobacco smoke and the human diet.[53] PAHs can form DNA base adducts, for example,
benzo[a]pyrene, one of the earliest identified carcinogens.[54] Dpo4 and eukaryotic pol κ in particular
are specialized to bypass these DNA bulky adducts.[37,55−61] Their close relative E. coli pol IV (DinB) bypasses
N2-furfuryl-dG adducts efficiently.[62] Interestingly,
the absence of pol κ in cell cultures also causes mild UV sensitivity,
and pol κ has been shown to participate in excision repair of
UV lesions.[63,64] In addition, 7,8-dihydro-8-oxyguanine
(8-oxo-G) is a major oxidative lesion and promutagenic. Replicative
DNA polymerases tend to incorporate dATP opposite an 8-oxo-G, resulting
in a G → T transversion mutation.[65] Dpo4, pol η and ι, and presumably Rev1 are able to bypass
8-oxo-G with correct dCTP incorporation[66−71] (Table 1).
There are additional structures
in the PDB of Dpo4 complexed with DNA lesions beyond those listed
here.
There are additional structures
in the PDB of Dpo4 complexed with DNA lesions beyond those listed
here.
Regulatory Regions of Y-Family
Polymerases
Regulatory regions of Y-Family polymerases often
contain a PCNA
interacting peptide (PIP), a Rev1 interacting region (RIR), a ubiquitin-binding
module (UBM), and a ubiquitin-binding zinc domain (UBZ)[8,9] (Figure 1). Rev1 is unique and has an N-terminal
BRCT domain and a C-terminal protein-binding domain (Figure 1). Structural studies of these regulatory modules
are mostly conducted by NMR as summarized in Table 1 and shown in Figure 3. Every Y-Family
polymerase interacts with the replication processivity cofactor, β-sliding
clamp in bacteria and PCNA in archaea and eukaryotes, via its PIP
or BRCT domain (Rev1).[72,73] When DNA is damaged, PCNA becomes
ubiquitinated,[74,75] and each eukaryotic Y-Family
polymerase uses UBM or UBZ to interact with ubiquitin. In addition,
human Y-Family polymerases η, ι, and κ each has
an RIR that binds the C-terminal domain of Rev1, which in turn interacts
with Rev3 and Rev7 subunits of pol ζ.[51,76−78] Ubiquitinated PCNA and Rev1 appear to be the hubs
of the TLS network.
Figure 3
Structural domains in human pol η. (A) Diagram of
the linear
arrangement of functional domains in human pol η. Domains are
color-coded. (B) Crystal structure of the catalytic region (amino
acids 1–432) in a complex with DNA and dNTP (PDB entry 3MR2). (C) Structural
model of the quaternary complex of pol η RIR complexed with
the Rev1 CTD, Rev3, and Rev7. The model is a composite of RIR (η)–Rev1
(PDB entry 2LSK) and RIR(κ)–Rev1–Rev3–Rev7 complexes
(PDB entry 4GK5). (D) NMR structure of the UBZ domain (PDB entry 2I5O) with the residues
interacting with Ub shown as sticks. The Zn2+ ion is shown
as a green sphere. (E) Crystal structure of human PCNA complexed with
the pol η PIP (PDB entry 2ZVK). All parts of human pol η are
shown as ribbon diagrams, and their interacting partners are shown
as a molecular surface. The α-helices of pol η are shown
as cylinders in panels B and E.
Structural domains in human pol η. (A) Diagram of
the linear
arrangement of functional domains in human pol η. Domains are
color-coded. (B) Crystal structure of the catalytic region (amino
acids 1–432) in a complex with DNA and dNTP (PDB entry 3MR2). (C) Structural
model of the quaternary complex of pol η RIR complexed with
the Rev1CTD, Rev3, and Rev7. The model is a composite of RIR (η)–Rev1
(PDB entry 2LSK) and RIR(κ)–Rev1–Rev3–Rev7 complexes
(PDB entry 4GK5). (D) NMR structure of the UBZ domain (PDB entry 2I5O) with the residues
interacting with Ub shown as sticks. The Zn2+ ion is shown
as a green sphere. (E) Crystal structure of humanPCNA complexed with
the pol η PIP (PDB entry 2ZVK). All parts of human pol η are
shown as ribbon diagrams, and their interacting partners are shown
as a molecular surface. The α-helices of pol η are shown
as cylinders in panels B and E.Human pol η, whose inactivation directly leads to cancer,[79−82] is among the most scrutinized. In addition to the regulatory elements
listed above, pol η itself can be ubiquitinated on a Lys residue
within the nuclear localization signal (NLS) segment (Figure 3A), and this modification prevents the nearby PIP
from interacting with PCNA[83] (Figure 1B). Pol η is also phosphorylated by ATR kinase
upon UV irradiation (Figure 3A), which allows
its TLS activity and checkpoint response to UV damage.[84]
A Preformed Active Site of Y-Family Polymerases
The Y-Family polymerase catalytic region is composed of a conserved
catalytic core, which includes the finger, palm, and thumb subdomains,
and an appendage of the little finger domain.[8] The palm, finger, and thumb subdomains are found in all DNA polymerases.
The “little finger” domain (LF), also known as the polymerase-associated
domain (PAD),[85] is unique to the Y-Family
and has more sequence variability than the catalytic core.[8,86] Three-dimensional structures of the catalytic region of a number
of Y-Family polymerases from bacteria and archaea to humans have been
reported (Table 1 and references therein).
DNA substrate is bound between the thumb and LF. The palm subdomain
contains the catalytic carboxylates that coordinate two Mg2+ ions, and the finger subdomain interacts with the template base
and the incoming nucleotide (Figure 3).In DNA polymerases other than the Y-Family members, the finger
subdomain undergoes a large conformational change from an open state
represented by apoproteins to a closed state adopted by the catalysis
ready enzyme–DNA–dNTP ternary complexes[87−92] (Figure 4A,B). In contrast, the catalytic
core of Y-Family polymerases contains a preformed active site with
the finger closed in the absence of DNA, dNTP substrate, or both[85,93−95] (Figure 4C,D). Some Y-Family
members can bind an incoming dNTP even in the absence of a base pair
partner (templating base) and base stacking with the primer end.[96]
Figure 4
Different conformational changes in replicases and Y-Family
polymerases.
(A and B) Open (PDB entry 4BDP) and closed (PDB entry 3THV) structures, respectively, of Bacillus DNA polymerase I. The structure of the apo
polymerase[135] is identical to the structure
of a polymerase–DNA binary complex, and both have an open conformation.
The “O” helix (colored deep purple) and the two surrounding
helices (colored pink) undergo a “closing” motion upon
binding of a correct incoming dNTP and metal ion (shown as a green
sphere). (C and D) Structures of Dpo4 in apo (PDB entry 2RDI) and DNA-bound (PDB
entry 2AGQ)
forms, respectively. The little finger domain (LF) rotates nearly
130° between the two forms.
Different conformational changes in replicases and Y-Family
polymerases.
(A and B) Open (PDB entry 4BDP) and closed (PDB entry 3THV) structures, respectively, of Bacillus DNA polymerase I. The structure of the apo
polymerase[135] is identical to the structure
of a polymerase–DNA binary complex, and both have an open conformation.
The “O” helix (colored deep purple) and the two surrounding
helices (colored pink) undergo a “closing” motion upon
binding of a correct incoming dNTP and metal ion (shown as a green
sphere). (C and D) Structures of Dpo4 in apo (PDB entry 2RDI) and DNA-bound (PDB
entry 2AGQ)
forms, respectively. The little finger domain (LF) rotates nearly
130° between the two forms.Correlated with being preformed, the active site of Y-Family
members
is large and solvent-exposed[8,67,97] (Figure 5). Thus, in a complete cycle of
primer extension, from diffusion-in of dNTP and Mg2+ ions
and the nucleotidyl-transfer reaction to pyrophosphate release and
DNA translocation, no significant domain movement is needed or observed.[98−100] In contrast, the closed catalytic center of a replicative polymerase
can accommodate only a WC base pair between a template base and an
incoming dNTP and DNA and dNTP are constrained to the correct alignment
for the nucleotidyl-transfer reaction.[88,101] The finger
subdomain of a replicase has to open to release reaction product pyrophosphate
and allow DNA to translocate and also a dNTP to bind for the next
round of DNA synthesis (Figure 4A,B). The perfect
alignment of a normal DNA and a correct dNTP in the active site of
a replicative polymerase underlies its high fidelity and high catalytic
efficiency. In contrast, the loose fit of substrates in Y-Family polymerases
leads to a low fidelity, a low catalytic efficiency, and a low processivity
of DNA synthesis.
Figure 5
Comparison
of the active site of a replicase and pol η. Close-up
views of the ternary complexes of (A) T7 DNA polymerase (PDB entry 1T7P) and (B) pol η
(PDB entry 3MR3). The finger (blue), palm (red), thumb (green), and LF (purple)
domains are shown as ribbon diagrams superimposed with semitransparent
molecular surfaces. DNA is shown as yellow (primer) and orange (template)
tube and ladders; incoming dNTPs are shown as white sticks, and metal
ions are shown as cyan spheres. The template base and catalytic metal
ions in the T7 complex are barely visible, while the cis-syn thymine dimer (shown as red sticks), incoming nucleotide, and both
active site metal ions in the pol η complex are solvent-exposed.
Comparison
of the active site of a replicase and pol η. Close-up
views of the ternary complexes of (A) T7 DNA polymerase (PDB entry 1T7P) and (B) pol η
(PDB entry 3MR3). The finger (blue), palm (red), thumb (green), and LF (purple)
domains are shown as ribbon diagrams superimposed with semitransparent
molecular surfaces. DNA is shown as yellow (primer) and orange (template)
tube and ladders; incoming dNTPs are shown as white sticks, and metal
ions are shown as cyan spheres. The template base and catalytic metal
ions in the T7 complex are barely visible, while the cis-synthymine dimer (shown as red sticks), incoming nucleotide, and both
active site metal ions in the pol η complex are solvent-exposed.
Large Conformational Changes
of Dpo4, Dbh, and Pol κ
Substrate-dependent large conformational
changes do occur in some
Y-Family members. The catalytic core (CC) and little finger (LF) are
connected by a flexible linker in all Y-Family polymerases. In the
DinB subfamily, including Dpo4, Dbh, and pol κ, the CC and LF
freely move relative to each other in the absence of DNA.[86,93,100,102] Large conformational changes are induced upon DNA binding.[86,100,103] Because of limited interactions
between the CC and LF in Dpo4, Dbh, and pol κ, a large structural
gap divides the CC and LF of these polymerases even in the ternary
complexes with DNA and dNTP[8,97] (Figure 2). Pol κ depends on its N-terminal extension (N-clasp)
to hold the CC and LF together[102,103] (Figures 1A and 2). The function of
the structural gap in Dpo4, Dbh, and pol κ appears to accommodate
minor groove bulky PAH adducts for lesion bypass[37,104] or nucleotides looped out of a DNA template strand leading to a
deletion frameshift.[38,39,105] Similarly, a large gap between the CC and LF of Rev1 is filled by
the N-terminal extension from the catalytic core[106,107] (Figures 1A and 2).
In humanRev1, the finger and thumb are connected by a long loop insertion,
and the CC is ring-shaped and encircles the DNA substrate.[107]In pol η, however, there is no
structural gap and extensive
interactions between the LF and the catalytic core stabilize the entire
polymerase region (Figure 2). As a result,
the apo pol η and DNA-bound complex structures are superimposable.[67,85,97] The interactions between the
CC and LF of pol η form a “molecular splint” to
hold the template strand in the B-form even in the presence of a UV
cross-linked cis-synpyrimidine dimer.[97] With this “molecular splint”,
pol η is able to insert dAMPs across a cis-synthymine dimer and also extend the primer for two more nucleotides,[97,108] so that the resulting TLS product can be utilized by a replicase
to continue DNA synthesis.The rate of DNA binding has not been
reported for most of the Y-Family
polymerases.[35] The poor affinity of Dbh
for DNA[109] perhaps is due to its unique
linker between the CC and LF[86] rather than
the large conformational changes per se. It will be interesting to
study whether the lack of CC–LF interaction in Dpo4 and pol
κ slows DNA substrate binding compared to that of pol η,
in which CC and LF do not undergo such conformational changes.
Substrate
Selection and Lesion Bypass Specificity
Since the structures
of Dbh, Dpo4, and yeast pol η were determined
12 years ago, the unusually large and solvent-accessible active sites
of Y-Family polymerases have been thought to underlie promiscuous
lesion and dNTP selection. This notion has been supported by Dpo4’s
ability to bind and bypass many lesions (Table 1). Despite a generally enlarged replicating base pair-binding site,
each Y-Family polymerase differs significantly in the size and shape
of template base-binding site and has a different lesion preference.
Pol η accommodates and accurately bypasses cis-synpyrimidine dimers and also a 1,2-intrastrand d(GpG)–cisplatin
cross-link[67,97,98] (Figure 5), but it is blocked by 6-4 photoproducts,
BPDE–dG, and other bulky adducts.[110] Neither pol κ nor pol ι is capable of bypassing the
3′ base of a cross-linked base dimer. Likewise, only pol κ,
and not pol η or pol ι, can accommodate the minor groove
BPDE–dG adduct and accurately incorporates dCTP opposite the
lesion.[55−57,59,60] To achieve this feat, pol κ depends on the flexibility between
the catalytic core and little finger domain, the large structural
gap, and its unique N-terminal extension (N-clasp).[103,104]A couple of Y-Family polymerases have a specific preference
for
the incorporation of incorrect nucleotides, and the resulting mutations
are sequence-specific. Pol η depends on two uniquely conserved
residues (Arg61 and Gln38 in humans) for the accurate bypass of cis-synpyrimidine dimers.[97,111] However,
these two residues also play a key role in the efficient misincorporation
of dGTP opposite a dT, particularly when the mismatched dT:dGTP replicating
base pair is immediately preceded by an AT base pair,[99] in the DNA sequence called the WA motif. The misincorporation
of dGTP by human pol η at the WA motif in undamaged DNA contributes
to somatic hypermutation in the normal development process of the
adaptive immune system.[112−114]Rev1 uses an arginine
side chain to form two hydrogen bonds with
the base of a dCTP and to serve as a template for dCTP incorporation,
while a DNA template base is flipped out of the active site.[106,107] The biological consequence of poly-dC synthesis remains unclear.
In parallel, pol ι has a narrow binding site for the replicating
base pair and thus favors the Hoogsteen base pair.[115,116] It is well-known that pol ι prefers to incorporate dGTP opposite
template dT instead of the correct dATP.[40,117] However, crystal structures of pol ι show that an incoming
dGTP and templating dT form neither a wobble nor a Hoogsteen base
pair,[116,118] and the molecular mechanism of the preference
of dGTP misincorporation over correct dATP incorporation remains unclear.
Perhaps because of the lack of pol ι recruitment in the immune
cells, Polimice do not
exhibit altered somatic hypermutation as do Polη mice.
Nucleotidyl-Transfer Reaction
Catalyzed by DNA Polymerases
All DNA
polymerases, regardless of the structure
and fidelity differences, catalyze the same nucleotidyl-transfer reaction,
that is the formation of a new phosphodiester bond between the 3′-OH
of a primer strand and the α-phosphate of an incoming dNTP,
and concomitantly break the phosphodiester bond between the α-
and β-phosphates of dNTP (Figure 6A).
The reaction requires two Mg2+ ions in addition to a DNA
template and primer and a dNTP,[8,24,119] and polymerases lower the energy barrier for product formation.
All DNA polymerases contain two to three conserved carboxylates in
the catalytic center. The Mg2+ ions neutralize the catalytic
carboxylates and triphosphates of dNTP and facilitate the alignment
of the substrates for the chemical reaction. The reaction is pH-dependent
and analogous to acid–base catalysis, in which the nucleophile
(3′-OH) needs to be deprotonated and the leaving group (pyrophosphate)
needs to be protonated.[120] However, different
from the classic acid–base catalysis in which conserved protein
side chains serve as the general base and acid to extract and donate
protons, DNA and RNA syntheses require two Mg2+ ions for
the catalysis.[121,122]
Figure 6
Nucleotidyl-transfer reaction. (A) Diagram
of the reaction catalyzed
by DNA polymerases. (B) Time-lapse recording of the reaction catalyzed
by human pol η. At time zero, a Ca2+ ion occupied
the B site, and the 3′ end of the primer (circled) is not aligned
with the α-phosphate of dNTP. At 40 s (40 s after the addition
of Mg2+), both metal ion-binding sites A and B become occupied
by Mg2+ ions, and the 3′-OH and α-phosphate
are aligned. However, no chemical reaction is detected. At 80 s, a
transient water appears (indicated by a gray arrowhead), and the new
bond starts to form as indicated by the black double arrow. At 230
s, there are more products than substrate, and the third Mg2+ ion partially occupies the C site. (C) Composite of mixed substrate
(colored yellow) and product (colored blue) in the middle of the reaction
time course.
Nucleotidyl-transfer reaction. (A) Diagram
of the reaction catalyzed
by DNA polymerases. (B) Time-lapse recording of the reaction catalyzed
by human pol η. At time zero, a Ca2+ ion occupied
the B site, and the 3′ end of the primer (circled) is not aligned
with the α-phosphate of dNTP. At 40 s (40 s after the addition
of Mg2+), both metal ion-binding sites A and B become occupied
by Mg2+ ions, and the 3′-OH and α-phosphate
are aligned. However, no chemical reaction is detected. At 80 s, a
transient water appears (indicated by a gray arrowhead), and the new
bond starts to form as indicated by the black double arrow. At 230
s, there are more products than substrate, and the third Mg2+ ion partially occupies the C site. (C) Composite of mixed substrate
(colored yellow) and product (colored blue) in the middle of the reaction
time course.Chemical reactions are
heterogeneous because enzyme molecules are
usually not synchronized. In addition, large conformational changes
of many polymerases associated with each catalytic cycle easily obscure
the small changes of chemical bond formation. Therefore, structural
analyses of DNA polymerases were limited to static states of a homogeneous
population. Crystal structures of a variety of DNA polymerases in
complex with DNA and dNTP substrate have been reported,[24−26,89,92,119,123−125] where the nucleotidyl-transfer reaction was prevented by a nonreactive
substrate, an inactivated polymerase, or the replacement of Mg2+ with Ca2+ because Ca2+ supports the
formation of the enzyme–DNA–dNTP ternary complex but
not the chemical reaction.[122] Crystal structures
of polymerase–DNA product complexes are also available. However,
how the actual nucleotidyl-transfer reaction takes place is still
unclear.
Time-Resolved Catalysis of DNA Synthesis by Pol η
In the course of our structural studies of pol η by X-ray
crystallography, we noted that both substrate and product complexes
of pol η coexisted in the same crystal lattice. Taking advantage
of the absence of large conformational changes and slow reaction rates,
we set out to record the course of the nucleotidyl-transer reaction
catalyzed by pol η using time-resolved X-ray crystallography.
To achieve this goal, it is necessary to turn off and turn on a chemical
reaction at will. After our failed attempt to make a caged dNTP that
can be activated by laser light,[126] we
succeeded in growing crystals of native pol η with normal DNA
and dNTP at pH 6.0, and with one Ca2+ per complex, no DNA
synthesis took place.[127]These crystals
were soaked in stabilization buffers at pH 6.8–7.2
and continued to diffract X-rays to better than 2.0 Å resolution.
Addition of 1 mM Mg2+ in the soaking buffer at pH 7.0 initiated
the nucleotidyl-transfer reaction in crystallo. After
Mg2+ exposure, at roughly 40 s intervals crystals were
flash-cooled in liquid nitrogen to stop the chemical reaction, and
the frozen samples at different reaction stages were analyzed by X-ray
crystallography.We found that binding of two Mg2+ ions was necessary
to align the 3′-OH of the primer strand and the α-phosphate
of dNTP (Figure 6B). With a single Ca2+ occupying the B metal-binding site, the 3′-OH together with
the deoxyribose at the primer end was oriented away from the dNTP.
After two Mg2+ ions had bound (at 40 s), the fully occupied
A-site Mg2+ coordinated by the 3′-OH and the α-phosphate
of dNTP led to the alignment of the reactants and tightening of the
catalytic center. Interestingly, there was a delay between the presence
of the well-aligned reactants and the formation of the new phosphodiester
bond. The delay may correspond to a rate-limiting step observed in
solution.[128]Because the chemical
reaction of the trillions of polymerase molecules
in each crystal took place stochastically and at a different rate,
when the new bond became detectable after Mg2+ exposure
for 80 s, every crystal contained two mixed reaction species, the
substrate and product (Figure 6B). With a diffraction
resolution of 1.5–2.0 Å, the structures and proportions
(occupancies) of the two reaction states of each crystal were readily
refined. As the reaction progressed, the level of substrates decreased
and the products accumulated. At 200 s, the crystals contained roughly
equal amounts of the substrate and product immediately before and
after the nucleotidyl-transfer reaction, and a composite of the two
resembles the hypothesized pentacovalent phosphoryl-transfer reaction
intermediate (Figure 6C). After 250 s, the
two species seemed to reach equilibrium, with 70% being the product.Our time-resolved study has revealed two transient elements in
the chemical reaction. A water molecule, which formed a hydrogen bond
with the 3′-OH, appeared at 80 s when the new bond started
to form and disappeared at 200 s before the product peaked. This transient
water was suspected to deprotonate the 3′-OH, but after substantial
investigation, we found that this transient water is not the nominal
general base because it can be replaced by a 2′-OH at the primer
end without reducing the reaction rate.[127] Instead, a native dNTP may directly contribute to deprotonation
of the 3′-OH because in the ternary complex of a nonreactive
incoming nucleotide (dNMPNPP) a similar water molecule is stably bound
to the 3′-OH without a chemical consequence.[97] The largest positional change in the transition from substrate
to product occurred at the α-phosphorus of the incoming nucleotide,
which moved 1.4 Å, and not the nucleophile 3′-O atom.
It appears that the 3′-OH and dNTP held together by the two
Mg2+ ions in the catalytic center may stochastically approach
each other to be within such a critically short distance that the
pKa of the 3′-OH is reduced for
the proton to leave readily. The transient water may be simply recruited
to carry the proton away. In the absence of this particular water,
such as in the case of the ribonucleotide substitution at the primer
end, the proton may leave via 2′-OH to another water molecule.
An alternative path of deprotonation of the 3′-OH has been
proposed in dynamic simulation studies of three other DNA polymerases,[60,129,130] in which the proton leaves via
the water coordinated by metal ion A to the end with the γ-phosphate
of dNTP.Among the two-Mg2+-ion-dependent DNA polymerases,
a
conserved general base is absent, and there is likely no fixed path
for the proton to depart. Instead, the two Mg2+ ions coordinated
by the catalytic carboxylates perfectly align the primer end of DNA
and dNTP, and the resulting electrostatic environment, including the
contribution by the dNTP itself, likely reduces the pKa of the 3′-OH for the nucleophilic attack to take
place.The second transient element observed in the in crystallo reaction catalyzed by pol η is a third
Mg2+. The
third Mg2+, which replaced the conserved Arg (Arg61) that
neutralizes the dNTP, started to appear after 140 s when the products
began to accumulate and its occupancy increased with an increase in
the level of products (Figure 6B). Obviously,
the +2 charged Mg2+ is more effective than the +1 charged
Arg in neutralizing the locally accumulated negative charge in the
transition states. This Mg2+ is coordinated by reaction
products, primer, and pyrophosphate
(Figure 6C). The nature of this transient divalent
cation has been verified by using Mn2+ instead of Mg2+ in the reaction buffer. The third metal ion has also been
observed in the time-resolved study of the X-Family DNA pol β.[131] In a related study of bacteriophage N4 RNA
polymerase (A-Family), the third metal ion is not observed,[132] and its absence could be due to the lack of
spatial (2.0 Å instead of 1.5–1.7 Å as in the studies
of pol η and β) and temporal resolution, in which a reaction
intermediate state was not recorded. In addition to neutralizing the
excess negative charge of the reaction intermediates and stabilizing
products, the third Mg2+ may play a role in protonating
the pyrophosphate product.[127]The
two transient elements, the water and the third metal ion,
revealed by time-resolved analyses have escaped notice in conventional
studies of either enzyme–substrate or enzyme–product
complexes. The third metal ion may be a general feature of DNA synthesis
and many reactions catalyzed by the two-metal ion mechanism. The transient
nature of the third metal ion may explain the discrepancy between
“three metal ions” reported by chemical studies of the
reaction process catalyzed by a group I intron and “two metal
ions” observed in the crystal structures of ribozyme–substrate
and ribozyme–product complexes.[133,134] The time-lapse
series of the in crystallo reaction also reveals
a delay between substrate and cofactor binding and the actual chemistry,
during which a transient water molecule emerges to hydrogen bond with
the 3′-OH. Deprotonation of the 3′-OH could be the rate-limiting
step, and the details of what leads to its pKa shift and the chemical reaction await future studies.
Concluding
Remarks
In the 18 years since yeastRev1 was first purified
and shown to
be a nucleotidyl-transferase, Y-Family DNA polymerases have been discovered
and characterized. However, many questions are yet to be answered.
The most fundamental question remaining is how so many DNA polymerases
are coordinated at a replication fork to ensure that DNA synthesis
occurs with the highest accuracy and efficiency in the presence of
various roadblocks. The molecular details of how DNA polymerases switch
when a lesion is encountered remain vague. With regard to the polymerase
catalytic region, atomic structures of pol κ bypassing a BPDE
adduct are yet to be obtained. While more protein modifications are
likely to be found, the importance of the linear arrangement of PIP,
RIR, UBM, and UBZ in each polymerase and whether these homologous
functional parts can be exchanged between Y-Family members are not
known.
Authors: Ye Zhao; Christian Biertümpfel; Mark T Gregory; Yue-Jin Hua; Fumio Hanaoka; Wei Yang Journal: Proc Natl Acad Sci U S A Date: 2012-04-23 Impact factor: 11.205
Authors: Mónica Berjón-Otero; Laurentino Villar; Miguel de Vega; Margarita Salas; Modesto Redrejo-Rodríguez Journal: Proc Natl Acad Sci U S A Date: 2015-06-22 Impact factor: 11.205
Authors: Natalie Saini; Joan F Sterling; Cynthia J Sakofsky; Camille K Giacobone; Leszek J Klimczak; Adam B Burkholder; Ewa P Malc; Piotr A Mieczkowski; Dmitry A Gordenin Journal: Nucleic Acids Res Date: 2020-04-17 Impact factor: 16.971
Authors: Susith Wickramaratne; Shaofei Ji; Shivam Mukherjee; Yan Su; Matthew G Pence; Lee Lior-Hoffmann; Iwen Fu; Suse Broyde; F Peter Guengerich; Mark Distefano; Orlando D Schärer; Yuk Yin Sham; Natalia Tretyakova Journal: J Biol Chem Date: 2016-09-12 Impact factor: 5.157