Literature DB >> 18676977

The solution structure of the amino-terminal domain of human DNA polymerase epsilon subunit B is homologous to C-domains of AAA+ proteins.

Tarmo Nuutinen¹, Helena Tossavainen, Kai Fredriksson, Päivi Pirilä, Perttu Permi, Helmut Pospiech, Juhani E Syvaoja.

Abstract

DNA polymerases alpha, delta and epsilon are large multisubunit complexes that replicate the bulk of the DNA in the eukaryotic cell. In addition to the homologous catalytic subunits, these enzymes possess structurally related B subunits, characterized by a carboxyterminal calcineurin-like and an aminoproximal oligonucleotide/oligosaccharide binding-fold domain. The B subunits also share homology with the exonuclease subunit of archaeal DNA polymerases D. Here, we describe a novel domain specific to the N-terminus of the B subunit of eukaryotic DNA polymerases epsilon. The N-terminal domain of human DNA polymerases epsilon (Dpoe2NT) expressed in Escherichia coli was characterized. Circular dichroism studies demonstrated that Dpoe2NT forms a stable, predominantly alpha-helical structure. The solution structure of Dpoe2NT revealed a domain that consists of a left-handed superhelical bundle. Four helices are arranged in two hairpins and the connecting loops contain short beta-strand segments that form a short parallel sheet. DALI searches demonstrated a striking structural similarity of the Dpoe2NT with the alpha-helical subdomains of ATPase associated with various cellular activity (AAA+) proteins (the C-domain). Like C-domains, Dpoe2NT is rich in charged amino acids. The biased distribution of the charged residues is reflected by a polarization and a considerable dipole moment across the Dpoe2NT. Dpoe2NT represents the first C-domain fold not associated with an AAA+ protein.

Entities: CellLine Chemical Disease Gene Mutation Species

Mesh：

Substances：

Year: 2008 PMID： 18676977 PMCID： PMC2528186 DOI： 10.1093/nar/gkn497

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

INTRODUCTION

Early cellular evolution was accompanied by gross genetic rearrangements (1–3), which resulted in apparent modularity of proteins (4) and in unexpected structural similarities and dissimilarities among cellular forms of life (5,6). Modular proteins involved in DNA metabolism, including DNA polymerases (Pols), have been particularly focused upon in research (7). In the eukaryotic cell, it is the family B Pols α, δ and ε that replicate the bulk of the nuclear DNA. These enzymes are all large multisubunit and multidomain complexes. The largest catalytic subunit A contains the motifs for DNA polymerase and in the case of Pols δ and ε, proofreading 3′-exonuclease activities (8). Subunit B belongs to a superfamily of B subunits conserved among eukaryotic family B Pols α, δ and ε (9,10). Surprisingly, this family also contains subunits of archaeal family D Pols, whose large subunits do not show any homology to family B Pols, except for putative zinc fingers. In this B-subunit superfamily, a calcineurin-like phopshoesterase domain spanning the C-terminal half of the B subunit is separated from an amino-proximal OB (oligonucleotide/oligosaccharide binding)-fold domain (11) by a short proline-rich region (9). The seven residues involved in metal coordination and catalysis in the calcineurin-like phosphoesterase domain are conserved in archaeal B-subunits but disrupted in their eukaryotic counterparts (10). Consistent with the conservation of the phosphoesterase domain, the B-subunit of Methanococcus jannaschii has been subsequently found to display 3′-exonuclease activity (12). Pol ε is needed for the chromosomal DNA replication in eukaryotic cells, probably as the enzyme that synthesises the leading strand (13–16). Pol ε has also been implicated in many functions independent of DNA replication (13). Human Pol ε is composed of four subunits, A–D, with molecular weights of 262, 60, 12 and 17 kDa, respectively. Subunits A and B are essential for the viability of yeast cells, while the histone-fold subunits C and D are not. Low-resolution structures of yeast Pol ε, determined by cryo-electron microscopy, indicate that the three small subunits form an extended structure that is flexibly connected to a larger globular domain formed by the catalytic subunit A (17). We report here the identification of a novel conserved domain in the very N-terminal end of human Pol ε subunit B (Dpoe2). Structural characterization by means of solution state NMR indicates a close structural relationship of this domain with the carboxyproximal α-helical domain of AAA+ proteins (ATPases associated with various cellular activities).

MATERIALS AND METHODS

DNA constructs and preparation of protein samples

Cloning

DNA insert corresponding to amino acids 1–75 of human Pol ε subunit B was amplified by conventional PCR with primers 5′-CGTGGATCCGTCGACATGGCGCCGGAGCGGCTGCGGAG-3′ and 5′-TTGAATTCTCGAGATTAACTGGATTCCTGGACTGCTGC-3′, and plasmid harboring human POLE2 cDNA as the template. SalI and XhoI digested insert was ligated to SalI, XhoI and CIP-treated pRSFDuet-1 vector (Novagen, Gibbstown, New Jersey, USA). The resulting construct encodes the peptide sequence: MGSSHHHHHHSQDPNSSSARLQVDMAPERLRSRALSAFKLRGLLLRGEAIKYLTEALQSISELELEDKLEKIINAVEKQPLSSNMIERSVVEAAVQESS. The peptide contains a histidine affinity purification tag (aa 1–24, italics) followed by the sequence representing the amino-terminus of Dpoe2 (aa 25–99). The amino acid substitution C98S is underlined.

Production of labeled samples for NMR

Escherichia coli (BL21[DE3]) cells carrying the plasmid described above were grown to saturation at +37°C in 5.5 ml of LB medium supplemented with 35 μg/ml of kanamycin. The culture was diluted in a stepwise manner to M9 labeling medium with 2.0 g/l of 13C d-glucose and 1.0 g/l of 15N–NH4Cl as the only source of carbon and nitrogen, respectively, and supplemented with MEM-vitamins (Sigma, St. Louis, Missouri, USA) and antibiotics to a final volume of 300 ml. Isotope labels were all purchased from Spectra, Andover, MD, USA, at >99% grade. Temperature was gradually decreased during the dilution. When temperature was stabilized to +25°C and cell density reached OD600 0.4, the protein expression was induced by adding IPTG to a final concentration of 0.65 mM. Cells were harvested by centrifugation when the culture had reached OD600 1.8.

Protein purification

Cells were suspended and incubated in lysis buffer (0.5 M NaCl, 50 mM MgCl2, 0.3% Triton-X-100 and lysozyme in 100 mM Tris–HCl, pH 6.8). Lysis was completed by brief pulses of ultrasonication. The solution was clarified by centrifugation and Ni-chelating sepharose particles were added to the supernatant. Particles were collected, washed once with 10 vol. of lysis buffer and twice with washing buffer (0.5 M NaCl in 100 mM Tris–HCl, pH 6.8). Proteins were eluted with an elution buffer (0.5 M NaCl, 0.5 M imidazole, pH 6.5). The eluted fraction was microfiltered and dialyzed against the NMR buffer (155 mM NaCl, 25 mM Tris–HCl, 25 mM imidazole, 1.25 mM EDTA, pH 6.62). As estimated by SDS–PAGE the purity of the protein was 80–90% with no prominent contaminant. D2O was added to protein solution prior to structure determination to a final concentration of 7% (v/v). The concentration of the protein was 0.6–0.8 mM. Unlabeled protein samples for circular dichroism (CD) spectroscopy were prepared as mentioned earlier, with the exception that M9-medium was substituted with standard LB medium supplemented with kanamycin. Due to the high expression levels in LB medium, a purity of >95% was reached. Samples were microfiltered and dialyzed against a 10 mM potassium phosphate buffer, pH 8.2.

CD spectroscopy

The Bradford method with BSA as a standard was used for adjustment of the protein concentrations for CD spectroscopy, which was carried out at 20°C using a Jasco, Easton, MD, USA, J-715 spectropolarimeter with a Jasco PFD-3505 temperature controller. The far-UV spectra of the proteins (0.125 mg/ml) were measured from 190 nm to 250 nm in 5 mM potassium phosphate, pH 8.2, with the following settings: response, 1 s; speed, 50 nm/min; path length, 1 mm; band width 1 nm; average of eight scans. The mean molar ellipticities were calculated with Jasco software. Quantitative estimations of the secondary structure content were made with the aid of the programs CDSSTR, CONTIN and SELCON3 included in the CDPro software package (http://lamar.colostate.edu/~sreeram/CDPro) (18). The melting curve was generated by monitoring the protein solution at 222 nm over a temperature gradient of 20–98°C in a CD spectrometer with the following instrument settings: temperature increase, 30°C/h; response, 16 s; and path length, 1 mm. The temperature titration was followed by an immediate reverse scan over the same temperature range to study the reversibility of the unfolding.

NMR spectroscopy and structure determination

NMR spectra for the structure determination were acquired at 25°C on Varian Unity Inova spectrometers operating at 600 and 800 MHz 1H frequencies, equipped with 5 mm triple resonance probeheads and actively shielded z- and xyz-axis gradient systems. Spectra were processed with Vnmr 6.1 revision C (Varian Inc., Palo Alto, CA, USA), and analyzed with Sparky 3.106 (19). A set of 3D experiments, i.e. HNCA, HN(CO)CA, iHNCA, HNCACB, HN(CO)CACB and HNCO were utilized in backbone assignment (20,21). Aliphatic side-chain assignment was accomplished using 3D H(CCO)NH, CC(CO)NH and HCCH-COSY experiments. A combination of (Hβ)Cβ(CγCδ)Hδ, (Hβ)Cβ(CγCδCε)Hε (22), and NOE experiments were employed for the assignment of aromatic side-chains. NOE peaks were picked from a 3D 15N-edited NOESY-HSQC (23), and a 13C-edited NOESY-HSQC (24) spectrum, modified to simultaneously excite aliphatic and aromatic carbon resonances. The NOE signal assignment and structure calculations, based on NOE-peak intensities, were made automatically by the program CYANA (25). Three hundred conformers were generated, and the 30 conformers with the lowest target function values were subjected to restrained energy minimization with AMBER 8 (26). AMBER refinement consisted of a cycle of simulated annealing using the generalized Born implicit solvent model. Structures were analyzed with PROCHECK-NMR (27). A final family of 20 structures with the lowest NOE restraint violation energies was selected to represent the Dpoe2NT structure in solution.

Bioinformatics

Multiple sequence alignment was generated by ClustalW (28), using the Gonnet matrix. Sequence alignments were revised with secondary structure prediction data retrieved from the PredictProtein Server (29). Multiple sequence alignments were managed with a genedoc sequence editor (30). When searching for structural homologs, superimposition of molecular structures, structural comparison and alignments following www servers and programs were utilized: DALI (31), DaliLite (32), CATHedral (33), MOLMOL (34) and Swiss-PdbViewer (35). Structure searches and comparisons were performed by using structural model no. 1 (residues 25–99) as a representative structure for Dpoe2NT. Dipole moments for all isolated domain structures were calculated at http://bip.weizmann.ac.il/dipol (36). HMM–HMM homology searches were performed at the HHPred server (37).

Database deposition

The atomic coordinates, NMR restraint data and chemical shifts for the 6His-tagged amino-terminal domain of Pol ε subunit B have been deposited at the EBI Macromolecular Structure Database (EBI-MSD), with the use of the PDB AutoDep database deposition tool, and in the RCSD Protein Data Bank (PDB) under the PDB code 2v6z.

RESULTS AND DISCUSSION

A novel domain in the aminoterminus of Pol ε subunit B

Previous studies have identified a C-terminal calcineurin-like phosphoesterase domain (10) and an OB-fold domain (11) within the B subunits of eukaryotic Pols α, δ and ε, as well as archaeal Pol D, whereas the N-terminal part of the B subunits shows very poor conservation. We re-analyzed the sequence aminoproximal to the OB-fold domain of Dpoe2, the human Pol ε B subunit. Sequence comparison revealed a previously unidentified region of conservation corresponding to the first 75 amino acids of human Dpoe2 (Figure 1). A regular pattern of charged and hydrophobic residues suggests a conserved fold. Independent secondary structure predictions of sequences from representative species indicate that the amino terminus of eukaryotic Pol ε B subunits may form a bundle of four helices. In contrast, the neighboring region, corresponding to amino acids 76–180 of Dpoe2, shows no apparent sequence conservation, suggesting that it is largely disordered.

Figure 1.

Domain organization of human Pol ε subunit B and conservation of the amino acid sequence in the N-terminal domain of eukaryotic Pol ε B subunits. Uniprot accession numbers are given on the left. The alignment was generated with ClustalW (28). XENLA, Xenopus laevis; CHICK, Gallus gallus; DANRE, Danio rerio; AEDAE, Aedes aegypti; CAEEL, Caenorhabditis elegans; YEAST, Saccharomyces cerevisiae; CANAL, Candida albicans; YARLI, Yarrowia lipolytica; ASPFU, Aspergillus fumigatus; CRYNE, Cryptococcus neoformans; ARATH, Arabidopsis thaliana; ORYSJ, Oryza sativa subsp. japonica; DICDI, Dictyostelium discoideum; LEIMA, Leishmania major; 9TRYP, Trypanosoma brucei. As sequence alignments and secondary structure predictions suggested that the N-terminus of human Dpoe2 forms an independent domain, we expressed this region (aa 1–75) with an aminoterminal 6xHis-tag separated by a flexible linker as a recombinant protein (His-Dpoe2NT) in E. coli. The fragment was well expressed and soluble, and could be purified to near homogeneity by utilizing a chelating column (Supplementary Figure 1A). The purified protein was subjected to CD studies. The far-UV CD spectrum exhibited double minima at 208 and 222 nm, characteristic for α-helical structures (Supplementary Figure 1B). The CD spectrum of the native protein was analyzed by the CDPro package (18). The CDSSTR, CONTIN and SELCON3 algorithms predicted similar secondary structure compositions of 58% α-helix, 7% β-strand, 14% β-turns and 20% random coil. The predicted random fold content could be derived from the purification tag that comprises 24% of the total protein. Temperature titration revealed considerable thermal stability of recombinant His-Dpoe2NT (Supplementary Figure 1C). The melting point Tm was estimated to be 71.4°C from the first and second derivative of the temperature titration and 70.3°C from the van't Hoff plot. We found that the temperature-dependent change in ellipticity at 222 nm was ∼80% reversible. The high Tm, together with the considerable degree of reversibility of thermal unfolding, supports the view that the N-terminus of Dpoe2 is folded as an independent, compact domain.

The NMR structure of the Dpoe2NT

Based on observations on protein stability and conservation, we expressed and purified the fully 13C-/15N-labeled His-Dpoe2NT protein for structure determination by NMR. The 15N-HSQC spectrum of Dpoe2NT showed moderately dispersed resonances typical for a highly α-helical protein. In total, 91.3% of all 1H resonances were assigned (Figure 2). The missing assignments belong to the very first 11 residues belonging to the N-terminal His-tag. Assignment of the 13 leucine residues was particularly challenging due to heavily overlapping chemical shifts, especially in methyl groups. Structure calculation was carried out automatically using the program CYANA. A total of 1488 nonredundant NOE distance restraints, obtained from 3D 15N- and 13C-edited NOESY spectra, were assigned by the NOEASSIGN algorithm in CYANA. The automatic method was employed to calculate 300 initial structures. Of these, 30 structures with the lowest target functions were selected and subsequently energy-minimized with AMBER to obtain the final ensemble of 20 structures, representing Dpoe2NT in solution. The 1439 NOE restraints were evenly distributed through residues 25–99 of the construct that constituted the Dpoe2NT. The additional 49 restraints occupied the amino-terminal His-tag (residues 1–24 of the construct) and included no long-range NOEs and only two medium range NOEs. Special care was taken to verify that the first 25 residues not belonging to the native sequence of Dpoe2NT would not contribute to the automatic NOE assignment and structure calculation procedure in CYANA. We also analyzed our structure by verifying its NOE completeness as described by Doreleijers et al. (38). The results indicate, by neglecting intraresidual restraints, a NOE completeness of 71, 54 and 22% for 3, 4 and 5 Å cut-offs, respectively. This fits well with the average values found for 97 proteins, i.e. 68 ± 14%, 48 ± 13% and 26 ± 9% (38). A summary of the structural statistics is provided in Table 1, and an ensemble of the 20 final structures with the lowest NOE restraint violation energies for the human Dpoe2NT is displayed in Figure 3A. The 20 best conformers had very good covalent geometry and small energy terms (Table 1). A structure quality analysis by PROCHECK-NMR (27) revealed 92.5, 7.1 and 0.4% of the protein residues in the most favored, additionally allowed and generously allowed regions of the Ramachandran plot, respectively. Overall, the structure ensemble is well defined, except for the less well-ordered N-terminal residues 25–31 of Dpoe2NT (Figure 3A–C). Even for this part, the majority of the structures exhibit at least moderate helicity. For the well-ordered residues 32–99, backbone and heavy atoms show a RMSD of 0.37 ± 0.07 and 0.80 ± 0.06 Å, respectively.

Figure 2.

Table 1.

NMR restraints and structural statistics for the ensemble of the 20 best conformers of His-tagged (1–24) Dpoe2NT (25–99 of the construct)

NMR restraints
Total distance restraints	1488
Intraresidue	426
Sequential \|i−j\|=1	381
Medium range, 1<\|i−j\|<5	338
Long-range, \|i−j\|≥5	343
Violation statistics
Maximum NOE restraint violation (Å)	0.22
Number of NOE violations ≥ 0.20 Å	2
Energies
Average restraint violation energy (kcal/mol)	4.6 ± 0.7
Average AMBER energy (kcal/mol)	−4199.6 ± 11.8
RMS deviations from ideal covalent geometry
Bond lengths (Å)	0.0107 ± 0.0001
Bond angles (°)	2.33 ± 0.02
Atomic coordinate rmsd (Å), residues 32–99
Backbone atoms	0.37 ± 0.07
Heavy atoms	0.80 ± 0.06
Ramachandran plot regions (%), residues 25–99
Residues in most favored regions	92.5
Additionally allowed regions	7.1
Generously allowed regions	0.4

Figure 3.

The NMR structure and structural statistics of the amino-terminal domain of the human Pol ε B subunit. (A) Stereo view of backbone traces over residues 25–99 of the final ensemble of 20 models with the lowest NOE restraint violation energies. The program MOLMOL (34) was used to prepare the figure. (B) Ribbon presentation over residues 25–99, showing orientation of secondary structure elements. (C) NMR restraints and RMSD broken down by residues 25–99 of the construct. Colored bars indicate numbers of NOE restraints, Intraresidue NOEs, interresidue sequential NOEs (|i − j|=1), interresidue medium-range NOEs (1 < |i − j| < 5) and interresidue long-range NOEs (|i − j| ≥ 5), in dark red, blue, light blue and yellow, respectively from bottom to top. The black curve indicates local backbone heavy atom RMSDs. The program MOLMOL (34) was used to calculate RMSDs after superimposition of the structures over residues 32–99.

A sensitivity-enhanced 1H–15N HSQC spectrum of His-tagged Dpoe2NT. The spectrum was acquired at 25°C on 800 MHz spectrometer. Backbone amide correlations are indicated by one letter–letter amino acid code. The NMR structure and structural statistics of the amino-terminal domain of the human Pol ε B subunit. (A) Stereo view of backbone traces over residues 25–99 of the final ensemble of 20 models with the lowest NOE restraint violation energies. The program MOLMOL (34) was used to prepare the figure. (B) Ribbon presentation over residues 25–99, showing orientation of secondary structure elements. (C) NMR restraints and RMSD broken down by residues 25–99 of the construct. Colored bars indicate numbers of NOE restraints, Intraresidue NOEs, interresidue sequential NOEs (|i − j|=1), interresidue medium-range NOEs (1 < |i − j| < 5) and interresidue long-range NOEs (|i − j| ≥ 5), in dark red, blue, light blue and yellow, respectively from bottom to top. The black curve indicates local backbone heavy atom RMSDs. The program MOLMOL (34) was used to calculate RMSDs after superimposition of the structures over residues 32–99. NMR restraints and structural statistics for the ensemble of the 20 best conformers of His-tagged (1–24) Dpoe2NT (25–99 of the construct) The human Dpoe2NT consists of a left-handed superhelix bundle that is formed by four α-helices. The helices are arranged in two hairpins, and the two helices of each of the hairpins are connected by a loop that contains a short β-strand each. These two strands, composed of residues 44–46 and 84–86, are arranged into a parallel sheet (Figure 3B). Notably, the loops that connect the α-helices in the bundle superimpose tightly in the structure ensemble, suggesting that they are well ordered in solution (Figure 3A and C). The less ordered amino acid residues 25–31 in the N-terminal part are constructed based on only a few long-range NOEs, but with an average number of sequential and medium range NOEs. Consequently, the structure of Dpoe2NT is more loosely defined for residues 25–31. Nevertheless, all of the structures in the ensemble show an apparent helical nature down to the very near amino terminal end of the Dpoe2- sequence of the construct, based on sequential and short-range NOEs. For further analysis of the structure, we selected model number one of the final ensemble with the lowest NOE restraint violation energy to represent the structure.

Structural similarity to AAA+ C-domains

A search at the DALI database (31) with the Dpoe2NT (residues 25–99) revealed the carboxyproximal, α-helical domains of AAA+ proteins (the C-domains) as the best structural homologs. The C-domain of the E. coli Pol III δ clamp loader subunit (PDB code 1jqjC) aligned with the Dpoe2NT as the best DALI database hit, with a Z-score of 8.6, followed by C-domains of several other AAA+ family subgroups (Table 2). The AAA+ superfamily represents a large group of enzymes associated with the assembly, disassembly and operation of protein complexes (39,40). AAA+ proteins are characterized by an N-terminal Rossman-fold that contains the Walker-A and -B motifs that form the NTPase domain, fused to a small C-proximal α-helical bundle that forms the lid of the ATP-binding pocket (the C-domain) (Figure 4A). The C-domain is an important distinguishing feature of the AAA+ superfamily in comparison to other NTP-binding proteins (40). AAA+ proteins function by linking conformational changes inferred by nucleotide binding and hydrolysis within an oligomeric assembly (41). AAA+ proteins have been classified into different clades and clusters based on their amino acid sequence as well as structural features (42,43). Although AAA+ proteins are functionally remarkably diverse, several groups of this family are involved in DNA metabolism. These include clamp loaders, the Holliday junction migration motor protein RuvB and RuvB-like proteins, MCM-like helicases and the Orc/DnaA group of cellular DNA replication initiators (44,45). Among AAA+ proteins, clamp loaders and RuvB and RuvB-like proteins represented the best DALI hits when searching with the Dpoe2NT domain as bait, followed by archaeal and eukaryotic replication initiator proteins Cdc6 and ORC. The first hit that did not belong to the AAA+ family was the histone fold protein (1f1eA) from Methanopyrus kandleri at rank 13. Reciprocal searches with the C-domains identified in the initial search indicate that Dpoe2NT aligns with C-domains with Dali search Z-scores and RMSD values comparable to the alignment of C-domains with each other (Supplementary Table 1). Parallel searches with the CATHedral-server (33) gave similar results, the majority of the top-listed folds being C-domains and falling into topological class 1.10.8 and superfamily 1.10.8.60 (data not shown). Superimposition of the Dpoe2NT with C-domains of AAA+ proteins confirms that the structures align well (Figure 4A and B). It also becomes apparent that these proteins share the same topology, composed of a four-helix bundle and a short parallel β-sheet (43). Most significantly, the structure of Dpoe2NT represents, to the best of our knowledge, the first report of a C-domain fold that is not in the context of the AAA+ protein superfamily.

Table 2.

Summary of the key values from structure superimpositions of selected structures identified by DALI searches with Dpoe2NT

PDB ID	RMSD to Dpoe2NT [Å]	DALI Z-score	Number of C^α aligned	Sequence identity (%)	Source and description
1in4A	2.0	9.1	72	8	Thermotoga maritima RuvB*
1iqpA	2.1	5.7	62	16	Pyrococcus furiosus RFCS
1jqjC	2.3	8.6	69	16	Escherichia coli HolA

An asterisk indicates the values retrieved by pairwise DaliLite comparison for 1in4A that was not included in the DALI database.

Figure 4.

Comparison of the amino terminal structure of Pol ε subunit B with C-domain structures of AAA+ family proteins. (A) Ribbon presentation of the δ subunit of E. coli Pol III γ complex (PDB entry 1jr3D) on the left and the T. maritime RuvB Holliday junction branch migration motor protein with bound ADP molecule (PDB entry 1in4A) on the right. The Dpoe2NT solution structure superimposed on the C-domain is represented in red. The ADP molecule present in structure 1in4A was included to emphasize the location of the homologous structure relative to the functional site at the domain interphase. (B) Superimposed Cα traces of C-domain structures against Dpoe2NT. The color scheme was adopted from the coloring in the cluster maps by Ammelburg and co-workers (43): TmaRuvB (PDB ID 1in4A), Tth RuvB (1hqcA) and Hsa RuvBL1 (2c9oA) are in bright green, crenarchaeal replication initiator proteins Ape ORC2 (1w5sA) and Pae Cdc6 (1fnnA) are in dark green, Eco Pol III clamp loader subunits δ (1jqjC) and gamma (1jr3A) are in greenish yellow, while clamp loader proteins from Pfu (RFCS, 1iqpA) and yeast (RFC3, 1sxjC) are in olive green. Bacterial proteases and chaperones Bsu Lon1 (1x37A), Eco ClpA (1r6bX) and Eco HslU (1do0A) are in violet, and proteins belonging to AAA-D1 cluster, Hsa VPS4B (1xwiA) and Tth Ftsh (2dhrA) are in light red. Dpoe2NT (2v6zM) is in red. 3D bars at upright in stereo view indicate directions (positive) and relative magnitude (0.57 debye/atom on average) of calculated dipole moments for each domain structures independently. (C) Structure-based sequence alignment of a representative selection of homological C-domain structures: E. Coli Pol III γ complex subunit δ (1jr3D), Pyrococcus furiosus Clamp loader small subunit (1iqpA) and T. maritime RuvB Holliday junction branch migration motor protein (1in4A) against Dpoe2NT (2v6zM). Secondary structure elements of Dpoe2NT are indicated above the alignment. Summary of the key values from structure superimpositions of selected structures identified by DALI searches with Dpoe2NT An asterisk indicates the values retrieved by pairwise DaliLite comparison for 1in4A that was not included in the DALI database. The Dpoe2NT domain and the C-domains of AAA+ proteins are both rich in charged amino acids. The structures presented in Figure 4B contain 30.0% (±5.4%) of charged amino acids. In the Dpoe2NT and the majority of the C-domain structures, the parallel helices α1 and α3 are predominantly positively charged, while acidic side-chains are distributed in the structures, and mostly locate in the loops, the strands and in the less conserved α4 (Figures 1 and 4C). The uneven distribution of the charged residues is reflected in a polarization that can be visualized as considerable dipole moments (upright in Figure 4B) among the C-domains and Dpoe2NT. Due to the large proportion of charged residues, the dipole moments are approximately twice as large as could be expected from the size of domains (36). One of the conserved sequence motifs of AAA+ proteins is located in the loop between the second and third helix of the C-domain (46). This ‘sensor-2’ motif is characterized by a conserved arginine residue that is implicated in the sensing of ATP binding and hydrolysis. A motif that corresponds to the sensor-2 cannot be recognized in Dpoe2NT. The arginine site of the sensor-2 motif is occupied by a glutamate in the human Dpoe2NT and by an acidic residue in the majority of its eukaryotic orthologs (Figures 1 and 4C). This is reminiscent of the classical AAA+ clade that has a conserved aspartate instead of arginine (41). In the Dpoe2NT domain, the stem of helix α3 is rich in acidic residues, and there are insertions in the loops preceding and following helix α3. These result in distortion, particularly in the stem of helix α3 (Figure 4A and B). When superimposed onto C-domains in the contexts of functional AAA+ proteins, one can see that the acidic loop would extend towards the position occupied by the phosphoric acid residues of the ATP/ADP molecule. Taken together, it is therefore highly unlikely that Dpoe2NT could sense ATP/ADP in the context of AAA+ protein complexes, albeit mimic or compensation of the acid moiety of nucleotides cannot be excluded.

Evolutionary implications

The structural similarities with a variety of C-domains of AAA+ proteins involved in DNA metabolism, in particular with bacterial RuvB and clamp loader subunits, raise the question whether Dpoe2NT was acquired by partial gene transfer or genetic rearrangement, or if Dpoe2NT represents merely an analogous structure that has adapted the same fold without evolutionary relationship to the C-domains. Evolutionary relationship between structures has been implied by the presence of sequence homology or a similar function in addition to the related fold (47–49). Since normal and profile-based searches did not detect sequence homology between Dpoe2NT and AAA+, we utilized the highly sensitive HHPred server that uses HMM–HMM comparison (37). Searches with full-length Dpoe2 sequences from different species identified exclusively C-domains of AAA+ proteins as potential sequence homologs of Dpoe2NT, albeit with only moderate probability values (45–89%). Again, C-domains of AAA+ proteins involved in DNA metabolism, such as clamp loaders, RuvB Holliday junction resolvase or DnaA helicase, were detected (data not shown). Several searches with Dpoe2 identified also C-domains, when the secondary structure scoring was turned off, indicating that the hits were not only based on the similarity in secondary structure. Taken together, Dpoe2NT shares (i) a common fold, (ii) common electrostatic characteristics (50) and (iii) remote sequence homology with C-domains of AAA+ proteins. Furthermore, Dpoe2NT is implicated in DNA metabolism, as are the most similar C-domains. Collectively, these properties strongly suggest a common evolutionary origin of Dpoe2NT with the C-domains. Recently, Alva and co-workers (51) have proposed a common origin for the N-terminal substrate recognition domain of Clp/Hsp100 proteins, the C-domains of AAA+ proteins, and for the histone fold. The authors suggest that these folds are all derived from an antecedent helix–strand–helix segment (51). This study extends the list to the N-terminal domain of Pol ε B subunit.

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online.

46 in total

1. Did DNA replication evolve twice independently?

Authors: D D Leipe; L Aravind; E V Koonin
Journal: Nucleic Acids Res Date: 1999-09-01 Impact factor: 16.971

2. Protein NMR structure determination with automated NOE assignment using the new software CANDID and the torsion angle dynamics algorithm DYANA.

Authors: Torsten Herrmann; Peter Güntert; Kurt Wüthrich
Journal: J Mol Biol Date: 2002-05-24 Impact factor: 5.469

3. Characterization of the 3' exonuclease subunit DP1 of Methanococcus jannaschii replicative DNA polymerase D.

Authors: Maarit Jokela; Anitta Eskelinen; Helmut Pospiech; Juha Rouvinen; Juhani E Syväoja
Journal: Nucleic Acids Res Date: 2004-04-30 Impact factor: 16.971

Review 4. AAA+ proteins: have engine, will work.

Authors: Phyllis I Hanson; Sidney W Whiteheart
Journal: Nat Rev Mol Cell Biol Date: 2005-07 Impact factor: 94.444

5. Phosphoesterase domains associated with DNA polymerases of diverse origins.

Authors: L Aravind; E V Koonin
Journal: Nucleic Acids Res Date: 1998-08-15 Impact factor: 16.971

6. MOLMOL: a program for display and analysis of macromolecular structures.

Authors: R Koradi; M Billeter; K Wüthrich
Journal: J Mol Graph Date: 1996-02

7. Recognition of analogous and homologous protein folds: analysis of sequence and structure conservation.

Authors: R B Russell; M A Saqi; R A Sayle; P A Bates; M J Sternberg
Journal: J Mol Biol Date: 1997-06-13 Impact factor: 5.469

Review 8. Structural and functional similarities of prokaryotic and eukaryotic DNA polymerase sliding clamps.

Authors: Z Kelman; M O'Donnell
Journal: Nucleic Acids Res Date: 1995-09-25 Impact factor: 16.971

9. Backbone 1H and 15N resonance assignments of the N-terminal SH3 domain of drk in folded and unfolded states using enhanced-sensitivity pulsed field gradient NMR techniques.

Authors: O Zhang; L E Kay; J P Olivier; J D Forman-Kay
Journal: J Biomol NMR Date: 1994-11 Impact factor: 2.835

10. Ancestral paralogs and pseudoparalogs and their role in the emergence of the eukaryotic cell.

Authors: Kira S Makarova; Yuri I Wolf; Sergey L Mekhedov; Boris G Mirkin; Eugene V Koonin
Journal: Nucleic Acids Res Date: 2005-08-16 Impact factor: 16.971

14 in total

1. Structure of a DNA polymerase alpha-primase domain that docks on the SV40 helicase and activates the viral primosome.

Authors: Hao Huang; Brian E Weiner; Haijiang Zhang; Brian E Fuller; Yue Gao; Brian M Wile; Kun Zhao; Diana R Arnett; Walter J Chazin; Ellen Fanning
Journal: J Biol Chem Date: 2010-03-16 Impact factor: 5.157

2. Crystal Structure of the Human Pol α B Subunit in Complex with the C-terminal Domain of the Catalytic Subunit.

Authors: Yoshiaki Suwa; Jianyou Gu; Andrey G Baranovskiy; Nigar D Babayeva; Youri I Pavlov; Tahir H Tahirov
Journal: J Biol Chem Date: 2015-04-06 Impact factor: 5.157

3. Crystal structure of the human Polϵ B-subunit in complex with the C-terminal domain of the catalytic subunit.

Authors: Andrey G Baranovskiy; Jianyou Gu; Nigar D Babayeva; Igor Kurinov; Youri I Pavlov; Tahir H Tahirov
Journal: J Biol Chem Date: 2017-07-26 Impact factor: 5.157

Review 4. Modulation of mutagenesis in eukaryotes by DNA replication fork dynamics and quality of nucleotide pools.

Authors: Irina S-R Waisertreiger; Victoria G Liston; Miriam R Menezes; Hyun-Min Kim; Kirill S Lobachev; Elena I Stepchenkova; Tahir H Tahirov; Igor B Rogozin; Youri I Pavlov
Journal: Environ Mol Mutagen Date: 2012-10-10 Impact factor: 3.216

Review 5. DNA replication and homologous recombination factors: acting together to maintain genome stability.

Authors: Antoine Aze; Jin Chuan Zhou; Alessandro Costa; Vincenzo Costanzo
Journal: Chromosoma Date: 2013-04-16 Impact factor: 4.316

6. Kinetic investigation of the polymerase and exonuclease activities of human DNA polymerase ε holoenzyme.

Authors: Walter J Zahurancik; Zucai Suo
Journal: J Biol Chem Date: 2020-10-13 Impact factor: 5.157

7. Structural basis for processive DNA synthesis by yeast DNA polymerase ɛ.

Authors: Matthew Hogg; Pia Osterman; Göran O Bylund; Rais A Ganai; Else-Britt Lundström; A Elisabeth Sauer-Eriksson; Erik Johansson
Journal: Nat Struct Mol Biol Date: 2013-12-01 Impact factor: 15.369

8. Defective interaction between Pol2p and Dpb2p, subunits of DNA polymerase epsilon, contributes to a mutator phenotype in Saccharomyces cerevisiae.

Authors: Malgorzata Jaszczur; Justyna Rudzka; Joanna Kraszewska; Krzysztof Flis; Piotr Polaczek; Judith L Campbell; Iwona J Fijalkowska; Piotr Jonczyk
Journal: Mutat Res Date: 2009-05-20 Impact factor: 2.433

9. Functional mapping of the fission yeast DNA polymerase delta B-subunit Cdc1 by site-directed and random pentapeptide insertion mutagenesis.

Authors: Javier Sanchez Garcia; Andrey G Baranovskiy; Elena V Knatko; Fiona C Gray; Tahir H Tahirov; Stuart A MacNeill
Journal: BMC Mol Biol Date: 2009-08-17 Impact factor: 2.946

10. The C-terminus of Dpb2 is required for interaction with Pol2 and for cell viability.

Authors: Isabelle Isoz; Ulf Persson; Kirill Volkov; Erik Johansson
Journal: Nucleic Acids Res Date: 2012-10-02 Impact factor: 16.971