Literature DB >> 35252985

Investigating the Conformational Dynamics of a Y-Family DNA Polymerase during Its Folding and Binding to DNA and a Nucleotide.

Abstract

During DNA polymerization, the Y-family DNA polymerases are capable of bypassing various DNA damage, which can stall the replication fork progression. It has been well acknowledged that the structures of the Y-family DNA polymerases have been naturally evolved to undertake this vital task. However, the mechanisms of how these proteins utilize their unique structural and conformational dynamical features to perform the translesion DNA synthesis are less understood. Here, we developed structure-based models to study the precatalytic DNA polymerization process, including DNA and nucleotide binding to DPO4, a paradigmatic Y-family polymerase from Sulfolobus solfataricus. We studied the interplay between the folding and the conformational dynamics of DPO4 and found that DPO4 undergoes first unraveling (unfolding) and then folding for accomplishing the functional "open-to-closed" conformational transition. DNA binding dynamically modulates the conformational equilibrium in DPO4 during the stepwise binding through different types of interactions, leading to different conformational distributions of DPO4 at different DNA binding stages. We observed that nucleotide binding induces modulation of a few contacts surrounding the active site of the DPO4-DNA complex associated with a high free energy barrier. Our simulation results resonate with the experimental evidence that the conformational change at the active site led by nucleotide is the rate-limiting step of nucleotide incorporation. In combination with localized frustration analyses, we underlined the importance of DPO4 conformational dynamics and fluctuations in facilitating DNA and nucleotide binding. Our findings offer mechanistic insights into the processes of DPO4 conformational dynamics associated with the substrate binding and contribute to the understanding of the "structure-dynamics-function" relationship in the Y-family DNA polymerases.

Entities: Chemical

Year: 2021 PMID： 35252985 PMCID： PMC8889613 DOI： 10.1021/jacsau.1c00368

Source DB: PubMed Journal: JACS Au ISSN： 2691-3704

DNA replication, an essential process occurring in all living organisms, is finely tuned by the DNA polymerases. During in vivo DNA polymerization, these protein machines may frequently encounter lesions in the DNA template, which can potentially block the normal progression of replication forks. To resolve this critical issue, the Y-family DNA polymerases can perform the translesion synthesis and bypass the DNA lesions.[1,2] Meanwhile, the Y-family DNA polymerases catalyze DNA synthesis with low catalytic efficiency, low processivity, and low fidelity with both undamaged and damaged DNA,[2] compared to the DNA polymerases in the A- and B-families.[3] Structural analysis revealed that the Y-family DNA polymerases have conserved architectures,[4−8] which are different from those of the high-fidelity replicative DNA polymerases. In light of the “structure–function” paradigm, it has been acknowledged that the function of the Y-family DNA polymerase is characterized by its unique structure.[9,10] Nevertheless, a clear picture of how the Y-family DNA polymerases regulate translesion DNA synthesis through the structure and associated conformational dynamics is still not present. As a paradigmatic Y-family DNA polymerase, DNA polymerase IV (DPO4) from Sulfolobus solfataricus has a conserved polymerase core composed of a finger (F), a palm (P), and a thumb (T) domain, as well as a C-terminal little finger (LF) domain tethered to the T domain through a flexible linker.[4] Prior to the catalytic process, there are two essential substrate binding processes, including DPO4 binding to DNA and subsequently recruiting a nucleotide to the active site in the DPO4–DNA complex. Crystal structures of DPO4 in the apo state, DPO4–DNA binary, and DPO4–DNA–nucleotide ternary complexes revealed a global conformational change in DPO4 occurring during DNA binding through the relocation of the LF domain relative to the polymerase core and the slightly changed global conformation of DPO4 retaining between the binary and the ternary forms.[4,11] The large-scale “open-to-closed” DPO4 conformational transition induced by DNA has been found to result in a dynamical protein–DNA recognition process that may contribute to the low-fidelity DNA synthesis.[12] Besides, a slight local structural adaption in the F domain of DPO4 was identified to stabilize the bound incoming nucleotide.[11] However, the dynamical and full picture of the structural rearrangements of DPO4 from the apo state, then to the DNA binding binary complex, and finally to the nucleotide binding ternary complex remains largely elusive. Akin to the other Y-family DNA polymerases, DPO4 is a typical multidomain protein. DPO4 has been observed to undergo stepwise unfolding with the intermediate observed in both the experiment[13] and the simulations.[14−16] The unfolding intermediate, which shows an extended linker and unstable LF domain interfaces with well-folded individual domains in DPO4, was further hypothesized to benefit the formation of multiple DPO4 conformations during its binding to DNA or proliferating cell nuclear antigen (PCNA).[13] This fact indicates a positive role of the DPO4 unfolding in facilitating the functional binding processes. Recent theoretical work found that the weakly formed domain interfaces in DPO4 are the key to realizing the high efficiency of folding and DNA binding, simultaneously.[16] Currently, the interplay between the global (un)folding and domain spatial rearrangements in DPO4, in particular how the (un)folding affects the functional conformational dynamics in DPO4, is still in need of a quantitative investigation. DPO4–DNA binding was characterized to be a complex process that shows multistep characteristics associated with dynamically arranging the DPO4 conformational distribution.[17−22] This picture of DPO4 binding to DNA with conformational fluctuations may help the intricate regulation of DPO4 binding to the replication forks during the translesion synthesis through coordinating the movements of the LF domain, which can contribute to the polymerase switching between DPO4 and a replicative DNA polymerase.[23] Currently, it is still not clear how the conformational transition in DPO4 and the DPO4–DNA interactions evolve during DNA binding. After DNA binding, an incoming nucleotide binds to the DPO4–DNA binary complex to form the precatalytic ternary complex. The conformational rearrangements of the active site in the DPO4–DNA complex induced by nucleotide binding have been considered to be the rate-limiting step of the whole enzymatic process.[17,24,25] The results based on the stopped-flow Förster resonance energy transfer (FRET) study suggested that the F domain motions should account for the slow conformational rearrangement in DPO4 during nucleotide incorporation, but the process is less understood at the single-molecular level and the underlying mechanism remains unclear.[19] Here, we addressed the DPO4 conformational dynamics at the precatalytic steps, i.e., an initial DNA binding followed by a nucleotide binding to DPO4. Technically, we developed structure-based models (SBMs) to study the folding, conformational transition, and substrate binding of DPO4. Motivated by the recent experimental evidence that DPO4 is largely in the apo structure and able to adopt the DNA-binding structure with a minor population,[11,26] we extended the single-basin SBM, which was used in our previous studies,[14−16] to the double-well one that incorporates the structural information from the apo and ternary DPO4 forms. The model not only generated the experimentally consistent simulation results[13,26] but also enabled us to simultaneously study the global folding and local conformational dynamics of DPO4. We uncovered that the “open-to-closed” functional conformational transition in DPO4 occurs at the bottom of the folding energy landscape and pre-exists in the absence of DNA. The binding of DPO4 to DNA undergoes multiple steps associated with the different conformational distributions of DPO4 that are determined by different interactions. Furthermore, we found that there is a high free energy barrier during nucleotide binding. Careful examinations show subtle destabilization in the interactions surrounding the active site of the DPO4–DNA complex during nucleotide binding and give a hint on the origin of the binding free energy barrier. By performing the localized frustration analyses, we found that the DPO4 conformational dynamics induced by substrate binding are closely related to the highly frustrated interactions present in the native structures. Our theoretical work provides mechanistic insights into the rate-limiting, prechemistry step of the DPO4 catalyzed reaction and helps the understanding of translesion DNA synthesis by the Y-family polymerases.

Results

Global Folding and Local Conformational Dynamics of DPO4

We built a double-well two-bead SBM to study DPO4 folding and conformational dynamics. Each residue, except glycine, was modeled as two beads, representing the backbone and side chain, respectively. In our previous study,[16] we found that the one-bead homogeneous SBM may overweight the contribution of the interdomain interactions in the total energy in the native structure of DPO4. We further suggested that weakening the strengths of interdomain interactions in the SBM can optimize the folding and DNA binding of DPO4. Here, we found that the two-bead homogeneous SBM can naturally lead to a decreased proportion of interdomain interactions in the total energy for stabilizing the native structures with respect to the one-bead homogeneous SBM, possibly due to the fact that an improved native contact map was used with the presence of the side chain in the two-bead coarse-grained model (see Materials and Methods). In addition, considering the highly charged property of DPO4 as a DNA binding protein, we further included the electrostatic interactions described by the Debye–Hückel model and placed the charges onto the side chain of the designated residues (one positive charge for arginine and lysine, one negative charge for aspartic and glutamic acid). The model takes into account the effects of salt concentration through the Debye screening length. Unless otherwise specified, we used the salt concentration of 0.05 M throughout the simulations in accordance with previous experiments for DPO4 folding and substrate binding.[13,24,25,28] The double-well model is realized by a mixture of native contact maps of DPO4 in the apo DPO4 structure (DPO4A)[11] and ternary DPO4–DNA–nucleotide structure (DPO4T)[4] and aims to produce two basins at the energy landscape in order to describe the “open-to-closed” conformational transition of DPO4 (see Supporting Information). Crystallographic structural analysis revealed that the major differences of DPO4 between the apo and the ternary structures are attributed to the spatial position and rotation of the LF domain that forms interactions with the T domain in the DPO4A and the F domain in the DPO4T, respectively (Figure A). Meanwhile, the other segments of DPO4, including the individual domain structures and domain–domain interfaces, remain largely the same. Therefore, the conformational transition of DPO4 between the DPO4A and the DPO4T corresponds to the rearrangements of the interfacial LF domain contacts.

Figure 1

Folding and conformational dynamics of DPO4. (A) The contact map of DPO4 at the apo structure (PDB: 2RDI(11)) (top left) and the ternary DPO4–DNA–nucleotide structure (PDB: 1JX4(4)) (bottom right). The domains in DPO4 are finger domain (F domain, residues 11–70, blue), palm domain (P domain, residues 1–10 and 71–166, red), thumb domain (T domain, residues 167–229, green) and little finger domain (LF domain, residues 245–341, magenta). The flexible linker (residues 230–244) that tethers the T and LF domains is colored gray. The red and blue rectangles indicate the major change of the contacts in DPO4 between the apo and the ternary structures, corresponding to the contacts formed at the T–LF and F–LF interfaces, respectively. (B) Proportion of helical formation in DPO4 and heat capacity curve along with the temperature. Due to the lack of Ramachandran angles in our coarse-grained model, we defined the formation of a helical segment as the one that has at least three continuous dihedrals within the range of −35°–145°.[27] The experimental temperatures are approximately mapped to the simulation temperatures with the knowledge of folding temperatures and further using a linear relation.[13] (C) Free energy landscapes of DPO4 at the room temperature T (left), the first folding transition temperature T1 (right), and the second folding transition temperature T2 (middle). The free energy profiles are projected onto QDPO4(Rest) and QDPO4(CT) . QDPO4(CT) = QDPO4(F – LF) – QDPO4(T – LF), where QDPO4(T – LF) is the fraction of the interdomain native contacts between the T and the LF domains in the apo structure and QDPO4(F – LF) is the fraction of the interdomain native contacts between the F and the LF domains in the ternary structure. QDPO4(Rest) is the fraction of the native contacts in DPO4, excluding the ones at the T–LF and F–LF domain interfaces. The free energy is in the unit of kT. T is the corresponding temperature where the free energy was calculated. (D) Structural illustrations of DPO4 at the apo (DPO4A), intermediate (DPO4I), and ternary states (DPO4T). The domains in DPO4 have the same color schemes as the ones at the axes in (A). (E) Scheme illustrating the energy landscape of DPO4 folding and conformational dynamics. We performed Replica-Exchange Molecular Dynamics (REMD) simulations to explore DPO4’s folding and conformational dynamics.[29] With the Weighted Histogram Analysis Method (WHAM),[30] we investigated the thermodynamics of DPO4 folding, including the heat capacity curve and the melting curve (Figure B). We observed an apparent two-step DPO4 folding process exhibiting two folding temperatures (the low melting temperature T2 = 0.96 and the high melting temperature T1 = 1.04 in Figure B; the temperature is in reduced units), consistent with the observations in the experiments, which identified two melting temperatures of DPO4 at 89.3 and 102.6 °C.[13] It is worth noting that our previous simulations with single-basin one-bead SBMs generated only one peak on the heat capacity curve and resulted in sigmoidal-like melting curves.[14−16] The results suggest that the presence of the side chain bead and electrostatics in the SBM is critical to recapture the global folding behaviors of DPO4.[31] Due to the simplified interactions and the coarse-grained nature in the SBM, the simulation temperatures cannot directly correspond to the experimental ones. In this regard, we assumed a linear temperature dependence on the energy and provided an approximate connection to bridge the simulation temperatures and the experimental ones with the knowledge of the folding temperatures (see Supporting Information). We quantified the free energy landscapes of DPO4 onto the fraction of native contacts of folding (QDPO4(Rest)) and conformational transition between the DPO4A and DPO4T (QDPO4(CT)) at room temperature T and the two folding temperatures T2 and T1 (Figure C). QDPO4(CT) is the subtraction of the fraction of the interdomain native contacts between the T and the LF domains in the apo structure (QDPO4(T – LF)) and the fraction of interdomain native contacts between the F and LF domains in the ternary structure (QDPO4(F – LF)), so DPO4 in the apo and ternary structure has QDPO4(CT) equal to −1 and 1, respectively. In order to see whether our model can successfully capture the structures of the DPO4A and DPO4T, we also quantified the free energy landscapes of DPO4 projected onto the root-mean-square deviation (RMSD) toward the apo (RMSDA) and ternary structure (RMSDT) of DPO4 (Figure S7). At room temperature, we can see that when RMSDA and RMSDT are small close to 0, QDPO4(CT) values approach −1 and 1, respectively. This suggests formations of the DPO4A and DPO4T structures at room temperature, and QDPO4(CT) is capable of describing the transitions between the DPO4A and the DPO4T forms. The conformational dynamics of DPO4 at room temperature is limited and entirely attributed to the transition between the DPO4A and DPO4T (Figure C). Besides, there is an intermediate state of DPO4 (DPO4I) formed during the transition between the DPO4A and the DPO4T (Figure D). The intermediate state DPO4I, at which the LF domain in DPO4 shows no interactions with either the F or T domain and the other regions of DPO4 are well folded, is an inevitable on-pathway intermediate state. In other words, DPO4 at the DPO4I exhibits an extended flexible linker and serves as the bridge to connect the structurally distinct DPO4A and DPO4T. The observation of DPO4I here is consistent with the melting experiment,[13] resonating with the fact that the flexible linker is the key to realize the DPO4 substrate binding through the conformational dynamics of DPO4.[12] With increasing the temperature to the low melting temperature T2, the DPO4I state becomes more populated than the DPO4A and DPO4T states, indicating that the DPO4I state is entropically favored. Structural analysis on the free energy minimum at QDPO4(Rest) ∼ 0.7 shows that the LF domain interfaces are entirely broken while other regions in DPO4 remain folded (Figure S8). Moreover, a new free energy minimum emerges on the landscape at QDPO4(CT) ∼ 0 and QDPO4(Rest) ∼ 0.6, signifying an intermediate state for DPO4 (un)folding. We found that the LF domain at the intermediate state is unfolded associated with a fully folded polymerase core (Figure S8). Continuously increasing the temperature to the high dominant melting temperature T1 results in two new minima on the free energy landscape, corresponding to an additional folding intermediate (QDPO4(CT) ∼ 0 and QDPO4(Rest) ∼ 0.4) and the unfolded state (QDPO4(CT) ∼ 0 and QDPO4(Rest) ∼ 0.2), respectively. The structural analysis on the intermediate state at T1 shows that DPO4 has the folded F and P domains and the formed F–P domain interface (Figure S8). The multiple intermediate unfolding states of DPO4 were observed in our previous studies as a result of “divided-and-conquer” domain-wise folding.[14,16] Free energy profiles show that QDPO4(CT) can reach values higher than 0.5 or lower than −0.5 only when QDPO4(Rest) is higher than 0.7. This indicates that the interfacial LF domain interactions, which are responsible for the functional conformational dynamics of DPO4, can only be formed when the other regions of DPO4 have accomplished folding. It suggests the vulnerable structural characteristics of the LF domain interfaces in DPO4 are responsible for the functional purpose. We also found that modulating the model parameters related to the strengths of the LF domain interacting with the T and F domains has minor effects on changing the global folding temperatures (Figure S9). This indicates the functional conformational dynamics of DPO4 has a minimal impact on the global folding process. From the energy landscape perspective (Figure E), the apo and ternary states of DPO4 represent two energy basins that are globally located at the bottom of the energy landscapes. In other words, the structures and interactions within DPO4 that underpin its function can only be formed at the late stages of folding, and DPO4 transforms from an inactive (DPO4A) state to an active (DPO4T) state through the local unfolding, which has been largely regarded as an effect of the frustration on the energy landscape in favor of the protein conformational dynamics.[32−34]

Conformation-, Interaction- and Salt-Dependent Multistep DPO4–DNA Binding

On the basis of the double-well SBM of DPO4, we further studied DPO4–DNA binding (in the absence of nucleotide). The DNA binding model includes the interactions of the DPO4–DNA native contacts derived from the ternary crystal structure of DPO4–DNA–nucleotide[4] and electrostatic interchain interactions between DPO4 and DNA.[37,38] To achieve sufficient sampling, we performed umbrella sampling simulations for the DPO4–DNA binding process. We implemented the biasing potentials along with the binding reaction coordinate QDNA#. QDNA# = NDNAQDNA – dRMSDNA, containing both the information on the fraction of the interchain native contacts QDNA and the Euclidean distance of the interchain native contacts to the bound structure dRMSDNA (NDNA is the number of interchain native contacts). A high (low) value of QDNA# corresponds to a high (low) degree of native similarity to the DNA binding in the native structure. It has been recognized that QDNA and dRMSDNA are good at describing the process after the native contacts start to establish and the unbound states with no native contacts formed, respectively.[39] Thus, QDNA# can provide a more precise description at both unbound states and binding states, compared to our previous studies.[16,22] We further calibrated our model to the binding affinity by modulating the strengths of the DPO4–DNA interchain native contacts. From the quantified binding free energy landscape (Figure A), we identified four free energy minima that separate the DNA-binding process into three stages: from the completely dissociative unbound state (DNAUS) to the initially anchoring encounter complex (DNAEC), then to the partially bound intermediate state (DNAIS), and finally to the fully bound state (DNABS). The multistep DNA binding picture obtained here is consistent with experimental observations.[17−21] Further analysis shows the DNAEC is made up of two metastable states on the free energy landscape (DNAEC1 and DNAEC2). We note that the DNAEC state was not able to be detected as a metastable state in our previous study,[16] where the one-bead SBM without calibrations to the experiments was used along with dRMSDNA as the reaction coordinate. Here, the careful determination on the DNAEC enabled us to characterize the conformational distribution of DPO4 and further dissect the conformational dynamics mechanism of DPO4 during DNA binding.

Figure 2

DPO4–DNA binding. (A) Free energy landscapes of DPO4 binding to DNA projected onto the binding reaction coordinate QDNA# at different ϵDNA, where QDNA# = NDNAQDNA – dRMSDNA, QDNA is the fraction of DPO4–DNA interchain native contacts, NDNA is the number of DPO4–DNA interchain native contacts, and dRMSDNA is the difference of the distance of native contact pairs between DPO4 and DNA with deviation from 0 indicating deviation from the native structure.[35,36]dRMSDNA is in the unit of Å. ϵDNA is the strength of the native contacts between DPO4 and DNA. The free energy landscapes show 4 minima, which are denoted as “Unbinding State (DNAUS)”, “Encounter Complex (DNAEC)”, “Intermediate State (DNAIS)”, and “Bound State (DNABS)”. Inset plots are the zoom-in free energy landscapes at the region of the DNAEC state (left) and binding affinity (K) along with ϵDNA (right). The free energy landscapes and K at different ϵDNA were calculated from reweighting the thermodynamics at ϵDNA = 1.0 (see Supporting Information). The black line in the free energy plot (ϵDNA = 0.70), which matches with the experimental K (3–10 nM),[24,28] was obtained from the direct umbrella sampling simulations. In the zoom-in free energy landscape plot, the DNAEC state is further divided into the DNAEC1 and DNAEC2 states, separated by a minor free energy barrier. In the K plot, the gray shadow region corresponds to the standard error of the mean value (black line) and the yellow line indicates the experimental affinity. The cyan points in the K plot are the results from the direct umbrella sampling simulations. (B) Typical DPO4–DNA structures in the DNAUS, DNAEC1, DNAEC2, DNAIS, and DNABS states extracted from the simulations. The structure is shown in three different views for each binding state. (C) Conformational dynamics of DPO4 at each binding state shown by the probability distribution along with QDPO4(CT) . (D) Probability distribution of the fraction of native contacts formed by the individual domains and the linker in DPO4 with DNA. (E) Probability distribution of the interaction energy between DPO4 and DNA. We found that the conformational dynamics of DPO4 is modulated by DNA during binding (Figure B,C and Figure S14). When DPO4 is in the DNAEC stage, it exhibits remarkable population in the DPO4I form (QDPO4(CT) ∼ 0.0), in particular in the DNAEC2. Bearing in mind that DPO4 is largely in the DPO4A form when isolated, DPO4 has significantly shifted its conformational equilibrium toward the DPO4I form in forming the DNAEC. Native contact and interaction energy analyses revealed a negligible amount of native contacts (interactions) between DPO4 and DNA formed in both the DNAEC1 and the DNAEC2, where the interchain interactions are purely non-native electrostatic (Figure D,E). These features suggest that the formation of the DNAEC state is nonspecific and driven by the non-native electrostatic interactions. The formations of the DNAEC1 and DNAEC2 from the DNAUS can contribute to the “facilitated diffusion” in the protein–DNA recognition process by reducing the dimensionality of the searching space from 3D to 1D.[40−43] The non-native electrostatic interactions play the key role in forming these two states, where DPO4 is populated in the DPO4I form associated with the extended and flexible linker. Since the linker region is positively charged, we further removed the positive charges in the linker region and performed additional DPO4–DNA binding simulations to examine the effects of the charges in the linker on the DNA binding process. We found that overall the free energy landscape after DPO4 initializing DNA binding (QDNA# > −60) was elevated from the original one (Figure S15). This implies that the charged interactions between the linker in DPO4 and DNA can increase the stability of the DPO4–DNA binding states. A significant decrease was observed in the barrier height of the transition from the DNAEC to the DNAUS state after removing the positive charges in the linker region. The results indicate that the flexible and extended, positively charged linker in DPO4 prevents the dissociation between DPO4 and DNA, thus in favor of the “facilitated diffusion”. In the DNAIS, the LF domain and the linker region in DPO4 accomplish DNA binding, and DPO4 is largely in the DPO4A form. It indicates that the transition from the nonspecific DNAEC to the partially specific DNAIS involves the modulation of DPO4 conformational dynamics from the DPO4I to the DPO4A form coupled with DNA binding. In the last stage, the DPO4–DNA binding and DPO4 conformational transition to the DPO4T form were found to be strongly coupled. Overall, the stepwise DPO4–DNA binding with the nonmonotonic adaptation of DPO4 conformational dynamics underlines the complexity of the DPO4–DNA binding process. To investigate the effects of the specific and nonspecific interactions on DPO4–DNA binding, we changed the strengths of the DPO4–DNA native contact interactions and salt concentrations, which result in different strengths of electrostatic interactions in the system. We found an apparent decrease of barrier height for the transition of DNAEC → DNAIS with increasing the strength of the DPO4–DNA native contact interactions (Figure A). This indicates that a strong specific DPO4–DNA interaction can help the formation of the DNAIS state. However, strengthening the DPO4–DNA native contacts has minor effects in accelerating the transition from the DNAIS to the DNABS. Our previous work has demonstrated that the flexible domain interface in DPO4 plays a significant role in inducing the DNAIS toward the DNABS.[16] These results together suggest that the last stage of DPO4–DNA binding is controlled by the intrinsic conformational dynamics of DPO4, rather than the interactions between DPO4 and DNA. Further calculation of the conformational distribution in DPO4 along with DNA binding shows that DPO4 has a notable population in the DPO4T form at the transition state (barrier region, QDNA# ∼ 60) between the DNAIS and DNABS (Figure S16). This feature signifies a “conformational selection”[44−46] for the last stage of the DPO4–DNA binding process, in line with the theoretical inference that the slow and large-scale conformational dynamics of the proteins favor the “conformational selection” mechanism.[47−49] For the unbinding process, we observed constantly accelerating effects led by decreasing the strength of the DPO4–DNA native contact interactions on the transitions from the DNABS to DNAIS and then to the DNAEC (Figure B). This is an intuitive finding as both DNABS and DNAEC are stabilized by the specific DPO4–DNA interactions (Figure E).

Figure 3

Barrier heights of DPO4–DNA binding and unbinding. The barrier heights at different stages along with ϵDNA for (A) binding and (B) unbinding processes. The barrier heights at different stages along with CSalt for (C) binding and (D) unbinding processes. Shadow regions represent the standard errors at the corresponding mean values. Decreasing the salt concentrations (increasing the strength of electrostatic interactions) decreases the barrier height for DPO4 capturing DNA (Figure C), likely because of the “fly-casting” effects enhanced by the strengthening of the nonspecific DPO4–DNA electrostatic interactions at low salt concentrations.[50] Meanwhile, the barrier heights for forming the specific DPO4–DNA complex during the following two stages are slightly decreased by weakening the electrostatic interactions. It is possibly due to the fact that, during DNA binding, the magnitude of electrostatic interactions, which are largely non-native, exhibits only a slight decrease (Figure E). Similarly, it is also expected that the salt concentration plays a minor role in modulating the unbinding process from the DNABS toward the DNAEC (Figure D). However, the electrostatic interactions were found to significantly impact the transition of DNAEC → DNAUS, which becomes the rate-limiting step for the unbinding process at the very low salt concentration. Based on the effects of the interactions on the DPO4–DNA binding and unbinding, we conclude that different stages of the DPO4–DNA binding process are controlled by different types of interactions. For binding, the nonspecific electrostatic interactions provide the driving forces to initialize DPO4–DNA binding; then, the native contacts promote the formation of the partially bound complex, and finally the conformational dynamics in DPO4 drag the transition from the DNAIS to the DNABS through the “conformational selection” mechanism. For unbinding, the native contact interactions between DPO4 and DNA play significant roles in the stages of transitions from the bound complex to the encounter complex; finally, the dissociation rate of DPO4–DNA is determined by the nonspecific electrostatic interactions. Our simulation results show the complex DPO4–DNA binding and unbinding processes that are strongly dependent on the intrachain and interchain interactions as well as the ionic environments.

Two-State Nucleotide Binding to the DPO4–DNA Complex

In the last section, we studied the process of nucleotide binding to the DPO4–DNA complex, which completes the precatalytic steps for nucleotide incorporation. Our model, based on the DPO4–DNA binding model, further includes the native contacts and electrostatic interactions between the incoming nucleotide and the DPO4–DNA complex.[4] Here, we applied the umbrella sampling simulation strategy as we did in studying the DPO4–DNA binding with a focus on nucleotide binding. From the quantified free energy landscapes (Figure A), we observed a two-state binding process with a free energy barrier around 10 kT, which indicates a highly cooperative process. The free energy landscape changes upon different strengths of the nucleotide binding native contact interactions (ϵNT), leading to a switch of the location associated with the highest free energy barrier. When ϵNT is small, close to 0.9, the highest free energy barrier is located at QNT# ∼ 21; when ϵNT is big, close to 1.4, the highest free energy barrier is located at QNT# ∼ 0. This indicates an essential change of the rate-limiting step in nucleotide binding due to the change of the interactions associated with the nucleotide.

Figure 4

Nucleotide binding to the DPO4–DNA complex. (A) Free energy landscapes of nucleotide binding to the DPO4–DNA complex projected onto the binding reaction coordinate QNT# at different ϵNT, where QNT# = NNTQNT – dRMSNT, QNT is the fraction of nucleotide interchain native contacts, NNT is the number of nucleotide interchain native contacts, and dRMSNT is the difference of the distance of native contact pairs between the nucleotide and the DPO4–DNA complex with deviation from 0 indicating deviation from the native structure. dRMSNT is in the unit of Å. ϵNT is the strength of the nucleotide interchain native contacts. The free energy landscapes show two minima and one transition state, which are denoted as “Unbinding State (NTUS)”, “Transition State (NTTS)”, and “Bound State (NTBS)”. Inset plots are the zoom-in free energy landscapes at the region of the NTTS (bottom) and binding affinity (K) along with ϵNT (top). The free energy landscapes and K at different ϵNT values were calculated from reweighting the thermodynamics at ϵNT = 1.00. The black line in the free energy plot (ϵNT = 1.13), which matches with the experimental K (200–800 μM),[24,25] was obtained from the direct umbrella sampling simulations. In the zoom-in free energy landscape plot, the NTTS state is further divided into the NTTS1 and NTTS2 states. In the K plot, the gray shadow region corresponds to the standard error of the mean value (black line), and the yellow line indicates the experimental affinity. The cyan points in the K plot are the results from the direct umbrella sampling simulations. (B) Typical structures of the ternary DPO4–DNA–nucleotide system in the NTUS, NTTS1, NTTS2, and NTBS states extracted from the simulations. The structure is shown in global view and zoom-in view for the nucleotide binding site at each binding state except the NTUS state. (C) Probability distribution of the fraction of native contacts formed by the individual domains and the linker in DPO4 and DNA with the nucleotide. The probability distribution of the interaction energy (D) between nucleotide and DPO4 and (E) between nucleotide and DNA. The contact analysis shows that the nucleotide initializes the binding process by forming the preliminary contacts with the F domain in DPO4 and DNA at the first transition state NTTS1 (Figure B,C and Figure S21). Upon proceeding to the second transition state NTTS2, the nucleotide continues to stabilize the interactions with DPO4 and almost accomplishes forming contacts with the DNA at the final NTBS. This indicates that the nucleotide at the NTTS2 has arrived at the correct spatial position on the DNA and the transition of NTTS2 → NTBS mainly corresponds to the stabilization of the native contacts between the nucleotide and DPO4. Further analysis on interaction energy shows that the driving forces for nucleotide binding at the early stage in forming NTTS1 are both the native contacts and the non-native electrostatic interactions (Figure D,E), different from DPO4 binding to DNA. The transition from the NTTS2 to the NTBS is promoted by the native contacts between the nucleotide and DPO4, so the increase of the nucleotide native contact strength can decrease the barrier height at the NTTS2 more than that at the NTTS1 (Figure S22), resulting in the switching of the transition state region. In order to see how nucleotide binding influences the DPO4–DNA complex, we performed further analyses on the conformational dynamics of the DPO4–DNA complex. Overall, the DPO4–DNA complex exhibits very similar structures and interactions during nucleotide binding (Figure S24). However, careful examinations revealed that there are mild changes primarily associated with the F domain in DPO4 during nucleotide binding. To further assess the origins of the changes in the probability distribution of Q, we calculated the probability of the individual native contact formed at these 4 nucleotide binding states and made the comparison to it at the NTBS state (Figure ). We found that most of the native contacts remain similar to those at the NTBS state during nucleotide binding. However, there are notable changes in a few native contacts within the F and P domains and at the F–P and F–DNA interfaces. In this regard, the slight changes in QDPO4(F domain), QDPO4(F – P), and QDNA(F domain) during nucleotide binding are contributed by the destabilization of a small number of native contacts. The findings indicate that a few native contacts within the F domain and at the F–P and F–DNA interfaces are distorted by the nucleotide during its binding. We further characterized these contacts and mapped them onto the structure (Figure D). We found that all these contacts are located at or proximate to the nucleotide binding site; thus, the partial breaking of these contacts at the binding transition states can open the binding site in order to accommodate the incoming nucleotide. Our results suggest that the opening of the active site in the DPO4–DNA complex may facilitate nucleotide binding, similar to what was observed previously in a protein kinase with opening its active site for the ATP recruitment.[51,52]

Figure 5

Formations of native contacts (A) within the F domain in DPO4, (B) at the F–P domain interface in DPO4, and (C) between the F domain in DPO4 and DNA, formed at the NTUS, NTTS1, NTTS2, and NTBS states during the binding of nucleotide to the DPO4–DNA complex. In each part, the top panel shows the probability distribution of Q and the bottom panel shows the changes in the individual native contact (Q, the probability of the individual native contact formed between bead i and bead j in the SBM) from the binding state to the bound state (NTBS). (D) Illustration of the contacts that have large discrepancies during nucleotide binding. These contacts were identified with Q(State) – Q(NT) < −0.1. The metal ions are indispensable when the DNA polymerase incorporates the nucleotide into the DNA molecule through the phosphoryl transfer reaction.[53] Since it is still challenging to accurately describe and model the ion interactions in a classical molecular dynamics simulation,[54,55] for simplicity, we coupled the ion and nucleotide binding by establishing the bonded interactions between the ion and nucleotide in our model. To see the effects of the ion on nucleotide binding, we removed the ion and its related interactions and performed the simulations again. We still observed a high free energy barrier for nucleotide binding with two transition states when the ion was absent (Figure S25). Further analyses on the native contacts and interactions between the nucleotide and DPO4–DNA complex showed similar results with and without the ion. This suggests that the binding pathways should not be substantially altered by the ion. However, we found a significant decrease in the stability of the bound state when the ion is absent. The result indicates that the interactions from the ion can help to form a stable ternary DPO4–DNA–nucleotide complex.

Localized Frustration in DPO4 at the apo, DNA Binary, and DNA–Nucleotide Ternary States

Naturally foldable proteins are deemed to obey the “principle of minimal frustration”,[56] which efficiently guides the folding on the funneled energy landscapes. In reality, proteins often endure a limited fraction of interresidue interactions that conflict with others. Although these interactions generally weaken the stability of folded structures, they can promote specific conformational movements, which may be related to the functional purposes. Our results have indicated that the local conformational dynamics, rather than the global unfolding, has important effects on the substrate binding processes. To study the local functional conformational dynamics in DPO4 by taking into account the energetic frustrations, we quantified and compared the frustrations in DPO4 at the native apo, DNA binary, and DNA–nucleotide ternary states based on the method introduced by Ferreiro et al.[32] In all three states of DPO4, we see that the interactions in DPO4 are dominated in the minimally frustrated way (Figure S26), indicating that DPO4 possesses a globally funneled folding energy landscape. The highly frustrated contacts are generally located on the surfaces of individual domains in DPO4, similar to the observations in the single domain proteins.[32,57] Interestingly, there are a notable amount of highly frustrated contacts formed between the T and the LF domains in the apo state of DPO4 (Figure S26A). This signifies a frustrated T–LF domain interface that is prone to unravel or crack in favor of the “open-to-closed” state transitions of DPO4. Our results are in line with the previous findings that the highly frustrated interactions in proteins are often enriched at the regions responsible for the large-scale conformational changes.[58] To see how DNA and nucleotide affect the frustrations in DPO4, we compared the differences of the highly frustrated contacts formed in the vicinity of each residue in DPO4 between the apo and the binary states, as well as the binary and ternary states (Figure A). We found that the presence of DNA overall increases the degree of frustration for the residues in the F and P domains (Figure A, middle). This indicates that the residues in the F and P domains of DPO4 are more mobile in the binary state than they are in the apo state. Meanwhile, DNA binding decreases the degree of frustration at several residues located at the interface between the T and the LF domains (Figure B). This indicates that the highly frustrated contacts that favor the “open-to-closed” state transitions in DPO4 are diminished after DNA binding. Nucleotide binding has a much weaker effect on modulating the frustration in DPO4 than DNA binding does (Figure A, bottom, and Figure C). However, we note that a short segment (residues 145–152) in the P domain possesses less highly frustrated contacts when the nucleotide is present. We found that this region is located at the surface of the P domain, interacting with the T domain. Thus, our results indicate that the mobility of the P–T domain interface is weakened by nucleotide binding.

Figure 6

Localized frustration in DPO4. (A) Number of highly frustrated interactions in the vicinity of each residue in apo (PDB: 2RDI(11)), binary (PDB: 2RDJ(11)), and ternary (PDB: 1JX4(4)) states (top). The differences between the apo and the binary states; the binary and ternary states are respectively shown at the middle and bottom in (A). The x-axis is colored according to the domain index in DPO4, same as that in Figure . (B) and (C) are DPO4 structures colored according to the differences in contacts shown in (A). (D) Differences of the frustration indexes of contacts in DPO4 between the apo and binary state; the binary and ternary states are calculated based on the intradomain (top) and interdomain (bottom) interactions. We further studied the effects of substrate binding on the highly frustrated contacts formed by the intra- and interdomain interactions through measuring the changes of the frustration index of contacts upon substrate binding. The frustration index measures how favorable a particular contact is relative to the set of all possible contacts in that location normalized by the variance of that distribution.[32] Thus, a low (high) value of the frustration index corresponds to a strongly (weakly) frustrated contact. We see that DNA binding destabilizes all of the three individual domains in the conserved polymerase core (the F, P, and T domains) through decreasing the frustration index of the corresponding intradomain contacts (Figure D, top). In addition, the F–P domain interface is considered to be more flexible, with a lower frustration index in the binary state than in the apo state. Meanwhile, the frustrated contacts formed by the T–LF domain interface and linker in the DPO4 apo state are significantly minimized by DNA binding, indicating that the conformational dynamics related to the T–LF domain interface and linker in DPO4 vanish in the binary state. Further binding of nucleotide to the binary state stabilizes the F domain and domain interfaces of F–P and P–T through increasing the frustration index of the contacts (Figure D, bottom). Together, our results suggest that the F domain and the F–P domain interface in DPO4 are more unstable at the binary state than at the apo and ternary states, and the P–T domain interface is more stable at the ternary state than at the binary state. The frustration analysis echoes our SBM simulation results that the F domain and the domain interfaces of F–P and P–T have high propensities to enable the specific conformational motions during the nucleotide binding process.

Discussion and Conclusions

Crystal structures revealed that DPO4 adopts distinct conformations with and without substrate binding.[4,11] With the double-well SBM, we studied the conformational transition of DPO4 between the inactive (DPO4A) and the active (DPO4T) form. During the transition, DPO4 forms an inevitable DPO4I state, which shows an extended linker connecting the T domain and the LF domain. A moderate increase of the temperature from room temperature leads to an increase of the population of the DPO4I state, where the individual domains and domain interfaces except the ones involving the LF domain remain folded (Figure C and Figure S6). The DPO4I state was found to be dominant when DPO4 forms the nonspecific encounter complex with the DNA (Figure C). This indicates that an elevated temperature leads to the increased population of the DPO4I state and thus may promote the DPO4–DNA binding process. On the other hand, the crystallographic DPO4–PCNA structure revealed that the LF domain in DPO4 is adapted from it in the DPO4A form to anchor PCNA for forming the complex.[12] This also implies that the DPO4I state may promote the DPO4–PCNA binding through breaking the interface of the LF domain in the DPO4A form. Since the DPO4I is an entropy-driven state, we proposed a positive role of the temperature in facilitating the binding of DPO4 to the substrates/proteins by inducing the formation of the DPO4I state. It is worth noting that, in our recent studies with the single-basin one-bead model,[15,16] the DPO4I state was not detected during DPO4 unfolding. With considering the local conformational transition in DPO4 and improving the coarse-grained level of the model to the two-bead double-well SBM, here we characterized the DPO4I as the intermediate state for both the “open-to-closed” transition and unfolding of DPO4. An unfolding intermediate state with similar structural characteristics of the DPO4I was previously observed in the melting experiments.[13] In this regard, the results generated by the current model are in good agreement with the experiments, suggesting that DPO4 undergoes partial unfolding to accomplish the functional conformational transition. The finding enriches the current understanding of conformational flexibility and frustration in the multidomain protein native structures for promoting the functional structure arrangements.[59,60] Our simulations show that the conformational transition of DPO4 occurs through adapting the interfacial domain interactions involved by the LF domain while the other regions in DPO4 remain structurally unaltered. The results indicate that the domain interfaces of the LF domain in DPO4, which are responsible for the functional conformational dynamics, are more fragile than the others, which are responsible for maintaining the DPO4’s folded structure. This has led to a globally funneled energy landscape of DPO4 with two small basins at the bottom of the funnel, corresponding to the inactive and active DPO4 conformational states (Figure ). The transition between the DPO4A state and the DPO4T state has to go through the entropy-driven DPO4I state located at the upper layer of the energy landscape compared to that of these two states. All three states of DPO4 are located at the bottom of the funnel-like energy landscape, so the functional conformational dynamics of DPO4 is restricted to an efficient local structural rearrangement of the domain interfaces rather than a slow global unfolding.[34,61] This leads to first unraveling and then folding for the dynamical scenario of the conformational change.

Figure 7

Scheme illustrating DPO4 folding, conformational dynamics, and substrate binding from the energy landscape perspective. For folding, the global energy landscape of DPO4 is funnel-like with two basins at the bottom. For substrate binding, the local energy landscapes responsible for the functional “open-to-closed” conformational dynamics of DPO4 are illustrated. DPO4 is in cartoon plot with each domain and the linker region colored by the same scheme used in Figure . The DPO4–DNA encounter complex is stabilized by the non-native electrostatic interactions with DPO4 largely in the DPO4I form (Figure ). Given the fact that the linker in DPO4 is positively charged, we performed additional DPO4–DNA binding simulations with the linker in DPO4 free of positive charges. Despite the notable destabilization in the binding states led by removing the positive charges in the linker of DPO4, the binding barrier heights remain almost the same regardless of the presence of the positive charges in the linker. A significant decrease of the barrier height was observed in the transition from the DNAEC to the DNAIS state. Thus, the extended, positively charged linker of DPO4 can prevent the dissociation of the DPO4–DNA encounter complex, thus facilitating the binding process by restricting the searching in a 1D manner. From the DNAEC to the DNAIS, DPO4 undergoes a short-range translocation on DNA by forming the native contacts with DNA, primarily through the LF domain and linker region (Figure ). Further analysis on the structural distribution of DPO4 during DNA binding shows that DPO4 has significantly decreased the population in the DPO4I form and increased the population in the DPO4A form upon forming the DNAIS state (Figure S16). It is worth noting that the most populated forms of DPO4 in the DNAUS, DNAEC, and DNAIS are the DPO4A, DPO4I, and DPO4A, respectively. The observation leads to the “backtracking” of DPO4 conformation during DNA binding.[62,63] In addition, we found that the transformation of the nonspecific DNAEC to the specific DNAIS is the rate-limiting step for DPO4–DNA binding and can be accelerated by strengthening the DPO4–DNA native contact interactions. This underlines the importance of the specific interactions in guiding and promoting DPO4–DNA binding. In cells, the coordination of the DNA polymerase is usually undertaken by the sliding clamps (PCNA and bacterial β-clamp).[64−66] Structural and biochemical studies revealed that DPO4 binds to PCNA with multiple conformations.[12] As revealed by our recent study,[67] the specific conformational adaption of DPO4 coupled with PCNA binding may be advantageous to regulate the activity and the accessibility of DPO4 at the replication site. Here, we found that the translocation of DPO4 to the replication site on DNA is slow because of the energetically frustrating protein–DNA landscape led by the nonspecific electrostatic interaction during the DNA searching process.[43,68,69] In this regard, we suggest that this process can be accelerated by PCNA in vivo, which provides the guiding interactions to position DPO4 to the spatial proximity of the DNA replication site. Although our previous SBM simulations observed the similar multistep DPO4–DNA binding process,[16,22] the conformational distribution of the DPO4 in the dissociative state and the DNA-binding interactions in the models were not calibrated to the experiments. Furthermore, the binding reaction coordinate was not optimally chosen, so that the precise characterization of the DPO4–DNA binding process was not possible. Here with the current well-calibrated model, we determined the binding mechanisms of the complex DPO4–DNA binding process, including the “backtracking” in DPO4 upon forming the DNAIS state and “conformational selection” of DPO4 during the last transition of the DNAIS to DNABS state. Therefore, we found that the double-well two-bead well-calibrated SBM developed here goes beyond our previous model in studying DPO4 folding[14,15,22] and DNA binding[16,22] because of the following three aspects. First, we upgraded the one-bead model to the two-bead one. We demonstrated that the two-bead model can naturally reduce the contribution of the interdomain interactions in the total energy from the one-bead model. The weak interdomain interactions in DPO4 are requested by the efficient folding and DNA binding.[16] In addition, it has been recognized that the presence of the side chain in the two-bead model can have better placement of the charges than the one-bead model,[37] considering the fact that the electrostatic interactions are important for both the DPO4 folding and the DNA binding processes.[13,17] Second, the simulations of DPO4 with the double-well SBM led to the observation of a metastable DPO4I state, which was not able to be characterized by the single-basin SBM developed for DPO4 folding to the apo structure.[15,16] The DPO4I state was further identified as the intermediate state for both folding and conformational transition, enlightening the understanding of the interplay between DPO4 folding and conformational dynamics. Third, the DPO4–DNA binding model was calibrated to the experiments, and simulations were performed with a carefully determined reaction coordinate. This has enabled us to dissect the underlying mechanisms of conformational dynamics in DPO4 during its multistep binding to DNA. Nucleotide binding goes through a typical two-state process associated with a high energy barrier. There are only minor structural changes in the DPO4–DNA complex after nucleotide binding, leading to the similar energy landscapes of DPO4 with and without nucleotide binding (Figure ). However, there are notable changes in a few contacts in the DPO4–DNA complex during the nucleotide binding process. The nucleotide can destabilize several interactions surrounding the active site within the F domain, at the F–P domain interface and the interface between the F domain and DNA at the binding transition states. Protein structure opening for recruiting a substrate via partial protein unfolding, particularly at the binding site, was previously found in other protein systems.[51,52] This again underlines the importance of frustrations in protein structure for functional purposes. Interestingly, we found that the changes in interactions led by nucleotide binding are mainly associated with the F domain in DPO4. The flexibility inside the F domain and the fluctuating interactions at the interfaces of the F domain were previously characterized to have a potential contribution in catalyzing the translesion synthesis across various DNA lesions,[16,70,71] the in vivo role of DPO4 as a Y-family polymerase.[1,2] Here, we suggest a positive role of the small and intrinsically fluctuating F domain in facilitating nucleotide binding. Our frustration analyses show that the T–LF domain interface is highly frustrated in the apo DPO4 state, and DNA binding increases the degree of frustration in the F domain and the domain interfaces of F–P and P–T. The frustrated regions and interactions have a high propensity to promote the specific conformational changes during substrate binding. Therefore, the results through calculating the energetic frustration at the native state[32] resonated with our SBM simulation findings and further provided a different way to dissect the roles of local DPO4 conformational dynamics in its functional substrate binding processes. It has been suggested that DPO4 can readily accept the damaged or mismatched base pairs during low-fidelity DNA polymerization due to the small energetic cost of adapting the DPO4 conformation to accommodate the base pair at the active site.[72] The notable enhancement of frustration in the polymerase core at the DPO4–DNA binary state observed in our study can induce the conformational flexibility in the DPO4–DNA complex, in particular at the active site, thus in favor of the recruitment of the incoming nucleotide. We further performed similar frustration analyses on a high-fidelity DNA polymerase, the DNA polymerase I large fragment from a thermostable strain of Bacillus stearothermophilus (Bacillus fragment, BF) (Figure S27).[73−75] We found that DNA binding induces the stabilization of the F domain in BF by decreasing the degree of the localized frustration (Figure S28). Meanwhile, the frustration index of the contacts within the P domain and at the F–P domain interface have only subtle changes upon DNA binding. The observations are very different from those of DPO4, where the F and P domains, as well as the F–P domain interface, are destabilized by DNA binding with decreasing the frustration index of the associated contacts. It has been well-known that the F domain is critical for modulating the fidelity of the DNA polymerase as it forms the contacts with the replicating base pair.[76] In this regard, we suggest that the stable F domain in the BF–DNA binary complex contributes to the sterically tight active site. This further promotes establishing the contacts between the F domain and the replicating base pair, responsible for the fidelity-checking mechanisms of nucleotide incorporation. The distinct results from the frustration analyses on DPO4 and BF indicate the potential connections of the localized frustrations to the polymerase fidelity. Therefore, our study provides a plausible explanation on the origin of the low-fidelity DNA polymerization by DPO4 from the conformational frustration and dynamical perspective. In this study, we developed the SBMs to study the DPO4’s global folding, local conformational transition, DNA binding, and nucleotide binding. We provided a full picture of conformational dynamics in DPO4 during its precatalytic substrate binding processes and characterized its relation and impacts on the substrate binding. Together with the localized frustration analyses, we emphasized the importance of the conformational dynamics and structural fluctuations of DPO4 in promoting the conformational transition from the inactive to active state, which forms the bound DPO4–DNA complex and facilitates nucleotide binding. Our findings provided mechanistic insights into the DPO4 conformational dynamics upon substrate binding. We anticipate that the results from the DPO4 study can be used to understand the conformational dynamics of other Y-family DNA polymerases, as they have the conserved structural architecture[4−8] with the flexible charged linker, which promotes the intermediate state formation.[13]

Materials and Methods

A coarse-grained SBM was developed for studying the DPO4 conformational dynamics and its binding to DNA and a nucleotide. SBM is inspired by the energy landscape theory,[56,77] which assumes a “minimally frustrated” funnel-like energy landscape with biasing to the native state of folding and binding. Thus, SBM only considers the interactions in the protein native structure, so the relevant protein folding and binding processes can be accelerated. SBMs have been widely applied in studying various protein dynamics, including the protein folding,[78,79] the protein–DNA recognition,[42,43,80] the intrinsically disorder proteins’ binding–folding,[27,38,81,82] and protein aggregation.[83] The results obtained from these simplified models were found to be consistent with experiments in many aspects,[78,84,85] confirming the validity of the SBMs. For DPO4, we adapted the SBM, which often exhibits one basin representing the native state, to the double-well SBM, which has two basins corresponding to the apo DPO4 state and ternary DPO4–DNA–nucleotide state. Each residue in DPO4 is represented by two beads (except glycine), with one bead placing at the Cα position and the other placing at the centroid of the side chain. One unit charge was assigned to lysine and arginine (positive) and glutamic and aspartic acid (negative), respectively. The SBM potential for DPO4 used in our study is expressed as follows: VLocalDPO4 describes the local interactions, including the bond stretching, angle bending, dihedral rotation, and chirality maintenance. Each term of VLocalDPO4 (except bond stretching) has two potential minima with the positions adapted from the DPO4 apo and ternary crystal structures; VNativeDPO4 is the nonlocal native biasing potential, based on a mixture contact map from the DPO4 apo and ternary crystal structures; VNon-nativeDPO4 represents the volume-excluding potential; and VElectrostaticDPO4 describes the electrostatic interactions through the Debye–Hückel model. In our previous study,[16] we found that the application of the default homogeneous strength of the intra- and interdomain native contacts in the SBM does not result in the efficient folding and DNA binding processes for DPO4 from the kinetic aspects. From the evolutionary perspective, proteins are deemed to be evolved to optimize folding and function.[59] Slightly decreasing the interdomain native contacts in the homogeneous SBM can accelerate DPO4 folding and achieve efficient DPO4–DNA binding. The findings appear to be reasonable considering the fact that there are a large number of hydrophobic residues within the domains of DPO4,[4] so the intradomain interactions have been naturally strengthened in stabilizing the native structure of DPO4. Thus, it is important to take into account the heterogeneity of the interactions and weaken the interdomain interactions in the SBM. However, there is no experimental data serving as quantitative guidance to determine the strength of the interdomain interaction; here, we used our previous study as a reference.[16] We used the single-basin one-bead SBM to study DPO4 folding and DNA binding previously.[16] We found that the optimal strength of the interdomain native contacts should be rescaled to 0.7–0.8 in order to achieve efficient DPO4 folding and DNA (un)binding. This results in a 10.95%–12.51% proportion of interdomain energetic contribution to the total, regarded as the optimal values. Here, we calculated the proportions of the energetic contribution of the interdomain interactions to the total energy of the apo structure and ternary structure with the default parameters of the double-well two-bead SBM. We found that the percentages are 12.91% and 12.40%, respectively. These two values are close to the range suggested by our previous study using the single-basin one-bead SBM. In other words, our current model with default parameters on the intra- and interdomain interaction strengths naturally generates an optimal energetic contribution of the interdomain interactions to the total energy for the efficient DPO4 folding and DNA binding. Therefore, we used the default intra- and interdomain interaction strength in the current model. Further calibration on the strengths of the native contacts from the apo and ternary structure in building the mixed contact map was performed. This was realized by modulating the strengths and generating the probability distribution of DPO4 at the DPO4A, DPO4I, and DPO4T state. In principle, strengthening the contacts derived from the apo (ternary) structure should increase the probability of the DPO4A (DPO4T) state (Figure S5). We determined the strengths of the native contacts based on the following two experimental observations. First, the crystal structure of DPO4 indicates that DPO4 should be mainly in the apo structure at room temperature.[11] Second, increasing temperature leads to an increase of the population of DPO4 in the ternary structure in solution.[26] In practice, we applied the thermodynamic reweighing method to the data generated at the default SBM simulations to obtain the thermodynamic results at the other designated parameters.[16,39,86] For DNA, we used the short DNA segment (primer/template 13/17-mer DNA substrate) present in the ternary crystal structure.[4] Each nucleotide was reduced into three beads, representing the sugar, base, and phosphate groups, respectively. The phosphate pseudobead was modeled to carry one negative charge. In the simulations, the short DNA segment was used and set to be rigid. It is due to the following two facts. First, the binding of DPO4 to DNA is coordinated by PCNA in vivo.[65,66] During DPO4–DNA binding, PCNA binds with DPO4 and relocates DPO4 toward the vicinity of the DNA replication sites, so DPO4 does not have to perform the 1D diffusion on a long DNA molecule. Instead, a combination of the short-range 3D diffusion and local-range 1D diffusion appears to be appropriate to describe DPO4–DNA binding. Second, DNA has a high stiffness with a persistence length of ∼50 nm (∼150 bp).[87] The effects of the conformation and flexibility of the DNA molecule should be negligible on DPO4 binding, considering that the short DNA segment was used. We note that further improvement of the DNA model can be made by taking into account the DNA conformational flexibility while still using the SBMs for the proteins.[88−90] The potential of the DPO4–DNA system is expressed as follows:where VNon-localDPO4–DNA is made up of the native, non-native, and electrostatic interaction potentials of interchain DPO4–DNA. The strength of the interchain native contacts between DPO4 and DNA was calibrated in accordance with the experimental affinity.[24,28] For nucleotide, we determined the native contacts in the ternary structure.[4] The nucleotide was coarse-grained into five beads, representing the base, sugar, two phosphate groups, and one calcium ion. The potential of the DPO4–DNA–nucleotide system is expressed as follows:where VSBMNT is biasing to the native structure of nucleotide in the crystal structure with a typical SBM expression and VNon–localDPO4,DNA–NT describes the nonlocal interactions of nucleotide with DPO4 and DNA. The strength of the interchain native contacts between the nucleotide and the DPO4–DNA complex was calibrated in accordance with the experimental affinity.[24,25] Simulations were performed by Gromacs software (version 4.5.7).[91] Reduced units were used throughout the simulations, except the length is in the units of nm or Å. For DPO4 folding, we performed two sets of REMD simulations starting from DPO4 structures in the apo and ternary forms, respectively. For DNA and nucleotide binding, we performed umbrella sampling simulations along with the corresponding binding reaction coordinates Q#, which is expressed as Q# = NQ – dRMS, where Q is the fraction of interchain native contacts for substrate binding (DNA or nucleotide), N is the number of the interchain native contacts, and dRMS is the difference of the distance of native contact pairs. For the SBMs, Q was deemed as a good reaction coordinate for describing the protein folding[92] and adding the biased potentials during the umbrella sampling simulations.[93,94] However, for protein binding, Q was found to be incapable of discriminating among different unbound conformations,[39] which all have interchain Q values equal to 0. Discriminating the unbound states is critical to determine the binding and unbinding pathways. Previously, we used dRMS, which measures the degree of the dissociation relative to the bound structure. Although dRMS has been proved effective for studying protein binding when applying the umbrella sampling simulations,[95,96] we previously found that dRMS does not well capture the conformational differences after the ligand anchors the target protein.[36] In this regard, we applied the biased potentials on the reaction coordinate Q#, which contains the information from Q and dRMS. When the substrate is unbound from the DPO4, Q ∼ 0 and the change of Q# strongly depends on the change of dRMS, which is competent to discriminate the unbound states; when the substrate approaches the binding site, dRMS becomes small, close to 0, so Q# mainly relies on Q, which has been proven to be an optimal reaction coordinate for the SBMs. In this regard, Q# can provide a comprehensive description of both the unbound states and the states after the substrate initializes the interactions with DPO4, thus resulting in the characterizations of the (un)binding pathways. The umbrella sampling simulations were conducted with the aid of the PLUMED plugin (version 2.5.0).[97] Three sets of umbrella sampling simulations with different initial structures at one binding contact strength or salt concentration were performed. The multiple trajectories in one set of the simulation were analyzed by the Weighted Histogram Analysis Method (WHAM).[30] The trajectories were further analyzed by the reweighting method, which used the principle of statistical mechanics to obtain the thermodynamic results at other parameters in the SBMs.[16,39,86] The details of the models and simulations can be found in the Supporting Information. Frustration analyses were carried out by the frustratometer server.[98] The server used the associative memory, water mediated, structure, and energy model (AWSEM), in which a coarse-grained representation of residue with interaction parameters optimized from landscape theory is used.[99] The latest version of AWSEM, which considers the electrostatic interactions,[100] is included in the frustratometer server and was used in this study. We used the crystal structures of DPO4 at the apo (PDB: 2RDI(11)), binary (PDB: 2RDJ(11)), and ternary states (PDB: 1JX4(4)) to perform the frustration analyses. The details of the method can be found here.[98] The necessary files for setting up Gromacs (version 4.5.7 with PLUMED version 2.5.0) simulations and analysis programs/scripts are publicly available at https://osf.io/sj86k/.

96 in total

1. Crystal structure of a DinB family error-prone DNA polymerase from Sulfolobus solfataricus.

Authors: L F Silvian; E A Toth; P Pham; M F Goodman; T Ellenberger
Journal: Nat Struct Biol Date: 2001-11

2. GROMACS 4: Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation.

Authors: Berk Hess; Carsten Kutzner; David van der Spoel; Erik Lindahl
Journal: J Chem Theory Comput Date: 2008-03 Impact factor: 6.006

3. Ligand-induced global transitions in the catalytic domain of protein kinase A.

Authors: Changbong Hyeon; Patricia A Jennings; Joseph A Adams; José N Onuchic
Journal: Proc Natl Acad Sci U S A Date: 2009-02-09 Impact factor: 11.205

4. Dynamic energy landscape view of coupled binding and protein conformational change: induced-fit versus population-shift mechanisms.

Authors: Kei-Ichi Okazaki; Shoji Takada
Journal: Proc Natl Acad Sci U S A Date: 2008-08-04 Impact factor: 11.205

5. Probing possible downhill folding: native contact topology likely places a significant constraint on the folding cooperativity of proteins with approximately 40 residues.

Authors: Artem Badasyan; Zhirong Liu; Hue Sun Chan
Journal: J Mol Biol Date: 2008-09-17 Impact factor: 5.469

6. Protein Assembly and Building Blocks: Beyond the Limits of the LEGO Brick Metaphor.

Authors: Yaakov Levy
Journal: Biochemistry Date: 2017-08-31 Impact factor: 3.162

7. Kinetic basis for the differing response to an oxidative lesion by a replicative and a lesion bypass DNA polymerase from Sulfolobus solfataricus.

Authors: Brian A Maxwell; Zucai Suo
Journal: Biochemistry Date: 2012-04-10 Impact factor: 3.162

8. Mechanistic Basis for the Bypass of a Bulky DNA Adduct Catalyzed by a Y-Family DNA Polymerase.

Authors: Rajan Vyas; Georgia Efthimiopoulos; E John Tokarsky; Chanchal K Malik; Ashis K Basu; Zucai Suo
Journal: J Am Chem Soc Date: 2015-09-11 Impact factor: 15.419

9. Confinement and Crowding Effects on Folding of a Multidomain Y-Family DNA Polymerase.

Authors: Xiakun Chu; Zucai Suo; Jin Wang
Journal: J Chem Theory Comput Date: 2020-01-30 Impact factor: 6.006

Review 10. Kinetic Mechanism of DNA Polymerases: Contributions of Conformational Dynamics and a Third Divalent Metal Ion.

Authors: Austin T Raper; Andrew J Reed; Zucai Suo
Journal: Chem Rev Date: 2018-06-04 Impact factor: 60.622