Emmanuelle Bignon1, Natacha Gillet1, Chen-Hui Chan1, Tao Jiang1, Antonio Monari2, Elise Dumont1,3. 1. Univ. Lyon, ENS de Lyon, CNRS UMR 5182, Université Claude Bernard Lyon 1, Laboratoire de Chimie, F69342 Lyon, France. 2. Université de Lorraine and CNRS, LPCT UMR 7019, 54000 Nancy, France. 3. Institut Universitaire de France, 5 rue Descartes, 75005 Paris, France.
Abstract
The combination of several closely spaced DNA lesions, which can be induced by a single radical hit, constitutes a hallmark in the DNA damage landscape and radiation chemistry. The occurrence of such a tandem base lesion gives rise to a strong coupling with the double helix degrees of freedom and induces important structural deformations, in contrast to DNA strands containing a single oxidized nucleobase. Although such complex lesions are known to be refractory to repair by DNA glycosylases, there is still a lack of structural evidence to rationalize these phenomena. In this contribution, we explore, by numerical modeling and molecular simulations, the behavior of the bacterial glycosylase responsible for base excision repair (MutM), specialized in excising oxidatively-damaged defects such as 7,8-dihydro-8-oxoguanine (8-oxoG). The difference in lesion recognition between a simple damage and a tandem lesion featuring an additional abasic site is assessed at atomistic resolution owing to microsecond molecular dynamics simulations and machine learning postprocessing, allowing to extensively pinpoint crucial differences in the interaction patterns of the damaged bases. Our results reveal substantial changes in the interaction network surrounding the 8-oxoG upon addition of an adjacent abasic site, leading to the perturbation of the intercalation triad which is crucial for lesion recognition and processing. The recognition process might also be impacted by a more constrained MutM-DNA binding upon tandem damage, as shown by the machine learning post-processing. This work advocates for the use of such high throughput numerical simulations for exploring the complex combinatorial chemistry of tandem DNA lesions repair and more generally local multiple damaged sites of the utmost significance in radiation chemistry.
The combination of several closely spaced DNA lesions, which can be induced by a single radical hit, constitutes a hallmark in the DNA damage landscape and radiation chemistry. The occurrence of such a tandem base lesion gives rise to a strong coupling with the double helix degrees of freedom and induces important structural deformations, in contrast to DNA strands containing a single oxidized nucleobase. Although such complex lesions are known to be refractory to repair by DNA glycosylases, there is still a lack of structural evidence to rationalize these phenomena. In this contribution, we explore, by numerical modeling and molecular simulations, the behavior of the bacterial glycosylase responsible for base excision repair (MutM), specialized in excising oxidatively-damaged defects such as 7,8-dihydro-8-oxoguanine (8-oxoG). The difference in lesion recognition between a simple damage and a tandem lesion featuring an additional abasic site is assessed at atomistic resolution owing to microsecond molecular dynamics simulations and machine learning postprocessing, allowing to extensively pinpoint crucial differences in the interaction patterns of the damaged bases. Our results reveal substantial changes in the interaction network surrounding the 8-oxoG upon addition of an adjacent abasic site, leading to the perturbation of the intercalation triad which is crucial for lesion recognition and processing. The recognition process might also be impacted by a more constrained MutM-DNA binding upon tandem damage, as shown by the machine learning post-processing. This work advocates for the use of such high throughput numerical simulations for exploring the complex combinatorial chemistry of tandem DNA lesions repair and more generally local multiple damaged sites of the utmost significance in radiation chemistry.
The chemical stability of DNA components is fundamental to maintain the genome stability, hence preventing unwanted mutations or cell death. Indeed, the accumulation of DNA lesions has been recognized as one of the principal causes of cancer development [1]. Although DNA maximizes its stability through its helical structure, its constituting nucleic acids are constantly exposed to damaging agents, either endogenous or exogenous, that inevitably lead to the production of lesions. Among the different sources of DNA lesions, most common oxidative agents include free radicals or reactive oxygen species (ROS), UV light, and ionizing radiations. In addition to the chemistry of oxidatively-generated lesions, ionizing radiations also damage DNA, as the excess energy along a single track is deposited along localized regions of space. This in turn leads to an accumulation of DNA damages (oxidation of purines, abasic sites and strand breaks) in close spatial proximity, i.e. the so-called local multiple damaged sites (MDS). MDS have recently been observed by atomic force microscopy (AFM) [2]. When two or more damaged nucleotides are present within one or two helical turns it is customary to refer of MDS as clustered lesions [3]. As a consequence of lesion formation, specific and highly efficient repair machineries exist that are able to recognize the presence of lesions in the genome and remove them to reinstate undamaged DNA strands [4], [5]. Failure in the regulation of DNA processes is also related to the development of debilitating pathologies such as Xeroderma pigmentosa [6] or to incidence, development, and morbidity of various cancers. As a consequence DNA repair enzymes also constitute a most attractive therapeutic target [7], [8].Specific DNA repair pathways may depend on the organisms, and are also related to the kind of lesions, for instance for localized oxidatively-induced damages the base excision repair (BER) pathway is preferred [9], [10], while for more extended and bulky lesions, such as base dimerization, the nucleotide excision repair (NER) mechanism is favored. Yet these sophisticated repair mechanisms have been reported to be strongly impaired when not only one but two adjacent DNA lesions are located on the same strand, the so-called tandem lesions. The formation of tandem lesions can derive from a single radical hit, and their biological impact is now well established. While their formation mechanism has been delineated [11], the reasons underlying their resistance to repair are more elusive and should be analyzed taking firmly into account specific structural modifications. The most common oxidative tandem lesions feature two adjacent oxidized nucleobases. In the following we will specifically consider 8-oxoguanine (8-oxoG) and an abasic apurinic/apyrimidinic site (Ap). This arrangement is particularly relevant because Ap are also the most common outcome of ionizing radiations after excision of an entire nucleobase [12].Interestingly, Ap sites also represent key intermediates of the BER machinery and result from the action of DNA glycosylases before being further processed and removed by endonucleases. The presence of a tandem lesion, or more generally local MDS, that are the hallmarks of radiation chemistry [13], induces strong coupling between the lesions that in turn is translated into important structural deformations of the nucleic acid as compared to its ideal structure, i.e. either undamaged strand or sequences containing an isolated lesion. The unusual structural deformations induced by tandem lesions or MDS may also well justify their globally lower repair rate as compared to other lesions [14], [15], [16], [17]. To cope with their frequency, canonical DNA lesions benefit from a most efficient repair. For instance, 8-oxoG, which is well-known to mismatch with adenine and hence is potentially mutagenic [18], is repaired by formamidopyrimidine DNA glycosylase. The latter recognizes the presence of 8-oxoG in the genome and specifically binds at the damaged site [19]. Many studies have contributed to dissect the mode of action of the bacterial glycosylase MutM in presence of a single 8-oxoG in particular concerning the recognition of the lesion [20], [21], [22]. MutM [23], [24] has been shown to recognize 8-oxoG among other oxidatively-induced lesions and to subsequently proceed to its extrusion initiating the base excision process [9], [10]. The mechanisms of recognition [25] and extrusion [26], [27], [23] of 8-oxoG have been scrutinized through a series of techniques, including molecular modeling and simulations, and are now relatively well characterized. Recently, Simmerling et al. [28], while recognizing the role of the damaged base flipping in favoring its recognition, have also pinpointed the existence of preliminary recognition steps correlating with the rapid sliding of MutM along the DNA strand that is incompatible with a recognition mechanism based on the systematic flip of all the bases. In addition, the same authors have also identified that 8-oxoG flips preferably through the major groove. The free energy required for the extrusion of 8-oxoG in extrahelical position has also been estimated by La Rosa and Zacharias [22], also taking into account the contributions due to the DNA global bending and twisting. A most important feature of MutM efficiency has been traced back to the crucial M77, R112, and F114amino acids triad – see Fig. 1-A. Indeed, it permits to disrupt 8-oxoG interactions within the DNA helix by intercalating above the 8-oxoG position, thereby facilitating its extrusion towards the active site. Besides, other several important MutM residues (K60, H74, Y242, K258, and R264) are known to stabilize the DNA helix by interacting with its backbone [26]. Interestingly, iron-sulfur cluster cofactors can play an important role in lesion processing by repair enzymes [29] including glycosylases (as shown for MutY and its homologs [30]), highlighting the finely-tuned redox mechanisms which unfortunately require very complex techniques to be modeled.
Fig. 1
(A) Cartoon representation of the bacterial MutM in interaction with a 13-bp double stranded DNA helix harboring 8-oxoguanine (OG19) as the 19th nucleobase – PDB IDcode3GO3GO8 [27]. The magnified section highlights the position of the catalytic triad (M77, R112, and F114 in green) and the residues interacting with the DNA backbone (in orange) around the damage. (B) Sequence of the 13-bp oligonucleotide, showing the position of the 8-oxoguanine (in red). In simulations with tandem lesions, dG20 is mutated in silico into an abasic site (Ap). (C) Chemical structure of the 8-oxoguanine and the abas.ic site lesions.
(A) Cartoon representation of the bacterial MutM in interaction with a 13-bp double stranded DNA helix harboring 8-oxoguanine (OG19) as the 19th nucleobase – PDB IDcode3GO3GO8 [27]. The magnified section highlights the position of the catalytic triad (M77, R112, and F114 in green) and the residues interacting with the DNA backbone (in orange) around the damage. (B) Sequence of the 13-bp oligonucleotide, showing the position of the 8-oxoguanine (in red). In simulations with tandem lesions, dG20 is mutated in silico into an abasic site (Ap). (C) Chemical structure of the 8-oxoguanine and the abas.ic site lesions.On the other hand, several studies have addressed the behavior of tandem lesion-containing oligonucleotides, either from a biochemical and repair perspective [15] or from a structural point of view [31], also relying on molecular modeling and simulations [32], [33], [17]. Globally, the different approaches agree in pointing out a strong effect of closely spaced lesions in modifying the structure and dynamics of the oligonucleotide. In addition, strong sequence effects, depending both on the relative position of the cluster lesions and on the nearby undamaged bases contribute to the complexity of the global landscape. The interaction of MDS-containing oligonucleotides with repair enzymes and in particular both E. Coli and human endonucleases [34], [16] has been reported. The perturbations exerted by the secondary lesion on the protein/DNA contact regions, and the consequent decrease in its repair efficiency, as observed for some particular tandem lesions, have also been highlighted [16], [35], [36], [37], [38]. However, no analysis of the structural behavior of MutM in presence of tandem DNA lesions has been reported, despite the relevance that such lesions may assume in conditions of strong oxidative stress or ionizing radiations.In this work, we take advantage of the existing knowledge of 8-oxoG recognition by MutM to investigate the structural and dynamic impact of the presence of a second, adjacent lesion, namely an Ap site. Relying on all-atom, explicit-solvent molecular dynamics [39], [16] we simulate the structural and dynamical behavior of a MutM:DNA complex – see Fig. 1-B. We consider both the situation in which only a single lesion (8-oxoG) is present and compare it to the one in which the adjacent guanine base dG20, present in the X-ray structure (PDB ID 3GO8), has been in silico mutated to an Ap site – see Fig. 1. We clearly show that the presence of the tandem lesion induces important structural deformations to the DNA that significantly perturb the protein/nucleic acid interaction pattern, hence being susceptible to alter the 8-oxoG extrusion. We extensively describe the changes in the 8-oxoG lesion structural signature upon the presence of an adjacent Ap site and the perturbation of the interaction network with MutM, which contribute to ultimately diminish the recognition efficiency.
Results
We report the structural and dynamical properties of the bacterial MutM interacting with a 13-bp DNA sequence harboring either a single 8-oxoG at the position 19 (as found in the crystal structure PDB ID3GO8) or 8-oxoG coupled with an Ap site at position 20, along two replicas reaching 1s MD simulation time each. The numbering of the nucleic acids used hereafter corresponds to the one in Fig. 1-B; the numbering of MutM residues refers to the crystallographic structure (PDB ID3GO8).
Tandem lesions impact the interaction network around 8-oxoG
The interaction network as found in the MutM:DNA crystal containing a single 8-oxoG lesion is stable along our MD simulations. A most important structural feature in MutM is its intercalation triad, consisting of the M77, R112 and F114 residues. Those three amino acids are located around the 8-oxoG in the minor groove, weakening the stabilizing interactions of the lesion within the double-helix to facilitate its extrusion. R112 interacts with the complementary dC7, while M77 and F114 intercalate directly above 8-oxoG and disrupt the stable -stacking with the adjacent base-pair – see Fig. 2-A. These interactions are persistent along the entire MD simulations of the singly-damaged system. F114 is involved in -staking with dG20 over 91.8% of the time series, with the distance between heavy atoms of their aromatic rings averaging at 5.35 0.01 Å. The rest of the time, F114 interacts transiently with dC7 facing dG20 and their aromatic rings maintain a distance of 6.73 0.01 Å – see Fig. 3. M77 intercalates between OG19 and dG20, as its terminal methyl group remains at 4.80 0.01 Å of the N9 atom involved in the N-glycosidic bond, and is ideally positioned to act on OG19 desoxyribose moiety to drag it outwards [26].
Fig. 2
Cartoon representation of MutM interacting with the DNA helix harboring a single 8-oxoG lesion (OG19, A) or tandem lesion 8-oxoG + Ap (OG19 and Ap20, B). H-bonds are depicted as dashed pink lines and the DNA structure is rendered transparent for sake of clarity. Upon local multiple lesions, the interaction pattern around 8-oxoG (OG19) is perturbed. The intercalation triad M77/R112/F114 is shifted down by R76 which comes to interact between Ap20 and the facing dC7, preventing M77 ad F114 intercalation above 8-oxoG. R264, normally interacting with the DNA backbone phosphates between positions 19 and 20, is now involved in hydrogen bonding with OG19 carbonyl.
Fig. 3
Distribution of relevant distances involving the intercalation triad (A) and R76 (B) upon a single 8-oxoG mutation (single) or Ap + 8-oxoG lesions (tandem). The presence of the Ap site at position 20 makes M77 and R112 move away from 8-oxoG and the facing dC8. Steric hindrance provoked by R76 movement pushes F114 closer to dC7. R76 comes closer to 8-oxoG and intercalates in the gap left by the abasic site, with formation of very stable H-bonds bridging Ap20 and the facing dC7. It also interacts with F114, preventing it to stack within the double-helix.
Cartoon representation of MutM interacting with the DNA helix harboring a single 8-oxoG lesion (OG19, A) or tandem lesion 8-oxoG + Ap (OG19 and Ap20, B). H-bonds are depicted as dashed pink lines and the DNA structure is rendered transparent for sake of clarity. Upon local multiple lesions, the interaction pattern around 8-oxoG (OG19) is perturbed. The intercalation triad M77/R112/F114 is shifted down by R76 which comes to interact between Ap20 and the facing dC7, preventing M77 ad F114 intercalation above 8-oxoG. R264, normally interacting with the DNA backbone phosphates between positions 19 and 20, is now involved in hydrogen bonding with OG19 carbonyl.Distribution of relevant distances involving the intercalation triad (A) and R76 (B) upon a single 8-oxoG mutation (single) or Ap + 8-oxoG lesions (tandem). The presence of the Ap site at position 20 makes M77 and R112 move away from 8-oxoG and the facing dC8. Steric hindrance provoked by R76 movement pushes F114 closer to dC7. R76 comes closer to 8-oxoG and intercalates in the gap left by the abasic site, with formation of very stable H-bonds bridging Ap20 and the facing dC7. It also interacts with F114, preventing it to stack within the double-helix.R112 side chain amino groups form H-bonds with the nitrogen and carbonyl of dC8 over 76.8% of the simulation time, the distance between these two atom groups being of 5.12 0.01 Å. Several other amino acids have been identified to stabilize the MutM:DNA complex by interacting with the negatively-charged phosphate groups of the backbone namely K60, H74, Y242, K258, and R264. These interactions are stable in our simulations and the highly conserved R264 forms strong H-bonds between OG19 and dG20phosphate groups – as shown in Figure S1. Noteworthy, R264 is known to play a role in 8-oxoG extrusion [28]. The structural behavior of MutM:DNA(8-oxoG) observed here corroborates the hypothesis of a highly dynamic system, whose functional flexibility is known to be central to ensure its biological role through the recognition and extrusion of 8-oxoG [40], [41].The dG20 Ap20 mutation induces a clear perturbation of this well-characterized interaction network. A first consequence is the perturbation of the dynamics of the intercalation triad. The presence of the abasic site involves, in the first 100 ns of simulation, a rapid reorientation of R76 situated just above the intercalation triad. R76 side chain turns towards the damaged site, and is found closer to OG19, at 7.78 0.02 Å vs. 11.84 0.02 Å observed in the singly 8-oxoG-containing duplex. R76 does not interact directly with OG19 but rather positions itself in the gap between the Ap site and the facing dC7, bridging the two residues through stable H-bonds as reported in Fig. 2-B. The distances between the R76 guanidinium nitrogens and the dC7/Ap20 H-bond acceptor atoms lie at 3.00 0.01 Å and 3.02 0.04 Å, respectively, in the tandem-damaged MutM:DNA complex. Comparatively, the dC7-R76 distance is of 8.14 0.03 Å in the singly damaged system – see Fig. 3-B. The reorientation of R76 reshapes the canonical interaction network of the intercalation triad, which is globally shifted downwards the duplex. F114 is pushed away from position 20 and comes closer to the opposite strand, the dC7-F114 distance drops to 5.65 0.01 Å, although its strong cation- interaction with R76 avoids direct stacking with dC7. Additionally, the distance between the R76 guanidinium extremity and the F114 aromatic ring is of 3.93 0.01 Å vs. 7.93 0.02 Å in the singly-damaged system, while the interaction of R112 with the estranged dC8 is destabilized. In presence of tandem lesions, R112 lies further from dC8 (5.93 0.02 Å) than what is observed for the singly-damaged complex (5.12 0.01 Å). The intercalation of M77 is prevented in presence of the tandem lesion since its terminal methyl group rotates away from OG19:N9 (5.31 0.01 Å), while the interaction with M77 corresponds to a more rigid binding mode, with the formation of a H-bond between the sulfur atom and one hydrogen of OG19. The specific interaction network emerging in case of R76 and F114amino acids appears as crucial to design a recognition pattern that prevents those residues to dive to fill the void in the nucleic acid/protein interface to compensate for the absence of an entire nucleobase. Even though further studies should be required to confirm its specific role, this can suggest that those two residues have an important, and rather unique, effect in assuring repairing of clustered damages.Overall, interactions between the DNA backbone and MutM tend to be stronger upon tandem damages than in the singly damaged duplex. The H-bond between K60 and the phosphate at position 20 is stronger as witnessed by the –NH3+ P distance that is reduced to 5.72 0.02 Å vs. 7.51 0.06 Å for the singly-damaged system, as well as the interaction of H24 with dA21 (NH – P distance of 4.64 0.01 Å vs. 5.25 0.03 Å with the isolated 8-oxoG) and Y242 H-bond with OG19 (OH – P distance of 5.45 0.03 Å vs 6.14 0.05 Å). However, R264 interaction with the DNA helix is strongly perturbed: in the singly-damaged system, R264 forms H-bonds with OG19 and dG20phosphates (CZ – P distance of 4.86 0.04 Å and 4.52 0.03 Å, respectively) which are disrupted upon the presence of the additional Ap site (CZ – P distance of 8.711 0.07 Å and 7.17 0.05 Å, respectively). The R264 position experiences important fluctuations in the tandem-damaged complex, and can form H-bond with OG19:O8 – see Fig. 2-B. the R264:CZ – OG16:O8 distance is below 4 Å for 42% of the simulation time in the tandem-damaged complex, while in the singly-damaged complex such short distance amounts to 11% only – see Figure S2.This first local analysis suggests that the singly- vs. tandem-damaged 13-bp duplex present different interaction patterns, with non trivial changes in the binding mode and its dynamics. In order to probe more extensively the structural and dynamic consequences of dG20 Ap20 substitution, we have relied on a recently-proposed machine-learning protocol [42] to identify other residues possibly implied in the recognition mechanism.
Systematic assessment of interacting residues through machine-learning protocol
In order to probe the residues that exhibit important interactions with the DNA duplex, a machine-learning protocol based on the principal components analysis (PCA) was set up. The latter allows to generate a “footprint” of the residues that are particularly involved in MutM:DNA bonding – see Fig. 4. A score function, in the following referred to as “importance”, is attributed to each residue: the higher the score, the higher the contribution to the MutM:DNA complex stabilization. Using a threshold of 0.04 of importance, 47 and 61 residues out of 273 single out in the singly and tandem-damaged system, respectively.
Fig. 4
Importance of the contribution of residues to the MutM:DNA complex bonding for the singly-damaged (dark red) and the tandem-damaged (green) systems. The threshold value above which the importance of the residue for the stabilization of the complex is considered as significant is 0.04. Some of the key-residues as well as the flexible loop region are pinpointed by the arrows. Contributions of amino acids to the bonding are mostly higher upon 8-oxoG/Ap combination, suggesting a more rigid complex upon local multiple damage sites than with an isolated 8-oxoG.
Importance of the contribution of residues to the MutM:DNA complex bonding for the singly-damaged (dark red) and the tandem-damaged (green) systems. The threshold value above which the importance of the residue for the stabilization of the complex is considered as significant is 0.04. Some of the key-residues as well as the flexible loop region are pinpointed by the arrows. Contributions of amino acids to the bonding are mostly higher upon 8-oxoG/Ap combination, suggesting a more rigid complex upon local multiple damage sites than with an isolated 8-oxoG.The three residues of the intercalation triad (i.e. M77, R112 and F114) show a slightly higher contribution in the tandem- (0.043, 0.042, 0.045) than in the singly-damaged system (0.041, 0.038, 0.037). R76 and E78, adjacent to M77 in the MutM sequence, also present high values. As highlighted by the visual inspection of our MD trajectories, in the tandem-damaged system, R76 flips towards the lesion site to compensate for the nucleobase removal at position 20 by bridging Ap20 to the facing dC7 through strong H-bonds. The importance score for R76 is 0.045 with tandem lesions vs 0.037 with isolated 8-oxoG, corroborating the significant role of this residue in MutM:DNA binding upon the presence of Ap20, in line with the newly-formed and very stable H-bonds with the lesion site. E78 importance is higher than the threshold in both tandem- (0.042) and singly- (0.043) damaged systems. This residue interacts with R112, contributing to the H-bonds network in the vicinity of the lesion. Adjacent to F114, G115 importance in the stabilization of the complex is also enhanced upon dG20 Ap20 mutation (0.042 in tandem vs. 0.033 with isolated 8-oxoG). Additional visual inspection of the MD trajectories reveals that G115 forms a strong H-bond with R76, helping in maintaining the latter intercalated between Ap20 and the facing dC7.Over the five key residues reported to anchor the phosphate DNA backbone (K60, H74, Y242, K258, and R264 [43]), only the closest to OG19 are associated with importance scores above the threshold of 0.04: Y242 (0.040 and 0.052 for the singly- and tandem-damaged system), K258 (0.037 and 0.040) and R264 (0.051 and 0.061). The G173 and Y176 residues also contribute to the H-bonding with the DNA backbone. N174 shows high importance values, 0.040 and 0.053 for singly- and tandem-damaged systems, this is due either to its interaction with the damaged site backbone or through indirect coupling with R264, as previously described in the literature [23]. I175 is involved in hydrophobic interactions with Y242 that in turn interacts with OG19 backbone. Among other residues whose contribution is above the threshold, R35 forms H-bonds with either dC8 or dC23 backbone, P130 and C166 interact with dT22:P, D165 and E150 maintain the 5’-terminus backbone of the DNA strand 1 (in the dG1 and dT2 surroundings), L164 stabilizes the position of the key-residue R264, while G263 and G265 form H-bonds with R264 or directly with the DNA backbone, and K258 interacts with the dC17 phosphate. Of note, the Q166C mutation kept from the crystal should have only a shy impact on the interaction with the DNA backbone, as methionine and glutamine share similar side chain length, we can safely hypothesize that a glutamine would have been pinpointed by the neural network analysis, probably with a higher contribution.Globally the PCA analysis clearly reveals that the protein residues comprised between the position 210 and 237 exhibit the highest values of importance. They correspond mostly to a large, flexible loop, comprising the residues 221–234 prior to the C-terminus that is prone to disorder, but also known to have an implication for DNA recognition despite being spatially far from the double-helix [27]. Amino acids at the N-terminus also show significant contributions to the MutM:DNA bonding. The proline located at the very end of the N-terminal region has an important catalytic role since it reacts with the C1’ atom of the deoxyribosesugar moiety of the 8-oxoguanine to form a Schiff base, and hence it induces the cleavage of the N-glycosidic bond which constitutes the first step of the repair process [44], [45]. Adjacent to P2, the vicinal E3 is also known to play a role in MutM catalytic efficiency [46], [47]. Interestingly, the contribution of these two residues to the MutM:DNA stability decreases from 0.054 in the singly-damaged to 0.044 for the tandem-damaged complex, hence corroborating a subtle reduction of the excision efficiency. Other residues of the N-terminus (L4, P5, E6) also show a drop in their contribution upon dG20 Ap20 mutation.The residues which single out in this PCA analysis match very well with the ones evidenced by previous works on MutM [26], [20], [21], [23], [25], [40]. Our machine-learning post-processing allows to disentangle a complex interaction pathway, which is already well-established for 8-oxoG-containing DNA [41] but perturbed upon the presence of tandem lesions as revealed by the present simulations. It allows to generate an exhaustive map of residues showing importance for the protein-DNA interactions, beyond the simple visual investigation based on the data from the literature. Noteworthy, the nucleic acid importance score in the MutM:DNA bonding is enhanced upon the presence of Ap20, denoting again a more constrained oligonucleotide – see Figure S3.
Mechanical and dynamic properties of the DNA strand
In order to assess the mechanical and dynamic properties of the DNA strand, the MD trajectories were post-processed with the Curves + program [48] to evaluate the structural parameters of the double helix.The first signature of the B-helix is often the bend angle, which reaches typical values around 51.55 0.34 upon interaction with MutM for the singly-damaged oligonucleotide. Such extreme values for bending are typical [49], [50] and necessary to facilitate the extrusion of the lesion towards the enzyme active site. The presence of the Ap site at position 20 is not sufficient to perturb the global bending of the 13-bp oligonucleotide (49.12 38), but rather induces local deformations.Structural parameters of the dC8-OG19 basepair are particularly impacted – see Table 1. Importantly, the backbone parameters ‘Bend’, ‘Tip’ and ‘Inclination’ are lower when the Ap site is present at position 20, denoting a straighter portion of DNA helix than what is normally found in the canonical single-damaged MutM:DNA complex – see Fig. 5 and Figure S4. The values monitored for these parameters are of 8.64 0.03, 13.96 0.13, and 19.50 0.10, respectively in the singly-damaged system, vs 7.59 0.04, 8.10 0.14, and 18.26 0.16 in the tandem-damaged complex. Besides, several intra base-pair structural parameters are also found closer to the canonical B-DNA for the dC8-OG19 base-pair. Especially, the ‘Buckle’ and ‘Propeller’ drop from −16.13 0.19 to −8.83 0.24 and from 5.63 0.17 to −0.01 0.09, respectively upon dG20 Ap20 mutation. Other parameters (Opening, Shear, Stretch, and Stagger) show much less significant deviations – see Table 1 and Figures S4 and S5.
Table 1
Averaged values and standard errors of the oligonucleotide global bending and of dC8-OG19 base-pair structural parameters, for systems harboring either a single 8-oxoG (Single, left) or a tandem 8-oxoG + Ap lesion (Tandem, right).
Single
Tandem
Global bending (°)
51.55 ± 0.34
49.19 ± 0.38
dC8-OG19 parameter
Local bending (°)
8.64 ± 0.03
7.59 ± 0.04
Tip (°)
13.96 ± 0.13
8.10 ± 0.14
Inclination (°)
19.50 ± 0.10
18.26 ± 0.16
Buckle (°)
−16.13 ± 0.19
−8.83 ± 0.24
Propel (°)
5.63 ± 0.17
−0.01 ± 0.09
Opening (°)
0.14 ± 0.09
1.02 ± 0.08
Shear (Å)
−0.14 ± 0.01
−0.21 ± 0.01
Stretch (Å)
0.05 ± 0.00
0.02 ± 0.00
Stagger (Å)
0.54 ± 0.01
0.49 ± 0.01
Fig. 5
Distribution of three characteristic DNA helix intra base-pair parameters for dC8-OG19 over 2 s MD simulation, for a single 8-oxoguanine (Single, red, top) and both 8-oxoguanine and Ap site (Tandem, blue, bottom). The structural signature of the tandem lesion-containing oligonucleotide is closer to canonical B-DNA than in the presence of a single 8-oxoG.
Averaged values and standard errors of the oligonucleotide global bending and of dC8-OG19 base-pair structural parameters, for systems harboring either a single 8-oxoG (Single, left) or a tandem 8-oxoG + Ap lesion (Tandem, right).Distribution of three characteristic DNA helix intra base-pair parameters for dC8-OG19 over 2 s MD simulation, for a single 8-oxoguanine (Single, red, top) and both 8-oxoguanine and Ap site (Tandem, blue, bottom). The structural signature of the tandem lesion-containing oligonucleotide is closer to canonical B-DNA than in the presence of a single 8-oxoG.The role of the sugar moiety, and in particular its ideal locking, could be crucial to explain the recognition of the damaged nucleic acid. Indeed, while AP sites have a propensity to assume extrahelical positions, they are also characterized, at least in naked configuration, to a strong variability and a coexistence of diverse conformations. The specific interactions developed in the enzyme conformation can, as a consequence, be crucial for the lesion recognition. Qi et al. [27] reported a change in puckering values upon oxidation of a guanine residue. 8-oxoG would exhibit a C4’-exo puckering while a canonical dG ribose moiety would harbor a C2’-endo conformation. This would promote the recognition of 8-oxoG by MutM. In our simulations, the frequency of the C2’-endo conformation of OG19 is increased by the presence of tandem lesion compared to a single 8-oxoG (42.4% and 21.3%, respectively). However, rather than the C4’-exo puckering (2.9% and 7.6% for tandem- and single-damaged), the C1’-exo is the main or second preponderant conformation (42.5% and 61.5%) – see Figures S6. Interestingly, the different trends of the 8-oxoG phase angle are correlated to differences in the chi angle values, which can reach 180 with a tandem lesion – see Figure S7. Besides, the distribution of epsilon-zeta angle characterizing the BI or BII conformational state of the 8-oxoG backbone is broader with a single lesion, further underlining the enhanced stiffness upon the presence of a tandem lesion – see Figure S8. Likewise, the distribution of the distance between the O8 and C5’ of the 8-oxoG shows different trends for single and tandem lesions, highlighting again the difference of structural signature – see Figure S9. Concerning the inter base-pair parameters (dC8-OG19/dG9-dC18) and parameters of the adjacent dG9-dC18 base-pair, DNA structural values are comparable for single- and tandem-damaged systems and in agreement with previous works [22] – see Figure S10, Figure S11 and Figure S12. As could be expected though, the absence of the nucleobase at position 20 upon mutation to Ap site influences the stability of the canonical stacking that is usually conserved in the singly-damaged complex. This highlights the blurrier structural signature exhibited by the tandem-damaged DNA helix, which is another criteria that might affect the interaction with the surrounding amino acids, hence the efficiency of the 8-oxoG extrusion by MutM.
Discussion
MutM, the bacterial analog of the humanOGG1, is responsible for the recognition and repair of the utmost common 8-oxoG lesion. The MutM:DNA interface has been investigated by NMR, X-ray and molecular dynamics simulations, probing the key residues that play a crucial role in the most specific recognition of 8-oxoG, but also of other DNA lesions [51], [50], [28], [26], [25], [40], [21], [23], [27], [52], [53]. An intercalation triad (M77, R112, F114) has been characterized, and several other residues are known to be essential in MutM:DNA interactions and 8-oxoG extrusion, guiding the lesion towards the N-terminal proline responsible for the Schiff base formation. Intrahelical insertion of a single F114 wedge residue [52], [53], [28] is marked and allows a slow scanning of the double helix by MutM and analog enzymes. Among MutM key-residues, R264, located in the Zn-finger domain, is highly conserved and important for 8-oxoG extrusion [26], [27]. N174 also plays a key-role and its mutation leads to the perturbation of the R264 contacts [23], [28]. Besides, the flexible loop prior to the C-term is known to be essential for the 8-oxoG recognition by folding over the lesion in a capping process [27], [25].While the recognition and repair of single 8-oxoG by MutM are well documented, their perturbation upon the presence of tandem lesions is very poorly understood. However, it has been shown that ionizing radiations can lead to the formation of tandem lesions [54], rendering 8-oxoG refractory to excision by glycosylases [14], [55]. Such local multiple damaged sites are highly mutagenic and increase the risks of cancer development [56], [57]. They can also be cytotoxic as their error-prone repair can result in the formation of deleterious double-strand breaks [58], [59]. Noteworthy, the high toxicity of the DNA lesions induced by ionizing radiation is also exploited for the development of cancer (radio)-therapies [60]. In this context, we investigated the structural impact of tandem lesions on the interactions between MutM and a 13-bp oligonucleotide harboring the 8-oxoG lesion at position 19. Using molecular modeling and machine-learning analysis, we highlighted a structural re-organization of MutM canonical interaction network around 8-oxoG upon the presence of an adjacent abasic site at position 20.The interaction network involving the intercalation triad and the damage is perturbed by the dG20 Ap20 mutation. The MutM:DNA interactions are more pronounced, leading to a more rigid system, which could explain the difficulty of MutM to process such local multiple damaged sites. While in the simulation of the singly-damaged system, the classical interaction patterns are observed, the presence of an additional Ap site results in the rotation of R76 that provokes a shift of the intercalation triad. Noteworthy, as R76 is poorly conserved in MutM sequences from different organisms [26], [41], one cannot rule out the possibility of a different reorganization around the intercalation triad. First observations of our MD trajectories allowed to describe the re-shaping of the MutM:DNA interaction patterns – see Fig. 2, Fig. 3. The structural analysis of the DNA oligonucleotide also reveals changes in the local conformation of the lesion site (see Fig. 5 and Table 1), which might jeopardize the efficiency of 8-oxoG recognition by the enzyme.In order to go beyond the visual observation of MutM:DNA interactions, we applied machine learning (ML) techniques to provide an extensive map of these contacts. ML methods have gained enormous amount of attention in recent years. Their power in finding important information out of large amount of data has been exploited by the biochemistry community, many interesting applications have been showcased in the literature. Recently, Fleetwood et al. [42] have demonstrated its capability in learning ensemble properties from molecular simulations and providing easily interpretable metrics describing important structural or chemical features. The machine-learning analysis of our trajectories is based on the demystifying package from Fleetwood et al. [42]. Residues highlighted as providing a significant contribution to the MutM:DNA bonding by the PCA analysis are in agreement with data from the literature. Comparison of the residues importance in MutM:DNA interactions upon single or tandem lesions allowed to pinpoint the changes in the interaction patterns, which concern the most important features of MutM – see Fig. 4. Apparently in contradiction with common chemical sense, the PCA analysis revealed that the dG20 Ap20 mutation leads to stronger, more stable interactions between the two macromolecules. The contribution of the residues involved in DNA anchoring is almost systematically increased in the tandem-damaged system. Nucleic acids also exhibit stronger interactions with MutM in the case of the tandem lesion, which overall suggests that the presence of a second damage somehow results in a more rigid complex than when an isolated 8-oxoG is present. The difference of structural signatures between the single and tandem lesions is further highlighted by the perturbation of the 8-oxoG backbone parameters upon addition of the adjacent abasic site. However, the global rigidity of the tandem-damaged MutM:DNA complex can actually be counterproductive for repair since it has been evidenced that flexibility of the DNA strand is a key feature correlating with 8-oxoG removal [41]. This consideration is also further reinforced by the fact that conversely, the catalytic N-terminal residues are less involved in the MutM:DNA complex stability in the case of the tandem lesion. This is also the case for the 211–234 loop region which is known to play a key-role in 8-oxoG extrusion. Hence, the presence of the Ap site alongside the 8-oxoG lesion impacts the canonical structural behavior of these two important MutM regions, which might also contribute to the lower repair efficiency.Our study provides an example of the predictive power of all-atom, MD simulations coupled to machine learning analysis, applied to a very challenging test-case. Indeed, the multiplicity of oxidatively-generated DNA lesions embrace a combinatorial chemistry, with contrasted structural, mechanical and dynamic properties. Additionally, MutM is a very flexible protein [41], certainly difficult to properly sample. The efficiency of our protocols gives perspectives for its extension towards other tandem systems and the investigation of sequence effects [18], [61], [62]. Furthermore, the biological significance of rationalizing this complex scenario is also unquestionable. Indeed, ionizing radiations can be satisfactorily exploited in cancer therapy, and the inhibition of repair enzyme by combined chemotherapy can prove a most valuable synergy in assuring the accumulation of lesions necessary to reach the apoptosis threshold. Understanding of the molecular mechanisms underlying DNA repair is thus crucial for also offering novel perspectives for cancer research.
Materials and methods
All-atom molecular dynamics simulations
All MD simulations were performed with the Amber and Ambertools 2018 packages [63]. The starting X-ray structure of Bacillus stearothermophilusMutM was taken from the structure obtained by Verdine and coworkers [27], PDB ID code3GO8. The crystallographic self-complementary ds-DNA is a 13-bp sequence d(GTAGATCCGACG). (CGTCCGGATCT) featuring 8-oxoG as the 19th nucleobase (in bold). It should be noted that the F-10 loop 217–237 of MutM, absent from the crystal structure, was reconstructed using Modeller [64]. The zinc atom present in the zinc-finger motif of MutM was kept and described with parameters taken from the Zinc Amber Force Field (ZAFF) developed by Merz and coworkers [65]. 19 potassium ions were added to neutralize the MutM:DNA complex, which was embedded in a 92x97x91 Å3 TIP3P water molecules bath. The Amber ff14SB [66] was used throughout, including the bsc1 force field corrections for the DNA duplex [67].The parameters for 8-oxoG and Ap site have been generated with a standard antechamber procedure embedded in Amber 18 [63], as described in previous references [68], [69], [70] and in agreement to the literature. The abasic site was modeled as an alpha anomer, and the Q166C mutation in the crystal was conserved to start from a structure as close to experimental data as possible. Four 10,000 steps minimization runs were carried out on the initial MutM:DNA complex, imposing restraints on the amino and nucleic acids, that were gradually decreased from 20 to 0 kcal/mol/Å2 along the four runs. The temperature was then raised from 0 to 300 K in a 50 ps thermalization step, and afterwards kept constant using the Langevin thermostat with a collision frequency ln of 1 ps−1. The system was subjected to a 1 ns equilibration run in the NPT ensemble. Finally, two replica of 1 s production run were performed to sample the conformational ensemble of the system. The integration step was set to 2 fs and trajectory files were written every 100 ps. The Particle Mesh Ewald method was used to treat electrostatic interactions, with a 9.0 Å cutoff.RMSD were monitored to ensure the sampling of relevant conformational space – see Figure S13. A clustering was carried out for both systems using the cpptraj module of Amber, based on the RMSD of the proteic and nucleic acids. The pdb file of the representative frame of the main cluster for each MD trajectory is available in SI (mono_MD1.pdb, mono_MD2.pdb, tandem_MD1.pdb and tandem_MD2.pdb). The structural descriptors of the DNA helix were evaluated based on a post-processing analysis with Curves+ [48] and other distances and RMSD values were monitored using Ambertools [63] on 1000 frames for each MD replicate. A description of the parameters (schemes and how they are computed) is available on curves + website (http://curvesplus.bsc.es/helpar). Plots were generated using the ggplot2 package of R [71] and figures were rendered using the VMD software (http://www.ks.uiuc.edu/Research/vmd/) [72].
Machine-learning post-analysis
The Principal Components Analysis (PCA) is a covariance-matrix-based unsupervised learning technique that take the inverse distances between residues as input, and produces the eigenvalues and eigenmodes of this covariance-matrix. The eigenmodes can be interpreted as the modes of motion, and the corresponding eigenvalues are the amplitude. The principal components (eigenmodes with highest amplitudes) capture the major fluctuations of the system can in turn be used to find out the important features of the DNA/MutM interface. We also initially compare to a supervised approach, with the distance between dC7 and R76 as a labelling tag to discriminate between bounded and unbounded states. However, for this system, this “tag” turns out to be not discriminative enough and only single out R75 and F114, and the unsupervised learning turned out to be superior.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Authors: Rotem Gilboa; Dmitry O Zharkov; Gali Golan; Andrea S Fernandes; Sue Ellen Gerchman; Eileen Matz; Jadwiga H Kycia; Arthur P Grollman; Gil Shoham Journal: J Biol Chem Date: 2002-03-23 Impact factor: 5.157