Molecular Mechanisms Associated with Clustered Lesion-Induced Impairment of 8-oxoG Recognition by the Human Glycosylase OGG1

The 8-oxo-7,8-dihydroguanine, referred to as 8-oxoG, is a highly mutagenic DNA lesion that can provoke the appearance of mismatches if it escapes the DNA Damage Response. The specific recognition of its structural signature by the hOGG1 glycosylase is the first step along the Base Excision Repair pathway, which ensures the integrity of the genome by preventing the emergence of mutations. 8-oxoG formation, structural features, and repair have been matters of extensive research; more recently, this active field of research expended to the more complicated case of 8-oxoG within clustered lesions. Indeed, the presence of a second lesion within 1 or 2 helix turns can dramatically impact the repair yields of 8-oxoG by glycosylases. In this work, we use μs-range molecular dynamics simulations and machine-learning-based postanalysis to explore the molecular mechanisms associated with the recognition of 8-oxoG by hOGG1 when embedded in a multiple-lesion site with a mismatch in 5′ or 3′. We delineate the stiffening of the DNA–protein interactions upon the presence of the mismatches, and rationalize the much lower repair yields reported with a 5′ mismatch by describing the perturbation of 8-oxoG structural features upon addition of an adjacent lesion.


Introduction
8-oxo-7,8-dihydroguanine (8-oxoG) is the most common oxidatively generated DNA damage and involves at least 1 over 10 6 of the total pool of guanines in human cells [1]. It is mostly generated by oxidative stress, either upon hydration of the guanine radical cation or by reaction of hydroxyl radical (·OH) with guanine [2,3]. Additionally, it may also be generated by UV irradiation, most notably via the activation of singlet oxygen by photosensitizers. The high carcinogenic potential of 8-oxoG arises from its capacity to bypass DNA polymerase checkpoints, leading to the incorporation of an adenine in the nascent strand and ultimately to G:C → A:T mutations [4,5]. When 8-oxoG is not isolated but is, instead, embedded in multiple damaged sites, i.e., multiple spatially close DNA lesions, which are particularly challenging for DNA damage response, its mutagenicity is even higher [6][7][8][9]. Of note, multiple damaged sites within one or two helical turns are referred to as clustered lesions [10].
To overcome the impact of 8-oxoG on DNA integrity, this lesion is efficiently repaired by Base Excision Repair enzymes [11,12]. Human OGG1 (hOGG1) is the first enzyme involved in 8-oxoG repair. It catalyzes the hydrolysis of the N-glycosidic bond and the subsequent β-elimination of the 3 phosphate of the resulting abasic site. The molecular mechanisms driving 8-oxoG recognition by OGG1 constitute a complex matter of research [13], and a recent joint cryo-EM/in silico study of hOGG1 bound to an oligonucleotide harboring an intrahelical 8-oxoG provides crucial insights into key residues involved in the 8-oxoG extrusion process [14]. The most important and conserved structural feature of OGG1 is the N149-N150-N151 motif, which plays a major role in interacting with the DNA helix and stabilizing the 8-oxoG-extruded structure. Especially, the N149, which interacts directly with the intrahelical 8-oxoG in the minor groove, fills the gap resulting from the absence of the oxidized nucleobase, which is extruded through the major groove and stabilizes the double helix-see Figure 1A. Two arginines (R154 and R204) interact with both N149 and the facing orphan cysteine to form a stable hydrogen bond network. Besides, a tyrosine (Y203) partially inserts into the helix to maintain the helix bend by π-stacking with the nucleobase in the 3 position of the orphan cysteine.
Upon the presence of clustered lesions, hOGG1 repair rates can undergo a dramatic decrease, depending on the nature and position of the second lesion [15][16][17]. Sassa et al. investigated sequence effects and the presence of adjacent lesions on hOGG1 repair rates [18]. They highlighted the relative tolerance of hOGG1 concerning sequence effects, with only a slight decrease when 8-oxoG is flanked by either 5 C, A, T, or G. However, hOGG1 efficiency is dramatically impacted by the presence of a mismatch at 5 position of 8-oxoG, with repair rates reduced by up to 100-fold [18]-see Figure 1B. Interestingly, a similar mismatch placed 3 of the lesion has a negligible effect on hOGG1 efficiency. Owing to µs molecular dynamics (MD) simulations, we investigated the effect of the presence of a mismatch adjacent to 8-oxoG on the hOGG1-DNA interaction, hence the damaged recognition and processing. Our results reveal contrasted fingerprints of the hOGG1-DNA interaction upon the presence of clustered mismatch and 8-oxoG. The perturbation of the Watson-Crick network upon the presence of an additional mismatch leads to stronger interactions with the surrounding amino acids, which might disfavor the lesion extrusion. Machine Learning postanalysis highlights that the hOGG1 regions that are in contact with the surroundings of the lesion site, such as the highly conserved NNN motif, stiffen upon the addition of a mismatch adjacent to 8-oxoG. Interestingly, the position of the mismatch in 5 or 3 affects only in a subtle way the helical properties, yet, the thorough analysis of the DNA-protein interaction network reveals contrasted patterns that might rationalize the difference in repair rates. Thus, our simulations brings insights into the molecular mechanisms underlying the perturbation of hOGG1 efficiency upon the presence of clustered lesions, and especially, help to rationalize the sequence effects highlighted by several previous experimental works.

MD Simulations
All MD simulations were performed with NAMD2.14 [19], on systems set up using AmberTools18 [20]. The ff14SB force field and the bsc1 corrections for DNA backbone parameters were employed [21,22], and parameters for the 8-oxoG noncanonical nucleobase were taken from previous works [23]. The systems were built from the experimental structure of OGG1 interacting with intrahelical 8-oxoG-bearing 9 bp DNA duplex PDB ID 6W13 [14]. To avoid edge effects, the DNA duplex was extended to 10 bp. The DNA sequence was modified to match the ones used by Sassa et al. to allow the direct comparison of the MD results with the experiments [18]-see Figure 1C. The missing loop 79-83 was reconstructed using the SwissModel server [24], the Y207C and C253W mutations added for crystallization purposes were reversed, and crystallographic waters were kept as such. Each system was soaked in an orthorhombic TIP3P water box with a 12.0 Å buffer and neutralized with K + cations, for a total of ∼60,000 atoms that were minimized in three runs of 30,000 steps with decreasing restraints on backbone atoms. Each minimization run was coupled with a 4-ns equilibration run at 300 K to ensure a smooth relaxation. A 400-ns production was then carried out to provide sampling for structural analysis. The temperature was maintained at 300 K using a Langevin Thermostat with a collision frequency γ−ln set to 1 ps −1 . Electrostatics were handled using the Particle Mesh Ewald method [25] and a 9 Å cutoff was used for nonbonded interactions. The hydrogen-mass repartitioning (HMR) approach was used to allow stable simulations with a time steps of 4 fs, and simulation data were saved every 40 ps. Two replicates were performed for each system, resulting in a total of 2.4 µs sampling.

Structural Analysis
Structural parameters of the DNA duplex were computed using Curves+ [26] on 1000 frames extracted from each replicate. The other descriptors monitoring and the clustering were carried out using the cpptraj module of AmberTools18 [20]. The clustering of the structures were performed based on RMSD deviations of the 5-bp centered on the 8-oxoG and the amino acids within 7 Å of the 8-oxoG. Plots, pictures, and figures were generated using the ggplot2 package of R3.6.3 [27,28], VMD [29], and Inkscape (https://inkscape.org/, accessed on 1 June 2021).
The assessment of the protein flexibility was performed using a machine learning protocol proposed by Fleetwood and coworkers [30], which was in-house implemented and previously successfully used on the MutM bacterial analog of OGG1 [23].

Protein-DNA Interaction Network
The interaction previously observed between hOGG1 and damaged oligonucleotides are correctly reproduced in our simulations. Hereafter, we describe the key residues interact-ing with the damaged region and forming the protein-DNA contact surface, and scrutinize the perturbations of this network upon the presence of a 3 or 5 mismatch.

Interactions in Proximity of the Damaged Site
Several amino acids surrounding the damage site have been previously identified as being crucial for 8-oxoG extrusion: N149, R154, R204, and Y203-see Figure 2.
The presence of the nearby N149 belonging to the NNN motif is especially important as this residue fills the gap in the helix left by the 8-oxoG extrusion, thus stabilizing the oligonucleotide. Upon the presence of an isolated 8-oxoG, our simulations show that this residue remains close to the damage site in the minor groove but interacts only transiently within the lesion surroundings, situated at distances from 8-oxoG and dG17 of 5.27 ± 2.20 Å (N149:ND2-8OG:N3) and 4.75 ± 2.85 Å (N149:ND2-dG17:N3), respectively. This is in agreement with the recent study by Shigdel et al. showing that N149 interacts with both nucleobases of the damaged base pair [14].
The two arginines R154 and R204 also play an important role in stabilizing the DNA helix after 8-oxoG extrusion from the duplex, by interacting with the inserted N149 and the facing orphan base. Along our MD simulations, R154 makes contacts with dT19 and dG18 backbone atoms (R154:CZ-dT19:P and -dG18:P distances of 6.49 ± 2.33 Å and 5.42 ± 1.56 Å, respectively), keeping it close enough to the lesion site for further stabilization. R204 interacts with the target strand, either in the minor groove with the dG6 nucleobase (R204:CZ-dG6:N3 at 8.55 ± 2.48 Å) or with the dA7 phosphate group below (R204:CZ-dA7:P at 7.42 ± 3.16 Å). Finally, the stabilization of the 8-oxoG extruded structure also involves a partial insertion of Y203 in 3 of the orphan base (dC16). In our simulations, Y203 stays nearby the minor groove, interacting mainly with dG17 on the facing strand through its backbone atom (Y203:N-dG17:P distance of 7.07 ± 1.93 Å) but also transiently with the target strand with either the dA7 sugar (Y203:HH-dA7:O4' distance of 5.65 ± 2.48 Å) or the dG6 nucleobase (Y203:HH-dG6:N3 distance of 6.25 ± 2.47 Å).
Upon addition of an adjacent mismatch, the above-described interaction patterns exhibit perturbations that result in a stiffening of the DNA-protein contact region around the lesion, with the position in 5 or 3 to the lesion dictating the new interaction network. When the secondary mismatch is located in 3 , N149 interacts more tightly with 8-oxoG and dG17 (3.12 ± 0.28 Å and 4.98 ± 0.77 Å, respectively). Besides, R154 gets closer to dT19 phosphate (5.45 ± 1.46 Å) at the expense of dG18 (6.56 ± 1.84 Å) and R204 becomes involved in more exclusive hydrogen bonding with dA7 phosphate (R204:CZ-dA7:P at 4.89 ± 1.32 Å). Similarly, Y203 also forms more stable interactions with the DNA duplex: it gets mainly involved in persistent hydrogen bonding with dG15 from the mismatch base pair (Y203:OH-dG15:N2 at 3.95 ± 1.58 Å) and its backbone can still interacts with the dG17 phosphate (Y203:N-dG17:P distance of 4.75 ± 1.07 Å). Y203 get also slightly closer to the target strand in 5 of the damage, exhibiting decreased distances compared with the isolated 8-oxoG: Y203:HH-dT6:O2 at 5.65 ± 2.10 Å and Y203:HH-dA7:O4' at 5.29 ± 1.82 Å. A mismatch in 5 of 8-oxoG induces a slightly different rewiring of the interaction network in the damaged region. N149 interaction with 8-oxoG is also stronger than for an isolated lesion (N149:ND2-8OG:N3 of 3.92 ± 0.94 Å) and hydrogen bonding with the dG17 nucleobase is even more pronounced than with a 3 mismatch (N149:ND2-dG17:N3 of 3.97 ± 1.18 Å). R154 exhibits a similar interaction with the dT19 phosphate to that for the 3 mismatch (5.74 ± 1.22 Å) and get even closer to dG18 backbone (4.67 ± 1.0 Å), but R204 does not get closer to dA7 nor dG6 compared with the isolated lesion (R204:CZ-dA7:P at 7.01 ± 2.69 Å and R204:CZ-dG6:N3 at 7.70 ± 1.58 Å). The stable Y203-dG15 interaction observed for the 3 mismatch is not retrieved with the 5 mismatch, yet Y203 also gets closer to the dG17 phosphate (Y203:N-dG17:P of 4.62 ± 1.14 Å) and to the dA7 sugar (Y203:HH-dA7:O4' of 5.13 ± 2.15 Å). The hydrogen bond with the dG6 nucleobase is also tighter than with the isolated lesion (Y203:HH-dG6:O2 at 5.74 ± 2.78 Å). The stronger interaction networks observed upon the presence of 3 and 5 mismatches translate into a stiffening of the protein region harboring the interacting residues. The flexibility profile generated by Machine Learning (ML) analysis of our simulations show higher flexibility values for the reference structure (isolated 8-oxoG) around the N149, R154, and Y203-R204 key amino acids-see Figure 3. A global stiffening of the DNA structure is also observed upon addition of a mismatch, regardless of its position. Besides, the clustering analysis reveals a broader distribution of the 4 main clusters in the reference system (25%/22%/20%/15%) than with additional 3 and 5 mismatches (44%/20%/12%/8% and 39%/17%/13%/12%, respectively), for which a slightly dominant structure emerges, hence underlying the enhanced stiffness of the hOGG1-DNA-see Figures S1-S3 and supplementary pdb files.

Interactions Maintaining the DNA-Protein Interface
At the hOGG1-DNA interface, key contacts ensure the stability of the complex. Several hOGG1-DNA structures exhibit a hydrogen bonding network involving, among others, G245, K249, and V250, which are crucial for maintaining the target strand during the 8-oxoG extrusion [14,[31][32][33]. In agreement with these observations, our simulations of the reference system harboring a single 8-oxoG show similar interaction patterns-see Figure 4. G245 interacts mostly with the dG8 backbone (G245:N-dG8:P distance at 6.22 ± 2.80 Å) and K249 can form hydrogen bonds with dG6 (K249:NZ-dG6:P at 7.71 ± 3.94 Å). V250 does not interact directly with the target strand in our simulations but stays close to the 8-oxoG extrusion path. V250 might interact further in the extrusion process, as we investigate here the 'encounter' stage that happens ahead of the extrusion process, as described by Shigdel et al. [14]. Interactions with the facing strand are also observed. In line with previous studies, we retrieved contacts between the facing strand in 5 of the lesion and N151 and R154, mostly with dT19 (N151:ND2-dT19:P at 5.14 ± 1.61 Å and R154:CZ-dT19:P at 6.49 ± 2.33 Å). Our simulations also revealed interactions between K207 and dG8 (K207:NZ-dG8:P at 4.37 ± 1.24 Å), S148 and dA7 phosphate (S148:HG-dA7:P at 4.17 ± 1.87 Å). Interestingly, H270 and K249 lie in the vicinity of the 8-oxoG backbone (distance between centers of mass within 12 Å); hence, they remain ideally positioned along the path of the extrusion.
The impact of an additional mismatch on hOGG1 contacts with the DNA backbone appears more pronounced for the 5 than 3 mismatch position.
A. The protein is depicted in purple and the DNA helix in white, the interacting residues' side chains are displayed in licorice with thicker tubes for amino acids. The histograms reflect the normalized distribution of the key distances in Å.

DNA Structural Features
The most important feature for DNA lesion recognition is the structural signature of the damage site within the helix. The perturbation of the structural features of the DNA helix harboring 8-oxoG upon the presence of a vicinal mismatch is described hereafter, focusing on intra-and inter-base-pair descriptors and on backbone properties.

Base Pair Descriptors
The presence of an adjacent mismatch does not strongly perturb the intra-base-pair parameters of 8-oxoG-see Figure S4. Yet, a deviation of the base-pair displacement along the X axis can be observed especially with a 5 mismatch, the average value, with the latter being −2.27 ± 0.91 Å, while values of −1.38 ± 1.00 Å and −1.76 ± 0.65 Å for the isolated 8-oxoG and with a 3 mismatch. Likewise, the local inclination of the helix at 8-oxoG slightly increases upon the presence of a mismatch: 5.59 ± 5.0 • , 12.45 ± 4.96 • , 9.45 ± 5.47 • -see Figure S4. On the contrary, the intra-base-pair parameters at the position of the mismatch experience more pronounced deviations, as can be expected from the perturbation of the Watson-Crick pairing. Upon a mismatch in 3 or 5 to the lesion, the shear parameter at the mismatch base pair increases to around 2.3 Å vs. 0.0 Å in the reference-see Figure 5A. Interestingly, mismatch in 5 also increases the deviations of the displacement along the X axis by ∼ 1 Å at both the 3 and 5 positions. The local inclinations of the base pairs in 3 and 5 of 8-oxoG are slightly impacted by the mismatches, especially when the latter is located in 3 -see Figures S5 and S6.
Inter-base-pair parameters undergo more drastic perturbations upon the presence of a mismatch, especially between 8-oxoG and its adjacent 5 base pair-see Figure 5B. With a mismatch in 5 of the lesion, the twist angle between the dT(C)4-dG17/8-oxoG-dC16 base pairs drops by ∼ 10 • . This deviation is of only ∼3 • with a 3 mismatch. A similar trend is observed for the helix twist, denoting a local straightening of the helix due to the 5 mismatch-see Figure S7. Besides, the distribution of the shift and slide parameters gets much broader with a 5 mismatch than for the reference system and a 3 mismatch. Parameters of the 8-oxoG-dC16/dG(T)6-dC(G)15 base pairs are less impacted by clustered lesions, yet, the twist angle increases by ∼10 • with a mismatch in 3 of the lesion. The helix twist angle undergoes a similar deviation with a 3 mismatch, underlying in this case a more pronounced local deformation of the DNA duplex that might facilitate 8-oxoG extrusion compared with the 5 mismatch structure-see Figure S8.

Backbone Parameters
The structural signature of the 8-oxoG backbone is of utmost importance for its processing by glycosylases [34]. In our simulations, the 8-oxoG ribose puckering is mostly in C1'-exo (50% of the time) and to a lesser extent adopts a C2'-endo (26%) or a O4'-endo (17%) conformation-see Figure 6A. In the presence of an additional mismatch, the 8-oxoG ribose can exhibit a different puckering. When the mismatch is in 3 , the effect is small and the sugar still shows mostly C1'-exo or C2'-endo conformations (40% and 43%). However, with a 5 mismatch, the C3 -exo conformation is much more frequent (24%), yet the puckering is mostly in C1'-exo or C2'-endo (35% and 36%).
The 8-oxoG backbone angles are also different upon clustered lesions, especially with a mismatch in 5 . Distribution of 8-oxoG φ angle with respect to its χ angle shows similar trends between the 3 mismatch and the reference system, but a slightly different pattern when the mismatch is in 5 , with the 8-oxoG χ angle being then less prone to adopt values around ±180 • -see Figure 6B

Discussion and Conclusions
Understanding how the human DNA repair machinery processes the highly mutagenic 8-oxoG lesion is of crucial importance for cancer research and life sciences. Along with investigations of 8-oxoG formation, structural signature, and processing by the BER enzymes [14,[34][35][36][37][38], lower repair yields of 8-oxoG when embedded in local clustered lesions are also a matter of intensive research [6][7][8][9][10]. In this contribution, we explored the molecular mechanisms associated with 8-oxoG recognition by the human hOGG1 glycosylase when 8-oxoG forms clustered lesions with an adjacent mismatch in 5 or 3 , in order to rationalize their different repair yield, drastically lowered with a 5 mismatch. Our µs-range simulations allowed us to scrutinize the DNA-protein interaction to delineate perturbations of the key contacts for the lesion processing, as well as to detail the changes in the structural signature of the DNA duplex, which is of utmost importance for its recognition by the repair enzyme.
Investigation of the DNA-protein contacts revealed the perturbation of the interaction network upon the presence of 3 and 5 mismatches. In both cases, the lower flexibility of the α-helix harboring key residues for the extrusion and of the S148-R154 region harboring the NNN motif conserved in the glycosylases reflect the tighter interactions found upon clustered lesions-see Figure 3. The weakening of the Watson-Crick network might provide more opportunities to form DNA-protein hydrogen bonds upon the presence of a mismatch. Interestingly, with a 5 mismatch the dG17 is more prone to make interactions with the enzymes and the rewiring of the interactions might perturb the intercalation of Y203 and the interactions involving N149, R204, R154, thus initiating hOGG1 activity. Overall, hOGG1 key residues are more tightly involved in interactions with the damaged area and with the DNA backbone upon the presence of a mismatch (especially in 5 ) than for an isolated 8-oxoG-see Figures 2 and 4. This might be counter-intuitive as DNA duplexes harboring a mismatch or 8-oxoG become generally more flexible than undamaged oligonucleotides to facilitate the mismatch recognition [39][40][41][42]. Hence, this outlines the importance of exploring the perturbation of their structural behavior when embedded in a clustered lesion, which impacts their repair. The observed global stiffening resulting from the clustered lesions might disfavor the extrusion of the 8-oxoG by hOGG1 by constraining the DNA helix-a process that requires an optimal DNA twisting to efficiently take place, as reported for bistranded 8-oxoG/abasic site clustered lesions [43].
As a matter of fact, our results show that the 5 mismatch perturbs the geometry of DNA duplex, thus modifying the structural signature of the nucleic acid sequence surrounding 8-oxoG-see Figure 5. The impact of a 3 mismatch is less pronounced, and most of the DNA structural properties exhibit values more comparable to the reference isolated 8-oxoG system. The 8-oxoG backbone angles are particularly deviated by the presence of a 5 mismatch, and the puckering of the lesion ribose moiety is also perturbed with clustered lesions-see Figure 6. Therefore, the difference in the structural signature of the lesion backbone, which is especially important for the recognition and the extrusion by hOGG1, might perturb the lesion processing upon the presence of an adjacent mismatch. Altogether, our simulations delineate the molecular mechanisms driving the difference in repair yields of 8-oxog by hOGG1 when embedded in a clustered lesion with a mismatch: a stiffening of the DNA-protein interactions involving key residues for the extrusion process, and a modified structural signature of the 8-oxoG-especially in its backbone features, which might disfavor the enzyme processing. Of note, some of the key residues abovementioned or vicinal amino acids can undergo mutations in different cancer types (N151S in bladder urothelial carcinoma, G245R in colon adenocarcinoma, R206C in lung squamous cell carcinoma, R206H or R206S in cutaneous melanoma) [44], highlighting their importance in carcinogenesis. Our study presents a first exploration of the impact of 5 and 3 mismatches adjacent to the 8-oxoG on the DNA-protein contacts network and the DNA structural properties. Free energy calculations of the extrusion process would provide complementary information to explore more in-depth the importance of these mechanisms. Data Availability Statement: The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.