Genome-Wide Adductomics Analysis Reveals Heterogeneity in the Induction and Loss of Cyclobutane Thymine Dimers across Both the Nuclear and Mitochondrial Genomes

The distribution of DNA damage and repair is considered to occur heterogeneously across the genome. However, commonly available techniques, such as the alkaline comet assay or HPLC-MS/MS, measure global genome levels of DNA damage, and do not reflect potentially significant events occurring at the gene/sequence-specific level, in the nuclear or mitochondrial genomes. We developed a method, which comprises a combination of Damaged DNA Immunoprecipitation and next generation sequencing (DDIP-seq), to assess the induction and repair of DNA damage induced by 0.1 J/cm2 solar-simulated radiation at the sequence-specific level, across both the entire nuclear and mitochondrial genomes. DDIP-seq generated a genome-wide, high-resolution map of cyclobutane thymine dimer (T<>T) location and intensity. In addition to being a straightforward approach, our results demonstrated a clear differential distribution of T<>T induction and loss, across both the nuclear and mitochondrial genomes. For nuclear DNA, this differential distribution existed at both the sequence and chromosome level. Levels of T<>T were much higher in the mitochondrial DNA, compared to nuclear DNA, and decreased with time, confirmed by qPCR, despite no reported mechanisms for their repair in this organelle. These data indicate the existence of regions of sensitivity and resistance to damage formation, together with regions that are fully repaired, and those for which > 90% of damage remains, after 24 h. This approach offers a simple, yet more detailed approach to studying cellular DNA damage and repair, which will aid our understanding of the link between DNA damage and disease.


Introduction
DNA damage arises from endogenous sources, such as normal cellular metabolism, or exogenous sources, including exposure to ultraviolet radiation (UVR), chemicals and ionising radiation. These damaging agents modify the structure of DNA components, which may subsequently lead to mutations or cell death, with implication for diseases such as cancer. Arguably, the most serious effects of UVR exposure are DNA mutation, and carcinogenesis [1]. The absorption of UVR leads to the induction of damage to DNA by forming photoproducts, such as cyclobutane pyrimidine dimers (CPD), that are mutagenic, and therefore potentially carcinogenic [2]. Additionally, an indirect mechanism, via the chemoexcitation of melanin, has been reported to induce CPD [3]. Protecting the cell from the detrimental effects of damage are a variety of DNA repair pathways, of which nucleotide excision repair is of greatest importance for the removal of CPD [4,5].
The induction of damage by UVR depends upon the DNA sequence, local structure and chromatin environment/organisation [6,7] which, in part, contributes to the expected, differential distribution of damage formation. The other major factor determining the distribution of damage is DNA repair, which is mediated by distinct DNA repair pathways, and here chromatin organisation also plays a role [8], as chromatin structure and accessibility alters following UVR exposure [9]. These repair pathways are critical to maintain the integrity of the genome and prevent disease [10]. Furthermore, there is evidence that cells prioritise repair machinery to regions of specific need, to minimise disruption of function. For example, it is well established that repair is site-specific, with preferential removal of DNA damage from transcriptionally active genes over inactive regions [11,12] and transcribed strand-specific repair [12]. Furthermore, the nature of the lesion influences whether or not there is preferential repair in transcriptionally active genes, and in what stage of the cell cycle repair occurs [13]. Even within genes, particular regions may be favoured, such as the 5 portion of the DHFR gene [14], although the molecular basis for such differential repair across genes remains subject to speculation. Supporting the importance of sequence-specific damage formation and repair is evidence that hotspots of CPD persistence are more likely to yield mutations [15,16]; about 80-90% of all human cancers can be correlated to regions of unrepaired DNA [17]; and that in melanoma, changes to local DNA structure favour the formation of CPD hotspots, which are highly correlated with sites of recurrent mutation [18].
A growing number of techniques evaluating damage and repair within discrete locations are now emerging. Initially, this was targeted towards individual genes, e.g., through the use of ligation-mediated PCR [19] and immuno-coupled PCR [20]. However, more recently, genome-wide mapping of damage has become possible (reviewed in Mao et al. [18]). The earliest reports were limited to providing information at a chromosomal level only, with rather crude resolution [21] or offering little information in terms of gene-specific or intergenic regions [22]. In the last few years there has been a small number of reports in the literature describing methods for the genome-wide mapping of different types of DNA damage at high resolution. These methods include a series of approaches based upon combinations of excision repair enzymes (e.g., the Excision-seq approach [23]), modifications of methodology to map ribonucleotide incorporation [24], or damaged DNA immunoprecipitation (DDIP-analagous to methylated DNA immunoprecipitation (MeDIP) [25], coupled with microarray (DDIP-chip, e.g., Teng et al. [26]) or next generation sequencing (DDIP-seq),. These have then been applied to study the formation of a variety of DNA damage products e.g., CPD [7],  photoproducts [27], platinum-induced guanine adducts [28], double-strand breaks [29], 8-oxo-7,8-dihydro-2 -deoxyguanosine (8-oxodG) [25,30], and uracil [23]. Whilst DDIP-chip is a sensitive, reliable assay for DNA damage, and can evaluate the location of DNA damage at high resolution (100-1000 bp), this approach does preclude detection at specific sites for which there is not array coverage. Additionally, to cover the entire human genome by microarray with high resolution, the use of multiple microarrays is required, which may not be practical, or financially feasible [31].
Here, we report the application of a straightforward method that utilises the DDIP-seq approach to analyse UVR-induced DNA damage and repair across the entire human genome. DDIP-seq was used to characterise solar simulated radiation (SSR)-induced DNA damage and repair in the genome of human skin keratinocytes, and adds to our growing understanding of the distribution of damage and repair in both the nuclear and mitochondrial genomes.

The Effect of SSR Irradiation on HaCaT Cell Viability
Following the exposure to 0.1 J/cm 2 of SSR, the cells were allowed to recover for 24, 48 and 72 h. The administered dose of SSR did induce some cell death, however, most cells were viable and capable of repair and growth ( Figure 1). The dose of SSR used is considered to be in the range of the erythemal dose (0. of human skin keratinocytes, and adds to our growing understanding of the distribution of damage and repair in both the nuclear and mitochondrial genomes.

The Effect of SSR Irradiation on HaCaT Cell Viability.
Following the exposure to 0.1 J/cm 2 of SSR, the cells were allowed to recover for 24, 48 and 72 h. The administered dose of SSR did induce some cell death, however, most cells were viable and capable of repair and growth ( Figure 1). The dose of SSR used is considered to be in the range of the erythemal dose (0.1 J/cm 2 -0.2 J/cm 2 ) in Europe, according to the Tropospheric Emission Monitoring Internet Service (TEMIS). HaCaT cells were irradiated with 0.1 J/cm 2 SSR and then stained with Annexin V and propidium iodide, and assayed by flow cytometry after 24, 48 and 72 h. Control cells were sham-irradiated. Etoposide was used as a positive control, and viability assayed immediately after exposure. Error bars represent the mean ± SEM for three independent experiments.

Optimisation of DNA:Anti-T<>T MAb Ratio
We based our protocol for DDIP-Seq upon a commercially available MeDIP assay, with optimisation for the detection of T<>T. In addition to the manufacturer's information for the anti-T<>T MAb, and previous use [27], we provided some additional characterisation data. These data demonstrated that the anti-T<>T MAb overwhelmingly recognised UVC-induced modified DNA, the predominant lesion in which is T<>T, over unmodified or H2O2-modified DNA (Figure 2A).
For initial method development, we used commercially available human, genomic DNA, which was irradiated with UVB (0, 0.1, 0.2 and 0.5 J/cm 2 ), fragmented to 100-300 bp by sonication, and the DNA:antibody ratio varied. After the immunoprecipitation, the samples were quantified by qPCR (DDIP-qPCR) at the GAPDH promoter and Myoglobin exon 2 regions, as representative transcriptionally active, and inactive genes, respectively. The most pronounced dose-response was seen with a DNA/antibody ratio of 1:1 μg/mL, for GAPDH ( Figure 2B), although this was less pronounced for Myoglobin exon 2 ( Figure 2C). HaCaT cells were irradiated with 0.1 J/cm 2 SSR and then stained with Annexin V and propidium iodide, and assayed by flow cytometry after 24, 48 and 72 h. Control cells were sham-irradiated. Etoposide was used as a positive control, and viability assayed immediately after exposure. Error bars represent the mean ± SEM for three independent experiments.

Optimisation of DNA:Anti-T<>T MAb Ratio
We based our protocol for DDIP-Seq upon a commercially available MeDIP assay, with optimisation for the detection of T<>T. In addition to the manufacturer's information for the anti-T<>T MAb, and previous use [27], we provided some additional characterisation data. These data demonstrated that the anti-T<>T MAb overwhelmingly recognised UVC-induced modified DNA, the predominant lesion in which is T<>T, over unmodified or H 2 O 2 -modified DNA (Figure 2A).
For initial method development, we used commercially available human, genomic DNA, which was irradiated with UVB (0, 0.1, 0.2 and 0.5 J/cm 2 ), fragmented to 100-300 bp by sonication, and the DNA:antibody ratio varied. After the immunoprecipitation, the samples were quantified by qPCR (DDIP-qPCR) at the GAPDH promoter and Myoglobin exon 2 regions, as representative transcriptionally active, and inactive genes, respectively. The most pronounced dose-response was seen with a DNA/antibody ratio of 1:1 µg/mL, for GAPDH ( Figure 2B), although this was less pronounced for Myoglobin exon 2 ( Figure 2C).  Figure 2. Specificity of the anti-T<>T Mab, and its optimisation, determined by DDIP-qPCR. DNA immunoprecipitation was performed using the MeDIP kit from Diagenode, with commercially available, extracted, human DNA irradiated with 0, 0.1, 0.2 and 0.5 J/cm 2 UVB, and a monoclonal antibody against thymine dimers (T<>T). (A) ELISA results demonstrating the specificity of the anti-T<>T Mab for UV-modified DNA. Quantitative PCR was performed using primers specific for (B) the GAPDH gene promoter, an actively expressed gene, and (C) the Myoglobin exon 2, an inactive gene. Recovery was expressed as a percentage of the amount of immunoprecipitated DNA compared to the input DNA after qPCR. The results are presented as the mean ± SEM of three independent experiments. **** represents p < 0.001. Figure 2. Specificity of the anti-T<>T Mab, and its optimisation, determined by DDIP-qPCR. DNA immunoprecipitation was performed using the MeDIP kit from Diagenode, with commercially available, extracted, human DNA irradiated with 0, 0.1, 0.2 and 0.5 J/cm 2 UVB, and a monoclonal antibody against thymine dimers (T<>T). (A) ELISA results demonstrating the specificity of the anti-T<>T Mab for UV-modified DNA. Quantitative PCR was performed using primers specific for (B) the GAPDH gene promoter, an actively expressed gene, and (C) the Myoglobin exon 2, an inactive gene. Recovery was expressed as a percentage of the amount of immunoprecipitated DNA compared to the input DNA after qPCR. The results are presented as the mean ± SEM of three independent experiments. **** represents p < 0.001.

DDIP-qPCR Quantification of the Induction and Repair of CPDs Induced by UVR
The assay was then repeated to optimise the number of cells required to assess the UVB induction of CPD. The results revealed that when 1 million HaCaT cells were used, a good UVB dose-response was observed for both the GAPDH promoter ( Figure 3A) and Myoglobin exon 2 ( Figure 3B) loci, with a higher level of damage induction being noted at the active GAPDH promoter ( Figure 3A) compared to the inactive Myoglobin exon 2 gene region ( Figure 3B).
Following optimisation of the DDIP assay, the conditions were used to assess the induction and repair of CPDs in HaCaT cells following irradiation with SSR. Again, DDIP-qPCR was performed using GAPDH and Myoglobin exon 2 gene primers. The results demonstrated that SSR appears to preferentially induce T<>T in the active GAPDH gene, compared to the inactive Myglobin exon 2 gene regions ( Figure 3C,D, respectively), confirming the results seen for naked DNA in Figure 2. For GAPDH, immediately after SSR irradiation, the percentage recovery of the immunoprecipitated sample to the input was 1.3%. This percentage decreased to 0.27% at 6 h, to 0.04% at 24 h, and to 0.028% at 36 h post-irradiation ( Figure 3C). A markedly less rapid decrease was noted in the Myoglobin gene region, albeit with much less damage induced in the first place, and no noticeable repair over the first 6 h ( Figure 3D). The assay was then repeated to optimise the number of cells required to assess the UVB induction of CPD. The results revealed that when 1 million HaCaT cells were used, a good UVB doseresponse was observed for both the GAPDH promoter ( Figure 3A) and Myoglobin exon 2 ( Figure 3B) loci, with a higher level of damage induction being noted at the active GAPDH promoter ( Figure 3A) compared to the inactive Myoglobin exon 2 gene region ( Figure 3B).
Following optimisation of the DDIP assay, the conditions were used to assess the induction and repair of CPDs in HaCaT cells following irradiation with SSR. Again, DDIP-qPCR was performed using GAPDH and Myoglobin exon 2 gene primers. The results demonstrated that SSR appears to preferentially induce T<>T in the active GAPDH gene, compared to the inactive Myglobin exon 2 gene regions ( Figure 3C and D, respectively), confirming the results seen for naked DNA in Figure 2. For GAPDH, immediately after SSR irradiation, the percentage recovery of the immunoprecipitated sample to the input was 1.3%. This percentage decreased to 0.27% at 6 h, to 0.04% at 24 h, and to 0.028% at 36 h post-irradiation ( Figure 3C). A markedly less rapid decrease was noted in the Myoglobin gene region, albeit with much less damage induced in the first place, and no noticeable repair over the first 6 h ( Figure 3D).

Yes
Yes Yes ** Figure 3. DDIP-qPCR analysis of the induction, and repair, of T<>T in nuclear DNA of HaCaT cells exposed to UVR. DDIP-qPCR for T<>T was performed immediately after exposure using primers specific for the (A) GAPDH gene promoter, representative of active genes, and (B) for Myoglobin exon 2, representative of inactive genes, using an optimised level of 1 million cells, following increasing doses of UVB. The same assay was then applied to the analysis of T<>T levels in (C) GAPDH and (D) Myoglobin exon 2 genes, at 0, 6, 24 and 36 h after exposure to 0.1 J/cm 2 SSR. Recovery, which represents the induction/repair of damage, was expressed as a percentage of the amount of immunoprecipitated DNA compared to the input DNA after DDIP-qPCR. The results are the mean ± SEM of three independent DDIP-qPCR experiments, **** p < 0.0001, **p < 0.001.

Nuclear and Mitochondrial Genome-Wide Mapping of T<>T Induction and Repair.
For the purposes of demonstrating the application of our DDIP-seq assay, our analyses focused specifically upon damage within gene regions. We identified the presence of damage in 13,680 genes in HaCaT cells immediately following irradiation with SSR. Representative results of the whole Figure 3. DDIP-qPCR analysis of the induction, and repair, of T<>T in nuclear DNA of HaCaT cells exposed to UVR. DDIP-qPCR for T<>T was performed immediately after exposure using primers specific for the (A) GAPDH gene promoter, representative of active genes, and (B) for Myoglobin exon 2, representative of inactive genes, using an optimised level of 1 million cells, following increasing doses of UVB. The same assay was then applied to the analysis of T<>T levels in (C) GAPDH and (D) Myoglobin exon 2 genes, at 0, 6, 24 and 36 h after exposure to 0.1 J/cm 2 SSR. Recovery, which represents the induction/repair of damage, was expressed as a percentage of the amount of immunoprecipitated DNA compared to the input DNA after DDIP-qPCR. The results are the mean ± SEM of three independent DDIP-qPCR experiments, **** p < 0.0001, ** p < 0.001.

Nuclear and Mitochondrial Genome-Wide Mapping of T<>T Induction and Repair
For the purposes of demonstrating the application of our DDIP-seq assay, our analyses focused specifically upon damage within gene regions. We identified the presence of damage in 13,680 genes in HaCaT cells immediately following irradiation with SSR. Representative results of the whole nuclear, genome-wide mapping of T<>T, to the human genome reference GRCh38, across a 7134 kb region of chromosome 11, q13.2, and a 7605 kb region chromosome 7, q21.11 are illustrated in Figure 4 (A and B, respectively). Figure 4A,B (upper panels, in blue) both illustrate a clear heterogeneous distribution of reads (damage), in terms of amount and location, induced immediately after irradiation (0 h) across both regions. Some regions clearly reveal higher levels of damage, with an absence of damage in other regions. At 24 h post-irradiation, the total levels of damage decreased (and the number of genes in which damage was detected had decreased to 10,822), and the number of locations lacking damage increased ( Figure 4A,B, lower panels in red). Damage clearly persisted for at least 24 h in some regions, whereas in others it was fully repaired, which did not appear related to the initial, induced levels of damage. nuclear, genome-wide mapping of T<>T, to the human genome reference GRCh38, across a 7134 kb region of chromosome 11, q13.2, and a 7605 kb region chromosome 7, q21.11 are illustrated in Figure  4 (A and B, respectively). Figures 4A and B (upper panels, in blue) both illustrate a clear heterogeneous distribution of reads (damage), in terms of amount and location, induced immediately after irradiation (0 h) across both regions. Some regions clearly reveal higher levels of damage, with an absence of damage in other regions. At 24 h post-irradiation, the total levels of damage decreased (and the number of genes in which damage was detected had decreased to 10,822), and the number of locations lacking damage increased ( Figure 4A, B, lower panels in red). Damage clearly persisted for at least 24 h in some regions, whereas in others it was fully repaired, which did not appear related to the initial, induced levels of damage.  We extended this analysis to study the distribution of damage between chromosomes. Figure 5A illustrates the total levels of T<>T per chromosome, immediately after irradiation, and 24 h later. As might be expected for a directly damaging agent such as UVR, at this macro-scale, generally the total levels of T<>T per chromosome correlated with chromosome length. The exception to this was the X chromosome, which contained comparable levels of T<>T, before and after repair, to chromosome 20, despite being approximately 2.5 times longer. At this crude resolution, the amount of T<>T remaining after 24 h appeared to be proportional to initial levels of damage, representing a decrease of approximately 50% for each chromosome ( Figure 5A). Expressing these data as number of T<>T-containing genes per chromosome ( Figure 5B) revealed a similar distribution, for 0 h, to that seen in Figure 5A. However, there was less of a pronounced decrease in the number of T<>T-containing genes between 0 h and 24 h. We extended this analysis to study the distribution of damage between chromosomes. Figure  5A illustrates the total levels of T<>T per chromosome, immediately after irradiation, and 24 h later. As might be expected for a directly damaging agent such as UVR, at this macro-scale, generally the total levels of T<>T per chromosome correlated with chromosome length. The exception to this was the X chromosome, which contained comparable levels of T<>T, before and after repair, to chromosome 20, despite being approximately 2.5 times longer. At this crude resolution, the amount of T<>T remaining after 24 h appeared to be proportional to initial levels of damage, representing a decrease of approximately 50% for each chromosome ( Figure 5A). Expressing these data as number of T<>T-containing genes per chromosome ( Figure 5B) revealed a similar distribution, for 0 h, to that seen in Figure 5A. However, there was less of a pronounced decrease in the number of T<>Tcontaining genes between 0 h and 24 h.   The heterogeneity in damage induction and repair noted in Figure 4. was confirmed by the detailed analysis of a smaller number of genes, as shown in Figure 6A, which illustrates differential sensitivities to damage formation, and rates of repair across a number of different nuclear genomic regions ( Figure 6B). We also examined SSR-induced levels of T<>T at representative loci within the mitochondrial genome. Levels of damage were not uniformly distributed across the loci examined. Levels of damage tended to be higher at the mitochondrial loci ( Figure 6C), compared the nuclear ( Figure 6A), being as much as 2.5 greater, when comparing the most damaged loci in both genomes. Repair was more effective towards nuclear damage, with generally more damage persisting in the mitochondria, after 24 h. As was observed with nuclear damage, loss of T<>T in mitochondria did not appear to be influenced by initial levels. The heterogeneity in damage induction and repair noted in Figure 4. was confirmed by the detailed analysis of a smaller number of genes, as shown in Figure 6A, which illustrates differential sensitivities to damage formation, and rates of repair across a number of different nuclear genomic regions ( Figure 6B). We also examined SSR-induced levels of T<>T at representative loci within the mitochondrial genome. Levels of damage were not uniformly distributed across the loci examined. Levels of damage tended to be higher at the mitochondrial loci ( Figure 6C), compared the nuclear ( Figure 6A), being as much as 2.5 greater, when comparing the most damaged loci in both genomes. Repair was more effective towards nuclear damage, with generally more damage persisting in the mitochondria, after 24 h. As was observed with nuclear damage, loss of T<>T in mitochondria did not appear to be influenced by initial levels. We used short-range qPCR as an alternative approach to further investigate the time-dependent loss of T<>T from the mitochondrial genome observed using DDIP-seq. This approach confirmed that levels of T<>T, in a 221 bp region spanning the Cytb and ND6 genes, decreased significantly over a 48 h period post-irradiation ( Figure 7A). Given that it is possible that the loss of T<>T was due to turnover of the genomes of damage-containing mitochondria, we therefore simultaneously evaluated mitochondrial DNA content in the irradiated HaCaTs. Although levels of mitochondrial DNA We used short-range qPCR as an alternative approach to further investigate the time-dependent loss of T<>T from the mitochondrial genome observed using DDIP-seq. This approach confirmed that levels of T<>T, in a 221 bp region spanning the Cytb and ND6 genes, decreased significantly over a 48 h period post-irradiation ( Figure 7A). Given that it is possible that the loss of T<>T was due to turnover of the genomes of damage-containing mitochondria, we therefore simultaneously evaluated mitochondrial DNA content in the irradiated HaCaTs. Although levels of mitochondrial DNA content appeared to decrease 6 h following irradiation, there were no significant differences in content between any of the timepoints ( Figure 7B). content appeared to decrease 6 h following irradiation, there were no significant differences in content between any of the timepoints ( Figure 7B).  Figure 7A, determined by real-time qPCR. Bars represent the mean ± SEM of three independent experiments, ns = not significant, compared to the result for the zero h samples.

Discussion
There is a growing number of methods for studying the genome-wide induction of DNA damage and its repair [29,33,34]. Using our DDIP-seq method, we have mapped T<>T formation and The loss of solar simulated radiation induced-DNA damage (T<>T) from a representative region of the mitochondrial genome, determined by short-range qPCR. Points and bars represent the mean ± SEM of three independent experiments. *** p < 0.001, ** p < 0.01, and * p < 0.05, relative to an unirradiated sample. (B) Corresponding mitochondrial DNA content, from the experiment described in Figure 7A, determined by real-time qPCR. Bars represent the mean ± SEM of three independent experiments, ns = not significant, compared to the result for the zero h samples.

Discussion
There is a growing number of methods for studying the genome-wide induction of DNA damage and its repair [29,33,34]. Using our DDIP-seq method, we have mapped T<>T formation and repair across the entire nuclear and mitochondrial genomes. We noted that in the absence of repair (i.e., immediately after exposure), at high resolution, the induction of T<>T was distributed heterogeneously across the genome, presumably due to region-specific susceptibility to damage formation. Indeed, we showed that, following irradiation of extracted DNA, GAPDH appeared to be more intrinsically prone to damage formation than Myoglobin exon 2; suggesting the presence of possible conformational differences between the two genes that are present even in naked DNA, which render GAPDH more sensitive to damage formation. However, other recent studies describe a uniform induction of pyrimidine dimers [(6-4)PP and CPD)] [35] and cisplatin adducts [36]. At whole chromosome resolution, we noted similar results, but not at higher resolution. Indeed, a high-resolution examination of damage over representative 7134 or 7605 kb regions showed that the damage was heterogeneously distributed, confirming previous observations [7,37]. This finding was reiterated when we noted marked variation in the levels of damage induced in individual gene loci. Indeed, the distribution of damage, depending upon its source, appears to be determined by a variety of inter-related elements [38,39], e.g., nuclear organisation (e.g., greater damage in sites in proximity to the nuclear membrane) [30], nucleotide sequence [34,40], proximity of metal ions [41]; DNA-histone interactions, and epigenetic factors [42][43][44][45].
We also showed that repair is site-specific confirming the results of others, using genome-wide mapping techniques [27], and approaches targeted towards individual loci [46] [12,47]. It is not entirely clear whether heterogenous susceptibility, and hence damage distribution [48], or the distribution of repair activities [35,36], is primarily responsible for the steady-state distribution of damage across the genome.
Although genome-wide mapping techniques are becoming used more frequently, little attention has been directed towards mitochondria, until one recent report [49]. These authors noted the presence of the DNA adduct 3-(2-deoxy-β-D-erythropentofuranosyl)pyrimido [1,2α]purin-10(3H)-one (M1dG) at roughly equal levels throughout the mitochondrial genome, with no specific sites of enrichment, compared to untreated cells. Here, we are the first to study the induction and loss of T<>T across the mitochondrial genome. Like the nuclear genome, we note an apparent non-random distribution of damage, evidenced by different levels of damage at representative gene loci, in contrast to the results for M1dG which the authors described as having no particular sites of accumulation [49]. Our findings are consistent with an assessment of ROS-induced DNA damage in specific coding regions of mitochondria, in which levels of damage differed across four sites (D-Loop, COII/ATPase6/8, ND4, ND5, and ND1) [50]. Unfortunately the authors did not report the repair of damage at these individual sites [50]. In contrast, a more recent report indicated that levels of oxidised purines appear to be the same across three mitochondrial DNA regions (D-loop, Ori-L, and ND1) [51].
Similar to the findings for M1dG, we noted that levels of T<>T were significantly higher in the mitochondrial DNA, compared to nuclear. We also noted a loss of T<>T from mtDNA with time, and observed that this was not equally distributed across the mitochondrial genome, with some loci targeted preferentially. Using an assay highly sensitive to the detection of thymine dimers, the loss of UVC-induced mitochondrial DNA damage has been reported previously and attributed to DNA repair [52]. In the case of M1dG, induced global levels of damage persisted in mtDNA for at least 24 h; however, a genome-wide analysis was not performed, unlike the present study, so whether or not the sequence-specific loss of M1dG occurred cannot be evaluated.
In our study, mitochondrial levels of T<>T clearly decreased with time, determined by DDIP-seq, and confirmed qPCR. The term loss is used here, rather than repair, as it is widely considered that mitochondria have no NER pathway per se, for the removal of T<>T, although some NER-related proteins have been have been noted in the mitochondria, seemingly due to their association with the repair of oxidatively damaged DNA (reviewed in [53]). While alternative excision repair pathways exist in other species, to date, none have been reported in mammalian cells. It is possible that, in the absence of NER of T<>T or indeed M1dG, mitochondria with highly damaged DNA are degraded [54] or rescued by fusion with a mitochondrion with relatively undamaged DNA [55,56]. Were either the case, then one might expect the pattern of damage to remain the same at zero and 24 post-irradiation, with the same decrease in damage across all loci, but this was not the case. Furthermore, we studied the mtDNA content of UV irradiated cells across the time course of repair, and noted no significant changes in mtDNA content. Unless the production of new mitochondria and mitophagy was in equilibrium, these data suggest that T<>T can be actively removed from mtDNA, by unknown processes. From a cellular perspective, despite the 'logistics' of targeting NER proteins to the excess of mitochondrial genomes, compared to a singular nuclear genome, it might be more economical to repair bulky adducts, rather than generate new mitochondria. We are currently investigating this further.
It is also worth noting the differences between global genome, and genome-wide assessments of damage and repair. We and others have demonstrated previously that the global genome repair of UVB-induced CPD is a lengthy process (t 1/2 > 48 h) [57]. This is markedly different to the results with DDIP-seq, which revealed that some loci are fully repaired within 24 h of irradiation, whereas for others, up to 96% of the initial damage remains 24 h later. This indicates that, whilst informative, measurement for global genome levels of DNA damage and repair may not fully reflect events in specific regions of the genome; this in turn has consequences for follow-on biological effects, notably cell transformation and/or death.

Conclusions
Like others, we have demonstrated that the induction and repair of damage is heterogenous in nuclear DNA, but importantly we have extended these investigations to include the mitochondrial genome and, for the first time, shown similar results as for the nuclear genome. These findings imply the presence of an, as yet, unidentified process for the removal of T<>T from mitochondrial DNA.
These data illustrate that genome-wide mapping adductomic approaches, such as DDIP-seq, provide the potential for developing a greater understanding of the formation and repair of damage, giving a more mechanistic insight into the link between DNA damage, repair, downstream events and disease, which is currently a "black box".

Cell Preparation and Treatment
Cells at 80% confluence were irradiated with SSR, or UVB, or UVC on ice. Following irradiation, fresh growth medium was added, and the cells then incubated in a 5% CO 2 incubator at 37 • C for different times to permit DNA repair, and/or evaluate viability. At each time point the cells were trypsinised and used in subsequent assays.
The source of SSR was a SUNTEST ® CPS+ cabinet (Atlas, Mount Prospect, IL, USA), which was programmed to irradiate the cells with 0.1 J/cm 2 . This low dose of SSR is considered in the range of the erythemal dose (0.1 J/cm 2 -0.2 J/cm 2 ) in Europe, according to TEMIS, http://www.temis.nl/uvradiation/ UVdose.html). UVB irradiation was performed using a custom-made exposure cabinet (Hybec Ltd., Leicester, UK), as described previously [57]. For optimisation of some assay conditions, isolated DNA was irradiated at different doses of UVB (0.25, 0.5, 0.75 and 1 J/cm 2 ). To study the loss of T<>T in mtDNA specifically, HaCaTs were cultured in Petri dishes, as above, and exposed to 0.12 J/cm 2 UVC (as a model system for effectively inducing T<>T), before being returned to the cell culture incubator, for 3, 6, 24 and 48 h). At these specific time points, the Petri dishes containing cells were removed. DNA was extracted using a QIAamp DNA mini kit (Qiagen, Manchester, UK), and quantified using a NanoDrop One (Thermo Fisher Scientific, Waltham, MA, USA), prior to quantitative PCR analysis, as described below. For the ELISA, cells were exposed to 50 µM H 2 O 2 for 30 min, on ice, (as described elsewhere [59]) before DNA was extracted, and ELISA performed, as referenced below.

Cell Viability
At 24, 48 and 72 h following SSR exposure, cell viability was assessed using the Human Annexin V-FITC Apoptosis Kit (Bender Medsystems, Vienna, Austria) as described in our previous study [57]. Briefly, cells were trypsinised and centrifuged at 300× g for 5 min, prior to resuspension in 5 mL of fresh media, transferred to FACS tubes, and centrifuged at 300× g for 5 min. The supernatant was discarded, and the pellet resuspended in 1 mL of Annexin buffer, followed by the addition of 4 µL of Annexin V-FITC conjugate, and incubated at room temperature for 10 min. Subsequently, 30 µL (0.05 mg/mL) propidium iodide was added, and the cells incubated at room temperature for 1 min. Finally, the cells were analysed by flow cytometry (FACScan flow cytometer, Becton Dickinson, Wokingham, UK) using CellQuest software (Becton Dickinson, Wokingham, UK).

DNA Extraction and Preparation
Following treatment, HaCaT cells were pelleted by centrifugation at 300× g for 5 min, and the supernatant discarded. The cell pellet was washed twice with PBS, resuspended in 10 mL of PBS and then centrifuged at 500× g for 5 min at 4 • C, and the supernatant discarded. The cell pellet was then resuspended in 500 µL of complete GenDNA Digestion buffer (5 µL of 200 × proteinase K added to 1 mL of GenDNA Digestion buffer) and incubated, at 50 • C for 18 h in a thermoshaker. DNA was extracted using the GenDNA module buffers (Diagenode MeDIP kit; Diagenode, Liège, Belgium), according to the manufacturer's instructions.

DNA Immunoprecipitation (DDIP) Assay
The principle of the DDIP assay for 8-oxodG (OxiDIP-Seq) was first described by Amente et al. [25], and developed in house based upon the MeDIP kit from Diagenode. The kit was used as described by the manufacturer, but using an anti-thymine dimer antibody (clone KTM53, Kamiya Biomedical Company, Tukwila, WA, USA), with the DNA:antibody ratio optimised (1:1 µg/mL), as determined in the present study. For optimisation, DDIP-qPCR was performed using different ratios of DNA and anti-T<>T antibody (DNA:Ab): 1:0.1 µg/mL, 0.1:0.1 µg/mL and 1:1 µg/mL. Evidence for the specificity of the anti-T<>T Mab for T<>T, rather than ROS-induced DNA damage, was provided by ELISA [60], using DNA extracted from cells exposed to UVC, or H 2 O 2 .
The immunoprecipitated (IP) DNA sample incubation mix (without the DNA samples added) was prepared in a total volume of 65 µL for one immunoprecipitated (IP) and input (IN) sample as follows, using buffers supplied with the kit: Buffer A, 24 µL; Buffer B, 6 µL; water, 35 µL. Then, 65 µL of IP incubation mix and 10 µL of sheared DNA were added per tube, making the total volume per IP'd sample 75 µL. For the input (control; IN) sample, 13 µL of IP incubation mix and 2 µL of DNA were added, making the total volume 15 µL. The IN sample acts as an internal control and represents purified, total background genomic DNA taken prior to IP that does not undergo IP, but does undergo amplification by qPCR. The samples were then incubated at 95 • C for 3 min, quickly chilled on ice and then pulse microfuged for a short time at 4 • C. The IN samples were kept at 4 • C overnight. Throughout the experimental workflow for each IP DNA sample, a separate IN sample was also used.
The anti-T<>T KTM53 antibody was added to each IP sample, and all contents were transferred to new tubes containing 20 µL of preblocked protein A/G beads. The tubes were then placed on a rotating wheel at 4 • C and incubated overnight. Next the IP samples were washed with 450 µL of ice-cold wash buffer-1 and placed on a rotating wheel for 5 min at 4 • C. After the incubation, the samples were centrifuged at 6000 rpm at 4 • C for 1 min, and the supernatant discarded. The bead pellets were washed again with 450 µL of ice-cold wash buffer-2 and -3 and twice with wash buffer-4.
The IN samples were treated in parallel with the IP samples from this point. The elution buffer was prepared (103.5 µL buffer D, 11.5 µL buffer E and 5 µL buffer F). Next, 120 µL of complete elution buffer was added to the bead pellets and IN samples. All the tubes were incubated in a thermo-shaker for 10 min at 65 • C (1000-1300 rpm). DNA was purified and eluted using a QIAquick ® PCR purification kit (Qiagen, Manchester, UK). Briefly, 600 µL of PB buffer was added to each tube. The tube's contents were then transferred to QIAquick columns and centrifuged at 4000 rpm for 1 min. The filter column was washed with 700 µL of PE buffer and was then centrifuged at 4000 rpm for 1 min. The flow-through was then discarded. The filter columns were centrifuged at 13,000 rpm for 1 min and were eluted with 50 µL of EB. Then tubes were incubated at 50 • C for 5 min and centrifuged at 13,000 rpm for 1 min. The concentration of IP and IN samples were measured by a Qubit ® fluorimeter using a Qubit ® dsDNA HS Assay Kit, as directed by the manufacturer (Thermo Fisher Scientific, Altrincham, UK).

Removal of CPD Adducts Prior to PCR and Next-Generation Sequencing
CPD adducts are bulky and can block the DNA polymerase during the PCR amplification steps. The purified IP samples (25 µL) were all incubated with 1 µL of reagent from the PreCR DNA repair kit (New England Biolabs, city, UK) for 20 min at 37 • C, to remove the CPD prior to the analysis by qPCR. Then, 5 µL of the mixture were added to 15 µL of the master mix including the control primer. Then, the DNA samples were amplified as followed: 2 min at 95 • C, then 25 cycles as followed: 10 s at 95 • C, 30 s at 65 • C, and 1 min at 72 • C. After the PCR, DNA was purified using Qiaquick PCR purification kit (Qiagen, Manchester, UK) and eluted in 50 µL of elution buffer (Qiagen, Manchester, UK).

Quantitative PCR Analysis
Following DIP, quantitative PCR (qPCR) was conducted to assess the success of the immunoprecipitation step. qPCR reactions were performed in duplicate in a final volume of 16 µL using the SensiMix™ SYBR ® Hi-ROX Kit (Bioline, London, UK), on a 7300 Real-Time PCR System (Applied Biosystems, Foster City, CA, USA). A total of 5 µL of IP and IN samples were used per reaction with 1 × SYBR green, and a mix of forward and reverse primer at a final concentration of 259 nM. Two primer sets were used for qPCR analysis; GAPDH and Myoglobin Exon 2 ( Table 1). The PCR conditions included a 10 min denaturation step at 95 • C, followed by 40 cycles of 30 s at 95 • C, 30 s at 60 • C and 30 s at 72 • C. Table 1. PCR primer sequences for target gene regions, performed by DIP-qPCR analysis.

MicroPlex Library Preparation™ Kit for Next-Generation Sequencing
The MicroPlex Library Preparation kit (Diagenode, Liège, Belgium) was used to prepare the indexed sequencing libraries for Illumina MiSeq next-generation sequencing. Briefly, template DNA was first repaired and blunt end molecules were generated using the manufacturer's template preparation. Stem-loop adaptors were then ligated to the 5 end of the genomic DNA and the 3 ends of the genomic DNA were extended. The libraries were then amplified using Illumina-compatible index primers. Each single step was performed according to the manufacturer's recommendations.
The MicroPlex libraries were purified using AMPure ® XP beads (Beckman Coulter, High Wycombe, UK), and quantified by real-time qPCR using the KAPA Biosystems library quantification kit (Roche Diagnostics, Burgess Hill, UK) according to the manufacturer's instructions. The indexed sequencing libraries were then pooled at equimolar concentration. The pool was spiked-in with 1% of PhiX and were sequenced on an Illumina MiSeq sequencer using 2 × 308 paired end sequencing.

Bioinformatic Analysis of NGS Data
Raw sequencing fastq files were analysed using the bioinformatics pipeline, as follows. First, the data were preprocessed and quality control performed on the reads, anaTQC (http://goo.gl/6TUqD). Then, the adapter sequences were removed using Trimmomatic, and the read mapper Burrow-Wheeler Aligner (BWA; http://bio-bwa.sourceforge.net/) [61] was used for alignment. The reads from each sample were mapped to the human genome assembly GRCh38. Next, the mapped data (SAM files) were filtered, in BWA, using a mapping quality (MAPQ) score with a cut-off q30, to eliminate reads that mapped to more than one location in the genome, poor quality and erroneous alignments. Following the alignment, the resulting SAM files were sorted and indexed using SAMtools (http://www.htslib.org/) [62]. The Integrative Genomics Viewer (IGV) (http://www.broadinstitute.org/igv/) was used to visualise the mapped reads to the reference genome GRCh38. Next, the genes were identified per sample using the GENCODE tool to find gene annotations and to compare them at different time points following exposure to SSR.

Quantification of mtDNA Damage
In order to further study the loss of T<>T from the mitochondrial genome noted by DDIP-seq, we performed "short range" qPCR of mtDNA damage in HaCaTs, using UVC as an effective source of T<>T. Using the method based on that of Santos et al. [63], total genomic DNA from UVC irradiated HaCaTs underwent PCR (Mastercycler Pro; Eppendorf, Hamburg, Germany) using primers specific for a 221 bp region spanning the Cytb and ND6 genes, and using LongAmp Taq (New England Biolabs, Ipswich, MA, USA). The PCR products were quantified in a Synergy 2 microplate reader (BioTek, Winooski, VT, USA), based upon PicoGreen fluorescence. Lesion frequency/10 kb was calculated, according the formula reported elsewhere [64].

Quantification of mtDNA Content
Mitochondrial DNA content was evaluated via the analysis of a small region mtDNA (83 bp amplicon of D-loop), relative to a small region of a nuclear, single copy gene (93 bp amplicon of beta 2 microglobulin, β2M). The regions of interests were amplified using a real-time quantification PCR method, after modification and optimisation of the PCR conditions reported elsewhere [65][66][67]. The mitochondrial and nuclear DNA was amplified using Maxima SYBR Green qPCR Master Mix (2 x), and QuantStudio 3 (Applied Biosystems, Foster City, CA, USA). The relative amplification of mtDNA, and hence mtDNA content, was calculated using the 2−∆∆Ct method, as described in [67].

Statistical Analysis
All the experiments were conducted in triplicate and the results expressed as mean ± SEM, unless indicated otherwise. The statistical analysis was performed by one-way analysis of variance (ANOVA) test, using GraphPad Prism software v. 6.0. Significance limits were set at * p < 0.05, ** p < 0.01, *** p < 0.001 and **** p < 0.0001. Funding: The research reported in this publication was supported, in part, by the National Institute of Environmental Health Sciences of the National Institutes of Health under award number R15ES027196. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.