Demethylation of Non-CpG Sites in DNA Is Initiated by TET2 5-Methylcytosine Dioxygenase

Sites in DNA Is Initiated by TET2 5-Methylcytosine Dioxygenase. Abstract: In the mammalian genome, cytosine methylation predominantly occurs at CpG sites. In addition, a number of recent studies have uncovered extensive C5 cytosine methylation (5mC) at non-CpG (5mCpH, where H = A/C/T) sites. Little is known about the enzyme responsible for active demethylation of 5mCpH sites. Using a very sensitive and quantitative LC–MS/MS method, we demonstrate that the human TET2, an iron (II)- and 2OG-dependent dioxygenase, which is a frequently mutated gene in several myeloid malignancies, as well as in a number of other types of cancers, can oxidize 5mCpH sites in double-stranded DNA in vitro. Similar to oxidation of 5mCpG, oxidation of 5mC at CpH sites produces 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC), and 5-carboxycytosine (5caC) bases in DNA. After 5mCpG, which is the most preferred substrate, TET2 prefers 5mCpC as a substrate, followed by 5mCpA and then 5mCpT. Since the TDG/BER pathway and deformylation or decarboxylation of 5fC or 5caC, respectively, can convert 5fCpH and 5caCpH to an unmodiﬁed cytosine base in DNA, our results suggest a novel demethylation pathway of 5mCpH sites initiated by TET2 dioxygenase.


Introduction
Methylation of cytosine at carbon-5 (5mC) within the CpG dinucleotide (5mCpG) in DNA plays crucial roles in X-chromosome inactivation, gene imprinting, nuclear reprogramming, and tissue-specific gene expression in mammalian cells [1][2][3]. Dynamic epigenetic regulation of 5mCpG also plays critical roles in multiple stages of pluripotency, differentiation, and development [4]. There appear to be more than 20 million CpG dinucleotides in the human genome with 70-80% of CpG cytosines being methylated [5]. A delicate balance between cytosine methylation and demethylation within CpG dinucleotides shapes the final epigenetic pattern of a cell, and any imbalance in methylation patterns results in pathological conditions, including cancer [6,7].
In addition to 5mCpG marks, several recent genome-wide bisulfite sequencing studies at single-base resolution have uncovered a significant amount of 5mC at CpH sites in almost all human cells [8][9][10][11][12]. Particularly, in human embryonic stem cell lines, 67.85%, 6.68%, 1.48%, and 0.63% of all CpG, CpA, CpT, and CpC sites are methylated, respectively [11]. In brain cells, neurons, and different embryonic stem cells (ESCs), one quarter of all methylation marks identified were on CpH sites [8][9][10][11]13]. Generally, as in the case of 5mCpG marks, the presence of 5mCpH sites also inversely correlated with transcription [8][9][10][11]13]. In addition, the 5mCpH marks correlate with genomic imprinting and regulation of inter-chromosomal interactions between enhancer elements and receptor genes, and are predictive of genes that escape X-inactivation [1,14]. Further, several recent studies have demonstrated a correlation between non-CpG methylation and gene expression in different types of cancers [15,16]. Finally, a number of studies have reported the presence of 5hmC in the context of CpH in human cells [17,18]. Collectively, these initial studies suggest critical roles for 5mCpH marks in the mammalian genome.
Interestingly, in vitro substrate specificity studies with DNMT3A using oligonucleotides demonstrated that this enzyme could methylate CpH marks [25,26]. Further in vivo studies have established a strong correlation between the presence of 5mCpH marks and the expression of DNMT3A, DNMT3B, and DNMT3L genes [27][28][29]. Finally, studies in ESCs provided convincing evidence that DNMT3A and DNMT3B methylate CpH sites [30,31]. Although passive demethylation of 5mCpH sites during cell division may occur, in some instances dynamic active demethylation has been observed [13,17,32,33]. However, Hu et al. demonstrated that, although the human TET2 enzyme can rapidly oxidize 5mCpG sites, oxidation of both the 5mCpA and 5mCpC sites was negligible [34]. In contrast to human TET2 dioxygenase, a TET homolog from N. gruberi, NgTET1, which is structurally very similar to the human TET2, can oxidize DNA containing 5mCpG and 5mCpA sites with comparable efficiency [35]. These conflicting results raise questions with regards to the tolerance of bases at the +1 position in TET-mediated 5mC oxidation of DNA substrates.
In this study, using a very sensitive LC-MS/MS-based assay, we show that human TET2 can oxidize 5mCpH sites in double-stranded DNA. Similar to oxidation of 5mCpG, oxidation of 5mC in CpH produces 5hmC, 5fC, and 5caC oxidation marks. Since the human TDG enzyme can excise 5fC and 5caC from CpH sites, allowing their replacement with unmodified cytosine, along with the recently established deformylation and decarboxylation pathways, our results demonstrate a novel demethylation pathway of 5mCpH sites initiated by TET2 dioxygenase [36][37][38][39][40]. These results may also help to elucidate the emerging role of non-CpG methylation in gene expression and cancer.

Chemicals and Reagents
Analytical grade chemicals were purchased from Thermo Fisher Scientific or Sigma-Aldrich. All molecular biology kits were obtained from Superior Scientifics (Lenexa, KS, USA). Nucleoside standards and growth media were supplied by Carbosynth (Now Biosynth Carbosynth (Compton, UK)) and Difco Laboratories (Detroit, MI, USA), respectively.

Purification of hTET2 Catalytic Domain
In our earlier publication, we reported the cloning and expression of the human TET2 catalytic domain (TET2 1129-1936, ∆1481-1843) [41]. Purification of the TET2 catalytic domain was performed using cation exchange chromatography, which is described in our previous publication [42].

Purification of Full-Length Myc-Tagged hTET2
The full-length hTET2 was cloned with an N-terminal Myc-tag, as described earlier, and purified on a large scale with minor modifications [43]. Briefly, the Myc-TET2 was ectopically expressed in human embryonic kidney 293 cells, obtained from ATCC (Manassas, VA, USA), by plasmid transfection for 3 days. Protein was extracted in the RIPA buffer supplemented with 300 mM NaCl and 1X protease inhibitor. Protein concentration was measured using the BCA assay and diluted to 1 mg/mL, with 1X PBS for affinity purification using Myc-tag antibody-conjugated magnetic beads (Cell Signaling, Danvers, MA, USA). The Myc-TET2 protein was eluted using 0.5 mg/mL c-Myc peptide in Tris-buffered saline containing 0.05% Tween-20 for 5-10 min at 37 • C.

5mC Oxidation by TET2 Dioxygenase
25-mer double-stranded DNA was used for the TET2 enzymatic assay. The substrates contain only one 5-methylcytosine residue on one strand only. The specific sequences of the substrates used are provided in Table 1A-C and Table 2. The reactions were performed in triplicate. Each reaction was carried out in 100 µL total reaction volume, containing 3 µg of substrate, 500 µg of TET2 enzyme, 75 µM FeSO4, 1 mM 2-oxoglutarate, 5 mM ascorbate, and 50 mM HEPES (pH 8.0). The enzymatic reactions of the GST-tagged enzyme (100 µg) and full-length TET2 enzyme were carried out under identical conditions. After incubating at 37 • C for 30 min, the reactions were stopped using 5 µL 500mM EDTA. Using an oligo purification technique, the DNA was separated from the reaction. Then, the DNA (20 µL) was digested with 2 units of DNase I and 60 units of S1 Nuclease in 40 µL reaction volume. The reactions were incubated for 12 h at 37 • C to generate individual nucleoside monophosphates. After this step, 2 units of calf-intestinal alkaline phosphatase (CIAP) were added to the reactions. After 4 h of incubation at 37 • C, individual nucleosides were produced. Thirty-five µL of each sample was analyzed by LC-MS/MS.

Liquid Chromatographic Conditions for Nucleosides under Different MS/MS Modes
A neutral solvent system was used for chromatographic separation in negative mode. Solvent A was 10 mM ammonium acetate (pH 6.5), and solvent B was 80% acetonitrile/20% 10 mM ammonium acetate (pH 6.5). The gradient was 0% solvent B at 0-2 min, 0-20% at 2-5 min, 20-60% at 5-9 min, 60-0% solvent B at 9-10 min and a 5 min post-equilibration with solvent A. The overall flow rate was 0.3 mL/min. A water/methanol-based solvent system was used for chromatographic separation in positive mode. Solvent A was water (adjusted to pH 3.5 using formic acid) and solvent B was methanol (adjusted to pH 3.0 using formic acid). The gradient was 0% solvent B at 0-1 min, 0-2% at 1-12 min, 2-30% at 12-17 min, 30% at 17-18 min, 30-0% solvent B at 18-18.5 min and a 3.5 min post-equilibration with solvent A. The overall flow rate was 0.3 mL/min. Standard curves, the limit of detection (LOD), the lower limit of quantification (LLOQ), and the matrix effect for all eight nucleosides in the positive and negative modes were calculated using our developed method [42]. Mass spectrometry parameters and standard curves for modified cytosines are provided under Supplementary Materials (Tables S1-S3 and Figures S1 and S2). An in vitro TET2 enzymatic assay for each substrate was performed in triplicate. The amount of product formed during TET2 oxidation reactions across different DNA substrates was normalized by calculating the peak area of each product (i.e., 5hmC or 5fC or 5caC) and dividing it by the area represented by one deoxycytidine residue. For statistical analysis, standard deviation and standard error were calculated for each oxidative product generated in the assays performed in triplicate. Calculated standard error is represented in figures and tables.

Detection of Cytosine Derivatives Produced by Oxidative Demethylation of 5mCpG and 5mCpA Sites
Despite the critical roles played by cytosine derivatives (5hmC, 5fC, and 5caC) in human health and disease, in vitro quantification of these recently identified nucleic acid bases is still in an early stage. Recently, we described an improved positive/negative ion-switching-based LC-MS/MS method that can separate and quantify these modified cytosine bases [42]. This method improved the detection sensitivity for 5hmC and 5fC 6-8fold, whereas 32-fold improvement was observed in the case of the 5caC base, compared to an earlier method [34]. Using this sensitive method, we determined the activity of the TET2 catalytic domain, using a 25-mer dsDNA containing one 5mC residue in a CpG mark (Table 1A). After TET2 enzymatic reactions, the DNA was purified and converted into nucleosides. The nucleoside mixtures were subjected to mass spectrometry analysis using our previously described method [42]. In the TET2 enzymatic reactions, in addition to A, T, G, C, and 5mC peaks, three new peaks corresponding to 5hmC and 5fC were observed in the positive mode, and 5caC was observed in the negative mode ( Figure 1A).

Figure 1.
Top panels are for the negative control (without TET2 enzyme), which showed only 5mC but no 5hmC, 5fC, and 5caC. Lower panels are for the MRM chromatograms of 5mC, 5hmC, 5fC, and 5caC in TET2 catalyzed oxidation of DNA containing 5mCpG (A) and 5mCpA (B) sites. Please note that 5mC and 5caC were detected under the negative ion mode using a 15 min chromatographic gradient, while 5hmC and 5fC were detected under the positive ion mode using a 22 min gradient.
Further, using this sensitive method, we determined the activity of the TET2 catalytic domain using a 25-mer dsDNA containing one 5mC residue in a CpA mark (Table 1A). To our surprise, similar to the oxidation of 5mC in CpG sites, three new peaks were observed in the LC-MS/MS analysis. These novel peaks in the TET2 enzymatic reactions using 5mCpA as a substrate had identical mass characteristics and elution profiles to the 5hmC, 5fC, and 5caC bases (i.e., 5hmC and 5fC observed in the positive mode, while 5caC was observed in the negative mode) [42] (Figure 1B).

Quantification of Cytosine Derivatives Produced by TET2-Mediated Oxidation of 5mCpH Sites; Effect of +1 Base
The oxidation of 5mC in the CpA site (~50% oxidation product compared to 5mCpG) was surprising, because a previous study reported that the human TET2 predominantly oxidized 5mCpG sites (>85% 5mC oxidized), while the oxidation of 5mCpA was negligible (<2% 5mC oxidized) [34]. The previous observation was reasoned by a specific hydrogen bond between the phosphate groups of +1G of the 5mCpG mark with the S1290-TET2 residue [34]. The crystal structure further demonstrated a base-stacking interaction between the +1G:C base pair (where the C is present in the reverse strand, opposite the +1G base) of DNA and the Y1294-TET2 residue, resulting in specific recognition of the +1G:C base pair in the 5mCpG dinucleotide by TET2.
To validate our initial observation that TET2 can significantly oxidize 5mCpA sites, we tested the activity of the TET2 catalytic domain with three additional sequences with 5mCpA sites (Table 1B). For comparison, we studied TET2 activity with similar substrates (i.e., length and sequences) which included 5mCpG sites instead of 5mCpA sites in our study (Table 1). Repeatedly, we observed that TET2 can significantly oxidize 5mC in CpA (~50%) compared to 5mCpG ( Figure 2). Specifically, little difference was observed in the formation of 5hmC and 5fC, while the amount of 5caC, in the case of 5mCpA, was significantly less than in the case of 5mCpG. Intrigued by these results, we further tested the activity of the TET2 catalytic domain with dsDNA sequences containing one 5mC residue in CpC and CpT sites (Table 1C). Indeed, our experiments demonstrated that the TET2 catalytic domain can oxidize 5mC in CpH sites (Figure 3). Our results demonstrated that, after 5mCpG, TET2 prefers 5mCpC over 5mCpA, with substrates with 5mCpT being the least preferred. Preference for the 5mCpC substrate over 5mCpA and 5mCpT could be due to a base-stacking interaction between the +1C:G base pair of the substrate and the Y1294-TET2 residue, as suggested by the crystal structure of the TET2 enzyme [34].  Comparison of total products (5hmC + 5fC + 5caC) formed after TET2 catalyzed oxidation of DNA containing 5mCpG, 5mCpA, 5mCpC, and 5mCpT sites.

TET2 Can Initiate Demethylation of 5mCpH Sites
The TET2 catalytic domain (TET2 1129-1936, ∆1481-1843) was identified as the minimal catalytically active domain present on the C-terminus of full-length human TET2 enzyme [34]. Hu et al. produced this minimal catalytically active TET2 domain as a GSTtagged protein in bacteria to solve its structure. Since this clone is an engineered TET2 protein and does not present the native enzyme, we produced full-length human TET2 enzyme in human embryonic kidney 293 cells. To compare the activity of the untagged TET2 catalytic domain, which was predominantly used in this study, we performed oxidation assays with the GST-TET2 catalytic domain and full-length human TET2, using 5mCpG and 5mCpA (which is the most common CpH methylation found in the human genome) as substrates. Indeed, both the GST-TET2 catalytic domain and full-length human TET2 oxidized 5mC on CpA sites in DNA and converted it into 5hmC, 5fC, and 5caC (Table 2). Thus, our results confirm that human TET2 dioxygenase can carry out iterative oxidation of 5mC in CpA dinucleotides, similar to the oxidation of 5mCpG sites in DNA.
During the course of this submission, DeNizio et al. demonstrated, consistently with our results, that TET2 can oxidize 5mCpH sites, albeit less efficiently than 5mCpG [40]. Similar to our results, they demonstrated that oxidation of 5mCpH produces 5hmCpH, 5fCpH, and 5caCpH. Interestingly, DeNizio et al. demonstrated that the rate of formation of 5hmC in CpG substrates is higher than in CpH substrates; however, there was no difference in the rate of formation of 5fC and 5caC in substrates with CpG vs. CpH. DeNizio et al. further demonstrated that the order of TET2 substrate preference, in the case of 5mCpH oxidation, is 5mCpH ≥ 5hmCpH > 5fCpH, which is slightly different from the oxidation of 5mCpG (i.e., 5mCpG > 5hmCpG > 5fCpG). A previous study reported that, in addition to excising 5fC and 5caC residues in CpG sites and replacing them with unmodified cytosine bases by the TDG/BER pathway [39], it can act on 5fC and 5caC residues in CpH sites [36]. DeNizio et al. further reestablished, similarly to a previously published study by Papin et al. [36], that TDG can excise fC and caC in CpH contexts [36,40], which may allow the replacement of these modified cytosine bases with cytosine. Further, in a recent, elegant study, an alternate demethylation process demonstrated that 5fC and 5caC deformylate and decarboxylate, respectively, to directly reinstall unmodified cytosine bases at previously methylated sites [37,38]. As such, this pathway will not distinguish the +1 base between the CpG and CpH sites. Taken together, our results suggest a novel demethylation of 5mCpH sites through the oxidation of 5mCpH sites by TET2 into 5hmCpH, 5fCpH, and 5caCpH, followed by the replacement of 5fCpH and 5caCpH by either the TDG/BER pathway or direct C-C bond cleavage (Figure 4).

Conclusions
TET2 is one of the most frequently mutated genes in different types of leukemia, including myelodysplastic syndromes (MDS), MDS-myeloproliferative neoplasms (MDS-MPN), and acute myeloid leukemia derived from MDS and MDS-MPN (sAML). Mutations in the TET2 gene have been reported in many diverse solid tumors as well. The TET2 gene codes for 5-methylcytosine dioxygenase, an enzyme that epigenetically regulates 5mC levels in the genome. Although the role of TET2 in the oxidation of 5mC at CpG sites was well documented in the literature, its role in the demethylation of 5mC at CpH sites remained questionable. Using a sensitive LC-MS/MS-based method in this study, we provided multiple lines of evidence that the human TET2 can oxidize 5mCpH sites in DNA. Like oxidation of 5mCpG, the oxidation of 5mC at CpH sites produces 5hmC, 5fC, and 5caC marks in DNA. The preferred order of TET2 substrates is 5mCpG > 5mCpC > 5mCpA >> 5mCpT. Since the TDG/BER pathway and deformylation or decarboxylation of 5fC or 5caC, respectively, can replace 5fCpH and 5caCpH with an unmodified cytosine base in DNA, our results suggest a novel demethylation of 5mCpH sites by TET2 dioxygenase. Since our study, along with the study by DeNizio et al., was performed in vitro, in vivo confirmation of our results will provide more insights regarding the demethylation of 5mCpH sites by TET2. TET2 uses a base flipping mechanism for its interaction with DNA, like many other DNA modifying enzymes [34,44]. The activity of some of these base flipping enzymes depends on flanking bases [45,46]. Since Hu et al. studied non-CpG demethylation using only one DNA sequence, further follow-up investigations based on our study reported here, along with crystallography, will help to improve our understanding of non-CpG demethylation [34].
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/dna1010004/s1, Figure S1: Standard curves of different nucleosides in negative mode, Figure  S2: Standard curves of different nucleosides in positive mode, Table S1: The chemical structure of four different modified deoxyribonucleosides, indicating the charge fragmentation position and the mass transition used for MRM detection, Table S2: Mass spectrometric parameters for the most intense MS/MS transitions of the eight nucleosides in negative mode detection, Table S3

Acknowledgments:
The authors would like to thank William Gutheil and Navid Ayon for their help with mass spectrometry.

Conflicts of Interest:
The authors declare that they have no known conflicts of interest.