Integrated Methylome and Transcriptome Analysis between the CMS-D2 Line ZBA and Its Maintainer Line ZB in Upland Cotton

DNA methylation is an important epigenetic modification involved in multiple biological processes. Altered methylation patterns have been reported to be associated with male sterility in some plants, but their role in cotton cytoplasmic male sterility (CMS) remains unclear. Here, integrated methylome and transcriptome analyses were conducted between the CMS-D2 line ZBA and its near-isogenic maintainer line ZB in upland cotton. More methylated cytosine sites (mCs) and higher methylation levels (MLs) were found among the three sequence contexts in ZB compared to ZBA. A total of 4568 differentially methylated regions (DMRs) and 2096 differentially methylated genes (DMGs) were identified. Among the differentially expressed genes (DEGs) associated with DMRs (DMEGs), 396 genes were upregulated and 281 genes were downregulated. A bioinformatics analysis of these DMEGs showed that hyper-DEGs were significantly enriched in the “oxidative phosphorylation” pathway. Further qRT-PCR validation indicated that these hypermethylated genes (encoding the subunits of mitochondrial electron transport chain (ETC) complexes I and V) were all significantly upregulated in ZB. Our biochemical data revealed a higher extent of H2O2 production but a lower level of adenosine triphosphate (ATP) synthesis in CMS-D2 line ZBA. On the basis of the above results, we propose that disrupted DNA methylation in ZBA may disrupt the homeostasis of reactive oxygen species (ROS) production and ATP synthesis in mitochondria, triggering a burst of ROS that is transferred to the nucleus to initiate programmed cell death (PCD) prematurely, ultimately leading to microspore abortion. This study illustrates the important role of DNA methylation in cotton CMS.


Introduction
Cotton (Gossypium hirsutum L.), as a vital resource for plant fiber and oil, is widely cultivated worldwide [1]. However, its low yield is one of the key factors hindering its further development. and the possible role of DNA methylation in cotton CMS (a widely grown fiber crop worldwide) have not been characterized to date.
In this study, a comparative analysis of the integrated methylome and transcriptome was performed in anthers of the CMS-D2 line ZBA and its isonuclear alloplasmic near-isogenic maintainer line ZB using Illumina sequencing technology. A bioinformatics analysis of differentially methylated DEGs suggested that the "oxidative phosphorylation" pathway may play a crucial role in anther development. Combined with our biochemical data, we propose that DNA hypermethylation is involved in regulating the dynamic balance of ROS production and ATP synthesis to maintain the normal development of cotton anthers. In other words, increased ROS may act as a signaling molecule in the mitochondria or be released into the cytoplasm in some way and then prematurely turn on PCD, ultimately leading to CMS. The epigenetic resources here open up a new pathway for elucidating the molecular mechanisms of CMS in cotton.

Methylation Landscapes of ZB and ZBA
The representative anther phenotypes of the CMS line ZBA and its near-isogenic maintainer line ZB and 0.5% 2,3,5-triphenyltetrazolium chloride (TTC) staining of pollen grains are shown in Figure 1. Obviously, the stamen filaments and stigma of ZBA were shorter than those of ZB ( Figure 1a). The anthers of ZBA were more wrinkled and had no pollen grains (Figure 1a,b), whereas the corresponding anthers in ZB were typically dehiscent to release normal pollen grains (Figure 1a,c). line ZB using Illumina sequencing technology. A bioinformatics analysis of differentially methylated DEGs suggested that the "oxidative phosphorylation" pathway may play a crucial role in anther development. Combined with our biochemical data, we propose that DNA hypermethylation is involved in regulating the dynamic balance of ROS production and ATP synthesis to maintain the normal development of cotton anthers. In other words, increased ROS may act as a signaling molecule in the mitochondria or be released into the cytoplasm in some way and then prematurely turn on PCD, ultimately leading to CMS. The epigenetic resources here open up a new pathway for elucidating the molecular mechanisms of CMS in cotton.

Methylation Landscapes of ZB and ZBA
The representative anther phenotypes of the CMS line ZBA and its near-isogenic maintainer line ZB and 0.5% 2,3,5-triphenyltetrazolium chloride (TTC) staining of pollen grains are shown in Figure  1. Obviously, the stamen filaments and stigma of ZBA were shorter than those of ZB ( Figure 1a). The anthers of ZBA were more wrinkled and had no pollen grains (Figure 1a,b), whereas the corresponding anthers in ZB were typically dehiscent to release normal pollen grains (Figure 1a,c). To explore the potential role of DNA methylation dynamics in anther development, we deciphered cytosine methylation at single-base resolution with high confidence across the whole genome of ZB and ZBA anthers using the WGBS technique (Figure 2a). The results of the WGBS are listed in the Supplementary Materials, Table S1. The Q30 percentages exceeded 94% in each sample. More than 300 million 125 bp paired-end raw reads comprising 78.38 Gb and 76.60 Gb were generated for ZB and To explore the potential role of DNA methylation dynamics in anther development, we deciphered cytosine methylation at single-base resolution with high confidence across the whole genome of ZB and ZBA anthers using the WGBS technique ( Figure 2a). The results of the WGBS are listed in the Supplementary Materials , Table S1. The Q30 percentages exceeded 94% in each sample. More than 300 million 125 bp paired-end raw reads comprising 78.38 Gb and 76.60 Gb were generated for ZB and ZBA, respectively, representing >30× of the upland cotton TM-1 reference genome, where the genome size was about 2.5 Gb [49,50] (Supplementary Materials, Table S1). After trimming adapters and filtering low-quality reads, a total of 309,694,292 and 301,003,201 clean reads were obtained for ZB and ZBA, respectively. More than 74% of these clean reads were successfully mapped to the TM-1 reference genome (Supplementary Materials, Table S1), which was slightly higher than the rate in a recent study of cotton [53], indicating the high credibility and accuracy of the methods and results in this study. Next, these mapped data were used to retrieve the methylation level (ML) of each cytosine site in the CG, CHG, and CHH contexts.
There were over 700 million cytosines covered in each sample, a number sufficient for further analysis (Figure 2a). Of these, a total of 232,123,389 potentially methylated cytosine sites (mCs) (23.40% at CG sites; 23.22% at CHG sites; 53.37% at CHH sites; and H representing A, T, or C) and 209,955,566 mCs (24.61% at CG sites, 24.39% at CHG sites, and 51.00% at CHH sites) were identified in ZB and ZBA, respectively ( Figure 2a). Additionally, the percentages of mC, mCG, mCHG, and mCHH in corresponding cytosine contexts across the whole genome of ZB and ZBA anthers were calculated. We found the overall genomic methylation degree of the mCs was significantly higher in ZB (31.59%) than in ZBA (28.58%), and the MLs in CG, CHG, and CHH contexts also presented similar comparative trends ( Figure 2b). We wondered here whether CHH hypermethylation accompanied by CG demethylation in Arabidopsis and rice endosperms [54,55] may also occur during anther development. Further calculation of the percentages of DNA MLs in C, CG, CHG, and CHH contexts throughout the 26 cotton chromosomes revealed that hypermethylation in ZB relative to ZBA was mainly due to increased MLs in the CHH context, whereas CG and CHG methylation generally showed a decreasing trend in each chromosome (Supplementary Materials, Figure S1). Moreover, these 26 chromosomes exhibited methylation patterns similar to the whole genome, and the MLs of CG, CHG, and CHH in each chromosome were significantly different between ZB and ZBA samples, indicating that there were some local changes in the DNA MLs (Supplementary Materials, Figure S1).
To further compare the DNA methylation patterns of the two samples in different genomic regions, we also analyzed the methylation profiles within genes, including promoters, exons, introns, and repetitive sequences. Apparently, the CG context was the highest and the CHH context the lowest ML among the three contexts in each gene region ( Figure 2c). For CG and CHG contexts, the MLs within the promoter, exon, and intron regions of ZB were significantly lower than in ZBA. However, compared to ZBA, the repeat region of ZB showed slightly higher MLs in all three sequence contexts (Figure 2c).

Differential Methylome Analysis between ZB and ZBA
Differences in DNA methylation between ZB and ZBA samples can be quantitatively characterized by differentially methylated cytosines (DMCs) and differentially methylated regions (DMRs). To investigate the difference between the two samples, we compared the DNA methylomes of ZB and ZBA. We identified more than 2.5 million DMCs in ZB relative to ZBA, 73.23% of which were hypermethylated (hyper-DMCs): 64.05% of the hyper-DMCs occurred in the CHH context, signifying an increase mainly in CHH DNA methylation in ZB relative to ZBA (Figure 3a). A circos plot was used to show the difference in overall MLs between the two samples, presenting especially widespread DNA methylation increases in ZB (Figure 3b).

Differential Methylome Analysis between ZB and ZBA
Differences in DNA methylation between ZB and ZBA samples can be quantitatively characterized by differentially methylated cytosines (DMCs) and differentially methylated regions (DMRs). To investigate the difference between the two samples, we compared the DNA methylomes of ZB and ZBA. We identified more than 2.5 million DMCs in ZB relative to ZBA, 73.23% of which were hypermethylated (hyper-DMCs): 64.05% of the hyper-DMCs occurred in the CHH context, signifying an increase mainly in CHH DNA methylation in ZB relative to ZBA (Figure 3a). A circos plot was used to show the difference in overall MLs between the two samples, presenting especially widespread DNA methylation increases in ZB (Figure 3b). To further characterize the change in DNA methylation between ZB and ZBA samples, a method based on Fisher's exact test was used to identify DMRs between two methylomes [56]. A total of 4568 DMRs were identified, and the length of most DMRs was found to be approximately 1000 bp (Supplementary Materials, Figure S2). Remarkably, we found that the overall ML of DMRs in ZB was higher than that of ZBA ( Figure 3c). Moreover, distribution statistics of the functional genomic regions associated with DMRs were also performed between the ZB and ZBA samples. Obviously, the most hypermethylated and hypomethylated DMRs occurred in the promoter region. More details are shown in Figure 3d. Furthermore, the genes located in DMRs, which are called differentially methylated genes (DMGs), were also characterized. In total, 2096 DMGs were identified in ZB versus ZBA, of which 2298 hypermethylated DMRs overlapped with 731 hypermethylated genes and 2270 hypomethylated DMRs contained 1365 hypomethylated genes (Figure 3e).

Correlation Between Altered DNA Methylation Patterns and Differential Gene Expression during Anther Development
To investigate whether the DNA methylation changes during anther development were associated with changes in gene expression, transcriptome profiles were generated with three biological replicates on the same materials used for methylome analysis. The FPKM (fragments per kilobase of exons per million fragments mapped) values of ZB and ZBA were evaluated with Pearson's correlation test, and the average correlation coefficients were 0.889 and 0.941, respectively (Supplementary Materials, Figure S3), which was higher than the correlation coefficient in a previous cotton transcriptome study [57]. In addition, a heatmap analysis of 13 selected cotton reference (housekeeping) genes [58][59][60] with their respective FPKM values was performed using TBtools software [61]. Clearly, the expression values of most housekeeping genes between ZB and ZBA were similar (Supplementary Materials, Figure S4). Overall, these results indicated relatively high correlation between the different replicates in our study, which contributed to the accuracy and reliability of the subsequent quantitative gene expression analysis. Next, we identified differentially expressed genes (DEGs) in ZB versus ZBA. In total, 11,944 upregulated and 8482 downregulated DEGs were identified in ZB relative to ZBA ( Figure 4a and Supplementary Materials, Figure S5). We also performed a comparative analysis of MLs and the densities of C-sites in the CG, CHG, and CHH contexts in different gene regions of the up-or downregulated DEGs, including promoter, exon, and intron regions ( Figure 4b and Supplementary Materials, Figure S6). Both the promoter and intron regions of the upregulated and downregulated DEGs presented significantly reduced MLs in the CG and CHG contexts, whereas only CHH methylation in the promoter region increased slightly in ZB relative to ZBA (Figure 4b).
To further explore the possible association between DNA methylation changes and changes in gene expression, we first examined overlap between the DMGs and DEGs. As is shown in Figure 4c, a total of 677 nonredundant DEGs associated with DMRs (DMEGs) were identified, including 599 that were widely distributed across the 26 chromosomes of upland cotton and 78 DMEGs on 67 different scaffolds. Of these, 262 genes were hypomethylated with upregulated expression levels, and 106 genes were hypermethylated with downregulated expression levels. On the contrary, 142 upregulated and 179 downregulated genes were hypermethylated and hypomethylated in ZB versus ZBA, respectively ( Figure 4d). Subsequently, we comparatively analyzed the differential expression levels of all genes and genes located in hyper-or hypomethylated DMRs, and the results are shown with a boxplot. Surprisingly, both hypermethylated (red box) and hypomethylated (green box) genes exhibited marginally higher expression levels compared with all genes (blue box) ( Figure 4e).

GO (Gene Ontology) and KEGG (Kyoto Encyclopedia of Genes and Genomes) Enrichment Analysis of DMEGs
To understand the potential role of DNA methylation changes in cotton anther development, we first performed a gene ontology (GO) analysis of DMEGs. For the 248 hypermethylated DEGs (hyper-DEGs) in ZB versus ZBA, 202 of these genes were annotated to 1350 functional categories, including 729 biological processes (BPs), 195 cellular components (CCs), and 426 molecular function (MFs) (Supplementary Materials, Table S2). This mainly included BPs such as "ribonucleoside monophosphate metabolic process", "nucleoside metabolic process", and "glycosyl compound metabolic process". Besides, "photosystem I", "mitochondrial envelope", and "membrane protein complex" were the three most significant enriched CC terms. However, MFs related to "ribulose-bisphosphate carboxylase activity", "translation regulator activity", and "pheromone activity" were important functional GO categories that were involved (Supplementary Materials, Figure S7a Table S2). Of these, only the BP categories "organic substance metabolic process" and "primary metabolic process", along with "metabolic process", had the highest enrichment ratios (Supplementary Materials, Figure S7b).
To gain insight into the biological functions of DMEGs in ZB versus ZBA, a KEGG pathway enrichment analysis was also carried out. Among the hyper-DEGs in ZB versus ZBA, "photosynthesis" was the most enriched pathway, followed by "oxidative phosphorylation", "metabolic pathways", and "glyoxylate and dicarboxylate metabolism". The first two pathways were most significantly enriched in ZB versus ZBA, with a corrected P-value < 0.05 (Figure 5a and Supplementary Materials, Table  S3). Correspondingly, the hypo-DEGs in ZB versus ZBA were mainly assigned in pathways related to "circadian rhythm plant", "carotenoid biosynthesis", "biosynthesis of secondary metabolites", and "galactose metabolism" (Figure 5b and Supplementary Materials, Table S3).

GO (Gene Ontology) and KEGG (Kyoto Encyclopedia of Genes and Genomes) Enrichment Analysis of DMEGs
To understand the potential role of DNA methylation changes in cotton anther development, we first performed a gene ontology (GO) analysis of DMEGs. For the 248 hypermethylated DEGs (hyper-DEGs) in ZB versus ZBA, 202 of these genes were annotated to 1350 functional categories, including 729 biological processes (BPs), 195 cellular components (CCs), and 426 molecular function (MFs) (Supplementary Materials, Table S2). This mainly included BPs such as "ribonucleoside monophosphate metabolic process", "nucleoside metabolic process", and "glycosyl compound metabolic process". Besides, "photosystem I", "mitochondrial envelope", and "membrane protein complex" were the three most significant enriched CC terms. However, MFs related to "ribulosebisphosphate carboxylase activity", "translation regulator activity", and "pheromone activity" were important functional GO categories that were involved (Supplementary Materials, Figure S7a). Here, 351 of the 441 hypomethylated DEGs (hypo-DEGs) were annotated to 1923 functional categories, including 1113 BPs, 271 CCs, and 539 MFs (Supplementary Materials, Table S2). Of these, only the BP categories "organic substance metabolic process" and "primary metabolic process", along with "metabolic process", had the highest enrichment ratios (Supplementary Materials, Figure S7b).
To gain insight into the biological functions of DMEGs in ZB versus ZBA, a KEGG pathway enrichment analysis was also carried out. Among the hyper-DEGs in ZB versus ZBA, "photosynthesis" was the most enriched pathway, followed by "oxidative phosphorylation", "metabolic pathways", and "glyoxylate and dicarboxylate metabolism". The first two pathways were most significantly enriched in ZB versus ZBA, with a corrected P-value < 0.05 (Figure 5a and Supplementary Materials, Table S3). Correspondingly, the hypo-DEGs in ZB versus ZBA were mainly assigned in pathways related to "circadian rhythm plant", "carotenoid biosynthesis", "biosynthesis of secondary metabolites", and "galactose metabolism" (Figure 5b and Supplementary Materials, Table S3).

DNA Hypermethylation Maintains Oxidative Phosphorylation Homeostasis to Ensure the Normal Development of Cotton Anthers
Subsequently, hyper-DEGs involved in the "mitochondrial envelope" term in ZB versus ZBA were chosen for further analysis. A total of 15 hyper-DEGs were enriched in this GO term, of which nine were downregulated and six were upregulated in ZBA relative to ZB (Supplementary Materials, Figure S8). It is worth noting that the four genes with the largest fold changes (more than two times)

DNA Hypermethylation Maintains Oxidative Phosphorylation Homeostasis to Ensure the Normal Development of Cotton Anthers
Subsequently, hyper-DEGs involved in the "mitochondrial envelope" term in ZB versus ZBA were chosen for further analysis. A total of 15 hyper-DEGs were enriched in this GO term, of which nine were downregulated and six were upregulated in ZBA relative to ZB (Supplementary Materials, Figure S8). It is worth noting that the four genes with the largest fold changes (more than two times) were all downregulated in the CMS-D2 line ZBA, including Gh_A06G0349, Gh_A05G3831, Gh_A01G0843, and Gh_A11G3033, indicating that DNA hypermethylation may play a crucial role in maintaining the integrity of the mitochondrial envelope (Supplementary Materials, Figure S8).
Previous studies have demonstrated that ROS-dependent cellular metabolism and physiological processes play a crucial role in anther development [16,62,63], and unbalanced ROS metabolism causes male sterility [9]. Hence, we focused on DMEGs involved in the "oxidative phosphorylation" signaling pathway in ZB versus ZBA (Figure 4c and Supplementary Materials, Table S3). A total of 13 DMEGs were enriched in this pathway, and of these, 10 were found to be hypermethylated and three to be hypomethylated in ZB (Figure 6a and Supplementary Materials, Table S4).

Overview of DNA Methylome in Cotton CMS Line Decoded by WGBS
DNA methylation has been recognized as a new regulator of plant growth, development, and stress response [34,[64][65][66][67][68]. Some researchers have reported that dynamic changes in DNA methylation are involved in the regulation of photoperiod-and/or thermosensitive GMS in rice [69][70][71] and cotton [35,40]; GMS in wheat [38], cabbage [41], and tomato [42]; and CMS in rice [37] and maize [39]. However, the genome-wide cytosine methylation profile and the possible role of DNA methylation in cotton CMS have not been reported so far.
As a predominantly powerful technology for DNA methylation research, whole-genome bisulfite sequencing (WGBS) can enable the determination of methylation patterns at single-base resolution [45]. Recently, WGBS has been applied to decrypt an increasing number of plant methylomes, ranging from model plants such as Arabidopsis [27,44] and rice [68,72,73] to economically important crops such as maize [74][75][76], cotton [35,53,77], soybean [78,79], tomato [65], orange [66], and caster bean [80]. In this study, the global DNA methylation pattern was profiled in anthers of the CMS-D2 line ZBA and its near-isogenic maintainer line ZB using WGBS. This is the first single-baseresolution DNA methylome in the cotton CMS line that has been deciphered and used to study the epigenetic regulation mechanism of male sterility. The genome-wide DNA methylation pattern in anthers was found to be similar but slightly different from a recent methylome analysis in cotton leaves under different treatments [53]. Specifically, we observed higher levels of both CG (65.3- presented as the means ± standard deviation (SD). Vertical bars represent SD of the mean of at least three biological replicates. Asterisks indicate statistically significant differences between ZB and ZBA (** P < 0.01; *** P < 0.001, Student's t-test).
To investigate the effects of DNA methylation changes on gene expression, we examined the transcript levels of these 13 genes in ZB and ZBA using quantitative real-time polymerase chain reaction (qRT-PCR). Interestingly, all 10 genes associated with DNA hypermethylation in different genomic regions were upregulated in ZB. For the three hypomethylated genes, one promoter hypomethylated gene (Gh_A01G0418/GhNDUA9) was downregulated in ZB, whereas the other two gene body hypomethylated genes were upregulated (Gh_A10G0789/GhNDHK) or indistinguishable (Gh_D13G2512/GhVATL) (Figure 6b and Supplementary Materials, Table S4). In addition, these 13 DMEGs were also used for alignment with the protein-coding genes of the mitochondrial genome [25]. Interestingly, two possible mitochondrial targeted protein-coding genes (Gh_Sca006566G05/GhNDUS2 and Gh_D04G1994/GhATPAM) were downregulated in ZBA relative to ZB (Figure 6b). These hyper-DEGs may have been affected by the CMS-D2 cytoplasm. Our results here suggest that DNA methylation may have a positive regulatory role in the expression of many genes involved in the "oxidative phosphorylation" pathway in anthers. To further confirm this finding, both the H 2 O 2 and ATP contents in the anthers of ZB and ZBA plants were determined. As expected, the H 2 O 2 content in ZBA anthers was significantly higher than in ZB (Figure 6c), but the relative levels of ATP in the anthers of ZB and ZBA plants presented an opposite trend (Figure 6d). Therefore, we conclude that DNA hypermethylation may play an important role in maintaining the dynamic balance of ROS production and ATP synthesis during anther development.

Overview of DNA Methylome in Cotton CMS Line Decoded by WGBS
DNA methylation has been recognized as a new regulator of plant growth, development, and stress response [34,[64][65][66][67][68]. Some researchers have reported that dynamic changes in DNA methylation are involved in the regulation of photoperiod-and/or thermosensitive GMS in rice [69][70][71] and cotton [35,40]; GMS in wheat [38], cabbage [41], and tomato [42]; and CMS in rice [37] and maize [39]. However, the genome-wide cytosine methylation profile and the possible role of DNA methylation in cotton CMS have not been reported so far.
As a predominantly powerful technology for DNA methylation research, whole-genome bisulfite sequencing (WGBS) can enable the determination of methylation patterns at single-base resolution [45]. Recently, WGBS has been applied to decrypt an increasing number of plant methylomes, ranging from model plants such as Arabidopsis [27,44] and rice [68,72,73] to economically important crops such as maize [74][75][76], cotton [35,53,77], soybean [78,79], tomato [65], orange [66], and caster bean [80]. In this study, the global DNA methylation pattern was profiled in anthers of the CMS-D2 line ZBA and its near-isogenic maintainer line ZB using WGBS. This is the first single-base-resolution DNA methylome in the cotton CMS line that has been deciphered and used to study the epigenetic regulation mechanism of male sterility. The genome-wide DNA methylation pattern in anthers was found to be similar but slightly different from a recent methylome analysis in cotton leaves under different treatments [53]. Specifically, we observed higher levels of both CG (65.3-68.7%) and CHG (58.7-61.8%) methylation in cotton anthers (Figure 2b) relative to leaves (58.4-61.7% for mCG and 53.8-56.8% for mCHG), whereas the MLs in the CHH context were similar in anthers (18.8-21.8%) and leaves (19.5-24.2%) [53]. These differences may have been due to the different cotton germplasms and tissues used in these two studies.

Widespread Dynamic DNA Methylation and Its Association with Differential Gene Expression during Anther Development
More mCs and higher MLs were observed among the three sequence contexts in ZB compared to ZBA (Figure 2a,b). The DNA methylation increase in ZB relative to ZBA could have been due to an increased number of total mCs or increased MLs of existing mCs. Further analyses of DMCs showed that the increase in DNA methylation was mainly in the CHH context (Figure 3a). All of the above analyses indicated that CHH was significantly hypermethylated in ZB relative to ZBA. In the endosperm of the model plants of Arabidopsis and rice and also during tomato development, CG hypomethylation is often accompanied by local CHH hypermethylation [54,55,65]. However, CHH hypermethylation during sweet orange fruit development and ripening is not accompanied by CG and CHG demethylation [66]. Our results found that CHH hypermethylation at the chromosome level may be accompanied by CG and CHG demethylation during anther development (Supplementary Materials, Figure S1). These results indicate that the dynamic regulation of DNA methylation is critical for normal anther development, even if DNA methylation changes in opposite directions in different plants.
There were no significant differences in the number of hypermethylated (2298) and hypomethylated (2270) regions identified between ZB and ZBA (Figure 3e and Supplementary Materials, Figure S2a), which was similar to the DNA methylation pattern reported in cabbage GMS [41]. Consistently with previous studies in photoperiod-and thermosensitive male sterile rice [71] and cabbage GMS [41], more DMR-related genes in ZBA were found to be hypermethylated in most gene regions (Figure 3d,e). A comparative analysis of integrated methylomes and transcriptomes can reveal the relationship between DNA methylation dynamics and gene expression [81]. DNA methylation is generally considered to be a marker of transcriptional repression. However, fruit ripening-induced hypermethylation in citrus is highly correlated with gene activation [66]. In our study, some genes affected by DNA methylation were differentially expressed between ZB and ZBA, of which 396 genes were upregulated and 281 genes were downregulated (Figure 4c,d). This finding suggests an important role of DNA methylation not only in suppressing gene expression but also in the activation of some genes during anther development.

Possible Regulatory Role of DNA Methylation in Cotton CMS
The plant mitochondrial ETC, which contains a series of redox-active electron carriers, is considered to be a major site for intracellular ROS production and ATP synthesis [82]. Superoxide (O 2 •− ), as a proximal ROS, can be generated in significant quantities by reverse-electron transport at the mitochondrial matrix side of complex I (NADH dehydrogenase) [83] when the mitochondria are not making ATP and consequently have a high ∆p (protonmotive force) and a reduced coenzyme Q pool or a high NADH/NAD + ratio in the mitochondrial matrix [83,84]. Conversely, for mitochondria that are actively making ATP and consequently have a lower ∆p and NADH/NAD + ratio, the extent of O 2 •− production is far lower [84]. Recent research has shown that the expression of genes involved in redox homeostasis and energy metabolism is significantly modulated by DNA methylation in cotton anthers under high-temperature (HT) stress [35,40]. Our KEGG analysis results also found that hyper-DEGs in ZB versus ZBA were significantly enriched in the "oxidative phosphorylation" pathway ( Figure 5a and Supplementary Materials, Table S3). Interestingly, these hypermethylated genes (encoding the subunits of mitochondrial ETC complexes I and V) were all significantly upregulated in ZB (Figure 6a,b). These results indicate that DNA hypermethylation that mediates oxidative phosphorylation homeostasis may play a crucial role in anther development.
In the rice CMS-WA line, WA352 encoded by the mitochondrial CMS gene accumulates preferentially in the anther tapetum, thereby inhibiting nuclear-encoded mitochondrial protein COX11 (CYTOCHROME C OXIDASE 11) functions in peroxide metabolism, leading to ROS bursts and cytochrome c release, which causes premature tapetal PCD and consequent pollen abortion [16]. In our study, among the 13 DMEGs involved in the "oxidative phosphorylation" pathway, two possible mitochondrial targeted protein-coding genes (Gh_Sca006566G05/GhNDUS2 and Gh_D04G1994/GhATPAM) were downregulated in ZBA relative to ZB (Figure 6b). This result suggests that these two hyper-DEGs may be involved in the regulation of the CMS pathway in cotton.
ROS homeostasis is essential for normal anther and pollen development [40,62,85], whereas excessive accumulation of ROS in anthers leads to cell apoptosis and male sterility [9,35,[86][87][88]. In this study, a lower extent of H 2 O 2 production but a higher level of ATP synthesis was found in maintainer line ZB compared to ZBA (Figure 6c,d). This demonstrates that a ROS burst may occur during anther development in the CMS-D2 line ZBA. This finding is consistent with other cotton sterile lines, such as Zhong41A (CMS-D8) [9] and C2P5A [24]. A decrease in ATP but excessive ROS accumulation was found in a transformant expressing the rice CMS-associated gene orfH79 [89]. On the basis of previously published results and our findings, we speculate that disrupted DNA methylation in ZBA may disturb the homeostasis of ROS production and ATP synthesis in mitochondria by inhibiting the expression of genes in the "oxidative phosphorylation" pathway, resulting in a burst of ROS that prematurely switches on PCD, eventually leading to microspore abortion. However, the exact molecular mechanisms by which DNA methylation regulates the expression of specific genes involved in the CMS signaling pathway during anther development still require further investigation.

Plant Materials
The cotton CMS line ZBA (with a Gossypium harknessii cytoplasm (denoted S)) was developed through consecutive backcross procedures with the maintainer ZB containing a normal fertile upland cotton (AD1) cytoplasm (denoted N) as the recurrent male parent [8,90]. The genotypes of ZBA and ZB were designated as S (rf1rf1) and N (rf1rf1), respectively. Both ZBA and ZB had a similar nucleus genetic background, but with different cytoplasms, so they were a pair of near-isogenic lines of isonuclear alloplasmic type. All materials were developed and seeds were preserved in the Cotton Heterosis Utilization Laboratory (our research group), the Institute of Cotton Research of the Chinese Academy of Agricultural Sciences (ICR-CAAS). ZBA and ZB were planted in the summer of 2015 at Baibi East Experimental Farm, ICR-CAAS, Anyang, Henan Province, China (36 • 10 N, 114 • 35'E). Crop management practices followed local recommendations.
Our previous cytological observations showed that the male abortion of ZBA occurred roughly at the meiosis stage [91], corresponding with the growth of flower buds to a length of about 3 mm [92]. During the squaring stage, 3-mm-long buds were collected centrally and pooled from 50 representative plants of ZBA and ZB. The anthers were excised and then snap-frozen in liquid nitrogen and stored at −80 • C in a freezer for further use.

DNA and RNA Extraction, Quantification, and Qualification
Total genomic DNA was extracted from anthers according to a modified CTAB (cetyltriethylammnonium bromide) DNA extraction protocol [93]. Total RNA was extracted from the anthers of each sample using a Spectrum™ Plant Total RNA Kit (STRN50, Sigma-Aldrich, Saint Louis, MO, USA) following the manufacturer's instructions. DNA and RNA degradation and contamination were monitored on 1% (w/v) agarose gels. DNA and RNA purity was checked using a NanoPhotometer ® spectrophotometer (IMPLEN, Westlake Village, CA, USA). The DNA and RNA concentration was measured using a Qubit ® DNA and RNA Assay Kit in a Qubit ® 2.0 Fluorometer (Life Technologies, Carlsbad, CA, USA). RNA integrity was assessed using an RNA Nano 6000 Assay Kit with a Bioanalyzer 2100 system (Agilent Technologies, Palo Alto, CA, USA).

Bisulfite Treatment and Library Construction
About 5.2 µg of genomic DNA per sample spiked with 26 ng lambda DNA was fragmented by sonication to 200-300 bp with a Covaris S220 sonicator and then end-repaired and adenylated. Unmethylated lambda DNA was used as a control for evaluating the bisulfite conversion rate. Subsequently, cytosine-methylated barcodes were ligated to the sonicated DNA in accordance with the manufacturer's recommendations. These DNA fragments were treated twice with bisulfite using an EZ DNA Methylation-Gold™ Kit (Zymo Research, Irvine, CA, USA), and the resulting single-strand DNA fragments were PCR-amplified using KAPA HiFi HotStart Uracil + ReadyMix (2×). Next, the library concentration was quantified with a Qubit ® 2.0 Fluorometer (Life Technologies) and quantitative PCR, and the insert size was assayed on an Agilent Bioanalyzer 2100 system. Clustering of the index-coded samples was performed on a cBot Cluster Generation System using a TruSeq PE Cluster Kit v3-cBot-HS (Illumina, San Diego, CA, USA) according to the manufacturer's instructions. Finally, WGBS was conducted with approximately 30× sequencing depth per sample on the Illumina Hiseq 2500 platform (Novogene, Beijing, China), and 125 bp paired-end reads were generated. Image analysis and base calling were performed with an Illumina CASAVA pipeline to generate the final 125 bp paired-end reads. The WGBS raw reads were deposited with the National Center for Biotechnology Information (NCBI) under Sequence Read Archive (SRA) accession numbers SRR10314682 (ZB) and SRR10314680 (ZBA).

DNA Methylation Data Analysis
We first used FastQC to perform basic statistical analyses on the quality of the raw reads. The read sequences produced by the Illumina pipeline in FASTQ format were filtered using Trimmomatic software (version 0.35) [94] to generate high-quality clean reads. Then, the bisulfite-treated clean reads of ZB and ZBA were mapped to the TM-1 genome [50] using Bismark software (version 0.16.1) [95] with the default parameters. The paired-end reads that were uniquely aligned with the genome were reserved for further methylation analysis. The potentially methylated cytosine sites (mCs) were extracted using a "methylation extractor" and transformed to bigWig format for visualization using an IGV browser. To identify true mCs, we modeled the sum s + i,j of the methylated counts as a classic binomial (Bin) random variable with the methylation rate r i,j : s + i,j~B in (s + i,j + s − i,j , r i,j ). To calculate the methylation level (ML), the whole genome was divided into 10 kb bins with no overlap, and the sum of the methylated and unmethylated read counts in each window was calculated. The ML for each C site shows the fraction of mCs and is defined as ML (C) = reads (mC)/[reads (mC) + reads (C)]. Differentially methylated cytosines (DMCs) were identified using Fisher's exact test with a false discovery rate (FDR) multiple test correction, and DMCs with a P-value < 0.05 and a difference of ML >0.2 between two samples were considered candidate DMCs. Differentially methylated regions (DMRs) were identified using swDMR software (version 1.0.6, http://122.228.158.106/swDMR/), which uses a sliding-window approach with a window size of 1000 bp and a step length of 100 bp. Fisher's exact test with FDR multiple test correction was used to detect DMRs, in which regions with a corrected P-value < 0.05 and a difference of ML >0.1 were considered candidate DMRs. Subsequently, genes located in DMRs, called differentially methylated genes (DMGs), were also characterized.
Adapters and low-quality reads from the raw data were trimmed using Trimmomatic software (version 0.35) [94]. All remaining clean reads were mapped to the cotton reference genome using TopHat (version 2.0.9) [96]. Cuffdiff (version 2.1.1) was used to calculate the FPKM value of each gene and determine the differential expression using a model based on the negative binomial distribution [97]. Transcripts with an adjusted P-value < 0.05 were considered to be differentially expressed genes (DEGs) [98]. The GOseq R package [99] was used for GO enrichment analysis, and KOBAS software [100] was used to test the statistical enrichment of the DEGs or DMEGs in KEGG pathways. GO terms or KEGG pathways with a corrected P-value < 0.05 were considered significantly enriched.

Quantitative Real-Time Polymerase Chain Reaction (qRT-PCR) Analysis
For qRT-PCR, a total of 1 µg total RNA was first used for first-strand complementary DNA (cDNA) synthesis using a PrimeScript™ RT reagent Kit for Perfect Real Time (RR037A, Takara, Japan) according to the manufacturer's instructions. Then, the qRT-PCR analysis was performed using TransStart ® Top Green qPCR SuperMix (AQ131, TransGen Biotech, Beijing, China) on a Mastercycler ep realplex instrument (Eppendorf, Hamburg, Germany). Each reaction contained 2 µL cDNA template, 0.4 µL of each primer (10 µM), 2× TransStart ® Top Green qPCR SuperMix, and 0.4 µL Passive Reference Dye (50×) with ddH 2 O to make the final volume 20 µL. The reaction conditions were pre-denaturation at 94 • C for 30 s, followed by 42 cycles of denaturation at 94 • C for 5 s, annealing at 58 • C for 15 s, and extension at 72 • C for 20 s. A melting curve was also generated for each sample at the end of each run to determine the specificity of the amplified PCR product. Each gene in each sample was analyzed with three replicates and two technical replicates. The upland cotton Histone3 (GhHIS3) gene was used as an internal control with the primers GhHIS3-qRT-F (10 µM) and GhHIS3-qRT-R (10 µM), and the relative expression level of genes was calculated using the 2 −∆∆Ct method, as described in our previous study [90]. The results were obtained from three biologically independent experiments to ensure the reliability of the assay data. Gene-specific primers for qRT-PCR were designed using Oligo 7 Primer Analysis Software [101], were synthesized commercially (BioSune Biotechnology, Shanghai, China), and are listed in the Supplementary Materials, Table S5.

ROS and ATP Quantification
Anthers with a bud length of approximately 3 mm from both CMS line ZBA and maintainer line ZB plants were collected in 2 mL tubes for a determination of the ROS and ATP contents, with at least five biological replicates and two technical replicates for each sample. The amounts of ROS were measured using an H 2 O 2 Detection Kit (Beyotime, Shanghai, China) and are expressed as µM/g fresh anther. ATP content was measured using the luciferin-luciferase method [102] following the protocol of the ATP Detection Kit (Beyotime, Shanghai, China).

Statistical Analyses
Each graphical plot in this study represents the results of multiple independent experiments (n ≥ 3), and the values are expressed as the mean ± standard deviation (SD). Statistical significance analyses of gene expression levels were performed using two-tailed unpaired Student's t-tests, and a P-value < 0.05 was considered to be statistically significant.

Conclusions
Through integrated methylome and transcriptome analysis, a total of 4568 DMRs and 2096 DMGs were identified in anthers between the CMS-D2 line ZBA and its near-isogenic maintainer line ZB. Among the DMEGs, 396 genes were upregulated and 281 genes were downregulated. The hyper-DEGs were significantly enriched in the "oxidative phosphorylation" pathway, and these hypermethylated genes (encoding the subunits of mitochondrial ETC complexes I and V) were all significantly upregulated in ZB. Combined with our biochemical data, we propose that DNA hypermethylation is involved in regulating the homeostasis of ROS production and ATP synthesis to maintain the normal development of cotton anthers. Our results help to better understand the possible role of DNA methylation in cotton CMS and will accelerate the study of the molecular mechanisms of CMS in cotton.
Supplementary Materials: Supplementary materials can be found at http://www.mdpi.com/1422-0067/20/23/ 6070/s1. Figure S1: Levels of C, CG, CHG, and CHH methylation in 26 chromosomes. Figure S2: Volcano map of identified differentially methylated regions (DMRs) (a) and their length distribution (b) in ZB versus ZBA. Figure S3: Evaluation of RNA sequencing data quality. Figure S4: Heatmap of 13 selected cotton reference (housekeeping) genes with their respective FPKM values in ZB and ZBA. Figure S5: Numbers of differentially expressed genes (DEGs) in ZB versus ZBA. Figure S6: Comparative analysis of methylation densities of C-sites in different gene regions of the up-or downregulated DEGs, including promoter, exon, and intron regions. Figure S7: Gene ontology (GO) enrichment analysis of hyper-(a) or hypomethylated (b) DEGs in ZB versus ZBA. Figure S8: Heatmaps showing the MLs (a) and the expression levels (i.e., FPKM values) (b) of hyper-DEGs involved in the "mitochondrial envelope" term in ZB versus ZBA. Table S1: Output data description of WGBS for ZB and ZBA.  Table S4: Information on 13 DMEGs in the oxidative phosphorylation signaling pathway involved in anther development. Table S5: Primers used for qRT-PCR in this study.
Author Contributions: C.X. and J.W. conceived of and designed the experiments; L.G., T.Q., H.T., H.W., and X.Q. constructed the population materials and investigated the fertility of cotton plants; M.Z. performed the data analysis and most of the experiments; X.Z., B.Z., J.F., Z.Z., and T.L. provided bioinformatics analysis assistance to M.Z.; M.Z., C.X., J.W., and K.S. contributed to the preparation of the manuscript. All authors have read and approved the final manuscript.
Funding: This research was funded by the National Natural Science Foundation of China (31621005).

Acknowledgments:
The authors are grateful to Yongfeng Zhang (ICR-CAAS, China) for helping to investigate the fertility of cotton plants, Guoyuan Liu (ICR-CAAS, China) for assistance with the bioinformatics analysis, and Lihua Ma (ICR-CAAS, China) for assistance with the microscopy.

Conflicts of Interest:
The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.