A Methylation-Based Reclassification of Bladder Cancer Based on Immune Cell Genes

Simple Summary Bladder cancer (BC) development is highly related to immune cell infiltration. In this study, we aimed to construct a new classification of bladder cancer molecular subtypes based on immune-cell-associated CpG(Methylation) sites. The classification was accurate and stable. BC patients were successfully divided into three subtypes based on the immune-cell-associated CpG sites. The clinicopathologic features, distribution of immune cells, level of expression of checkpoints, stromal score, immune score, ESTIMATEScore, tumor purity, APC co_inhibition, APC co_stimulation, HLA, MHC class_I, Type I IFN_respons, Type II IFN response, and DNA stemness score (DNAss) presented significant differences among the three subgroups. The specific genomic alteration was also different across subgroups. High-level immune infiltration showed a correlation with high-level methylation. A lower RNA stemness score (RNAss) was associated with higher immune infiltration. Cluster 2 demonstrated a better response to chemotherapy. The anti-cancer targeted drug therapy results are different among the three subgroups. Abstract Background: Bladder cancer is highly related to immune cell infiltration. This study aimed to develop a new classification of BC molecular subtypes based on immune-cell-associated CpG sites. Methods: The genes of 28 types of immune cells were obtained from previous studies. Then, methylation sites corresponding to immune-cell-associated genes were acquired. Differentially methylated sites (DMSs) were identified between normal samples and bladder cancer samples. Unsupervised clustering analysis of differentially methylated sites was performed to divide the sites into several subtypes. Then, the potential mechanism of different subtypes was explored. Results: Bladder cancer patients were divided into three groups. The cluster 3 subtype had the best prognosis. Cluster 1 had the poorest prognosis. The distribution of immune cells, level of expression of checkpoints, stromal score, immune score, ESTIMATEScore, tumor purity, APC co_inhibition, APC co_stimulation, HLA, MHC class_I, Type I IFN Response, Type II IFN Response, and DNAss presented significant differences among the three subgroups. The distribution of genomic alterations was also different. Conclusions: The proposed classification was accurate and stable. BC patients could be divided into three subtypes based on the immune-cell-associated CpG sites. Specific biological signaling pathways, immune mechanisms, and genomic alterations were varied among the three subgroups. High-level immune infiltration was correlated with high-level methylation. The lower RNAss was associated with higher immune infiltration. The study of the intratumoral immune microenvironment may provide a new perspective for BC therapy.


Introduction
Recently, diverse immunotherapy methods have been proven to successfully treat numerous lethal cancers [1]. These included cytokine treatment, cellular therapy, immune checkpoint blockades, and therapeutic vaccines [2]. Immune checkpoint inhibitors showed remarkable anti-tumor functions in several human cancers, including programmed cell death protein 1 (PD-1), cytotoxic T lymphocyte antigen-4 (CTLA-4), and PD-1 ligand (PD-L1) antibodies [3][4][5]. The FDA has approved two cytokines as anti-tumor agents against kidney cancer and metastatic melanoma [6]. Preventive and therapeutic anticancer vaccines have a significant anti-tumor function in several cancers, such as hepatitis B virus vaccine [7], Sipuleucel-T, human papillomavirus vaccine [8], and GVAX vaccine [9,10]. There is a remarkable heterogeneity in the response rates to treatment across individuals, yet not all immunotherapy is successful in treating patients [2].
A previous study also revealed the interaction of various immune cells and signaling pathways between the tumor and immune cells [11]. There are several types of immunotherapy strategies to treat bladder cancer (BC); for instance, treating high-risk non-muscle invasive bladder cancer (NMIBC) with intravesical administration of the Bacillus Calmette-Guerin (BCG) [12]. BCG is a standard practical therapy in NMIBC [13], but, unfortunately, 25-45% of patients with high-risk papillary tumors or carcinoma in situ did not benefit from BCG therapy [14]. Several immune checkpoint inhibitors have been utilized to treat BC. Among them, atezolizumab, avelumab, durvalumab, nivolumabis, and pembrolizumab were recommended for patients with advanced or metastatic tumors [15,16]. In addition, the inhibition of CTLA-4, including ipilimumab and tremelimumab, was suggested to increase the immune response of BC [17]. Finally, intravesical interleukin 12 (IL-12) activates the immune system and weakens the status of immunosuppression in tumor cells [18]. However, not all patients have the same response to the above kinds of therapy [17]. Similarly, research reported that only approximately 20% of patients can benefit significantly from immunotherapy [19,20]. Therefore, accurate classification methods must be developed to help to enhance the optimal scheme of BC patients' responses to immunotherapy.
Tumor infiltrating lymphocytes (TIL) have an important role in the chemotherapy response and in enhancing the clinical effect in all subtypes of breast cancer [21]. Furthermore, two previous studies proved that high immune cell infiltration was associated with a favorable prognosis after chemotherapy [22,23]. Besides, two previous studies reported that the basal subtype of muscle invasive bladder cancer with immune infiltration had a sensitivity to chemotherapy [24,25].
The treatment response and prognosis of patients were predicted by immune cells with the current molecular stratification of BC patients [26]. In contrast to that study, our study had some differences as follows. The first difference is that we analyzed based on multi-omics, including DNA methylation, RNA, DNA mutations, and copy number variations. The second difference is that we divided the bladder cancers into three subtypes based on methylation. The third difference is that they did not show any clinical implications regarding their classification. In our study, BC was divided into three distinct subtypes based on immune cell-related methylation site profiles. The three methylation site subtypes are associated with different molecular features, cellular properties, and clinical outcomes. In fact, the classification of immune-related methylation site subtypes may help to enhance the optimal scheme of BC patients that are responsive to immunotherapy.

Three Subgroups Based on Differentially Methylated Sites (DMSs)
Seven hundred and eighty-two immune cell biomarker-associated genes were selected from previous studies [27], and 8703 corresponding immune cell biomarker-associated methylation sites were acquired. The parameter of infiltration was an adjusted p-value < 0.05 and |deltabeta| > 0.2. Seven hundred and fifteen DMSs were identified between normal samples and tumor samples ( Figure 1A). Probes revealing the p-value < 0.05 and deltabeta > 0.2 were defined as hypermethylated, and those with a p-value < 0.05 and deltabeta < 0.2 were defined as hypomethylated. A total of 553 DNA hypomethylation sites and 162 hypermethylation sites were obtained.

Classification of Methylation Subtypes of BC
The consensus clustering of 715 DMSs was classified into three subtypes ( Figure 1C). Cluster 1 showed mid-range-methylation, cluster 2 showed the highest methylation level, whereas cluster 3 had the lowest methylation. Principal component analysis (PCA) was utilized to check the stability of the consensus classification ( Figure 1D).
The overall survival (OS) curve of BC subsets was obtained using the Kaplan-Meier method ( Figure 2A). Cluster 1 had the poorest prognosis. Cluster 3 had the best prognosis. Next, we carried outonducted a log-rank test between each pair of subtypes and found a significant difference only between clusters 1 and 3 (p-value was 0.009). However, several studies reported that it was unnecessary that there was a significant difference between each pair of clusters [25,[28][29][30]. Furthermore, the sub-sequence analysis showed that the bio-mechanism had significance, and the clinical significance was different among the three subgroups. Therefore, we divided the BC into three clusters.
In our study, a barplot demonstrates the relationship between the clinical traits and the biological characteristics of the subtypes ( Figure 2B-H). Excluding age, the other clinicopathologic features had significant differences among the three subgroups. Our results show that Cluster 3 had more stage I, more low-grade types, and less T3.

Identifying Different Methylation Levels and Distinct Gene Expression Levels of the Different Subgroups
We compared the DNA methylation level and immune gene levels among the three subgroups using the chi-square test with p-value < 0.05. Figure 3A shows the differentially immune cell biomarker-associated methylation levels. Cluster 1 revealed a mid-range-methylation level, cluster 2 revealed the highest methylation level, and cluster 3 revealed the lowest methylation level. The results in Figure 3A are consistent with Figure 1C. The DNA methylation levels provided significant differences among the three subgroups. We also found significantly different immune gene levels across the three subgroups ( Figure 3B). The methylation expression in one subgroup was compared with the rest of the subgroups using the Wilcox test. Consequently, we found 14, 540, and 136 methylation sites with higher expression levels in cluster 1, cluster 2, and cluster 3, respectively. Fourteen methylation sites overlapped between subtypes cluster 1 and cluster 2. However, no overlapped methylation sites were identified in the remaining cluster-pairs.

Immune in Different Subgroups
In Figure 4A, cluster 1 had mid-range immune infiltration. The correlation with high immune infiltration is shown in cluster 2, unlike cluster 3, which had low immune infiltration. The immune infiltration was compared among the three subtypes, and there were remarkable differences among these subtypes ( Figure 4B). Evidently, the immune checkpoints demonstrated significant differences among these subtypes ( Figure 4E).

Tumor Microenvironment (TME)
The tumor microenvironment contains stromal cells, tumor cells, and immune cells. The higher the stromal score and immune score, the lower the purity of the tumor. As shown in Figure 4F-I, cluster 2 presented the highest stromal score, immune score, ESTIMATEScore, and the lowest purity of tumor. On the contrary, cluster 3 had the lowest stromal score, immune score, ESTIMATEScore, and the highest tumor purity.

Single Sample Gene Set Enrichment Analysis (ssGSEA)
The biomarkers of APC co_inhibition, APC co_stimulation, endothelial cells, fibroblasts, HLA, inflammation-promoting, MHC class_I, Type I IFN_Response, and Type II IFN Response were significantly different among these subtypes ( Figure 5).

Comparing with the Other Classification
As shown in Figure 6, cluster 3 had the highest luminal marker expression and the lowest squamous and neuronal differentiation marker expression. Cluster 2 had the highest basal and EMT Claudin marker expression.

DNAss and RNAss among Subgroups
The DNA hypermethylation of those promoter genes suppressed the gene expression, which, in turn, benefited the cancer cells. Therefore, down-regulation of those genes may lead to the occurrence of cancer stem and progenitor cells by DNA hypermethylation [31,32]. RNAss and DNAss were the lowest in cluster 2 in Figure 7.

Analysis of Mutations and CNVs among the Three Subgroups
A total of thirty immune-cell-associated genes with the highest mutation proportion in each subtype are shown in Figure 8A-C. A further fifty-eight immune-cell-associated genes were identified from the above thirty genes in each subgroup. Together, the results show that there was less overlap among the three subtypes ( Figure 8A-C). In additional, the mutations of ITGA9, ENG, EVI5, ATIC, and FZD2 in cluster 1 were significantly higher than those in other subtypes. The mutations of CTSZ, HOXA1, and KLRF1 in cluster 2 were significantly higher as well compared with those in other subtypes. Likewise, the mutations of DLC1, OSBPL1A, RRP12, C3AR1, MPZL1, and ITK in cluster 3 were significantly higher than those in other subtypes. TMB had a significant difference only between cluster 1 and cluster 2. ( Figure 8D). Finally, the CNV data were analyzed, and 391 normal tissue and 410 tumor tissue were extracted. In a comparison of CNV data of one group with the other two groups (cluster one, two, and three) ( Figure 8E), the results were as follows. The CNV data in one subgroup were compared with the other two subgroups. One gene with significant copy number gains was in cluster 1, and three genes with significant copy number losses were in cluster 1. Figure 8F shows four genes with significant copy number gains and one gene with significant copy number losses in cluster 2. In Figure 8G, there are two genes with significant copy number gains and four genes with significant copy number losses in cluster 3.

Clinical Implications Regarding Our Classification
We analyzed chemotherapy's impact on the three subgroups. Three hundred and ninety-eight cases were treated using chemotherapy in this study. Almost half of the unknown therapy information was in cluster 3 (Table 1), and cluster 3 had the highest luminal biomarker expression and lowest immune infiltration, which demonstrated that the luminal subtype had a good survival rate with and without neoadjuvant chemotherapy [33]. Therefore, we only compared the chemotherapy information and overall survival rate between cluster 1 and cluster 2. In this study, 80.2% of patients in cluster 2 received chemotherapy, and 95.9% of patients in cluster 1 received chemotherapy. Table 1. Response to chemotherapy. CPD represents clinical progressive disease, CR represents complete response, PR represents partial response, and SD represents stable disease. Cancers 2020, 12, x 9 of 22 Table 1. Response to chemotherapy. CPD represents clinical progressive disease, CR represents complete response, PR represents partial response, and SD represents stable disease. ☆Unknown represents "unknown therapy or unknown the response to therapy".

Discussion
In recent years, there has been increased interest in DNA methylation alteration as altered DNA methylation patterns are hallmarks of tumors. Typically, unmethylated promoters may change into densely methylated forms such as tumor suppressors which will facilitate gene silencing [32]. Other sequences may alter into hypomethylated forms in tumors, which results in the abnormal activation of genes that are usually suppressed by DNA methylation [34]. Hypermethylation events have also been reported to be biomarkers of human tumors, for an early examination of blood, urine, and other body fluids for prediction of the response and prognosis of treatment and for monitoring cancer recurrence [35].
To understand the mechanism of cancer, help guide therapy, and improve prognoses, it is vital to identify accurate subtypes. Several studies reported identifying subtypes based on DNA methylation, including colon adenocarcinoma [36], cervical cancer [37], glioblastoma [38], and bladder cancer [39]. This study divided BC into three distinct subtypes based on the immune-cellrelated methylation profiles ( Figure 1B-C). To check for the stability and probability of the classification, PCA was utilized to validate the stability of the classification ( Figure 1D), thus proving that the classification was stable and accurate. The three immune subtypes were related to different clinical results ( Figure 2). The methylation levels among the three subtypes were significantly different. The distribution of immune cells, genes corresponding to specific DNA methylation sites, level of expression of checkpoints, stromal score, immune score, ESTIMATEScore, APC_co_inhibition, APC_co_stimulation, HLA, MHC_class_I, Type_I_IFN_Response, and Type_II_IFN_Response had significant differences among the three subgroups. All were used to verify the stability and accuracy of the classification (Figures 4 and 5).
A meta-analysis proved that the DNA methylation trended toward a poor clinicopathological result [40]. However, they found no correlation with age [40], which was the same in our study. An abnormal promoter methylation level was correlated with clinicopathological profiles in BC [41]. Researchers found that higher methylation values of four genes (RASSF1A, CDH1, CDH13, and APC) were significantly associated with several traits of poorer outcome (tumor stage, growth pattern, muscle invasion, and tumor grade). Clearly, our study was similar to the previously mentioned Table 1. Response to chemotherapy. CPD represents clinical progressive disease, CR represents complete response, PR represents partial response, and SD represents stable disease. ☆Unknown represents "unknown therapy or unknown the response to therapy".

Discussion
In recent years, there has been increased interest in DNA methylation alteration as altered DNA methylation patterns are hallmarks of tumors. Typically, unmethylated promoters may change into densely methylated forms such as tumor suppressors which will facilitate gene silencing [32]. Other sequences may alter into hypomethylated forms in tumors, which results in the abnormal activation of genes that are usually suppressed by DNA methylation [34]. Hypermethylation events have also been reported to be biomarkers of human tumors, for an early examination of blood, urine, and other body fluids for prediction of the response and prognosis of treatment and for monitoring cancer recurrence [35].
To understand the mechanism of cancer, help guide therapy, and improve prognoses, it is vital to identify accurate subtypes. Several studies reported identifying subtypes based on DNA methylation, including colon adenocarcinoma [36], cervical cancer [37], glioblastoma [38], and bladder cancer [39]. This study divided BC into three distinct subtypes based on the immune-cellrelated methylation profiles ( Figure 1B-C). To check for the stability and probability of the classification, PCA was utilized to validate the stability of the classification ( Figure 1D), thus proving that the classification was stable and accurate. The three immune subtypes were related to different clinical results (Figure 2). The methylation levels among the three subtypes were significantly different. The distribution of immune cells, genes corresponding to specific DNA methylation sites, level of expression of checkpoints, stromal score, immune score, ESTIMATEScore, APC_co_inhibition, APC_co_stimulation, HLA, MHC_class_I, Type_I_IFN_Response, and Type_II_IFN_Response had significant differences among the three subgroups. All were used to verify the stability and accuracy of the classification (Figures 4 and 5).
A meta-analysis proved that the DNA methylation trended toward a poor clinicopathological result [40]. However, they found no correlation with age [40], which was the same in our study. An abnormal promoter methylation level was correlated with clinicopathological profiles in BC [41]. Researchers found that higher methylation values of four genes (RASSF1A, CDH1, CDH13, and APC) were significantly associated with several traits of poorer outcome (tumor stage, growth pattern, Unknown represents "unknown therapy or unknown the response to therapy". CPD represents clinical progressive disease, CR represents complete response, PR represents partial response, and SD represents stable disease. In patients of cluster 2, 59.7% reached a CR after chemotherapy, while in patients of cluster 1, only 32.9% reached a CR. We found that 62.3% of patients of cluster 2 had a response to chemotherapy, which included CR and PR; However, 53.2% of patients in cluster 1 had a response to chemotherapy. In addition, Kaplan-Meier analyses showed that cluster 2 had more improvement in the overall survival after chemotherapy (Table 2 and Figure 1). All the above results demonstrate that cluster 2 had a higher improvement in the overall survival after chemotherapy.

Discussion
In recent years, there has been increased interest in DNA methylation alteration as altered DNA methylation patterns are hallmarks of tumors. Typically, unmethylated promoters may change into densely methylated forms such as tumor suppressors which will facilitate gene silencing [32]. Other sequences may alter into hypomethylated forms in tumors, which results in the abnormal activation of genes that are usually suppressed by DNA methylation [34]. Hypermethylation events have also been reported to be biomarkers of human tumors, for an early examination of blood, urine, and other body fluids for prediction of the response and prognosis of treatment and for monitoring cancer recurrence [35].
To understand the mechanism of cancer, help guide therapy, and improve prognoses, it is vital to identify accurate subtypes. Several studies reported identifying subtypes based on DNA methylation, including colon adenocarcinoma [36], cervical cancer [37], glioblastoma [38], and bladder cancer [39]. This study divided BC into three distinct subtypes based on the immune-cell-related methylation profiles ( Figure 1B-C). To check for the stability and probability of the classification, PCA was utilized to validate the stability of the classification ( Figure 1D), thus proving that the classification was stable and accurate. The three immune subtypes were related to different clinical results (Figure 2). The methylation levels among the three subtypes were significantly different. The distribution of immune cells, genes corresponding to specific DNA methylation sites, level of expression of checkpoints, stromal score, immune score, ESTIMATEScore, APC_co_inhibition, APC_co_stimulation, HLA, MHC_class_I, Type_I_IFN_Response, and Type_II_IFN_Response had significant differences among the three subgroups. All were used to verify the stability and accuracy of the classification (Figures 4 and 5).
A meta-analysis proved that the DNA methylation trended toward a poor clinicopathological result [40]. However, they found no correlation with age [40], which was the same in our study. An abnormal promoter methylation level was correlated with clinicopathological profiles in BC [41]. Researchers found that higher methylation values of four genes (RASSF1A, CDH1, CDH13, and APC) were significantly associated with several traits of poorer outcome (tumor stage, growth pattern, muscle invasion, and tumor grade). Clearly, our study was similar to the previously mentioned studies (Figure 2). Cluster 2 had the highest methylation and several poor clinicopathological parameters, including less stage I and II, less M0, less T1 and T2, and no low-grade.
In this current study, different subtypes had different survival rates. This may vary due to the following reasons: (1) abnormal DNA methylation may lead to a poor prognosis in cancer patients [42]. The progression and prognosis of cancer may be affected by the hyper-methylation of DNA [43].
(2) Tumor cells in the microenvironment can express high levels of immunosuppressive cytokines to forbid T cell proliferation and activity while facilitating tumor development and progression [44,45]. Tumor-expressing specific molecules can be sufficient to induce immunosuppression and facilitate immune evasion [46]. Subtle changes in the compositions of immune cells can have different influences on tumor progression [47]. Previous studies reported that a high density of macrophages in the microenvironment was correlated with a poor prognosis in bladder cancer patients [48]. (3) Patients with the luminal subtype presented well with and without neoadjuvant chemotherapy [30,33]. In our study, cluster 3 demonstrated the lowest methylation level, low immune cell infiltration, more luminal biomarkers, and less neuronal differentiation biomarkers, which might indicate why it had a good survival rate ( Figures 1C, 3A, 4A and 6).
However, mid-range-methylation and mid-range immune infiltration were found in cluster 1, which had the worst survival. The highest methylation and the highest immune infiltration were found in cluster 2, which had intermediate survival. We might find the following reasons.
(1) Up-regulation of the VTCN1 expression in bladder cancer led to poor survival [49,50]. B7x (VTCN1) was remarkably overexpressed in many human cancers, and it repressed the antitumor immune effect and regulated the escape from immunosurveillance [51]. A high-level expression of CD80 and CD86 may result in a high survival benefit of patients with nasopharyngeal carcinoma [52]. The absence or low-level expression of CD80 and CD86 in cancers could be one of the mechanisms in which cancers escape immunosurveillance [52]. The checkpoints illustrated in Figure 4E. demonstrated significant differences among the three subgroups. Among them, VTCN1 (B7-H4) had the higher expression in cluster 1. On the contrary, CD80 and CD86 had the lower expressions in cluster 1. (2) The frequency of gene mutation and the CNV were different among the three groups. There was little overlap of mutant genes among the three subtypes, as shown in Figure 8A-C. These CNV genes among the three subgroups were completely different ( Figure 8E). (3) HLA plays an important role in the presentation of neoantigens [53,54]. Due to HLA loss, a tumor can escape immune monitoring [53,54]. (4) The downregulation of MHC class I expression also causes immune escape [55]. As shown in Figure 5, HLA and MHC class I had the highest expression in cluster 2.
(5) High immune infiltration improved the clinical outcomes from chemotherapy [23,56,57]. In this study, 80.2% of patients in cluster 2 received chemotherapy, and 95.9% of patients in cluster 1 received chemotherapy. Thus, all previously mentioned factors might cause a poorer overall survival rate in cluster 1 compared with cluster 2.
A previous study reported that muscle-invasive bladder cancer was divided into five subtypes based on the mRNA expression profiles [30]. The five subtypes were luminal-papillary, luminal-infiltrated, luminal, basal-squamous, and neuronal. The neuronal subtype had the poorest overall survival time [30]. However, the luminal-infiltrated subtype and basal-squamous subtype had medium-level overall survival times [30]. In the present study, cluster 3 had the highest luminal marker expression and the lowest basal and neuronal marker expression. Cluster 2 had the highest basal, immune, and EMT Claudin markers with more T-cell infiltration. The neuronal and squamous markers were the same in cluster 1 and cluster 2. Cluster 1 and cluster 2 with high neuronal marker expression demonstrated poorer survival times, as shown in Figure 6.
The relationship between methylation and immune infiltration is important. A previous study reported that the correlation between DNA methylation and gene expression in lung cancer was identified for approximately 750 genes [58]. They found that one third of these correlations were positive, which indicates the challenges in finding widespread and strong negative correlations between gene expression and genome-wide CpG methylation [58]. A previous study reported that the high methylation subgroup had low infiltration in skin cutaneous melanomas and breast cancer [59]. However, another study showed no distinct correlation between methylation and immune infiltration [60]. In addition, another study showed that one subgroup with low methylation had low infiltration [28]. Cong Liang et al. reported a high concordance between the methylation value and gene expression level, which predicted the immune infiltration levels in tumors [61]. All the above publications proved that the positive or negative correlation between immune infiltration and methylation level depended on the specific CpG sites. In the current study, cluster 2 with high methylation levels had a high immune infiltration (Figure 3).
The scores of stromal and immune cells were based on specific biomarkers associated with the infiltration of stromal and immune cells in the tumor samples. The stromal and immune scores formed ESTIMATE scores. These scores of stromal and immune cells were negatively correlated with the tumor purity. The ESTIMATE scores also had a negative correlation with the tumor purity [62,63]. In this study, cluster 2 had the highest immune infiltration and the highest stromal cells (Figure 4). Overall, cluster 2 had the highest stromal score, immune score, ESTIMATE scores, and the lowest tumor purity, in contrast to cluster 3 ( Figure 4F-I). The distribution of immune scores among the three subgroups was consistent with the distribution of immune cells ( Figure 4F-I).
Endothelial cells can remodel the local immune microenvironment and help tumor cells to escape immunosurveillance in different ways [64]. Endothelial cells not only release chemokines to promote leukocyte migration into tumor tissues, but also express adhesion proteins to facilitate peripheral leukocyte capture [65]. Thus, endothelial cells can forbid the activation and chemotaxis of immune cells and mediate inhibitory molecules to facilitate immune tolerance [66,67]. Endothelial cells also show an increased expression of PD-L1 to repress T cell activation [68][69][70]. In addition, FasL expression in endothelial cells promotes their ability to suppress the activation of CD8+ T cells, causing endothelial-cell-associated immune cell death and promoting tumor escape [71,72]. In the present study, the density of the endothelial cells in cluster 2 was the highest, and immune infiltration was the highest in cluster 2 ( Figure 5). This implies that the endothelial cells might help tumor cells to escape immunity.
Cancer cells interacted with cancer-associated macrophages and tumor-associated fibroblasts, which promote tumor progression in bladder cancer [73]. In the present study, the distribution of fibroblasts among the three subgroups was consistent with the macrophage distribution ( Figure 5). This indicates a correlation between fibroblasts and macrophages. This can be an effective part of a patient's cancer treatment regimen due to the numerous inflammatory molecules that play an important role in the progress and development of cancer [74]. The role of inflammatory molecules is far from being fully understood [74]. Chronic inflammation plays an important role in inducing aberrant methylation [75][76][77][78][79][80]. However, the molecular mechanisms are not yet understood [75][76][77][78][79][80]. A previous reviewer emphasized the importance of the inflammatory response in the recurrence risk and progression of BC [81]. The inflammatory cells and inflammatory cytokines in the chronic inflammatory microenvironment of solid tumors contributed to BC generation and progression via multiple mechanisms [81].
The proinflammatory cells, including immune cells, such as macrophages, myeloid-derived suppressor cells, regulatory T cells, dendritic cells, mast cells, neutrophils, and lymphocytes [81,82], belong to immune infiltration cells. In this study, the distribution of inflammation-promoting cells among the three subgroups was consistent with the distribution of immune cells ( Figure 5). This demonstrates that the inflammation-promoting cells and immune cells might affect each other. However, the precise role of the inflammation promotion requires further study.
A study divided triple-negative breast cancer into three subgroups based on immune profiling [83]: the immunity-high group had the most HLA genes with higher expression, the immunity-mid had medium expression, and the immunity-low group had HLA genes with lower expression. Similarly, the immunity-high group also had high MIC class I, Type I IFN response, Type II IFN response, and APC [29,83,84]. However, the immunity-low group had MIC class I and APC with a lower expression [29,83,84]. In the present study, as shown in Figure 5, we found the same as the above studies.
In the present study, most checkpoints had the highest expression levels in cluster 2. There are several factors that affect the expression of checkpoints. First, aberrant methylation causes abnormal mRNA expression. The CD28, CTLA4, CD80, and CD86 expression values are mediated by DNA methylation [85]. The positive or negative correlation between the expression levels of those checkpoints and methylation levels depends on the specific CpG sites [85]. However, a previous publication showed a significantly positive correlation between the PD-L1 promoter methylation level and protein expression level in advanced gastric cancer [86]. Secondly, the distribution of immune checkpoints varied in different subtypes. A publication showed that the basal-squamous subtype had high CD274 (PD-L1) and CTLA4 expression [30]. Thirdly, high immune infiltration may be one of the factors. Several previous studies reported high immune infiltration with high LD-1(PDCD1) expression [29,83,84]. Similarly, another study reported that tumoral B7-H3 (CD276) overexpression was correlated with high tumor-infiltrating cytotoxic lymphocytes [87]. Two articles showed PDCD1, CD274, PDCD1LG2, CD86, CTLA4, and CD80 overexpression in the subtype with high immune infiltration [29,84].
The DNA hypermethylation of those promoter genes suppressed gene expression, which in turn gave the cancer cells the maximum benefits. Therefore, down-regulation of those genes may lead to the occurrence of cancer stem and progenitor cells by DNA hypermethylation [31,32]. The range of scores was from 0 to 1. Zero indicates high differentiation, and one indicates undifferentiation [88]. However, in this study, the RNA stemness score and DNA stemness score were the lowest in cluster 2 ( Figure 7). This challenged the above findings in which down-regulation of those genes may lead to the occurrence of cancer stem and progenitor cells by DNA hypermethylation. The previous study found that for several tumor types, such as BLCA, LUSC, HNSC, and GBM, there was a negative correlation between DNAss and the leukocyte fraction and/or lower PD-L1 expression [88]. In this study, cluster 2 had the highest immune infiltration and a high-level expression of CD274 ( Figure 4E); however, cluster 3 had the lowest DNAss. This result in the current study is the same as the previous work. In this study, we also found that the lower RNAss was associated with higher immune infiltration and a higher-level expression of CD274 (Figures 4E and 7).
The high TMB, non-small-cell lung cancer group had more DNA methylation aberrations, including hypermethylation [89]. The correlation between TMB and the DNA methylation level was negative in head and neck squamous cell carcinomas [85]. Unlike the previous study, in the present study, TMB demonstrated a remarkable difference only between cluster 1 and cluster 2. (Figure 8D). There are two potential reasons for this. The first reason is that the correlation between TMB and the methylation level in different tumors is varied. The second reason is that the correlation between TMB and the methylation level in different subtypes is also well diversified. Further studies must be performed to verify this.
The correlation between mutation and DNA methylation is also important. A previous study reported that the hypomethylated blocks might promote mutation [90]. The methylation of cytosine can cause mutations to thymine [91]. However, another study reported that differential promoter methylation and somatic mutations interacted with each other in head and neck cancer [92]. The composition of the genes of mutations was different among the three subtypes. Figure 8. shows that there was less overlap among the three subtypes ( Figure 8A-C). The mutations of ITGA9, ENG, EVI5, ATIC, and FZD2 in cluster 1 were significantly higher than those in the other subtypes. These genes are the biomarkers of mast cells, plasmacytoid dendritic cells, Type 2 T helper cells, immature dendritic cells, and macrophages [27].
The mutations of CTSZ, HOXA1, and KLRF1 in cluster 2 were significantly higher than those in the other subtypes. These genes are the biomarkers of natural killer cells, CD56 bright natural killer cells, and gamma delta T cells [27]. The mutations of DLC1, OSBPL1A, RRP12, C3AR1, MPZL1, and ITK in cluster 3 were significantly higher than those in the other subtypes. These genes are the biomarkers of Type 2 T helper cells, eosinophils, effector memory CD8 T cells, activated CD8 T cells, and activated CD4 T cells [27]. These immune cells with mutant genes were different among the three subgroups. The mutant genes could be promising drug targets.
Hypomethylated loci in cancer often coordinate with DNA break hotspots and may therefore contribute to copy number changes [90]. In Figure 7E, the CNV data in one subgroup was compared with the other two subgroups. AKNA with significant copy number gains was in cluster 1, and this gene is the biomarker of activated B cells. PARVG, SIK1, and UPK3A with significant copy number losses were in cluster 1, and these genes are the biomarkers of MDSC, effector memory CD8 T cells, and monocytes [27]. Figure 7F shows CLTB, GEMIN6, SIRPA, and SIRPG with significant copy number gains. These genes are the biomarkers of immature dendritic cells, activated CD8 T cells, plasmacytoid dendritic cells, and central memory CD4 T cells. DYRK2 with significant copy number losses was in cluster 2, and this gene is the biomarker of CD56 dim natural killer cells [27]. Likewise, CSF1R and GUSB with significant copy number gains were in cluster 3 ( Figure 8G), and these genes are the biomarkers of T follicular helper cells and central memory CD8 T cells. CDC7, CHST12, CSF3R, and OGT with significant copy number losses were in cluster 3. These genes are the biomarkers of Type 2 T helper cells, T follicular helper cells, immature dendritic cells, and plasmacytoid dendritic cells [27]. These immune cells with mutant genes among the three subgroups were completely different. They also have a high chance as promising drug targets based on these CNV genes.
Triple-negative breast cancers with high tumor-infiltrating lymphocytes may show increased PD-L1 expression, which may be the reason that triple-negative breast cancers respond robustly to immune checkpoint inhibitor therapy [21]. Likewise, better responses to Atezolizumab were correlated with higher PD-L1 expression in the tumor-infiltrating leukocytes in BC [16]. In the current study, cluster 2 had the highest PD-L1 expression and cluster 1 had PD-L1 expression in the middle. This suggests that cluster 2 would be the most likely to respond to the PD-L1-Blocking Antibody and that cluster 1 might have a mid-range response to the PD-L1-Blocking Antibody.
Next, we analyzed chemotherapy's impact on the three subgroups. Wang et al. reported that high immune scores were associated with therapeutic benefits from chemotherapy [23]. Similarly, breast cancer with high immune infiltration responded to chemotherapy, with pathologic complete response rates of 42% and 40% in the training cohort and validation cohort, respectively [56]. In contrast, those tumors without any immune infiltration had pathologically complete response rates of 3% and 7% in the training cohort and validation cohort, respectively [56]. A study reported that all subtypes of breast cancer with tumor-infiltrating lymphocytes benefited from chemotherapy, in particular, triple-negative breast cancers (TN) with >50% lymphocytic infiltration [21].
The basal subtype of muscle-invasive bladder cancer with immune infiltration had a good response to chemotherapy [24,25]. One study reported the basal subtype with CD19+ tumor-infiltrating B-cells received a significant benefit from adjuvant chemotherapy [24]. Likewise, in the present study, cluster 2 had a greater response to chemotherapy compared with cluster 1, as shown in Table 1. Cluster 2 also had a better survival rate after chemotherapy, as shown in Figure 9 and Table 2. All the above results suggest that cluster 2 is associated with more improvement in the overall survival after chemotherapy. There are two reasons: the first is that cluster 2 has high immune cell infiltration; the second is that cluster 2 has more basal markers ( Figure 6). Then, we analyzed cluster 3. A previous study reported that the luminal subtype with low immune infiltration had a good survival with and without neoadjuvant chemotherapy [33]. In the present study, cluster 3 had the highest luminal marker expression and the lowest immune infiltration, and so this suggests that the cluster 3 might have a good survival rate with or without chemotherapy.
Finally, the three subgroups had specific methylation sites, specific DNA mutations, and CNV, as shown in Figure 8. All these will be promising targets for anti-cancer drug development. These results indicate that the anti-cancer targeted drug therapy are different among the three subgroups.
In conclusion, the classification was accurate and stable. BC patients were successfully divided into three subtypes based on the immune-cell-associated CpG sites. The three subgroups demonstrated different clinicopathologic features. The distribution of immune cells, level of expression of checkpoints, stromal score, immune score, ESTIMATEScore, tumor purity, APC co_inhibition, APC co_stimulation, HLA, MHC class I, Type I IFN response, and Type II IFN response demonstrated significant differences among the three subgroups. The distribution of genomic alterations was also different among the groups. High-level immune infiltration was correlated with high-level methylation. A lower RNAss was associated with higher immune infiltration and a higher level of expression of CD274. Cluster 2 was associated with a better response to chemotherapy. The anti-cancer targeted drug therapy are different among the three subgroups. The study of the intratumoral immune microenvironment may provide a new perspective for therapy in BC.

Data Pre-Processing
Four hundred and thirty-seven samples were used in this study. Methylation data using Illumina Human Methylation 450 arrays were obtained from UCSC Xena (https://xenabrowser.net/datapages/). DNAss, RNAss, and RNA-Seq from 430 BC samples and clinical data also were downloaded from the UCSC Xena website. The Masked Somatic Mutation data (MuTect2. Variant0. Maf) and the CNV data set (Masked Copy Number Segment, affymetrix snp 6.0) were collected from the TCGA website (https://portal.gdc.cancer.gov/repository). The CNV data comprised 814 samples. Due to the fact that the collected databases were public, the publishing policies of these databases were strictly obeyed by us, and ethical approval was not required.

Immune-Cell-Associated Gene Selection
Previous studies found that the DNA methylation sites in promoter regions strongly influenced gene expression [93,94]. The promoter regions were within 2 kb upstream to 0.5 kb downstream from the transcription start sites [93,94]. Immune-cell-associated biomarkers were obtained from previous studies (Table S1). Their corresponding methylation sites in promoter regions were obtained. The analysis showed that the exclusion probe criteria were as follows: (1) if the CpG site data had more than 70% of the samples missing, then the CpG sites were excluded from the analysis [95].
(2) Cross-reactive genome CpG sites were deleted [96]. (3) Probes on the X and Y chromosomes were excluded from the analysis [96]. The remaining sites were imputed with the k-nearest neighbors (KNN) imputation procedure [36].

Unsupervised Hierarchical Cluster Analysis
The methylation sites corresponding to immune-cell-associated genes were acquired. DMSs were identified between normal samples and bladder cancer samples with adjusted p-value < 0.05 and |deltabeta| > 0.2. Unsupervised hierarchical clustering was performed based on immune-cell-associated methylation data to identify subtypes of BC with the "sparcl" R software package (https://CRAN.Rproject.org/package=sparcl). The overall survival curve of the BC subsets was obtained using the Kaplan-Meier method and with the "survival" package in R software. PCA was performed to validate the classification. A barplot was established to demonstrate the correlation between the clinical traits and the biological characteristics of the subtypes. Chi-square tests were performed, and p values less than 0.05 were considered significant.

Single Sample Gene Set Enrichment Analysis (ssGSEA) Based on Immune Cells Biomarker
ssGSEA was used to quantify the infiltration of immune cells that were obtained from the previous study [27]. The ssGSEA ranked the genes based on their absolute expression in a sample with the "GSEABase" and "GSVA" R packages. The enrichment score was calculated by integrating the differences between the empirical cumulative distribution functions of the gene ranks [97,98]. Activated B cells, activated CD8 T cells, effector memory CD8 T cells, central memory CD8 T cells, activated CD4 T cells, effector memory CD4 T cells, central memory CD4 T cells, regulatory T cells, gamma delta T cells, immature B cells, memory B cells, type 17 T helper cells, T follicular helper cells, type 1 T helper cells, and type 2 T helper cells are adaptive immune cells. CD56 dim natural killer cells, CD56 bright natural killer cells, eosinophils, activated dendritic cells, immature dendritic cells, MDSCs, macrophages, monocytes, mast cells, plasmacytoid dendritic cells, natural killer cells, natural killer T cells, and neutrophils are innate immune cells. Immune cells were compared among subgroups using the Kruskal-Wallis test. Apart from this, immune checkpoints were selected from previous studies [29,84] in order to compare them among the subtypes. Then, the Kruskal-Wallis test was performed.

Tumor Microenvironment (TME)
An algorithm called ESTIMATE was used for the estimation of stromal and immune cells in malignant tumor tissues based on the expression data [63]. The ESTIMATE algorithm was obtained from the public source website (https://sourceforge.net/projects/estimateproject/) to estimate the stromal scores and immune scores based on specific biomarkers associated with the infiltration of stromal and immune cells in tumor samples [62]. The stromal scores and immune scores were analyzed separately to predict the level of stromal and immune cells in tumor tissue, and these were combined to reference the tumor purity by ssGSEA [62,99,100]. The stromal scores, immune scores, tumor purity, and ESTIMATE scores were calculated for each sample. Subsequently, they were compared among the subtypes.

Single Sample Gene Set Enrichment Analysis (ssGSEA)
The biomarkers of APC_co_inhibition, APC_co_stimulation, endothelial cells, fibroblasts, HLA, inflammation-promoting, MHC_class_I, Type_I_IFN_Response, and Type_II_IFN_Response were selected from similar studies [82,101] (Table S2). Luminal biomarkers, basal biomarkers, squamous biomarkers, neuronal-differentiation biomarkers, and EMT-Claudin biomarkers were obtained from a previous study [30] (Table S3). Single sample gene set enrichment analysis was used to rank the genes based on their absolute expression in a sample.

DNAss and RNAss among Subgroups
The differentiated phenotype was rapidly lost during cancer progression, and progenitor and stem-cell-like characteristics were acquired [102]. The DNA hypermethylation of those genes suppressed gene expression, which was significantly benefited by cancer cells. Therefore, down-regulation of those genes may lead to the occurrence of cancer stem and progenitor cells by DNA hypermethylation [31,32]. RNAss based on mRNA expression and DNAss based on DNA methylation were utilized to measure the tumor stemness [88]. The range of scores was from 0 to 1. Zero indicates high differentiation, and 1 indicates undifferentiation [88]. Finally, the DNAss and RNAss among the three subgroups were analyzed.

Analysis of Mutations and CNVs among Subgroups
The 'maftools' software package was utilized to analyze and visualize the immune cell biomarker-associated mutation data [103]. The immune cell biomarker-associated mutation data were compared between one group and the other groups using the chi-square test. A p-value of less than 0.05 was considered significant. The TMB, which indicates the density of tumor gene mutations, was compared among subtypes based on the immune cell biomarker-associated mutation data.
The immune cell biomarker-associated CNV data were analyzed. The genomic identification of significant targets in cancer (GISTIC) algorithm was utilized to classify the copy number variant genes with remarkable gains and losses [104,105]. The parameter thresholds were adjusted to 0.2 and −0.2 for genomic gains and losses, respectively [104,105]. Immune cell biomarker-associated copy number variant data were compared between one group with the remaining groups using a chi-square test. A p-value of < 0.01 was considered significant.

Conclusions
The classification was accurate and stable. BC patients were successfully divided into three subtypes based on the immune-cell-associated CpG sites. Specific biological signaling pathways, immune mechanisms, and genomic alterations were found to have variations among the three subgroups. High-level immune infiltration was correlated with high-level methylation. The lower RNAss was associated with higher immune infiltration.

Conflicts of Interest:
The authors declare that they have no competing interests.