Gene Expression Profiles Identify Biomarkers of Resistance to Decitabine in Myelodysplastic Syndromes

Myelodysplastic syndrome (MDS) is a clonal hematopoietic stem cell disease characterized by inefficient hematopoiesis and the potential development of acute leukemia. Among the most notable advances in the treatment of MDS is the hypomethylating agent, decitabine (5-aza-2′deoxycytidine). Although decitabine is well known as an effective method for treating MDS patients, only a subset of patients respond and a tolerance often develops, leading to treatment failure. Moreover, decitabine treatment is costly and causes unnecessary toxicity. Therefore, clarifying the mechanism of decitabine resistance is important for improving its therapeutic efficacy. To this end, we established a decitabine-resistant F-36P cell line from the parental F-36P leukemia cell line, and applied a genetic approach employing next-generation sequencing, various experimental techniques, and bioinformatics tools to determine differences in gene expression and relationships among genes. Thirty-eight candidate genes encoding proteins involved in decitabine-resistant-related pathways, including immune checkpoints, the regulation of myeloid cell differentiation, and PI3K-Akt signaling, were identified. Interestingly, two of the candidate genes, AKT3 and FOS, were overexpressed in MDS patients with poor prognoses. On the basis of these results, we are pursuing development of a gene chip for diagnosing decitabine resistance in MDS patients, with the goal of ultimately improving the power to predict treatment strategies and the prognosis of MDS patients.


Introduction
Myelodysplastic syndrome (MDS) is a hematologic disorder characterized by marrow failure and a risk of clonal progression. MDS, also known as pre-leukemia, is characterized by a high risk of transformation to acute myeloid leukemia (AML). The exact cause of MDS is uncertain in many cases, although preceding exposure to radiation therapy, chemotherapy, poisonous solvents, tobacco, or agricultural chemicals are risk factors in some cases [1]. Apart from allogeneic hematopoietic stem cell transplantation (HSCT), effective therapies are absent; thus, MDS has long been considered a difficult to control disease. Although allogeneic HSCT can be a curative treatment option for MDS, high treatment-related mortality limits the universal application of allogeneic HSCT in all patients with MDS. This has led to the development of the International Prognostic Scoring System (IPSS), a risk stratification by continuously exposing parental F-36P cells to gradually increasing concentrations of DEC (Sigma-Aldrich, St. Louis, MO, USA). The initial DEC concentration was 1 µmol/L and was then increased exponentially in steps until reaching 32 µmol/L. Selected cells were cultured in DEC-free medium for at least 2 weeks prior to experiments. F-36P and F-36P/DEC cells were incubated in RPMI-1640 medium (GenDEPOT, Katy, TX, USA) containing 5% fetal bovine serum (GenDEPOT), 1% penicillin (GenDEPOT), 5 ng/mL interleukin (IL)-3 (Sigma-Aldrich), and 2 mM glutamine (Sigma-Aldrich) at 37 • C in a humidified, 5% CO 2 atmosphere.

Cell Morphology and Measurement of Drug Sensitivity
F-36P and F-36P/DEC cells were observed under an inverted light microscope (Nikon, Melville, NY, USA) equipped with a 10× eyepiece and both 10× and 40× objective lenses. Images of each cell type were acquired using Motic Images Plus 2.0 software (Motic Inc., Xiamen, China). Cells were seeded into 96-well plates at 1 × 10 4 cells/well in 0.1 mL of complete medium. The proliferation and cytotoxicity of DEC were investigated using WST assay kits (LPS Solution, Daejeon, Korea). After incubation in 96-well plates, F-36P and F-36P/DEC cells were incubated without or with different concentrations of DEC (1.75, 3.5, 7.5, 15, and 30 µmol/L) for 72 h. Cell proliferation was measured by adding 20 µL of WST reagent to each well and then incubating for 1 h in a 37 • C incubator. The absorbance of wells at 450 nm was read using a Multiskan FC instrument (Thermo Fisher Scientific Inc., Waltham, MA, USA), and the percentage growth was calculated. Proliferation was assessed at 24, 48, and 72 h in wells without DEC. The method for measuring cell cytotoxicity at 72 h following the seeding of cells was identical to that used to measure cell proliferation. The concentration of DEC required to inhibit growth by 50% (half-maximal inhibitory concentration) at 72 h was used for calculation of IC 50 . The degree of resistance was scored relative to the IC 50 value. The IC 50 value of DEC was analyzed by probit analysis using SPSS software version 26.0 (IBM Inc., Armonk, NY, USA).

Fluorescence-Activated Cell Sorting (FACS) Analysis
A cell-cycle analysis was conducted using a BD Accuri C6 Plus flow cytometer (BD Biosciences, Franklin Lakes, NJ, USA). Briefly, cells were seeded into six-well plates at 2 × 10 4 cells/well in 2 mL. After 72 h, cells were harvested by centrifugation at 1000× g for 5 min at room temperature, washed with PBS, fixed overnight at 4 • C with ice-cold 75% ethanol, and incubated with 10 µg/mL RNase A (Sigma-Aldrich) and 20 µg/mL propidium iodide (Sigma-Aldrich) for 15 min at room temperature in the dark. Cells in G1, S, and G2/M phases were identified based on fluorescence intensity, and the cell-cycle distribution was analyzed using BD Accuri C6 Plus software version 1.0.23.1 (BD Biosciences).

NanoString Targeted Gene Expression
Total RNA samples were analyzed using the NanoString nCounter PanCancer Pathways Panel Analysis System (NanoString Technologies, Seattle, WA, USA). For each hybridization reaction, 100 ng total RNA (or any quantity that was present in a 5 µL aliquot of purified RNA if less than 100 ng) was used. nCounter raw data (RCC files) for each sample were imported into nSolver Analysis Software version 4.0 (NanoString Technologies) for review of quality control metrics (all passed). The spike-in positive control geometric mean normalization factor (without negative control background subtraction) was determined for each sample using nSolver. Raw counts from these genes, together with spike-in positive controls, were then used to perform both positive control and reference gene normalization of the raw data in nSolver. These data were then exported in CSV format for gene expression analyses. Count data were converted into counts per million mapped reads (CPM) using the fpm function in DESeq2. A pseudo count of 1 was added to all counts before calculating the log 2 value and the log 2 fold-change was calculated. Data mining and graphic visualization were performed using nCounter Advanced Analysis software (NanoString Technologies).

Gene and Pathway Enrichment Analyses of Differentially Expressed Genes (DEGs)
DAVID (Database for Annotation, Visualization and Integrated Discovery; https: //david.ncifcrf.gov/, accessed on 13 November 2020), an online biological information database, was used for the pathway enrichment analysis of DEGs. The Gene Ontology (GO) term enrichment analysis annotated by the DAVID database is composed of three attributes: molecular function (MF), biological process (BP), and cellular component (CC). Pathway enrichment analyses were conducted using the DAVID website tools, Kyoto Encyclopedia of Genes and Genomes (KEGG), and REACTOME; p-values < 0.05 were considered statistically significant.

Protein-Protein Network and Module Analysis
Protein-protein interaction (PPI) networks were mapped using Cytoscape (version 3.8.2; https://cytoscape.org/, accessed on 11 November 2020), a public-access software that can graphically edit, display, and analyze the network. Significant modules in PPI networks were identified using Molecular Complex Detection (MCODE), a plug-in app of Cytoscape designed to analyze densely connected regions by clustering a given network. Hub genes were identified using the cytoHubba analysis in Cytoscape. Analyses with ClueGO, a Cytoscape plug-in, were performed using databases updated in May 2021.

Quantitative Reverse Transcription-Polymerase Chain Reaction (qRT-PCR)
Total RNA was extracted using QIAzol reagent (Qiagen), then reverse transcribed into cDNA using amfiRivert reverse transcriptase (GenDEPOT) according to the manufacturer's instructions. cDNA was amplified by PCR on a Mic Real-Time PCR system (Bio Molecular Systems, Upper Coomera, QLD, Australia) using Luna Universal qPCR master mix (New England Biolabs Inc., Ipswich, MA, USA) and primer pairs specific to target genes (Table  S1). Expression data were normalized to the geometric mean of the housekeeping gene, GAPDH, to control the variability in expression levels and analyzed using the ∆CT and 2 −∆∆CT quantification method.

Validation of Genetic Alterations in Candidate Genes
Genetic alterations in candidate genes in the myeloid neoplasm dataset were analyzed using cBioPortal (http://cbioportal.org/, accessed on 1 October 2021), an online analysis platform for multidimensional cancer genomic data that provides a collective visualization of genes, samples, and data types.

Patient Enrollment and Treatment-Bone Marrow Samples
Bone marrow-derived blood samples were obtained from four MDS patients diagnosed at Seoul National University Hospital (SNUH). Patient samples were collected with each patient's informed consent after receiving approval from the Institutional Review Board of SNUH. MDS subtypes were classified according to the revised World Health Organization classification of myeloid neoplasm [24]. Patients were treated with DEC (20 mg/m 2 /d for 5 days every 4 weeks), and response to treatment and clinical outcome were evaluated according to revised International Working Group (IWG) response criteria. The patients used for the analysis were classified into two groups: (1) Patients 1 and 2, bone marrow samples obtained at initial diagnosis, pre-DEC treatment; and (2) Patients 3 and 4, samples obtained in complete response (CR) status after four cycles of DEC treatment.

Statistical Analysis
All experiments were performed in at least triplicate (n ≥ 3), and data are presented as means ± standard deviation (SD). Differences between two sample means were determined with a Student's t-test for independent samples using SPSS software version 26.0. Differences with p-values < 0.05 were considered statistically significant.

Establishment of the DEC-Resistant Cell Line, F-36P/DEC
To study the mechanism of resistance to DEC, we established an in vitro DEC-resistant cell line model from the parental cell line F-36P ( Figure 1A). The morphological differences between F-36P cells and the derived DEC-resistant F-36P/DEC cells were surveyed using an inverted light microscope at magnifications of 100× and 400×. As shown in Figure 1B, F-36P cells were detected as single cells under the light microscope, whereas F-36P/DEC cells were observed to aggregate. We further found that F-36P/DEC cells stably proliferated over 72 h ( Figure 1C). The IC 50 value for DEC was 17.27 µmol/L in F-36P cells and more than 30 µmol/L in F-36P/DEC cells. However, although we used DEC at concentrations up to 1000 µmol/L in the F-36P/DEC cell line, we were unable to accurately calculate an IC 50 value for these cells ( Figure 1D). To further explore the biological properties of F-36P/DEC cells, we compared the cell-cycle properties of F-36P/DEC cells with those of F-36P cells, finding that the proportion of cells in the G2/M phase of the cell cycle was significantly increased in F-36P/DEC cells ( Figure 1E). Collectively, these results indicate that the F-36P/DEC cell model generated for these studies is significantly resistant to DEC and has characteristics that differ from those of F-36P cells.

Identification of Genes That Are Differentially Expressed between F-36P/DEC and F-36P Cells
To identify genes associated with DEC resistance, we screened for candidate genes that were differentially expressed between F-36P and F-36P/DEC cells using a NanoString analysis [25]. The 189 differentially expressed genes (DEGs) obtained as a result of this analysis were expressed as a cluster heatmap [26] using nSolver ( Figure 2A and Table S2). For a cluster heatmap, hierarchical clustering using average linkage and a Euclidean distance metric was used. A volcano plot used to display the distribution of DEGs in F-36P/DEC relative to F-36P ( Figure 2B and Table S3) shows −log 10 (p-value) values and log 2 fold changes in DEGs. Statistically significant genes (p-value ≤ 0.05) are located at the top of the plot above the horizontal lines, with strongly differentially expressed genes falling on either side. Using a scatter plot to show the distribution of F-36P/DEC DEGs, we found that the larger the difference in expression level, the more the trend line deviated from a linear function ( Figure 2C). Collectively, these data allowed us to select more specific DEGs between F-36P/DEC and F-36P cells. We also performed a gene ontology (GO) analysis [27] of significantly enriched genes to obtain overview information about the function of protein products of our DEGs ( Table 1). The regulation of transcription, cell cycle, and proliferation pathways (biological process), nucleus and extracellular region (cellular compartment), and transcription factor binding and protein kinase activity (molecular function) were the GO properties most frequently associated with identified DEGs.  Results are presented as means ± SD (error bars) (n = 3; * p < 0.05, ** p < 0.01, *** p < 0.001 compared to the control).
linear function ( Figure 2C). Collectively, these data allowed us to select more specific DEGs between F-36P/DEC and F-36P cells. We also performed a gene ontology (GO) analysis [27] of significantly enriched genes to obtain overview information about the function of protein products of our DEGs ( Table 1). The regulation of transcription, cell cycle, and proliferation pathways (biological process), nucleus and extracellular region (cellular compartment), and transcription factor binding and protein kinase activity (molecular function) were the GO properties most frequently associated with identified DEGs.

Functional Classification of DEGs Associated with DEC Resistance of the Cell Line, F-36P/DEC
Protein-protein interaction (PPI) networks can reveal physical contacts between protein pairs and identify biological pathway clusters [28]. Using this approach, we generated a PPI network from 66 DEGs that contained 54 nodes and 198 edges ( Figure 3A and Table S4). The resulting network was still very complex; thus, we subsequently performed a minimal common oncology data elements (MCODE) analysis with specific modules [29]. The networks were also confirmed by cytoHubba analysis [30], which generated similar hub networks ( Figure 3B-E). This subnetwork analysis identified STAT3 (signal transducer and activator of transcription 3), RELN (reelin), IL11RA (interleukin-11 receptor, alpha subunit), and WNT3 (proto-oncogene protein Wnt-3), among others, as hub genes. A ClueGO and module analysis, which can improve biological interpretation by organizing separate GO pathway term networks [31,32], revealed various additional processes linked to DEC resistance, including immune cell activation, regulation of myeloid cell differentiation, transcriptional misregulation, PI3K-Akt signaling, and JAK-STAT signaling (Figure 4).

Validation of Candidates Identified by RNA-Seq Analysis
We next verified gene expression profiles using quantitative RT-PCR (qRT-PCR) analysis, selecting the top 38 genes with the highest fold-changes for validation. These top genes included those encoding proteins involved in immune cell activation (GFI1, IL12B, NFKB1, FOS), the regulation of myeloid cell differentiation (ETS1, ID2, KMT2E), transcriptional misregulation (BCL2A1, MLF1, RUNX1T1), PI3K-Akt signaling (AKT3, MET, RELN), and the JAK-STAT signaling pathway (BCL2, EGF, STAT3). The molecular properties of these gene products are summarized in the Supplementary Data (Table S5). As shown in Figure 5A, qRT-PCR analyses confirmed upregulation of our candidates in F-36P/DEC cells. Using the cBioPortal for cancer genomics, originally developed for interactive exploration of multidimensional cancer genomic datasets [33], we extracted a dataset from the clinical expression dataset of myeloid neoplasms containing 11 studies and 6940 patient samples ( Figure 5B). The majority of our candidates were found in the extracted dataset, consolidating the credibility of our candidate list. Thirty-eight genes have been shown to be associated with different types of alterations, most of which have been associated with mutations. In particular, 24 genes were associated with amplification (red bars). Therefore, these results suggest that most of our candidate genes are clinically relevant and allow reliable discovery. to DEC resistance, including immune cell activation, regulation of myeloid cell differentiation, transcriptional misregulation, PI3K-Akt signaling, and JAK-STAT signaling ( Figure  4).   Highly interconnected GO terms are presented. Terms in bold font indicate top GO terms. Gene names within subgroups were generated using ClueGO default settings. All GO terms shown are statistically significant (p < 0.05 with Bonferroni correction).

Validation of Candidates Identified by RNA-Seq Analysis
We next verified gene expression profiles using quantitative RT-PCR (qRT-PCR) analysis, selecting the top 38 genes with the highest fold-changes for validation. These top genes included those encoding proteins involved in immune cell activation (GFI1, IL12B, NFKB1, FOS), the regulation of myeloid cell differentiation (ETS1, ID2, KMT2E), transcriptional misregulation (BCL2A1, MLF1, RUNX1T1), PI3K-Akt signaling (AKT3, MET, RELN), and the JAK-STAT signaling pathway (BCL2, EGF, STAT3). The molecular properties of these gene products are summarized in the Supplementary Data (Table S5). As shown in Figure 5A, qRT-PCR analyses confirmed upregulation of our candidates in F- Highly interconnected GO terms are presented. Terms in bold font indicate top GO terms. Gene names within subgroups were generated using ClueGO default settings. All GO terms shown are statistically significant (p < 0.05 with Bonferroni correction). taset from the clinical expression dataset of myeloid neoplasms containing 11 studies and 6940 patient samples ( Figure 5B). The majority of our candidates were found in the extracted dataset, consolidating the credibility of our candidate list. Thirty-eight genes have been shown to be associated with different types of alterations, most of which have been associated with mutations. In particular, 24 genes were associated with amplification (red bars). Therefore, these results suggest that most of our candidate genes are clinically relevant and allow reliable discovery.

Comparison of Gene Expression in Bone Marrow from Patients with MDS
Four patients with MDS were enrolled in this study. The median age of patients at the time of diagnosis of MDS was 64 years (range, . MDS subtypes were MDS-EB1 and MDS-EB2, and IPSS risk categories at baseline were intermediate-2 and high risk ( Table 2). Patients were treated with a median of nine cycles of DEC (range, [2][3][4][5][6][7][8][9][10][11][12][13][14][15][16], and all patients achieved an objective response (complete remission in three patients and partial remission in one patient). The duration of the response was less than 1 year in two patients (9 months in Patient 1 and 11 months in Patient 3); another patient (Patient 4) proceeded to HSCT after five cycles of DEC, but expired due to pneumonia 17 months later. In contrast to the relatively shorter survival of these patients, Patient 2 showed a significantly longer survival (52 months). IPSS-R was well correlated with the time to progression, with two patients with very high risk by IPSS-R progressing within 1 year, and two patients with intermediate and high risk by IPSS-R showing continued response to DEC after 2 years and more than 4 years, respectively. However, IPSS and the MDS subtypes MDS-EB1 and MDS-EB2 were not predictive of prognosis. We next performed qRT-PCR to assess gene expression in patient bone marrow samples, using bone marrow-derived samples before DEC treatment for Patients 1 and 2 and samples obtained after four cycles of DEC treatment for Patients 3 and 4. For this analysis, qRT-PCR results for each bone marrow sample, quantified using the 2 −∆∆CT method, were normalized to those of 10 normal bone marrow samples (defined as 1). This analysis confirmed differences in mRNA expression of DNA methyltransferase (DNMT) genes in patient bone marrow samples, specifically showing a slightly higher expression of DNMT1 and DNMT3A in Patient 2 compared with normal bone marrow, but no difference in DNMT3B. In other patients, the expression of DNMT genes was decreased compared to that of normal controls ( Figure 6A). Surprisingly, a comparison of the mRNA levels of 38 genes in bone marrow samples from patients showed that expression levels of AKT3 and FOS genes were similar to normal in Patient 2 but increased in the other patients ( Figure 6B and Table S6).

Discussion
Clinical outcomes of patients with MDS who experience HMA treatment failure are poor [34,35]. Thus, elucidating the resistance mechanism is important for overcoming this problem. Drug-resistant cell line models can be useful in vitro tools for understanding the mechanisms underlying clinical anticancer drug resistance, enabling the discovery of molecular alterations between a drug-resistant cell line and its drug-sensitive counterpart.

Discussion
Clinical outcomes of patients with MDS who experience HMA treatment failure are poor [34,35]. Thus, elucidating the resistance mechanism is important for overcoming this problem. Drug-resistant cell line models can be useful in vitro tools for understanding the mechanisms underlying clinical anticancer drug resistance, enabling the discovery of molecular alterations between a drug-resistant cell line and its drug-sensitive counterpart. Furthermore, acquired-resistance cell line models can play an important additional role in revealing the mechanisms of action of new anticancer drugs [36]. It has also been shown that the expression levels of certain cancer-related genes can be altered in DEC-resistant cell lines and that the regulation of gene expression levels can influence resistance [37]. To date, most research on DEC resistance has focused on resistant cells derived from a few cell lines originating from acute promyelocytic leukemia (HL-60) or chronic myeloid leukemia (K562 cells) [11,13,37]. However, DEC is not used to treat acute promyelocytic leukemia or chronic myeloid leukemia. Accordingly, data obtained from HL-60 or K562 cells cannot be expected to accurately represent the phenomenon of DEC resistance in actual MDS patients.
The DEC-resistant MDS cell line developed for this study was produced by continuously exposing F-36P cells to staged increases in DEC concentration. DEGs associated with DEC resistance were determined by biological and bioinformatic analyses of the gene expression profiles of parental F-36P and F-36P/DEC cells. On the basis of this expression profiling, we selected 38 candidate DEGs, including those involved in immune cell activation, the regulation of myeloid cell differentiation, transcriptional misregulation, PI3K-Akt signaling, and the JAK-STAT signaling pathway, for further analysis. These analyses identified pathways that overlap with previously reported pathways; however, the individual genes were quite different.
A number of genetic studies performed to date have identified groups of mutated genes that contribute to the pathogenesis of MDS and resistance to HMAs [20,38]. These genes were categorized according to a limited number of cellular processes, including epigenetic, RNA splicing and traditional transcriptional regulation, and signal transduction. Several large studies have evaluated the prognostic value of MDS-associated gene mutations across a broad cross-section of patients [39][40][41]. Somatic mutations in certain genes reproducibly predict patient prognosis. Across studies, mutations in genes encoding ETV6, RUNX1, TP53, EZH2, ASXL1, FLT3-ITD, and SRSF2 predict poor overall survival, whereas mutations in SF3B1 are associated with better clinical outcomes [40,42]. It is also known that mutations in TP53 and TET2 affect the prognosis of patients undergoing DEC therapy [17]. Considering the high incidence of mutations, it can be assumed that mutations play a role in MDS pathogenesis and resistance to HMAs. However, many such studies did not distinguish specific mutations that were closely related to an incomplete response to treatment. Therefore, it is reasonable to suggest that the decision to treat MDS patients with HMAs should not be based solely on the basis of mutation information.
We next looked for possible regulatory mechanisms governing the expression of these genes, focusing on FOS, a component of transcription factor AP-1 (activator protein 1), and PI3K-AKT signaling pathways, which play essential roles in hematological malignancies. We found that FOS transcriptionally induced the expression of genes encoding the coinhibitory immune checkpoint proteins PD-1 (programmed cell death protein-1) and PD-L1 (ligand for PD-1) through binding to enhancer regions of the respective gene promoters. It has additionally been shown that blockade of the FOS-mediated induction of PD-1 could be harnessed therapeutically to restore T cell-mediated antitumor responses [43]. Impairment of AKT3, a member of the AKT protein family, has also been linked to multiple myeloma, and mutations in the AKT3 gene have been frequently reported in de novo Philadelphia chromosome-positive AML. Accordingly, the presence of a mutated AKT3 gene is correlated with, and predicts, a poor clinical response in MDS patients [44]. Interestingly, another study confirmed that AKT3 protein levels are overexpressed in the plasma of MDS patients [45]. In DEC-induced DNA reprogramming, chimeric antigen receptor T (CAR T) cells, compared to the control group, differentially expressed genes including AKT3, and it is expected that this can enhance antitumor effects or reduce tumor recurrence [46]. We found that AKT3 and FOS genes were overexpressed in F36P/DEC cells, and clinical data even showed overexpression of these genes in MDS patients with poor prognosis.
In the current study, a bioinformatics analysis was used to identify biological networks, and in vitro experiments were conducted to verify bioinformatics findings. Our examination of the roles of certain molecules and the demonstration of the involvement of specific signaling pathways in the current study is expected to expand our understanding of the molecular mechanism of DEC resistance. However, our study is not without limitations, including the need to validate our findings in a larger group of patients and to perform in-depth experiments to confirm that the tendency toward the increased expression of genes identified in vitro is recapitulated in vivo. Future advances should yield more accurate data from bioinformatics analyses, confirming pathways that are significant for DEC resistance in MDS and potentially achieving a comprehensive understanding of this process.
In conclusion, we successfully established a good in vitro model with potential utility in understanding the molecular mechanisms and origin of DEC resistance. Using this model, we observed the increased expression of select genes from among 38 DEGs identified, including AKT3 and FOS, in F-36P/DEC cells and showed that they were associated with several pathways with potential involvement in DEC resistance, consistent with the results of our bioinformatics analysis. Our findings can serve as a reference for other researchers involved in investigating the development of resistance to DEC and provide tools that could aid in the application of combination therapeutics and biomarkers as therapies for MDS patients.