Next Article in Journal
Potassium-Hydroxide-Based Extraction of Nicotinamide Adenine Dinucleotides from Biological Samples Offers Accurate Assessment of Intracellular Redox Status
Previous Article in Journal
Gene Expression Profiling of Transcription Factors and Acclimation-Related Genes in Ribes spp.
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Anti-Cancer Potential of Genistein: Single-Cell RNA Sequencing Analysis and Spatial Transcriptome Reveal That Genistein Targets HSD17B1 to Inhibit the Progression of Gastric Adenocarcinoma

1
The School of Molecular Sciences, The University of Western Australia, Perth, WA 6009, Australia
2
Biological Science Research Center, Southwest University, Chongqing 400715, China
3
The State Key Laboratory of Resource Insects, Southwest University, Chongqing 400715, China
*
Author to whom correspondence should be addressed.
Int. J. Mol. Sci. 2025, 26(21), 10369; https://doi.org/10.3390/ijms262110369
Submission received: 26 August 2025 / Revised: 15 October 2025 / Accepted: 23 October 2025 / Published: 24 October 2025
(This article belongs to the Section Molecular Oncology)

Abstract

Genistein has anti-cancer effects, but its molecular targets in gastric adenocarcinoma (GA) are unclear. This study used single-cell RNA sequencing (scRNA-seq) and spatial transcriptomics (ST) to explore genistein’s “drug-gene-cell” interactions in GA. GA- and genistein-related target genes were retrieved and intersected with differentially expressed genes identified from bulk transcriptomic data. Machine learning screened candidates, and survival analysis assessed prognosis. Molecular docking with genistein validated key genes, with molecular dynamics assessing binding stability. HSD17B1, EZH2, CCNB1, CCNB2, CDKN2A, and IGFBP6 were identified as key candidate genes with prognostic value for GA. Specifically, samples in the IGFBP6 high-expression group were associated with higher survival probability, whereas the opposite trend was observed for the other five genes. In addition, HSD17B1 was genistein’s main target in GA treatment, showing a strong binding affinity with genistein (binding energy of −8.1 kcal/mol). scRNA-seq analysis indicated that HSD17B1 was predominantly expressed in epithelial cells and was significantly involved during their malignant transformation (confirmed by ST). This study identified HSD17B1 as a critical target gene for genistein in GA treatment, emphasizing its roles in the malignant transformation of epithelial cells, thus providing a theoretical foundation for understanding the therapeutic mechanism of genistein in GA.

1. Introduction

Gastric cancer (GC) is the fifth most common and lethal gastrointestinal malignancy globally. According to the American Cancer Society, over 968,000 new GC cases were reported worldwide in 2022, resulting in nearly 660,000 deaths [1]. Gastric adenocarcinoma (GA) is the predominant subtype, accounting for 90% to 95% of GC cases. For patients with advanced-stage GA (clinical stage IV), the median overall survival is between 9 and 10 months, with a 5-year survival rate of less than 30% [2,3,4]. Despite its high incidence, research on therapeutic strategies for GA remains limited compared to other cancers.
Phytoestrogens, as naturally occurring bioactive compounds, have long been recognized for their antioxidant properties and potential role in cancer prevention [5]. These compounds share structural similarities with endogenous estrogen, enabling them to modulate estrogen receptor (ER) expression and potentially induce apoptosis through ER downregulation—a mechanism that has shown promise in breast cancer therapy [5]. Although the specific effects of phytoestrogens in GA have not yet been systematically investigated, emerging evidence from a cross-sectional study indicates a significant association between phytoestrogens and lung function, with this relationship appearing to be influenced by phytoestrogen subtypes, patient sex, and smoking status [6].
Genistein, or 5,7,4′-trihydroxyisoflavone, is a naturally occurring isoflavone (phytoestrogens) primarily found in the roots and seeds of leguminous plants. Genistein, though not a steroidal estrogen like estradiol, estriol, and estrone, shares structural similarities with them such as possessing aromatic rings and phenolic hydroxyl groups that enable interactions with estrogen receptors (Figure S1). This compound has demonstrated significant anti-inflammatory and anti-cancer effects across various malignancies, including colorectal, bladder, breast, prostate, and non-small cell lung cancers [7]. It inhibits the secretion of angiogenic factors under hypoxic conditions, thereby limiting tumor growth [8]. Additionally, genistein reduces cancer stem cell-like properties in GC cells, decreasing their self-renewal, drug resistance, and invasive capabilities [9,10]. Moreover, it enhances the therapeutic efficacy of conventional chemotherapeutic agents such as 5-fluorouracil (5-FU) and cisplatin, and sensitizes TRAIL-resistant GA cells to TRAIL-induced apoptosis [11,12]. In GA, genistein inhibits cellular proliferation by inducing cell cycle arrest, suppressing NF-κB activity, and upregulating apoptosis-associated proteins like caspase-3, thereby promoting dose- and time-dependent apoptosis in cancer cells [13,14,15]. However, the precise molecular target of genistein remains incompletely defined, and the underlying mechanisms of its pharmacological action require further exploration.
Single-cell RNA sequencing (scRNA-seq) is an advanced, high-throughput technique that enables transcriptomic profiling at the individual cell level, providing insights into gene expression heterogeneity within cell populations. This method offers significant advantages, including the ability to identify cellular diversity, reconstruct differentiation pathways, and detect rare cell types [16]. scRNA-seq is widely utilized in GA research to examine tumor cell diversity at the single-cell level, identifying various immune cell subsets, cancer-associated fibroblasts, and malignant epithelial groups. It also aids in reconstructing cell differentiation trajectories and uncovering key transcription factors and signaling pathways involved in tumor progression. These insights are instrumental in pinpointing potential immunotherapy targets and developing personalized treatment strategies [17,18]. Additionally, spatial transcriptomics (ST), which combines scRNA-seq with spatial localization, allows for the concurrent collection of transcriptomic and positional information from tissue sections. This approach enhances the study of the tissue microenvironment, cellular diversity, and spatial regulatory mechanisms with greater resolution and context [19,20]. Chen MM et al. combined ST with single-nucleus RNA sequencing to analyze 25 pancreatic ductal adenocarcinoma samples, clarifying the spatial niche differences in tumor microenvironment, neuro-related cell composition and interactions between high/low neural invasion cases, and providing spatial evidence for deciphering tumor-immune-nerve crosstalk and neural invasion mechanisms [21]. Similarly, the Ma H team integrated ST and scRNA-seq data from multiple GA samples and identified intratumoral heterogeneity based on spatially resolved gene expression patterns, which were associated with distinct immune microenvironments [22]. In this study, scRNA-seq and ST were employed to explore cellular heterogeneity in GA, with the goal of identifying key functional cell populations and disease-related genes, along with their spatial expression patterns within the tumor microenvironment.
This study integrated network pharmacology, bulk RNA-seq, and molecular docking to identify key genes targeted by genistein in GA, supported by molecular dynamics (MD) simulations. scRNA-seq and ST further characterized tumor heterogeneity and the microenvironment, revealing critical cell populations and spatially resolved gene expression patterns (Figure 1).

2. Results

2.1. The Candidate Genes Were Involved in Immune and Cancer-Related Pathways

Principal components analysis (PCA) of the GSE29998 dataset revealed a distinct separation between GA and control samples (Figure 2A). Differential expression analysis identified 2043 differentially expressed genes (DEGs), including 904 downregulated genes and 1139 upregulated genes in GA samples (Figure 2B,C). Pathway enrichment analysis showed that Gene Ontology (GO) terms were mainly enriched in stress response, chromosome organization, extracellular structure, immune activity, and ion transport, while Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways highlighted drug metabolism (cytochrome P450) and viral protein–cytokine interactions, indicating coordinated regulation of cellular response, immunity, and metabolism (Figure 2D–G).
The molecular structure of genistein is shown in Figure 3A. After deduplication and merging, genistein was predicted to interact with 596 drug target genes, while GA was associated with 1622 target genes. The intersection of drug target genes, GA target genes, and DEGs resulted in 29 candidate genes, including 18 upregulated genes (e.g., TYMS and TLR4) and 11 downregulated genes (e.g., KIT and CAPN9) (Figure 3B,C, Table S1). Chromosomal localization analysis revealed that EZH2, AKR1B10, COL1A2, and CYP3A4 were co-localized on chromosome 7 (Figure S2A). Correlation analysis indicated that most genes exhibited positive correlations, exemplified by TYMS and CCNB2 (Figure S2B). The protein–protein interaction (PPI) network demonstrated complex interactions among these genes at the protein level, such as CA9 interacting with CD28 (Figure S2C). GO annotation revealed significant enrichment of these candidate genes in 451 biological process (BP) terms, 22 molecular function (MF) terms, and 12 cellular component (CC) terms, with notable immune-related functions such as leukocyte proliferation, B cell activation, and lymphocyte proliferation (Figure 3D). KEGG pathway enrichment identified 38 significantly enriched pathways, including cancer-related pathways such as the p53 signaling pathway and proteoglycans in cancer (Figure 3E).

2.2. HSD17B1, EZH2, CCNB1, CCNB2, CDKN2A, and IGFBP6 Served as Candidate Key Genes

Among the machine learning models developed using candidate genes, the NB model showed superior performance, with the highest area under the curve (AUC) value (AUC = 1), minimal residual error, and greatest classification accuracy as indicated by the confusion matrix (Figure 4A–D, Table S2). Therefore, the Naive Bayes (NB) model was selected as the optimal predictive model, with 6 genes within the model identified as candidate key genes (HSD17B1, EZH2, CCNB1, CCNB2, CDKN2A, and IGFBP6). These candidate genes demonstrated high diagnostic accuracy for GA, with AUC values above 0.6 for each individual gene. A predictive model constructed from these genes in the GSE118916 dataset yielded an AUC greater than 0.9 (Figure 4E,F). Kaplan–Meier survival analysis revealed significant differences in survival outcomes among the expression groups of all candidate key genes (Figure S3).

2.3. Functional Enrichment Analysis of the Candidate Key Genes

To further explore the biological functions and potential mechanisms of the six hub genes, Gene Set Enrichment Analysis (GSEA) was performed using KEGG pathway data (Figure 5A–F). The analysis revealed significant enrichment of cell cycle-related pathways in multiple genes, including CCNB1, CCNB2, and EZH2, suggesting their involvement in regulating cell proliferation. CDKN2A was associated with several cancer-related pathways, including colorectal cancer, endometrial cancer, and pancreatic cancer, while HSD17B1 was enriched in metabolic and signaling pathways, such as galactose metabolism and MAPK signaling. IGFBP6 was primarily linked to gap junctions, glycerolipid metabolism, and mTOR signaling pathways. Notably, the cell cycle, olfactory transduction, and MAPK signaling pathways were recurrently enriched across several genes, indicating their potential role in tumor progression.

2.4. HSD17B1 Was Treated as a Key Gene

Molecular docking analysis between the candidate genes and genistein showed binding energies consistently below −5 kcal/mol, indicating a strong potential for molecular interaction (Figure 6A–F, Table S3). Specifically, genistein formed multiple hydrogen bonds with the amino acid residues of the proteins encoded by the candidate genes, with corresponding binding energies as follows: HSD17B1 (−8.1 kcal/mol), EZH2 (−7.3 kcal/mol), CCNB1 (−6.6 kcal/mol), CCNB2 (−8.0 kcal/mol), CDKN2A (−6.6 kcal/mol), and IGFBP6 (−6.7 kcal/mol). Since HSD17B1 exhibited the lowest docking energy with genistein, it was prioritized for MD simulation. The RMSD values of the HSD17B1-genistein complex were calculated to assess the stability of the complex structure during MD simulation (Figure 7A,B). The results indicated that after 20 ns of simulation, the RMSD values of the complex gradually stabilized, with fluctuations remaining below 0.1 nm, suggesting stable binding between HSD17B1 and genistein. Throughout the simulation, the RMS fluctuation, Rg, and buried SASA values of the HSD17B1-genistein complex converged, indicating structural stabilization. During the middle phase of the simulation, the complex displayed minimal RMSF variations, indicating improved residue-level stability (Figure 7C–E). The complex formed varying numbers of hydrogen bonds, ranging from 0 to 7, and the 2D and 3D Gibbs free energy landscapes revealed a distinct energy minimum along PC1 and PC2, collectively indicating a strong binding affinity (Figure 7F–H).

2.5. HSD17B1 Was Specifically Expressed in Epithelial Cells

Following data filtration, a total of 28,682 cells and 27,643 genes from the GSE264203 dataset were retained for subsequent scRNA-seq analysis (Figure S4A,B). The top 2000 highly variable genes (HVGs) were selected for PCA, and the first 20 principal components (PCs) were used for further clustering analysis (Figure 8A). Thus, 21 clusters were identified and annotated into 8 cell types: mast cells, T cells, neutrophils, macrophages, fibroblasts, epithelial cells, endothelial cells, and B cells (Figure 8B–E and Figure S4C). Figure 8F illustrates the distribution of intergroup DEGs across cell types, with HSD17B1 showing predominant expression in epithelial cells (Figure 8G). Subcellular localization analysis revealed that HSD17B1 was primarily expressed in the cytosol (Figure 8H and Figure S4D). Integration of inferCNV analysis showed that epithelial cells (clusters 7, 13, and 15) exhibited pronounced copy number alterations (CNAs), suggesting potential epithelial–mesenchymal transition (EMT) within the tumor microenvironment (Figure 8I).
Enrichment analysis revealed mostly inverse correlations between the HALLMARK pathways and all cell types (Figure S5). Notably, a divergent positive correlation was observed between KRAS signaling-DN and cell types, with a particularly strong association in fibroblasts (R = 0.77), indicating aberrant activation of pro-proliferative intracellular signaling, potentially linked to oncogene activation. Conversely, the p53 signaling pathway was consistently downregulated across all cell populations, suggesting possible inactivation of tumor suppressor mechanisms.

2.6. HSD17B1 Was Mainly Expressed During the Malignant Transformation of Epithelial Cells

To further explore cellular heterogeneity, epithelial cells were re-clustered into 10 distinct subclusters (Figure 9A). Integration with CytoTRACE scoring revealed that subclusters 1 and 4 were predominantly composed of tumor-derived epithelial cells, while the other subclusters represented normal epithelial populations (Figure 9B–D). Pseudotime trajectory analysis of epithelial cells revealed a stepwise differentiation process comprising five distinct stages, with an initial starting point, two divergent branches, and a transitional phase during mid-differentiation where normal cells begin to undergo malignant transformation (Figure 9E–H). Notably, HSD17B1 expression was upregulated at the final stage of differentiation, indicating its progressive elevation during the malignant transformation of normal cells into tumor cells (Figure 9I; expression of other candidate genes is shown in Figure S6). ST analysis of three samples (GSM7990475, GSM7990477, and GSM7990480) partitioned the samples into 14, 13, and 11 distinct clusters, respectively, and annotated them into 17, 13, and 11 cell types. Commonly identified cell types across the samples included epithelial cells, endothelial cells, macrophages, and fibroblasts (Figure 9J). Expression analysis showed that HSD17B1 was predominantly localized in epithelial cells, consistent with the scRNA-seq results, and its generally elevated expression suggested a potential association with tumor progression.

2.7. Heterogeneity of the Tumor Microenvironment (TME) by ST Analysis

ST sequencing results revealed the distribution characteristics of spatial transcript counts (nCount_Spatial) (Figure S7A). The spatial heatmap showed that there were significant regional differences in transcript counts within the tissues, and regions with high transcript counts were mainly concentrated in tumor cell-enriched areas. Meanwhile, the violin plot further verified the differences in overall transcript counts among different tissue samples and the grouped violin plot revealed that uneven distribution of transcript counts within each spatial cluster, indicating high heterogeneity within gastric cancer tissues. In the distribution of each spatial cluster in tissue sections, different spatial clusters exhibited distinct spatial localization characteristics in tissue sections: some clusters were mainly distributed in the tumor margin area, while others were localized in the tumor core area or stromal area (Figure S7B–D). This spatial distribution pattern indicated that tumor tissues are composed of multiple cell populations, and these cell populations maintain relatively independent histological characteristics in spatial structure.
Uniform manifold approximation and projection (UMAP) dimensionality reduction results demonstrated that each GC tissue sample could be divided into multiple cell clusters (Figure S8A). After cell type annotation, it was found that these cell clusters mainly included epithelial cells, fibroblasts, immune cells (e.g., B cells, T cells, and macrophages), with a small number of endothelial cells. Further cluster annotation results revealed that different cell types had a certain consistency among samples, but also showed inter-individual differences. Additionally, pseudotime analysis presented the dynamic distribution characteristics of different cell clusters in the trajectory graph (Figure S8B–D). The trajectory structure showed that epithelial cells were located at the starting position of the main differentiation path during tumor progression, while fibroblasts and immune cells aggregated in the later stage of the trajectory, suggesting that these two types of cells may play key roles in the formation of the TME and immune regulation. In addition, the three samples showed significant differences in trajectory structure and cell distribution pattern, further confirming the high heterogeneity of TME.

3. Discussion

Recent molecular studies have demonstrated that genistein, a soy isoflavone, affects signaling pathways associated with GA. It modulates inflammatory markers like NF-κB and cell cycle regulators such as Cyclin D and p53, while its tumor-suppressive effects are mediated through molecular markers like PCDH17 and SOX2 [13,23]. Cell-based studies have identified genistein-interacting molecules with potential anti-cancer properties [24], and bioinformatics approaches have been employed to further explore genistein’s molecular mechanisms in combating GA. In our integrative analysis, six candidate key genes (HSD17B1, CCNB1, CCNB2, EZH2, CDKN2A, and IGFBP6) were identified as potential targets for genistein’s therapeutic effect in GA.
Both CCNB1 and CCNB2, members of the cyclin B family, encode Cyclin B1 and Cyclin B2, which interact with CDK1 to regulate the G2/M transition, acting as essential drivers of mitotic entry [25,26]. Activation of upstream pathways such as PI3K/Akt enhances cell cycle progression by increasing Cyclin B expression or activity [27]. Overexpression of these cyclins has been consistently associated with heightened tumor proliferation, increased malignancy, and poor prognosis in GC [28]. Genistein inhibits mitosis by inducing cell cycle arrest and downregulating CCNB1 expression, further supporting its role as a therapeutic target [14,29].
EZH2, the catalytic subunit of Polycomb Repressive Complex 2 (PRC2), is a histone methyltransferase that catalyzes H3K27 trimethylation and transcriptional repression. As an oncogenic driver in various cancers, EZH2 promotes tumorigenesis and sustains cancer stem cell maintenance [30]. It has been recognized as a novel therapeutic target in GA [31]. Notably, cancer-derived exosomes facilitate glycolysis, proliferation, and metastasis through the miR-198/EZH2 axis [32]. In oral squamous cell carcinoma, genistein selectively induces tumor cell apoptosis by suppressing the PI3K/AKT–EZH2 signaling pathway, underscoring its potential as a targeted therapeutic agent [33].
CDKN2A (cyclin-dependent kinase inhibitor 2A) is a tumor suppressor gene that induces cell cycle arrest at the G1/S and G2/M checkpoints [34,35]. However, this function appears inconsistent with our findings, which may be attributed to mutations that result in functionally defective CDKN2A [36]. Deng C et al. demonstrated that CDKN2A mutations (e.g., deletions, point mutations) in GA suppress interferon-α/γ responses, inflammatory reactions, and other immune-related pathways, impairing anti-tumor immunity and correlating with shorter patient survival [36]. This suggests that the paradoxical association between high CDKN2A expression and poor prognosis in our study may result from the predominance of mutant CDKN2A in highly expressed cases (e.g., loss of p16-mediated cell cycle inhibitory function). Here, high expression merely represents a concomitant phenomenon of mutation, whereas the mutation itself is the key driver of disease progression and impairment of its tumor-suppressive function. Additionally, a study in renal carcinoma demonstrated that genistein suppresses tumor cell proliferation by enhancing CDKN2A expression through promoter hypomethylation [37], suggesting an epigenetic mechanism that may underlie CDKN2A activation in certain cancers.
IGFBP6, a member of the insulin-like growth factor binding protein family, activates the MAPK signaling pathway and enhances cell migration [38]. IGFBP6, a specific inhibitor of insulin-like growth factor II (IGF-II), exhibits significantly stronger binding capacity to IGF-II than to IGF-I and can effectively inhibit the proliferation of malignant tumors with high IGF-II expression [39]. Meanwhile, similar to other members of the IGFBP family, IGFBP6 can also regulate the activity of cancer cells directly without relying on the IGF signaling pathway. In vitro studies [39,40] have confirmed that this protein can block the angiogenesis response mediated by vascular endothelial growth factor. In conclusion, IGFBP6 exhibits dual functional characteristics in cancer regulation, and its specific mechanism of action remains to be further explored. While no previous evidence supports a direct interaction between IGFBP6 and genistein, our findings provide novel insights by identifying IGFBP6 as a potential target of genistein.
HSD17B1 (17beta-hydroxysteroid dehydrogenase type 1), identified as a key gene in this study, is central to our discussion. Belonging to the short-chain dehydrogenase/reductase (SDR) family, HSD17B1 plays a pivotal role in steroid metabolism [41]. It significantly enhances androgenic effects, although its catalytic efficiency for converting androstenedione to testosterone is lower than its conversion of estrone (E1) to estradiol (E2) in vitro [42,43]. Notably, transgenic overexpression of human HSD17B1 in mice results in a masculinized phenotype in females, highlighting its substantial impact on androgen-related pathways in vivo [44]. Given the higher incidence of GC in males than females, it is plausible that androgen-related risk factors may contribute to this sex disparity in disease prevalence [45]. Additionally, high HSD17B1 expression has been reported to correlate with poor prognosis in patients with GC [46], which aligns with our findings. At the single-cell level, HSD17B1 was predominantly expressed in malignant epithelial clusters, and ST revealed its localization in tumor-enriched regions, providing robust spatial and cellular context to its pathogenic role. Collectively, these multi-omics results validate HSD17B1 as a genistein-responsive gene and reinforce its biological relevance in GC progression, supporting its potential as a therapeutic target.
GSEA revealed that multiple candidate key genes were significantly enriched in pathways related to the cell cycle, olfactory transduction, and MAPK signaling. Dysregulation of the cell cycle is a hallmark of cancer initiation and progression [47]. Cyclins and cyclin-dependent kinases (CDKs), key regulators of cell cycle transitions, have emerged as promising therapeutic targets across various malignancies [48]. In GA, several cell cycle-related genes have demonstrated prognostic significance. Specifically, overexpression of ESPL1 and MCM5 has been significantly correlated with advanced tumor stages and disease progression, underscoring their potential utility as biomarkers for predicting clinical outcomes [49]. Though traditionally linked to sensory perception, olfactory transduction is increasingly implicated in tumor biology. This is through ectopic expression of olfactory receptors (ORs), GPCR-family proteins that activate intracellular cascades (e.g., cAMP, calcium flux, cytoskeletal remodeling), promoting cancer cell migration and invasion [50]. Notably, the functional role of ORs in tumors is not limited to pro-tumorigenic effects: activation of ORs within tumor cells holds the potential to substantially reduce tumor proliferation or even bring tumor cell growth to a complete halt [50]. Previous research has suggested that GC may be associated with the ectopic expression of ORs, which are aberrantly activated in non-olfactory tissues and participate in tumor-related signaling pathways [51]. Mechanistically, key molecules governing olfactory signal transduction have been identified as potential modulators of this OR-mediated tumor regulatory pathway. Specifically, genistein exerts its effects by altering the phosphorylation status of target proteins—a process indispensable for the proper functioning of olfactory signal transduction, which may in turn modulate OR-dependent regulatory effects on tumor biology. The MAPK signaling pathway, recurrently enriched in several genes, regulates diverse cellular processes such as growth, differentiation, and apoptosis, and is frequently dysregulated in GC [52]. These findings suggest that the candidate genes may contribute to tumor progression through the cell cycle, olfactory transduction, and MAPK signaling pathways, thereby elucidating the mechanism of genistein in treating GA.
The results from scRNA-seq and ST analyses identified epithelial cells as a key cell type in GA. Pseudotime analysis revealed a progressive upregulation of HSD17B1 expression specifically during the transition of epithelial cells from normal phenotypes to malignant tumor cells. Epithelial cells perform a variety of vital physiological functions in the human body, acting as a physical barrier to protect underlying tissues from external insults while also participating actively in processes such as absorption and secretion. By forming tight junctions, epithelial cells regulate selective permeability across tissue interfaces, thereby maintaining internal homeostasis [53]. Additionally, epithelial cells contribute to innate immune responses by detecting and responding to microbial invasions [54]. In GA, these critical functions are disrupted. The malignancy is driven by transformed epithelial cells that exhibit uncontrolled proliferation and metastatic potential, largely facilitated by EMT [55,56]. Key molecular drivers of EMT include non-coding RNAs such as the long non-coding RNA CHRF and microRNA miR-665 [57,58]. The stepwise upregulation of HSD17B1 during epithelial transformation suggests its potential involvement in GA pathogenesis, possibly through disruption of epithelial integrity or modulation of EMT pathways. Epithelial cells have been demonstrated to detect signals derived from microorganisms, allergens, and tissue surface damage, and subsequently relay these signals to immune cells [59]. Given the pivotal role of epithelial cells in GA, we further utilized ST analysis to investigate the heterogeneity within the GA immune microenvironment. Our findings indicate that GA tumor tissues consist of diverse cell populations and exhibit distinct histological features with well-defined spatial organization. Pseudotime analysis further underscores the high degree of heterogeneity present in the TME of GA. These findings further highlight the importance of in-depth exploration of the TME in clarifying the development mechanism of GA.
This study synergistically integrated network pharmacology and bioinformatics approaches to systematically characterize the therapeutic targets and multifaceted anti-tumor mechanisms of genistein in GA. scRNA-seq revealed unprecedented cellular heterogeneity and dynamic state transitions within tumor ecosystems, while ST mapping accurately localized molecular expression patterns within their native tissue architecture. The network pharmacology framework further constructed a comprehensive “drug-target-pathway-disease” interaction network, collectively enabling a systematic dissection of tumor pathogenesis and treatment at both single-cell resolution and spatially resolved dimensions.
Notably, this study primarily focuses on descriptive correlation analyses and has limitations in several critical areas: the cohort sizes of the scRNA-seq and ST datasets are relatively limited. This limitation may lead to insufficient representativeness of the research results and introduce potential sampling bias, thereby affecting the generalizability of the conclusions to a certain extent. Therefore, in future research, we plan to further expand the size of the sample cohort and enhance the reliability and extrapolation value of the research results through validation in a larger scope. The mechanistic interpretation of the observed molecular changes is preliminary; causal relationships have not been experimentally validated through targeted functional studies, and clinical translatability requires more robust evidence from patient-derived models. Additionally, incomplete characterization of tumor microenvironmental crosstalk and potential sampling biases may affect the generalizability of the data. Current research has not verified HSD17B1 as a genistein target via in vitro or in vivo functional experiments, and this missing verification leaves the drug-target causal relationship without direct experimental support. Although the association between HSD17B1 and malignant epithelial transformation has been fully supported by scRNA-seq and ST analyses, its relevance to therapeutic translation still requires verification through functional experiments. To address these gaps, future investigations should prioritize mechanistic validation, preclinical testing, and multicenter clinical studies to evaluate therapeutic efficacy across diverse populations. These efforts will significantly enhance the translational potential of our findings for precision oncology applications.

4. Materials and Methods

4.1. Data Collection

The scRNA-seq dataset GSE264203 (platform: GPL24676) contains malignant epithelial cells from 5 patients with GA [60], while 3 GA samples (GSM7990475, GSM7990477, and GSM7990480) from GSE251950 (platform: GPL24676) were selected for ST analysis [61]. The bulk RNA sequencing datasets GSE29998 (platform: GPL6947) [62] consists of 50 tumor samples and 49 controls, and GSE118916 (platform: GPL15207) [63] includes 15 pairs of gastric tumor and adjacent non-tumor (normal) tissues. All datasets were sourced from the Gene Expression Omnibus (GEO) database (http://www.ncbi.nlm.nih.gov/geo/ (accessed on 11 October 2025)).

4.2. Acquisition of Drug Target Genes

Human protein targets for genistein were identified by querying the ChEMBL (https://www.ebi.ac.uk/chembl/ (accessed on 11 October 2025)), STITCH (http://stitch.embl.de/ (accessed on 11 October 2025)), and SwissTargetPrediction (http://www.swisstargetprediction.ch/ (accessed on 11 October 2025)) databases using its PubChem-derived chemical structure and SMILES notation (https://pubchem.ncbi.nlm.nih.gov/ (accessed on 11 October 2025)). The resulting target lists were merged and duplicates removed to identify drug target genes.

4.3. Identification of GA Target Genes

Disease-associated targets for GA were retrieved from established databases, including Online Mendelian Inheritance in Man (OMIM, https://omim.org/ (accessed on 11 October 2025)) and Genecards (https://www.genecards.org/ (accessed on 11 October 2025)), with a relevance score > 7. Duplicate entries were then removed to identify GA-related target genes. Principal component analysis (PCA) was performed on the GSE29998 dataset to assess the separation between sample groups. DEGs between GA and control samples in the GSE29998 dataset were identified using the limma (v3.64.1) package (|log2FC| > 1 and adjust. p < 0.05) [64].

4.4. Acquisition and Functional Analysis of Candidate Genes

To identify potential candidate genes related to GA treatment by genistein, the intersection of drug target genes, GA target genes, and DEGs was analyzed. A box plot was generated to visualize the expression patterns of these candidate genes across different experimental groups in the GSE29998 dataset. Chromosomal localization of candidate genes was determined using the circlize (v0.4.16) package [65]. Pearson correlation coefficients between these genes were calculated to assess their interrelationships. A PPI network was constructed using the GeneMANIA database (http://genemania.org/ (accessed on 11 October 2025)) to explore potential protein-level interactions among the candidate genes. To further elucidate the molecular functions and mechanisms of these genes, extensive functional enrichment analyses were conducted using the clusterProfiler (v4.16.0) package, incorporating GO and KEGG pathway annotations [66] (p < 0.05). The aPEAR (v1.0) package was employed to cluster the annotation results, facilitating the identification of underlying biological themes [67].

4.5. Machine Learning

To facilitate machine learning, the GSE29998 dataset was randomly divided into two subsets using a 7:3 split ratio, with the larger portion allocated for model training and the smaller portion reserved for validation. A range of machine learning algorithms, including Random Forest (RF), Extreme Gradient Boosting (XGBoost), Generalized Linear Model (GLM), Neural Network (NNET), Support Vector Machine (SVM), k-Nearest Neighbors (k-NN), Gradient Boosting Machine (GBM), NB, Adaptive Boosting (ADA), Least Absolute Shrinkage and Selection Operator (LASSO), Bootstrap Aggregating (BAG), and Decision Tree (DT), was implemented based on candidate genes using the caret (v7.0.1) package [68]. All models adopted 5-fold repeated cross-validation (train Control (method = “repeatedcv”, number = 5)) to evaluate the robustness and generalization ability of the algorithms. The model training parameters were set to default values, and the binary classification threshold used the default 0.5 cutoff for probability prediction. The positive class was defined as the control group, which served as the reference for all performance evaluations and class discrimination. For model performance evaluation, a unified framework for residual assessment and variable importance interpretation was constructed using the DALEX package (v2.0) [69]. Meanwhile, the performance of each model was evaluated through ROC curves, residual box plots, reverse cumulative distribution of residual graphs, and confusion matrices. The model with the highest AUC value, minimal residuals, and highest accuracy was selected as the optimal model, with the genes identified by this model considered as candidate key genes. Additionally, ROC curves for each candidate key gene were constructed in the GSE29998 dataset to assess their diagnostic efficacy. The GSE118916 dataset served as an external validation cohort to construct ROC curves for the prediction model based on candidate key genes, further assessing their diagnostic performance. Subsequently, samples from all datasets on the website (http://kmplot.com/analysis/index.php?p=service&cancer=gastric (accessed on 11 October 2025)), except the GSE62254 dataset, were stratified into high- and low-expression groups based on the optimal cutoff expression levels for each candidate key gene, and survival differences between these groups were analyzed on the website [70].

4.6. GSEA Enrichment Analysis

GSEA was performed to explore the potential biological pathways associated with each candidate key gene. GSEA was conducted using the “clusterProfiler” R package (v4.16.0) [66] and the org.Hs.eg.db gene set. Enrichment scores were computed with 1000 permutations, and pathways with nominal p < 0.05 and |NES| > 1 were considered significantly enriched. The top five enriched KEGG pathways for each gene were visualized to highlight their biological relevance.

4.7. Molecular Docking

Molecular docking simulations between genistein and candidate key gene-encoded proteins were conducted using the online tool CB-dock2 (https://cadd.labshare.cn/cb-dock2/php/blinddock.php (accessed on 11 October 2025)) [71]. Tertiary protein structures were retrieved from the PDB database (https://www.rcsb.org (accessed on 11 October 2025)) or AlphaFold2 (https://alphafold.ebi.ac.uk (accessed on 11 October 2025)) in PDB format, while genistein’s structure was obtained in SDF format from PubChem (https://pubchem.ncbi.nlm.nih.gov/ (accessed on 11 October 2025)). Proteins were preprocessed by adding non-polar hydrogens, assigning Gasteiger charges, and converting them into PDBQT format. Ligand structures were converted from SDF to PDB, followed by transformation into PDBQT files with defined rotatable bonds. Binding affinities were evaluated based on the calculated binding free energy (ΔG ≤ −5.0 kcal/mol). Based on these findings, the gene exhibiting the lowest binding energy to genistein was identified as the key gene.

4.8. MD Simulation

MD simulations were performed in GROMACS 2020 using the docked complexes of the key gene and genistein [72]. The protein was constructed using the CHARMM36 force field, and ligand parameters were generated with the Sobtop tool based on the GAFF force field. Atomic types, bond angle parameters, and charge distributions were checked to confirm compatibility with GROMACS. In the simulation, the TIP3P water model was used, and Na+/Cl ions were added to neutralize charges and simulate physiological conditions. Simulations were carried out in an NPT ensemble with a 2-fs time step, applying LINCS hydrogen bond constraints, Particle Mesh Ewald (PME) electrostatics (1.2 nm cutoff), a 1.0 nm van der Waals cutoff, and neighbor list updates every 10 steps. Temperature (300 K) and pressure (1 bar) were controlled using the V-rescale thermostat and Berendsen barostat. The system underwent energy minimization using the steepest descent, with convergence achieved when the maximum atomic force fell below 1000 kJ/mol/nm. This was followed by stepwise equilibration: 100 ps in the NVT ensemble and 100 ps in the NPT ensemble at 300 K. During both the NVT and NPT equilibration stages, convergence was determined by monitoring the system’s energy, temperature, and pressure. The equilibration was considered complete and transitioned to the next stage only when the curves of these parameters stabilized, showing no systematic drift and remaining within a narrow random fluctuation range. Production simulations were performed for 100 ns (NPT), with trajectory snapshots saved every 10 ps. Trajectory analysis was conducted using VMD 1.9.4 and PyMOL 3.1, and structural evolution was assessed through root mean square deviation (RMSD), radius of gyration (Rg), root mean square fluctuation (RMSF), buried solvent accessible surface area (SASA), and hydrogen bond dynamics.

4.9. scRNA-Seq Data Processing

Data filtering was conducted using the Seurat (v5.3.0) package on the GSE264203 dataset [73]. Quality control filtering excluded cells with the following characteristics: detected genes < 200 or >7500, UMI counts < 600, mitochondrial gene percentage > 10%, or hemoglobin gene percentage > 0.1%. Following logarithmic normalization, the vst function was used to identify the top 2000 HVGs showing maximal inter-cellular variation. PCA was performed on these HVGs, with significant PCs selected through integrated assessment of elbow plot inflection points and JackStraw permutation p-values (p < 0.05). Using these components, cell clustering was generated via the FindNeighbors and FindClusters functions at resolution = 0.5, followed by t-SNE dimensionality reduction. Cluster identities were annotated by cross-referencing differentially expressed marker genes with established literature signatures [60]. The expression patterns of key genes across different cell types were systematically analyzed. Additionally, a systematic analysis of the subcellular localization of key genes was performed using the Genecards database. Tumor cells were distinguished from non-neoplastic cells using InferCNV (v1.20.0) [74], with immune cells from the same sample serving as the reference group. Cells designated as “Malignant” by the HMM model and exhibiting large-scale chromosomal aberrations were classified as malignant. The quantification of cellular HSD17B1 expression levels in each cell type was conducted through AUCell scoring, followed by stratification of cells into high- and low-expression subgroups using median AUCell scores as the threshold. Pathway enrichment analysis was then performed using single-sample GSEA (ssGSEA), systematically interrogating the Hallmark gene sets curated in MSigDB (https://www.gsea-msigdb.org/gsea/msigdb (accessed on 11 October 2025)). Correlations between cell types and pathways were determined, followed by visualization of pathways showing significant enrichment.

4.10. Secondary Clustering and Pseudotime Trajectory Analysis

Cell types exhibiting distinct expression patterns of key genes were selected for secondary clustering analysis. The CytoTRACE algorithm was then employed to compute stemness scores for cells, which were subsequently used to classify cells into tumor and normal categories.
Additionally, pseudotime trajectory analysis was conducted using the monocle (v2.36.0) package to further investigate the functional dynamics and developmental processes of these cell types or states during disease progression, as well as the expression patterns of key genes [75].

4.11. ST Analysis

ST analysis in GSE251950 was performed using the Seurat package (v4.3.0). For each sample, the gene-spot matrix was normalized with SCTransform to adjust for technical variability and stabilize feature variances. HVGs were identified using the FindVariableFeatures function, followed by data scaling with the ScaleData function. PCA was performed using the RunPCA function. Clustering was conducted via the Louvain community detection algorithm on a shared nearest-neighbor graph of the first 30 PCs constructed by the FindNeighbors and FindClusters functions (resolution parameter = 0.8). Cluster-specific marker genes were identified using the FindAllMarkers function (min.pct = 0.1, logfc.threshold = 1). Automated cell type annotation was carried out by integrating Seurat-derived cluster marker genes with reference-based predictions from SingleR (v2.10.0) [76]. The spatial distributions of clusters and key gene expression patterns were visualized using the SpatialFeaturePlot function.

4.12. TME Analysis in ST

10x Genomics ST data were imported using the Read10X_h5 function, while spatial images and positional information were loaded via the Read10X_Image function. Subsequently, a Seurat object was created, and spatial images were attached to this object to complete data preprocessing. Data normalization was performed using SCTransform, followed by dimensionality reduction via PCA. Neighbor search and cluster analysis were then conducted, and the results were visualized using UMAP.
DEGs were identified using the FindAllMarkers function and filtered based on log fold change (logFC) and adjusted p-values. Spatial feature plots and violin plots were generated to visualize the spatial distribution of distinct marker genes and their expression levels, respectively. Additionally, spatially variable gene analysis was performed to explore genes with expression patterns that vary across spatial locations. Automatic cell type annotation was implemented using the SingleR package combined with reference datasets from celldex. Each cell was assigned a corresponding cell type, and the annotation results were visualized on both UMAP plots and spatial distribution maps to confirm the spatial localization of different cell types.
Spatial trajectory construction was conducted using the monocle3 package, following the steps below: (1) The Seurat object was converted to a cell_data_set (CDS) object (the core data structure of monocle3) using as.cell_data_set, and Seurat-derived cluster information was added to the colData of the CDS object. (2) Cells were clustered using the cluster_cells function, and the cell trajectory graph was acquired via the learn_graph function. (3) Pseudotime values for each cell were calculated using the order_cells function. (4) The computed pseudotime information was added back to the original Seurat object, and spatial visualization plots were generated to display the spatial distribution of pseudotime.

4.13. Statistical Analysis

Statistical analyses were performed in R (v4.5.0) and Python (v3.8.10). Comparisons of continuous variables between groups were made using Wilcoxon tests, with statistical significance set at p < 0.05.

5. Conclusions

In conclusion, this study systematically identified therapeutic targets of genistein in GA through an integrated approach combining network pharmacology, transcriptomic analysis, molecular docking, and MD simulations. HSD17B1 was identified as a potential core target, with its specific expression pattern observed in epithelial cell subsets, highlighting its functional relevance in tumor progression. The integration of scRNA-seq and ST data provided a comprehensive cellular and spatial map of genistein’s potential action landscape in GA, contributing novel insights into the study of drug mechanisms in this context.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/ijms262110369/s1.

Author Contributions

Conceptualization, X.W.; methodology, X.W.; software, X.W. and J.Z.; validation, X.W.; formal analysis, J.Z.; investigation, J.J.; resources, Y.W.; data curation, X.W. and J.Z.; writing—original draft preparation, X.W., J.Z. and J.J.; writing—review and editing, X.W., J.Z. and J.J.; visualization, X.W.; supervision, Y.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The scRNA-seq dataset GSE264203, GSE251950 and the bulk RNA sequencing datasets GSE29998, GSE118916 were sourced from the Gene Expression Omnibus (GEO) database (http://www.ncbi.nlm.nih.gov/geo/ (accessed on 11 October 2025)).

Acknowledgments

We would like to sincerely thank the authors for their scientific contributions. All individuals mentioned have provided their consent to be acknowledged.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

GCGastric cancer
GAGastric adenocarcinoma
scRNA-seqSingle-cell RNA sequencing
STSpatial transcriptomics
RNA-seqRNA sequencing
MDMolecular dynamics
GEOGene Expression Omnibus
DEGsDifferentially expressed genes
PCAPrincipal component analysis
GOGene Ontology
KEGGKyoto Encyclopedia of Genes and Genomes
PPIProtein–protein interaction
MLMachine learning
RFRandom Forest
XGBoostExtreme Gradient Boosting
GLMGeneralized Linear Model
NNETNeural Network
SVMSupport Vector Machine
k-NNk-Nearest Neighbors
GBMGradient Boosting Machine
NBNaive Bayes
ADAAdaptive Boosting
LASSOLeast Absolute Shrinkage and Selection Operator
BAGBootstrap Aggregating
DTDecision Tree
ROCReceiver operating characteristic
AUCArea under the curve
GSEAGene set enrichment analysis
NESNormalized enrichment score
PDBProtein Data Bank
ΔGBinding free energy
RMSDRoot mean square deviation
RgRadius of gyration
RMSFRoot mean square fluctuation
SASASolvent accessible surface area
HVGsHighly variable genes
PCsPrincipal components
t-SNEt-distributed stochastic neighbor embedding
EMTEpithelial–mesenchymal transition
ssGSEASingle-sample GSEA
ORsOlfactory receptors
GPCRG protein-coupled receptor
CDKsCyclin-dependent kinases
CNAsCopy number alterations
HMMHidden Markov model

References

  1. Bray, F.; Laversanne, M.; Sung, H.; Ferlay, J.; Siegel, R.L.; Soerjomataram, I.; Jemal, A. Global cancer statistics 2022: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 2024, 74, 229–263. [Google Scholar] [CrossRef]
  2. Choi, K.W.; Joo, M.; Kim, H.S.; Lee, W.Y. Synchronous triple occurrence of MALT lymphoma, schwannoma, and adenocarcinoma of the stomach. World J. Gastroenterol. 2017, 23, 4127–4131. [Google Scholar] [CrossRef]
  3. Xie, S.H.; Chen, H.; Lagergren, J. Causes of death in patients diagnosed with gastric adenocarcinoma in Sweden, 1970–2014: A population-based study. Cancer Sci. 2020, 111, 2451–2459. [Google Scholar] [CrossRef] [PubMed]
  4. Ajani, J.A.; Lee, J.; Sano, T.; Janjigian, Y.Y.; Fan, D.; Song, S. Gastric adenocarcinoma. Nat. Rev. Dis. Primers 2017, 3, 17036. [Google Scholar] [CrossRef] [PubMed]
  5. Zulkifli, M.F.; Eshak, Z.; Mokhtar, M.H.; Ismail, W.I.W. Labisia pumila var. alata Extract Induces Apoptosis Cell Death by Inhibiting the Activity of Oestrogen Receptors in MCF-7 Breast Cancer Cells. Int. J. Mol. Sci. 2025, 26, 3748. [Google Scholar]
  6. Zhang, Y.; Huang, T.; Wang, J.; Wang, G.; Luo, F. Roles of gender and smoking in the associations between urinary phytoestrogens and asthma/wheeze and lung function: Evidence from a cross-sectional study. BMJ Open Respir. Res. 2024, 11, e001708. [Google Scholar] [CrossRef] [PubMed]
  7. Sharifi-Rad, J.; Quispe, C.; Imran, M.; Rauf, A.; Nadeem, M.; Gondal, T.A.; Ahmad, B.; Atif, M.; Mubarak, M.S.; Sytar, O.; et al. Genistein: An Integrative Overview of Its Mode of Action, Pharmacological Properties, and Health Benefits. Oxid. Med. Cell. Longev. 2021, 2021, 3268136. [Google Scholar] [CrossRef] [PubMed]
  8. Xue, Y.; Bi, F.; Zhang, X.; Zhang, S.; Pan, Y.; Liu, N.; Shi, Y.; Yao, X.; Zheng, Y.; Fan, D. Role of Rac1 and Cdc42 in hypoxia induced p53 and von Hippel-Lindau suppression and HIF1alpha activation. Int. J. Cancer 2006, 118, 2965–2972. [Google Scholar] [CrossRef]
  9. Huang, W.; Wan, C.; Luo, Q.; Huang, Z.; Luo, Q. Genistein-inhibited cancer stem cell-like properties and reduced chemoresistance of gastric cancer. Int. J. Mol. Sci. 2014, 15, 3432–3443. [Google Scholar] [CrossRef]
  10. Cao, X.; Ren, K.; Song, Z.; Li, D.; Quan, M.; Zheng, Y.; Cao, J.; Zeng, W.; Zou, H. 7-Difluoromethoxyl-5,4′-di-n-octyl genistein inhibits the stem-like characteristics of gastric cancer stem-like cells and reverses the phenotype of epithelial-mesenchymal transition in gastric cancer cells. Oncol. Rep. 2016, 36, 1157–1165. [Google Scholar] [CrossRef]
  11. Jin, C.-Y.; Park, C.; Cheong, J.; Choi, B.T.; Lee, T.H.; Lee, J.-D.; Lee, W.H.; Kim, G.-Y.; Ryu, C.H.; Choi, Y.H. Genistein sensitizes TRAIL-resistant human gastric adenocarcinoma AGS cells through activation of caspase-3. Cancer Lett. 2007, 257, 56–64. [Google Scholar] [CrossRef]
  12. Wang, Z.; Gong, J.; Zhu, M.; Tang, W.; Cui, H. In vitro study on the inhibition of human gastric cancer BGC-823 cells by Genistein and 5-fluorouracil alone or in combination. J. Chongqing Med. Univ. 2012, 37, 1053–1058. [Google Scholar]
  13. Li, Y.-S.; Wu, L.-P.; Li, K.-B.; Liu, Y.-P.; Xiang, R.; Zhang, S.-B.; Zhu, L.-Y.; Zhang, L.-Y. Involvement of nuclear factor κB (NF-κB) in the downregulation of cyclooxygenase-2 (COX-2) by genistein in gastric cancer cells. J. Int. Med. Res. 2011, 39, 2141–2150. [Google Scholar] [CrossRef]
  14. Song, D.; Liu, Y.; Wang, X.; Yang, Y. Inhibitory effects of genistein on the synthesis of DNA and the protein expression of cyclin D1 in human gastric carcinoma cell-line. Wei Sheng Yan Jiu 2002, 31, 106–108. [Google Scholar] [PubMed]
  15. Zhou, H.B.; Chen, J.M.; Cai, J.T.; Du, Q.; Wu, C.N. Anticancer activity of genistein on implanted tumor of human SG7901 cells in nude mice. World J. Gastroenterol. 2008, 14, 627–631. [Google Scholar] [CrossRef] [PubMed]
  16. Baysoy, A.; Bai, Z.; Satija, R.; Fan, R. The technological landscape and applications of single-cell multi-omics. Nat. Rev. Mol. Cell Biol. 2023, 24, 695–713. [Google Scholar] [CrossRef] [PubMed]
  17. Zhang, M.; Hu, S.; Min, M.; Ni, Y.; Lu, Z.; Sun, X.; Wu, J.; Liu, B.; Ying, X.; Liu, Y. Dissecting transcriptional heterogeneity in primary gastric adenocarcinoma by single cell RNA sequencing. Gut 2021, 70, 464–475. [Google Scholar] [CrossRef]
  18. Wang, R.; Dang, M.; Harada, K.; Han, G.; Wang, F.; Pizzi, M.P.; Zhao, M.; Tatlonghari, G.; Zhang, S.; Hao, D.; et al. Single-cell dissection of intratumoral heterogeneity and lineage diversity in metastatic gastric adenocarcinoma. Nat. Med. 2021, 27, 141–151. [Google Scholar] [CrossRef]
  19. Gulati, G.S.; D’sIlva, J.P.; Liu, Y.; Wang, L.; Newman, A.M. Profiling cell identity and tissue architecture with single-cell and spatial transcriptomics. Nat. Rev. Mol. Cell Biol. 2025, 26, 11–31. [Google Scholar] [CrossRef]
  20. Longo, S.K.; Guo, M.G.; Ji, A.L.; Khavari, P.A. Integrating single-cell and spatial transcriptomics to elucidate intercellular tissue dynamics. Nat. Rev. Genet. 2021, 22, 627–644. [Google Scholar] [CrossRef]
  21. Chen, M.-M.; Gao, Q.; Ning, H.; Chen, K.; Gao, Y.; Yu, M.; Liu, C.-Q.; Zhou, W.; Pan, J.; Wei, L.; et al. Integrated single-cell and spatial transcriptomics uncover distinct cellular subtypes involved in neural invasion in pancreatic cancer. Cancer Cell 2025, 43, 1656–1676.e10. [Google Scholar] [CrossRef]
  22. Ma, H.; Srivastava, S.; Ho, S.W.T.; Xu, C.; Lian, B.S.X.; Ong, X.; Tay, S.T.; Sheng, T.; Lum, H.Y.J.; Ghani, S.A.B.A.; et al. Spatially Resolved Tumor Ecosystems and Cell States in Gastric Adenocarcinoma Progression and Evolution. Cancer Discov. 2025, 15, 767–792. [Google Scholar] [CrossRef]
  23. Tan, Z.; Pan, K.; Sun, M.; Pan, X.; Yang, Z.; Chang, Z.; Yang, X.; Zhu, J.; Zhan, L.; Liu, Y.; et al. CCKBR+ cancer cells contribute to the intratumor heterogeneity of gastric cancer and confer sensitivity to FOXO inhibition. Cell Death Differ. 2024, 31, 1302–1317. [Google Scholar] [CrossRef] [PubMed]
  24. Lee, S.H.; Lee, D.; Choi, J.; Oh, H.J.; Ham, I.-H.; Ryu, D.; Lee, S.-Y.; Han, D.-J.; Kim, S.; Moon, Y.; et al. Spatial dissection of tumour microenvironments in gastric cancers reveals the immunosuppressive crosstalk between CCL2+ fibroblasts and STAT3-activated macrophages. Gut 2025, 74, 714–727. [Google Scholar] [CrossRef] [PubMed]
  25. Holbrook, J.D.; Parker, J.S.; Gallagher, K.T.; Halsey, W.S.; Hughes, A.M.; Weigman, V.J.; Lebowitz, P.F.; Kumar, R. Deep sequencing of gastric carcinoma reveals somatic mutations relevant to personalized medicine. J. Transl. Med. 2011, 9, 119. [Google Scholar] [CrossRef]
  26. Li, L.; Zhu, Z.; Zhao, Y.; Zhang, Q.; Wu, X.; Miao, B.; Cao, J.; Fei, S. FN1, SPARC, and SERPINE1 are highly expressed and significantly related to a poor prognosis of gastric adenocarcinoma revealed by microarray and bioinformatics. Sci. Rep. 2019, 9, 7827. [Google Scholar] [CrossRef] [PubMed]
  27. Ritchie, M.E.; Phipson, B.; Wu, D.; Hu, Y.; Law, C.W.; Shi, W.; Smyth, G.K. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015, 43, e47. [Google Scholar] [CrossRef]
  28. Gu, Z.; Gu, L.; Eils, R.; Schlesner, M.; Brors, B. circlize Implements and enhances circular visualization in R. Bioinformatics 2014, 30, 2811–2812. [Google Scholar] [CrossRef]
  29. Yu, G.; Wang, L.G.; Han, Y.; He, Q.Y. clusterProfiler: An R package for comparing biological themes among gene clusters. Omics 2012, 16, 284–287. [Google Scholar] [CrossRef]
  30. Kerseviciute, I.; Gordevicius, J. aPEAR: An R package for autonomous visualization of pathway enrichment networks. Bioinformatics 2023, 39, btad672. [Google Scholar] [CrossRef]
  31. Kuhn, M. Building Predictive Models in R Using the caret Package. J. Stat. Softw. 2008, 28, 1–26. [Google Scholar] [CrossRef]
  32. Deng, Z.; Liu, H.; Chen, F.; Liu, Q.; Wang, X.; Wang, C.; Lyu, C.; Li, J.; Li, T. Developing an Interpretable Machine Learning Model for Early Prediction of Cardiovascular Involvement in Systemic Lupus Erythematosus. J. Inflamm. Res. 2025, 18, 8629–8641. [Google Scholar] [CrossRef]
  33. Győrffy, B. Integrated analysis of public datasets for the discovery and validation of survival-associated genes in solid tumors. Innovation 2024, 5, 100625. [Google Scholar] [CrossRef] [PubMed]
  34. Liu, Y.; Yang, X.; Gan, J.; Chen, S.; Xiao, Z.X.; Cao, Y. CB-Dock2: Improved protein-ligand blind docking by integrating cavity detection, docking and homologous template fitting. Nucleic Acids Res. 2022, 50, W159–W164. [Google Scholar] [CrossRef]
  35. Sinelnikova, A.; Spoel, D.V. NMR refinement and peptide folding using the GROMACS software. J. Biomol. NMR 2021, 75, 143–149. [Google Scholar] [CrossRef]
  36. Hao, Y.; Hao, S.; Andersen-Nissen, E.; Mauck, W.M., 3rd; Zheng, S.; Butler, A.; Lee, M.J.; Wilk, A.J.; Darby, C.; Zager, M.; et al. Integrated analysis of multimodal single-cell data. Cell 2021, 184, 3573–3587.e29. [Google Scholar] [CrossRef]
  37. Patel, A.P.; Tirosh, I.; Trombetta, J.J.; Shalek, A.K.; Gillespie, S.M.; Wakimoto, H.; Cahill, D.P.; Nahed, B.V.; Curry, W.T.; Martuza, R.L.; et al. Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma. Science 2014, 344, 1396–1401. [Google Scholar] [CrossRef]
  38. Trapnell, C.; Cacchiarelli, D.; Grimsby, J.; Pokharel, P.; Li, S.; Morse, M.; Lennon, N.J.; Livak, K.J.; Mikkelsen, T.S.; Rinn, J.L. The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nat. Biotechnol. 2014, 32, 381–386. [Google Scholar] [CrossRef] [PubMed]
  39. Aran, D.; Looney, A.P.; Liu, L.; Wu, E.; Fong, V.; Hsu, A.; Chak, S.; Naikawadi, R.P.; Wolters, P.J.; Abate, A.R.; et al. Reference-based analysis of lung single-cell sequencing reveals a transitional profibrotic macrophage. Nat. Immunol. 2019, 20, 163–172. [Google Scholar] [CrossRef]
  40. Gloesenkamp, C.R.; Nitzsche, B.; Ocker, M.; Di Fazio, P.; Quint, K.; Hoffmann, B.; Scherübl, H.; Höpfner, M. AKT inhibition by triciribine alone or as combination therapy for growth control of gastroenteropancreatic neuroendocrine tumors. Int. J. Oncol. 2012, 40, 876–888. [Google Scholar]
  41. Sun, Q. Study on the Interaction Between Soybean Isoflavones and Nucleic Acid G-Quadruplexes. Master’s Thesis, Tianjin University, Tianjin, China, 2007. [Google Scholar]
  42. Dai, P.; Xiong, L.; Wei, Y.; Wei, X.; Zhou, X.; Zhao, J.; Tang, H. A pancancer analysis of the oncogenic role of cyclin B1 (CCNB1) in human tumors. Sci. Rep. 2023, 13, 16226. [Google Scholar] [CrossRef] [PubMed]
  43. Shi, Q.; Wang, W.; Jia, Z.; Chen, P.; Ma, K.; Zhou, C. ISL1, a novel regulator of CCNB1, CCNB2 and c-MYC genes, promotes gastric cancer cell proliferation and tumor growth. Oncotarget 2016, 7, 36489–36500. [Google Scholar] [CrossRef]
  44. Hu, M.; Tao, P.; Wang, Y.; Zhu, C.; Ma, Y.; Liu, X.; Cai, H. Knockdown of CCNB2 inhibits the tumorigenesis of gastric cancer by regulation of the PI3K/Akt pathway. Sci. Rep. 2025, 15, 5703. [Google Scholar] [CrossRef] [PubMed]
  45. Schnittger, A.; De Veylder, L. The Dual Face of Cyclin B1. Trends Plant Sci. 2018, 23, 475–478. [Google Scholar] [CrossRef]
  46. Song, D.; Na, X.; Liu, Y.; Chi, X. Study on mechanisms of human gastric carcinoma cells apoptosis induced by genistein. Wei Sheng Yan Jiu 2003, 32, 128–130. [Google Scholar]
  47. Villanueva, M.T. Anticancer drugs: All roads lead to EZH2 inhibition. Nat. Rev. Drug Discov. 2017, 16, 239. [Google Scholar]
  48. Ramezankhani, R.; Solhi, R.; Es, H.A.; Vosough, M.; Hassan, M. Novel molecular targets in gastric adenocarcinoma. Pharmacol. Ther. 2021, 220, 107714. [Google Scholar] [CrossRef]
  49. Zheng, Y.; Li, P.; Ma, J.; Yang, C.; Dai, S.; Zhao, C. Cancer-derived exosomal circ_0038138 enhances glycolysis, growth, and metastasis of gastric adenocarcinoma via the miR-198/EZH2 axis. Transl. Oncol. 2022, 25, 101479. [Google Scholar] [CrossRef]
  50. Dev, A.; Sardoiwala, M.N.; Kushwaha, A.C.; Karmakar, S.; Choudhury, S.R. Genistein nanoformulation promotes selective apoptosis in oral squamous cell carcinoma through repression of 3PK-EZH2 signalling pathway. Phytomedicine 2021, 80, 153386. [Google Scholar] [CrossRef] [PubMed]
  51. Bian, Y.S.; Osterheld, M.C.; Fontolliet, C.; Bosman, F.T.; Benhattar, J. p16 inactivation by methylation of the CDKN2A promoter occurs early during neoplastic progression in Barrett’s esophagus. Gastroenterology 2002, 122, 1113–1121. [Google Scholar] [CrossRef]
  52. Alhejaily, A.; Day, A.G.; Feilotter, H.E.; Baetz, T.; Lebrun, D.P. Inactivation of the CDKN2A tumor-suppressor gene by deletion or methylation is common at diagnosis in follicular lymphoma and associated with poor clinical outcome. Clin. Cancer Res. 2014, 20, 1676–1686. [Google Scholar] [CrossRef] [PubMed]
  53. Deng, C.; Li, Z.-X.; Xie, C.-J.; Zhang, Q.-L.; Hu, B.-S.; Wang, M.-D.; Mei, J.; Yang, C.; Zhong, Z.; Wang, K.-W. Pan-cancer analysis of CDKN2A alterations identifies a subset of gastric cancer with a cold tumor immune microenvironment. Hum. Genom. 2024, 18, 55. [Google Scholar] [CrossRef]
  54. Xu, J.; Li, N.; Deng, W.; Luo, S. Discovering the mechanism and involvement of the methylation of cyclin-dependent kinase inhibitor 2A (CDKN2A) gene and its special locus region in gastric cancer. Bioengineered 2021, 12, 1286–1298. [Google Scholar] [CrossRef] [PubMed]
  55. Zhong, C.; Shi, K.; Li, P.; Qiu, X.; Wu, X.; Chen, S.; Liu, Y.; Li, F.; Zhao, Z.; Zhou, J.; et al. Single-cell sequencing analysis and bulk-seq identify IGFBP6 and TNFAIP6 as novel differential diagnosis markers for postburn pathological scarring. Burns 2024, 50, 107255. [Google Scholar] [CrossRef]
  56. Lu, H.; Yu, X.; Xu, Z.; Deng, J.; Zhang, M.J.; Zhang, Y.; Sun, S. Prognostic Value of IGFBP6 in Breast Cancer: Focus on Glucometabolism. Technol. Cancer Res. Treat. 2024, 23, 15330338241271998. [Google Scholar] [CrossRef]
  57. Zhang, C.; Lu, L.; Li, Y.; Wang, X.; Zhou, J.; Liu, Y.; Fu, P.; Gallicchio, M.A.; Bach, L.A.; Duan, C. IGF binding protein-6 expression in vascular endothelial cells is induced by hypoxia and plays a negative role in tumor angiogenesis. Int. J. Cancer 2012, 130, 2003–2012. [Google Scholar] [CrossRef] [PubMed]
  58. Järvensivu, P.; Saloniemi-Heinonen, T.; Awosanya, M.; Koskimies, P.; Saarinen, N.; Poutanen, M. HSD17B1 expression enhances estrogen signaling stimulated by the low active estrone, evidenced by an estrogen responsive element-driven reporter gene in vivo. Chem. Biol. Interact. 2015, 234, 126–134. [Google Scholar] [CrossRef]
  59. Poutanen, M.; Miettinen, M.; Vihko, R. Differential estrogen substrate specificities for transiently expressed human placental 17 beta-hydroxysteroid dehydrogenase and an endogenous enzyme expressed in cultured COS-m6 cells. Endocrinology 1993, 133, 2639–2644. [Google Scholar] [CrossRef]
  60. Peltoketo, H.; Luu-The, V.; Simard, J.; Adamski, J. 17beta-hydroxysteroid dehydrogenase (HSD)/17-ketosteroid reductase (KSR) family; nomenclature and main characteristics of the 17HSD/KSR enzymes. J. Mol. Endocrinol. 1999, 23, 1–11. [Google Scholar] [CrossRef]
  61. Saloniemi, T.; Welsh, M.; Lamminen, T.; Saunders, P.; Mäkelä, S.; Streng, T.; Poutanen, M. Human HSD17B1 expression masculinizes transgenic female mice. Mol. Cell. Endocrinol. 2009, 301, 163–168. [Google Scholar] [CrossRef]
  62. Yang, W.-J.; Zhao, H.-P.; Yu, Y.; Wang, J.-H.; Guo, L.; Liu, J.-Y.; Pu, J.; Lv, J. Updates on global epidemiology, risk and prognostic factors of gastric cancer. World J. Gastroenterol. 2023, 29, 2452–2468. [Google Scholar] [CrossRef]
  63. Chang, W.-C.; Huang, S.-F.; Lee, Y.-M.; Lai, H.-C.; Cheng, B.-H.; Cheng, W.-C.; Ho, J.Y.-P.; Jeng, L.-B.; Ma, W.-L. Cholesterol import and steroidogenesis are biosignatures for gastric cancer patient survival. Oncotarget 2017, 8, 692–704. [Google Scholar] [CrossRef]
  64. Kent, L.N.; Leone, G. The broken cycle: E2F dysfunction in cancer. Nat. Rev. Cancer 2019, 19, 326–338. [Google Scholar] [CrossRef] [PubMed]
  65. Otto, T.; Sicinski, P. Cell cycle proteins as promising targets in cancer therapy. Nat. Rev. Cancer 2017, 17, 93–115. [Google Scholar] [CrossRef] [PubMed]
  66. Wang, D.; Zhu, H.; Guo, M.; Fan, X.; Hu, S.; Yan, K.; Sun, J.; Wang, J.; Li, M.; Xiao, H.; et al. Expression and prognostic value of cell-cycle-associated genes in gastric adenocarcinoma. BMC Gastroenterol. 2018, 18, 81. [Google Scholar] [CrossRef] [PubMed]
  67. Tang, Y.; Tian, Y.; Zhang, C.X.; Wang, G.T. Olfactory Receptors and Tumorigenesis: Implications for Diagnosis and Targeted Therapy. Cell Biochem. Biophys. 2025, 83, 295–305. [Google Scholar] [CrossRef]
  68. Guo, X.; Yang, Z.; Zhi, Q.; Wang, D.; Guo, L.; Li, G.; Miao, R.; Shi, Y.; Kuang, Y. Long noncoding RNA OR3A4 promotes metastasis and tumorigenicity in gastric cancer. Oncotarget 2016, 7, 30276–30294. [Google Scholar] [CrossRef]
  69. Kyriakis, J.M.; Avruch, J. Mammalian MAPK signal transduction pathways activated by stress and inflammation: A 10-year update. Physiol. Rev. 2012, 92, 689–737. [Google Scholar] [CrossRef]
  70. Yuan, D.; Ma, Z.; Tuo, B.; Li, T.; Liu, X. Physiological Significance of Ion Transporters and Channels in the Stomach and Pathophysiological Relevance in Gastric Cancer. Evid.-Based Complement. Alternat. Med. 2020, 2020, 2869138. [Google Scholar] [CrossRef]
  71. Ghasemi, F.; Tessier, T.M.; Gameiro, S.F.; Maciver, A.H.; Cecchini, M.J.; Mymryk, J.S. High MHC-II expression in Epstein-Barr virus-associated gastric cancers suggests that tumor cells serve an important role in antigen presentation. Sci. Rep. 2020, 10, 14786. [Google Scholar] [CrossRef]
  72. Xu, X.; Chen, B.; Zhu, S.; Zhang, J.; He, X.; Cao, G.; Chen, B. Hyperglycemia promotes Snail-induced epithelial-mesenchymal transition of gastric cancer via activating ENO1 expression. Cancer Cell Int. 2019, 19, 344. [Google Scholar] [CrossRef] [PubMed]
  73. Zhou, W.; Chen, H.; Ruan, Y.; Zeng, X.; Liu, F. High Expression of TRIM15 Is Associated with Tumor Invasion and Predicts Poor Prognosis in Patients with Gastric Cancer. J. Investig. Surg. 2021, 34, 853–861. [Google Scholar] [CrossRef] [PubMed]
  74. Gong, J.; Wang, Y.; Shu, C. LncRNA CHRF promotes cell invasion and migration via EMT in gastric cancer. Eur. Rev. Med. Pharmacol. Sci. 2020, 24, 1168–1176. [Google Scholar] [PubMed]
  75. Zhang, M.; Wang, S.; Yi, A.; Qiao, Y. microRNA-665 is down-regulated in gastric cancer and inhibits proliferation, invasion, and EMT by targeting PPP2R2A. Cell Biochem. Funct. 2020, 38, 409–418. [Google Scholar] [CrossRef]
  76. Larsen, S.B.; Cowley, C.J.; Fuchs, E. Epithelial cells: Liaisons of immunity. Curr. Opin. Immunol. 2020, 62, 45–53. [Google Scholar] [CrossRef]
Figure 1. Study design.
Figure 1. Study design.
Ijms 26 10369 g001
Figure 2. Differential expression and functional enrichment analyses in the GSE29998 dataset. (A) Principal component analysis (PCA) results of samples in the GSE29998 dataset. (B) Heatmap of differentially expressed genes (DEGs) between GA and the control groups fom the GSE29998 dataset. (C) Volcano plot of DEGs between the GA group and the control group in the GSE29998 dataset. (DG) Cluster annotation results of Gene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways enriched by DEGs. (D), Biological Process (BP); (E), Cellular Component; (F), Molecular Function; (G), KEGG.
Figure 2. Differential expression and functional enrichment analyses in the GSE29998 dataset. (A) Principal component analysis (PCA) results of samples in the GSE29998 dataset. (B) Heatmap of differentially expressed genes (DEGs) between GA and the control groups fom the GSE29998 dataset. (C) Volcano plot of DEGs between the GA group and the control group in the GSE29998 dataset. (DG) Cluster annotation results of Gene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways enriched by DEGs. (D), Biological Process (BP); (E), Cellular Component; (F), Molecular Function; (G), KEGG.
Ijms 26 10369 g002
Figure 3. Acquisition and functional analysis of candidate genes. (A) The molecular structure of genistein. (B) Venn diagram of drug target genes, GA target genes, and DEGs to identify candidate genes. (C) Expression box plot of candidate genes in GSE29998 dataset. * p < 0.05, ** p < 0.01, *** p < 0.001. (D) GO terms enriched by candidate genes. (E) KEGG pathways enriched by candidate genes.
Figure 3. Acquisition and functional analysis of candidate genes. (A) The molecular structure of genistein. (B) Venn diagram of drug target genes, GA target genes, and DEGs to identify candidate genes. (C) Expression box plot of candidate genes in GSE29998 dataset. * p < 0.05, ** p < 0.01, *** p < 0.001. (D) GO terms enriched by candidate genes. (E) KEGG pathways enriched by candidate genes.
Ijms 26 10369 g003
Figure 4. Identification of candidate key genes by machine learning algorithms. (A) Receiver Operating Characteristic (ROC) curves of 12 distinct machine learning algorithms. RF, Random Forest; SVM, Support Vector Machine; XGB, Extreme Gradient Boosting; GLM, Generalized Linear Model; GBM, Gradient Boosting Machine; KNN, k-Nearest Neighbors; NNET, Neural Network; LASSO, Least Absolute Shrinkage and Selection Operator; DT, Decision Tree; NB, Naive Bayes; ADA, Adaptive Boosting; BAG, Bootstrap Aggregating. (B) Residual box plot of 12 distinct machine learning algorithms. (C) Reverse cumulative distribution graph of 12 distinct machine learning algorithms. (D) Confusion matrix of 12 distinct machine learning algorithms. (E) ROC curves of 6 candidate key genes. AUC, area under the curve. (F) ROC curves of predictive model constructed from 6 candidate key genes within the GSE118916 dataset.
Figure 4. Identification of candidate key genes by machine learning algorithms. (A) Receiver Operating Characteristic (ROC) curves of 12 distinct machine learning algorithms. RF, Random Forest; SVM, Support Vector Machine; XGB, Extreme Gradient Boosting; GLM, Generalized Linear Model; GBM, Gradient Boosting Machine; KNN, k-Nearest Neighbors; NNET, Neural Network; LASSO, Least Absolute Shrinkage and Selection Operator; DT, Decision Tree; NB, Naive Bayes; ADA, Adaptive Boosting; BAG, Bootstrap Aggregating. (B) Residual box plot of 12 distinct machine learning algorithms. (C) Reverse cumulative distribution graph of 12 distinct machine learning algorithms. (D) Confusion matrix of 12 distinct machine learning algorithms. (E) ROC curves of 6 candidate key genes. AUC, area under the curve. (F) ROC curves of predictive model constructed from 6 candidate key genes within the GSE118916 dataset.
Ijms 26 10369 g004
Figure 5. Gene set enrichment analysis (GSEA) of candidate key genes. (A) HSD17B1, (B) CCNB1, (C) CCNB2, (D) CDKN2A, (E) EZH2, (F) IGFBP6.
Figure 5. Gene set enrichment analysis (GSEA) of candidate key genes. (A) HSD17B1, (B) CCNB1, (C) CCNB2, (D) CDKN2A, (E) EZH2, (F) IGFBP6.
Ijms 26 10369 g005
Figure 6. Molecular docking of 6 candidate key genes and genistein. (A) Molecular docking between HSD17B1 and genistein. (B) Molecular docking between CCNB1 and genistein. (C) Molecular docking between CCNB2 and genistein. (D) Molecular docking between EZH2 and genistein. (E) Molecular docking between CDKN2A and genistein. (F) Molecular docking between IGFBP6 and genistein.
Figure 6. Molecular docking of 6 candidate key genes and genistein. (A) Molecular docking between HSD17B1 and genistein. (B) Molecular docking between CCNB1 and genistein. (C) Molecular docking between CCNB2 and genistein. (D) Molecular docking between EZH2 and genistein. (E) Molecular docking between CDKN2A and genistein. (F) Molecular docking between IGFBP6 and genistein.
Ijms 26 10369 g006
Figure 7. Molecular dynamics (MD) simulation of HSD17B1 and genistein complex. (A,B) root mean square deviation (RMSD) of HSD17B1 and genistein complex. (C) Root mean square fluctuation (RMSF) of HSD17B1 and genistein complex. (D) Radius of gyration (Rg) of HSD17B1 and genistein complex. (E) Buried solvent accessible surface area (buried SASA) of HSD17B1 and genistein complex. (F) Hydrogen bond counts of HSD17B1 and genistein complex. (G,H) Gibbs energy landscape of HSD17B1 and genistein complex.
Figure 7. Molecular dynamics (MD) simulation of HSD17B1 and genistein complex. (A,B) root mean square deviation (RMSD) of HSD17B1 and genistein complex. (C) Root mean square fluctuation (RMSF) of HSD17B1 and genistein complex. (D) Radius of gyration (Rg) of HSD17B1 and genistein complex. (E) Buried solvent accessible surface area (buried SASA) of HSD17B1 and genistein complex. (F) Hydrogen bond counts of HSD17B1 and genistein complex. (G,H) Gibbs energy landscape of HSD17B1 and genistein complex.
Ijms 26 10369 g007
Figure 8. Identification of epithelial cells as key cell type. (A) Selection of top 2000 highly variable genes. (BC) Expression of marker genes in cell types (D) UMAP clustering plot of 21 clusters in the GSE264203 dataset. (E) UMAP plot illustrating the distribution of 8 cell types. (F) Distribution of intergroup DEGs across cell types. (G) The expression of HSD17B1 in each type of cell. (H) Subcellular localization results of expression of HSD17B1. (I) inferCNV score evaluating pronounced copy number alterations (CNAs) for each cell cluster.
Figure 8. Identification of epithelial cells as key cell type. (A) Selection of top 2000 highly variable genes. (BC) Expression of marker genes in cell types (D) UMAP clustering plot of 21 clusters in the GSE264203 dataset. (E) UMAP plot illustrating the distribution of 8 cell types. (F) Distribution of intergroup DEGs across cell types. (G) The expression of HSD17B1 in each type of cell. (H) Subcellular localization results of expression of HSD17B1. (I) inferCNV score evaluating pronounced copy number alterations (CNAs) for each cell cluster.
Ijms 26 10369 g008
Figure 9. Pseudotime trajectory analysis and spatial transcriptome (ST) analysis. (A) UMAP clustering plot of 10 distinct subclusters of epithelial cells. (B) CytoTRACE score of epithelial cells. (C) Distribution of tumor-derived epithelial cells and normal epithelial cells divided according to the established CytoTRACE score. (D) Box plot of CytoTRACE score of tumor-derived epithelial cells and normal epithelial cells. (E) Trajectory construction of highly variable genes. (F) Cell state distribution of epithelial cells in pseudotime trajectory analysis. (G) Distribution of tumor-derived epithelial cells and normal epithelial cells in pseudotime trajectory analysis. (H) Time trajectory of epithelial cells in pseudotime trajectory analysis. (I) The expression of HSD17B1 in pseudotime trajectory analysis. (J) Cluster annotation of spatial transcriptome (ST) analysis and the expression of HSD17B1 in GSM7990475 (top), GSM7990477 (middle), GSM7990480 (bottom) samples of GSE251950.
Figure 9. Pseudotime trajectory analysis and spatial transcriptome (ST) analysis. (A) UMAP clustering plot of 10 distinct subclusters of epithelial cells. (B) CytoTRACE score of epithelial cells. (C) Distribution of tumor-derived epithelial cells and normal epithelial cells divided according to the established CytoTRACE score. (D) Box plot of CytoTRACE score of tumor-derived epithelial cells and normal epithelial cells. (E) Trajectory construction of highly variable genes. (F) Cell state distribution of epithelial cells in pseudotime trajectory analysis. (G) Distribution of tumor-derived epithelial cells and normal epithelial cells in pseudotime trajectory analysis. (H) Time trajectory of epithelial cells in pseudotime trajectory analysis. (I) The expression of HSD17B1 in pseudotime trajectory analysis. (J) Cluster annotation of spatial transcriptome (ST) analysis and the expression of HSD17B1 in GSM7990475 (top), GSM7990477 (middle), GSM7990480 (bottom) samples of GSE251950.
Ijms 26 10369 g009
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, X.; Zhang, J.; Jiang, J.; Wang, Y. The Anti-Cancer Potential of Genistein: Single-Cell RNA Sequencing Analysis and Spatial Transcriptome Reveal That Genistein Targets HSD17B1 to Inhibit the Progression of Gastric Adenocarcinoma. Int. J. Mol. Sci. 2025, 26, 10369. https://doi.org/10.3390/ijms262110369

AMA Style

Wang X, Zhang J, Jiang J, Wang Y. The Anti-Cancer Potential of Genistein: Single-Cell RNA Sequencing Analysis and Spatial Transcriptome Reveal That Genistein Targets HSD17B1 to Inhibit the Progression of Gastric Adenocarcinoma. International Journal of Molecular Sciences. 2025; 26(21):10369. https://doi.org/10.3390/ijms262110369

Chicago/Turabian Style

Wang, Xianbing, Junyuan Zhang, Jiaying Jiang, and Yi Wang. 2025. "The Anti-Cancer Potential of Genistein: Single-Cell RNA Sequencing Analysis and Spatial Transcriptome Reveal That Genistein Targets HSD17B1 to Inhibit the Progression of Gastric Adenocarcinoma" International Journal of Molecular Sciences 26, no. 21: 10369. https://doi.org/10.3390/ijms262110369

APA Style

Wang, X., Zhang, J., Jiang, J., & Wang, Y. (2025). The Anti-Cancer Potential of Genistein: Single-Cell RNA Sequencing Analysis and Spatial Transcriptome Reveal That Genistein Targets HSD17B1 to Inhibit the Progression of Gastric Adenocarcinoma. International Journal of Molecular Sciences, 26(21), 10369. https://doi.org/10.3390/ijms262110369

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop