Next Article in Journal
Multifunctional Dermatological Effects of Whole-Plant Bassia scoparia Extract: Skin Repair and Protection
Previous Article in Journal
Elucidating the Role of CNOT2 in Regulating Cancer Cell Growth via the Modulation of p53 and c-Myc Expression
Previous Article in Special Issue
Relevance of Glucagon-Like Peptide 1 (GLP-1) in Inflammatory Bowel Diseases: A Narrative Review
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Molecular Subtypes and Biomarkers of Ulcerative Colitis Revealed by Sphingolipid Metabolism-Related Genes: Insights from Machine Learning and Molecular Dynamics

1
Department of General Surgery, The First Affiliated Hospital of Dalian Medical University, Dalian 116000, China
2
Institute of Integrative Medicine, Dalian Medical University, Dalian 116000, China
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Curr. Issues Mol. Biol. 2025, 47(8), 616; https://doi.org/10.3390/cimb47080616 (registering DOI)
Submission received: 8 July 2025 / Revised: 28 July 2025 / Accepted: 30 July 2025 / Published: 4 August 2025

Abstract

Ulcerative colitis (UC) is a chronic inflammatory bowel disease associated with disrupted lipid metabolism. This study aimed to uncover novel molecular subtypes and biomarkers by integrating sphingolipid metabolism-related genes (SMGs) with machine learning approaches. Using data from the GEO and GeneCards databases, 29 UC-related SMGs were identified. Consensus clustering was employed to define distinct molecular subtypes of UC, and a diagnostic model was developed through various machine learning algorithms. Further analyses—including functional enrichment, transcription factor prediction, single-cell localization, potential drug screening, molecular docking, and molecular dynamics simulations—were conducted to investigate the underlying mechanisms and therapeutic prospects of the identified genes in UC. The analysis revealed two molecular subtypes of UC: C1 (metabolically dysregulated) and C2 (immune-enriched). A diagnostic model based on three key genes demonstrated high accuracy in both the training and validation cohorts. Moreover, the transcription factor FOXA2 was predicted to regulate the expression of all three genes simultaneously. Notably, mebendazole and NVP-TAE226 emerged as promising therapeutic agents for UC. In conclusion, SMGs are integral to UC molecular subtyping and immune microenvironment modulation, presenting a novel framework for precision diagnosis and targeted treatment of UC.

1. Introduction

Ulcerative colitis (UC) is a chronic, recurrent, and nonspecific inflammatory disorder primarily affecting the colon’s mucosa and submucosa. In recent years, both the incidence and prevalence of UC have steadily risen [1]. The etiology of UC remains unclear and multifactorial; however, increasing evidence suggests that, in addition to genetic factors, disturbances in the intestinal flora, the host immune system, environmental influences, abnormal lipid metabolism, and inflammation play a significant role [2,3]. Despite advancements in treatment, including drug and endoscopic therapies, clinical outcomes remain suboptimal [4]. The disease’s heterogeneity and its complex molecular and immunological mechanisms highlight the urgent need for novel biomarkers and therapeutic targets.
Sphingolipids, a type of bioactive lipid, are crucial for maintaining the integrity and function of cell membranes [5]. These molecules are composed of a polar head group, a fatty-acid chain, and a sphingosine backbone. They are involved in various biological processes, including signal transduction, lipid raft formation, and cell adhesion [6]. Previous studies have indicated that certain sphingolipid metabolites and the sphingosine 1-phosphate signaling pathway may play a role in regulating UC-related inflammation [7]. Additionally, sphingosine kinase 1—which can protect against intestinal injury and systemic inflammation through genetic knockout or pharmacological inhibition—has been identified as a key regulator in dextran sulfate sodium-induced colitis [8]. Therefore, investigating the expression and function of sphingolipid-related genes in UC is critical for advancing both our understanding of the disease and the development of therapeutic strategies.
In research related to disease states, machine learning has emerged as a powerful tool for analyzing complex biological data. Techniques such as Random Forest (RF), Support Vector Machine (SVM), and LASSO regression can integrate and analyze large datasets to uncover patterns and relationships that traditional methods may overlook [9,10]. This study aims to employ machine learning to identify sphingolipid metabolism-related genes (SMGs) associated with UC, and to delineate sphingolipid metabolism-driven molecular subtypes and biomarkers. The goal is to enhance early diagnosis and improve treatment strategies for UC.

2. Methods

The study design flowchart is presented in Figure 1.

2.1. Raw Data Sources

From the GeneCards database (https://www.genecards.org/; accessed on 3 May 2025), 396 SMGs (relevance score > 10) were identified [11]. Gene expression datasets related to UC were retrieved from the GEO database (accessed on 3 May 2025), including GSE48958 [12], GSE75214 [13], GSE38713 [14], and GSE87466 [15]. These datasets were divided into a training group (GSE48958 and GSE75214) and validation sets (GSE38713 and GSE87466) to develop a robust and reliable UC prediction model. Supplementary Table S1 lists information on clinical characteristics of samples from the four datasets.

2.2. Integrated Analysis of UC-Related SMGs

To eliminate batch effects between arrays, the “sva” (version 3.42.0) R package was employed on the combined training matrix [16]. Differentially expressed genes (DEGs) between the UC and control groups were determined on the integrated dataset, with criteria set to an adjusted p-value < 0.05 and |log2FC| > 1. Statistical significance was assessed using moderated t-tests implemented in the “limma” (version 3.50.3) package, with multiple testing correction performed by the Benjamini–Hochberg method to control the false discovery rate (FDR). UC-related SMGs were identified by intersecting the SMGs with the DEGs using a Venn diagram.

2.3. Functional Enrichment Analysis

The UC-related DEGs were uploaded to the STRING database (accessed on 10 May 2025) to explore protein–protein interaction (PPI) networks [17]. GeneMANIA (https://genemania.org/) (accessed on 10 May 2025) was used to rank genes based on functional testing [18]. Functional enrichment analysis of these DEGs was conducted through the Metascape database (accessed on 10 May 2025) [19]. Enrichment significance was assessed using the cumulative hypergeometric distribution, and p-values were adjusted for multiple comparisons using the Benjamini–Hochberg FDR correction. Enriched terms were filtered based on the criteria of p < 0.01, a minimum gene count ≥ 3, and an enrichment factor > 1.5.

2.4. Identification of Sphingolipid Metabolism-Related Molecular Subtypes in Patients with UC

To classify UC samples based on the expression levels of UC-related SMGs, the “ConsensusClusterPlus” (version 1.58.0) R package was applied [20]. Principal component analysis (PCA) was performed to validate the clustering results, followed by biological characteristic comparisons of the different subgroups. To assess the differences in immune cell infiltration levels between UC molecular subtypes, Wilcoxon’s rank-sum test was applied to each immune cell type. Statistical significance was determined using two-sided tests, and p-values were adjusted for multiple comparisons.

2.5. Identifying the Best Model Genes of UC-Related SMGs

To prevent overfitting, LASSO analysis was performed using the “glmnet” (version 4.1.8) R package, with 10-fold cross-validation used to determine the optimal penalty parameter λ [21,22]. RF analysis was conducted using the “randomForest” (version 4.7.1.1) R package (ntree = 500) [23]. The mean decrease in Gini index produced by the RF was used to assess feature importance, with genes having relative importance greater than one classified as characteristic genes. Additionally, SVM-RFE was implemented using the “e1071” (version 1.7.14) R package [24]. SVM-RFE, based on structural risk minimization, aims to minimize empirical error and maximize learning performance. The model genes selected for further analysis were those that intersected across the three methods. To compare the expression levels of model genes between control and UC groups, Wilcoxon’s rank-sum test was employed. Significance was determined using two-sided tests, and p-values were annotated as significance levels in the boxplots.

2.6. Gene Set Variation Analysis (GSVA) and Immunoinfiltration Analysis

To investigate pathway activity alterations in the model genes, GSVA was performed using the “GSVA” (version 1.42.0) R package [25]. For each pathway, a two-sided unpaired Student’s t-test was used to assess the difference in ssGSEA scores between the two groups. Pathways were considered significantly upregulated or downregulated based on the p-value (<0.05) and the direction of the t-statistic. Simultaneously, a comprehensive evaluation of immune cell composition—an essential aspect of UC research—was conducted using the CIBERSORT algorithm [26]. This efficient method facilitated the quantification of 22 distinct immune cell types. The correlations between gene expression levels and immune cell infiltration were assessed using Spearman’s rank correlation coefficient. The analysis was visualized with lollipop plots, where the circle size denotes the magnitude of the correlation, and the color represents the statistical significance of the association. A p-value threshold of 0.05 was used to define statistically significant correlations.

2.7. Cell Culture and Quantitative Real-Time PCR Analysis

The Caco-2 cell line was cultured in MEM supplemented with 20% fetal bovine serum and 1% penicillin–streptomycin. Consistent with prior research, the cells were treated with LPS (1 µg/mL) for 24 h to induce a colitis model [27,28]. Total RNA was extracted using Trizol reagent, and cDNA synthesis was performed using a reverse-transcription kit. Gene expression levels were quantified via a fluorescent dye-based assay with SYBRGreen I. The expression levels of target genes between the two groups were compared using a t-test, while RNA expression was evaluated and quantified through the ΔΔCt method. A p-value < 0.05 was considered statistically significant.

2.8. Prediction of Core Gene Transcription Factors (TFs), Cell Localization, and Single-Cell Profiling Analysis

The model genes for TFs were identified using the TFTF online tool (accessed on 15 May 2025) [29], which integrates three major TF-target databases: JASPAR [30], GTRD [31], and ChIP_Atlas [32]. To explore the localization patterns of key genes in the colon, single-cell transcriptome data were obtained from the Human Protein Atlas (HPA) database (accessed on 15 May 2025).

2.9. Prediction of Potential Therapeutic Drugs for UC

The Connection Map (CMap, https://clue.io/) database (accessed on 15 May 2025) was used to explore functional linkages between genes, small-molecule drugs, and diseases [33,34]. UC-related DEGs were uploaded to the CMap database to predict potential therapeutic drugs. The identified small-molecule compounds were subsequently validated through molecular docking. The 3D structures of target proteins were retrieved from the Protein Data Bank (PDB), and docking was performed using the CB-Dock2 online server (https://cadd.labshare.cn/cb-dock2/index.php), which combines cavity detection with AutoDock Vina to automatically identify optimal binding pockets and docking poses (accessed on 18 May 2025). Molecular dynamics simulations were conducted using GROMACS 2021.3 with the CHARMM36 force field.

3. Results

3.1. Identification of DEGs Associated with UC and SMGs

Raw UC data were sourced from the GEO database, and after removing batch effects, the datasets were normalized (Figure S1A–D). “Limma” analysis was conducted on the UC cohort, identifying 551 DEGs, including 340 upregulated and 211 downregulated genes (Figure 2A). Further intersection of UC-related DEGs with SMGs revealed 29 shared genes for subsequent analysis (Figure 2B). The PPI network for these 29 DEGs was constructed using data from the STRING and GeneMANIA databases (Figure 2C,D). Functional enrichment analysis indicated that these genes were predominantly involved in interleukin-4 and interleukin-13 signaling, nutrient response, and lipid localization (Figure 2E).

3.2. Identification of SMG-Related Clusters in UC

To classify UC samples, a consensus clustering approach was applied based on the expression patterns of the 29 UC-SMGs. The results were most stable when the samples were divided into two clusters (Figure S1E,F). The expression levels of the 29 UC-related SMGs were then compared between the two clusters, C1 and C2, to assess molecular differences (Figure 3A). Next, GSVA was performed to identify potential biological and functional distinctions between the clusters. The results showed that C1 (the metabolically dysregulated subtype) was mainly associated with butanoate metabolism, glycine–serine metabolism, and lipid modification, whereas C2 (the immune-enriched subtype) was enriched in the positive regulation of T-cell-mediated cytotoxicity, the regulation of natural killer cell-mediated immunity, and the T-cell receptor signaling pathway (Figure 3B,C). Further comparison of immune cell infiltration between the two clusters revealed that T cells (CD4 memory), resting NK cells, macrophages (M0 and M1), activated dendritic cells, and neutrophils were more abundant in C1. In contrast, T cells (CD8), regulatory T cells (Tregs), and M2 macrophages were significantly more abundant in C2 (Figure 3D,E).

3.3. Construction of the Diagnostic Model for UC

Three machine learning models were developed based on the 29 UC-related SMGs. The LASSO regression approach identified 13 genes as potential diagnostic indicators (Figure 4A,B). Using the SVM-RFE method, nine genes were further selected as potential biomarkers from this set (Figure 4C,D). Additionally, nine genes with importance values greater than one, as determined by the RF algorithm, were included for further investigation (Figure 4E,F). By overlaying the results from all three methods on a Venn diagram, three genes—CAV1, PPARG, and SLC30A10—were identified as key diagnostic biomarkers (Figure 4G).

3.4. Evaluation of the Diagnostic Model

The chromosomal locations of the three model genes are shown in Figure 5A. Among them, CAV1 was highly expressed in patients with UC, while PPARG and SLC30A10 were downregulated (Figure 5B). These findings were confirmed in the validation set (Figure S2A,B). The receiver operating characteristic (ROC) curves indicated strong predictive values for all three genes (Figure 5C). Notably, the three-gene prediction model, with an AUC of 0.991, exhibited superior performance (Figure 5D). To further assess the model’s accuracy, the same analysis was conducted on the validation set, demonstrating high predictive value in this cohort as well (Figure 5E,F). Additionally, UC cellular models were generated by treating Caco-2 cell lines with LPS, and the expression levels of the three model genes were confirmed. The UC model was successfully established, as shown by significantly elevated levels of IL-6 and IL-1β in the model group compared to the control group (Figure 5G). The expression levels of the three model genes in the cellular model closely aligned with the bioinformatics findings (Figure 5G).

3.5. Analysis of the Functional Enrichment and Immune Infiltration of the Model Genes

To explore the pathway enrichment differences in the three model genes, GSVA analysis was performed. CAV1 was found to be highly expressed in lipid metabolism pathways, including the diacylglycerol metabolic process, pyruvate metabolism, and inositol phosphate metabolism. In contrast, PPARG and SLC30A10 were downregulated in processes such as monoacylglycerol metabolism, retinol metabolism, nitrogen metabolism, and glycerophospholipid metabolism (Figure S3). Additionally, this study explored the relationship between each model gene and various immune cell types, revealing several interesting correlations. Notably, CAV1 was positively correlated with neutrophils, resting CD4 memory T cells, and M1 macrophages, while PPARG exhibited a negative correlation with M1 macrophages, neutrophils, and activated CD4 memory T cells (Figure 6A–H). Furthermore, SLC30A10 showed a negative correlation with activated dendritic cells, neutrophils, and M1 macrophages (Figure 6I–L). This study also performed a correlation analysis between the different immune cell types in patients with UC and examined gene–immune cell interactions (Figure 6M).

3.6. Screening of Transcription Factor (TFs) of Model Genes and Single-Cell Expression Analysis

TFs are proteins that regulate transcription by binding to DNA in a sequence-specific manner, ultimately controlling the expression of target genes and influencing biological phenotypes and pathological processes. To identify TFs associated with CAV1, PPARG, and SLC30A10, three datasets were merged (Figure S4A–C). The intersection of these datasets revealed a common TF, FOXA2, which could potentially regulate all three model genes (Figure S4D). Single-cell expression analysis from the HPA database showed distinct expression patterns for the three key genes in the colon. CAV1 was predominantly expressed in undifferentiated cells (Figure S4E), while PPARG and SLC30A10 were mainly localized in distal enterocytes (Figure S4F,G).

3.7. Screening of Potential Therapeutic Agents

The CMap database was employed to identify promising small-molecule drugs for UC treatment based on UC-related DEGs. The top 10 compounds with the highest potential for UC treatment are listed in Table 1. To validate their suitability, molecular docking analysis was performed between these ten drugs and the three model genes using AutoDock Vina, with results presented in Table 1. Mebendazole and NVP-TAE226 showed the lowest binding free energies (less than −8.0 kcal/mol) with PPARG and SLC30A10, respectively, indicating the most stable binding interactions. The drug binding poses and sites on the model genes are shown in Figure 7A,B. Mebendazole formed two hydrogen bonds with PPARG (Figure 7A), while NVP-TAE226 formed six hydrogen bonds with SLC30A10 (Figure 7B), stabilizing the drug–protein interaction.
To assess the stability of these complexes, 100 ns molecular dynamics simulations were carried out. Root Mean Square Deviation (RMSD) and root mean square fluctuation (RMSF) analyses (Figure 7C–F) indicated rapid equilibration and minimal fluctuations in residue positions, suggesting structural stability of both complexes. Solvent-accessible surface area (SASA) and radius of gyration (Rg) analyses (Figure 7G–J) further confirmed the compactness of the complexes’ conformations. The PPARG–mebendazole complex maintained 1–3 hydrogen bonds, while the SLC30A10–NVP-TAE226 complex consistently formed 4–6 hydrogen bonds (Figure 7K,L), suggesting a stronger interaction in the latter. Free energy landscape (FEL) analysis (Figure 7M,N) revealed that both complexes stabilized in dominant, low-energy conformational states, with the SLC30A10 complex showing slightly more flexibility due to its transmembrane nature. PCA and the corresponding FELs (Figure 7O–R) supported the presence of coordinated motions and stable conformational basins, further demonstrating the stability and strong binding affinity of both complexes. Overall, the PPARG–mebendazole and SLC30A10–NVP-TAE226 complexes exhibited strong binding interactions, structural stability, and minimal conformational drift, supporting their potential as promising therapeutic candidates for UC.

4. Discussion

UC, a chronic, relapsing inflammatory bowel disease, is characterized by intestinal inflammation, mucosal damage, and fibrosis [35]. Although its incidence continues to rise, the underlying molecular mechanisms remain unclear, and treatment outcomes are suboptimal. Sphingolipids, a diverse group of structurally and biologically active lipids, are metabolized through an intricate network of enzymes. Research into the role of bioactive sphingolipids in signaling mechanisms has expanded beyond their initial involvement in PKC regulation to encompass a range of biological processes, including metabolism, apoptosis, cellular development, differentiation, proliferation, immunology, inflammation, and related diseases [36]. This study leverages machine learning and bioinformatics approaches to identify SMGs as novel diagnostic biomarkers and potential therapeutic targets.
Through a series of analyses combining UC and SMGs, 29 common genes were identified. Functional enrichment analysis revealed that these genes are primarily involved in lipid localization, response to nutritional signals, and interleukin-4 and interleukin-13 signaling. These findings suggest that the identified genes may serve as key links between sphingolipid metabolism and UC, with the enriched pathways offering insight into the molecular mechanisms by which sphingolipid metabolism influences UC. Consensus clustering further classified patients with UC into two subgroups: C1 (a metabolically dysregulated subtype) and C2 (an immune-enriched subtype). This stratification reflects the clinical heterogeneity of UC, with some patients exhibiting severe inflammation and immune cell accumulation, while others display metabolic disorder characteristics.
By integrating three machine learning methods, three model genes—CAV1, PPARG, and SLC30A10—were identified. The three-gene model demonstrated exceptional diagnostic performance, with an AUC of 0.991, highlighting its potential for non-invasive UC screening. The consistency of results across validation datasets and in vitro LPS-induced models further supports the reliability of these biomarkers. The identification of these genes as diagnostic biomarkers is consistent with their distinct roles in lipid metabolism and inflammation. CAV1, a membrane scaffolding protein found in lipid rafts and caveolae, plays a vital role in signal transduction, metabolism, endocytosis, and exocytosis [37,38,39]. By regulating the activation of mitogen-activated protein kinase family members, CAV1 suppresses pro-inflammatory cytokine production from macrophages and has been linked to the modulation of inflammation and innate immunity [39]. As the principal component of caveolae, structures rich in cholesterol and sphingolipids, CAV1 is involved in the dynamic regulation of cholesterol within cells. It regulates cholesterol distribution and transport on the cell membrane by binding to cholesterol [38]. In the present study, CAV1 was upregulated in UC and positively correlated with pro-inflammatory immune cells. Its overexpression may exacerbate mucosal inflammation by amplifying lipid-mediated signaling pathways, aligning with its involvement in lipid metabolic processes. In contrast, PPARG, a subfamily of PPARs involved in regulating immune tolerance, metabolism, and inflammation [40], and SLC30A10, a zinc transporter implicated in lipid synthesis [41], were both downregulated in UC. Moreover, the expression of PPARG and SLC30A10 was negatively correlated with anti-inflammatory cells, suggesting that their downregulation may promote inflammation by impairing the activity of anti-inflammatory cells and disrupting lipid homeostasis. These findings are consistent with prior studies linking PPAR deficiency to impaired mucosal repair [42] and SLC30A10 mutations to metabolic dysregulation [43].
Furthermore, shared TFs for these three genes were identified, with FOXA2 emerging as a key regulator potentially coordinating the expression of CAV1, PPARG, and SLC30A10. The FOX family of TFs plays a pivotal role in the differentiation and function of various cell types [44]. The expression of the gene regulating cystic fibrosis transmembrane conductance in intestinal epithelial cells depends on FOXA1/A2 [45,46]. FOXA2 also regulates the function of intestinal epithelial cells through a co-regulated gene network [47]. Deletion of FOXA1/A2 results in decreased intracellular adenosine 3′,5′-cyclic monophosphate levels, which are vital for ion and solute transport and other enterocyte processes [47]. These findings suggest that FOXA2 plays a pivotal role in intestinal epithelial differentiation and barrier integrity, and its dysregulation in UC could disrupt mucosal repair and lipid metabolism.
In this study, we identified mebendazole and NVP-TAE226 as promising therapeutic candidates for UC through integrative analysis using the CMap database and molecular docking. Molecular dynamics simulations further validated the stability and reliability of the predicted binding modes. Both the PPARG–mebendazole and SLC30A10–NVP-TAE226 complexes maintained stable interactions throughout the simulation, with low RMSD values, consistent hydrogen bonding, and favorable FELs, indicating strong and specific binding under physiological conditions. Mebendazole, an FDA-approved benzimidazole, is safe for use in both children [48] and adults [49] and is effective against various intestinal helminths. In addition to its antiparasitic properties, mebendazole has shown anti-inflammatory and anti-fibrotic effects in several cell lines and animal models by downregulating key signaling pathways such as mitogen-activated protein kinase [50,51], nuclear factor- kappa B [52], cyclooxygenase 2 [53], and TGF-β [23]. It also reduces collagen release and alpha-smooth muscle actin levels, contributing to its anti-fibrotic effects [54]. A recent animal study demonstrated that mebendazole reduces inflammation and accelerates healing in UC mouse models [55]. Furthermore, it was also reported to induce M2-phenotype polarization of macrophages with anti-inflammatory properties, resulting in the production of anti-inflammatory modulators, including IL-10 and CD206, and acceleration of the healing process [56]. In a pilot study, the addition of mebendazole to mesalamine for the treatment of UC was a safe and potentially beneficial approach to improve mesalamine efficacy and reduce clinical symptoms [57]. NVP-TAE226, a focal adhesion kinase (FAK) inhibitor, has been explored for treating various malignant tumors [58]. FAK serves as a key signaling node downstream of integrin and growth factor receptors, and its inhibition can block pro-inflammatory and pro-fibrotic signal transduction [59]. The high affinity of mebendazole for PPARG may restore its anti-inflammatory signaling, while the interaction between NVP-TAE226 and SLC30A10 could inhibit the activation of FAK by the zinc transporter, collectively contributing to the therapeutic effects in UC.
In summary, this study presents an innovative integration of sphingolipid metabolism and UC transcriptomic data to identify two distinct molecular subtypes of UC using consensus clustering. Three machine learning algorithms were employed in parallel for robust identification of key genes, with cross-validation among models effectively reducing the risk of overfitting. Validation using independent cohorts and experimental data further reinforces the credibility of the findings. Moreover, by combining CMap-based drug screening with molecular docking and molecular dynamics simulations, we performed a comprehensive evaluation of potential therapeutic compounds, offering new insights and directions for the clinical treatment of UC. However, several limitations remain in this study. First, the use of publicly available databases resulted in small sample sizes in the validation cohorts. Therefore, further validation through larger-scale, multicenter randomized controlled trials is needed. Additionally, functional experiments for verification are lacking. While LPS-induced Caco-2 cells validated the gene expression trends, further functional studies are essential to confirm the findings. Therefore, future studies using animal models such as DSS-induced colitis in mice are warranted to further validate the biological significance of the identified targets.

5. Conclusions

Through the integration of SMGs and transcriptomics data, this study revealed two molecular subtypes of UC with significant biological functional differences and established a novel and excellent diagnostic model. Meanwhile, we identified mebendazole and NVP-TAE226 as potential candidates for the treatment of UC, providing a reference for the clinical management of UC.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/cimb47080616/s1.

Author Contributions

Conceptualization, methodology, investigation, writing—original draft, Q.L. and J.L. (Junchen Li); data curation, formal analysis, writing—original draft, S.L., Y.Z. and J.L. (Jifeng Liu); reviewing and revising papers, X.W. and G.L. Q.L. and J.L. (Junchen Li) contributed equally to this work. All authors fully participated in this work and agreed to take responsibility for all aspects of this work. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Acknowledgments

We thank Bullet Edits Limited for the linguistic editing and proofreading of the manuscript.

Conflicts of Interest

There are no conflicts of interest declared by any of the authors.

References

  1. Krugliak Cleveland, N.; Torres, J.; Rubin, D.T. What Does Disease Progression Look Like in Ulcerative Colitis, and How Might It Be Prevented? Gastroenterology 2022, 162, 1396–1408. [Google Scholar] [CrossRef]
  2. Ding, X.; Yan, F.; Wang, W.; Qin, J.; Luo, L. Integration of transcriptomics and metabolomics identify biomarkers of aberrant lipid metabolism in ulcerative colitis. Int. Immunopharmacol. 2024, 131, 111865. [Google Scholar] [CrossRef]
  3. Kobayashi, T.; Siegmund, B.; Le Berre, C.; Wei, S.C.; Ferrante, M.; Shen, B.; Bernstein, C.N.; Danese, S.; Peyrin-Biroulet, L.; Hibi, T. Ulcerative colitis. Nat. Reviews. Dis. Primers 2020, 6, 74. [Google Scholar] [CrossRef]
  4. Adams, S.M.; Close, E.D.; Shreenath, A.P. Ulcerative Colitis: Rapid Evidence Review. Am. Fam. Physician 2022, 105, 406–411. [Google Scholar]
  5. Zheng, W.; Kollmeyer, J.; Symolon, H.; Momin, A.; Munter, E.; Wang, E.; Kelly, S.; Allegood, J.C.; Liu, Y.; Peng, Q.; et al. Ceramides and other bioactive sphingolipid backbones in health and disease: Lipidomic analysis, metabolism and roles in membrane structure, dynamics, signaling and autophagy. Biochim. Et Biophys. Acta 2006, 1758, 1864–1884. [Google Scholar] [CrossRef]
  6. Siow, D.; Sunkara, M.; Morris, A.; Wattenberg, B. Regulation of de novo sphingolipid biosynthesis by the ORMDL proteins and sphingosine kinase-1. Adv. Biol. Regul. 2015, 57, 42–54. [Google Scholar] [CrossRef] [PubMed]
  7. Kunkel, G.T.; Maceyka, M.; Milstien, S.; Spiegel, S. Targeting the sphingosine-1-phosphate axis in cancer, inflammation and beyond. Nat. Rev. Drug Discov. 2013, 12, 688–702. [Google Scholar] [CrossRef] [PubMed]
  8. Snider, A.J.; Orr Gandy, K.A.; Obeid, L.M. Sphingosine kinase: Role in regulation of bioactive sphingolipid mediators in inflammation. Biochimie 2010, 92, 707–715. [Google Scholar] [CrossRef] [PubMed]
  9. Levi, M.; Lazebnik, T.; Kushnir, S.; Yosef, N.; Shlomi, D. Machine learning computational model to predict lung cancer using electronic medical records. Cancer Epidemiol. 2024, 92, 102631. [Google Scholar] [CrossRef]
  10. Liu, Y.; Cai, C.; Xu, W.; Li, B.; Wang, L.; Peng, Y.; Yu, Y.; Liu, B.; Zhang, K. Interpretable Machine Learning-Aided Optical Deciphering of Serum Exosomes for Early Detection, Staging, and Subtyping of Lung Cancer. Anal. Chem. 2024, 96, 16227–16235. [Google Scholar] [CrossRef]
  11. Liu, J.; Li, Y.; Ma, J.; Wan, X.; Zhao, M.; Zhang, Y.; Shang, D. Identification and immunological characterization of lipid metabolism-related molecular clusters in nonalcoholic fatty liver disease. Lipids Health Dis. 2023, 22, 124. [Google Scholar] [CrossRef]
  12. Van der Goten, J.; Vanhove, W.; Lemaire, K.; Van Lommel, L.; Machiels, K.; Wollants, W.-J.; De Preter, V.; De Hertogh, G.; Ferrante, M.; Van Assche, G.; et al. Integrated miRNA and mRNA expression profiling in inflamed colon of patients with ulcerative colitis. PLoS ONE 2014, 9, e116117. [Google Scholar] [CrossRef] [PubMed]
  13. Vancamelbeke, M.; Vanuytsel, T.; Farré, R.; Verstockt, S.; Ferrante, M.; Van Assche, G.; Rutgeerts, P.; Schuit, F.; Vermeire, S.; Arijs, I.; et al. Genetic and Transcriptomic Bases of Intestinal Epithelial Barrier Dysfunction in Inflammatory Bowel Disease. Inflamm. Bowel Dis. 2017, 23, 1718–1729. [Google Scholar] [CrossRef] [PubMed]
  14. Planell, N.; Lozano, J.J.; Mora-Buch, R.; Masamunt, M.C.; Jimeno, M.; Ordás, I.; Esteller, M.; Ricart, E.; Piqué, J.M.; Panés, J.; et al. Transcriptional analysis of the intestinal mucosa of patients with ulcerative colitis in remission reveals lasting epithelial cell alterations. Gut 2013, 62, 967–976. [Google Scholar] [CrossRef] [PubMed]
  15. Li, K.; Strauss, R.; Ouahed, J.; Chan, D.; Telesco, S.E.; Shouval, D.S.; Canavan, J.B.; Brodmerkel, C.; Snapper, S.B.; Friedman, J.R. Molecular Comparison of Adult and Pediatric Ulcerative Colitis Indicates Broad Similarity of Molecular Pathways in Disease Tissue. J. Pediatr. Gastroenterol. Nutr. 2018, 67, 45–52. [Google Scholar] [CrossRef] [PubMed]
  16. Leek, J.T.; Johnson, W.E.; Parker, H.S.; Jaffe, A.E.; Storey, J.D. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics 2012, 28, 882–883. [Google Scholar] [CrossRef]
  17. Szklarczyk, D.; Kirsch, R.; Koutrouli, M.; Nastou, K.; Mehryary, F.; Hachilif, R.; Gable, A.L.; Fang, T.; Doncheva, N.T.; Pyysalo, S.; et al. The STRING database in 2023: Protein-protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucleic Acids Res. 2023, 51, D638–D646. [Google Scholar] [CrossRef]
  18. Warde-Farley, D.; Donaldson, S.L.; Comes, O.; Zuberi, K.; Badrawi, R.; Chao, P.; Franz, M.; Grouios, C.; Kazi, F.; Lopes, C.T.; et al. The GeneMANIA prediction server: Biological network integration for gene prioritization and predicting gene function. Nucleic Acids Res. 2010, 38, W214–W220. [Google Scholar] [CrossRef]
  19. Zhou, Y.; Zhou, B.; Pache, L.; Chang, M.; Khodabakhshi, A.H.; Tanaseichuk, O.; Benner, C.; Chanda, S.K. Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nat. Commun. 2019, 10, 1523. [Google Scholar] [CrossRef]
  20. Li, C.-L.; Wang, Q.; Wu, L.; Hu, J.-Y.; Gao, Q.-C.; Jiao, X.-L.; Zhang, Y.-X.; Tang, S.; Yu, Q.; He, P.-F. The PANoptosis-related hippocampal molecular subtypes and key biomarkers in Alzheimer’s disease patients. Sci. Rep. 2024, 14, 23851. [Google Scholar] [CrossRef]
  21. Bai, J.; Huang, J.H.; Price, C.P.E.; Schauer, J.M.; Suh, L.A.; Harmon, R.; Conley, D.B.; Welch, K.C.; Kern, R.C.; Shintani-Smith, S.; et al. Prognostic factors for polyp recurrence in chronic rhinosinusitis with nasal polyps. J. Allergy Clin. Immunol. 2022, 150, 352–361.e357. [Google Scholar] [CrossRef]
  22. Blanco, J.L.; Porto-Pazos, A.B.; Pazos, A.; Fernandez-Lozano, C. Prediction of high anti-angiogenic activity peptides in silico using a generalized linear model and feature selection. Sci. Rep. 2018, 8, 15688. [Google Scholar] [CrossRef]
  23. Rigatti, S.J. Random Forest. J. Insur. Med. 2017, 47, 31–39. [Google Scholar] [CrossRef]
  24. Yang, L.; Pan, X.; Zhang, Y.; Zhao, D.; Wang, L.; Yuan, G.; Zhou, C.; Li, T.; Li, W. Bioinformatics analysis to screen for genes related to myocardial infarction. Front. Genet. 2022, 13, 990888. [Google Scholar] [CrossRef]
  25. Hanzelmann, S.; Castelo, R.; Guinney, J. GSVA: Gene set variation analysis for microarray and RNA-seq data. BMC Bioinform. 2013, 14, 7. [Google Scholar] [CrossRef]
  26. Jiang, C.; Zhang, S.; Jiang, L.; Chen, Z.; Chen, H.; Huang, J.; Tang, J.; Luo, X.; Yang, G.; Liu, J.; et al. Precision unveiled: Synergistic genomic landscapes in breast cancer-Integrating single-cell analysis and decoding drug toxicity for elite prognostication and tailored therapeutics. Environ. Toxicol. 2024, 39, 3448–3472. [Google Scholar] [CrossRef]
  27. Li, J.; Chen, Y.; Yu, Q.; Li, S.; Zhang, X.; Cheng, Y.; Fu, X.; Li, J.; Zhu, L. Estrogen receptor β alleviates colitis in intestinal epithelial cells and activates HIF-1a and ATG-9a-mediated autophagy. Exp. Cell Res. 2025, 447, 114520. [Google Scholar] [CrossRef] [PubMed]
  28. Wang, Y.; Xu, T.; Wang, W. C9orf72 Alleviates DSS-Induced Ulcerative Colitis via the cGAS-STING Pathway. Immun. Inflamm. Dis. 2025, 13, e70139. [Google Scholar] [CrossRef] [PubMed]
  29. Wang, J. TFTF: An R-Based Integrative Tool for Decoding Human Transcription Factor-Target Interactions. Biomolecules 2024, 14, 749. [Google Scholar] [CrossRef]
  30. Rauluseviciute, I.; Riudavets-Puig, R.; Blanc-Mathieu, R.; Castro-Mondragon, J.A.; Ferenc, K.; Kumar, V.; Lemma, R.B.; Lucas, J.; Chèneby, J.; Baranasic, D.; et al. JASPAR 2024: 20th anniversary of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 2024, 52, D174–D182. [Google Scholar] [CrossRef] [PubMed]
  31. Kolmykov, S.; Yevshin, I.; Kulyashov, M.; Sharipov, R.; Kondrakhin, Y.; Makeev, V.J.; Kulakovskiy, I.V.; Kel, A.; Kolpakov, F. GTRD: An integrated view of transcription regulation. Nucleic Acids Res. 2021, 49, D104–D111. [Google Scholar] [CrossRef] [PubMed]
  32. Zou, Z.; Ohta, T.; Miura, F.; Oki, S. ChIP-Atlas 2021 update: A data-mining suite for exploring epigenomic landscapes by fully integrating ChIP-seq, ATAC-seq and Bisulfite-seq data. Nucleic Acids Res. 2022, 50, W175–W182. [Google Scholar] [CrossRef]
  33. Subramanian, A.; Narayan, R.; Corsello, S.M.; Peck, D.D.; Natoli, T.E.; Lu, X.; Gould, J.; Davis, J.F.; Tubelli, A.A.; Asiedu, J.K.; et al. A Next Generation Connectivity Map: L1000 Platform and the First 1,000,000 Profiles. Cell 2017, 171, 1437–1452.e17. [Google Scholar] [CrossRef]
  34. Yang, C.; Zhang, H.; Chen, M.; Wang, S.; Qian, R.; Zhang, L.; Huang, X.; Wang, J.; Liu, Z.; Qin, W.; et al. A survey of optimal strategy for signature-based drug repositioning and an application to liver cancer. eLife 2022, 11, e71880. [Google Scholar] [CrossRef]
  35. Wangchuk, P.; Yeshi, K.; Loukas, A. Ulcerative colitis: Clinical biomarkers, therapeutic targets, and emerging treatments. Trends Pharmacol. Sci. 2024, 45, 892–903. [Google Scholar] [CrossRef]
  36. Pyne, N.J.; McNaughton, M.; Boomkamp, S.; MacRitchie, N.; Evangelisti, C.; Martelli, A.M.; Jiang, H.R.; Ubhi, S.; Pyne, S. Role of sphingosine 1-phosphate receptors, sphingosine kinases and sphingosine in cancer and inflammation. Adv. Biol. Regul. 2016, 60, 151–159. [Google Scholar] [CrossRef]
  37. Ni, K.; Wang, C.; Carnino, J.M.; Jin, Y. The Evolving Role of Caveolin-1: A Critical Regulator of Extracellular Vesicles. Med. Sci. 2020, 8, 46. [Google Scholar] [CrossRef] [PubMed]
  38. Razani, B.; Lisanti, M.P. Caveolins and caveolae: Molecular and functional relationships. Exp. Cell Res. 2001, 271, 36–44. [Google Scholar] [CrossRef] [PubMed]
  39. Wang, X.M.; Kim, H.P.; Song, R.; Choi, A.M. Caveolin-1 confers antiinflammatory effects in murine macrophages via the MKK3/p38 MAPK pathway. Am. J. Respir. Cell Mol. Biol. 2006, 34, 434–442. [Google Scholar] [CrossRef]
  40. Zuo, X.; Deguchi, Y.; Xu, W.; Liu, Y.; Li, H.S.; Wei, D.; Tian, R.; Chen, W.; Xu, M.; Yang, Y.; et al. PPARD and Interferon Gamma Promote Transformation of Gastric Progenitor Cells and Tumorigenesis in Mice. Gastroenterology 2019, 157, 163–178. [Google Scholar] [CrossRef] [PubMed]
  41. Claro da Silva, T.; Hiller, C.; Gai, Z.; Kullak-Ublick, G.A. Vitamin D3 transactivates the zinc and manganese transporter SLC30A10 via the Vitamin D receptor. J. Steroid Biochem. Mol. Biol. 2016, 163, 77–87. [Google Scholar] [CrossRef] [PubMed]
  42. Chen, L.; Jiao, T.; Liu, W.; Luo, Y.; Wang, J.; Guo, X.; Tong, X.; Lin, Z.; Sun, C.; Wang, K.; et al. Hepatic cytochrome P450 8B1 and cholic acid potentiate intestinal epithelial injury in colitis by suppressing intestinal stem cell renewal. Cell Stem Cell 2022, 29, 1366–1381.e1369. [Google Scholar] [CrossRef]
  43. Winslow, J.W.W.; Limesand, K.H.; Zhao, N. The Functions of ZIP8, ZIP14, and ZnT10 in the Regulation of Systemic Manganese Homeostasis. Int. J. Mol. Sci. 2020, 21, 3304. [Google Scholar] [CrossRef]
  44. van der Sluis, M.; Vincent, A.; Bouma, J.; Korteland-Van Male, A.; van Goudoever, J.B.; Renes, I.B.; Van Seuningen, I. Forkhead box transcription factors Foxa1 and Foxa2 are important regulators of Muc2 mucin expression in intestinal epithelial cells. Biochem. Biophys. Res. Commun. 2008, 369, 1108–1113. [Google Scholar] [CrossRef]
  45. Kerschner, J.L.; Gosalia, N.; Leir, S.H.; Harris, A. Chromatin remodeling mediated by the FOXA1/A2 transcription factors activates CFTR expression in intestinal epithelial cells. Epigenetics 2014, 9, 557–565. [Google Scholar] [CrossRef]
  46. Kerschner, J.L.; Harris, A. Transcriptional networks driving enhancer function in the CFTR gene. Biochem. J. 2012, 446, 203–212. [Google Scholar] [CrossRef]
  47. Gosalia, N.; Yang, R.; Kerschner, J.L.; Harris, A. FOXA2 regulates a network of genes involved in critical functions of human intestinal epithelial cells. Physiol. Genom. 2015, 47, 290–297. [Google Scholar] [CrossRef] [PubMed]
  48. Todorov, T.; Vutova, K.; Mechkov, G.; Georgiev, P.; Petkov, D.; Tonchev, Z.; Nedelkov, G. Chemotherapy of human cystic echinococcosis: Comparative efficacy of mebendazole and albendazole. Ann. Trop. Med. Parasitol. 1992, 86, 59–66. [Google Scholar] [CrossRef] [PubMed]
  49. Messaritakis, J.; Psychou, P.; Nicolaidou, P.; Karpathios, T.; Syriopoulou, B.; Fretzayas, A.; Krikos, F.; Matsaniotis, N. High mebendazole doses in pulmonary and hepatic hydatid disease. Arch. Dis. Child. 1991, 66, 532–533. [Google Scholar] [CrossRef]
  50. Simbulan-Rosenthal, C.M.; Dakshanamurthy, S.; Gaur, A.; Chen, Y.S.; Fang, H.B.; Abdussamad, M.; Zhou, H.; Zapas, J.; Calvert, V.; Petricoin, E.F.; et al. The repurposed anthelmintic mebendazole in combination with trametinib suppresses refractory NRASQ61K melanoma. Oncotarget 2017, 8, 12576–12595. [Google Scholar] [CrossRef] [PubMed]
  51. Younis, N.S.; Ghanim, A.M.H.; Saber, S. Mebendazole augments sensitivity to sorafenib by targeting MAPK and BCL-2 signalling in n-nitrosodiethylamine-induced murine hepatocellular carcinoma. Sci. Rep. 2019, 9, 19095. [Google Scholar] [CrossRef]
  52. Blom, K.; Senkowski, W.; Jarvius, M.; Berglund, M.; Rubin, J.; Lenhammar, L.; Parrow, V.; Andersson, C.; Loskog, A.; Fryknäs, M.; et al. The anticancer effect of mebendazole may be due to M1 monocyte/macrophage activation via ERK1/2 and TLR8-dependent inflammasome activation. Immunopharmacol. Immunotoxicol. 2017, 39, 199–210. [Google Scholar] [CrossRef]
  53. Williamson, T.; Bai, R.Y.; Staedtke, V.; Huso, D.; Riggins, G.J. Mebendazole and a non-steroidal anti-inflammatory combine to reduce tumor initiation in a colon cancer preclinical model. Oncotarget 2016, 7, 68571–68584. [Google Scholar] [CrossRef]
  54. Soto, H.; Massó, F.; Cano, S.; Díaz de León, L. Effects of mebendazole on protein biosynthesis and secretion in human-derived fibroblast cultures. Biochem. Pharmacol. 1996, 52, 289–299. [Google Scholar] [CrossRef]
  55. Eskandari, M.; Asgharzadeh, F.; Askarnia-Faal, M.M.; Naimi, H.; Avan, A.; Ahadi, M.; Vossoughinia, H.; Gharib, M.; Soleimani, A.; Naghibzadeh, N.; et al. Mebendazole, an anti-helminth drug, suppresses inflammation, oxidative stress and injury in a mouse model of ulcerative colitis. Sci. Rep. 2022, 12, 10249. [Google Scholar] [CrossRef]
  56. Wildenberg, M.E.; Levin, A.D.; Ceroni, A.; Guo, Z.; Koelink, P.J.; Hakvoort, T.B.M.; Westera, L.; Bloemendaal, F.M.; Brandse, J.F.; Simmons, A.; et al. Benzimidazoles Promote Anti-TNF Mediated Induction of Regulatory Macrophages and Enhance Therapeutic Efficacy in a Murine Model. J. Crohn’s Colitis 2017, 11, 1480–1490. [Google Scholar] [CrossRef]
  57. Eskandari, M.; Alkhafaji, A.H.; Al-Asady, A.M.; Naimi, H.; Ahmadzadeh, A.M.; Avan, A.; Vossoughinia, H.; Mehri, A.; Ryzhikov, M.; Khazaei, M.; et al. Mebendazole as an Adjunct Therapy with Mesalamine to Increase Efficacy and Maintenance Therapy for Ulcerative Colitis Patients: A Pilot Study. Curr. Pharm. Des. 2025. [Google Scholar] [CrossRef]
  58. Liu, T.J.; LaFortune, T.; Honda, T.; Ohmori, O.; Hatakeyama, S.; Meyer, T.; Jackson, D.; de Groot, J.; Yung, W.K. Inhibition of both focal adhesion kinase and insulin-like growth factor-I receptor kinase suppresses glioma proliferation in vitro and in vivo. Mol. Cancer Ther. 2007, 6, 1357–1367. [Google Scholar] [CrossRef]
  59. Zhang, J.; Gelman, I.H.; Qu, J.; Hochwald, S.N. Phosphohistidine signaling promotes FAK-RB1 interaction and growth factor-independent proliferation of esophageal squamous cell carcinoma. Oncogene 2023, 42, 449–460. [Google Scholar] [CrossRef]
Figure 1. Flow chart of this study.
Figure 1. Flow chart of this study.
Cimb 47 00616 g001
Figure 2. Identifying UC-related SMGs: (A) volcano plot of gene expression differences between UC and control groups; (B) Venn diagram showing the intersection of SMGs and DEGs from UC; (C) PPI network illustrating interactions of the 29 intersecting genes; (D) GeneMANIA analysis of UC-related SMGs; (E) functional enrichment analysis of the UC-related SMGs.
Figure 2. Identifying UC-related SMGs: (A) volcano plot of gene expression differences between UC and control groups; (B) Venn diagram showing the intersection of SMGs and DEGs from UC; (C) PPI network illustrating interactions of the 29 intersecting genes; (D) GeneMANIA analysis of UC-related SMGs; (E) functional enrichment analysis of the UC-related SMGs.
Cimb 47 00616 g002
Figure 3. Identification and characterization of two SMG-related molecular subtypes in UC: (A) heatmap showing the expression profiles of UC-SMGs across the two identified clusters (C1 and C2); (B) GSVA results showing differences in functional enrichment between C1 and C2, based on KEGG pathways; (C) GSVA results showing differences in functional enrichment between C1 and C2, based on GO gene sets; (D,E) immune cell infiltration patterns in C1 and C2 subtypes assessed by CIBERSORT, indicating subtype-specific immune characteristics. ( * p < 0.05, ** p < 0.01, and *** p < 0.001).
Figure 3. Identification and characterization of two SMG-related molecular subtypes in UC: (A) heatmap showing the expression profiles of UC-SMGs across the two identified clusters (C1 and C2); (B) GSVA results showing differences in functional enrichment between C1 and C2, based on KEGG pathways; (C) GSVA results showing differences in functional enrichment between C1 and C2, based on GO gene sets; (D,E) immune cell infiltration patterns in C1 and C2 subtypes assessed by CIBERSORT, indicating subtype-specific immune characteristics. ( * p < 0.05, ** p < 0.01, and *** p < 0.001).
Cimb 47 00616 g003
Figure 4. Use of machine learning technology to construct a UC diagnostic model: (A,B) thirteen genes selected by LASSO regression. Different colors correspond to different variables; (C,D) nine genes identified using SVM-RFE; (E,F) six genes with importance ratings greater than 1 identified using RF. The black solid line shows the overall out-of-bag (OOB) error rate. The red dashed and green dotted lines represent the class-specific OOB error rates for the Control and UC groups, respectively; (G) Venn diagram displaying shared genes across the three machine learning models.
Figure 4. Use of machine learning technology to construct a UC diagnostic model: (A,B) thirteen genes selected by LASSO regression. Different colors correspond to different variables; (C,D) nine genes identified using SVM-RFE; (E,F) six genes with importance ratings greater than 1 identified using RF. The black solid line shows the overall out-of-bag (OOB) error rate. The red dashed and green dotted lines represent the class-specific OOB error rates for the Control and UC groups, respectively; (G) Venn diagram displaying shared genes across the three machine learning models.
Cimb 47 00616 g004
Figure 5. Diagnostic effect of the three-gene model on UC: (A) chromosomal locations of the three model genes; (B) box plots showing expression differences in CAV1, PPARG, and SLC30A10 between UC and normal samples in the training set; (C,D) ROC curves for the three genes and the model in the training set; (E,F) ROC curves for the three genes and the model in the validation set; (G) expression levels of inflammatory factors and the three model genes in the cell model. (ns, non-significant, * p < 0.05, ** p < 0.01, and *** p < 0.001).
Figure 5. Diagnostic effect of the three-gene model on UC: (A) chromosomal locations of the three model genes; (B) box plots showing expression differences in CAV1, PPARG, and SLC30A10 between UC and normal samples in the training set; (C,D) ROC curves for the three genes and the model in the training set; (E,F) ROC curves for the three genes and the model in the validation set; (G) expression levels of inflammatory factors and the three model genes in the cell model. (ns, non-significant, * p < 0.05, ** p < 0.01, and *** p < 0.001).
Cimb 47 00616 g005
Figure 6. Correlation analysis between model gene expression and immune cell infiltration in UC: (A) lollipop plot showing the correlations between CAV1 expression and various immune cell types; (BD) Individual correlation plots between CAV1 and neutrophils, resting CD4 memory T cells, and M1 macrophages, respectively; (E) lollipop plot showing correlations between PPARG and various immune cells; (FH) detailed correlations of PPARG with M1 macrophages, activated CD4 memory T cells, and neutrophils; (I) lollipop plot showing correlations between SLC30A10 and various immune cells; (JL) detailed correlations of SLC30A10 with M1 macrophages, neutrophils, and activated dendritic cells; (M) summary plot showing the correlations between the three model genes and immune cell subsets, along with inter-cellular correlations.
Figure 6. Correlation analysis between model gene expression and immune cell infiltration in UC: (A) lollipop plot showing the correlations between CAV1 expression and various immune cell types; (BD) Individual correlation plots between CAV1 and neutrophils, resting CD4 memory T cells, and M1 macrophages, respectively; (E) lollipop plot showing correlations between PPARG and various immune cells; (FH) detailed correlations of PPARG with M1 macrophages, activated CD4 memory T cells, and neutrophils; (I) lollipop plot showing correlations between SLC30A10 and various immune cells; (JL) detailed correlations of SLC30A10 with M1 macrophages, neutrophils, and activated dendritic cells; (M) summary plot showing the correlations between the three model genes and immune cell subsets, along with inter-cellular correlations.
Cimb 47 00616 g006
Figure 7. Molecular docking and molecular dynamics (MD) simulation analyses of PPARG–mebendazole and SLC30A10–NVP-TAE226 complexes to assess binding stability and interaction dynamics. Molecular docking models showing the binding poses of mebendazole with PPARG (A) and NVP-TAE226 with SLC30A10 (B); blue dashed lines indicate predicted hydrogen bonds at the active site. Root mean square deviation (RMSD) plots of the two protein–ligand complexes during 100 ns MD simulation, reflecting the overall structural stability of PPARG–mebendazole (C) and SLC30A10–NVP-TAE226 (D) systems. Root mean square fluctuation (RMSF) of protein residues in the PPARG–mebendazole (E) and SLC30A10–NVP-TAE226 (F) complexes, reflecting local flexibility upon ligand binding. Solvent-accessible surface area (SASA) of the PPARG–mebendazole (G) and SLC30A10–NVP-TAE226 (H) complexes during the simulation, reflecting potential conformational rearrangements. Radius of gyration (Rg) curves for the PPARG–mebendazole (I) and SLC30A10–NVP-TAE226 (J) complexes, assessing structural compactness. Number of hydrogen bonds formed between ligands and proteins in the PPARG–mebendazole (K) and SLC30A10–NVP-TAE226 (L) systems over time. Two- and three-dimensional free energy landscapes (FEL) illustrating conformational stability of the PPARG–mebendazole (M) and SLC30A10–NVP-TAE226 (N) complexes based on MD trajectories. Principal component analysis (PCA) covariance matrix heatmaps of the PPARG–mebendazole (O,P) and SLC30A10–NVP-TAE226 (Q,R) complexes, reflecting dynamic residue correlations.
Figure 7. Molecular docking and molecular dynamics (MD) simulation analyses of PPARG–mebendazole and SLC30A10–NVP-TAE226 complexes to assess binding stability and interaction dynamics. Molecular docking models showing the binding poses of mebendazole with PPARG (A) and NVP-TAE226 with SLC30A10 (B); blue dashed lines indicate predicted hydrogen bonds at the active site. Root mean square deviation (RMSD) plots of the two protein–ligand complexes during 100 ns MD simulation, reflecting the overall structural stability of PPARG–mebendazole (C) and SLC30A10–NVP-TAE226 (D) systems. Root mean square fluctuation (RMSF) of protein residues in the PPARG–mebendazole (E) and SLC30A10–NVP-TAE226 (F) complexes, reflecting local flexibility upon ligand binding. Solvent-accessible surface area (SASA) of the PPARG–mebendazole (G) and SLC30A10–NVP-TAE226 (H) complexes during the simulation, reflecting potential conformational rearrangements. Radius of gyration (Rg) curves for the PPARG–mebendazole (I) and SLC30A10–NVP-TAE226 (J) complexes, assessing structural compactness. Number of hydrogen bonds formed between ligands and proteins in the PPARG–mebendazole (K) and SLC30A10–NVP-TAE226 (L) systems over time. Two- and three-dimensional free energy landscapes (FEL) illustrating conformational stability of the PPARG–mebendazole (M) and SLC30A10–NVP-TAE226 (N) complexes based on MD trajectories. Principal component analysis (PCA) covariance matrix heatmaps of the PPARG–mebendazole (O,P) and SLC30A10–NVP-TAE226 (Q,R) complexes, reflecting dynamic residue correlations.
Cimb 47 00616 g007
Table 1. Potential drugs analyzed by CMap and molecular docking.
Table 1. Potential drugs analyzed by CMap and molecular docking.
Pert_InameMoaTarget_NameNorm_CsFree Binding Energy (Kcal/Mol)
CAV1PPARGSLC30A10
vemurafenibRAF inhibitorBRAF|CYP2C19|CYP3A4|CYP3A5|RAF1−1.9033−7−8.7−8
NVP-TAE226Protein tyrosine kinase inhibitorIGF1R|PTK2−1.7499−6.2−8.7−9.1
PD-160170Neuropeptide receptor antagonistNPY1R−1.7369−5.9−7.8−7.5
U-0126MEK inhibitorMAP2K1|MAP2K2|JAK2|AKT1|CHEK1|GSK3B|LCK|MAP2K7|MAPK1|MAPK11|MAPK12|MAPK14|MAPK8|PRKCA|RAF1|ROCK1|RPS6KB1|SGK1−1.733−6.2−7.3−6.9
selumetinibMEK inhibitorMAP2K1|MAP2K2−1.7202−4.9−7.1−7.1
amperozideDopamine receptor antagonistHTR2A|DRD2|FAAH−1.7026−7−8−7.4
PD-158780EGFR inhibitorEGFR−1.6983−5.4−8.2−6.8
UNC-0321Histone lysine methyltransferase inhibitorEHMT2−1.6751−5.4−7.8−7
mebendazoleTubulin inhibitorTUBA1A|TUBB|TUBB4B−1.674−6.8−8.9−8
PP-2Src inhibitorSRC|LCK|ABL1|LYN|RIPK2−1.6735−5.3−8−7.2
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, Q.; Li, J.; Liu, S.; Zhang, Y.; Liu, J.; Wan, X.; Liang, G. Molecular Subtypes and Biomarkers of Ulcerative Colitis Revealed by Sphingolipid Metabolism-Related Genes: Insights from Machine Learning and Molecular Dynamics. Curr. Issues Mol. Biol. 2025, 47, 616. https://doi.org/10.3390/cimb47080616

AMA Style

Li Q, Li J, Liu S, Zhang Y, Liu J, Wan X, Liang G. Molecular Subtypes and Biomarkers of Ulcerative Colitis Revealed by Sphingolipid Metabolism-Related Genes: Insights from Machine Learning and Molecular Dynamics. Current Issues in Molecular Biology. 2025; 47(8):616. https://doi.org/10.3390/cimb47080616

Chicago/Turabian Style

Li, Quanwei, Junchen Li, Shuyuan Liu, Yunshu Zhang, Jifeng Liu, Xing Wan, and Guogang Liang. 2025. "Molecular Subtypes and Biomarkers of Ulcerative Colitis Revealed by Sphingolipid Metabolism-Related Genes: Insights from Machine Learning and Molecular Dynamics" Current Issues in Molecular Biology 47, no. 8: 616. https://doi.org/10.3390/cimb47080616

APA Style

Li, Q., Li, J., Liu, S., Zhang, Y., Liu, J., Wan, X., & Liang, G. (2025). Molecular Subtypes and Biomarkers of Ulcerative Colitis Revealed by Sphingolipid Metabolism-Related Genes: Insights from Machine Learning and Molecular Dynamics. Current Issues in Molecular Biology, 47(8), 616. https://doi.org/10.3390/cimb47080616

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop