Next Article in Journal
The Gut Microbiome in HIV Pathogenesis: Interconnections Between Dysbiosis, Immune Dysfunction, and Viral Persistence
Previous Article in Journal
Decellularized Testicular Extracellular Matrix Scaffolds Support Mature Spermatogenesis: Impact of Donor Age and Transplantation Microenvironment
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Dual-Gene Signature of PMAIP1 and GADD45A for Early Detection of Intrahepatic Cholangiocarcinoma in the Context of Primary Sclerosing Cholangitis

Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing 100700, China
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Int. J. Mol. Sci. 2026, 27(11), 4826; https://doi.org/10.3390/ijms27114826
Submission received: 1 April 2026 / Revised: 12 May 2026 / Accepted: 14 May 2026 / Published: 27 May 2026
(This article belongs to the Section Molecular Oncology)

Abstract

Primary sclerosing cholangitis (PSC) is a chronic inflammatory precursor associated with an increased risk of intrahepatic cholangiocarcinoma (ICC), yet identifying malignant features within the persistent inflammatory background remains challenging. In this study, a background-deviation framework was applied to explore malignant-associated determinants during PSC-associated cholangiocarcinogenesis. Single-cell RNA sequencing data from PSC, ICC tumor tissues, and adjacent non-tumor tissues were integrated, followed by functional enrichment, CellChat analysis, Monocle 2 pseudotime reconstruction, Non-negative Matrix Factorization (NMF), STRING/Cytoscape network analysis, and diagnostic signature construction using LASSO regression and exhaustive best subset selection. Single-cell profiling suggested disease-associated cellular remodeling, including cholangiocyte expansion in ICC samples. Functional and intercellular communication analyses indicated a putative transition from an immune-dominant PSC state toward a hyper-biosynthetic ICC-associated phenotype, accompanied by a possible MIF receptor-usage shift from CXCR4 to CD44. Monocle 2 and NMF further identified candidate malignant-associated trajectories and meta-programs, with MYC/TP63-related regulatory signals emerging as potential contributors. Based on these exploratory findings, best subset selection identified a two-gene transcriptomic candidate signature comprising PMAIP1 and GADD45A, which showed promising discriminative performance in internal cross-validation and an external tumor-versus-adjacent validation cohort. These findings provide a transcriptomic basis for further validation of PSC-associated cholangiocarcinogenesis and potential ICC surveillance markers.

1. Introduction

Intrahepatic cholangiocarcinoma (ICC) is a highly lethal malignancy arising from the biliary epithelium, characterized by rising global incidence and a 5-year survival rate below 20% [1,2]. Primary sclerosing cholangitis (PSC) is a chronic cholestatic liver disease defined by progressive biliary inflammation and fibrosis [3], serving as a major predisposing factor for ICC [4]. Clinical evidence indicates that the cumulative risk of hepatobiliary malignancy in patients with PSC reaches up to 20% over 30 years [5,6]. However, a critical clinical challenge remains: identifying early malignant transformation within a background of chronic inflammation is difficult, resulting in delayed diagnosis and poor prognosis.
Current understanding of the link between chronic inflammation and carcinogenesis posits that sustained pro-inflammatory signaling [7] creates a permissive microenvironment for tumor initiation [8]. However, the precise cellular evolutionary logic governing the transition from an inflammatory adaptive state to an overt malignant phenotype remains poorly defined. Previous genomic and transcriptomic studies have largely investigated PSC [9] and ICC [10] as separate entities, failing to comprehensively characterize the continuous dynamic changes that occur during disease progression. Crucially, it remains unclear whether malignancy arises as a de novo event replacing the inflammatory state, or if it evolves through the reprogramming of pre-existing inflammatory populations.
To address this issue, high-resolution mapping of cellular and transcriptional heterogeneity from inflammation-associated states to malignancy is needed. Single-cell RNA sequencing (scRNA-seq) [11] offers an opportunity to characterize cholangiocyte states and dissect transcriptional heterogeneity within complex tissue microenvironments. By integrating transcriptomic profiles across the PSC–ICC spectrum—comprising benign inflammatory tissues, paired adjacent non-tumor tissues, and established tumors—we aimed to define transcriptional patterns associated with inflammatory adaptation, peritumoral remodeling, and malignant progression. This strategy enabled us to distinguish “shared scaffolds” retained across disease states from “specific deviations” enriched in malignant-associated cholangiocyte programs. Here, the shared scaffold refers to conserved inflammatory and stress-response programs that are present across PSC, ICC-adjacent tissue, and ICC tumor tissue, whereas specific deviations denote malignant-enriched transcriptional, metabolic, or microenvironmental features that emerge beyond this common inflammatory background.
In this study, we performed an integrated single-cell transcriptomic analysis of PSC-associated cholangiocarcinogenesis and proposed a background-deviation framework. Rather than supporting a simple linear model in which inflammation is replaced by malignancy, our findings suggest that malignant-associated cholangiocyte states may emerge through the superimposition of hyper-biosynthetic and oncogenic stress programs upon a persistent inflammatory scaffold. Building on this framework, we identified PMAIP1 and GADD45A as a candidate dual-gene transcriptomic signature that captures malignant-associated deviation from the inflammatory background. This work provides a conceptual and molecular basis for further investigation of PSC-associated cholangiocarcinogenesis. The overall workflow for identifying malignant progression determinants and diagnostic signatures is summarized in Figure 1.

2. Results

2.1. Comparative Profiling of the Cellular Landscape Across the PSC–ICC Spectrum Through Single-Cell Analysis

To elucidate the cellular transitions associated with PSC-associated cholangiocarcinogenesis, an integrated single-cell RNA sequencing (scRNA-seq) dataset was constructed (Figure S1, Section 4). This dataset incorporated data from primary sclerosing cholangitis (PSC; n = 4) and intrahepatic cholangiocarcinoma (ICC; n = 3 tumors with n = 3 matched adjacent non-tumor tissues), capturing the inflammatory, peritumoral, and malignant states across the disease spectrum.
Cell types were identified based on canonical marker genes and the CellMarker database [12]. We visualized the resulting cellular landscape using Uniform Manifold Approximation and Projection (UMAP), defining eight major lineages: T cells, B cells, NK cells, macrophages, cholangiocytes, hepatocytes, endothelial cells, and fibroblasts (Figure 2A–C).
Cell-type composition showed disease-associated differences across PSC, ICC-adjacent tissue (ICC-Adj), and ICC tumor tissue (ICC-Tumor) (Figure 2D). PSC samples displayed an immune-enriched architecture, characterized by relatively higher proportions of macrophages and lymphoid populations, including T/NK cells and B cells. In contrast, cholangiocytes were more abundant in ICC-Tumor samples, whereas ICC-Adj samples showed a prominent T-cell component. Across the three tissue states, cholangiocyte abundance appeared to increase from PSC to ICC-Tumor, while macrophage abundance showed a decreasing pattern, and fibroblast proportions remained comparatively stable (Figure 2D and Figure S2).
Given that cholangiocytes are widely regarded as the putative cell of origin for ICC [13], their relative enrichment in ICC-Tumor samples provided a cellular basis for subsequent cholangiocyte-focused analyses. To further distinguish malignant-like cholangiocytes, CopyKAT-based CNV inference was applied to the cholangiocyte population. The resulting CNV heatmap showed broad CNV-like alterations in CopyKAT-defined malignant cholangiocytes, supporting their malignant-associated aneuploid annotation for downstream analyses (Figure 2E). Collectively, these dataset-level observations provided an initial single-cell context for investigating epithelial and microenvironmental remodeling across the PSC–ICC spectrum.

2.2. Functional Divergence and Metabolic Reprogramming of Cholangiocytes During Malignant Progression

Following the characterization of distinct cellular shifts across disease states, further investigation focused on the functional heterogeneity of cholangiocytes in PSC, ICC-adjacent, and ICC-Tumor tissues. Enrichment analysis was performed based on differentially expressed genes (DEGs) identified across these conditions (Tables S1 and S2; Figures S3–S6). Functional enrichment analysis suggested that NF-κB-related inflammatory programs, particularly HALLMARK_TNFA_SIGNALING_VIA_NFKB, were commonly represented across PSC [14] and ICC samples (Tables S3–S6; Figure S7). This finding indicates that pro-inflammatory signaling may persist as a shared inflammatory scaffold throughout the PSC–ICC spectrum. Consequently, inflammation-related transcriptional activity alone may be insufficient to distinguish malignancy-associated epithelial states from the benign inflammatory background (Figure 3A).
Despite this shared inflammatory background, cholangiocytes from different tissue states displayed distinct functional enrichment patterns (Figure 3B–D). In the PSC stage, cholangiocyte function was primarily associated with immune response interactions, characterized by the enrichment of “T cell receptor signaling” and “antigen processing and presentation”. This reflects a reactive state in which biliary epithelial cells actively participate in mucosal immunity. In stark contrast, ICC-Tumor cholangiocytes exhibited a fundamental transition toward a hyper-biosynthetic and bioenergetic phenotype. The malignant state was defined by the marked upregulation of ribosome biogenesis (e.g., “rRNA processing,” “ribosome large subunit assembly”) and oxidative phosphorylation. To provide quantitative support for this observation, module scores for predefined representative gene programs related to biosynthesis, energy metabolism, and oncogenic transcriptional regulation were calculated in cholangiocytes. These scores increased stepwise from PSC to ICC-adjacent and ICC-tumor cholangiocytes, supporting a malignant-associated hyper-biosynthetic and bioenergetic transcriptional state (Table S7). This enrichment of protein synthesis machinery and mitochondrial respiratory pathways suggests that the core feature of malignant transformation is driven by specific metabolic rewiring rather than a mere continuation of inflammation. Additionally, cholangiocytes in ICC-adjacent tissues displayed an intermediate “metabolic adaptation” phenotype, characterized by the enrichment of fatty acid degradation (Table S2).
Collectively, these findings support the background-deviation framework: PSC and ICC share a generalized inflammatory scaffold, whereas ICC-Tumor cholangiocytes exhibit additional malignant-associated biosynthetic and metabolic features superimposed upon this background.

2.3. Remodeling of Cholangiocyte-Centered Intercellular Communication Networks

While functional profiling delineates the intrinsic metabolic reprogramming characteristic of the malignant state, these cellular adaptations are inevitably shaped by extrinsic microenvironmental signals. To delineate putative extrinsic microenvironmental changes across PSC, ICC-Adj, and ICC-Tumor states, intercellular communication patterns were inferred using CellChat (Figure 4A–C). The analysis suggested a topological reconfiguration of the cellular communication network. In the PSC stage, the inferred network was mainly centered on interactions involving macrophages and fibroblasts. In ICC-Tumor, fibroblasts and cholangiocytes emerged as prominent communication hubs, suggesting enhanced epithelial–stromal communication in the malignant tissue context (Figures S8–S10).
Integration of global pathway-level analysis with ligand-receptor (L-R) profiling suggested two layers of microenvironmental remodeling (Tables S8–S10; Figure S11). First, a conserved signaling foundation was observed across the disease spectrum. A total of 34 signaling pathways, including MIF, Prostaglandin, and VEGF, were shared among PSC, adjacent, and tumor tissues (Figure 4D, Figures S12 and S13). At the ligand-receptor level, inflammatory interactions such as PGE2-PTGER4 and MIF-(CD74+CXCR4) were broadly detected across tissue states. These findings are consistent with the presence of a shared inflammatory scaffold (Figure 4D).
Superimposed on this shared background, ICC-Tumor tissues showed enrichment of tumor-associated pathways and a potential “receptor switch” pattern (Figure S14). Analysis identified 33 pathways preferentially enriched in the ICC tumor microenvironment, including invasive or tumor-associated modules such as ApoE, HGF, and TRAIL. This pattern was further accompanied by preferential detection of SPP1-CD44- and MDK-NCL-signaling axes in tumor and adjacent tissues, while PSC-associated immune pathways, such as IL1 and CD45, were relatively attenuated.
Notably, a shift in cellular response to conserved MIF signaling was observed. Although the MIF ligand was broadly present, its inferred receptor usage appeared to shift from the inflammatory background-associated MIF-(CD74+CXCR4) axis toward the ICC-associated MIF-(CD74+CD44) axis. The detection of MIF-(CD74+CD44) signaling in adjacent non-tumor tissues may be consistent with field-like peritumoral remodeling [15,16] in line with the concept that tumor-adjacent tissues can harbor molecular alterations distinct from truly healthy tissues. However, this pattern may also reflect tumor-adjacent inflammatory or stromal responses, differences in tissue sampling, cell-state composition, or dataset-related batch effects between PSC and ICC cohorts. In summary, both metabolic and microenvironmental profiles support a “Background-Deviation” pattern, in which malignant-associated features are superimposed upon a conserved inflammatory scaffold through altered ligand-receptor usage.

2.4. Identification of Transcriptional Drivers and Regulatory Networks Governing Malignant Lineage Specification

Following the macroscopic delineation of specific malignant features and the shared inflammatory background, the dynamic transcriptional logic governing lineage specification was investigated. To identify the molecular drivers regulating the transition from a shared inflammatory state to a malignant fate, the developmental trajectory of cholangiocytes was reconstructed using Monocle 2 (Figures S15–S21). The analysis revealed that all cholangiocytes originate from a common transcriptional root (Figure 5A,B), regardless of their ultimate fate. This root is characterized by a “shared stress response,” defined by the transient induction of immediate-early genes (IEGs) [17], including JUN, CYR61, and CXCL family members. As cells progressed along the trajectory, this initial stress signature diminished, while lineage-specific markers, such as S100 family members and the antioxidant regulator PRDX1, gradually accumulated (Figures S22 and S23).
The trajectory bifurcated into two distinct termini: an “inflammatory adaptation branch,” predominantly populated by PSC cells, and a “malignant progression branch,” primarily composed of ICC-Tumor cells. To isolate the drivers directing this bifurcation, Branched Expression Analysis Modeling (BEAM) was applied (Figure 5C; Table S11). A gene module specifically activated along the malignant branch was identified, showing significant enrichment in ribosome biogenesis, oxidative phosphorylation, and cell cycle progression (Figure 5D). This provides a mechanistic link to the hyper-biosynthetic phenotype observed earlier, confirming the metabolic switch as an intrinsic developmental process rather than a passive byproduct.
Screening for upstream regulators within this module highlighted MYC- and TP63-associated regulatory signals as candidate contributors to the malignant-associated branch (Table S12). MYC [18] serves as a critical node linking extrinsic signals to internal metabolic reprogramming. Meanwhile, the induction of TP63, [19] along with stemness factors (KLF4) [20] and EMT regulators (SNAI2), suggests that malignant cholangiocytes acquire a dedifferentiated state to survive microenvironmental pressure. These trajectory-driving genes represent specific “markers” that distinguish malignancy from the inflammatory background, providing a high-confidence candidate pool for the construction of the diagnostic signature.

2.5. NMF Meta-Program Analysis Decouples Stable and Specific Malignant Transcriptional Programs

While pseudotime analysis reconstructed the dynamic evolutionary trajectory of malignancy, mathematical deconvolution was required to isolate stable, tumor-specific signals from the pervasive inflammatory background. To define fixed transcriptional meta-programs (MPs) characterizing each disease state and to differentiate malignant features from shared signals, Non-negative Matrix Factorization (NMF) was employed.
Six recurrent meta-programs were identified in the present dataset (Figure 6A; Tables S13–S18). Intriguingly, peritumoral cholangiocytes (derived from ICC-Adj) were not transcriptionally quiescent but exhibited a distinct stress-adaptive phenotype, predominantly characterized by MP4 and MP6. MP4 was enriched in the “unfolded protein response (UPR)” and “protein refolding,” suggesting that adjacent cells activate specific adaptation mechanisms to survive the perturbed peritumoral microenvironment. MP6 was defined by the high expression of MHC-I antigen presentation genes, indicating that these cells maintain immune visibility, which contrasts with the immune evasion typical of established tumors. These findings are consistent with the “shared stress response” identified in the pseudotime analysis, suggesting that adjacent tissues represent an intermediate state where cells experience environmental stress prior to full malignant reprogramming (Figure 6B and Figure S24).
Crucially, MP5 was identified as a malignancy-specific program, distinct from the inflammatory programs observed in PSC (Figure 6B and Figure S25). Functionally, MP5 was enriched in EGFR/ERBB signaling and the negative regulation of protein kinase activity. This meta-program represents the specific oncogenic module superimposed upon the background inflammatory stress. Additionally, MP3, enriched in nuclear division, was co-activated in ICC-Tumor, further supporting the proliferative capacity driven by the MYC regulatory network. Thus, MP5 provided an additional layer of support for the malignant-associated transcriptional patterns suggested by pseudotime analysis, although its stability requires further validation in larger datasets.

2.6. Integrative Identification and Validation of a Malignant-Specific Diagnostic Signature Within the PSC-ICC Spectrum

To bridge the gap between high-dimensional single-cell data and clinical applicability, an integrative approach was employed to develop a robust diagnostic signature capable of distinguishing early ICC from the pervasive inflammatory background of PSC. This strategy intersected lineage-driving genes identified from pseudotime analysis with the malignant-specific meta-program (MP5) derived from NMF deconvolution, resulting in a high-confidence pool of 28 candidate genes (Table S19). To ensure the statistical significance and biological relevance of these candidates, a Protein–Protein Interaction (PPI) network was constructed using STRING (v12.5) and Cytoscape (v3.10.4). The top five hub genes—SFN, PMAIP1, GADD45A, CDKN1A, and PLK3—were prioritized for downstream diagnostic modeling based on degree centrality (Figure 6C; Table S20).
Subsequently, we applied an exhaustive “best subset selection” method based on multivariate logistic regression to optimize feature selection (Table S20; Figure S26). This approach identified a two-gene candidate transcriptomic signature comprising PMAIP1 and GADD45A. By leveraging the synergistic signaling of stress-response and pro-apoptotic regulators, the signature effectively captures the malignant “deviations” superimposed on the inflammatory baseline. These genes undergo specific transcriptional reconfiguration during PSC-associated cholangiocarcinogenesis, reflecting core oncogenic mechanisms. In the discovery cohort, internal leave-one-out cross-validation (LOOCV) yielded an AUC of 0.917 (Figure 6D), suggesting promising discriminative performance within the available dataset while indicating that the model performance should be interpreted in the context of the discovery-cohort design. The PMAIP1/GADD45A signature was further evaluated in the independent external cohort GSE107943. In this dataset, the control background comprised adjacent non-tumor tissues rather than inflammatory PSC samples; therefore, this analysis was interpreted as a tumor-versus-adjacent validation rather than direct validation in a PSC surveillance setting. Within this context, the signature showed high discriminative performance, with an AUC of 0.947 (Figure 6E). This result supports preliminary cross-dataset tumor-discriminative potential.
To ensure model robustness and benchmark against standard algorithms, we compared the performance of a LASSO logistic regression model (Figure S27). Although the LASSO method selected GADD45A and CDKN1A as predictors, this combination resulted in a lower AUC of 0.788 in the external validation dataset (Figure S28). The superior performance of the best subset selection method underscores the unique diagnostic value of the PMAIP1/GADD45A combination. This signature may reflect a biologically plausible stress-superimposition pattern associated with malignancy. Nevertheless, its generalizability and potential utility for monitoring malignant progression in high-risk PSC patients require further evaluation in independent PSC-specific and longitudinal cohorts.

3. Discussion

This study delineates the transcriptional landscape underlying the PSC-ICC spectrum and provides support for the “background-deviation” framework. Malignant-associated progression was characterized by a phenotypic shift toward hyperbiosynthesis and microenvironmental remodeling involving the MIF-CD44 axis. Trajectory analysis highlighted MYC/TP63-related regulatory signals potentially associated with this deviation, while NMF helped decouple malignant-associated transcriptional programs from pervasive inflammatory signals. Protein network analysis further refined the biological connectivity of candidate drivers and prioritized hub genes for downstream modeling. This integrative workflow ultimately identified PMAIP1 and GADD45A as a candidate transcriptomic signature that characterizes malignancy as a stress-superimposition state rather than a mere escalation of inflammation.
The existing literature supports the critical role of PMAIP1 [21] and GADD45A in stress responses and validates their capacity to distinguish malignancy. PMAIP1 encodes NOXA, a BH3-only pro-apoptotic protein [22], and its transcriptional upregulation may serve as an indicator of metabolic and hypoxic stress. Our functional analysis revealed a hyper-biosynthetic phenotype in ICC, a state inevitably accompanied by the Unfolded Protein Response (UPR) and hypoxia. Consistent with this, studies indicate that PMAIP1 transcription is rigorously induced by endoplasmic reticulum (ER) stress and hypoxia [23]. Although malignant cells may attenuate PMAIP1 protein levels via the SAG-UPS system [24] to evade apoptosis, the elevated PMAIP1 transcript levels observed here may still reflect persistent “oncogenic stress” at the mRNA level. Thus, PMAIP1 should be interpreted in this study as a transcriptomic stress indicator that helps distinguish metabolically burdened malignant-associated cells from the inflammatory background.
Complementing this, GADD45A functions as a specific monitor for genomic instability [25]. Our trajectory analysis identified the MYC/TP63 network as the driver of the malignant branch, and MYC-driven proliferation [26] is a well-established inducer of replication stress. The sharp upregulation of GADD45A serves as a direct cellular response to this replication pressure [27] and DNA damage [28]. Unlike the chronic, low-level damage response observed in benign PSC, the elevation of GADD45A in the signature may reflect genomic stress associated with malignant transformation. In summary, the PMAIP1/GADD45A signature characterizes a transcriptomic pattern combining metabolic and genomic stress, thereby helping distinguish malignant-associated deviation from the inflammatory background.
Beyond its direct diagnostic utility, the identification of the PMAIP1/GADD45A signature offers a conceptual refinement to the current understanding of PSC-associated carcinogenesis. Traditionally, clinical surveillance of PSC has relied on detecting cellular proliferation or structural dysplasia [29,30]. However, within a chronic inflammatory context, benign reactive biliary lesions frequently exhibit high proliferative indices, limiting the specificity of such markers [31]. The synergistic upregulation of PMAIP1 and GADD45A indicates that malignant cholangiocytes operate under intense metabolic and genomic stress, which is distinct from the homeostatic stress of benign inflammation. Consequently, this study proposes a shift in early detection strategies: moving from monitoring non-specific inflammatory progression to targeting the specific molecular footprints of oncogenic stress.
Importantly, this stress-based signature also provides a clinically interpretable link to therapeutic vulnerabilities associated with malignant stress adaptation. PMAIP1/NOXA places this marker within the mitochondrial apoptosis-regulatory network, and this connection is translationally relevant because BH3-mimetic strategies have already shown activity in cholangiocarcinoma-related models. For example, obatoclax, a pan-BCL-2 family antagonist with activity against MCL1, induced Bax activation and apoptosis in cholangiocarcinoma cells and produced antitumor effects in an orthotopic cholangiocarcinoma model [32]. In addition, recent evidence in intrahepatic cholangiocarcinoma showed that inhibition of the stress-response regulator HSF1 could be potentiated by the Bcl-xL/Bcl-2/Bcl-w inhibitor ABT-263 (navitoclax), further supporting the therapeutic relevance of apoptosis-threshold modulation in this disease context [33]. In parallel, GADD45A is closely linked to DNA-damage and replication-stress responses, connecting this signature to DNA damage response-directed treatment strategies. Consistently, homologous recombination deficiency has been reported in a subset of biliary tract cancers, supporting the potential use of PARP inhibitors and DNA-damaging regimens [34]. Moreover, combined PARP/ATR inhibition with olaparib and ceralasertib/AZD6738 has shown antitumor activity in biliary tract cancer models [35]. Together, these findings suggest that the PMAIP1/GADD45A signature not only distinguishes malignant-associated stress from the chronic inflammatory background, but it also highlights apoptosis-threshold and DNA-damage-response pathways as clinically relevant axes for therapeutic exploration in PSC-associated cholangiocarcinogenesis.
Our study has several limitations. First, the integrated scRNA-seq analysis was based on four PSC samples, three ICC tumor samples, and three adjacent non-tumor samples. Although this dataset enabled an exploratory comparison across inflammatory, peritumoral, and malignant tissue states, the observed differences in cell-type composition, pseudotime trajectories, NMF-derived meta-programs, and CellChat-inferred interactions may be influenced by inter-individual heterogeneity, tissue sampling variation, dataset-related technical effects, and incompletely available clinical covariates. In particular, the curated metadata revealed an imbalanced sex distribution, with all PSC samples from GSE247128 derived from male patients and the ICC samples from GSE138709 derived from both female and male patients (Table S22). Moreover, age, PSC disease duration, and detailed medication history were incompletely available. Given the limited patient-level sample size and incomplete covariate annotation, formal adjustment for these potential confounders was not statistically robust. Therefore, these single-cell findings should be interpreted as exploratory patterns rather than definitive causal evidence of PSC-to-ICC progression. Second, the external validation cohort consisted of ICC tumor and adjacent non-tumor tissues rather than PSC inflammatory tissues. Thus, the external AUC should be viewed as preliminary evidence of tumor-discriminative potential across datasets, but not as a direct validation of clinical surveillance performance in PSC patients. Third, the PMAIP1/GADD45A signature was constructed from transcriptomic data, and protein-level validation by immunohistochemistry or functional assays remains necessary before clinical translation. In conclusion, this study proposes a background-deviation framework in which malignant-associated stress programs are superimposed upon a persistent inflammatory scaffold. The PMAIP1/GADD45A signature represents a candidate transcriptomic feature linked to this model, providing a molecular basis for further validation of PSC-associated cholangiocarcinogenesis.

4. Materials and Methods

4.1. Data Acquisition and Integration

Transcriptomic datasets were retrieved from the Gene Expression Omnibus (GEO; National Center for Biotechnology Information, U.S. National Library of Medicine, Bethesda, MD, USA) (https://www.ncbi.nlm.nih.gov/geo/, accessed on 6 March 2026). To profile the cellular landscape of PSC-associated cholangiocarcinogenesis, scRNA-seq data from primary sclerosing cholangitis (PSC; GSE247128) and intrahepatic cholangiocarcinoma (ICC; GSE138709) were integrated, yielding 10 samples encompassing PSC inflammatory tissues (n = 4), ICC-Tumor tissues (n = 3), and adjacent non-tumor tissues of intrahepatic cholangiocarcinoma (ICC-Adj; n = 3) (Table S23). Datasets were included if they were derived from human liver or biliary tissues, had clearly annotated disease or tissue origins, and provided expression matrices suitable for downstream analysis. Samples lacking clear tissue annotation, disease-state information, or usable expression data were excluded. Available patient- and sample-level demographic and clinical metadata for these GEO scRNA-seq datasets were manually curated from GEO SOFT files, BioSample/SRA records, and the Supplementary Materials of the original publications, and they are summarized in Table S21. The curated variables included sample identity, disease group, tissue type, sex, age, disease duration, medication or treatment status, and other available clinical or pathological characteristics. Variables not reported in the public GEO/SRA metadata or the corresponding Supplementary Materials were marked as “not available”. For external validation, an independent bulk RNA-seq cohort (GSE107943; 57 samples, including 30 ICC tumor samples and 27 adjacent non-tumor samples) comprising paired ICC tumor and adjacent tissues was analyzed. Data processing was conducted in R software (v4.4.1; R Foundation for Statistical Computing, Vienna, Austria) using Seurat (v5.3.0; Satija Lab, New York Genome Center, New York, NY, USA). Quality control was performed by excluding cells with fewer than 300 or more than 4500 detected genes, or with more than 15% mitochondrial counts [36].
The 15% mitochondrial threshold was selected as a permissive cutoff to avoid excessive depletion of hepatobiliary epithelial cells under inflammatory and tumor-associated stress conditions. In PSC and ICC tissues, cholangiocytes may exhibit increased mitochondrial transcription associated with metabolic stress, oxidative phosphorylation, hypoxia-related responses, and tissue dissociation-associated stress. Therefore, an overly stringent mitochondrial threshold may remove biologically relevant stress-associated cholangiocytes.
To evaluate whether the mitochondrial threshold influenced the key marker-level findings, a focused sensitivity analysis was further performed using a stricter 10% mitochondrial cutoff. Specifically, the annotated single-cell object was subsetted to retain only cells with mitochondrial gene percentages ≤ 10%. Cell retention, cholangiocyte distribution across disease groups, CopyKAT-based malignant/non-malignant cholangiocyte composition, and PMAIP1/GADD45A expression patterns were then compared between the original 15% object and the strict 10% subset. A simple dual-gene expression score, calculated as the mean normalized expression of PMAIP1 and GADD45A at the single-cell level, was used only for this sensitivity analysis. The corresponding mitochondrial-threshold assessment and focused 10% cutoff sensitivity results are provided in Figure S29 and Table S24. After log-normalization, doublets were identified and removed using DoubletFinder (V.2.0.6; University of California, San Francisco, San Francisco, CA, USA) [37].
Cell-cycle scores were calculated based on S- and G2/M-phase markers and regressed out during scaling to mitigate proliferation-associated variation.

4.2. Batch Correction and Cell Type Annotation

Batch effects across datasets were corrected using Harmony (v1.2.3; Broad Institute, Cambridge, MA, USA) [38] on the top 30 principal components derived from 3000 highly variable genes [39]. Dimensionality reduction was performed with Uniform Manifold Approximation and Projection [40] (UMAP), and clustering was conducted using Seurat functions FindNeighbors and FindClusters (resolution = 0.3). Cell types were annotated based on established marker genes with reference to the CellMarker database (CellMarker 2.0; College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China). [12].

4.3. Identification of Malignant Cells

Malignant cholangiocytes were identified using CopyKAT (v1.1.0; Navin Lab, University of Texas MD Anderson Cancer Center, Houston, TX, USA) [41], which infers large-scale copy number variation (CNV) profiles from scRNA-seq data via Bayesian segmentation. Cholangiocytes were classified into aneuploid and diploid categories according to inferred CNV patterns [42] (Figure S30). In the primary binary classification used for downstream analyses, aneuploid cholangiocytes were defined as malignant, whereas non-aneuploid cholangiocytes were treated as non-malignant. Cells with available CopyKAT CNA profiles were used to generate the CNV heatmap (Figure 2E). CopyKAT-derived CNV burden was calculated from the inferred CNA matrix as a quantitative measure of CNV deviation, and sensitivity analyses were performed by evaluating PMAIP1/GADD45A expression across CNV-burden-based threshold settings (Figure S31; Tables S25–S27).
InferCNV (V.1.27.0; Broad Institute of MIT and Harvard, Cambridge, MA, USA) was further used as an orthogonal CNV inference approach. Non-tumor cholangiocytes from PSC inflammatory tissues, adjacent non-tumor tissues, and ICC-Tumor samples were selected as lineage-matched reference cells, while ICC-tumor cholangiocytes were treated as observation cells. InferCNV-derived gene-level CNV burden was calculated as the mean absolute deviation of the final InferCNV expression matrix from the neutral baseline (Figure S32; Table S28).

4.4. Pseudotime Trajectory and Transcriptional Regulation Analysis

Pseudotime trajectory analysis was performed using Monocle 2 (v2.10.0; Trapnell Lab, University of Washington, Seattle, WA, USA) [43] to reconstruct the evolutionary lineage of cholangiocytes. The DDRTree algorithm was used for dimensionality reduction and cell ordering. To identify key molecular drivers of lineage fate (inflammatory vs. malignant), Branched Expression Analysis Modeling (BEAM) [43] was applied. Genes with significant branch-dependent expression patterns (q-value < 1 × 10−4) were identified.
To uncover the regulatory network driving the malignant trajectory, a list of human transcription factors (TFs) [44] was retrieved from the org.Hs.eg.db annotation package (v3.22.0; Bioconductor, Buffalo, NY, USA) using the AnnotationDbi package (v1.66.0; Bioconductor, Buffalo, NY, USA). The select function was used to query genes associated with the Gene Ontology term “DNA-binding transcription factor activity” (GO:0003700) [44]. These TFs were then intersected with the BEAM-significant genes to identify key regulatory drivers.

4.5. Non-Negative Matrix Factorization (NMF) Analysis

To decouple stable malignant transcriptional programs from the inflammatory background, Non-negative Matrix Factorization (NMF) [45] was performed using the NMF R package (v0.28; CRAN, R Foundation for Statistical Computing, Vienna, Austria). The analysis was conducted independently on four subgroups (PSC non-malignant, ICC-Tumor malignant, ICC-Tumor non-malignant, and ICC-adjacent non-malignant cholangiocytes) using the “brunet” algorithm. Factorization rank (k) was tested from 4 to 9, with 30 iterations (nrun = 30) for each k. To assess the stability of NMF decomposition across the tested ranks, consensus-based diagnostic metrics were calculated, including the cophenetic correlation coefficient, dispersion, reconstruction residuals, silhouette width, silhouette consensus, and sparseness of the W and H matrices. The corresponding rank-stability diagnostic plots and numerical metrics are provided in Figure S33 and Table S29, respectively. Robust programs were defined based on recurrent top-gene overlap across different ranks within the same subgroup using Jaccard similarity, and meta-programs (MPs) [46] were identified by hierarchical clustering of these robust programs based on Jaccard similarity. The robust-program clustering structure was further evaluated using hierarchical clustering and within- versus between-meta-program Jaccard similarity analysis (Figure S33). The top 50 genes with the highest feature scores in each MP were defined as the core gene signatures.

4.6. Functional Enrichment and Cell-Cell Communication

Biological functions of differentially expressed genes (DEGs) and NMF meta-programs were annotated using Over-Representation Analysis (ORA) via the clusterProfiler package (v4.8.1; Bioconductor, Buffalo, NY, USA) [47]. Gene sets were retrieved from Gene Ontology (GO) [48], Kyoto Encyclopedia of Genes and Genomes (KEGG) [49], and MSigDB Hallmark collections [50]. Intercellular communication networks were inferred using CellChat (v2.1.2; Jin Lab, Wuhan University, Wuhan, China) [51]. Global network topology and interaction strength were compared across PSC, ICC-Adjacent, and ICC-Tumor groups. A focused analysis was performed on the shift in ligand–receptor pair usage by cholangiocytes during disease progression.
To quantitatively evaluate the hyper-biosynthetic and bioenergetic phenotype, module scores for representative biosynthetic, bioenergetic, and MYC-associated transcriptional programs were calculated in cholangiocytes using the AddModuleScore function in Seurat. Pairwise group differences among PSC, ICC-adjacent, and ICC-tumor cholangiocytes were assessed using the Wilcoxon rank-sum test with Benjamini–Hochberg correction.

4.7. Construction and Validation of the Diagnostic Signature

Twenty-eight high-confidence genes, identified from the intersection of pseudotime lineage drivers and tumor-specific meta-programs (MP5), were mapped to the STRING database (v12.5; STRING Consortium; confidence score > 0.4) [52] to construct a Protein–Protein Interaction (PPI) network. Topological centrality was calculated using the CytoHubba plugin (v0.1; Cytoscape App Store) within Cytoscape software (v3.10.4; Cytoscape Consortium, Seattle, WA, USA) [53] via the Degree algorithm, prioritizing the five highest-degree hub genes—CDKN1A, GADD45A, PLK3, PMAIP1, and SFN—for subsequent diagnostic modeling. The corresponding degree centrality scores were 7, 4, 4, 4, and 4, respectively (Table S30).
To avoid statistical inflation caused by treating individual cells as independent observations, diagnostic modeling was performed at the sample level rather than the single-cell level. The expression values of the candidate genes were aggregated into sample-level pseudo-bulk profiles by calculating the average expression of cholangiocytes within each patient sample. This strategy reduced single-cell dropout noise and ensured that model training was based on biological samples rather than individual cells.
The primary diagnostic signature was developed using an exhaustive “best subset selection” method based on multivariate logistic regression. Utilizing the glm function (family = binomial) in R, we systematically evaluated all possible dual- and triple-gene combinations from the five-gene hub pool. The Akaike Information Criterion (AIC) served as the selection metric to optimize model fit while minimizing complexity. The diagnostic Risk Score (RS) was calculated as follows:
RS = β0 + β1 × ExpPMAIP1 + β2 × ExpGADD45A
To benchmark the model’s performance, a comparative signature was constructed using Least Absolute Shrinkage and Selection Operator (LASSO) [54] logistic regression via the glmnet R package (v4.1.8; Stanford University, Stanford, CA, USA). The optimal penalty parameter (λ) was determined through Leave-One-Out Cross-Validation (LOOCV) to minimize binomial deviance.
Model stability and generalizability were assessed using LOOCV in the discovery cohort and an independent external validation dataset (GSE107943). Performance was quantified through Receiver Operating Characteristic (ROC) [55] curve analysis using the pROC R package (v1.18.5; CRAN, R Foundation for Statistical Computing, Vienna, Austria), with the Area Under the Curve (AUC) serving as the evaluation metric and an effectiveness threshold set at AUC > 0.7. To reduce overfitting risk in the small discovery cohort, feature selection was restricted to biologically prioritized hub genes, model complexity was controlled by AIC, and model performance was evaluated using both internal cross-validation and external validation.

4.8. Statistical Analysis

All statistical analyses were performed using R software (v4.4.1). Comparisons of continuous variables, including gene expression levels and pathway scores, between two groups were conducted using the Wilcoxon rank-sum test. p values were adjusted for multiple testing using the Benjamini–Hochberg method. Similarity between transcriptional programs in the NMF analysis was evaluated using the Jaccard similarity coefficient.
The diagnostic performance of the PMAIP1/GADD45A signature was evaluated using receiver operating characteristic (ROC) curve analysis, with the area under the curve (AUC) used as the primary performance metric. The 95% confidence intervals of AUC values were calculated using the DeLong method. Sensitivity, specificity, and accuracy were calculated at the optimal cutoff determined by the Youden index, and the results are summarized in Table S31.
For the focused 10% mitochondrial cutoff sensitivity analysis, PMAIP1 and GADD45A expression levels, as well as the mean PMAIP1/GADD45A dual-gene expression score, were compared between CopyKAT-defined malignant-like and non-malignant ICC-tumor cholangiocytes using the Wilcoxon rank-sum test. p values reported as 0 due to numerical underflow in R were presented as p < 2.2 × 10−16. Unless otherwise stated, a two-sided p < 0.05 was considered statistically significant.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ijms27114826/s1.

Author Contributions

B.Y. and Y.M. designed the study, performed experiments, and drafted the manuscript; B.Y., S.G., Q.Z. and Y.Y. analyzed the sequencing data; R.J., Y.S. and Z.H. conducted bioinformatics validation; Z.W. and J.L. supervised the study and revised the manuscript critically for important intellectual content. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (Grant No. 82474682) for the project The modification of non-negative tensor factorization for uncovering the spatial-temporal network evolution mechanism in “same treatment for different diseases”.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are openly available in GEO (https://www.ncbi.nlm.nih.gov/geo/, accessed on 6 March 2026) repositories.

Acknowledgments

We sincerely acknowledge all contributors for their invaluable efforts in the preparation of this manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
PSCPrimary sclerosing cholangitis
ICCintrahepatic cholangiocarcinoma
scRNA-seqSingle-cell RNA sequencing
ICC-AdjIntrahepatic Cholangiocarcinoma-Adjacent Non-tumor Tissue
NMFNon-negative Matrix Factorization
PPIProtein-protein interaction
UMAPUniform Manifold Approximation and Projection
CNVcopy number variation
BEAMBranched Expression Analysis Modeling
TFstranscription factors
MPsmeta-programs
DEGsdifferentially expressed genes
ORAOver-Representation Analysis
GOGene Ontology
KEGGKyoto Encyclopedia of Genes and Genomes
AICAkaike Information Criterion
RSRisk Score
LOOCVLeave-One-Out Cross-Validation
ROCReceiver Operating Characteristic
AUCArea Under the Curve
UPRunfolded protein response
ERendoplasmic reticulum

References

  1. Zhang, H.; Yang, T.; Wu, M. Intrahepatic cholangiocarcinoma: Epidemiology, risk factors, diagnosis and surgical management. Cancer Lett. 2016, 379, 198–205. [Google Scholar] [CrossRef]
  2. Banales, J.M.; Marin, J.J.G.; Lamarca, A.; Rodrigues, P.M.; Khan, S.A.; Roberts, L.R.; Cardinale, V.; Carpino, G.; Andersen, J.B.; Braconi, C.; et al. Cholangiocarcinoma 2020: The next horizon in mechanisms and management. Nat. Rev. Gastroenterol. Hepatol. 2020, 17, 557–588. [Google Scholar] [CrossRef]
  3. Weismüller, T.J.; Wedemeyer, J.; Kubicka, S.; Strassburg, C.P.; Manns, M.P. The challenges in primary sclerosing cholangitis--aetiopathogenesis, autoimmunity, management and malignancy. J. Hepatol. 2008, 48, S38–S57. [Google Scholar] [CrossRef]
  4. Bergquist, A.; von Seth, E. Epidemiology of cholangiocarcinoma. Best. Pract. Res. Clin. Gastroenterol. 2015, 29, 221–232. [Google Scholar] [CrossRef] [PubMed]
  5. Ali, A.H.; Bi, Y.; Machicado, J.D.; Garg, S.; Lennon, R.J.; Zhang, L.; Takahashi, N.; Carey, E.J.; Lindor, K.D.; Buness, J.G.; et al. The long-term outcomes of patients with immunoglobulin G4-related sclerosing cholangitis: The Mayo Clinic experience. J. Gastroenterol. 2020, 55, 1087–1097. [Google Scholar] [CrossRef] [PubMed]
  6. Bakhshi, Z.; Hilscher, M.B.; Gores, G.J.; Harmsen, W.S.; Viehman, J.K.; LaRusso, N.F.; Gossard, A.A.; Lazaridis, K.N.; Lindor, K.D.; Eaton, J.E. An update on primary sclerosing cholangitis epidemiology, outcomes and quantification of alkaline phosphatase variability in a population-based cohort. J. Gastroenterol. 2020, 55, 523–532. [Google Scholar] [CrossRef]
  7. Caronni, N.; La Terza, F.; Vittoria, F.M.; Barbiera, G.; Mezzanzanica, L.; Cuzzola, V.; Barresi, S.; Pellegatta, M.; Canevazzi, P.; Dunsmore, G.; et al. IL-1β(+) macrophages and the control of pathogenic inflammation in cancer. Trends Immunol. 2025, 46, 403–415. [Google Scholar] [CrossRef]
  8. Greten, F.R.; Grivennikov, S.I. Inflammation and Cancer: Triggers, Mechanisms, and Consequences. Immunity 2019, 51, 27–41. [Google Scholar] [CrossRef]
  9. Poch, T.; Krause, J.; Casar, C.; Liwinski, T.; Glau, L.; Kaufmann, M.; Ahrenstorf, A.E.; Hess, L.U.; Ziegler, A.E.; Martrus, G.; et al. Single-cell atlas of hepatic T cells reveals expansion of liver-resident naive-like CD4(+) T cells in primary sclerosing cholangitis. J. Hepatol. 2021, 75, 414–423. [Google Scholar] [CrossRef] [PubMed]
  10. Zhang, M.; Yang, H.; Wan, L.; Wang, Z.; Wang, H.; Ge, C.; Liu, Y.; Hao, Y.; Zhang, D.; Shi, G.; et al. Single-cell transcriptomic architecture and intercellular crosstalk of human intrahepatic cholangiocarcinoma. J. Hepatol. 2020, 73, 1118–1130. [Google Scholar] [CrossRef]
  11. Bennett, H.M.; Stephenson, W.; Rose, C.M.; Darmanis, S. Single-cell proteomics enabled by next-generation sequencing or mass spectrometry. Nat. Methods 2023, 20, 363–374. [Google Scholar] [CrossRef]
  12. Hu, C.; Li, T.; Xu, Y.; Zhang, X.; Li, F.; Bai, J.; Chen, J.; Jiang, W.; Yang, K.; Ou, Q.; et al. CellMarker 2.0: An updated database of manually curated cell markers in human/mouse and web tools based on scRNA-seq data. Nucleic Acids Res. 2023, 51, D870–D876. [Google Scholar] [CrossRef] [PubMed]
  13. Vijgen, S.; Terris, B.; Rubbia-Brandt, L. Pathology of intrahepatic cholangiocarcinoma. Hepatobiliary Surg. Nutr. 2017, 6, 22–34. [Google Scholar] [CrossRef]
  14. Hua, H.; Zhao, Q.Q.; Kalagbor, M.N.; Yu, G.Z.; Liu, M.; Bian, Z.R.; Zhang, B.B.; Yu, Q.; Xu, Y.H.; Tang, R.X.; et al. Recombinant adeno-associated virus 8-mediated inhibition of microRNA let-7a ameliorates sclerosing cholangitis in a clinically relevant mouse model. World J. Gastroenterol. 2024, 30, 471–484. [Google Scholar] [CrossRef]
  15. Shah, S.C.; Itzkowitz, S.H. Colorectal Cancer in Inflammatory Bowel Disease: Mechanisms and Management. Gastroenterology 2022, 162, 715–730.e3. [Google Scholar] [CrossRef]
  16. Gadaleta, E.; Thorn, G.J.; Ross-Adams, H.; Jones, L.J.; Chelala, C. Field cancerization in breast cancer. J. Pathol. 2022, 257, 561–574. [Google Scholar] [CrossRef]
  17. Lau, L.F.; Nathans, D. Expression of a set of growth-related immediate early genes in BALB/c 3T3 cells: Coordinate regulation with c-fos or c-myc. Proc. Nat. Acad. Sci. USA 1987, 84, 1182–1186. [Google Scholar] [CrossRef] [PubMed]
  18. Yang, H.; Liu, T.; Wang, J.; Li, T.W.; Fan, W.; Peng, H.; Krishnan, A.; Gores, G.J.; Mato, J.M.; Lu, S.C. Deregulated methionine adenosyltransferase α1, c-Myc, and Maf proteins together promote cholangiocarcinoma growth in mice and humans. Hepatology 2016, 64, 439–455, Erratum in Hepatology 2017, 66, 1707. https://doi.org/10.1002/hep.29549. [Google Scholar] [CrossRef]
  19. Wang, S.; Tong, H.; Su, T.; Zhou, D.; Shi, W.; Tang, Z.; Quan, Z. CircTP63 promotes cell proliferation and invasion by regulating EZH2 via sponging miR-217 in gallbladder cancer. Cancer Cell Int. 2021, 21, 608. [Google Scholar] [CrossRef] [PubMed]
  20. Takahashi, K.; Yamanaka, S. Induction of pluripotent stem cells from mouse embryonic and adult fibroblast cultures by defined factors. Cell 2006, 126, 663–676. [Google Scholar] [CrossRef]
  21. Huang, M.; Lan, T.; Chen, X.; Chen, R.; Ding, X.; Tai, W.C.; Wong, S.C.; Chan, L.W. Functional Role of NOXA in Hypoxia-Mediated PD-L1 Inhibitor Response in Hepatocellular Carcinoma. Int. J. Mol. Sci. 2025, 26, 4766. [Google Scholar] [CrossRef] [PubMed]
  22. Busche, S.; John, K.; Wandrer, F.; Vondran, F.W.R.; Lehmann, U.; Wedemeyer, H.; Essmann, F.; Schulze-Osthoff, K.; Bantel, H. BH3-only protein expression determines hepatocellular carcinoma response to sorafenib-based treatment. Cell Death Dis. 2021, 12, 736. [Google Scholar] [CrossRef]
  23. Wang, L.; Hu, T.; Shen, Z.; Zheng, Y.; Geng, Q.; Li, L.; Sha, B.; Li, M.; Sun, Y.; Guo, Y.; et al. Inhibition of USP1 activates ER stress through Ubi-protein aggregation to induce autophagy and apoptosis in HCC. Cell Death Dis. 2022, 13, 951. [Google Scholar] [CrossRef]
  24. Chang, S.C.; Choo, W.Q.; Toh, H.C.; Ding, J.L. SAG-UPS attenuates proapoptotic SARM and Noxa to confer survival advantage to early hepatocellular carcinoma. Cell Death Discov. 2015, 1, 15032. [Google Scholar] [CrossRef] [PubMed]
  25. Chen, M.; Wu, H.L.; Wong, T.S.; Chen, B.; Gong, R.H.; Wong, H.L.X.; Xiao, H.; Bian, Z.; Kwan, H.Y. Combination of Wogonin and Artesunate Exhibits Synergistic anti-Hepatocellular Carcinoma Effect by Increasing DNA-Damage-Inducible Alpha, Tumor Necrosis Factor α and Tumor Necrosis Factor Receptor-Associated Factor 3-mediated Apoptosis. Front. Pharmacol. 2021, 12, 657080. [Google Scholar] [CrossRef]
  26. Maiani, E.; Milletti, G.; Nazio, F.; Holdgaard, S.G.; Bartkova, J.; Rizza, S.; Cianfanelli, V.; Lorente, M.; Simoneschi, D.; Di Marco, M.; et al. AMBRA1 regulates cyclin D to guard S-phase entry and genomic integrity. Nature 2021, 592, 799–803. [Google Scholar] [CrossRef]
  27. Hou, C.Y.; Suo, Y.H.; Lv, P.; Yuan, H.F.; Zhao, L.N.; Wang, Y.F.; Zhang, H.H.; Sun, J.; Sun, L.L.; Lu, W.; et al. Aristolochic acids-hijacked p53 promotes liver cancer cell growth by inhibiting ferroptosis. Acta Pharmacol. Sin. 2025, 46, 208–221. [Google Scholar] [CrossRef]
  28. Zhou, H.; Zhu, L.; Zhang, Y.; Chen, L.; Gou, D.M.; Zhang, H.; Hua, R.; Song, J.; Qiu, C.; Yao, F.W.; et al. Tissue Factor Pathway Inhibitor 2 Enhances Hepatocellular Carcinoma Chemosensitivity by Activating CCAR2-GADD45A-Mediated DNA Damage Repair. Int. J. Biol. Sci. 2025, 21, 4629–4646. [Google Scholar] [CrossRef] [PubMed]
  29. Lieshout, R.; Kamp, E.J.C.A.; Verstegen, M.M.A.; Doukas, M.; Dinjens, W.N.M.; Köten, K.; IJzermans, J.N.M.; Bruno, M.J.; Peppelenbosch, M.P.; van der Laan, L.J.W.; et al. Cholangiocarcinoma cell proliferation is enhanced in primary sclerosing cholangitis: A role for IL-17A. Int. J. Cancer 2023, 152, 2607–2614. [Google Scholar] [CrossRef]
  30. Villard, C.; Friis-Liby, I.; Rorsman, F.; Said, K.; Warnqvist, A.; Cornillet, M.; Kechagias, S.; Nyhlin, N.; Werner, M.; Janczewska, I.; et al. Prospective surveillance for cholangiocarcinoma in unselected individuals with primary sclerosing cholangitis. J. Hepatol. 2023, 78, 604–613. [Google Scholar] [CrossRef]
  31. Ilyas, S.I.; Eaton, J.E.; Gores, G.J. Primary Sclerosing Cholangitis as a Premalignant Biliary Tract Disease: Surveillance and Management. Clin. Gastroenterol. Hepatol. 2015, 13, 2152–2165. [Google Scholar] [CrossRef]
  32. Smoot, R.L.; Blechacz, B.R.; Werneburg, N.W.; Bronk, S.F.; Sinicrope, F.A.; Sirica, A.E.; Gores, G.J. A Bax-mediated mechanism for obatoclax-induced apoptosis of cholangiocarcinoma cells. Cancer Res. 2010, 70, 1960–1969. [Google Scholar] [CrossRef]
  33. Cigliano, A.; Gigante, I.; Serra, M.; Vidili, G.; Simile, M.M.; Steinmann, S.; Urigo, F.; Cossu, E.; Pes, G.M.; Dore, M.P.; et al. HSF1 is a prognostic determinant and therapeutic target in intrahepatic cholangiocarcinoma. J. Exp. Clin. Cancer Res. 2024, 43, 253. [Google Scholar] [CrossRef] [PubMed]
  34. Okawa, Y.; Iwasaki, Y.; Johnson, T.A.; Ebata, N.; Inai, C.; Endo, M.; Maejima, K.; Sasagawa, S.; Fujita, M.; Matsuda, K.; et al. Hereditary cancer variants and homologous recombination deficiency in biliary tract cancer. J. Hepatol. 2023, 78, 333–342. [Google Scholar] [CrossRef]
  35. Nam, A.R.; Yoon, J.; Jin, M.H.; Bang, J.H.; Oh, K.S.; Seo, H.R.; Kim, J.M.; Kim, T.Y.; Oh, D.Y. ATR inhibition amplifies antitumor effects of olaparib in biliary tract cancer. Cancer Lett. 2021, 516, 38–47. [Google Scholar] [CrossRef]
  36. Luecken, M.D.; Theis, F.J. Current best practices in single-cell RNA-seq analysis: A tutorial. Mol. Syst. Biol. 2019, 15, e8746. [Google Scholar] [CrossRef] [PubMed]
  37. McGinnis, C.S.; Murrow, L.M.; Gartner, Z.J. DoubletFinder: Doublet Detection in Single-Cell RNA Sequencing Data Using Artificial Nearest Neighbors. Cell Syst. 2019, 8, 329–337.e4. [Google Scholar] [CrossRef]
  38. Korsunsky, I.; Millard, N.; Fan, J.; Slowikowski, K.; Zhang, F.; Wei, K.; Baglaenko, Y.; Brenner, M.; Loh, P.R.; Raychaudhuri, S. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods 2019, 16, 1289–1296. [Google Scholar] [CrossRef] [PubMed]
  39. Stuart, T.; Butler, A.; Hoffman, P.; Hafemeister, C.; Papalexi, E.; Mauck WM3rd Hao, Y.; Stoeckius, M.; Smibert, P.; Satija, R. Comprehensive Integration of Single-Cell Data. Cell 2019, 177, 1888–1902.e21. [Google Scholar] [CrossRef]
  40. Becht, E.; McInnes, L.; Healy, J.; Dutertre, C.A.; Kwok, I.W.H.; Ng, L.G.; Ginhoux, F.; Newell, E.W. Dimensionality reduction for visualizing single-cell data using UMAP. Nat. Biotechnol. 2018, 37, 38–44. [Google Scholar] [CrossRef]
  41. Gao, R.; Bai, S.; Henderson, Y.C.; Lin, Y.; Schalck, A.; Yan, Y.; Kumar, T.; Hu, M.; Sei, E.; Davis, A.; et al. Delineating copy number and clonal substructure in human tumors from single-cell transcriptomes. Nat. Biotechnol. 2021, 39, 599–608. [Google Scholar] [CrossRef]
  42. Ben-David, U.; Amon, A. Context is everything: Aneuploidy in cancer. Nat. Rev. Genet. 2020, 21, 44–62. [Google Scholar] [CrossRef]
  43. Qiu, X.; Mao, Q.; Tang, Y.; Wang, L.; Chawla, R.; Pliner, H.A.; Trapnell, C. Reversed graph embedding resolves complex single-cell trajectories. Nat. Methods 2017, 14, 979–982. [Google Scholar] [CrossRef]
  44. Lambert, S.A.; Jolma, A.; Campitelli, L.F.; Das, P.K.; Yin, Y.; Albu, M.; Chen, X.; Taipale, J.; Hughes, T.R.; Weirauch, M.T. The Human Transcription Factors. Cell 2018, 172, 650–665, Erratum in Cell 2018, 175, 598–599. https://doi.org/10.1016/j.cell.2018.09.045. [Google Scholar] [CrossRef] [PubMed]
  45. Gaujoux, R.; Seoighe, C. A flexible R package for nonnegative matrix factorization. BMC Bioinform. 2010, 11, 367. [Google Scholar] [CrossRef] [PubMed]
  46. Puram, S.V.; Tirosh, I.; Parikh, A.S.; Patel, A.P.; Yizhak, K.; Gillespie, S.; Rodman, C.; Luo, C.L.; Mroz, E.A.; Emerick, K.S.; et al. Single-Cell Transcriptomic Analysis of Primary and Metastatic Tumor Ecosystems in Head and Neck Cancer. Cell 2017, 171, 1611–1624.e24. [Google Scholar] [CrossRef]
  47. Wu, T.; Hu, E.; Xu, S.; Chen, M.; Guo, P.; Dai, Z.; Feng, T.; Zhou, L.; Tang, W.; Zhan, L.; et al. clusterProfiler 4.0: A universal enrichment tool for interpreting omics data. Innovation 2021, 2, 100141. [Google Scholar] [CrossRef]
  48. Gene Ontology Consortium. The Gene Ontology resource: Enriching a GOld mine. Nucleic Acids Res. 2021, 49, D325–D334. [Google Scholar] [CrossRef]
  49. Kanehisa, M.; Furumichi, M.; Sato, Y.; Kawashima, M.; Ishiguro-Watanabe, M. KEGG for taxonomy-based analysis of pathways and genomes. Nucleic Acids Res. 2023, 51, D587–D592. [Google Scholar] [CrossRef] [PubMed]
  50. Liberzon, A.; Birger, C.; Thorvaldsdóttir, H.; Ghandi, M.; Mesirov, J.P.; Tamayo, P. The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Syst. 2015, 1, 417–425. [Google Scholar] [CrossRef]
  51. Jin, S.; Guerrero-Juarez, C.F.; Zhang, L.; Chang, I.; Ramos, R.; Kuan, C.H.; Myung, P.; Plikus, M.V.; Nie, Q. Inference and analysis of cell-cell communication using CellChat. Nat. Commun. 2021, 12, 1088. [Google Scholar] [CrossRef] [PubMed]
  52. Szklarczyk, D.; Kirsch, R.; Koutrouli, M.; Nastou, K.; Mehryary, F.; Hachilif, R.; Gable, A.L.; Fang, T.; Doncheva, N.T.; Pyysalo, S.; et al. The STRING database in 2023: Protein-protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucleic Acids Res. 2023, 51, D638–D646. [Google Scholar] [CrossRef] [PubMed]
  53. Shannon, P.; Markiel, A.; Ozier, O.; Baliga, N.S.; Wang, J.T.; Ramage, D.; Amin, N.; Schwikowski, B.; Ideker, T. Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Res. 2003, 13, 2498–2504. [Google Scholar] [CrossRef] [PubMed]
  54. Friedman, J.; Hastie, T.; Tibshirani, R. Regularization Paths for Generalized Linear Models via Coordinate Descent. J. Stat. Softw. 2010, 33, 1–22. [Google Scholar] [CrossRef]
  55. Robin, X.; Turck, N.; Hainard, A.; Tiberti, N.; Lisacek, F.; Sanchez, J.C.; Müller, M. pROC: An open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinform. 2011, 12, 77. [Google Scholar] [CrossRef]
Figure 1. Schematic workflow for identifying malignant progression determinants and diagnostic signatures in PSC-associated cholangiocarcinogenesis.
Figure 1. Schematic workflow for identifying malignant progression determinants and diagnostic signatures in PSC-associated cholangiocarcinogenesis.
Ijms 27 04826 g001
Figure 2. Single-cell transcriptomic landscape of PSC and ICC tissues. (A) UMAP visualization of all sequenced cells, color-coded by identified cell lineages. (B) UMAP plot of all cells, color-coded by sample origins. (C) Dot plot showing the expression of canonical marker genes used to define each cell type. The color intensity represents the average expression level, and the dot size indicates the percentage of cells expressing the marker. (D) Bar plot illustrating the relative proportions of each cell type across three disease states: PSC, ICC-Adj, and ICC-Tumor. (E) CopyKAT-inferred CNV heatmap of cholangiocytes. Columns represent individual cholangiocytes and rows represent genomic bins ordered by chromosomal position. Top annotations indicate disease group and CopyKAT-based binary malignancy status.
Figure 2. Single-cell transcriptomic landscape of PSC and ICC tissues. (A) UMAP visualization of all sequenced cells, color-coded by identified cell lineages. (B) UMAP plot of all cells, color-coded by sample origins. (C) Dot plot showing the expression of canonical marker genes used to define each cell type. The color intensity represents the average expression level, and the dot size indicates the percentage of cells expressing the marker. (D) Bar plot illustrating the relative proportions of each cell type across three disease states: PSC, ICC-Adj, and ICC-Tumor. (E) CopyKAT-inferred CNV heatmap of cholangiocytes. Columns represent individual cholangiocytes and rows represent genomic bins ordered by chromosomal position. Top annotations indicate disease group and CopyKAT-based binary malignancy status.
Ijms 27 04826 g002
Figure 3. Functional landscape and stage-specific divergence of cholangiocytes during PSC-associated cholangiocarcinogenesis. (A) Bubble plot illustrating the integrated enrichment of Hallmark and KEGG pathways across PSC, ICC-Adj, and ICC-Tumor states. Conserved inflammatory pathways, particularly the NF-κB-signaling pathway (TNFA_SIGNALING_VIA_NFKB), are identified as a persistent “inflammatory scaffold” across all disease stages. The dot size represents the gene count, and the color intensity indicates the significance level. (BD) Functional enrichment analysis showing the specific GO (Biological Process, Cellular Component, and Molecular Function) and KEGG terms for cholangiocytes in PSC (B), ICC-Adj (C), and ICC-Tumor (D) tissues.
Figure 3. Functional landscape and stage-specific divergence of cholangiocytes during PSC-associated cholangiocarcinogenesis. (A) Bubble plot illustrating the integrated enrichment of Hallmark and KEGG pathways across PSC, ICC-Adj, and ICC-Tumor states. Conserved inflammatory pathways, particularly the NF-κB-signaling pathway (TNFA_SIGNALING_VIA_NFKB), are identified as a persistent “inflammatory scaffold” across all disease stages. The dot size represents the gene count, and the color intensity indicates the significance level. (BD) Functional enrichment analysis showing the specific GO (Biological Process, Cellular Component, and Molecular Function) and KEGG terms for cholangiocytes in PSC (B), ICC-Adj (C), and ICC-Tumor (D) tissues.
Ijms 27 04826 g003
Figure 4. Alterations in intercellular communication networks and ligand-receptor interactions during PSC-associated cholangiocarcinogenesis. (AC) Global cell–cell communication networks in PSC (A), ICC-Adj (B), and ICC-Tumor (C) tissues. The lines indicate interactions between cell types, where line thickness represents the interaction strength and number of ligand-receptor pairs. (D) Upset plot showing the shared and stage-specific signaling pathways among the three disease states. The intersection highlights the conserved signaling core, while the unique sets represent functional shifts specific to malignant progression. (E) Dot plot illustrating the ligand-receptor pairs of cholangiocytes acting as either senders or receivers across the three disease states. The dot size represents the communication probability, and the color intensity indicates the computed p-value.
Figure 4. Alterations in intercellular communication networks and ligand-receptor interactions during PSC-associated cholangiocarcinogenesis. (AC) Global cell–cell communication networks in PSC (A), ICC-Adj (B), and ICC-Tumor (C) tissues. The lines indicate interactions between cell types, where line thickness represents the interaction strength and number of ligand-receptor pairs. (D) Upset plot showing the shared and stage-specific signaling pathways among the three disease states. The intersection highlights the conserved signaling core, while the unique sets represent functional shifts specific to malignant progression. (E) Dot plot illustrating the ligand-receptor pairs of cholangiocytes acting as either senders or receivers across the three disease states. The dot size represents the communication probability, and the color intensity indicates the computed p-value.
Ijms 27 04826 g004
Figure 5. Pseudotime trajectory and branch-specific functional determinants of cholangiocyte malignant progression. (A) Monocle 2 pseudotime trajectory of all cholangiocytes, with cells ordered according to their developmental progress. The color gradient represents the evolution of pseudotime. (B) Distribution of cholangiocytes from different sample groups along the pseudotime trajectory. Individual cells are color-coded by their identified cell types within each disease state. (C) BEAM (Branched Expression Analysis Modeling) heatmap displaying the kinetic changes in gene expression at the branching point. Fate 1 represents the inflammatory non-malignant lineage, while Fate 2 signifies the malignant cholangiocyte lineage. (D) GO and KEGG enrichment analysis of genes associated with the transition toward the malignant lineage (Fate 2). The results highlight the functional drivers of malignant progression.
Figure 5. Pseudotime trajectory and branch-specific functional determinants of cholangiocyte malignant progression. (A) Monocle 2 pseudotime trajectory of all cholangiocytes, with cells ordered according to their developmental progress. The color gradient represents the evolution of pseudotime. (B) Distribution of cholangiocytes from different sample groups along the pseudotime trajectory. Individual cells are color-coded by their identified cell types within each disease state. (C) BEAM (Branched Expression Analysis Modeling) heatmap displaying the kinetic changes in gene expression at the branching point. Fate 1 represents the inflammatory non-malignant lineage, while Fate 2 signifies the malignant cholangiocyte lineage. (D) GO and KEGG enrichment analysis of genes associated with the transition toward the malignant lineage (Fate 2). The results highlight the functional drivers of malignant progression.
Ijms 27 04826 g005
Figure 6. Identification of malignant meta-programs and construction of the diagnostic signature for PSC-associated cholangiocarcinogenesis. (A) Non-negative Matrix Factorization (NMF) analysis identifies six distinct meta-programs (MPs) and their corresponding functional enrichment profiles. The heatmap displays the relative expression of top-ranked genes for each MP, with associated biological functions annotated alongside. (B) Relative expression levels of the six identified MPs across four defined groups: PSC non-malignant, ICC-Adj non-malignant, ICC-Tumor non-malignant, and ICC-Tumor malignant cholangiocytes (as identified by CopyKAT). (C) Protein-Protein Interaction (PPI) network of the top-ranked genes from malignant-specific MPs, visualized via STRING and Cytoscape. The five central hubs (hub genes) were prioritized based on degree centrality. The five highest-degree hub genes were CDKN1A (degree = 7), GADD45A (degree = 4), PLK3 (degree = 4), PMAIP1 (degree = 4), and SFN (degree = 4). (D,E) Performance and validation of the dual-gene diagnostic signature (PMAIP1 and GADD45A) derived from the best subset selection method. ROC curves illustrate the diagnostic accuracy (AUC) in both the discovery (internal) cohort (D) and the independent validation (external) cohort (E).
Figure 6. Identification of malignant meta-programs and construction of the diagnostic signature for PSC-associated cholangiocarcinogenesis. (A) Non-negative Matrix Factorization (NMF) analysis identifies six distinct meta-programs (MPs) and their corresponding functional enrichment profiles. The heatmap displays the relative expression of top-ranked genes for each MP, with associated biological functions annotated alongside. (B) Relative expression levels of the six identified MPs across four defined groups: PSC non-malignant, ICC-Adj non-malignant, ICC-Tumor non-malignant, and ICC-Tumor malignant cholangiocytes (as identified by CopyKAT). (C) Protein-Protein Interaction (PPI) network of the top-ranked genes from malignant-specific MPs, visualized via STRING and Cytoscape. The five central hubs (hub genes) were prioritized based on degree centrality. The five highest-degree hub genes were CDKN1A (degree = 7), GADD45A (degree = 4), PLK3 (degree = 4), PMAIP1 (degree = 4), and SFN (degree = 4). (D,E) Performance and validation of the dual-gene diagnostic signature (PMAIP1 and GADD45A) derived from the best subset selection method. ROC curves illustrate the diagnostic accuracy (AUC) in both the discovery (internal) cohort (D) and the independent validation (external) cohort (E).
Ijms 27 04826 g006
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yao, B.; Ma, Y.; Guan, S.; Zheng, Q.; Yu, Y.; Jia, R.; Shi, Y.; Hou, Z.; Wang, Z.; Liu, J. A Dual-Gene Signature of PMAIP1 and GADD45A for Early Detection of Intrahepatic Cholangiocarcinoma in the Context of Primary Sclerosing Cholangitis. Int. J. Mol. Sci. 2026, 27, 4826. https://doi.org/10.3390/ijms27114826

AMA Style

Yao B, Ma Y, Guan S, Zheng Q, Yu Y, Jia R, Shi Y, Hou Z, Wang Z, Liu J. A Dual-Gene Signature of PMAIP1 and GADD45A for Early Detection of Intrahepatic Cholangiocarcinoma in the Context of Primary Sclerosing Cholangitis. International Journal of Molecular Sciences. 2026; 27(11):4826. https://doi.org/10.3390/ijms27114826

Chicago/Turabian Style

Yao, Bei, Yiming Ma, Shuang Guan, Qiguang Zheng, Yanan Yu, Ran Jia, Yinli Shi, Zhiyong Hou, Zhong Wang, and Jun Liu. 2026. "A Dual-Gene Signature of PMAIP1 and GADD45A for Early Detection of Intrahepatic Cholangiocarcinoma in the Context of Primary Sclerosing Cholangitis" International Journal of Molecular Sciences 27, no. 11: 4826. https://doi.org/10.3390/ijms27114826

APA Style

Yao, B., Ma, Y., Guan, S., Zheng, Q., Yu, Y., Jia, R., Shi, Y., Hou, Z., Wang, Z., & Liu, J. (2026). A Dual-Gene Signature of PMAIP1 and GADD45A for Early Detection of Intrahepatic Cholangiocarcinoma in the Context of Primary Sclerosing Cholangitis. International Journal of Molecular Sciences, 27(11), 4826. https://doi.org/10.3390/ijms27114826

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop