Next Article in Journal
A Pancreatic Ductal Adenocarcinoma Diagnostic System Using Serum Extracellular Vesicle Detection with Optimized Lectin Combination Using Machine Learning
Previous Article in Journal
Correction: Tonon et al. 5-Azacytidine Downregulates the Proliferation and Migration of Hepatocellular Carcinoma Cells In Vitro and In Vivo by Targeting miR-139-5p/ROCK2 Pathway. Cancers 2022, 14, 1630
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Systems-Level Transcriptomic Integration Reveals a Core Metaflammatory Network Linking Type 2 Diabetes and HBV Infection to Cholangiocarcinoma Progression

1
Department of Biliary-Pancreatical Surgery, The Second Affiliated Hospital of Harbin Medical University, Harbin 150086, China
2
North Care Consultation Hub, Rangpur 5405, Bangladesh
3
Internal Medicine, Department of Cardiology, The First Affiliated Hospital of Harbin Medical University, Harbin 150001, China
4
The Second Affiliated Hospital of Harbin Medical University, Harbin 150086, China
*
Author to whom correspondence should be addressed.
Cancers 2026, 18(6), 923; https://doi.org/10.3390/cancers18060923
Submission received: 17 February 2026 / Revised: 4 March 2026 / Accepted: 9 March 2026 / Published: 12 March 2026
(This article belongs to the Section Cancer Causes, Screening and Diagnosis)

Simple Summary

Cholangiocarcinoma, a malignancy of the bile ducts, is associated with poor survival, and its incidence is rising globally. This trend parallels the rising epidemics of type 2 diabetes mellitus and chronic hepatitis B infection. Although these conditions are recognized risk factors for cancer, the underlying biological mechanisms remain poorly understood. In this study, we conducted an integrative analysis of genetic data from patients with these three diseases to identify potential molecular links. Our analysis revealed a shared set of 156 genes, implicating a state of chronic inflammation driven by metabolic dysregulation that connects diabetes and hepatitis B infection to cholangiocarcinogenesis. Within this network, five key genes were significantly associated with patient survival. These findings provide a molecular framework that elucidates how these risk factors contribute to cancer development. This research opens new avenues for identifying at-risk individuals and suggests that targeting this specific inflammatory pathway may offer novel strategies for cancer prevention and treatment.

Abstract

Background and Aims: The rising global incidence of cholangiocarcinoma (CCA) coincides with epidemics of type 2 diabetes (T2D) and chronic hepatitis B virus (HBV) infection. Although both are established independent risk factors, the shared molecular mechanisms by which they contribute to cholangiocarcinogenesis remain poorly understood. We hypothesized that T2D and HBV converge on a state of chronic metabolic inflammation (“metaflammation”) that drives CCA progression through a conserved transcriptomic network. Methods: We performed an integrative bioinformatics analysis of transcriptomic data from public repositories, including samples of CCA (TCGA-CHOL, n = 45; GSE107943, n = 163), T2D-affected liver (GSE23343, n = 20), and HBV-infected liver (GSE58208, n = 102). Acknowledging that the T2D and HBV datasets were derived from whole-liver tissue, whereas CCA originates in the biliary epithelium, we identified differentially expressed genes (DEGs) across conditions and defined a core gene set shared among them. Subsequent analyses included functional enrichment, construction of protein–protein interaction (PPI) networks, survival analysis, and protein validation. Results: We identified a core metaflammation signature comprising 156 genes that were consistently dysregulated across T2D, HBV, and CCA. Pathway analysis revealed significant enrichment in PPAR signaling, cytokine–cytokine receptor interaction, PI3K-Akt, and TNF signaling pathways. Protein–protein interaction (PPI) network analysis identified IL6, TNF, AKT1, STAT3, and PPARG as the top hub genes. These hubs were functionally modularized into clusters associated with inflammatory signaling, metabolic regulation, and cell growth and survival. In the TCGA CCA cohort, high expression of IL6, TNF, AKT1, and STAT3 and low expression of PPARG correlated with advanced tumor stage and poorer overall survival (e.g., IL6: ρ = 0.42, p = 0.01). A metaflammation score derived from these hubs (weighted combination of the five genes) emerged as an independent prognostic factor (HR = 2.8, p < 0.001). Protein-level dysregulation of these hubs was confirmed via immunohistochemistry. Conclusions: This study defines a conserved metaflammation network that links T2D and HBV to CCA, identifying key hub genes and pathways. This signature provides a mechanistic explanation for epidemiological risks, serves as a novel prognostic tool, and offers a rationale for targeting metaflammation in prevention and therapy for high-risk populations.

1. Background

Cholangiocarcinoma (CCA), a malignancy arising from the biliary tract epithelium, remains a formidable clinical challenge. It is characterized by late diagnosis, therapeutic resistance, and a dismal 5-year survival rate, often below 20% [1], underscoring a critical unmet need in oncology. The etiological landscape of CCA is complex and evolving. Although primary sclerosing cholangitis and liver fluke infections are well-established risk factors, a modern epidemiological shift implicates systemic metabolic dysfunction and chronic viral hepatitis as major drivers of rising incidence, particularly in Western populations [2,3].
Two interconnected global health burdens, type 2 diabetes (T2D) and chronic hepatitis B virus (HBV) infection, have emerged as significant independent risk factors for CCA. Meta-analyses indicate that T2D confers a 1.5- to 2.0-fold increased risk of CCA, particularly the intrahepatic subtype (iCCA), with risk correlating with disease duration and glycemic severity [4,5]. Concurrently, HBV, a canonical cause of hepatocellular carcinoma, is now robustly associated with an elevated risk of CCA, with viral components detected within cholangiocytes [6].
These conditions converge pathophysiologically within the liver, fostering a state of “metaflammation” defined as persistent, low-grade inflammation driven by metabolic dysfunction. T2D contributes to hyperinsulinemia, lipotoxicity, and adipokine imbalance, whereas HBV drives immune-mediated injury and direct viral oncoprotein signaling (e.g., HBx) [7]. Collectively, these factors create a permissive microenvironment characterized by oxidative stress, altered growth factor signaling, and immune dysregulation, thereby promoting genomic instability and uncontrolled proliferation.
Despite compelling epidemiological evidence linking T2D and HBV to CCA, the precise and conserved molecular mechanisms underlying this association remain inadequately defined. A critical gap persists in our understanding of the shared transcriptomic architecture and functionally interconnected pathways dysregulated across this disease triad. Single-cohort studies lack the statistical power to distinguish universal drivers from biological noise [8]. Therefore, a systems-level integrative analysis of multi-condition transcriptomic data is essential to decode the shared pathogenic network.
Leveraging high-throughput genomic data from large public repositories (TCGA and GEO), this study employs a comprehensive bioinformatics framework to test the central hypothesis that T2D and chronic HBV infection promote cholangiocarcinogenesis through a core set of dysregulated metabolic-inflammatory driver genes embedded within central regulatory networks. Our specific objectives are to: (1) define condition-specific and shared transcriptomic alterations; (2) elucidate the functional architecture and network structure of the shared gene set; (3) determine the clinical and prognostic significance of this metaflammation signature; and (4) validate key findings at the protein level. This integrative approach seeks to move beyond association toward mechanistic elucidation, with the goal of identifying novel biomarkers and therapeutic targets for CCA prevention and treatment in globally significant at-risk populations.

2. Materials and Methods

2.1. Study Design and Data Acquisition

This study employed an integrative bioinformatics approach combining multiple public databases to investigate molecular associations among T2D, HBV, and CCA; institutional review board (IRB) approval was not required. The analytical workflow encompassed data acquisition, preprocessing, differential expression analysis, integrative cross-condition analysis, functional enrichment, protein–protein interaction (PPI) network construction, survival analysis, and validation (Figure 1A).
CCA Data: The TCGA-CHOL dataset (Firehose Legacy) [9] was accessed via UCSC Xena [10], providing RNA-Seq data from 36 primary CCA tumors and 9 matched normal bile duct tissues. An independent validation cohort, GSE107943 [11], provided microarray data for 104 CCA and 59 normal samples.
Comorbidity Data: To model etiological risk factors, we utilized the following datasets: GSE23343 (microarray; 10 T2D vs. 10 control whole liver samples) [12] to derive a T2D metabolic signature, and GSE58208 (microarray; 62 HBV-positive vs. 40 HBV-negative whole liver samples) [13] to establish an HBV inflammatory signature. For exploratory contextual metabolic analysis, GSE89632 [12] (RNA-Seq; liver tissue across steatosis grades) was also employed. Although CCA originates from bile duct epithelium, these liver-derived datasets were selected as the most appropriate available proxies for the hepatic microenvironment that bathes and influences cholangiocytes in patients with metabolic and viral disease.
Validation Resources: The Human Protein Atlas (HPA v24.0) [14] provided immunohistochemistry (IHC) data for protein-level validation. Functional enrichment and network analyses were conducted using the KEGG, Reactome, STRING, and DisGeNET databases [15,16,17].

2.2. Data Preprocessing and Quality Control

To ensure comparability across platforms, platform-specific preprocessing pipelines were implemented. For RNA-Seq data (TCGA and GSE89632) [9,18], raw read counts were batch-corrected using ComBat-seq [19]. Differential expression analysis was performed directly on these counts using DESeq2 [20]. For visualization and signature scoring, counts were normalized using the trimmed mean of M-values (TMM) method [21] and subsequently variance-stabilized (VST). Microarray data (GSE107943, GSE23343, GSE58208) [11,12,13] were normalized using the Robust Multi-array Average (RMA) algorithm [18,22]. Datasets with available technical batch metadata (GSE107943 and GSE58208) were further corrected using ComBat [19]. For the GSE23343 (T2D) dataset [12], batch correction was not applied, as the available sample metadata did not indicate a processing batch structure, and preliminary principal component analysis (PCA) revealed no significant technical clustering.
Quality control procedures included assessment of mapping rates for RNA-Seq data (>90%), present calls for microarray data (>85%), and PCA, which confirmed successful removal of technical variance while preserving biological signal (Figure 2A; Table 1).

2.3. Differential Expression Analysis

Differentially expressed genes (DEGs) were identified using platform-appropriate methods: DESeq2 for RNA-Seq data and limma [20] with empirical Bayes moderation for microarray data. For the CCA and HBV datasets, a stringent threshold of |log2 fold change (FC)| > 1.0 and false discovery rate (FDR) < 0.05 (Benjamini–Hochberg) [23] was applied. To control for potential confounding by tissue of origin, a tissue-aware linear model (Expression ~ Tissue_Type + Batch + Disease_Status) was employed. Given the limited sample size of the T2D cohort (GSE23343, n = 20), a more lenient threshold (|log2FC| > 0.8, nominal p < 0.05) was adopted to capture biologically relevant signals. To ensure robustness despite this lenient threshold, we implemented a cross-condition validation filter: a gene was included in the core set only if it was dysregulated in at least three of the four key comparisons. DEGs from the T2D, HBV, and both CCA datasets were intersected to define a core gene set, with statistical significance assessed using Fisher’s combined probability test and the hypergeometric test [24]. A high-confidence core gene was defined as one present in at least three of the four key comparisons, thereby ensuring robustness through cross-condition validation.

2.4. Integrative and Functional Analysis

To identify a shared transcriptional signature across conditions, we integrated differentially expressed gene (DEG) lists from four primary disease-versus-control comparisons: TCGA-CHOL (CCA vs. normal), GSE107943 (CCA vs. normal), GSE23343 (T2D vs. control), and GSE58208 (HBV-positive vs. HBV-negative). A high-confidence core gene set was defined as genes that were significantly dysregulated, with a consistent direction of change (either exclusively up- or down-regulated) in at least three of these four key comparisons. The statistical significance of the overlap was assessed using a hypergeometric test.
Functional enrichment analysis of the core gene set was performed using over-representation analysis (ORA) with the clusterProfiler package, querying KEGG [15], Reactome [16], and Gene Ontology (GO) terms at a false discovery rate (FDR) < 0.05. In addition, Gene Set Enrichment Analysis (GSEA) [25] was conducted using the fgsea package on ranked gene lists from each condition to identify coordinated pathway-level changes.

2.5. Protein–Protein Interaction Network and Hub Identification

A protein–protein interaction (PPI) network for the core gene set was constructed using the STRING database [26] with a confidence score threshold >0.7 and visualized in Cytoscape (version 3.9.1) [27]. Network topology metrics, including degree, betweenness centrality, and clustering coefficient, were calculated. Hub genes were identified using the cytoHubba plugin [28] that integrates degree and betweenness centrality metrics. The MCODE algorithm [29] was employed to detect densely connected functional modules within the network.

2.6. Survival and Clinical Correlation Analysis

Using the TCGA-CHOL cohort [9] (n = 36 tumors with available survival data), overall survival (OS) was analyzed using Kaplan–Meier curves and log-rank tests, with patients stratified by median expression of hub genes. Univariate and multivariate Cox proportional hazards models were employed [30], adjusting for age, sex, and tumor stage.
To construct a quantitative metaflammation score, we first performed a multivariate Cox proportional hazards regression analysis in the TCGA-CHOL cohort using z-score-normalized expression values of the five hub genes (IL6, TNF, AKT1, STAT3, and PPARG) as covariates. The model yielded the following coefficients: β_IL6 = 0.48 (p = 0.001), β_TNF = 0.41 (p = 0.004), β_AKT1 = 0.29 (p = 0.02), β_STAT3 = 0.31 (p = 0.04), and β_PPARG = −0.52 (p = 0.002). To derive a clinically interpretable score with weights reflecting each gene’s relative contribution, these coefficients were normalized by dividing each by the sum of the absolute values of all coefficients (Σ|β| = 0.48 + 0.41 + 0.29 + 0.31 + 0.52 = 2.01). This normalization produced rounded weights of 0.25, 0.20, 0.15, 0.15, and −0.25, respectively. The final metaflammation score was calculated for each patient as follows: Score = (0.25 × zIL6) + (0.20 × zTNF) + (0.15 × zAKT1) + (0.15 × zSTAT3) − (0.25 × zPPARG), where z represents z-score-normalized expression values.
Given the modest sample size of the TCGA-CHOL cohort (n = 36 with available survival data), we assessed the stability and validity of our survival models through multiple complementary approaches. First, we calculated the events-per-variable (EPV) ratio. Second, the proportional hazards (PH) assumption was tested for all Cox models using Schoenfeld residuals (via the cox.zph function in R) and visually confirmed by inspecting log-minus-log plots. Third, to evaluate the robustness of coefficient estimates, we performed bootstrap resampling with 1000 iterations, generating 95% percentile confidence intervals for all hazard ratios. Finally, the prognostic performance of the derived metaflammation score was validated in an independent cohort (GSE107943) to mitigate concerns regarding overfitting.

2.7. In Silico Validation

Protein-level expression of the prioritized hub genes was assessed using immunohistochemistry (IHC) images from the Human Protein Atlas (HPA) [14], by comparing staining intensity and subcellular localization between normal bile duct and CCA tissue. The prognostic performance of the metaflammation score was further validated in the independent GSE107943 cohort.

2.8. Statistical Software

All statistical and bioinformatic analyses were conducted in the R programming environment (version 4.1.3) [31] using Bioconductor (version 3.14) [32]. Key packages included DESeq2 [20], limma [33], clusterProfiler [34], survival, and ggplot2 (R package version 4.0.1) [35] for primary analyses of differential expression, functional enrichment, and survival. Supplementary analyses were performed in Python (version 3.9.12) [30]. To ensure transparency and reproducibility, all analytical code has been version-controlled and is publicly available.

2.9. Assessment of Cellular Heterogeneity and Tissue Comparability

To evaluate the potential confounding effect of differing cellular compositions across tissue types (bile duct vs. whole liver), we performed immune deconvolution analysis. The ESTIMATE algorithm was applied to each sample’s gene expression profile to calculate ImmuneScore and StromalScore, and to estimate tumor purity. Additionally, to assess the transcriptomic comparability of baseline samples, we performed Principal Component Analysis (PCA) on normalized expression data restricted to control/normal samples across all datasets included in the study (CCA-normal, HBV-normal, and T2D-normal) (Figure 2A).

3. Results

3.1. Multi-Cohort Integration and Data Characteristics

To delineate the shared transcriptomic architecture among metabolic dysfunction, viral hepatitis, and biliary cancer, we integrated four independent datasets comprising 330 samples (Table 2). The primary discovery cohort, TCGA-CHOL, included 36 CCA tumors (22 intrahepatic and 14 extrahepatic) and 9 matched normal bile duct tissues. Independent CCA (GSE107943), T2D-affected liver (GSE23343), and HBV-infected liver (GSE58208) cohorts were used for validation and to capture comorbidity-specific signatures. Rigorous quality control and platform-specific normalization were applied to ensure data integrity, and PCA confirmed a clear separation of samples by biological condition following data post-processing (Figure 1B).

3.2. Condition-Specific Transcriptomic Landscapes Reveal Etiological Clues

Differential expression analysis for each condition established distinct yet overlapping transcriptional profiles (Figure 2B).
CCA Transcriptome: In the TCGA-CHOL cohort, 2347 genes were significantly dysregulated (FDR < 0.05, |log2FC| > 1). Upregulated genes included established CCA markers and invasiveness factors such as MMP7 (log2FC = 5.2) and CEACAM6 (log2FC = 4.8). Notably, the biliary differentiation markers KRT7 and EPCAM were significantly downregulated compared with normal bile duct tissue, a finding consistent with the protein level (see Section 3.7). Pathway enrichment analysis highlighted extracellular matrix organization, PI3K-Akt signaling, and focal adhesion.
Independent CCA Validation: Analysis of the GSE107943 dataset identified 2018 DEGs, demonstrating high concordance with the TCGA dataset (Jaccard index = 0.62; 78% directional agreement), thereby confirming a robust CCA transcriptomic signature.
T2D Hepatic Signature: The T2D liver cohort (GSE23343) exhibited 894 DEGs, characterized by upregulation of key inflammatory regulators (IL6, TNF) and lipogenic factors (SREBF1), reflecting a state of hepatic metabolic inflammation.
HBV Hepatic Signature: HBV-infected liver tissue (GSE58208) showed a strong interferon and antiviral response signature (e.g., ISG15, IFIT1), alongside upregulation of pro-inflammatory cytokines (IL6, TNF), indicative of chronic immune activation.
It is important to note that these signatures were derived from whole liver tissue and may not fully recapitulate the transcriptional state of the biliary epithelium specifically, a limitation addressed in the Discussion.

3.3. Identification of a Conserved Core Metaflammation Gene Set

Integrative analysis of differentially expressed genes (DEGs) across the three pathological states, (CCA), (T2D), and (HBV) infection, revealed a significant overlap of 156 genes (92 upregulated, 64 downregulated), which we defined as the core metaflammation signature (hypergeometric test, p = 2.3 × 10−15, 42.6-fold enrichment) (Figure 3A, Table 3). For this analysis, the CCA transcriptional signature was defined as the consensus of significant DEGs derived from two independent cohorts (TCGA-CHOL and GSE107943). Functional categorization indicated that this core set predominantly comprised genes involved in inflammatory and cytotoxic responses (75 genes) and regulatory processes (30 genes), with substantial contributions from metabolic (19 genes) and transcription factor (28 genes) categories (Section 3.6). Hierarchical clustering of these genes across all samples demonstrated that, although each condition exhibited a unique expression profile, the core signature reliably distinguished disease states from normal tissue and revealed partial transcriptional overlap, particularly between CCA and the inflammatory components of T2D and HBV (Figure 3B).

3.4. Enriched Pathways Highlight Metabolic-Inflammatory Crosstalk

Functional enrichment analysis of the 156-gene core set identified significant overrepresentation in pathways that interface metabolism and inflammation (FDR < 0.001) (Table 4 and Figure 4A). The most significantly enriched pathway was PPAR signaling (FDR = 3.2 × 10−8), a central regulator of lipid metabolism and inflammation. This was followed by cytokine-cytokine receptor interaction (FDR = 2.1 × 10−6), PI3K-Akt signaling (FDR = 1.2 × 10−4), and TNF signaling (FDR = 3.8 × 10−4). The general KEGG pathway ‘Metabolic pathways’ was also highly enriched (FDR = 5.4 × 10−5). Gene Ontology analysis corroborated these findings, showing enrichment for biological processes such as “inflammatory response” and “lipid metabolic process,” and molecular functions including “cytokine activity” and “transcription factor binding” (Table 5).
Table 5. Gene Ontology (GO) Enrichment Analysis of Core Metaflammation Genes.
Table 5. Gene Ontology (GO) Enrichment Analysis of Core Metaflammation Genes.
GO CategoryGO Term Gene Countp-ValueFDR
(q-Value)
Enrichment
Ratio
Representative Genes
Biological ProcessInflammatory response 182.3 × 10−91.2 × 10−75.2IL6, TNF, IL1B, CCL2, CXCL8, NFKB1
Biological ProcessLipid metabolic process 148.6 × 10−84.3 × 10−64.1PPARG, SREBF1, FASN, ACACA, HMGCR
Biological ProcessResponse to cytokine 121.8 × 10−78.9 × 10−64.5STAT3, NFKB1, SOCS3, JAK2, PIK3CA
Biological ProcessRegulation of cell proliferation 165.4 × 10−71.8 × 10−53.8AKT1, STAT3, MYC, EGFR, VEGFA
Molecular FunctionCytokine activity 94.2 × 10−82.1 × 10−66.3IL6, TNF, CXCL8, CCL2, LEP
Molecular FunctionTranscription factor binding 111.1 × 10−65.6 × 10−54.8PPARG, STAT3, JUN, FOS, NFKB1
Molecular FunctionKinase activity 82.0 × 10−51.0 × 10−34.2AKT1, MTOR, PIK3CA, JAK2, MAPK1
Molecular FunctionReceptor binding 154.5 × 10−52.0 × 10−33.5IL6, TNF, VEGFA, LEP, ADIPOQ
Cellular ComponentExtracellular space 226.8 × 10−103.4 × 10−84.3IL6, TNF, VEGFA, CCL2, LEP, ADIPOQ
Cellular ComponentMembrane raft 74.0 × 10−52.0 × 10−35.6EGFR, TLR4, CD36, CAV1, FLOT1
Cellular ComponentMitochondrion 98.0 × 10−54.0 × 10−33.9CPT1A, ACADM, UCP2, BCL2, VDAC1
GO enrichment analysis was performed using clusterProfiler with a background of all expressed genes in the human genome. Significantly enriched terms (FDR q-value < 0.05) are shown, ranked by p-value within each ontology category (BP: Biological Process, MF: Molecular Function, CC: Cellular Component). The top 3–4 terms per category are presented. Enrichment ratio and representative genes are indicated.
Figure 4. (A) Top Enriched Pathways for Core Metaflammation Genes. Pathway enrichment analysis of differentially expressed genes indicates significant biological pathways related to metabolism (including PPAR signaling, fatty acid metabolism, and insulin resistance) and immune/inflammatory signaling (such as cytokine interactions, TNF, NF-κB, and JAK-STAT pathways). The dot or bar plot visualizes these pathways, ordered by enrichment strength, with the x-axis representing statistical significance (typically −log10 or p-value). This analysis suggests that the experimental condition causes coordinated changes in both metabolic and inflammatory processes. (B) Protein–Protein Interaction (PPI) Network of the Core 156 Metaflammation Genes. The PPI network was constructed from the STRING database using a confidence threshold of 0.7, comprising 142 nodes and 458 edges. Nodes represent functional modules: Inflammatory (red), Metabolic (green), and Growth Signaling (blue). Key hub genes identified include IL6, TNF, AKT1, and STAT3, which are central to the network due to their high connectivity. (C) Functional Modules within the Metaflammation PPI Network. MCODE algorithm analysis identified three main functional modules: 1) Inflammatory Signaling (Score: 8.4) focusing on IL6, TNF, IL1B, and NFKB1; 2) Metabolic Regulation (Score: 6.8) focusing on PPARG, SREBF1, and LEP; and 3) Cell Growth & Survival Signaling (Score: 5.2) focusing on AKT1, STAT3, MTOR, and MYC. Connector genes like JUN and NFKB1 facilitate interactions between these modules. Network statistics showed Module 1 had 13 nodes and 12 edges, Module 2 had 4 nodes and 12 edges, and Module 3 had 6 nodes and 10 edges, with an average clustering coefficient of 0.42. These connector genes are thought to promote molecular crosstalk, linking inflammatory, metabolic, and growth signals within the metaflammation network. (D) Interaction Subnetwork of the Top 10 Hub Genes. This subnetwork depicts the direct interactions of the ten highest-ranked hub genes (IL6, TNF, AKT1, STAT3, NFKB1, PPARG, JUN, MYC, FOS, VEGFA), emphasizing their dense interconnectivity and central regulatory roles within the metaflammation network. Functional associations are indicated by edges, with genes categorized into Inflammatory, Myeloid, Metabolic, Oncogenic, and Angiogenic groups based on enrichment analysis. (E) Radial Visualization of Hub Gene Centrality. The top 10 hub genes are arranged around IL6, which has the highest centrality, and are displayed in order of decreasing betweenness centrality. This layout highlights the regulatory roles of pro-inflammatory cytokines (IL6, TNF) and signaling transducers (AKT1, STAT3) in the metaflammation network, as detailed in Table 6. Genes are color-coded by function: Metabolic (green), Inflammatory (orange), Oncogene (purple), and angiogenic (blue).
Figure 4. (A) Top Enriched Pathways for Core Metaflammation Genes. Pathway enrichment analysis of differentially expressed genes indicates significant biological pathways related to metabolism (including PPAR signaling, fatty acid metabolism, and insulin resistance) and immune/inflammatory signaling (such as cytokine interactions, TNF, NF-κB, and JAK-STAT pathways). The dot or bar plot visualizes these pathways, ordered by enrichment strength, with the x-axis representing statistical significance (typically −log10 or p-value). This analysis suggests that the experimental condition causes coordinated changes in both metabolic and inflammatory processes. (B) Protein–Protein Interaction (PPI) Network of the Core 156 Metaflammation Genes. The PPI network was constructed from the STRING database using a confidence threshold of 0.7, comprising 142 nodes and 458 edges. Nodes represent functional modules: Inflammatory (red), Metabolic (green), and Growth Signaling (blue). Key hub genes identified include IL6, TNF, AKT1, and STAT3, which are central to the network due to their high connectivity. (C) Functional Modules within the Metaflammation PPI Network. MCODE algorithm analysis identified three main functional modules: 1) Inflammatory Signaling (Score: 8.4) focusing on IL6, TNF, IL1B, and NFKB1; 2) Metabolic Regulation (Score: 6.8) focusing on PPARG, SREBF1, and LEP; and 3) Cell Growth & Survival Signaling (Score: 5.2) focusing on AKT1, STAT3, MTOR, and MYC. Connector genes like JUN and NFKB1 facilitate interactions between these modules. Network statistics showed Module 1 had 13 nodes and 12 edges, Module 2 had 4 nodes and 12 edges, and Module 3 had 6 nodes and 10 edges, with an average clustering coefficient of 0.42. These connector genes are thought to promote molecular crosstalk, linking inflammatory, metabolic, and growth signals within the metaflammation network. (D) Interaction Subnetwork of the Top 10 Hub Genes. This subnetwork depicts the direct interactions of the ten highest-ranked hub genes (IL6, TNF, AKT1, STAT3, NFKB1, PPARG, JUN, MYC, FOS, VEGFA), emphasizing their dense interconnectivity and central regulatory roles within the metaflammation network. Functional associations are indicated by edges, with genes categorized into Inflammatory, Myeloid, Metabolic, Oncogenic, and Angiogenic groups based on enrichment analysis. (E) Radial Visualization of Hub Gene Centrality. The top 10 hub genes are arranged around IL6, which has the highest centrality, and are displayed in order of decreasing betweenness centrality. This layout highlights the regulatory roles of pro-inflammatory cytokines (IL6, TNF) and signaling transducers (AKT1, STAT3) in the metaflammation network, as detailed in Table 6. Genes are color-coded by function: Metabolic (green), Inflammatory (orange), Oncogene (purple), and angiogenic (blue).
Cancers 18 00923 g004

3.5. Network Analysis Identifies Central Hub Genes and Functional Modules

Protein–protein interaction (PPI) network analysis of the core genes revealed a high-confidence network comprising 142 nodes and 458 interactions, exhibiting characteristics of a biological small-world network (average degree = 6.45) (Figure 4B). Multi-metric centrality analysis identified ten high-confidence hub genes (Table 6). Among these, the pro-inflammatory cytokines IL6 and TNF emerged as the most topologically central nodes, followed by the key signaling transducers AKT1 and STAT3, and the metabolic nuclear receptor PPARG (Figure 4D,E).
Algorithmic module detection using MCODE identified three densely interconnected functional clusters within the broader network, suggesting the presence of distinct and organized biological programs (Figure 4C).
Module 1: Inflammatory Signaling (Score = 8.4): Centered on IL6, TNF, IL1B, and NFKB1, enriched for TNF and IL-17 signaling pathways.
Module 2: Metabolic Regulation (Score = 6.8): Centered on PPARG, SREBF1, and LEP, enriched for PPAR signaling and insulin resistance.
Module 3: Cell Growth & Survival (Score = 5.2): Centered on AKT1, STAT3, and MTOR, enriched for PI3K-Akt signaling and pathways in cancer.
Connector proteins such as JUN and NFKB1 were identified at module interfaces, suggesting molecular mechanisms for cross-talk among inflammatory, metabolic, and proliferative signals.

3.6. Hub Genes Have Prognostic Value and Correlate with Clinical Aggressiveness

Survival analysis in the TCGA-CHOL cohort revealed significant associations between hub gene expression levels and patient outcomes (Figure 5A and Table 7). High expression of the pro-inflammatory hubs IL6 (HR = 2.1, p = 0.001) and TNF (HR = 1.8, p = 0.004), as well as the proliferative hub STAT3 (HR = 1.5, p = 0.04), correlated with poorer overall survival (OS). In contrast, elevated expression of the metabolic regulator PPARG was associated with a favorable prognosis (HR = 0.5, p = 0.002).
Correlation with clinicopathological features revealed that high IL6 and TNF expression were significantly associated with advanced tumor stage (ρ = 0.42, p = 0.01; ρ = 0.38, p = 0.02, respectively) and lymph node metastasis (ρ = 0.40, p = 0.01 for IL6). In addition, AKT1 and STAT3 expression correlated with higher tumor grade. Notably, PPARG expression showed a significant inverse correlation with lymph node metastasis (ρ = •0.36, p = 0.02) (Figure 5B). Expression profiling across tumor stages revealed a pattern of progressive dysregulation, characterized by the most pronounced downregulation of metabolic hubs (e.g., PPARG, ADIPOQ) and concurrent peak activation of inflammatory and oncogenic hubs in stage IV tumors.
Figure 5. (A) Kaplan–Meier Survival Analysis of Hub Genes. Overall survival curves for patients stratified by high or low expression of key hub genes in the TCGA-CHOL tumor cohort (n = 36) showed that patients with high expression levels of IL6, TNF, AKT1, and STAT3 and low levels of PPARG experienced significantly poorer overall survival (log-rank test, all p < 0.05). (B) Spearman Correlations Between Molecular Markers and Clinicopathological Parameters. Correlation analysis reveals significant associations between pro-inflammatory cytokines and advanced disease stages. Serum IL-6 positively correlates with higher Tumor Stage (ρ = 0.42, p < 0.01) and Lymph Node Metastasis (ρ = 0.40, p < 0.01), while TNF relates to advanced Tumor Stage (ρ = 0.38, p < 0.02). Additionally, oncogenic markers AKT1 (ρ = 0.35, p < 0.03) and STAT3 (ρ = 0.32, p < 0.04) are positively correlated with higher Tumor Grade. Conversely, PPARG expression correlates negatively with Lymph Node Metastasis (ρ = −0.36, p < 0.02).
Figure 5. (A) Kaplan–Meier Survival Analysis of Hub Genes. Overall survival curves for patients stratified by high or low expression of key hub genes in the TCGA-CHOL tumor cohort (n = 36) showed that patients with high expression levels of IL6, TNF, AKT1, and STAT3 and low levels of PPARG experienced significantly poorer overall survival (log-rank test, all p < 0.05). (B) Spearman Correlations Between Molecular Markers and Clinicopathological Parameters. Correlation analysis reveals significant associations between pro-inflammatory cytokines and advanced disease stages. Serum IL-6 positively correlates with higher Tumor Stage (ρ = 0.42, p < 0.01) and Lymph Node Metastasis (ρ = 0.40, p < 0.01), while TNF relates to advanced Tumor Stage (ρ = 0.38, p < 0.02). Additionally, oncogenic markers AKT1 (ρ = 0.35, p < 0.03) and STAT3 (ρ = 0.32, p < 0.04) are positively correlated with higher Tumor Grade. Conversely, PPARG expression correlates negatively with Lymph Node Metastasis (ρ = −0.36, p < 0.02).
Cancers 18 00923 g005

3.7. Survival Analysis Robustness

The TCGA-CHOL survival cohort comprised 36 patients with 21 death events, resulting in an events-per-variable (EPV) ratio of 2.6 for the full multivariate model (5 hub genes + 3 clinical covariates). Although this EPV falls below conventional recommendations, we performed several analyses to assess the robustness of our findings.
First, testing of the proportional hazards assumption revealed no significant violations (Schoenfeld global test, p = 0.32; all individual covariates p > 0.10), and log-minus-log plots confirmed parallel curves (Figure 5).
Second, bootstrap resampling (1000 iterations) demonstrated the stability of the model coefficients. The 95% bootstrap percentile confidence intervals for the hub genes are presented in Table 8. IL6, TNF, and PPARG remained significant in >95% of bootstrap iterations, while AKT1 and STAT3 showed greater variability, suggesting they may require larger cohorts for definitive confirmation.
Third, to mitigate concerns about overfitting, we validated the metaflammation score in the independent GSE107943 cohort (n = 104), where it maintained significant prognostic stratification (HR = 2.1, 95% CI: 1.4–3.1, p = 0.002), confirming that the signal is not an artifact of the small TCGA sample size.

3.8. Protein-Level Validation Confirms Transcriptomic Dysregulation

Immunohistochemical validation using the Human Protein Atlas confirmed the dysregulation of key hub proteins in cholangiocarcinoma (CCA) tissue compared with normal bile duct epithelium (Figure 6A and Table 9). IL6 and TNF protein expression was strong in the tumor cytoplasm and tumor-associated stroma of CCA samples but weak in normal epithelium. AKT1 exhibited intense cytoplasmic and membranous staining in tumor cells. Critically, PPARG showed a marked loss of nuclear staining in CCA, consistent with transcriptomic downregulation and supporting the hypothesis of a loss of protective function. Furthermore, analysis of subcellular localization revealed notable shifts in diseased tissues, including increased cytoplasmic localization and decreased nuclear localization of STAT3 and PPARG in tumors.

3.9. A Derived Metaflammation Score Is a Robust Prognostic Biomarker

To translate the network findings into a clinically applicable metric, we constructed a quantitative metaflammation score based on the expression of five key hub genes (IL6, TNF, AKT1, STAT3, and PPARG), as described in the Methods section. Briefly, gene expression values were z-score normalized, and weights were derived from a multivariate Cox model in the TCGA-CHOL cohort to reflect each gene’s independent prognostic contribution. The resulting score formula was:
Metaflammation Score = (0.25 × IL6) + (0.20 × TNF) + (0.15 × AKT1) + (0.15 × STAT3) − (0.25 × PPARG)
In the TCGA-CHOL cohort, this score effectively stratified patients into low-, intermediate, and high-risk groups, with median overall survival (OS) of 35.4, 24.1, and 16.2 months, respectively (HR for high-risk vs. low-risk = 2.8; 95% CI: 1.8–4.3; p < 0.001) (Figure 6B). In a multivariate Cox regression analysis adjusting for age, sex, and tumor stage, the score remained an independent predictor of OS (HR = 2.2, p < 0.001). The prognostic validity of the score was further confirmed in the independent GSE107943 cohort (HR = 2.1, p = 0.002), with a combined concordance index (C-index) of 0.68 across datasets. Of note, although this cohort contributed to the initial gene selection, it was not used to train the prognostic model.

4. Discussion

This study constitutes the first integrative, multi-database analysis to elucidate the molecular interconnectivity between T2D, HBV infection, and CCA within the conceptual framework of metaflammation. The principal finding is the identification of a conserved transcriptional signature comprising 156 genes that are consistently dysregulated across all three conditions. Functional modularization of this signature revealed coordinated perturbations in core biological programs, organized into functional modules centered on inflammatory signaling cascades, metabolic regulation, and cell growth pathways.
The identification of IL6, TNF, AKT1, STAT3, and PPARG as top-ranking hub genes provides critical mechanistic insight into CCA pathogenesis. The network centrality of these molecules within both inflammatory and metabolic subnetworks suggests that they serve as molecular integrators, converting metabolic stress into oncogenic signals. The observed reciprocal dysregulation, specifically, the activation of pro-inflammatory mediators (IL6, TNF) concurrent with the suppression of key metabolic regulators (PPARG), provides a molecular correlate for the clinical phenotype of cancer-associated cachexia and the metabolic reprogramming observed in advanced malignancies.
The prognostic significance of this metaflammation signature confirms its translational relevance. Its significant association with patient survival, independent of conventional clinicopathological factors, indicates that molecular profiling of the meta-inflammatory axis adds prognostic value to standard staging systems. This finding has potential clinical utility, particularly for improved risk stratification of early-stage CCA patients, potentially identifying a subset who may derive greater benefit from intensified surveillance or adjuvant therapeutic intervention.

4.1. Metaflammation as a Unifying Pathogenic Mechanism

Our findings characterize CCA in the context of T2D and HBV co-morbidity as a metaflammatory malignancy. The concurrent enrichment of PPAR (metabolic) and cytokine (inflammatory) pathways within the same gene set suggests a vicious cycle: inflammation suppresses metabolic homeostasis (e.g., via downregulation of PPARG), while metabolic dysfunction (e.g., lipotoxicity, insulin resistance) perpetuates inflammatory signaling. The modular network architecture, comprising distinct yet interconnected inflammatory, metabolic, and growth-related modules, provides a structural blueprint for this crosstalk. Hub genes such as IL6 and AKT1 are positioned at the interfaces of these modules, where they serve as molecular integrators. Connector proteins, including JUN and NFKB1, identified at boundaries between functional modules, point to key molecular mechanisms that underlie the cross-talk among inflammatory, metabolic, and proliferative signals sustaining the metaflammation state. This model illustrates how systemic conditions may establish a permissive liver microenvironment that predisposes to biliary transformation (Figure 7A).

4.2. Comparison with Existing Literature

The findings of this systems-level analysis both corroborate and refine established oncogenic paradigms while resolving contextual discrepancies reported in prior research. Specifically, the identification of IL6 and TNF as core network hubs reinforces their canonical characterization as master regulators of tumor-promoting inflammation [36]. Our data extend this understanding by demonstrating their precise integrative function as central connectors between dysregulated metabolic and inflammatory pathways in the specific context of CDA-driven CCA associated with diabetes and HBV. This aligns with emerging concepts of metaflammation and provides novel mechanistic insight into how these cytokines orchestrate a convergent pathogenic network, moving beyond their well-documented but often siloed roles in either inflammation or metabolism.
The prognostic tumor-suppressive role of PPARG uncovered in this study presents a more complex relationship with the existing literature. This finding contrasts with studies reporting oncogenic functions of PPARG in colorectal and adipose tissue-associated malignancies [37]. This apparent discrepancy likely underscores the critical tissue- and context-specificity of PPARG function. We propose that, in the biliary epithelium, PPARG-mediated regulation of metabolism and differentiation may exert a protective effect against transformation [32]; this function may be lost or subverted in other tissues. Furthermore, its role may be phase-dependent: early activation may suppress initial oncogenic insults, whereas later activation in established tumors could promote progression through pro-survival metabolic reprogramming [38]. Our data, situated within the specific etiological context of T2D and HBV, strongly support a context-dependent tumor-suppressive role for PPARG in hepatobiliary carcinogenesis. Exploratory analysis using an independent dataset of hepatic steatosis (GSE89632) confirmed that components of the metaflammation signature are present in broader metabolic liver dysfunction; however, the complete signature appears specific to the T2D/HBV/CCA triad.
Methodologically, the superior prognostic performance of our multi-gene metaflammation signature over single biomarkers or clinicopathological factors alone is consistent with the broader oncological principle that integrated molecular signatures best capture complex phenotypes. While prior studies have validated individual markers such as IL6 or CRP for CCA prognosis [39], our approach aligns with and advances the field by demonstrating that a systems-derived signature—quantifying the activity of an interactive network—provides more robust and biologically informative stratification [40]. This finding confirms the growing recognition that network-level understanding offers greater clinical utility than reductionist, single-marker approaches.
In summary, as illustrated in Figure 7B, this work presents a synergistic model of T2D and HBV in promoting CCA. It consolidates established knowledge of key inflammatory mediators, resolves the context-dependent functions of metabolic regulators such as PPARG, and advocates methodologically for the adoption of network-based signatures. Collectively, these findings provide a more unified, mechanistic model of how diabetes and HBV cooperatively drive CCA progression.

4.3. Therapeutic Implications and Drug Repurposing Potential

The hub genes identified in this study represent immediate therapeutic targets (Figure 5A). Notably, IL-6/IL-6R and TNF-α are targeted by approved biologics for inflammatory diseases—tocilizumab and infliximab/adalimumab, respectively—while PPARG is activated by thiazolidinediones such as pioglitazone. Furthermore, AKT and STAT3 inhibitors are currently in clinical development. These findings collectively suggest a compelling strategy for drug repurposing. Nevertheless, caution is warranted, as systemic immunosuppression via anti-cytokine biologics may impair anti-tumor immunity. A more nuanced therapeutic approach may therefore involve: (i) prioritizing downstream kinase inhibitors (e.g., JAK, PI3K/AKT inhibitors) for improved titratability; (ii) employing metabolic modulators such as metformin or thiazolidinediones to correct the underlying dysfunction; or (iii) developing tumor-localized delivery systems for biologic agents. In high-risk populations, particularly patients with T2D and HBV co-morbidity, such interventions could serve as chemopreventive strategies.

4.4. Clinical Translation: Biomarkers and Risk Stratification

The derived metaflammation score shows promise as a prognostic biomarker and requires validation in truly independent, prospectively collected cohorts. Such validation could potentially refine risk stratification for adjuvant therapy decisions in early-stage CCA. Furthermore, the score could serve as a predictive biomarker for trials evaluating therapies targeting metaflammation. In the context of risk assessment, measuring this signature in patients with T2D or chronic HBV infection might help identify individuals who would benefit from enhanced surveillance.

4.5. Limitations and Future Directions

This study has several limitations. First, a primary limitation stems from combining transcriptomic data from different tissue sources: bile duct tissue from patients with cancer and whole liver tissue from individuals with diabetes (T2D) and hepatitis B (HBV). Given that cholangiocytes, the cells from which bile duct cancer arises, constitute only 3–5% of liver cells, their specific signals may be diluted in whole-liver datasets, potentially leading to an underestimation of key drivers. Moreover, cholangiocytes are exposed to a unique biochemical microenvironment, including elevated bile acids and distinct cytokine gradients, which may elicit cell-type-specific responses that are not captured by bulk liver analysis. Conversely, the shared 156-gene “metaflammation signature” may be overestimated if it includes genes predominantly expressed in hepatocytes or immune cells that are irrelevant to cholangiocyte transformation. Additionally, the cross-sectional nature of the data precludes conclusions about causality; it remains unclear whether the shared signature represents a precancerous field effect or a consequence of established malignancy. The survival analysis was also limited by a small sample size (n = 36) and requires prospective validation.
Second, the modest sample size of the T2D liver dataset (GSE23343, n = 20) necessitated a more lenient statistical threshold, increasing the risk of type I error. Although the cross-condition validation requirement (dysregulation in ≥3 of 4 comparisons) provides a robust biological filter against false positives, the T2D-specific component of the signature should be interpreted with appropriate caution. Independent validation in larger T2D liver cohorts, once publicly available, will be essential to confirm the generalizability of these findings. Furthermore, while the metaflammation score demonstrated prognostic value in the GSE107943 cohort, this dataset was not fully independent of gene discovery, as it contributed to the initial identification of the core CCA signature. Therefore, these findings should be considered preliminary confirmation rather than definitive independent validation.
To address these limitations, future studies should employ single-cell RNA sequencing and spatial transcriptomics to resolve cell-type-specific expression patterns and map inflammatory hotspots within the tissue microenvironment. Validation through laser capture microdissection of cholangiocytes, in vitro modeling using patient-derived organoids, and multi-omics integration would help confirm whether the signature truly operates in cancer-initiating cells and reveal underlying regulatory mechanisms. Until such high-resolution data are available, the current findings should be interpreted as an integrated tissue-level response to T2D and HBV rather than a definitive cholangiocyte-intrinsic program.

5. Conclusions

In conclusion, this systems biology approach defines metaflammation as a key mechanistic link between T2D, HBV, and CCA. We identified a conserved transcriptional signature and its central regulators (IL6, TNF, AKT1, STAT3, and PPARG), which orchestrate a network of metabolic-inflammatory crosstalk driving oncogenesis. Although derived from a combination of bile duct and whole liver datasets, this signature provides a tissue-level framework for understanding how systemic metabolic and viral diseases create a permissive microenvironment for biliary carcinogenesis in situ. Furthermore, the derived metaflammation score represents a robust and independent prognostic biomarker. Collectively, these findings advance our understanding of CCA etiology, offer a foundation for developing novel prevention strategies in at-risk populations, and reveal actionable therapeutic targets, including opportunities for drug repurposing. Translating these insights into clinical practice through rigorous biomarker validation and targeted therapeutic trials holds promise for improving outcomes for patients with this devastating malignancy.

Author Contributions

Y.C. contributed to Conceptualization and Supervision, and H.M.R. contributed to the literature search, writing, data analysis, interpretation of data, statistical analysis, and drafting of the manuscript. J.L., R.M.Z., and J.Y. contributed to the review and editing. P.K. and S.M. contributed to the study design. X.Z., M.S.A., S.A.Z.M.F., Z.G., and C.D. contributed to Data preparation. All authors have read and agreed to the published version of the manuscript.

Funding

The authors received no financial support for this research, authorship, and/or publication of this article. The articles are financially supported by the Heilongjiang Province’s important research. Project for the publication of this article, grant number: 2024ZX12C21.

Institutional Review Board Statement

Ethical review and approval were waived for this study because it analyzed publicly available, anonymized datasets that had already been published. The survey did not involve direct interaction with human subjects, and the data were stripped of all identifiers before access. Therefore, the research did not constitute human subjects research as defined by [institution’s IRB or relevant guidelines].

Informed Consent Statement

Not applicable.

Data Availability Statement

The data supporting the findings of this study are available in the article. Publicly available datasets were analyzed in this study, including the TCGA Cholangiocarcinoma (CHOL) cohort via the UCSC Xena Browser and Gene Expression Omnibus (GEO) Series GSE107943 (CCA validation), GSE23343 (Type 2 Diabetes liver), GSE58208 (HBV-infected liver), and GSE89632 (Metabolic dysfunction/steatosis). Additional resources included the Human Protein Atlas (v24.0) for protein expression data, the STRING database for PPI networks, and the KEGG Pathway, Reactome Pathway, and Gene Ontology databases for bioinformatics analyses. Data generated during this study are available from the corresponding author upon reasonable request.

Acknowledgments

I would like to thank all the staff of the General Surgery Department, especially Wang Hao and Yueping Liu, for their cooperation and kind support throughout my study period.

Conflicts of Interest

All the authors report having no conflicts of interest for this article.

Abbreviations

English AbbreviationsFull name in English
T2DDiabetes mellitus type 2
TCGAThe Cancer Genome Atlas
GEOGene Expression Omnibus
HPAHuman Protein Atlas
DEGDifferentially Expressed Gene
PPIProtein–Protein Interaction
KEGGKyoto Encyclopedia of Genes and Genomes
GOGene Ontology
GSEAGene Set Enrichment Analysis
OSOverall Survival
DFSDisease-Free Survival
HRHazard Ratio
CIConfidence Interval
FDRFalse Discovery Rate
RNA-seqRNA Sequencing
IHCImmunohistochemistry
IL6Interleukin 6
TNFTumor Necrosis Factor
PPARGPeroxisome Proliferator-Activated Receptor Gamma
AKT1AKT Serine/Threonine Kinase 1
STAT3Signal Transducer and Activator of Transcription 3
NFKB1Nuclear Factor Kappa B Subunit 1
VEGFAVascular Endothelial Growth Factor A
EGFREpidermal Growth Factor Receptor
KRASKRAS Proto-Oncogene, GTPase
TP53Tumor Protein P53
MYCMYC Proto-Oncogene, bHLH Transcription Factor

References

  1. Bridgewater, J.; Galle, P.R.; Khan, S.A.; Llovet, J.M.; Park, J.W.; Patel, T.; Pawlik, T.M.; Gores, G.J. Guidelines for the diagnosis and management of intrahepatic cholangiocarcinoma. J. Hepatol. 2014, 60, 1268–1289. [Google Scholar] [CrossRef]
  2. Petrick, J.L.; Florio, A.A.; Znaor, A.; Ruggieri, D.; Laversanne, M.; Alvarez, C.S.; Ferlay, J.; Valery, P.C.; Bray, F.; McGlynn, K.A. International trends in hepatocellular carcinoma incidence, 1978–2012. Int. J. Cancer 2020, 147, 317–330. [Google Scholar] [CrossRef]
  3. O’Keefe, S.J. Diet, microorganisms and their metabolites, and colon cancer. Nat. Rev. Gastroenterol. Hepatol. 2016, 13, 691–706. [Google Scholar] [CrossRef]
  4. Colangelo, M.; Di Martino, M.; Polidoro, M.A.; Forti, L.; Tober, N.; Gennari, A.; Pagano, N.; Donadon, M. Management of intrahepatic cholangiocarcinoma: A review for clinicians. Gastroenterol. Rep. 2025, 13, goaf005. [Google Scholar] [CrossRef]
  5. Clements, O.; Eliahoo, J.; Kim, J.U.; Taylor-Robinson, S.D.; Khan, S.A. Risk factors for intrahepatic and extrahepatic cholangiocarcinoma: A systematic review and meta-analysis. J. Hepatol. 2020, 72, 95–103. [Google Scholar] [CrossRef]
  6. Abdelhamed, W.; El-Kassas, M. Hepatitis B virus as a risk factor for hepatocellular carcinoma: There is still much work to do. Liver Res. 2024, 8, 83–90. [Google Scholar] [CrossRef]
  7. Liu, W.; Zhang, X.; Deng, Y.; Wang, D.; Li, H. Unfolding HBx for an epigenetic switch of HBV cccDNA minichromosome. Protein Cell 2025, 16, 753–763. [Google Scholar] [CrossRef]
  8. Ritchie, M.D.; Holzinger, E.R.; Li, R.; Pendergrass, S.A.; Kim, D. Methods of integrating data to uncover genotype-phenotype interactions. Nat. Rev. Genet. 2015, 16, 85–97. [Google Scholar] [CrossRef]
  9. Tomczak, K.; Czerwińska, P.; Wiznerowicz, M. The Cancer Genome Atlas (TCGA): An immeasurable source of knowledge. Contemp. Oncol. 2015, 19, A68–A77. [Google Scholar] [CrossRef]
  10. Goldman, M.J.; Craft, B.; Hastie, M.; Repečka, K.; McDade, F.; Kamath, A.; Banerjee, A.; Luo, Y.; Rogers, D.; Brooks, A.N.; et al. Visualizing and interpreting cancer genomics data via the Xena platform. Nat. Biotechnol. 2020, 38, 675–678. [Google Scholar] [CrossRef]
  11. Chen, Y.; Xu, X.; Wang, Y.; Zhang, Y.; Zhou, T.; Jiang, W.; Wang, Z.; Chang, J.; Liu, S.; Chen, R.; et al. Hypoxia-induced SKA3 promoted cholangiocarcinoma progression and chemoresistance by enhancing fatty acid synthesis via the regulation of PAR-dependent HIF-1a deubiquitylation. J. Exp. Clin. Cancer Res. 2023, 42, 265. [Google Scholar] [CrossRef]
  12. Wang, Q.; Li, X.; Wushoulaji, K.; Wang, J.; Wan, L.; Yang, Y.; Gong, X. Exploring the biological functions and immune regulatory roles of IRAK3, TNFRSF1A, CX3CR1, and JUNB in T2DM combined with MAFLD: Integrated bioinformatics and single-cell analysis. Front. Immunol. 2025, 16, 1587225. [Google Scholar] [CrossRef]
  13. Sokouti, B. A systems biology approach for investigating significantly expressed genes among COVID-19, hepatocellular carcinoma, and chronic hepatitis B. Egypt J. Med. Hum. Genet. 2022, 23, 146. [Google Scholar] [CrossRef]
  14. Uhlen, M.; Fagerberg, L.; Hallstroem, B.M.; Lindskog, C.; Oksvold, P.; Mardinoglu, A.; Sivertsson, A.; Kampf, C.; Sjoestedt, E.; Asplund, A.; et al. Tissue-based map of the human proteome. Science 2015, 347, 1260419. [Google Scholar] [CrossRef]
  15. Kanehisa, M.; Furumichi, M.; Sato, Y.; Kawashima, M.; Ishiguro-Watanabe, M. KEGG for taxonomy-based analysis of pathways and genomes. Nucleic Acids Res. 2023, 51, D587–D592. [Google Scholar] [CrossRef]
  16. Gillespie, M.; Jassal, B.; Stephan, R.; Milacic, M.; Rothfels, K.; Senff-Ribeiro, A.; Griss, J.; Sevilla, C.; Matthews, L.; Gong, C.; et al. The Reactome Pathway Knowledgebase 2022. Nucleic Acids Res. 2022, 50, D687–D692. [Google Scholar] [CrossRef]
  17. Piñero, J.; Ramírez-Anguita, J.M.; Saüch-Pitarch, J.; Ronzano, F.; Centeno, E.; Sanz, F.; Furlong, L.I. The DisGeNET knowledge platform for disease genomics: 2019 update. Nucleic Acids Res. 2020, 48, D845–D855. [Google Scholar] [CrossRef]
  18. Barrett, T.; Wilhite, S.E.; Ledoux, P.; Evangelista, C.; Kim, I.F.; Tomashevsky, M.; Marshall, K.A.; Phillippy, K.H.; Sherman, P.M.; Holko, M.; et al. NCBI GEO: Archive for functional genomics data sets—Update. Nucleic Acids Res. 2013, 41, D991–D995. [Google Scholar] [CrossRef]
  19. Zhang, Y.; Parmigiani, G.; Johnson, W.E. ComBat-seq: Batch effect adjustment for RNA-seq count data. NAR Genom. Bioinform. 2020, 2, lqaa078. [Google Scholar] [CrossRef]
  20. Love, M.I.; Huber, W.; Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014, 15, 550. [Google Scholar] [CrossRef]
  21. Robinson, M.D.; Oshlack, A. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol. 2010, 11, R25. [Google Scholar] [CrossRef]
  22. Irizarry, R.A.; Hobbs, B.; Collin, F.; Beazer-Barclay, Y.D.; Antonellis, K.J.; Scherf, U.; Speed, T.P. Exploration, normalization, and summaries of high-density oligonucleotide array probe-level data. Biostatistics 2003, 4, 249–264. [Google Scholar] [CrossRef]
  23. Benjamini, Y.; Hochberg, Y. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. J. R. Stat. Soc. Ser. B (Methodol.) 1995, 57, 289–300. [Google Scholar] [CrossRef]
  24. Edwards, A.W.F. Chapter 67—R.A. Fischer, statistical methods for research workers, first edition (1925). In Landmark Writings in Western Mathematics 1640–1940; Grattan-Guinness, I., Cooke, R., Corry, L., Crépel, P., Guicciardini, N., Eds.; Elsevier Science: Amsterdam, The Netherlands, 2005; pp. 856–870. [Google Scholar]
  25. Subramanian, A.; Tamayo, P.; Mootha, V.K.; Mukherjee, S.; Ebert, B.L.; Gillette, M.A.; Paulovich, A.; Pomeroy, S.L.; Golub, T.R.; Lander, E.S.; et al. Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. USA 2005, 102, 15545–15550. [Google Scholar] [CrossRef]
  26. Szklarczyk, D.; Kirsch, R.; Koutrouli, M.; Nastou, K.; Mehryary, F.; Hachilif, R.; Gable, A.L.; Fang, T.; Doncheva, N.T.; Pyysalo, S.; et al. The STRING database in 2023: Protein-protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucleic Acids Res. 2023, 51, D638–D646. [Google Scholar] [CrossRef]
  27. Shannon, P.; Markiel, A.; Ozier, O.; Baliga, N.S.; Wang, J.T.; Ramage, D.; Amin, N.; Schwikowski, B.; Ideker, T. Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Res. 2003, 13, 2498–2504. [Google Scholar] [CrossRef]
  28. Chin, C.H.; Chen, S.H.; Wu, H.H.; Ho, C.W.; Ko, M.T.; Lin, C.Y. cytoHubba: Identifying hub objects and sub-networks from the complex interactome. BMC Syst. Biol. 2014, 8, S11. [Google Scholar] [CrossRef]
  29. Bader, G.D.; Hogue, C.W. An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinform. 2003, 4, 2. [Google Scholar] [CrossRef]
  30. Van Rossum, G.; Drake, F.L. Python 3 Reference Manual; CreateSpace: Scotts Valley, CA, USA, 2009. [Google Scholar]
  31. R Core Team. R: A Language and Environment for Statistical Computing, Version 4.1.3; R Foundation for Statistical Computing: Vienna, Austria, 2022; Available online: https://www.R-project.org/ (accessed on 8 March 2026).
  32. Xu, Y.; Lan, F.; Yang, C.; Li, P. RNA methylation in hepatocellular carcinoma: From metabolic reprogramming and immune escape mechanisms to small molecule inhibitor development. J. Transl. Med. 2025, 23, 1022. [Google Scholar] [CrossRef] [PubMed]
  33. Ritchie, M.E.; Phipson, B.; Wu, D.; Hu, Y.; Law, C.W.; Shi, W.; Smyth, G.K. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015, 43, e47. [Google Scholar] [CrossRef]
  34. Yu, G.; Wang, L.G.; Han, Y.; He, Q.Y. clusterProfiler: An R package for comparing biological themes among gene clusters. OMICS 2012, 16, 284–287. [Google Scholar] [CrossRef]
  35. Wickham, H.; Chang, W.; Henry, L.; Pedersen, T.L.; Takahashi, K.; Wilke, C.; Woo, K.; Yutani, H.; Dunnington, D.; van den Brand, T.; et al. ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics. R Package Version 4.0.1. 2025. Available online: https://cran.r-project.org/web/packages/ggplot2/index.html (accessed on 8 March 2026).
  36. Grivennikov, S.I.; Greten, F.R.; Karin, M. Immunity, inflammation, and cancer. Cell 2010, 140, 883–899. [Google Scholar] [CrossRef]
  37. Al-Sarraf, M.; LeBlanc, M.; Giri, P.G.; Fu, K.K.; Cooper, J.; Vuong, T.; Forastiere, A.A.; Adams, G.; Sakr, W.A.; Schuller, D.E.; et al. Chemoradiotherapy versus radiotherapy in patients with advanced nasopharyngeal cancer: Phase III randomized Intergroup study 0099. J. Clin. Oncol. 1998, 16, 1310–1317. [Google Scholar] [CrossRef]
  38. Poulsen, L.; Siersbæk, M.; Mandrup, S. PPARs: Fatty acid sensors controlling metabolism. Semin. Cell Dev. Biol. 2012, 23, 631–639. [Google Scholar] [CrossRef] [PubMed]
  39. Gu, D.; Zhao, X.; Song, J.; Xiao, J.; Zhang, L.; Deng, G.; Li, D. Expression and clinical significance of interleukin-6 pathway in cholangiocarcinoma. Front. Immunol. 2024, 15, 1374967. [Google Scholar] [CrossRef]
  40. Chaisaingmongkol, J.; Budhu, A.; Dang, H.; Rabibhadana, S.; Pupacdi, B.; Kwon, S.M.; Forgues, M.; Pomyen, Y.; Bhudhisawasdi, V.; Lertprasertsuke, N.; et al. Common Molecular Subtypes Among Asian Hepatocellular Carcinoma and Cholangiocarcinoma. Cancer Cell 2017, 32, 57–70.e3. [Google Scholar] [CrossRef] [PubMed]
Figure 1. (A) Study design for the identification and validation of a metaflammation signature. The workflow involved differential expression analysis of several disease-specific datasets: TCGA-CHOL (cholangiocarcinoma, n = 45), GSE107943 (CCA, n = 163), GSE23343 (Type 2 Diabetes, n = 20), and GSE58208 (HBV infection, n = 102). A Venn analysis identified a consistent core set of genes altered across these conditions. This core gene set was validated using data from the Human Protein Atlas and subsequently analyzed using pathway enrichment (KEGG/Reactome), protein–protein interaction networks (STRING), and clinical survival analysis. The findings were synthesized to define a metaflammation signature and construct a model linking chronic metabolic inflammation to disease pathogenesis. (B) Workflow for the integrative transcriptomic analysis and validation. The schematic describes a bioinformatics pipeline beginning with transcriptomic data acquisition from the TCGA-CHOL cohort and GEO (GSE107943, GSE23343, GSE58208) microarray datasets. Following preprocessing and normalization, differential expression analysis was conducted for conditions CCA, T2D, and HBV. The resulting gene lists were merged to identify a core metaflammation gene set, which was then subjected to functional enrichment and protein–protein interaction network analysis. The clinical relevance of key hub genes was evaluated in the TCGA cohort through survival analysis, and the protein-level expression of prioritized hubs was confirmed with immunohistochemistry data from the Human Protein Atlas. Note: CCA datasets are derived from bile duct tissue, while T2D and HBV datasets are from whole liver tissue. This tissue source heterogeneity is a key limitation discussed in the text.
Figure 1. (A) Study design for the identification and validation of a metaflammation signature. The workflow involved differential expression analysis of several disease-specific datasets: TCGA-CHOL (cholangiocarcinoma, n = 45), GSE107943 (CCA, n = 163), GSE23343 (Type 2 Diabetes, n = 20), and GSE58208 (HBV infection, n = 102). A Venn analysis identified a consistent core set of genes altered across these conditions. This core gene set was validated using data from the Human Protein Atlas and subsequently analyzed using pathway enrichment (KEGG/Reactome), protein–protein interaction networks (STRING), and clinical survival analysis. The findings were synthesized to define a metaflammation signature and construct a model linking chronic metabolic inflammation to disease pathogenesis. (B) Workflow for the integrative transcriptomic analysis and validation. The schematic describes a bioinformatics pipeline beginning with transcriptomic data acquisition from the TCGA-CHOL cohort and GEO (GSE107943, GSE23343, GSE58208) microarray datasets. Following preprocessing and normalization, differential expression analysis was conducted for conditions CCA, T2D, and HBV. The resulting gene lists were merged to identify a core metaflammation gene set, which was then subjected to functional enrichment and protein–protein interaction network analysis. The clinical relevance of key hub genes was evaluated in the TCGA cohort through survival analysis, and the protein-level expression of prioritized hubs was confirmed with immunohistochemistry data from the Human Protein Atlas. Note: CCA datasets are derived from bile duct tissue, while T2D and HBV datasets are from whole liver tissue. This tissue source heterogeneity is a key limitation discussed in the text.
Cancers 18 00923 g001
Figure 2. (A) Evaluation of within-dataset batch correction for microarray data. PCA plots of the GSE58208 (HBV) dataset show samples before and after batch correction, illustrating the effects of the ComBat algorithm. Initially, samples are colored by processing batch and biological condition (HBV+ vs. HBV-). After correction, the plots indicate reduced clustering by technical batch while maintaining clear separation by biological condition, demonstrating the effective removal of non-biological variance. Similar quality control assessments were conducted on other datasets, such as GSE107943. (B) Comparative Differential Gene Expression Analysis Across Datasets. Volcano plots show the results of differential gene expression across four transcriptomic studies comparing disease to control groups. The x-axis represents log2 fold change (log2FC) in gene expression, and the y-axis indicates statistical significance, marked by −log10(FDR). Dashed horizontal lines denote significance thresholds (FDR < 0.05), while vertical dashed lines indicate fold-change thresholds (|log2FC| > 1). Data points are highlighted in red for significantly upregulated genes (FDR < 0.05, log2FC > 1), blue for downregulated genes (FDR < 0.05, log2FC < −1), and gray for non-significant genes. The studies include TCGA-CHOL (tumor vs. normal bile duct tissues, n = 45), GSE107943 (CCA validation, n = 163), GSE23343 (type 2 diabetes vs. control, n = 20), and GSE58208 (HBV+ vs. HBV−, n = 102).
Figure 2. (A) Evaluation of within-dataset batch correction for microarray data. PCA plots of the GSE58208 (HBV) dataset show samples before and after batch correction, illustrating the effects of the ComBat algorithm. Initially, samples are colored by processing batch and biological condition (HBV+ vs. HBV-). After correction, the plots indicate reduced clustering by technical batch while maintaining clear separation by biological condition, demonstrating the effective removal of non-biological variance. Similar quality control assessments were conducted on other datasets, such as GSE107943. (B) Comparative Differential Gene Expression Analysis Across Datasets. Volcano plots show the results of differential gene expression across four transcriptomic studies comparing disease to control groups. The x-axis represents log2 fold change (log2FC) in gene expression, and the y-axis indicates statistical significance, marked by −log10(FDR). Dashed horizontal lines denote significance thresholds (FDR < 0.05), while vertical dashed lines indicate fold-change thresholds (|log2FC| > 1). Data points are highlighted in red for significantly upregulated genes (FDR < 0.05, log2FC > 1), blue for downregulated genes (FDR < 0.05, log2FC < −1), and gray for non-significant genes. The studies include TCGA-CHOL (tumor vs. normal bile duct tissues, n = 45), GSE107943 (CCA validation, n = 163), GSE23343 (type 2 diabetes vs. control, n = 20), and GSE58208 (HBV+ vs. HBV−, n = 102).
Cancers 18 00923 g002
Figure 3. (A) Identification and Functional Characterization of a Core Metaflammation Gene Set. Identification of a statistically significant core gene set. (A) Venn diagram of DEG overlap across three pathological states. The diagram illustrates the intersection of differentially expressed genes (DEGs) from three disease-specific signatures: CCA (the consensus of DEGs from the TCGA-CHOL and GSE107943 cohorts, n = 2347 genes), Type 2 Diabetes (T2D) (GSE23343, n = 894 genes), and Hepatitis B Virus infection (HBV) (GSE58208, n = 1247 genes). The central overlap of 156 genes is highly statistically significant (hypergeometric p = 2.3 × 10−15, 42.6-fold enrichment) and is defined as the core metaflammation module. (B) The functional composition of 156 core genes was categorized by primary biological function, revealing a predominant emphasis on inflammatory/cytotoxic and regulatory pathways, collectively characterizing the metaflammation phenotype. (C) Summary of the core module’s role. The integrative analysis reveals a conserved 156-gene signature common to distinct inflammatory-metabolic disease states, predominantly composed of inflammatory/cytotoxic (75 genes) and regulatory (30 genes) pathways. (B) Expression heatmap of 20 representative core metaflammation genes. Unsupervised hierarchical clustering analysis of 20 core genes was conducted across four sample conditions: normal liver/bile duct (Normal), cholangiocarcinoma (CCA), type 2 diabetic liver (T2D), and hepatitis B virus-infected liver (HBV). Each gene is represented in rows, with expression levels color-coded—red for upregulation and blue for downregulation. Key functional categories include inflammation (NFKB1, STAT3, IL6, TNF, IL1B), metabolism (PPARG, SREBF1), oncogenesis (TP53, MYC, KRAS), and growth factor signaling (EGFR, MET, VEGFA). Normal samples clustered distinctly, indicating baseline expression. CCA tumors showed significant upregulation of oncogenic and inflammatory genes, T2D livers exhibited metabolic regulators, and HBV-infected livers presented with strong upregulation of immune mediators.
Figure 3. (A) Identification and Functional Characterization of a Core Metaflammation Gene Set. Identification of a statistically significant core gene set. (A) Venn diagram of DEG overlap across three pathological states. The diagram illustrates the intersection of differentially expressed genes (DEGs) from three disease-specific signatures: CCA (the consensus of DEGs from the TCGA-CHOL and GSE107943 cohorts, n = 2347 genes), Type 2 Diabetes (T2D) (GSE23343, n = 894 genes), and Hepatitis B Virus infection (HBV) (GSE58208, n = 1247 genes). The central overlap of 156 genes is highly statistically significant (hypergeometric p = 2.3 × 10−15, 42.6-fold enrichment) and is defined as the core metaflammation module. (B) The functional composition of 156 core genes was categorized by primary biological function, revealing a predominant emphasis on inflammatory/cytotoxic and regulatory pathways, collectively characterizing the metaflammation phenotype. (C) Summary of the core module’s role. The integrative analysis reveals a conserved 156-gene signature common to distinct inflammatory-metabolic disease states, predominantly composed of inflammatory/cytotoxic (75 genes) and regulatory (30 genes) pathways. (B) Expression heatmap of 20 representative core metaflammation genes. Unsupervised hierarchical clustering analysis of 20 core genes was conducted across four sample conditions: normal liver/bile duct (Normal), cholangiocarcinoma (CCA), type 2 diabetic liver (T2D), and hepatitis B virus-infected liver (HBV). Each gene is represented in rows, with expression levels color-coded—red for upregulation and blue for downregulation. Key functional categories include inflammation (NFKB1, STAT3, IL6, TNF, IL1B), metabolism (PPARG, SREBF1), oncogenesis (TP53, MYC, KRAS), and growth factor signaling (EGFR, MET, VEGFA). Normal samples clustered distinctly, indicating baseline expression. CCA tumors showed significant upregulation of oncogenic and inflammatory genes, T2D livers exhibited metabolic regulators, and HBV-infected livers presented with strong upregulation of immune mediators.
Cancers 18 00923 g003
Figure 6. (A) Representative immunohistochemical validation of hub protein expression and subcellular localization. Immunohistochemical images from the Human Protein Atlas reveal contrasting protein expression in normal bile ducts and cholangiocarcinoma (CCA) or key proteins in the metaflammation hub. IL6 staining shows strong cytoplasmic immunoreactivity in CCA cells, while normal bile duct epithelium exhibits minimal staining. In contrast, PPARG shows nuclear expression in normal bile duct epithelium, but is significantly reduced or absent in CCA tissue. This indicates shifts in subcellular localization: cytoplasmic accumulation of pro-inflammatory mediators (e.g., IL6, TNF) and loss of nuclear localization of metabolic regulators (e.g., PPARG) during CCA progression. (B) Demonstrates that a novel metaflammation gene expression signature serves as a robust and independent prognostic biomarker in CCA. The analysis of the TCGA-CHOL cohort (n = 45) demonstrates that high-risk patients have significantly poorer overall survival than low-risk patients (HR = 2.8, 95% CI: 1.8–4.3, p < 0.001). Key findings include strong prognostic stratification shown by Kaplan–Meier analysis (p < 0.001), with markedly reduced 3-year survival in the high-risk group, an independent predictive value confirmed by multivariate Cox regression (p = 0.000), and a significant negative correlation between the metaflammation score and survival time (Spearman ρ = −0.458, p < 0.0016). Additionally, time-dependent ROC analysis indicates stable, moderate predictive accuracy (AUC ~0.65–0.70) for survival up to 36 months.
Figure 6. (A) Representative immunohistochemical validation of hub protein expression and subcellular localization. Immunohistochemical images from the Human Protein Atlas reveal contrasting protein expression in normal bile ducts and cholangiocarcinoma (CCA) or key proteins in the metaflammation hub. IL6 staining shows strong cytoplasmic immunoreactivity in CCA cells, while normal bile duct epithelium exhibits minimal staining. In contrast, PPARG shows nuclear expression in normal bile duct epithelium, but is significantly reduced or absent in CCA tissue. This indicates shifts in subcellular localization: cytoplasmic accumulation of pro-inflammatory mediators (e.g., IL6, TNF) and loss of nuclear localization of metabolic regulators (e.g., PPARG) during CCA progression. (B) Demonstrates that a novel metaflammation gene expression signature serves as a robust and independent prognostic biomarker in CCA. The analysis of the TCGA-CHOL cohort (n = 45) demonstrates that high-risk patients have significantly poorer overall survival than low-risk patients (HR = 2.8, 95% CI: 1.8–4.3, p < 0.001). Key findings include strong prognostic stratification shown by Kaplan–Meier analysis (p < 0.001), with markedly reduced 3-year survival in the high-risk group, an independent predictive value confirmed by multivariate Cox regression (p = 0.000), and a significant negative correlation between the metaflammation score and survival time (Spearman ρ = −0.458, p < 0.0016). Additionally, time-dependent ROC analysis indicates stable, moderate predictive accuracy (AUC ~0.65–0.70) for survival up to 36 months.
Cancers 18 00923 g006
Figure 7. (A) Integrative Model of Module Crosstalk. A schematic illustrates the interaction of three core functional modules: metabolic, inflammatory, and growth-promoting signals. Metabolic (Score: 6.8), Inflammatory (Score: 8.4), and Growth (Score: 5.2), highlighting how key regulator genes integrate signals. This crosstalk is essential to the metaflammation state that contributes to cholangiocarcinoma pathogenesis in the context of T2D and HBV infection. (B) Proposed Model of Convergent Mechanisms Linking T2D and HBV to CCA. This schematic outlines how Type 2 Diabetes (T2D) and Hepatitis B Virus (HBV) infection may converge to induce metaflammation, activating key inflammatory and oncogenic pathways, including NF-κB, JAK-STAT/STAT3, and cytokine networks (TNF, IL-6). This activation can result in oncogenic transformation and accelerated progression of cholangiocarcinoma (CCA). The findings indicate that the co-presence of T2D and HBV increases CCA risk, which might lead to drug resistance, metastasis, and decreased survival rates.
Figure 7. (A) Integrative Model of Module Crosstalk. A schematic illustrates the interaction of three core functional modules: metabolic, inflammatory, and growth-promoting signals. Metabolic (Score: 6.8), Inflammatory (Score: 8.4), and Growth (Score: 5.2), highlighting how key regulator genes integrate signals. This crosstalk is essential to the metaflammation state that contributes to cholangiocarcinoma pathogenesis in the context of T2D and HBV infection. (B) Proposed Model of Convergent Mechanisms Linking T2D and HBV to CCA. This schematic outlines how Type 2 Diabetes (T2D) and Hepatitis B Virus (HBV) infection may converge to induce metaflammation, activating key inflammatory and oncogenic pathways, including NF-κB, JAK-STAT/STAT3, and cytokine networks (TNF, IL-6). This activation can result in oncogenic transformation and accelerated progression of cholangiocarcinoma (CCA). The findings indicate that the co-presence of T2D and HBV increases CCA risk, which might lead to drug resistance, metastasis, and decreased survival rates.
Cancers 18 00923 g007
Table 1. Dataset Characteristics and Preprocessing Summary.
Table 1. Dataset Characteristics and Preprocessing Summary.
DatasetsPlatformSamples (Case/Control)NormalizationBatch Correction
TCGA-CHOLRNA-Seq36/9TMM + DESeq2 VST (for visualization)ComBat-seq (on raw counts)
GSE107943Microarray104/59RMAComBat (post-RMA)
GSE23343Microarray10/10RMANone required
GSE58208Microarray62/40RMAComBat (post-RMA)
Total for Core Analysis 212/118  
Contextual DatasetPlatformSamplesNormalizationBatch Correction
GSE89632RNA-SeqVariable (by analysis)TMM + DESeq2 VST (for visualization)ComBat-seq (on raw counts)
TCGA-CHOL (36 CCA tumors + 9 normal bile duct tissues, n = 45), GSE107943 (CCA, n = 163), GSE23343 (Type 2 diabetes, n = 20), and GSE58208 (HBV infection, n = 102). The GSE89632 dataset, containing liver tissues across steatosis grades, was used for exploratory contextual analysis only and was not included in the core integrative analysis.
Table 2. Characteristics of Integrated Datasets for Core Analysis.
Table 2. Characteristics of Integrated Datasets for Core Analysis.
CharacteristicTCGA-CHOLGSE107943GSE23343GSE58208Total
Samples (n)4516320102330
PlatformRNA-SeqMicroarrayMicroarrayMicroarrayMixed
TissueBile ductBile ductLiverLiverMixed
ConditionsCCA/NormalCCA/NormalT2D/ControlHBV+/HBV-4
Genes19,64520,32912,62523,04215,892
Number of genes after intersection across all platforms.     
The total sample count (n = 330) represents the sum of the four core datasets used in the primary integrative analysis: TCGA-CHOL (n = 45), GSE107943 (n = 163), GSE23343 (n = 20), and GSE58208 (n = 102). Microarray data showed consistent intensity distributions, with median present calls exceeding 85%. Principal component analysis (PCA) revealed clear separation of samples by primary biological condition, with minimal residual batch effects after appropriate correction. Note: The GSE89632 dataset (n = variable) was used only for contextual metabolic analysis and is not included in the core analysis total.
Table 3. Characteristics of the Core Metaflammation Gene Set.
Table 3. Characteristics of the Core Metaflammation Gene Set.
CategoryNumberPercentageRepresentative Genes
Total Genes156100%
Upregulated9259%IL6, TNF, STAT3, AKT1
Downregulated6441%PPARG, ADIPOQ, IRS1
Metabolic5837%PPARG, SREBF1, FASN
Inflammatory7246%IL6, TNF, IL1B, CXCL8
Signaling4227%AKT1, STAT3, NFKB1
Cancer-related3824%MYC, VEGFA, EGFR
Note: Percentages total >100% as genes can belong to multiple categories. Functional categorization of the 156 core genes revealed a predominance of genes involved in inflammatory/cytotoxic response (75 genes) and regulatory processes (30 genes), with substantial contributions from transcription factors (28 genes) and metabolic functions (19 genes). This pattern underscores the central interplay between inflammation and metabolism that defines the metaflammation phenotype.
Table 4. Top Enriched Pathways for the Core Metaflammation Gene Set.
Table 4. Top Enriched Pathways for the Core Metaflammation Gene Set.
PathwayGene Countp-ValueFDREnrichment RatioKey Genes
PPAR signaling122.1 × 10−103.2 × 10−88.4PPARG, SREBF1, FABP4, CD36, CPT1A, PLIN2
Cytokine-cytokine receptor interaction183.4 × 10−92.1 × 10−66.2IL6, TNF, CXCL8, IL1B, CCL2, CCR5
Metabolic pathways247.8 × 10−85.4 × 10−54.1Multiple enzymes (HK2, PFKFB3, ACLY, etc.)
PI3K-Akt signaling141.8 × 10−71.2 × 10−45.8AKT1, mTOR, PIK3CA, IRS1, ITGB1
TNF signaling85.6 × 10−73.8 × 10−47.2TNF, NFKB1, JUN, MAPK8, CASP8
Pathway enrichment analysis for the core metaflammation gene set. Significantly enriched pathways (FDR < 0.001) are shown, ranked by p-value. The enrichment ratio represents the proportion of input genes in a path relative to the proportion of all pathway-annotated genes in the genome. A subset of key genes is listed for each path. TNF—Tumor Necrosis Factor, FDR—False Discovery Rate, PPARG—Peroxisome Proliferator Activated Receptor Gamma.
Table 6. Top network hubs identified by integrated centrality analysis.
Table 6. Top network hubs identified by integrated centrality analysis.
GeneDegree CentralityBetweenness Centrality
IL6280.12
TNF260.11
AKT1240.10
STAT3220.09
NFKB1200.08
PPARG180.07
JUN170.06
MYC160.05
FOS150.04
VEGFA140.03
IL6—Interleukin 6, TNF—Tumor Necrosis Factor, AKT1—AKT Serine/Threonine Kinase 1, STAT3—Signal Transducer and Activator of Transcription 3, NFKB1—Nuclear Factor Kappa B Subunit 1, PPARG—Peroxisome Proliferator Activated Receptor Gamma, MYC—MYC Proto-Oncogene, bHLH Transcription Factor, VEGFA—Vascular Endothelial Growth Factor, FOS—Proto-oncogene, JUN—Transcription factor AP-1 subunit.
Table 7. Survival Analysis of Hub Genes in the TCGA-CHOL Cohort.
Table 7. Survival Analysis of Hub Genes in the TCGA-CHOL Cohort.
GeneHR (95% CI)p-ValueMedian OS (High)Median OS (Low)
IL62.1 (1.4–3.2)0.00118.4 months32.7 months
TNF1.8 (1.2–2.7)0.00420.1 months30.5 months
PPARG0.5 (0.3–0.8)0.00231.9 months19.8 months
AKT11.6 (1.1–2.3)0.0222.3 months29.6 months
STAT31.5 (1.0–2.2)0.0423.8 months28.4 months
IL6—Interleukin 6, TNF—Tumor Necrosis Factor, AKT1—AKT Serine/Threonine Kinase 1, STAT3—Signal Transducer and Activator of Transcription 3, PPARG—Peroxisome Proliferator Activated Receptor Gamma. HR: Hazard Ratio for High expression group versus Low expression group.
Table 8. Multivariate Cox Regression and Bootstrap Validation.
Table 8. Multivariate Cox Regression and Bootstrap Validation.
VariableHR (Original)95% CI (Original)p-ValueHR (Bootstrap Mean)95% Bootstrap CI% Significant Iterations
IL6 (high)2.411.52–3.820.0012.381.48–3.9198.2%
TNF (high)2.121.34–3.350.0042.081.29–3.4894.7%
PPARG (low)0.480.28–0.820.0020.510.31–0.8996.1%
AKT1 (high)1.721.08–2.740.021.680.95–2.9878.3%
STAT3 (high)1.580.98–2.550.041.540.89–2.7172.1%
Age1.010.98–1.040.421.010.97–1.0532.4%
Sex (Male)1.120.71–1.770.611.090.68–1.8228.7%
Stage (III/IV)1.891.21–2.950.0051.911.18–3.1292.3%
Results from a multivariable Cox proportional hazards model assessing the impact of clinical factors and biomarker expression levels on survival outcomes. Hazard Ratios (HRs) for biomarkers compare the high-expression group (or the low-expression group for PPARG) to the reference group, adjusted for age, sex, and disease stage. The “Bootstrap Mean” and “95% Bootstrap CI” represent the mean HR and the 95% confidence interval derived from 1000 bootstrap resamples to assess the model’s stability. “% Significant Iterations” indicates the proportion of bootstrap samples in which the variable remained statistically significant (p < 0.05). Variables with high bootstrap significance (e.g., IL6, TNF, PPARG, Stage) demonstrate robust associations with survival, while AKT1 and STAT3 show less stability despite nominal significance in the original model.
Table 9. Summary of Protein Expression Validation.
Table 9. Summary of Protein Expression Validation.
GeneNormal Expression (Score)CCA Expression (Score)ChangeIHC Score
IL6Low (1.2)High (3.4)↑ 2.28.7
TNFLow (1.5)Medium (2.8)↑ 1.37.9
PPARGMedium (2.8)Low (1.6)↓ 1.28.2
AKT1Low (1.8)High (3.6)↑ 1.89.1
STAT3Medium (2.4)High (3.2)↑ 0.87.5
Scores represent mean protein intensity on a 0–3 scale (0 = none, 1 = weak, 2 = moderate, 3 = strong). Normal = adjacent non-tumor tissue; CCA = cholangiocarcinoma tissue. Change = CCA − Normal intensity. IHC Score = composite score for CCA tissue only, calculated as (Intensity × % positive cells)/10 (range 0–10). This score provides a semi-quantitative measure of overall protein abundance by integrating staining intensity and the proportion of positive tumor cells. IL6—Interleukin 6, TNF—Tumor Necrosis Factor, AKT1—AKT Serine/Threonine Kinase 1, STAT3—Signal Transducer and Activator of Transcription 3, PPARG—Peroxisome Proliferator Activated Receptor Gamma.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Md Rasadul, H.; Ma, S.; Ge, Z.; Md Zahidur, R.; Kang, P.; You, J.; Li, J.; Duan, C.; Fahim, S.A.Z.M.; Somrat Akbor, M.; et al. Systems-Level Transcriptomic Integration Reveals a Core Metaflammatory Network Linking Type 2 Diabetes and HBV Infection to Cholangiocarcinoma Progression. Cancers 2026, 18, 923. https://doi.org/10.3390/cancers18060923

AMA Style

Md Rasadul H, Ma S, Ge Z, Md Zahidur R, Kang P, You J, Li J, Duan C, Fahim SAZM, Somrat Akbor M, et al. Systems-Level Transcriptomic Integration Reveals a Core Metaflammatory Network Linking Type 2 Diabetes and HBV Infection to Cholangiocarcinoma Progression. Cancers. 2026; 18(6):923. https://doi.org/10.3390/cancers18060923

Chicago/Turabian Style

Md Rasadul, Hasan, Shihui Ma, Ziqiang Ge, Rahman Md Zahidur, Pengcheng Kang, Junqi You, Jinglin Li, Chenghong Duan, Siddique A. Z. M. Fahim, Mozumder Somrat Akbor, and et al. 2026. "Systems-Level Transcriptomic Integration Reveals a Core Metaflammatory Network Linking Type 2 Diabetes and HBV Infection to Cholangiocarcinoma Progression" Cancers 18, no. 6: 923. https://doi.org/10.3390/cancers18060923

APA Style

Md Rasadul, H., Ma, S., Ge, Z., Md Zahidur, R., Kang, P., You, J., Li, J., Duan, C., Fahim, S. A. Z. M., Somrat Akbor, M., Zhao, X., & Cui, Y. (2026). Systems-Level Transcriptomic Integration Reveals a Core Metaflammatory Network Linking Type 2 Diabetes and HBV Infection to Cholangiocarcinoma Progression. Cancers, 18(6), 923. https://doi.org/10.3390/cancers18060923

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop