1. Introduction
Bladder cancer is a heterogeneous disease with high prevalence and recurrence rates [
1]. According to an estimate by the American Cancer Society, in the United States, approximately 84,000 new bladder cancer cases will be diagnosed and 17,000 deaths will occur in the year 2021 (cancer facts and figure, American Cancer Society). The diagnosis and treatment of bladder cancer is based on a sequence of clinical stage with two distinct entities that include a low-risk, non-muscle invasive variant (NMIBC) and a highly aggressive muscle invasive (MIBC) subtype [
2]. At initial diagnosis, about 70–80% of NMIBC tumors are confined to the epithelium or sub-epithelial connective tissue [
3]. These tumors are frequently managed by surgery and/or immunotherapy based on their risk of progression assessment. Meanwhile, the best option for patients with loco-regionally advanced (intermediate to high-risk) disease remains standard chemotherapy. However, these treatments often become refractory over time with a high recurrence rate of 50–70% over a 5-year period and subsequent progression to MIBC [
4,
5]. To date, there has been no reliable means of prediction of disease progression other than clinical decision making models, risk estimates tables and nomograms [
6]. The utility of these existing prediction models are less accurate as these classifiers are unable to provide information on the underlying biological properties that likely drive tumor behavior. Therefore, exploring molecular pathways and gene regulatory networks that provide information about disease progression may provide a new approach to predict prognosis and improvements in the management of bladder cancer.
An integral part of understanding the molecular basis of bladder cancer progression includes analysis of differentially expressed genes in NMIBC (low-risk) versus MIBC (high-risk) phenotype. At the molecular level, alterations in genes result in unrestricted cell proliferation, reduced cell death, invasion, and metastasis [
7]. Specific alterations in gene regulatory network could affect the biologic behavior of the tumor that may influence patient’s survival. In fact, patients with aggressive disease require careful supervision, particularly because the treatment and pathogenesis of MIBC and NMIBC differ and conventional histopathological evaluation is inadequate to precisely predict the behavior of high-risk NMIBC [
8]. Thus, there is a clear need of predictive biomarkers that can differentiate progressive from non-progressive NMIBC. Analysis of The Cancer Genome Atlas (TCGA) identified a distinct gene expression profile between NMIBC and MIBC [
9,
10]; however, limited prognostic values of these genes established the requirement for new molecular markers of bladder cancer outcome.
To examine the prognostic significance of genes and designate them as potential biomarkers for bladder cancer, we conducted sequential analyses on high-throughput sequencing data obtained from three datasets (GSE154261, GSE57813, and GSE37317) of bladder cancer. In the first step, differentially expressed genes (DEGs) were identified that are common among the three databases and analyzed using ingenuity pathway analysis (IPA). In the second step, we used Metascape to explore the major pathways of DEG enrichment in bladder cancer [
11]. In the third step, the protein interaction network between DEGs was generated using the STRING online tool and demonstrated employing the Cytoscape software. Next, the Gene Expression Pro-filing Interactive Analysis 2 (GEPIA2) [
12], and human protein atlas datasets were utilized to explore DEGs and their association with bladder cancer prognosis. Lastly, we discover single gene and its key biological role using the UALCAN, cBioPortal, and STRING online tools.
In the analysis, we identified 1516 DEGs commonly associated with bladder cancer progression. The DEGs aligned with functional interactome analysis comprising of 40 independent knowledge database revealed overrepresentation of cell cycle and cell-cycle check point associated genes in bladder cancer. We further validated the subset of selected core genes using independent datasets from TCGA and GEO databases and explored their potential as prognostic biomarkers in clinical use.
4. Discussion
Predicting the risk of bladder cancer progression is often challenging because of histologic heterogeneity [
13] and clinical characteristics of tumors. Therefore, there is a pressing need for identification of precise and novel biomarkers with prognostic significance. Our comprehensive analysis of databases identified 1516 differentially expressed genes (DEGs) between NMIBC and MIBC. The IPA analysis of individual dataset showed diverse overrepresented canonical pathways. In GSE154261 dataset, a total of 6078 DEGs were identified with significant enrichment in canonical pathways that include molecular mechanisms of cancer, mitochondrial dysfunction, protein ubiquitination pathway, oxidative phosphorylation, sirtuin signaling pathway, EIF2 signaling, senescence pathway, mTOR signaling, HER-2 signaling, estrogen receptor signaling, IGF-1 signaling, and hepatic fibrosis pathways. The GSE57813 dataset showed 7589 DEGs exhibiting RAR activation, molecular mechanism of cancer, aryl hydrocarbon receptor signaling, tight junction signaling and death receptor signaling, osteoarthritic pathway, PXR pathway, D-myo-inositol tetrakisphosphate biosynthesis, calcium transport, 3-phosphoinoside degradation and biosynthesis as overrepresented pathways. In the GSE37317 dataset, 7975 DEGs were noted and the pathways significantly overrepresented were molecular mechanism of cancer, hepatic fibrosis, glucocorticoid receptor signaling, and cardiac hypertrophy signaling, Huntington disease signaling, senescence pathway, hereditary breast cancer signaling, colorectal cancer metastasis, NRF2-mediated oxidative stress response, estrogen receptor signaling, axonal guidance signaling, and aryl hydrocarbon receptor signaling. Furthermore, the GO annotation and KEGG pathway enrichment analysis revealed pathways enriched in cell cycle/cell cycle checkpoint, signaling by receptor tyrosine kinase, nervous system development, cellular response to external stimuli and regulation of transferase activity, and autophagy. Previous studied showed that dysregulation of cell cycle events were closely associated with cell growth, anabolism, and cell proliferation, which comprise major hallmarks of cancer [
14]. Therefore, investigating these signaling networks and disease-specific pathways are important to understand bladder cancer progression.
Several previous studies involved only one dataset for analysis. Nonetheless, a recent study by Xu et al. [
15] analyzed three databases, while the present study collected more public dataset for analysis. We focused on transcript levels of DEGs in low-risk NMIBC and high-risk MIBC tissues. To explore the difference between the non-invasive versus invasive group, we identified the DEGs and analyzed by GO and KEGG pathway enrichment analysis. The enrichment analysis was constructed to demonstrate the nexus among key signaling pathways. It appears that the risk scores were predominantly related to the regulation of cell cycle. Interestingly the cell-cycle signaling pathway showed its close liaison with nervous system development, and positive regulation of transferase activity. Furthermore, the PPI network identified, mitotic spindle checkpoint, involvement of APC-CDC20, FOXM1 transcription factor network, phosphorylation of Emi-1, endogenous TLR signaling, CFCF, ARF, and c-MYC pathways, hypoxia and oxygen homeostasis as some key differentially expressed pathways. Besides, our analysis highlighted several attributed genes and their associated signaling network involved in bladder cancer prognosis, including the family of ubiquitin-conjugating enzymes (
UBE2C,
UBE2D1, and
UBE2E1), regulatory protein involved in mitosis (
CCNB1), cell division cycle protein ubiquitin ligase (
CDC16,
CDC23, and
CDC26) that are components of the multiprotein APC complex. Other proteins involved in cell cycle regulation are the subunits of the anaphase-promoting complex (
ANAPC1,
ANAPC2,
ANAPC4,
ANAPC5,
ANAPC7,
ANAPC10, and
ANAPC11), cyclin-dependent kinases (
CDK1 and
CDK27), genes encoding kinases involved in the spindle checkpoint function (
BUB1B and
BUB3), mitotic spindle assembly checkpoint (
MAD2L1) and S100 calcium and RAGE receptor binding protein (
S100A8). The above data tempted to explore the gene regulatory network, which may give insight to understand its role in disease progression. The results of gene network analysis revealed the interaction of genes with anaphase-promoting complex/cyclosome (APC/C), which function together with one of homologous coactivators, CDC20. APC/C is an evolutionary conserved multi-subunit E3 ubiquitin ligase consisting of CDC16, CDC20, CDC23, CDC26, and CDC27 subunits that function to regulate progression of cells through the mitotic phase of the cell cycle. A defect in APC/C gene extends mitosis, inhibits drug-induced segregation errors in chromosomes, and diminishes naturally occurring lagging chromosomes in malignant cells. Indeed, somatic mutation in one the subunit CDC27 reduces chromosome segregation errors [
16]. Additionally, it was demonstrated that the transcriptional regulation of CDC20 is mediated by FOXM1/transcription factor network, which is in accordance with our gene regulatory network data. BUB1B/MAD3L and BUB3 were also identified as molecules involved in the regulating gene network [
17].
BUB1B gene encodes a kinase involved in spindle checkpoint function and plays a role in the inhibition APC/C by blocking the binding of
CDC20 to APC/C, delaying the onset of anaphase and ensuring proper chromosome segregation [
18,
19]. Regarding
CDKN2A and its interaction with
MAGEA6 very little is known so far, moreover previous studies identified higher expression of MAGEA6 in early stage bladder cancer [
20]. Cancer patients with high MAGEA6 expression exhibit poor prognosis. Conclusively, high expression of MAGEA6 in primary lesions could represent a promising prognostic biomarker [
20,
21].
The PPI network analysis of three databases identified the following subsets of DEGs, including
CDC20,
CDKN2A,
CTSV,
FOXM1,
KRT23,
MAGEA6, and
S100A9, which were significantly upregulated in bladder cancer. Moreover, collagen type I and 3 alpha chain (
COL1A2 and
COL3A1), decorin (
DCN), fibronectin (
FN1), regulator of G protein signaling (
RGS2), and secreted phosphoprotein (
SPP1) were significantly downregulated in NMIBC specimens, compared to MIBC tissues. In addition, some of the genes were in line with previously published studies [
22,
23,
24]. The cell division cycle protein 20 (CDC20) homolog is a key regulator of cell division that is encoded by the CDC20 gene in humans that combines with the APC/C, and subsequently controls securin degradation. Loss of securin promotes the degradation of cohesion, interferes in sister chromatid segregation, and transition of cells from the G2/M phase to G1 phase of the cell cycle [
25]. Abnormal expression of CDC20 promotes premature anaphase, resulting in mitotic anomalies and aneuploidy, which may promote carcinogenesis. Studies have demonstrated that CDC20 overexpression decreases overall survival and recurrence-free survival time in bladder cancer patients [
22]. The molecular mechanism(s) of its overexpression is still unclear in bladder cancer, and further studies are required to establish it as a candidate biomarker.
CDKN2A, also referenced as cyclin-dependent kinase inhibitor 2A, encodes p16 protein that regulates cell cycle and is ubiquitously expressed in numerous tissues and cell types [
26,
27]. p16 inhibits CDK4 and CDK6 [
28,
29], the cyclin dependent kinases thereby activating the retinoblastoma (RB) family of proteins and its check on E2F transcription factors [
30,
31], which block progression of cells from G1 to S-phase of the cell cycle [
32,
33]. CDKN2A can activate tumor suppressor p53 either by binding and obstructing MDM2 protein to p53 or directly inhibiting RB protein during the G1-phase cell cycle [
34,
35]. Previous studies have demonstrated genomic alterations in CDKN2A in 20 to 60% of bladder carcinomas, which are associated with either RB1 deletions or E2F amplification [
36,
37]. Overexpression of CDKN2A is significantly observed in genomically unstable tumors associated with shorter progression-free survival in bladder cancer patients [
38].
The CTSV gene encodes CTSL protein, which is a member of the peptidase C1 family, is a lysosomal cysteine proteinase that degrade extracellular matrix and basement membrane components such as type IV collagen, laminin, fibronectin, and proteoglycans of the bladder [
24]. CTSL promoter activity and synthesis is driven by tumor secreted cytokines, and its upregulation is reported in a wide range of malignancies, including bladder cancer. High expression of CTSL correlates with metastatic aggressiveness and poor patient prognosis [
39]. Evidence shows that CTSL expression may be linked to cancer grade and stage [
40]. Studies have proposed CTSL as an independent predictor of bladder cancer and invasiveness in patients with a history of urothelial carcinoma [
41,
42].
FOXM1, a member of the forkhead gene family, is a regulator of embryogenesis and numerous developmental processes [
43]. FOXM1 is primarily expressed in proliferating cells or induced by growth factor release. FOXM1 is designated as a proto-oncogene in most cancers [
44], as it exclusively expressed in dividing cells and is involved in angiogenesis, cell migration and epithelial-to–mesenchymal transition. Relative research considers that FOXM1 plays an essential role in the phenotype determination and the development of molecular bladder cancer subtypes [
45,
46]. FOXM1 has been designated as a prognostic marker for bladder cancer and an independent predictor for overall survival and disease-specific survival in muscle-invasive bladder cancer [
46]. The FOXM1 signaling network and its regulators, including FOXO3, PI3K, and AKT, remains putative drug targets in bladder cancer which requires additional work.
Human keratin 23 gene (KRT23) belongs to the family of type I keratins and 60–65% similarity within the alpha-helical rod domain [
47]. Previous studies have reported abnormal expression of KRT23 in numerous tumor tissues, including bladder cancer. KRT23 is localized in the Golgi apparatus in the cytoplasm and is involved in the development and migration of various types of human cancers [
48]. An association has been established between KRT23 and Smad4 in malignant cells, promoting Smad4-dependent upregulation and subsequent migration of cancer cells. However, the mechanism through which KRT23 regulates epithelial-to-mesenchymal transition in malignant cells remains unclear. Precise bio-informatics analysis provides evidence that KRT23 may upsurge epithelial-mesenchymal transition through regulation of TGF-β signaling pathway [
49]. Additional studies are needed, especially in bladder cancer.
MAGEA6 (MAGE family member A6) is a protein coding gene associated with melanoma and osseous dysplasia [
20]. MAGEA6 is overexpressed in a variety of human cancers, including bladder carcinoma [
50]. Studies have shown that MAGEA6-TRIM28 ligase complex can degrade AMPK, a master cellular energy sensor and regulator, through ubiquitination of the alpha catalytic subunit (PRKAA1) resulting in overall reduction of AMPK protein levels in tumors [
51]. Downregulation of AMPK by MAGEA6 led to significant decrease in autophagy and upregulation of mTOR, facilitating tumor growth. High MAGE6A expression is associated with poor clinical outcomes [
52]. However, studies elucidating the tumorigenic role of MAGE6A in bladder cancer need further exploration.
S100A9 is a member of the low-molecular calcium binding S100 protein family implicated in regulating cell proliferation [
53]. S100A9 is induced by various mitogens including VEGF-A, TGFβ and TNFα, increasing motility and promoting metastatic process. S100A9 supports proliferation and invasion of malignant cells which is dependent on receptor of advanced glycation end products (RAGE) and MAPK signaling cascades [
54]. Studies have investigated the role of S100A9 in tumorigenesis [
55]; however, the underlying molecular mechanism(s) in bladder cancer remains unknown.
We selected seven genes differentially regulated in NMIBC versus MIBC and their expression was analyzed in various bladder cancer cells representing stage-specific disease progression. Among the seven genes analyzed,
CDKN2A and
MAGEA6 showed a progressive increase in the transcript with disease severity, whereas
CTSV,
FOXM1 and
KRT23 gene transcript levels were higher in the carcinoma and metastatic carcinoma cell line. Irrespective of the tumor type,
CDC20 and
CTSV showed higher expression in transitional cell carcinoma. Previous studies have shown abnormal expression of CDC20 in bladder cancer tissue that may serve as a potential diagnostic biomarker for NMIBC [
56]. In addition, CDKN2A, is also designated as a molecular risk factor for tumor progression in NMIBC [
38]. Our data suggest that overexpression of CDKN2A and MAGEA6 are associated with Ta/T1 stage NMIBC. The mRNA expression of seven genes were further confirmed from TCGA database demonstrating that the expression of these genes were markedly higher in bladder cancer compared with normal tissues. Indeed, the expression of CDC20, FOXM1, and MAGEA6 showed higher levels in tumors compared with normal tissues. Further, it will be interesting to validate the above set of genes in stage-specific manner in bladder cancer.
In order to validate the prognostic significance of the seven selected genes in low-risk NMIBC versus high-risk MIBC tumors, overall survival (OS) and progression free survival (PFS) were estimated using logistic and Cox regression. The results suggested that high expressions of these genes were associated with the malignant progression of bladder cancer. Furthermore, the mRNA expression levels of the subset genes were all significantly higher in MIBC tissues compared with NMIBC, which demonstrated that these seven genes played important roles in the progression of bladder cancer. The effect of expression of these genes were analyzed on survival and tumor aggressiveness in patients with bladder cancer. The univariate analysis of gene expression showed that CDC20 was predictive of OS (p = 0.014). Indeed, univariate analysis of gene expression on PFS using the Cox model showed the CDC20 was predictive of PFS (p = 0.021). Univariate analysis of gene expression data on tumor aggressiveness (NMIBC versus MIBC) using logistic regression showed genes, including CDKN2A, CTSV, FOXM1, and KRT23, were predictive of tumor aggressiveness.
The study has the following limitations and strengths. Firstly, there are a lack of in vivo experiments to validate the mechanisms of the genes in driving bladder cancer. Secondly, the sample size for validation analysis was small; therefore, some potential assays could not be performed. Lastly, NMIBC exhibit a heterogeneous subtype, thus multivariate analysis is needed to control the confounding effects, which requires substantial sample size. Therefore, further research with large sample size is warranted, which remains top priority on our research agenda. The strength of the study is the identification of a subset of seven genes, which are prognostic for bladder cancer. Furthermore, a combination of three gene set has 95.5% sensitivity and 100% specificity to differentiate between high-risk and low-risk disease subtype.