RNA-Seq Uncovers Association of Endocrine-Disrupting Chemicals with Hub Genes and Transcription Factors in Aggressive Prostate Cancer

Alwadi, Diaaidden; Felty, Quentin; Doke, Mayur; Roy, Deodutta; Yoo, Changwon; Deoraj, Alok

doi:10.3390/ijms26125463

Open AccessArticle

RNA-Seq Uncovers Association of Endocrine-Disrupting Chemicals with Hub Genes and Transcription Factors in Aggressive Prostate Cancer

by

Diaaidden Alwadi

¹,

Quentin Felty

¹

,

Mayur Doke

²

,

Deodutta Roy

¹

,

Changwon Yoo

³ and

Alok Deoraj

^1,*

¹

Department of Environmental Health Sciences, Florida International University, Miami, FL 33199, USA

²

Diabetes Research Institute, University of Miami, Miami, FL 33136, USA

³

Biostatistics Department, Florida International University, Miami, FL 33199, USA

^*

Author to whom correspondence should be addressed.

Int. J. Mol. Sci. 2025, 26(12), 5463; https://doi.org/10.3390/ijms26125463

Submission received: 9 May 2025 / Revised: 31 May 2025 / Accepted: 2 June 2025 / Published: 6 June 2025

(This article belongs to the Special Issue Environmental Epigenome and Endocrine Disrupting Chemicals)

Download

Browse Figures

Versions Notes

Abstract

This study analyzes publicly available RNA-seq data to comprehensively include the complex heterogeneity of prostate cancer (PCa) etiology. It combines prostate and prostate cancer (PCa) cell lines, representing primary PCa cells, Gleason scores, ages, and PCa of different racial origins. Additionally, some cell lines were exposed to endocrine-disrupting chemicals (EDCs). The research aims to identify hub genes and transcription factors (TFs) of the prostate carcinogenesis pathway as molecular targets for clinical investigations to elucidate EDC-induced aggressiveness and to develop potential biomarkers for their exposure risk assessments. PCa cells rely on androgen receptor (AR)-mediated signaling to survive, develop, and function. Fifteen various RNA-seq datasets were normalized for distribution, and the significance (p-value < 0.05) threshold of differentially expressed genes (DEGs) was set based on |log2FC| ≥ 2 change. Through integrated bioinformatics, we applied cBioPortal, UCSC-Xena, TIMER2.0, and TRRUST platforms, among others, to associate hub genes and their TFs based on their biologically meaningful roles in aggressive prostate carcinogenesis. Among all RNA-Seq datasets, we found 75 overlapping DEGs, with BUB1B (32%) and CCNB1 (29%) genes exhibiting the highest degree of mutation, amplification, and deletion. EDC-associated CCNB1, BUB1B, and CCNA2 in PCa cells exposed to EDCs were consistently shown to be associated with high Gleason scores (≥4 + 3) and in the >60 age group of patients. Selected TFs (E2F4, MYC, and YBX1) were also significantly associated with DEGs (NCAPG, MKI67, CCNA2, CCNB1, CDK1, CCNB2, AURKA, UBE2C, BUB1B) and influenced the overall survival (p-value < 0.05) of PCa cases. This is one of the first comprehensive studies combining 15 publicly available RNA-seq datasets to demonstrate the association of EDC-associated hub genes and their TFs aligning with the aggressive carcinogenic pathways in the higher age group (>60 years) of patients. The findings highlight the potential of these hub genes as candidates for further studies to develop molecular biomarkers for assessing the EDC-related PCa risk, diagnosing PCa aggressiveness, and identifying therapeutic targets.

Keywords:

androgen receptor; differentially expressed genes; endocrine-disrupting chemicals; environmental health risk assessment; environmental carcinogenesis; prostate cancer; RNA-seq; transcription factors

1. Introduction

Prostate cancer (PCa) is the second leading cause of cancer-related death in men and the most commonly diagnosed cancer in the United States. Consequently, the incidence of PCa cases diagnosed has risen from 3.9% to 8.2% over the last decade [1]. In 2022, the number of estimated new cases in the USA was 268,490 (rated first at 27% among the top 10 cancers), and assessed deaths were numbered at 34,500 (rated second at 11% among the top 10 cancers) for PCa in men [1,2,3]. The normal prostate gland consists of basal and luminal cells enclosed by fibromuscular stroma. During the development of PCa, the ratio of luminal to basal cell percentages is significantly changed, with the luminal cells including more than 99% of the cancer [4]. The growth of PCa is facilitated by the hormones interacting with the androgen receptor (AR), whose dysregulated gene expressions are linked to the development of PCa with a Gleason score ≥ (4 ± 3) [5]. AR consists of four domains: (1) N-terminal, (2) DNA-binding, (3) hinge region, and (4) ligand-binding that binds with androgens, such as Testosterone and Dihydrotestosterone (DHT) [6]. Normal physiological functions, prostate carcinogenesis, and androgen deprivation therapy (ADT) affect AR-mediated signaling, which also plays a significant role in treating PCa [7]. AR signaling (AR activity) can stimulate or suppress the growth of PCa based on the level of androgen, which has resulted in the development of the ADT technique [7,8]. Present PCa examinations, including computed tomography (CT), digital rectal examination, magnetic resonance imaging (MRI), and ultrasound detection, overlook middle- and high-risk lesions in tissue, and these diagnostic methods exhibit limitations in the biochemical detection of PCa recurrence [9,10]. Detection of PSA is also prone to false positives resulting from benign prostatic hyperplasia, and its levels are frequently altered by different triggers (inflammation or sexual activity), which result in overdiagnosis [9,11,12,13]. Distinct protein expressions and genes between malignant and normal prostate tissues have been recorded, which provided resources for diagnostic, prognostic, and biomarker identification [10,14,15,16,17,18,19,20,21,22,23,24,25,26]. Epigenetic modifications, gene mutations, and microRNA expression in carcinogenesis have also been researched for biomarker findings [14,15,16,17,18,19,20,21,22,23,24,25,26]. Many researchers have discovered gene expression profiles that can help elucidate the PCa prognosis by analyzing single or multiple microarray datasets. For example, BUB1, CDK1, and EZH2 [14], EPCAM, TWIST1, CD38 and VEGFA [15], PTEN (Phosphatase and Tensin Homolog) [16], ISG15 and CST2 [17], BUB1, KIF2C, CDC20, and PBK [18], LMNB1, TK1, RACGAP1 and ZWINT [19], CKS2, TK1, and RRM2 [20], FGFR1, FGF13 and CCND1 [21], BZRAP1-AS1 and KRT8 [22], MP2, PPARG and PRKAR2B [23], ABCC4 and SLPI [24], CENPA, KIF20A and CDCA8 [25], and MCM4, CENPI, and KNTC1 [26] genes have been suggested to be potential biomarkers. Recently, there has been a significant increase in public RNA sequencing datasets (RNA-seq), which can be analyzed for gene set enrichment (GSE).

However, most of the publicly available microarray and RNA-seq-based datasets comprise clinically valuable samples or experimental types, often lacking information on the environmental exposure effects of single or multiple factors. The availability of high-throughput gene microarray data and RNA-seq data from various cancer cell types, treatments, and cancer tissue specimens, along with comparative toxicogenomic datasets, provides opportunity to simultaneously fill knowledge gaps in the molecular pathology of cancers due to environmental exposures at both the transcriptional and translational levels [27,28]. The prognosis of PCa can be significantly influenced by EDCs (or xenobiotics) in the environment. EDCs have been linked with poor PCa prognosis [29,30,31,32,33]. EDCs may act as agonists or antagonists for AR; they can directly bind to AR, alter gene expression, interfere with steroidogenesis, or induce epigenetic changes. Our earlier studies demonstrated the association of environmental phenols and parabens with patient-reported PCa diagnoses [34], with 22 environmental chemicals, including 17 EDCs, potentially impacting PCa carcinogenesis and its aggressiveness [35]. EDCs have also been shown to significantly influence DNA mutations, chromosomal abnormalities, and inflammation [5,36].

This study aims to capture the range of heterogeneity present within prostate cancer cells. It includes a comprehensive analysis of the GSE of 15 RNA-seq datasets, which combine prostate and PCa cell lines. The pool represents RNA seq data from primary PCa cells, PCa cells with high Gleason scores that suggest the level of PCa aggressiveness, and PCa cells of different racial origins. These cell lines were either treated with androgen modulators, such as R1881 (an androgen agonist), used to investigate the impact of the TFs, or exposed to xenobiotic EDCs. The pooled GSE enabled the evaluation of molecular signatures associated with EDC exposure and the prognosis of aggressive PCa. By examining comprehensive RNA-seq data and GSE, the objective of the study is to understand the potential roles of EDC-associated hub genes and their TFs in the aggressiveness of prostate cancer (PCa). These genes and their TFs may serve as biomarkers for assessing environmental health risks associated with EDC exposures. The findings may also pave the way for further research to validate their potential as biomarkers to improve the prognosis of aggressive prostate cancer cases.

2. Results

2.1. Discovery of Differentially Expressed Genes (DEGs) in RNA-Seq for PCa

Fifteen expression data series (GSE) RNA-seq datasets (GSE70466, GSE151290, GSE128749, GSE120660, GSE13587, GSE136272, GSE218556, GSE64529, GSE5590, GSE128339, GSE109021, GSE200879, GSE103512, GSE104131, GSE200167) included 41 platforms for PCa that were normalized using DESeq2, and the post-normalized value is demonstrated by all by 41 box plots of all datasets as shown in Figure S1. Our data contained 364 total samples of cell lines: 272 tumors and 92 normal prostate cell samples. The RNA-seq datasets were divided into four groups: cell lines treated with R1881 treatment, cell lines exposed to EDCs (Bisphenol A, DHT, Methyl testosterone, Estrogen, Testosterone propionate, DDE, and triclosan), cell lines with various Gleason scores and cells with MYC (TF) functional analysis, and cell lines with different ages and races. A total number of 10,241 DEGs were obtained from RNA-seq datasets assessed in the analysis based on selected criteria of the cutoff standards of adjusted p-value < 0.05 and [log2FC] > +2, [log2FC] < −2 as illustrated in volcano plots from OriginPro 9.1 shown in Figure S2.

In the next step, we generated a Venn diagram of four groups of overlapping genes from RNA-seq datasets of PCa cell lines with (1) R1881 treatments (418 genes), (2) exposure datasets (116 genes), (3) Gleason scores + MYC (TF) (2235 genes), and (4) age/race (7472 genes). This result is shown in Figure 1. To obtain the most significant overlapping DEGs from all RNA-seq datasets, we generated a Venn diagram showing 75 (0.8%) genes, including 37 upregulated and 38 downregulated genes. These identified 75 genes were considered for further analysis.

2.2. RNA-Seq DEG Enrichment Analysis: GO, KEGG, and TFs

The top five Gene Ontology (GO) terms in biological process (BP), molecular function (MF), and cellular component (CC) and Kyoto Encyclopedia of Genes and Genomes (KEGG) terms in each category were recognized from functional analysis of upregulated and downregulated genes, as shown in Tables S1 and S2. The most significant enrichment function analysis terms for upregulated genes are cell division for BP, cyclin-dependent protein kinase holoenzyme complex for CC, and cyclin-dependent protein serine/threonine kinase regulator activity for MF, as shown in Table S1. Moreover, the most significant enrichment functional analysis terms for downregulated genes are cell division for BP, spindle for CC, and ATP binding for MF, as shown in Table S2. The KEGG pathway functional analysis indicated that these upregulated and downregulated genes, as shown in Tables S1 and S2, were significantly enriched in cancer, prostate cancer, cell cycle, and p53 signaling pathways. TF prediction analysis was performed by database for annotation, visualization, and integrated discovery (DAVID.6.8). In Table 1, we can see the top 10 TFs associated with upregulated and downregulated DEGs. The five most significant TFs, namely MYC, NFY, STAT, CEBP, and YY1, were found to be associated with upregulated genes, while MYC, NFY, NFAT, TATA, and IRF7 were identified as the significantly associated TFs for the downregulated genes. Their significance was determined using a False Discovery Rate (FDR) and p-value < 0.05, as shown in Table 1. MYC (MYC Proto-Oncogene, BHLH Transcription Factor) and NFY (Nuclear transcription factor Y) were the most significantly associated TFs with regulated genes.

2.3. Protein–Protein Interaction (PPI) Networks and Module Selection Analysis

PPI networks are critical in comprehending gene functions. Therefore, we constructed PPI networks for DEGs for 38 upregulated and 37 downregulated genes separately to analyze them using the Retrieval of Interacting Genes/Proteins (STRING) database tool. For upregulated genes, there were 38 nodes (genes), 366 edges, an average node degree of 19.3, and a local clustering coefficient of 0.78, as shown in Figure 2A. For downregulated genes, there were 37 nodes (genes), 282 edges, an average node degree of 15.2, and a local clustering coefficient of 0.88, as shown in Figure 2B. Both networks expressed PPI enrichment at p-value < 1.0⁻¹⁶, which was statistically significant. PPI network modules were generated by Molecular Complex Detection (MCODE). There are four modules of the PPI network identified from RNA-seq. For upregulated genes, Module-1 was connected with a score of 16.3 and contained 25 nodes/genes and 612 edges, and Module-2 was associated with a score of 2.4, including 3 nodes/genes with 3 edges, as shown in Supplementary Figure S3A. ACTA2 and FN1 were identified as clustered and TGFBR2 as seed (highlighted with square shapes) in the clustering analysis in Module-2. In terms of downregulation, Module-1 was connected with a score of 17.6 and contained 23 nodes/genes and 413 edges, and Module-2 was associated with a score of 4.0, including 7 nodes/genes with 18 edges, as shown in Figure S3B. TGFB2, CCL5, CXCL10, ICAM1, IGF1, and IL6R are described as clustered and IL7R as seed (highlighted with square shapes) in the clustering analysis in Module-2. The genes recognized in the Module-2 network, containing three (Figure S3A) and seven (Figure S3B) clustered proteins, were represented as significant genes for the subsequent analysis. The MCODE research for Module-2 of the top clusters containing essential proteins is underlined in the node and responsible for generating the clusters.

2.4. Hub Gene Identification

Hub genes were identified based on the CytoHubba topological measures MCC (maximal clique centrality), DMNC (density of maximum neighborhood component), MNC (maximum neighborhood component), Degree, EPC (edge percolated component), and (MCC ꓵ DMNC ꓵ MNC ꓵ Degree ꓵ EPC), as illustrated in Table 2 and Table 3. The hub genes were screened from the PPI network utilizing the CytoHubba tool’s degree rank technique. The topological data of the top 10 upregulated group of genes (NCAPH, RAD54L, E2F7, MKI67, NDC80, MELK, ESPL1, CCNA2, NCAPG, CCNB1) and top 10 downregulated group of genes (TOP2A, BUB1B, CCNB2, AURKA, UBE2C, CCNB1, KIF15, DLGAP5, CENPE, CDC20) are visualized in networks, as shown in Supplementary Figures S4A and S4B, respectively. To distinguish the identified genes based on CytoHubba topological measures, they were categorized into upregulated and downregulated hub gene groups for further analysis. In the next step, to explore the potential mechanism of gene interaction networks in PCa, we used GeneMANIA to generate gene-related PPI networks. The network of hub genes is displayed in an orange circle, and associated genes are expressed in a blue circle, as shown in Figure 3. GeneMANIA indicated that the 10 upregulated hub genes are linked with 20 genes, which are ranked based on their respective scores (SMC4:0.028, SMC2:0.024, FOXP4:0.023, NCAPD2:0.022, SPC25:0.017, CDK1:0.017, GTSE1: 0.017, E2F8:0.017, HMMR:0.016, ASPM:0.015, CKS2GTSE1:0.015, TOP2A:0.014, CENPE:0.014, CCNF:0.014, DLGAP5:0.014, TROAP:0.014, PLK1:0.013, CDC25C:0.013, BUB1B:0.013, KIF23:0.013) (Figure 3A). Similarly, downregulated hub genes are linked with 20 genes (FZR1:0.025, MAD2L1:0.022, TPX2:0.020, CCNA2:0.017, CCNF: 0.017, CDK1:0.017, CENPF:0.017, BUB1:0.016, KIF23:0.016, NDC80:0.016, GTSE1:0.016, MKI67:0.015, CHFR:0.015, TROAP:0.015, ANAPC11:0.015, CDC25C:0.014, KIF11:0.014, PLK1:0.014, CKS2:0.014, HAAO:0.014) as shown in Figure 3B.

2.5. Genetic Alteration and Expression of Hub Genes

We analyzed the genetic alterations (such as missense mutations, amplifications, deep deletions, high and low mRNA expressions, and no alterations) of each hub gene in predicted networks using the Oncoprint element of the cBio Cancer Genomics Portal (cBioPortal) across 26 PCa studies. This analysis included a total of 10,998 samples. It was found that some genetic alterations affected the upregulated hub genes in a certain percentage of PCa samples. The percentage of affected samples is shown in parentheses next to each upregulated hub gene. The affected upregulated hub genes are CCNB1 (29%), CCNA2 (28%), ESPL1 (26%), NCAPH (24%), RAD54L (22%), MELK (23%), NCAPG (20%), MKI67 (18%), E2F7 (13%), and NDC80 (8%). A visual summary of genetic alterations is presented in Figure 4A. Similarly, some genetic alterations affected the downregulated hub genes in a certain percentage of PCa samples. The percentage of affected samples is shown in parentheses next to each downregulated hub gene. The affected downregulated hub genes are BUB1B (32%), CDC20 (29%), CCNB1 (29%), AURKA (28%), CENPE (27%), DLGAP5 (26%), UBE2C (22%), TOP2A (18%), CCNB2 (15%), and KIF15 (14%). A visual summary of genetic alterations is presented in Figure 4B. Upregulated BUB1B and downregulated CCNB1 hub genes exhibit mutations in 32% and 29% of PCa samples, respectively, indicating their high prevalence. Furthermore, genetic alterations such as high amplification and deletion coincide with these mutations in the PCa samples of the oldest age groups, as depicted in Figure 4. The differential expression of hub genes in PCa and normal prostate tissues was analyzed using the University of California Santa Cruz Xena (UCSC-Xena) database (accessed on 17 November 2022) with data obtained from TCGA (The Cancer Genome Atlas), MSK (Memorial Sloan Kettering Cancer Center), ICGC (International Cancer Genome Consortium), GDC (Genomic Data Commons), and GTEx (Genotype-Tissue Expression), as shown in Figure S5. The gene expression was measured using RNA-seq RSEM, and the norm_count was used for analysis. The data was log2(fpkm-uq + 1) z-transformed, with the Y unit as log2(norm_count + 1), including Y data linear transform by subtracting the mean and dividing std dev (z-score). The analysis confirmed that 20 hub genes between PCa and normal prostate tissues were significantly altered in PCa tissues (p-value < 0.05) as shown in violin plots of upregulated and downregulated genes in Figure S5. Hierarchical clustering in a heatmap of hub genes could distinguish PCa samples from normal samples, as shown in Figure 5. The heatmap of the hub genes in PCa clusters showed significant expression compared to the non-cancer groups by UCSC-Xena (Figure 5A,B). Also, the hub genes were highly expressed as shown by a high Gleason score (Figure 5). The results were adjusted, and the color parameters were modified: max color 100% saturation of log2(norm_count + 1) = 12.6 (red) and min color 100% saturation of log2(norm_count + 1) = 2.89 (blue). After comparing PCa samples with normal tissues, the results indicated that upregulated (CCNB1, CCNA2, ESPL1, NCAPH, RAD54L, MELK, NCAPG) and downregulated (BUB1B, CDC20, CCNB1, AURKA, CENPE, UBE2C, TOP2A, CCNB2) genes were associated with PCa carcinogenesis, high Gleason scores ≥ (4 + 3), and age ≥ 55 years.

2.6. Hub Gene and Immune Cell Infiltration Estimations

The tumor-infiltrating immune cells are shown to be associated with prognosis and response to different functions and compositions in cancer, including in various stages of PCa [10]. To investigate the relationship between hub gene expressions and tumor-infiltrating immune cells in PCa tissues, we utilized the Tumor Immunity Evaluation Resource (TIMER2.0) database (accessed on 27 November 2023) to analyze the association of hub gene expressions and the levels of various immune cell infiltrations. The results indicate a positive association (p < 0.05) between all the upregulated groups of hub genes and the following immune cell populations: CD8+ T cells (RAD54L, E2F7), CD4+ T cells (NCAPH, RAD54L, E2F7, NDC80, MELK, ESPL1, MKI67, CCNA2, NCAPG), B cells (MELK, NDC80), neutrophils (NCAPH, RAD54L, E2F7, NDC80, MELK, ESPL1, MKI67, CCNA2, NCAPG, CCNB1), and macrophages (NCAPH, RAD54L, E2F7, NDC80, ESPL1, CCNA2). These results are illustrated in Figure S6A. The results also indicate that all the downregulated groups of hub genes were positively associated (p-value < 0.05) with the following immune cells: CD4+ T cells (TOP2A, CCNB1, AURKA, UBE2C, KIF15, RAD54L, CENPE, CDC20), neutrophils (TOP2A, CCNB1, CCNB2, AURKA, UBE2C, CCNB1, KIF15, RAD54L, CENPE, CDC20), and macrophages (TOP2A, AURKA, KIF15, CENPE). These findings are depicted in Figure S6B. We discovered that several hub genes were significantly associated with immune cells based on partial Spearman’s correlation and a p-value <0.05. The genes for neutrophils included NCAPH (0.344, p-value = 5.5 × 10⁻¹³), E2F7 (0.391, p-value = 1.18 × 10⁻¹⁶), ESPL1 (0.355, p-value = 2.25 × 10⁻¹²), MKI67 (0.298, p-value = 5.3 × 10⁻¹⁰), CCNA2 (0.277, p-value = 8.93 × 10⁻⁹), TOP2A (0.279, p-value = 6.43 × 10⁻¹⁰), NCAPG (0.1.93, p-value = 7.19 × 10⁻⁵), CCNB1 (0.198, p-value = 4.71 × 10⁻⁵), BUB1B (0.225, p-value = 3.4 × 10⁻⁶), CCNB2 (0.154, p-value = 1.6 × 10⁻³), AURKA (0.201, p-value = 3.79 × 10⁻⁵), KIF15 (0.287, p-value = 2.5 × 10⁻⁹), CENPE (0.376, p-value = 1.99 × 10⁻¹⁵), and CDC20 (0.178, p-value = 2.62 × 10⁻⁴). For CD4+ T cells, the genes were NDC80 (0.315, p-value = 5.29 × 10⁻¹¹), CCNA2 (0.272, p-value = 1.64 × 10⁻⁸), UBE2C (0.132, p-value = 7.15 × 10⁻³), KIF15 (0.258, p-value = 9.62 × 10⁻⁸), and DLGAP5 (0.257, p-value = 1.01 × 10⁻⁷). Lastly, for macrophages, the gene was ESPL1 (0.311, p-value = 9.3 × 10⁻¹¹). The results also showed that NCAPH, MELK, CCNA2, NCAPG, and KIF15 gene expressions were associated with overall higher infiltration of immune cells to PCa. However, they negatively correlated with CD8+ T cells (Figure S6). The results demonstrate the influence of identified hub genes on the immune microenvironment of PCa.

We utilized TIMER2.0 on TCGA, ICGC, GTEx, and GDC to analyze the differential expression of hub genes in relation to prostate cancer progression. To determine the association of these genes with overall survival (OS), we used Kaplan–Meier curves, as shown in Figure S7. For survival analysis, we used a model that included immune cell infiltrations, gene expression, and the TIMER algorithm. The results indicated that all genes of the upregulated and downregulated groups were significantly associated (p < 0.05, Hazard Ratio, Z-score) with immune cell infiltrations (CD8+ T cells, CD4+ T cells, B cells, and macrophages), except for CCNA2, as demonstrated in Figure S7A,B. Specifically, RAD54L, E2F7, MKI67, NDC80, and MELK from the upregulated group and TOP2A, BUB1B, CCNB1, UBE2C, and CCNB1 from the downregulated group were found to be the most highly associated with immune cell infiltrations.

2.7. GlueGO and GluePedia

GlueGO and GluePedia were employed to visualize the enrichment results for hub genes. It was discovered that specific hub genes are associated with numerous pathways, potentially regulating the PCa prognosis. The results show that the genes of the upregulated group (CCNB1, ESPL1, MKI67, NDC8) are associated with regulation of chromosome segregation, negative regulation of sister chromatid segregation, mitotic sister chromatid separation, metaphase/anaphase transition of the mitotic cell cycle, and regulation of mitotic metaphase/anaphase transition, as shown in Figure S8. In addition, the results also indicate that genes of the downregulated group (BUB1B, CDC20, CCNB1, AURKA, CENPE, DLGAP5, UBE2C, and TOP2A) are associated with positive regulation of mitotic nuclear division and ubiquitin–protein ligase activities, which are involved in the regulation of mitotic cell cycle transition, chromosome separation, the establishment of chromosome localization, mitotic sister chromatid segregation, anaphase-promoting complex, metaphase/anaphase transition of the mitotic cell cycle, regulation of mitotic metaphase/anaphase transition, and negative regulation of ubiquitin–protein ligase activity involved in the mitotic cell cycle, as shown in Figure S8.

2.8. Hub Gene Redundancy and Dependency Analysis

Based on the Cancer DepMap results, eight genes are considered essential in both CRISPR/Cas9 and RNAi, and four genes are unique and essential only in CRISPR/Cas9. Four of these genes (NDC80, CCNA2, NCAPG, ESPL1) from the upregulated group are common to both CRISPR/Cas9 and RNAi and can be potential diagnostic biomarkers and therapeutic targets. Meanwhile, four other genes (CCNB1, MKI67, NCAPG, E2F7) from the downregulated groups are unique to CRISPR/Cas9, which can also be potential diagnostic biomarkers and therapeutic targets. These findings are shown in Figure S9A,B. The predictability analysis has shown that certain genes are highly important for the Core Omics module. The genes that belong to the upregulated group, including NCAPH (3.2%), E2F7 (4.7%), NDC80 (5.9%), and NCAPG (2.1%), as well as the genes from the downregulated group, such as TOP2A (2.8%), BUB1B (3.1%), AURKA (1.0%), UBE2C (2.5%), and DLGAP5 (1.7%), appear to be essential genes. The predictability analysis has identified NDC80 (5.9%) and UBE2C (13.2%) as the genes with the highest relative importance; these findings are shown in Figure S10.

2.9. Enrichr and Hub Gene Enrichment Analysis

Enrichr (v3.1) and ggplot2 R packages (v 3.5.1) were used to highlight the functions and pathways of hub genes. The results showed that the upregulated genes in PCa were involved in the cell cycle, oocyte meiosis, the p53 signaling pathway, the FoxO signaling pathway, mitotic sister chromatid segregation, positive regulation of chromosomal regulation, condensed chromosomes, and cyclin-dependent protein serine/threonine kinase regulator activity (Figure S11A). The downregulated genes were associated with the cell cycle, mitosis, mitotic metaphase/anaphase transition, positive regulation of mitotic nuclear division, and microtubule cytoskeleton microtubule binding (Figure S11B).

2.10. Regulatory TFs and Their Associated Network with Hub Genes

The TRRUST V.2 database was used (accessed on 27 November 2023) to identify the critical regulatory TFs associated with hub genes. Table S3 displays the top ten TFs associated with hub genes of the upregulated group, which are E2F4, MYC, RBL1, E2F3, ARNT, YBX1, IRF1, ESR1, E2F1, SP1, and TP53. In contrast, Table S4 shows the top ten TFs associated with genes of the downregulated group, which are YBX1, MYC, E2F4, MED1, E2F3, E2F1, TP53, OTX2, PTTG1, and IRF1. YBX1, MYC, and E2F4 are the three unique TFs associated with both the upregulated and downregulated groups of genes. The HumanNet functional gene network validated protein-encoding genes constructed by a modified Bayesian integration and revealed that E2F4 is associated with 26 genes, 6 of which were discovered in this study (AR, AURKB, CCNB1, CCND1, MYC, TOPBP1) (Figure 6A). Similarly, MYC is associated with 113 genes; 5 genes (CCNA2, CCNB1, CDC20, CCND2, UBE2C) were discovered in this study, and YBX1 is associated with 33 genes; 6 genes (AR, CCNB1, CDC20, CCND2, E2F1, TOP2A) were discovered in this study, as shown in Figure 6A. Figure 6B displays the PPI network through iRefIndex among E2F4, MYC, and YBX1. The diseases/pathways associated with E2F4, MYC, and YBX1 are malignant neoplasm of the prostate (DOID:10283) and prostate carcinoma (DOID:10286) and show the highest significance (p-values 6.53⁻¹¹ and 5.20⁻¹¹, respectively), as shown in Table S5. The KEGG pathways and diseases connected with the specified E2F4, MYC, and YBX1 (KEGG term: Term ID: p-value) are as follows: (cell cycle: hsa04110: 2.20⁻⁹), (pathways in cancer: hsa05200: 9.05⁻⁹), (FoxO signaling pathway: hsa04068: 3.95⁻⁶), (prostate cancer: hsa05215: 2.14–6), (PI3K-Akt signaling pathway: hsa04151: 5.68⁻⁶), and (p53 signaling pathway: hsa04115: 1.92⁻⁵). This is shown in Table S6.

2.11. Comparison of Phenotypes for Hub Genes and TFs

The results from Phenolyzer demonstrated that the MYC (TF) has the most significant associations with PCa cases, and the hub genes of the upregulated group (RAD54L, E2F7, CCNB1, CCNA2) and the downregulated group of genes (TOP2A, CDC20, UBE2C, BUB1B, CCNB2, AURKA) as shown in Figure S12. SIGnaling Network Open Resource (SIGNOR 3.0) was used to construct a causal network by connecting the hub genes that have been associated with PCa and mutated in patients, as shown in Figure S13. The results show 27 Seed Entities with the most genomic alterations in PCa and included hub genes (AR, UBE2C, TOP2A, UBE2C, CCNA2, CENPE), TFs (MYC, E2F4, YBX1), and signaling pathways (PI3K-AKT, PIP3, P53, FoxO).

3. Discussion

PCa is one of the most common cancers and the principal cause of cancer mortality in men in the USA [1,2,3]. However, until now, confirmed biological/molecular biomarkers, molecular pathways, and signaling pathways underpinning PCa prognosis and development have yet to be completely understood [14,15,16,17,18,19,20,21]. In addition, the complex interacting risk factors for PCa include age, race/ethnicity, family history of PCa, Gleason score, diet, and environmental pollutants such as EDCs mimicking hormones of the endogenous endocrine system [23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39]. There are only a handful of molecular pathological studies that investigate gene–environment interaction to assess the contributions of environmental risk factors, including EDC or xenobiotic exposures, that render PCa cases more aggressive. Our earlier findings have implicated environmental phenols, parabens, and EDCs with a set of hub genes and their TFs in poor PCa prognosis using NHANES and the comparative toxicogenomic databases [34,35]. Therefore, in this study, we expanded the scope to validate the hub genes and their TFs by applying available PCa RNA-seq datasets. RNA-seq datasets offered a real-time snapshot of interacting genes and their TFs with the environmental risk factors, including EDCs, in the aggressive PCa pathways. We included 15 RNA-seq datasets from PCa cell lines representing synthetic androgen R1881 treatment, exposure to EDCs, the role of MYC (TF), high-grade Gleason scores ≥(4 + 3), different ages, and race/ethnicity. Using our approach, we have attempted to capture the heterogeneity of prostate cancer cells comprehensively. This helped us unravel the association of selected hub genes and secondary genes and associated TFs, as well as their alignment in the carcinogenic signaling pathways leading to aggressive PCa prognosis.

We collected gene expressions from RNA-seq dataset profiles from the Expression Omnibus database/National Center for Biotechnology Information (GEO/NCBI) database, (accessed on 27 November 2023) which contained 364 samples: 272 tumors and 92 normal prostate tissue samples. A total number of 10,241 DEGs were obtained; the overlapping genes among the GSE cell lines + R1881, EDC exposure datasets, Gleason scores + MYC (TF), and age/race/ethnicity are 418, 116, 2235, and 7472 genes, respectively. The most significant common genes were numbered at 75 (37 upregulated and 38 downregulated), which were determined as hub genes by CytoHubba topological measures for further analysis (Table 1 and Table 2). The GO enrichment analysis of upregulated and downregulated DEGs demonstrated that GO terms BP, MF, and CC were principally involved in cell division, mitotic cell cycle phase transition, DNA repair, mitotic sister chromatid segregation, and cellular response to ionizing radiation [40]. In addition, mitotic failure can potentially be a significant mechanism of immediate DNA damage [41]. These DEGs also included the selected EDC-associated hub genes from our earlier studies [34,35]. In the KEGG pathway analysis, the top five connected pathways aligned with cancer, prostate cancer, the cell cycle, the p53 signaling pathway, and the FoxO signaling pathway. KEGG enrichment analysis reveals that DEGs are involved in the main pathways of cancer, prostate cancer, the cell cycle, the p53 signaling pathway [42], and the FoxO signaling pathway [43]. Consequently, all the above functional analyses and alignment of DEGs in pathways verified that our results are associated with the progression of PCa. The most significant TFs identified for upregulated and downregulated genes are Nuclear transcription factor Y (NFY) and MYC [35]. We demonstrate that MYC overexpression significantly decreases AR transcriptional signaling, which impacts the targeted genes controlled by the AR protein in prostate luminal cells [44,45].

We visualized numerous genomic alteration possibilities of the hub genes across a set of PCa samples utilizing OncoPrint employing a query for alterations. Most of the alterations were identified as amplifications and deep deletions (Figure 4). Recurrent abnormalities in genes and pathways related to PCa cancer were identified. The most expected mutations are in CCNB1 (29%) and BUB1B (32%). This suggests that amplification- and deletion-based mutations were associated with hub gene overexpression in PCa tissues, and modifications were noticed in the main domains of hub genes. Higher expression levels of CCNB1 in PCa tissues may indicate its influence on poor prognosis and its role as a better candidate as a biomarker for aggressive PCa [46]. BUB1B, identified as one of the EDC-associated hub genes, is an essential mitotic checkpoint kinase determined as the top-scoring kinase by RNA transcription. BUB1B is a protein that plays a crucial role in cell division, specifically ensuring accurate chromosome segregation during mitosis. It is an element of the mitotic spindle checkpoint. This mechanism prevents cells from dividing when there are errors or abnormalities in the alignment of chromosomes on the spindle fibers [47,48]. Chromosomal instability, caused by mutations or dysregulation by EDC exposure to genes like BUB1B, can contribute to the progression or poor prognosis of PCa.

It is understood that cancers are formed not only by cancer cells but also by stroma and immune cells [49]. Immune cells play a crucial role in the development and progression of PCa. PCa can trigger different immune responses in the body, and the interactions between immune and cancer cells are complex. If detected, immune cells such as T cells can identify and attack cancer cells. The immune cells can infiltrate the PCa. Their presence within the cancer microenvironment can have both positive and negative outcomes. They can help prevent cancer growth by identifying and attacking cancer cells. On the other hand, cancers can create mechanisms to evade immune detection, leading to an insufficient immune response [18]. Lately, lymphocytic immune cell infiltration in PCa tissue has been presented as an immune mechanism of PCa progression. Therefore, targeted immune cell movements can be a possible early diagnostic opportunity for PCa. Our study indicates a significant positive prognostic influence of upregulated (NCAPH, RAD54L, E2F7, NDC80, MELK, ESPL1, MKI67, CCNA2, NCAPG) and downregulated (TOP2A, CCNB1, CCNB2, AURKA, UBE2C, CCNB1, KIF15, RAD54L, CENPE, CDC20) genes associated with the majority of the tumor-infiltrating immune cells, also indicating the above genes as potential biomarkers which are primarily expressed in PCa cells. The current study revealed that an increase in neutrophils, macrophages, and T cell populations was also associated with RAD54L, E2F7, MKI67, NDC80, MELK, TOP2A, BUB1B, CCNB1, UBE2C, and CCNB1 in PCa cell lines. These increased immune cells connected with the above genes are speculated to be involved in the progression of PCa and the OS of the patients (Figure S7).

The above results are compatible with other research findings [10,49]. GlueGO and GluePedia enrichment analysis results showed that CCNB1, ESPL1, MKI67, NDC8, BUB1B, CDC20, CCNB1, AURKA, CENPE, DLGAP5, UBE2C, and TOP2A genes were associated with cell cycle and apoptosis, activation of the p53 signaling pathway, tumor suppression, and cell cycle arrest [50,51], alongside a mitotic checkpoint serine kinase that has been documented in PCa [52]. Furthermore, some of these immune cells’ infiltration-associated hub genes and TFs have been shown to be EDC-associated [34,35].

The redundancy analysis showed that the genes NDC80, CCNA2, NCAPG, ESPL1, OP2A, BUB1B, AURKA, and CDC20 may be prognostic indicators of PCa. These eight genes are also differentially expressed in PCa cell lines by protein-level validation. The nine genes NCAPH, E2F7, NDC80, NCAPG, TOP2A, BUB1B, AURKA, UBE2C, and DLGAP5 are essential in both CRISPR/Cas9 knockout and the RNAi predictability of relative importance, suggesting that the above genes are unique and have potential to be diagnostic biomarkers for PCa cell lines [53]. The enrichment analysis for hub genes for PCa shows that the upregulated genes were mainly affected in the cell cycle, oocyte meiosis, the p53 signaling pathway, and the FoxO signaling pathway, which are all closely associated with cancer and PCa [40,41,49]. The downregulated hub genes were particularly enriched in the Cell cycle, mitotic cell cycle, regulation of mitotic metaphase/anaphase transition, and positive regulation of mitotic nuclear division, which are reported in different cancers, including PCa [54,55].

The key regulatory TFs identified from the TRRUST V.2 database and associated with both upregulation and downregulation of hub genes are YBX1, MYC, and E2F4. In PCa, MYC is overexpressed in earlier phases of the disease and functions as a critical driver of disease progression and tumorigenesis [54]. MYC overexpression is marked in up to 40% of metastatic PCa patients [55] and is connected with poor survival [56]. E2F4 is enriched in differentiated and nonproliferating cells and recreates essential functions in suppressing proliferation-associated hub genes [57]. The level of E2F4 protein was increased significantly in the nuclei of PCa cells [58] and acted as a transcriptional suppressor by reducing TGF-β in the survival expression in PCa epithelial cells [59]. YBX-1 controls AR expression, transcriptionally associated with a splicing mechanism [60]. Thus, YBX-1 seems to be an effective target to inhibit the growth of PCa even at the AR expression stage, where novel AR-targeting agents and medicines are inefficient [61]. The identified YBX1, MYC, and E2F4 TFs were associated mainly with DOIDs (malignant neoplasm of the prostate and prostate carcinoma) and KEGG pathways for disease-enriched cell cycle, pathways in cancer, the FoxO signaling pathway, prostate cancer, the PI3K-Akt signaling pathway, and the p53 signaling pathway, which are considered to be associated with cancer prognosis. Causal interaction analysis of hub genes in DisNor showed 27 Seed Entities with the most genomic alterations in PCa impact. The most significant findings are validated with our results for genes UBE2C, TOP2A, UBE2C, CCNA2, and CENPE, as well as TFs (MYC, E2F4, YBX1) and signaling pathways (PI3K-AKT, PIP3, P53, FoxO).

While acknowledging the limitations of our research, we interpret our results with caution. As with our previous studies [34,35], the current study also relies on mining public databases for bioinformatics analysis, and the data quality was not evaluated. Our approach to using the publicly available RNA-seq data allowed us to maximize the sample size and capture the heterogeneous mix of cells representing different origins, stages, and EDC treatments. However, these datasets may be susceptible to batch effects, platform-dependent variability, and RNA library preparation protocols. Publicly available datasets may vary in annotation consistency and quality control, which may have an effect on downstream data processing. We integrated multiple bioinformatics platforms, such as cBioPortal, UCSC Xena, TIMER2.0, and TRRUST, which are essential for building a robust and biologically meaningful analysis. However, using multiple tools increases the risk of redundancy, overfitting, or inconsistent interpretations if not handled carefully. As demonstrated, each tool had a distinct focus to provide non-redundant data types, which support complementary insights rather than duplication. For example, cBioPortal was used for clinical and genomic alteration data (e.g., mutations, CNVs, survival); UCSC-Xena was used for multi-omics visualization (expression, methylation, clinical features); TIMER2.0 was used for immune cell infiltration analysis using transcriptomic data; and TRRUST was used to map the curated transcription factor–target gene regulatory networks. This analysis enabled us to distinguish the type of evidence from each source based on its strength, without overlapping functions.

Our research strongly suggests the potential roles of EDC-associated hub genes in aggressive PCa pathways. By integrating different RNA-seq datasets, we unraveled hub genes and their interactions in a heterogeneous pool of PCa samples, leading to the aggressive PCa form. This observation reinforces the potential of the identified EDC-associated hub genes and their associated TFs as predictive biomarkers for assessing the environmental health risks of EDC exposures and as potential targets for aggressive PCa diagnosis and treatment. We found that some EDC-associated hub genes and TFs aligned with PCa pathways repeatedly appeared in our previous and current analyses [34,35]. This observation positions these hub genes as strong biomarker candidates for EDC risk assessment studies or clinical prospective studies. Addressing the translation of identified hub genes and TFs into clinically actionable biomarkers or therapeutic targets within the context of patient samples or functional assays is beyond the immediate scope of the current manuscript. However, this will be a crucial direction for future research.

4. Materials and Methods

We used comprehensive RNA-seq datasets and integrated bioinformatics and statistical procedures to analyze the datasets to investigate the hub genes and TFs based on their function roles(s) in regulating biological pathways and PCa prognoses.

4.1. Omics Data Acquisition

To acquire multiple GSE omics datasets for RNA-seq data related to PCa, we accessed and downloaded datasets from GEO/NCBI (http://www.ncbi.nlm.nih.gov/geo, accessed on 15 May 2023 [62]. The datasets incorporated in this research were established on controls following certain benchmarks: (1) the datasets included RNA-seq data only, (2) the origin of the samples was normal prostate or PCa tissue, (3) studies were conducted exclusively in Homo sapiens samples, (4) studies clearly described the sample selection protocol, (5) datasets had publicly available raw data, (6) the dataset included more than six samples (at least three controls and three cases), (7) samples were not treated with medication, radiation, or chemotherapy before sampling, (8) samples were from the USA and European countries, and (9) studies used Affymetrix, Illumina, or Agilent platforms. The following RNA-seq datasets that fit the criteria were identified and downloaded from GEO: GSE70466 [63], GSE151290 [64], GSE128749 [65], GSE120660 [66], GSE135879 [67], GSE136272 [68], GSE218556 [69], GSE64529 [70], GSE5590 [71], GSE128339 [72], GSE109021 [73], GSE200879 [74], GSE103512 [49], GSE104131 [75], and GSE200167 [76]. The characteristics of RNA-seq databases, including platforms, sample size, and description information in this study, are shown in Table 4. The RNA-seq databases were downloaded from GEO for 364 samples, including 272 tumors and 92 normal prostate tissue samples.

4.2. Data Processing

DEGs in PCa and control samples were determined by employing GEO2R/DESeq2 (http://www.ncbi.nlm.nih.gov/geo/geo2r, accessed on 15 May 2023) and the R package DESeq2 package (v 1.40.2) [77]. The Benjamini and Hochberg FDR methods were implemented in acquiring DEGs for multiple RNA-seq datasets [78]. We used significance cutoffs to identify the DEGs. The selection threshold for the DEGs was specified based on |log2FC| (fold change) ≥ 2 and p-value < 0.05. The RNA-seq datasets of each microarray dataset and across the 15 GSE profiles were normalized for distribution by DESeq2 [77]. We utilized OriginPro 9.1 (scientific data analysis and graphing software) [79] to create volcano plots of the DEGs of up- and downregulated genes. Furthermore, Venn diagrams of the overlapping DEGs were output by using the Venny 2.0 (https://bioinfogp.cnb.csic.es/tools/venny/, accessed on 15 June 2023) [80] and InteractiVenn (http://www.interactivenn.net/, accessed on 15 June 2023) [81] web-based tools.

4.3. Statistical Analysis

The GEOquery/DESeq2 package for the R platform was utilized to download the RNA-seq datasets, and the GEOquery R package merges GEO data into R data and R packages (Version. 3.6.0) [82]. Each of the 41 RNA-seq datasets was submitted for normalization, background correction, and summarization. Therefore, the following tools were utilized: Affymetrix-derived datasets—affy [83]; Illumina-derived datasets—lumi; and Agilent with R-package (Version 3.6.0) limma, [84,85]. DESeq2 conducts comparisons on actual submitter-supplied processed data utilizing the R packages GEO-query and linear models for RNA-seq dataset analysis (limma) from the Bioconductor project. Bioconductor provides access to statistical and graphical methods for analyzing genomic data based on the R programming language. The adjusted p-values and FDR (Benjamini and Hochberg) were utilized to balance the results of statistically significant DEGs and to reduce the possibility of false-positive errors.

4.4. Identification of DEGs and Pathway Enrichment Analysis

DAVID.6.8 (https://david.ncifcrf.gov/, accessed on 15 July 2023) is a pathway enrichment and functional analysis tool for high-throughput sequencing of genomic and proteomic datasets that include BP, CC, and MF [86,87]. DAVID.6.8 was utilized based on the KEGG (https://www.genome.jp/kegg/, accessed on 15 July 2023) [88] and GO (http://www.geneontology.org, accessed on 15 July 2023) [89] analyses of the identified genes, with p ≤ 0.05 considered as the cutoff criterion for determining enrichment. Together, the GO and KEGG pathway analyses were used to examine BP, MF, and CC associated with DEGs, their potential enrichment, and their processes in PCa pathways. For creating PPI networks of the DEGs, we used the STRING database (http://string-db.org/, accessed on 15 July 2023) to develop the primary PPI networks (version 11.5 metasearch engine) [90]. The parameter analysis criteria in STRING were a degree of confidence of 0.40, human species, local gene fusion databases, a clustering coefficient of 0.78, and a PPI enrichment p-value < 1.0⁻¹⁶. STRING functions as an online access point for analyzing associations between mixed genes/proteins on a genome-wide scale, Gene Ontology, clustering, and centralities, which is valuable for understanding the PPI network and functions.

4.5. Biological System Analysis and Module Network Mining

Cytoscape software (version 3.10.0), an open-source platform (https://cytoscape.org/, accessed on 15 June 2023) was utilized for module network analysis and selection. Cytoscape provides complex omics datasets through analysis and visualization functions of genes/proteins through enormous and dynamic apps. Also, Cytoscape apps include a network of biological functionality and specialized programming approaches for producing complex and evolving workflows [91]. Therefore, PPI networks were obtained from STRING and imported into the Cytoscape for network and module analysis [92]. Networks were analyzed regarding clustering and GO, aiming to discover the most topologically relevant nodes/genes and molecular pathways associated with PPI networks [93]. A Cytoscape plug-in, Molecular Complex Detection (MCODE), was used to discover significant clustering modules and PPI networks to select intensively connected ones [94]. The following previously defined criteria in MCODE were implemented to determine significant modules in PPI networks: degree cutoff = 2; node score cutoff = 0.2; K-Core = 2; max depth = 100 [95]. To discover the most hub genes in the PPI networks and modules, we implemented the Cytoscape plug-in CytoHubba. CytoHubba scored and ranked the nodes using algorithms based on eleven topological analysis methods for duplicated measurements to strengthen the observation of the interactions [96]. In our study, we retained results with precise prediction regarding MCC, DNMC, EPC, degree, and MNC [35,96]. In the next step, to generate an interaction network to assess hub gene/protein lists from identified modules, GeneMANIA was used to create and rank hub genes and associated genes for functional experiments and molecular interaction networks [97]. GeneMANIA is established on co-expression networks, physical interactions, genetic interactions, co-localization pathways, and predicted and shared protein domain information [98,99]. The top 20 overlapping hub genes based on these five topological methods from CytoHubba were selected for additional bioinformatics analysis utilizing GeneMANIA, which includes a comprehensive set of datasets from BioGRID, GEO, I2D, and Pathway Commons, in addition to nine organisms distinguishing functional genomics datasets [97,98,99]. In our study, we selected all networks and Homo sapiens, as defined previously [35]. Hub genes and TFs identified from the RNA-seq dataset will be implemented for further validation, verification, and evaluation/analysis on various bioinformatics data tools for signature-related genes in the prognosis of PCa.

4.6. Genetic Alteration in Hub Genes

The cBioPortal, V 5.3.13, is an open-access online tool (http://www.cbioportal.org, accessed on 30 July 2023) integrating the datasets from large-scale genomic projects including, but not limited to, TCGA, MSK, and ICGC, a resource for examining a multidimensional cancer genomics dataset [100]. The cBioPortal was utilized (according to online instructions) to investigate the visualization and comparison of genetic alterations of the hub genes involved in the signature. The PCa datasets from 26 studies containing 10,998 samples were chosen to analyze the genomic profile changes, retaining mutations, gene alterations (GISTIC: genomic identification of significant targets in cancer), and microarray mRNA expression (Z-scores). Genetic alteration co-occurrence and mutual exclusivity between each enquired hub genes were defined by p value, log2 odds ratio, and q value, where a p value < 0.05 was considered significant. Oncoprints describing genomic alteration mutations and the number of samples were generated employing a heatmap through cBioPortal Oncoprint [101,102].

4.7. Visualizing the Heatmap and Hub Gene Expressions

UCSC-Xena (http://xena.ucsc.edu/ accessed on 30 July 2023) is a cancer genomics visualization tool, and datasets in UCSC-Xena include but are not limited to TCGA, ICGC, GTEx, and GDC. UCSC-Xena was used for hierarchical clustering heatmaps and gene expression of hub genes [103]. For heatmaps, highly expressed genes were calculated by (log2(norm_count + 1), Gleason score [rated from 6 (dark yellow) to 10 (light yellow)], and age [rated from 41 (light gray) to 10 (dark gray)]. Gene expressions were calculated by log2(fpkm-uq + 1) and z-transformed based on the RNAseq RSEM norm_count, and p values < 0.05 were viewed as significant [35,103].

4.8. Hub Genes’ Association with Immune Cell Infiltration

We used the TIMER2.0 database (http://timer.cistrome.org/, accessed on 30 July 2023) to construct reliable immune infiltration estimations. TIMER2.0, utilizing an R-package (3.6.0), integrates six algorithms that have been systematically benchmarked, including TIMER, xCell, CIBERSORT, MCP-counter, quanTIseq, and EPIC [104]. TIMER2.0 is a comprehensive resource for identifying immune cell infiltration in cancers utilizing RNA-seq expression profiling datasets [105,106]. In the current study, hub genes’ expression was associated with five kinds of immune infiltration (CD4+ T cells, CD8+ T cells, B cells, neutrophils, and macrophages), which were correlated and assessed in the “Gene” module of “TIMER.” In addition, survival analysis was performed by TIMER2.0 for hub genes and immune infiltration cells by applying the following criteria:

Model Survivor analysis = Immune Infiltrates + Gene Expression + TIMMER (algorithm).

Kaplan–Meier curve parameters included a split expression of patients at 50%, a split infiltration of patients at 50%, and time in months. Cox proportional regression hazard mode was used for the model section, and infiltration level was divided into low and high groups with identified HRs; a p-value < 0.05 was deemed statistically significant [104,105,106].

4.9. ClueGO and CluePedia: Functional Enrichment Analysis

ClueGO (Version 2.5.10) and CluePedia (Version 1.5.10) were employed for functional enrichment analysis and pathway annotation networks for hub genes and TFs. ClueGO and CluePedia present the possibility of estimating enrichment analysis in two-sided tests established based on the hypergeometric distribution and utilized in combination with Bonferroni tests with a significance of p < 0.05. ClueGO combines GO terms and KEGG/BioCarta pathways, constructs a functionally categorized GO pathway, and visualizes the network [107,108]. It can analyze one or two lists of genes and comprehensively visualize functionally grouped terms. The parameter was restricted to facilitate the figure and only demonstrate the key pathway.

4.10. Redundancy Analyses of Hub Genes

We utilized Cancer DepMap (https://depmap.org/portal/, accessed on 30 July 2023) to determine the functional hub genes critical for PCa prognosis and survival. Cancer DepMap is established on the RNAi and CRISPR-Cas9 knockout databases [109,110]. CRISPR/Cas9 is a gene-editing technology that applies to direct RNA to match a selected target gene and Cas9 enzyme (CRISPR-associated protein-9, an endonuclease), which generates a double-stranded DNA break, allowing transformations to the genome [111]. RNA interference (RNAi) is a technique for silencing gene expression, which is the consequence of the degradation of RNA into short RNAs that activate ribonucleases to target mRNA [112]. DepMap includes cancer types based on their contribution to cancer mortality and their cancer cell line representation in the documented dataset. In our study, we utilized DepMap to (1) conduct redundancy analysis for hub genes, (2) identify the essential function of indicated hub genes in PCa total cell line survival, and (3) identify the CERES dependency score of hub genes in PCa cell lines.

4.11. Hub Gene Enrichment Analysis—Enrichr

Numerous experimentally verified online TF networks are available to evaluate TFs’ interaction with hub genes. This study used Enrichr (https://maayanlab.cloud/Enrichr/, accessed on 30 July 2023) to obtain BP, CC, MF, and KEGG pathways and identify TFs [113,114]. Enrichr includes WikiPathways, Chip Enrichment Analysis (ChEA), Reactome, and the BioCarta database [115]. The ggplot2 R package was employed to generate figures for Enrichr analysis. p-value < 0.05 was selected as the criterion.

4.12. TF Association Network with Hub Genes

TRRUST Version 2 (https://www.grnpedia.org/trrust/, accessed on 30 July 2023) is an available database of human TFs, gene interactions, and regulatory networks [116]. The TRRUST V.2 database was utilized to obtain the TFs regulating hub genes through BP, KEGG, and disease pathways. The species was selected as Homo sapiens, and a p-value <0.05 was set as the significance threshold [117]. Moreover, we implemented a functional network of the most significant TFs associated with hub genes via HumanNet, which includes a probabilistic functioning gene network of 19,000 validated protein-encoding genes of Homo sapiens. Each relation in HumanNet has a linked log-likelihood score (LLS) that calculates the probability of an interaction designating a true functional connection between two genes [118]. Finally, we created a PPI network of interactions with the most significant TFs associated with hub genes through iRefIndex, which is a compressed PPI database containing molecular interactions derived from various sources such as BioGRID, BIND, DIP, and HPRD [119].

4.13. Causal Association and Protein Interactions and Phenotypes with DEGs

Phenolyzer (https://phenolyzer.wglab.org/, accessed on 30 July 2023) was used to perform a cross-sectional comparison of hub genes and phenotypes, which would provide indicators for measuring expression across cancers [120]. Phenolyzer was used to research various forms of PCa—prostatic neoplasms, familial PCa, prostatic hyperplasia, and PCa neoplasia—which are clinical phenotype terms that are essentially related to each other. Then, we implemented SIGNOR 3.0 (https://signor.uniroma2.it, accessed on 30 July 2023), a public dataset of static maps that capture causal information and interactions that can be tailored, pruned and developed to create dynamic and predictive models [119]. Each signaling association is annotated with an effect of up- and downregulation and with the mechanisms in which indices (e.g., transcriptional activation, binding, and phosphorylation) generate the regulation of the target proteins/genes. SIGNOR 3.0 guides several exterior resources that characterize the use of the data in SIGNOR 3.0 to extract biologically suitable information in three biological parts: Disnor (disease pathways and disease-associated genes), Myo-REG (cell signaling interactions), and CancerGeneNet (genes repeatedly mutated in hallmark cancer phenotypes) [121,122].

5. Conclusions

In conclusion, the interactions among hub genes, TFs, and environmental risk factors, particularly endocrine-disrupting chemicals (EDCs), reveal a complex interplay that emphasizes the multifactorial nature of prostate cancer (PCa) development and prognosis. To our knowledge, this represents one of the first comprehensive RNA-seq analyses, integrating data from 15 gene expression enrichment (GSE) datasets focused on PCa. These include PCa cell lines treated with synthetic androgens, exposed to selected EDCs, and utilized to explore the role of MYC (a TF) in carcinogenic processes. The dataset captures a heterogeneous range of prostate and PCa cells, including various ages and races, as well as high-grade Gleason scores of ≥7 (4 + 3). Our study employed advanced bioinformatics tools to analyze gene expression profiles through GSE, focused on PCa-related pathways. We validated hub genes (NCAPG, MKI67, CCNA2, CCNB1, CDK1, CCNB2, AURKA, UBE2C, BUB1B) and their associated TFs (MYC, E2F4, YBX1) within the aggressive PCa pathway, particularly among older patients. Survival analysis, aligned with high Gleason scores and age, indicates that these hub genes and their TFs are essential to the cellular signaling pathways implicated in aggressive PCa prognosis. These findings underscore the potential of these hub genes as candidates for future studies aimed at developing molecular biomarkers for predicting and assessing EDC-related PCa risk, diagnosing PCa aggressiveness, and identifying therapeutic targets.

Supplementary Materials

The following supporting information, Figures S1–S13 and Tables S1–S6, can be downloaded at: https://www.mdpi.com/article/10.3390/ijms26125463/s1.

Author Contributions

D.A. and A.D. conceptualized the research and analyzed the results. D.A. designed the study, analyzed the RNA-seq and epidemiologic data, and compiled the gene network analysis data. M.D. analyzed and verified RNA-seq data analysis. C.Y. and Q.F. verified and examined the data analysis. D.A. and A.D. developed the draft of the manuscript, which was further improved, reviewed, and finalized by Q.F. and D.R. The manuscript draft was further enhanced and finished by Q.F., C.Y. and D.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets generated during the current study are available from the corresponding author upon reasonable request.

Acknowledgments

The authors acknowledge the support from the Department of Environmental Health Sciences, Stempel College of Public Health and Social Work, to Diaaidden Alwadi to conduct his graduate research. The authors also acknowledge the committee members, Muhammad Hossain and Juan Liuzzi, for their advice in shaping the research direction. Additionally, the authors thank Roberto Lucchini for his suggestions on improving the manuscript.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Abbreviations

ADT	Androgen Deprivation Therapy
AR	Androgen Receptor
BP	Biological Process
cBioPortal	cBio Cancer Genomics Portal
CC	Cellular Component
ChEA	Chip Enrichment Analysis
CT	Computed Tomography
DAVID	Database for Annotation, Visualization, and Integrated Discovery
DEGs	Differentially Expressed Genes
DHT	Dihydrotestosterone
DOIDs	Disease Ontology Identifiers
DNMC	Degree, Density of Maximum Neighborhood Component
E2F4	E2F Transcription Factor 4
EDCs	Endocrine-Disrupting Chemicals
EPC	Edge Percolated Component
FDR	False Discovery Rate
FPKM	Fragments Per Kilobase of transcript per Million
GDC	Genomic Data Commons
GO	Gene Ontology
GEO	Expression Omnibus database
GISTIC	Genomic Identification of Significant Targets in Cancer
GTEx	Genotype-Tissue Expression
ICGC	International Cancer Genome Consortium
KEGG	Kyoto Encyclopedia of Genes and Genomes
KM	Kaplan–Meier
MCC	Maximal Clique Centrality
MCODE	Molecular Complex Detection
MF	Molecular Function
MNC	Maximum Neighborhood Component
MRI	Magnetic Resonance Imaging
MSK	Memorial Sloan Kettering Cancer Center
MYC	MYC Proto-Oncogene, BHLH Transcription Factor
NFY	Nuclear transcription factor Y
NCBI	National Center for Biotechnology Information
NHANES	National Health and Nutrition Examination Survey
PCa	Prostate Cancer
PEI	Pathway Enrichment Analysis
PPI	Protein–Protein Interaction
PSA	Prostate-Specific Antigen
SIGNOR	SIGnaling Network Open Resource
STRING	Retrieval of Interacting Genes/Proteins
TCGA	The Cancer Genome Atlas
TFs	Transcription Factors
TIMER 2.0	Tumor Immunity Evaluation Resource 2.0
TRRUST	Transcriptional Regulatory Relationships Unraveled by Sentence-based Text mining
UCSC_TFBS	University of California, Santa Cruz—transcription factor binding sites
UCSC-Xena	University of California Santa Cruz—Xena
YBX1	Y box binding protein 1

References

Siegel, R.L.; Miller, K.D.; Fuchs, H.E.; Jemal, A. Cancer Statistics, 2022. CA Cancer J. Clin. 2021, 71, 7–33. [Google Scholar] [CrossRef] [PubMed]
Viale, P.H. The American Cancer Society’s Facts & Figures: 2022 Edition. J. Adv. Pract. Oncol. 2020, 11, 135–136. [Google Scholar] [PubMed]
Sung, H.; Ferlay, J.; Siegel, R.L.; Laversanne, M.; Soerjomataram, I.; Jemal, A.; Bray, F. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J. Clin. 2021, 71, 209–249. [Google Scholar] [CrossRef] [PubMed]
Packer, J.R.; Maitland, N.J. The molecular and cellular origin of human prostate cancer. Biochim. Biophys. Acta Mol. Cell Res. 2016, 1863, 1238–1260. [Google Scholar] [CrossRef]
Tagai, E.K.; Miller, S.M.; Kutikov, A.; Diefenbach, M.A.; Gor, R.A.; Al-Saleem, T.; Chen, D.Y.; Fleszar, S.; Roy, G. Prostate Cancer Patients’ Understanding of the Gleason Scoring System: Implications for Shared Decision-Making. J. Cancer Educ. 2018, 34, 441–445. [Google Scholar] [CrossRef]
Messner, E.A.; Steele, T.M.; Tsamouri, M.M.; Hejazi, N.; Gao, A.C.; Mudryj, M.; Ghosh, P.M. The androgen receptor in prostate cancer: Effect of structure, ligands and spliced variants on therapy. Biomedicines 2020, 8, 422. [Google Scholar] [CrossRef]
Centenera, M.M.; Selth, L.A.; Ebrahimie, E.; Butler, L.M.; Tilley, W.D. New Opportunities for Targeting the Androgen Receptor in Prostate Cancer. Cold Spring Harb. Perspect. Med. 2018, 8, a030478. [Google Scholar] [CrossRef]
Cai, C.; He, H.H.; Chen, S.; Coleman, I.; Wang, H.; Fang, Z.; Chen, S.; Nelson, P.S.; Liu, X.S.; Brown, M.; et al. Androgen Receptor Gene Expression in Prostate Cancer Is Directly Suppressed by the Androgen Receptor Through Recruitment of Lysine-Specific Demethylase 1. Cancer Cell 2011, 20, 457–471. [Google Scholar] [CrossRef]
Saltman, A.; Zegar, J.; Haj-Hamed, M.; Verma, S.; Sidana, A. Prostate cancer biomarkers and multiparametric MRI: Is there a role for both in prostate cancer management? Ther. Adv. Urol. 2021, 13, 1756287221997186. [Google Scholar] [CrossRef]
Wang, R.; Xiao, Y.; Pan, M.; Chen, Z.; Yang, P. Integrative Analysis of Bulk RNA-Seq and Single-Cell RNA-Seq Unveils the Characteristics of the Immune Microenvironment and Prognosis Signature in Prostate Cancer. J. Oncol. 2022, 2022, 1–28. [Google Scholar] [CrossRef]
Draisma, G.; Etzioni, R.; Tsodikov, A.; Mariotto, A.; Wever, E.; Gulati, R.; Feuer, E.; de Koning, H. Lead Time and Overdiagnosis in Prostate-Specific Antigen Screening: Importance of Methods and Context. J. Natl. Cancer Inst. 2009, 101, 374–383. [Google Scholar] [CrossRef] [PubMed]
Catalona, W.J. Prostate Cancer Screening. Med. Clin. N. Am. 2018, 102, 199–214. [Google Scholar] [CrossRef]
Moyer, V.A. Screening for prostate cancer: U.S. preventive services task force recommendation statement. Ann. Intern. Med. 2012, 157, 120–134. [Google Scholar] [CrossRef]
Gu, P.; Yang, D.; Zhu, J.; Zhang, M.; He, X. Bioinformatics analysis identified hub genes in prostate cancer tumorigenesis and metastasis. Math. Biosci. Eng. MBE 2021, 18, 3180–3196. [Google Scholar] [CrossRef]
Endo, T.; Uzawa, K.; Suzuki, H.; Tanzawa, H.; Ichikawa, T. Characteristic gene expression profiles of benign prostatic hypertrophy and prostate cancer. Int. J. Oncol. 2009, 35, 499–509. [Google Scholar] [PubMed]
Sun, J.; Li, S.; Wang, F.; Fan, C.; Wang, J. Identification of key pathways and genes in pten mutation prostate cancer by bioinformatics analysis. BMC Med. Genet. 2019, 20, 191. [Google Scholar] [CrossRef]
Song, F.; Zhang, Y.; Pan, Z.; Hu, X.; Yi, Y.; Zheng, X.; Wei, H.; Huang, P. Identification of novel key genes associated with the metastasis of prostate cancer based on bioinformatics prediction and validation. Cancer Cell Int. 2021, 21, 559. [Google Scholar] [CrossRef]
Liu, K.; Chen, Y.; Feng, P.; Wang, Y.; Sun, M.; Song, T.; Tan, J.; Li, C.; Liu, S.; Kong, Q.; et al. Identification of Pathologic and Prognostic Genes in Prostate Cancer Based on Database Mining. Front. Genet. 2022, 13, 854531. [Google Scholar] [CrossRef] [PubMed]
Lu, W.; Ding, Z. Identification of key genes in prostate cancer gene expression profile by bioinformatics. Andrologia 2019, 51, e13169. [Google Scholar] [CrossRef]
Song, Z.; Chao, F.; Zhuo, Z.; Ma, Z.; Li, W.; Chen, G. Identification of hub genes in prostate cancer using robust rank aggregation and weighted gene co-expression network analysis. Aging 2019, 11, 4736–4756. [Google Scholar] [CrossRef]
Wang, Y.; Wang, J.; Yan, K.; Lin, J.; Zheng, Z.; Bi, J. Identification of core genes associated with prostate cancer progression and outcome via bioinformatics analysis in multiple databases. PeerJ 2020, 2020, e8786. [Google Scholar] [CrossRef] [PubMed]
Tong, Y.; Song, Y.; Deng, S. Combined analysis and validation for DNA methylation and gene expression profiles associated with prostate cancer. Cancer Cell Int. 2019, 19, 50. [Google Scholar] [CrossRef] [PubMed]
Tan, J.; Jin, X.; Wang, K. Integrated Bioinformatics Analysis of Potential Biomarkers for Prostate Cancer. Pathol. Oncol. Res. 2017, 25, 455–460. [Google Scholar] [CrossRef]
He, Z.; Duan, X.; Zeng, G. Identification of potential biomarkers and pivotal biological pathways for prostate cancer using bioinformatics analysis methods. PeerJ 2019, 2019, e7872. [Google Scholar] [CrossRef]
Song, Z.; Huang, Y.; Zhao, Y.; Ruan, H.; Yang, H.; Cao, Q.; Liu, D.; Zhang, X.; Chen, K. The Identification of Potential Biomarkers and Biological Pathways in Prostate Cancer. J. Cancer 2019, 10, 1398–1408. [Google Scholar] [CrossRef] [PubMed]
Yu, P.; Dai, Y.; Zhuang, T.; Yue, X.; Chen, Y.; Wang, X.; Duan, X.; Ping, Y.; Xie, Y.; Cao, Y.; et al. Identification and Validation of Three Hub Genes Involved in Cell Proliferation and Prognosis of Castration-Resistant Prostate Cancer. Oxidative Med. Cell. Longev. 2022, 2022, 8761112. [Google Scholar] [CrossRef]
Roy, D.; Morgan, M.; Yoo, C.; Deoraj, A.; Roy, S.; Yadav, V.; Garoub, M.; Assaggaf, H.; Doke, M. Integrated Bioinformatics, environmental epidemiologic and genomic approaches to identify environmental and molecular links between endometriosis and breast cancer. Int. J. Mol. Sci. 2015, 16, 25285–25322. [Google Scholar] [CrossRef]
Deoraj, A.; Yoo, C.; Roy, D. Integrated bioinformatics biostatistics and molecular epidemiologic approaches to study how the environment and genes work together to affect the development of complex chronic diseases. In Gene-Environment Interaction Analysis: Methods in Bioinformatics and Computational Biology; Anno, S., Ed.; Pan Stanford Publishing Pte. Ltd.: New York, NY, USA, 2016; pp. 151–191. [Google Scholar]
Prins, G.S. Endocrine Disruptors and Prostate Cancer Risk. Endocr. Relat. Cancer 2008, 15, 649–656. [Google Scholar] [CrossRef] [PubMed]
Tarapore, P.; Ying, J.; Ouyang, B.; Burke, B.; Bracken, B.; Ho, S.-M. Exposure to Bisphenol a Correlates with Early-Onset Prostate Cancer and Promotes Centrosome Amplification and Anchorage-Independent Growth in Vitro. PLoS ONE 2014, 9, e90332. [Google Scholar] [CrossRef]
Golden, R.; Gandy, J.; Vollmer, G. A Review of the Endocrine Activity of Parabens and Implications for Potential Risks to Human Health. Crit. Rev. Toxicol. 2005, 35, 435–458. [Google Scholar] [CrossRef]
Ho, S.M.; Rao, R.; To, S.; Schoch, E.; Tarapore, P. Bisphenol A and Its Analogues Disrupt Centrosome Cycle and Microtubule Dynamics in Prostate Cancer. Endocr. Relat. Cancer 2017, 24, 83–96. [Google Scholar] [CrossRef]
De Falco, M.; Laforgia, V. Combined Effects of Different Endocrine-Disrupting Chemicals (EDCs) on Prostate Gland. Int. J. Environ. Res. Public Health 2021, 18, 9772. [Google Scholar] [CrossRef]
Alwadi, D.; Felty, Q.; Roy, D.; Yoo, C.; Deoraj, A. Environmental Phenol and Paraben Exposure Risks and Their Potential Influence on the Gene Expression Involved in the Prognosis of Prostate Cancer. Int. J. Mol. Sci. 2022, 23, 3679. [Google Scholar] [CrossRef]
Alwadi, D.; Felty, Q.; Yoo, C.; Roy, D.; Deoraj, A. Endocrine Disrupting Chemicals Influence Hub Genes Associated with Aggressive Prostate Cancer. Int. J. Mol. Sci. 2023, 24, 3191. [Google Scholar] [CrossRef]
Jarred, R.A.; Keikha, M.; Dowling, C.; McPherson, S.J.; Clare, A.M.; Husband, A.J.; Pedersen, J.S.; Frydenberg, M.; Risbridger, G.P. Induction of Apoptosis in Low to Moderate-Grade Human Prostate Carcinoma by Red Clover-derived Dietary Isoflavones. Cancer Epidemiol. Biomark. Prev. 2002, 11, 1689–1696. [Google Scholar]
Doultsinos, D.; Mills, I.G. Derivation and application of molecular signatures to prostate cancer: Opportunities and challenges. Cancers 2021, 13, 495. [Google Scholar] [CrossRef]
Lakshman, M.; Xu, L.; Ananthanarayanan, V.; Cooper, J.; Takimoto, C.H.; Helenowski, I.; Pelling, J.C.; Bergan, R.C. Dietary genistein inhibits metastasis of human prostate cancer in mice. Cancer Res. 2008, 68, 2024–2032. [Google Scholar] [CrossRef]
Artacho-Cordón, F.; Fernández, M.F.; Frederiksen, H.; Iribarne-Durán, L.M.; Jiménez-Díaz, I.; Vela-Soria, F.; Andersson, A.M.; Martin-Olmedo, P.; Peinado, F.M.; Olea, N.; et al. Environmental Phenols and Parabens in Adipose Tissue from Hospitalized Adults in Southern Spain. Environ. Int. 2018, 119, 203–211. [Google Scholar] [CrossRef]
Tai, K.-Y.; Shiah, S.-G.; Shieh, Y.-S.; Kao, Y.-R.; Chi, C.-Y.; Huang, E.; Lee, H.-S.; Chang, L.-C.; Yang, P.-C.; Wu, C.-W. DNA methylation and histone modification regulate silencing of epithelial cell adhesion molecule for tumor invasion and progression. Oncogene 2007, 26, 3989–3997. [Google Scholar] [CrossRef]
Bonkhoff, H. Estrogen receptor signaling in prostate cancer: Implications for carcinogenesis and tumor progression. Prostate 2018, 78, 2–10. [Google Scholar] [CrossRef]
Wan, J.; Zhang, J.; Zhang, J. Expression of P53 and Its Mechanism in Prostate Cancer. Oncol. Lett. 2018, 16, 378–382. [Google Scholar] [CrossRef]
Habrowska-Górczyńska, D.E.; Kozieł, M.J.; Kowalska, K.; Piastowska-Ciesielska, A.W. FOXO3A and Its Regulators in Prostate Cancer. Int. J. Mol. Sci. 2021, 22, 12530. [Google Scholar] [CrossRef]
Carabet, L.A.; Lallous, N.; Leblanc, E.; Ban, F.; Morin, H.; Lawn, S.; Ghaidi, F.; Lee, J.; Mills, I.G.; Gleave, M.E.; et al. Computeraided drug discovery of Myc-Max inhibitors as potential therapeutics for prostate cancer. Eur. J. Med. Chem. 2018, 160, 108–119. [Google Scholar] [CrossRef]
Qiu, X.; Boufaied, N.; Hallal, T.; Feit, A.; de Polo, A.; Luoma, A.M.; Larocque, J.; Zadra, G.; Xie, Y.; Gu, S.; et al. MYC drives aggressive prostate cancer by disrupting transcriptional pause release at androgen receptor targets. Nat. Commun. 2021, 13, 2559. [Google Scholar] [CrossRef]
Xie, B.; Wang, S.; Jiang, N.; Li, J.J. Cyclin B1/CDK1-regulated mitochondrial bioenergetics in cell cycle progression and tumor resistance. Cancer Lett. 2019, 443, 56–66. [Google Scholar] [CrossRef]
Feng, T.; Wei, D.; Li, Q.; Yang, X.; Han, Y.; Luo, Y.; Jiang, Y. Four Novel Prognostic Genes Related to Prostate Cancer Identified Using Co-Expression Structure Network Analysis. Front. Genet. 2021, 12, 584164. [Google Scholar] [CrossRef]
Qing, Y.; Wang, Y.; Hu, C.; Zhang, H.; Huang, Y.; Zhang, Z.; Ma, T.; Zhang, S.; Li, K. Evaluation of Notch Family Genes’ Expression and Prognostic Value in Prostate Cancer. Transl. Androl. Urol. 2022, 11, 627–642. [Google Scholar] [CrossRef]
Brouwer-Visser, J.; Cheng, W.-Y.; Bauer-Mehren, A.; Maisel, D.; Lechner, K.; Andersson, E.; Dudley, J.T.; Milletti, F. Regulatory T-Cell Genes Drive Altered Immune Microenvironment in Adult Solid Cancers and Allow for Immune Contextual Patient Subtyping. Cancer Epidemiol. Biomark. Prev. 2018, 27, 103–112. [Google Scholar] [CrossRef]
Wang, X.; Simpson, E.R.; Brown, K.A. P53: Protection against Tumor Growth beyond Effects on Cell Cycle and Apoptosis. Cancer Res. 2016, 76, 1668. [Google Scholar] [CrossRef]
Zeng, X.; Shi, G.; He, Q.; Zhu, P. Screening and Predicted Value of Potential Biomarkers for Breast Cancer Using Bioinformatics Analysis. Sci. Rep. 2021, 11, 20799. [Google Scholar] [CrossRef]
Gao, J.; Xia, R.; Chen, J.; Gao, J.; Luo, X.; Ke, C.; Ren, C.; Li, J.; Mi, Y. Inhibition of Esophageal-Carcinoma Cell Proliferation by Genistein via Suppression of JAK1/2-STAT3 and AKT/MDM2/P53 Signaling Pathways. Aging 2020, 12, 6240–6259. [Google Scholar] [CrossRef]
Engeland, K. Cell Cycle Arrest through Indirect Transcriptional Repression by P53: I Have a Dream. Cell Death Differ. 2017, 25, 114–132. [Google Scholar] [CrossRef]
Kumar, A.; Coleman, I.; Morrissey, C.; Zhang, X.; True, L.D.; Gulati, R.; Etzioni, R.; Bolouri, H.; Montgomery, B.; White, T.; et al. Substantial Interindividual and Limited Intraindividual Genomic Diversity among Tumors from Men with Metastatic Prostate Cancer. Nat. Med. 2016, 22, 369–378. [Google Scholar] [CrossRef]
Ribeiro, F.R.; Jerónimo, C.; Henrique, R.; Fonseca, D.; Oliveira, J.; Lothe, R.A.; Teixeira, M.R. 8Q Gain Is an Independent Predictor of Poor Survival in Diagnostic Needle Biopsies from Prostate Cancer Suspects. Clin. Cancer Res. 2006, 12, 3961–3970. [Google Scholar] [CrossRef]
DuPree, E.L.; Mazumder, S.; Almasan, A. Genotoxic Stress Induces Expression of E2F4, Leading to Its Association with P130 in Prostate Carcinoma Cells. Cancer Res. 2004, 64, 4390–4393. [Google Scholar] [CrossRef]
Taylor, W.R.; Schonthal, A.H.; Galante, J.; Stark, G.R. p130/E2F4 binds to and represses the cdc2 promoter in response to p53. J. Biol. Chem. 2001, 276, 1998–2006. [Google Scholar] [CrossRef]
Yang, J.; Song, K.; Krebs, T.L.; Jackson, M.W.; Danielpour, D. RB/E2F4 and Smad2/3 Link Survivin to TGF-β-Induced Apoptosis and Tumor Progression. Oncogene 2008, 27, 5326–5338. [Google Scholar] [CrossRef]
Shiota, M.; Fujimoto, N.; Imada, K.; Yokomizo, A.; Itsumi, M.; Takeuchi, A.; Kuruma, H.; Inokuchi, J.; Tatsugami, K.; Uchiumi, T.; et al. Potential Role for YB-1 in Castration-Resistant Prostate Cancer and Resistance to Enzalutamide through the Androgen Receptor V7. J. Natl. Cancer Inst. 2016, 108, djw005. [Google Scholar] [CrossRef]
Montes, M.; Becerra, S.; Sánchez-Álvarez, M.; Suñé, C. Functional Coupling of Transcription and Splicing. Gene 2012, 501, 104–117. [Google Scholar] [CrossRef]
Wyatt, A.W.; Gleave, M.E. Targeting the Adaptive Molecular Landscape of Castration-resistant Prostate Cancer. EMBO Mol. Med. 2015, 7, 878–894. [Google Scholar] [CrossRef]
Barrett, T.; Troup, D.B.; Wilhite, S.E.; Ledoux, P.; Rudnev, D.; Evangelista, C.; Kim, I.F.; Soboleva, A.; Tomashevsky, M.; Edgar, R. NCBI GEO: Mining tens of millions of expression profiles—Database and tools update. Nucleic Acids Res. 2007, 35, D760–D765. [Google Scholar] [CrossRef] [PubMed]
Poluri, R.T.; Paquette, V.; Allain, É.P.; Lafront, C.; Joly-Beauparlant, C.; Weidmann, C.; Droit, A.; Guillemette, C.; Pelletier, M.; Audet-Walsh, É. KLF5 and NFYA Factors as Novel Regulators of Prostate Cancer Cell Metabolism. Endocr.-Relat. Cancer 2021, 28, 257–271. [Google Scholar] [CrossRef]
Poluri, R.T.K.; Beauparlant, C.J.; Droit, A.; Audet-Walsh, É. RNA sequencing data of human prostate cancer cells treated with androgens. Data Brief 2019, 25, 104372. [Google Scholar] [CrossRef]
Jividen, K.; Kedzierska, K.Z.; Yang, C.; Szlachta, K.; Ratan, A.; Paschal, B.M. Genomic analysis of DNA repair genes and androgen signaling in prostate cancer. BMC Cancer 2018, 18, 960. [Google Scholar] [CrossRef] [PubMed]
Yang, C.; Jividen, K.; Kamata, T.; Dworak, N.; Oostdyk, L.; Remlein, B.; Pourfarjam, Y.; Kim, I.; Du, K.; Abbas, T.; et al. Androgen signaling uses a writer and a reader of ADP-ribosylation to regulate protein complex assembly. Nat. Commun. 2021, 12, 2705. [Google Scholar] [CrossRef]
Nyquist, M.D.; Corella, A.; Mohamad, O.; Coleman, I.; Kaipainen, A.; Kuppers, D.A.; Lucas, J.M.; Paddison, P.J.; Plymate, S.R.; Nelson, P.S.; et al. Molecular determinants of response to high-dose androgen therapy in prostate cancer. JCI Insight 2019, 4, e129715. [Google Scholar] [CrossRef]
Brzezinka, K.; Nevedomskaya, E.; Lesche, R.; Haegebarth, A.; Ter Laak, A.; Fernández-Montalván, A.E.; Eberspaecher, U.; Werbeck, N.D.; Moenning, U.; Siegel, S.; et al. Characterization of the menin-MLL interaction as therapeutic cancer target. Cancers 2020, 12, 201. [Google Scholar] [CrossRef]
Cato, L.; Neeb, A.; Sharp, A.; Buzón, V.; Ficarro, S.B.; Yang, L.; Muhle-Goll, C.; Kuznik, N.C.; Riisnaes, R.; Rodrigues, D.N.; et al. Development of bag-1L as a therapeutic target in androgen receptor-dependent prostate cancer. eLife 2023, 6, e27159. [Google Scholar] [CrossRef] [PubMed]
Metzger, E.; Willmann, D.; McMillan, J.; Forne, I.; Metzger, P.; Gerhardt, S.; Petroll, K.; Von Maessenhausen, A.; Urban, S.; Schott, A.; et al. Assembly of methylated KDM1A and CHD1 drives androgen receptor-dependent transcription and translocation. Nat. Struct. Mol. Biol. 2016, 23, 132–139. [Google Scholar] [CrossRef]
Lin, S.; Wei, H.; Maeder, D.; Franklin, R.B.; Feng, P. Profiling of zinc-altered gene expression in human prostate normal vs. cancer cells: A time course study. J. Nutr. Biochem. 2009, 20, 1000–1012. [Google Scholar] [CrossRef]
Zhang, H.; Gordon, R.; Li, W.; Yang, X.; Pattanayak, A.; Fowler, G.; Zhang, L.; Catalona, W.J.; Ding, Y.; Xu, L.; et al. Genistein treatment duration effects biomarkers of cell motility in human prostate. PLoS ONE 2019, 14, e0214078. [Google Scholar] [CrossRef] [PubMed]
Rooney, J.P.; Chorley, B.; Kleinstreuer, N.; Corton, J.C. Identification of Androgen Receptor Modulators in a Prostate Cancer Cell Line Microarray Compendium. Toxicol. Sci. 2018, 166, 146–162. [Google Scholar] [CrossRef] [PubMed]
Firlej, V.; Soyeux, P.; Nourieh, M.; Huet, E.; Semprez, F.; Allory, Y.; Londono-Vallejo, A.; de la Taille, A.; Vacherot, F.; Destouches, D. Overexpression of Nucleolin and Associated Genes in Prostate Cancer. Int. J. Mol. Sci. 2022, 23, 4491. [Google Scholar] [CrossRef]
Teslow, E.A.; Bao, B.; Dyson, G.; Legendre, C.; Mitrea, C.; Sakr, W.; Carpten, J.D.; Powell, I.; Bollig-Fischer, A. Exogenous IL-6 induces mRNA splice variant MBD2_v2 to promote stemness in TP53 wild-type, African American PCa cells. Mol. Oncol. 2018, 12, 1138–1152. [Google Scholar] [CrossRef] [PubMed]
Wei, Z.; Wang, S.; Xu, Y.; Wang, W.; Soares, F.; Ahmed, M.; Su, P.; Wang, T.; Orouji, E.; Xu, X.; et al. MYC reshapes CTCF-mediated chromatin architecture in prostate cancer. Nat. Commun. 2023, 14, 1787. [Google Scholar] [CrossRef]
Love, M.I.; Huber, W.; Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014, 15, 550. [Google Scholar] [CrossRef]
Rahman, M.H.; Rana, H.K.; Peng, S.; Hu, X.; Chen, C.; Quinn, J.M.W.; Moni, M.A. Bioinformatics and machine learning methodologies to identify the effects of central nervous system disorders on glioblastoma progression. Brief. Bioinform. 2021, 22, bbaa365. [Google Scholar] [CrossRef]
Seifert, E. OriginPro 9.1: Scientific data analysis and graphing software-software review. J. Chem. Inf. Model. 2014, 54, 1552. [Google Scholar] [CrossRef]
Oliveros, J.C. Venny. An Interactive Tool for Comparing Lists with Venn’s Diagrams. (2007–2015). Available online: https://bioinfogp.cnb.csic.es/tools/venny/index.html (accessed on 27 November 2023).
Heberle, H.; Meirelles, V.G.; da Silva, F.R.; Telles, G.P.; Minghim, R. InteractiVenn: A web-based tool for the analysis of sets through Venn diagrams. BMC Bioinform. 2015, 16, 169. [Google Scholar] [CrossRef]
Davis, S.; Meltzer, P.S. GEOquery: A bridge between the Gene Expression Omnibus (GEO) and BioConductor. Bioinformatics 2007, 23, 1846–1847. [Google Scholar] [CrossRef]
Gautier, L.; Cope, L.; Bolstad, B.M.; Irizarry, R.A. Affy—Analysis of Affymetrix GeneChip data at the probe level. Bioinformatics 2004, 20, 307–315. [Google Scholar] [CrossRef] [PubMed]
Ritchie, M.E.; Phipson, B.; Wu, D.; Hu, Y.; Law, C.W.; Shi, W.; Smyth, G.K. Limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015, 43, e47. [Google Scholar] [CrossRef]
Du, P.; Kibbe, W.A.; Lin, S.M. lumi: A pipeline for processing Illumina microarray. Bioinformatics 2008, 24, 1547–1548. [Google Scholar] [CrossRef]
Huang, D.W.; Sherman, B.T.; Tan, Q.; Collins, J.R.; Alvord, W.G.; Roayaei, J.; Stephens, R.; Baseler, M.W.; Lane, H.C.; Lempicki, R.A. The DAVID Gene Functional Classification Tool: A novel biological module-centric algorithm to functionally analyze large gene lists. Genome Biol. 2007, 8, R183. [Google Scholar] [CrossRef]
Dennis, G., Jr.; Sherman, B.T.; Hosack, D.A.; Yang, J.; Gao, W.; Lane, H.C.; Lempicki, R.A. DAVID: Database for Annotation, Visualization, and Integrated Discovery. Genome Biol. 2003, 4, P3. [Google Scholar] [CrossRef]
Kanehisa, M.; Furumichi, M.; Sato, Y.; Ishiguro-Watanabe, M.; Tanabe, M. KEGG: Integrating viruses and cellular organisms. Nucleic Acids Res. 2021, 49, D545–D551. [Google Scholar] [CrossRef] [PubMed]
Botstein, D.; Cherry, J.M.; Ashburner, M.; Ball, C.A.; Blake, J.A.; Butler, H.; Davis, A.P.; Dolinski, K.; Dwight, S.S.; Eppig, J.T.; et al. Gene Ontology: Tool for the unification of biology. Nat. Genet. 2000, 25, 25–29. [Google Scholar]
Szklarczyk, D.; Gable, A.L.; Lyon, D.; Junge, A.; Wyder, S.; Huerta-Cepas, J.; Simonovic, M.; Doncheva, N.T.; Morris, J.H.; Bork, P.; et al. STRING v11: Protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 2019, 47, D607–D613. [Google Scholar] [CrossRef]
Otasek, D.; Morris, J.H.; Bouças, J.; Pico, A.R.; Demchak, B. Cytoscape Automation: Empowering workflow-based network analysis. Genome Biol. 2019, 20, 185. [Google Scholar] [CrossRef]
Saito, R.; Smoot, M.E.; Ono, K.; Ruscheinski, J.; Wang, P.; Lotia, S.; Pico, A.R.; Bader, G.D.; Ideker, T. A travel guide to Cytoscape plugins. Nat. Methods 2012, 9, 1069–1076. [Google Scholar] [CrossRef]
Shannon, P.; Markiel, A.; Ozier, O.; Baliga, N.S.; Wang, J.T.; Ramage, D.; Amin, N.; Schwikowski, B.; Ideker, T. Cytoscape: A software Environment for integrated models of biomolecular interaction networks. Genome Res. 2003, 13, 2498–2504. [Google Scholar] [CrossRef] [PubMed]
Omranian, S.; Nikoloski, Z.; Grimm, D.G. Computational identification of protein complexes from network interactions: Present state, challenges, and the way forward. Comput. Struct. Biotechnol. J. 2022, 20, 2699–2712. [Google Scholar] [CrossRef] [PubMed]
Bandettini, W.P.; Kellman, P.; Mancini, C.; Booker, O.J.; Vasu, S.; Leung, S.W.; Wilson, J.R.; Shanbhag, S.M.; Chen, M.Y.; Arai, A.E. MultiContrast Delayed Enhancement (MCODE) improves detection of subendocardial myocardial infarction by late gadolinium enhancement cardiovascular magnetic resonance: A clinical validation study. J. Cardiovasc. Magn. Reson. 2012, 14, 83. [Google Scholar] [CrossRef]
Chin, C.; Chen, S.; Wu, H.; Ho, C.; Ko, M.; Lin, C. CytoHubba: Identifying hub objects and sub-networks from complex interactome. BMC Syst. Biol. 2014, 8, S11. [Google Scholar] [CrossRef]
Franz, M.; Rodriguez, H.; Lopes, C.; Zuberi, K.; Montojo, J.; Bader, G.D.; Morris, Q. GeneMANIA update 2018. Nucleic Acids Res. 2018, 46, W60–W64. [Google Scholar] [CrossRef]
Montojo, J.; Zuberi, K.; Rodriguez, H.; Kazi, F.; Wright, G.; Donaldson, S.L.; Morris, Q.; Bader, G.D. GeneMANIA Cytoscape plugin: Fast gene function predictions on the desktop. Bioinformatics 2010, 26, 2927–2928. [Google Scholar] [CrossRef] [PubMed]
Warde-Farley, D.; Donaldson, S.L.; Comes, O.; Zuberi, K.; Badrawi, R.; Chao, P.; Franz, M.; Grouios, C.; Kazi, F.; Lopes, C.T.; et al. The GeneMANIA prediction server: Biological network integration for gene prioritization and predicting gene function. Nucleic Acids Res. 2010, 38, W214–W220. [Google Scholar] [CrossRef]
Cao, Z.; Zhang, S. An integrative and comparative study of pan-cancer transcriptomes reveals distinct cancer common and specific signatures. Sci. Rep. 2016, 6, 33398. [Google Scholar] [CrossRef]
Cerami, E.; Gao, J.; Dogrusoz, U.; Gross, B.E.; Sumer, S.O.; Aksoy, B.A.; Jacobsen, A.; Byrne, C.J.; Heuer, M.L.; Larsson, E.; et al. The cBio Cancer Genomics Portal: An open platform for exploring multidimensional cancer genomics data. Cancer Discov. 2012, 2, 401–404. [Google Scholar] [CrossRef]
Gu, Z.; Eils, R.; Schlesner, M. Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics 2016, 32, 2847–2849. [Google Scholar] [CrossRef]
Goldman, M.J.; Craft, B.; Hastie, M.; Repečka, K.; McDade, F.; Kamath, A.; Banerjee, A.; Luo, Y.; Rogers, D.; Brooks, A.N.; et al. Visualizing and interpreting cancer genomics data via the Xena platform. Nat. Biotechnol. 2020, 38, 675–678. [Google Scholar] [CrossRef] [PubMed]
Li, T.; Fu, J.; Zeng, Z.; Cohen, D.; Li, J.; Chen, Q.; Li, B.; Liu, X.S. TIMER2.0 for analysis of tumor-infiltrating immune cells. Nucleic Acids Res. 2020, 48, W509–W514. [Google Scholar] [CrossRef] [PubMed]
Li, T.; Fan, J.; Wang, B.; Traugh, N.; Chen, Q.; Liu, J.S.; Li, B.; Liu, X.S. TIMER: A web server for comprehensive analysis of tumor-infiltrating immune cells. Cancer Res. 2017, 77, e108–e110. [Google Scholar] [CrossRef]
Li, B.; Severson, E.; Pignon, J.; Zhao, H.; Li, T.; Novak, J.; Jiang, P.; Shen, H.; Aster, J.C.; Rodig, S.; et al. Comprehensive analyses of tumor immunity: Implications for cancer immunotherapy. Genome Biol. 2016, 17, 174. [Google Scholar] [CrossRef]
Bindea, G.; Mlecnik, B.; Hackl, H.; Charoentong, P.; Tosolini, M.; Kirilovsky, A.; Fridman, W.; Pagès, F.; Trajanoski, Z.; Galon, J. ClueGO: A Cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks. Bioinformatics 2009, 25, 1091–1093. [Google Scholar] [CrossRef]
Bindea, G.; Galon, J.; Mlecnik, B. CluePedia Cytoscape plugin: Pathway insights using integrated experimental and in silico data. Bioinformatics 2013, 29, 661–663. [Google Scholar] [CrossRef] [PubMed]
Hahn, W.C. Abstract IA17: Defining cancer dependency maps. Cancer Res. 2020, 80, IA17. [Google Scholar] [CrossRef]
Dwane, L.; Behan, F.M.; Gonçalves, E.; Lightfoot, H.; Yang, W.; van der Meer, D.; Shepherd, R.; Pignatelli, M.; Iorio, F.; Garnett, M.J. Project Score database: A resource for investigating cancer cell dependencies and prioritizing therapeutic targets. Nucleic Acids Res. 2021, 49, D1365–D1372. [Google Scholar] [CrossRef]
Redman, M.; King, A.; Watson, C.; King, D. What is CRISPR/Cas9? Arch. Dis. Child. Educ. Pract. Ed. 2016, 101, 213–215. [Google Scholar] [CrossRef]
Agrawal, N.; Dasaradhi, P.V.N.; Mohmmed, A.; Malhotra, P.; Bhatnagar, R.K.; Mukherjee, S.K. RNA Interference: Biology, Mechanism, and Applications. Microbiol. Mol. Biol. Rev. 2003, 67, 657–685. [Google Scholar] [CrossRef]
Chen, E.Y.; Tan, C.M.; Kou, Y.; Duan, Q.; Wang, Z.; Meirelles, G.V.; Clark, N.R.; Ma’ayan, A. ENRICHR: Interactive and Collaborative HTML5 Gene List Enrichment Analysis Tool. BMC Bioinform. 2013, 14, 128. [Google Scholar] [CrossRef] [PubMed]
Kuleshov, M.V.; Jones, M.R.; Rouillard, A.D.; Fernandez, N.F.; Duan, Q.; Wang, Z.; Koplev, S.; Jenkins, S.L.; Jagodnik, K.M.; Lachmann, A.; et al. Enrichr: A comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 2016, 44, W90–W97. [Google Scholar] [CrossRef]
Xie, Z.; Bailey, A.; Kuleshov, M.V.; Clarke, D.J.; Evangelista, J.E.; Jenkins, S.L.; Lachmann, A.; Wojciechowicz, M.L.; Kropiwnicki, E.; Jagodnik, K.M.; et al. Gene Set Knowledge Discovery with ENRICHR. Curr. Protoc. 2021, 1, e90. [Google Scholar] [CrossRef]
Han, H.; Shim, H.; Shin, D.; Shim, J.E.; Ko, Y.; Shin, J.; Kim, H.; Cho, A.; Kim, E.; Lee, T.; et al. TRRUST: A reference database of human transcriptional regulatory interactions. Sci. Rep. 2015, 5, 11432. [Google Scholar] [CrossRef] [PubMed]
Han, H.; Cho, J.; Lee, S.; Yun, A.; Kim, H.; Bae, D.; Yang, S.; Kim, C.Y.; Lee, M.; Kim, E.; et al. TRRUST v2: An expanded reference database of human and mouse transcriptional regulatory interactions. Nucleic Acids Res. 2018, 46, D380–D386. [Google Scholar] [CrossRef] [PubMed]
Lee, I.; Blom, U.M.; Wang, P.I.; Shim, J.E.; Marcotte, E.M. Prioritizing candidate disease genes by network-based boosting of genome-wide association data. Genome Res. 2011, 21, 1109–1121. [Google Scholar] [CrossRef] [PubMed]
Razick, S.; Magklaras, G.; Donaldson, I.M. iRefIndex: A consolidated protein interaction database with provenance. BMC Bioinform. 2008, 9, 405. [Google Scholar] [CrossRef]
Yang, H.; Robinson, P.N.; Wang, K. Phenolyzer: Phenotype-based prioritization of candidate genes for human diseases. Nat. Methods 2015, 12, 841–843. [Google Scholar] [CrossRef]
Surdo, P.L.; Iannuccelli, M.; Contino, S.; Castagnoli, L.; Licata, L.; Cesareni, G.; Perfetto, L. SIGNOR 3.0, the SIGnaling network open resource 3.0: 2022 update. Nucleic Acids Res. 2023, 51, D631–D637. [Google Scholar]
Perfetto, L.; Briganti, L.; Calderone, A.; Perpetuini, A.C.; Iannuccelli, M.; Langone, F.; Licata, L.; Marinkovic, M.; Mattioni, A.; Pavlidou, T.; et al. SIGNOR: A database of causal relationships between biological entities. Nucleic Acids Res. 2016, 44, D548–D554. [Google Scholar] [CrossRef]

Figure 1. Venn diagram analysis of 4 groups of DEGs in PCa cell lines across RNA-seq datasets represented in blue (A), yellow (B), green (C), and red (D) panels. Panel (A) displays 418 DEGs common to cell lines treated with R1881 treatment (AR agonist). Panel (B) shows 116 DEGs shared among cell lines with EDC exposures, which also included AR modulators. Panel (C) illustrates 2235 DEGs shared between Gleason scores and cells overexpressing MYC (TF) for functional analysis. Panel (D) represents 7472 DEGs overlapping between age and race. A total of 75 (0.8%) overlapping genes are among the RNA-seq datasets.

Figure 2. PPI network analysis of 75 DEGs using STRING. (A) features a network of 38 upregulated genes (nodes) with 366 edges, an average node degree of 19.3, and a local clustering coefficient of 0.78. (B) shows a network of 37 downregulated genes with 282 edges, an average node degree of 15.2, and a local clustering coefficient of 0.88. These networks are derived from RNA-seq datasets with PPI enrichment p-value < 1.0⁻¹⁶.

Figure 3. The figure illustrates the GeneMANIA-generated networks of hub genes (orange circles), their associated genes (blue circles), and the weight percentages of various interaction types (physical, co-expression, co-localization, genetic interaction, pathway, and shared protein domain). (A) shows the networks of 20 upregulated hub genes, and (B) shows 20 downregulated hub genes, each ranked and sized according to their positive real-valued link weight score.

Figure 4. Genetic alteration analysis of hub genes from CBioPortal. The figure summarizes genetic alterations in upregulated (A) and downregulated (B) hub genes based on data from CBioPortal, encompassing 26 studies with 10,998 samples. Genetic alteration: red is amplification, blue is deep deletion, light red is high mRNA, light blue is low mRNA, and gray is no alteration. Mutation spectrum: adenine (A), cytosine (C), guanine (G), and thymine (T).

Figure 5. Heatmap analysis of 20 hub genes from TCGA, ICGC, and GTEx. The heatmap visualizes the expression of 20 hub genes, with data sourced from TCGA, ICGC, and GTEx via UCSC-Xena. Color saturation levels correspond to maximum and minimum log2-transformed expression values (red for high expression, blue for low). Gleason scores range from 6 (black) to 10 (yellow), and ages range from 41 (dark gray) to 78 (light gray). Panel (A) represents the 10 upregulated and Panel (B) represents the 10 downregulated groups of hub genes.

Figure 6. Network construction of protein-encoding TF genes using TRRUST V.2 database. (A): Functional E2F4, MYC, and YBX1 gene networks in HumanNet, constructed through modified Bayesian integration. (B): Protein–protein interaction (PPI) networks involving E2F4, MYC, and YBX1, mapped using iRefIndex.

Table 1. List of top 10 TFs associated with the number of DEGs identified from RNA-seq datasets for PCa.

Category	#	TFs	# of Genes	p-Value	FDR ²
Upregulated UCSC_TFBS ¹	1	MYC	28	6.48 × 10⁻⁵	0.011
	2	NFY	21	1.03 × 10⁻³	0.086
	3	STAT	33	5.52 × 10⁻³	0.222
	4	CEBP	31	7.13 × 10⁻³	0.222
	5	YY1	34	9.82 × 10⁻³	0.243
	6	EVI1	17	1.16 × 10⁻²	0.266
	7	HSF1	22	1.13 × 10⁻²	0.266
	8	CDPCR3HD	26	1.40 × 10⁻²	0.266
	9	STAT5A	26	1.68 × 10⁻²	0.266
	10	PBX1	19	1.58 × 10⁻²	0.269
Downregulated UCSC_TFBS	1	MYC	22	1.60 × 10⁻²	1.00
	2	NFY	19	2.70 × 10⁻²	1.00
	3	NFAT	23	2.90 × 10⁻²	1.00
	4	TATA	20	3.10 × 10⁻²	1.00
	5	IRF7	28	6.20 × 10⁻²	1.00
	6	MEF2	18	6.60 × 10⁻²	1.00
	7	FREAC3	18	9.10 × 10⁻²	1.00
	8	ISRE	23	9.80 × 10⁻²	1.00
	9	CDP	19	6.80 × 10⁻²	1.00
	10	NF1	18	9.90 × 10⁻²	1.00

¹ UCSC_TFBS: University of California, Santa Cruz—transcription factor binding sites. ² FDR: False Discovery Rate.

Table 2. Ten hub genes were selected based on CytoHubba topological measures for upregulated genes by associating the five most precise and predicted topological measures in the PPI network.

MCC	DMNC	MNC	Degree	EPC	MCC ꓵ DMNC ꓵ MNC ꓵ Degree ꓵ EPC
NCAPH	NCAPH	NCAPH	NCAPH	NCAPH	NCAPH
RAD54L	RAD54L	RAD54L	RAD54L	RAD54L	RAD54L
E2F7	E2F7	E2F7	E2F7	E2F7	E2F7
MKI67	MKI67	MKI67	MKI67	MKI67	MKI67
NDC80	NDC80	NDC80	NDC80	NDC80	NDC80
MELK	MELK	MELK	MELK	MELK	MELK
ESPL1	ESPL1	ESPL1	ESPL1	ESPL1	ESPL1
CCNA2	CCNA2	CCNA2	CCNA2	CCNA2	CCNA2
CKS2	GTSE1	ASPM	CKS2	GTSE1
ORC1	CHFR	FZR1	ORC1	CHFR
PLK1	DTL	DTL	ANAPC11	DTL
MCM10	DTL	MCM10	PLK1	MCM10
HELLS	HELLS	ANAPC11	HELLS	HELLS
NCAPG	CCNB1	NCAPG	NCAPG	NCAPG	NCAPG
CCNB1	NCAPG	CCNB1	CCNB1	CCNB1	CCNB1
PLK4	ANAPC11	PLK4	PLK4	FZR1
HAAO	EXO1	EXO1	FZR1	SMC4
HAAO	SPC25	FOXP4	FZR1	EXO1
MCM10	DTL	FOXP4	PLK1	MCM10
CKS2	GTSE1	ASPM	CKS2	GTSE1

Table 3. Ten hub genes were selected based on CytoHubba topological measures for downregulated genes by associating the five most precise and predicted topological measures with the PPI network.

MCC	DMNC	MNC	Degree	EPC	MCC ꓵ DMNC ꓵ MNC ꓵ Degree ꓵ EPC
TOP2A	TOP2A	TOP2A	TOP2A	TOP2A	TOP2A
CCNB1	CDC20	ASPM	GTSE1	ASPM
CDC20	BUB1B	CDC20	CDC20	CDC20	CDC20
BUB1B	HMMR	BUB1B	BUB1B	BUB1B	BUB1B
HMMR	TTK	HMMR	HMMR	GTSE1
RRM2	TTK	RRM2	RRM2	RRM2
CDK1	BUB1	GTSE1	CDK1	CDK1
TTK	CENPF	TTK	TTK	GTSE1
BUB1	CDKN3	BUB1	BUB1	BUB1	BUB1
CENPF	PKMYT1	CENPF	CENPF	CENPF	CENPE
CDKN3	CCNB2	CDKN3	CDKN3	CDKN3
CCNB2	KIF15	CCNB2	CCNB2	CCNB2	CCNB2
KIF15	AURKA	KIF15	KIF15	KIF15	KIF15
AURKA	KIF11	AURKA	AURKA	AURKA	AURKA
KIF11	DLGAP5	KIF11	KIF11	KIF11
DLGAP5	ZWINT	DLGAP5	DLGAP5	DLGAP5	DLGAP5
ZWINT	CENPE	ZWINT	GTSE1	ZWINT
CENPE	SPC25	CENPE	CENPE	CENPE
UBE2C	UBE2C	UBE2C	UBE2C	UBE2C	UBE2C
ASPM	GTSE1	ASPM	ASPM	GTSE1

Table 4. Overview of the specified RNA-seq dataset linked with PCa included in the study.

Group					Cell Lines
Group	#	GEO Profile	Platform	Annotation Platform	Total	PCa	Control	References
Cell lines treated with R1881 treatment	1	GSE70466	GPL16791	Illumina HiSeq 2500	6	3	3	[63]
	2	GSE151290	GPL16791	Illumina HiSeq 2500	16	8	8	[64]
	3	GSE128749	GPL11154	Illumina HiSeq 2000	11	5	6	[65]
	4	GSE120660	GPL16791	Illumina HiSeq 2500	21	12	9	[66]
	5	GSE135879	GPL16791	Illumina HiSeq 2500	12	6	6	[67]
	6	GSE136272	GPL16791	Illumina HiSeq 2500	12	6	6	[68]
Cell lines with EDC exposure	7	GSE218556	GPL24676	Illumina NovaSeq 6000	6	3	3	[69]
	8	GSE64529	GPL11154	Illumina HiSeq 2000	6	3	3	[70]
	9	GSE5590	GPL2986	Illumina HiSeq 2500	6	3	3	[71]
	10	GSE128339	GPL8842	Illumina NovaSeq 6000	34	24	10	[72]
	11	GSE109021	GPL10558	Illumina HiSeq 2000	18	15	3	[73]
Cell lines with Gleason scores and Cells with MYC overexpression	12	GSE200879	GPL32170	Illumina HiSeq 2500	124	115	9	[74]
	13	GSE103512	GPL13158	Illumina NovaSeq 6000	57	50	7	[49]
Cell lines between Age and Race	14	GSE104131	GPL16791	Illumina HiSeq 2500	29	16	13	[75]
Cell lines between Age and Race	15	GSE200167	GPL24676	Illumina NovaSeq 6000	6	3	3	[76]
Total Samples					364	272	92

GEO profile description: GSE70466: normal prostate cells vs. androgen-sensitive LNCaP. GSE151290: LNCaP controls vs. LNCaP-R1881. GSE128749: LNCaP-R1881 vs. LAPC4-R1881. GSE120660: PC3-R1881 vs. VCaP-R1881, PC3-R1881 vs. LNCaP-R1881, and LNCaP-R1881 vs. VCaP-R1881. GSE135879: PC3 controls vs. PC3-R1881, LAPC4 controls vs. LAPC4-R1881, VCaP controls vs. VCaP-R1881, and LNCaP controls vs. LNCaP-R1881. GSE136272: VCaP controls vs. VCaP-R1881 and LNCaP controls vs. LNCaP-R1881. GSE218556: controls vs. cells treated with DHT. GSE64529: Control LNCaP cells vs. DHT-treated LNCaP cells. GSE5590: Normal human prostate + zinc and PC3 + zinc. GSE128339: Normal prostate cells vs. control, PC3 adenocarcinoma, and PC3 adenocarcinoma + Genistein. GSE109021: Identification of AR with exposure conditions (Bisphenol A, DHT, Methyl testosterone, Estrogen, Testosterone propionate, DDE, and triclosan) for LAPC-4 cells (examined in antagonist mode) treated with R1881. GSE200879: Control cells vs. PCa Gleason scores. GSE103512: Control cells differing in age of PCa cases. GSE104131: Control cells differing in race/ethnicity of PCa (African American men (AAM) and European American men (EAM)). GSE200167: MYC (TF) in the chromatin interactions of PCa (22Rv1 controls vs. 22Rv1 MYC).

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Alwadi, D.; Felty, Q.; Doke, M.; Roy, D.; Yoo, C.; Deoraj, A. RNA-Seq Uncovers Association of Endocrine-Disrupting Chemicals with Hub Genes and Transcription Factors in Aggressive Prostate Cancer. Int. J. Mol. Sci. 2025, 26, 5463. https://doi.org/10.3390/ijms26125463

AMA Style

Alwadi D, Felty Q, Doke M, Roy D, Yoo C, Deoraj A. RNA-Seq Uncovers Association of Endocrine-Disrupting Chemicals with Hub Genes and Transcription Factors in Aggressive Prostate Cancer. International Journal of Molecular Sciences. 2025; 26(12):5463. https://doi.org/10.3390/ijms26125463

Chicago/Turabian Style

Alwadi, Diaaidden, Quentin Felty, Mayur Doke, Deodutta Roy, Changwon Yoo, and Alok Deoraj. 2025. "RNA-Seq Uncovers Association of Endocrine-Disrupting Chemicals with Hub Genes and Transcription Factors in Aggressive Prostate Cancer" International Journal of Molecular Sciences 26, no. 12: 5463. https://doi.org/10.3390/ijms26125463

APA Style

Alwadi, D., Felty, Q., Doke, M., Roy, D., Yoo, C., & Deoraj, A. (2025). RNA-Seq Uncovers Association of Endocrine-Disrupting Chemicals with Hub Genes and Transcription Factors in Aggressive Prostate Cancer. International Journal of Molecular Sciences, 26(12), 5463. https://doi.org/10.3390/ijms26125463

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

RNA-Seq Uncovers Association of Endocrine-Disrupting Chemicals with Hub Genes and Transcription Factors in Aggressive Prostate Cancer

Abstract

1. Introduction

2. Results

2.1. Discovery of Differentially Expressed Genes (DEGs) in RNA-Seq for PCa

2.2. RNA-Seq DEG Enrichment Analysis: GO, KEGG, and TFs

2.3. Protein–Protein Interaction (PPI) Networks and Module Selection Analysis

2.4. Hub Gene Identification

2.5. Genetic Alteration and Expression of Hub Genes

2.6. Hub Gene and Immune Cell Infiltration Estimations

2.7. GlueGO and GluePedia

2.8. Hub Gene Redundancy and Dependency Analysis

2.9. Enrichr and Hub Gene Enrichment Analysis

2.10. Regulatory TFs and Their Associated Network with Hub Genes

2.11. Comparison of Phenotypes for Hub Genes and TFs

3. Discussion

4. Materials and Methods

4.1. Omics Data Acquisition

4.2. Data Processing

4.3. Statistical Analysis

4.4. Identification of DEGs and Pathway Enrichment Analysis

4.5. Biological System Analysis and Module Network Mining

4.6. Genetic Alteration in Hub Genes

4.7. Visualizing the Heatmap and Hub Gene Expressions

4.8. Hub Genes’ Association with Immune Cell Infiltration

4.9. ClueGO and CluePedia: Functional Enrichment Analysis

4.10. Redundancy Analyses of Hub Genes

4.11. Hub Gene Enrichment Analysis—Enrichr

4.12. TF Association Network with Hub Genes

4.13. Causal Association and Protein Interactions and Phenotypes with DEGs

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI