Identification of Key Pathways and Genes in the Dynamic Progression of HCC Based on WGCNA

Hepatocellular carcinoma (HCC) is a devastating disease worldwide. Though many efforts have been made to elucidate the process of HCC, its molecular mechanisms of development remain elusive due to its complexity. To explore the stepwise carcinogenic process from pre-neoplastic lesions to the end stage of HCC, we employed weighted gene co-expression network analysis (WGCNA) which has been proved to be an effective method in many diseases to detect co-expressed modules and hub genes using eight pathological stages including normal, cirrhosis without HCC, cirrhosis, low-grade dysplastic, high-grade dysplastic, very early and early, advanced HCC and very advanced HCC. Among the eight consecutive pathological stages, five representative modules are selected to perform canonical pathway enrichment and upstream regulator analysis by using ingenuity pathway analysis (IPA) software. We found that cell cycle related biological processes were activated at four neoplastic stages, and the degree of activation of the cell cycle corresponded to the deterioration degree of HCC. The orange and yellow modules enriched in energy metabolism, especially oxidative metabolism, and the expression value of the genes decreased only at four neoplastic stages. The brown module, enriched in protein ubiquitination and ephrin receptor signaling pathways, correlated mainly with the very early stage of HCC. The darkred module, enriched in hepatic fibrosis/hepatic stellate cell activation, correlated with the cirrhotic stage only. The high degree hub genes were identified based on the protein-protein interaction (PPI) network and were verified by Kaplan-Meier survival analysis. The novel five high degree hub genes signature that was identified in our study may shed light on future prognostic and therapeutic approaches. Our study brings a new perspective to the understanding of the key pathways and genes in the dynamic changes of HCC progression. These findings shed light on further investigations.


Introduction
Hepatocellular carcinoma (HCC) is one of the most common cancers worldwide and the second leading cause of global cancer-related death accounting for around 11% of all cancer deaths [1]. Chronic viral hepatitis, metabolic disease, autoimmune hepatitis are major risk factors for HCC development. Hepatitis C viruses (HCV) can cause acute and chronic infections which can lead to liver cirrhosis and hepatocellular carcinoma. Cirrhosis is the most important risk factor for developing HCC, and the majority of viral-associated HCC cases develop from liver cirrhosis, so the clarity of hepatitis viral infections in HCC development is necessary for the treatment of HCC [2]. It is estimated that there are 140 million infections with hepatitis C worldwide and the most-affected regions are Central and East Asia and North Africa. HCV-associated carcinogenesis can lead to increased hepatocyte proliferation and steatosis, oxidative stress, mitochondrial damage and induction of reactive oxygen species (ROS) [3]. Patients who have progressed to liver cirrhosis and poor liver function commonly develop HCC [4].
At present, the potentially curative options for HCC patients are radiofrequency ablation, liver transplantation and tumor resection. However, many factors affect the suitability, including tumor stage, deficiency of donors, graft rejection and opportunistic infections due to immunosuppression, etc. [5]. The liver cancer stage is one of the most important factors in choosing treatment options and predicting a patient's outlook. For early stage HCC without cirrhosis, liver resection or liver transplantation is available, but recurrence is frequently occurred. At the end stage of HCC, patients often have less than three months survival. It is necessary to understand the mechanism of progression from cirrhosis to HCC.
Weighted gene co-expression network analysis (WGCNA) has been proved to be an effective method to detect co-expressed modules and hub genes, microRNAs and lncRNAs (long non-coding RNAs) in many aspects [6][7][8][9][10][11][12][13][14][15]. WGCNA can group genes into a model or network based on pairwise correlations between genes due to their similar expression profile, and these models can correlate to different stages of HCC. First, the absolute value of the correlation of paired genes was used to define the gene co-expression network. Next, an adjacency matrix is used to define the strength with wich genes are connected to each other. A soft thresholding parameter was employed to construct a weighted network. WGCNA uses the topological overlap measure (TOM) as a proximity measure to cluster genes into network modules that combine the adjacency of two genes and the connection strengths with which these two genes interact with other neighbor genes. The genes inside a module can be summarized with the module eigengene, which is defined as the first principal component of the expression profiles. To find the modules related to clinical traits of interest, the correlation is calculated between module eigengenes and all clinical traits. The correlation between genes and module eigengenes was used to identify intramodular hub genes [16].
In order to explore the dynamic changes during the development of HCC, we analyzed 75 tissue samples representing the stepwise carcinogenic process from pre-neoplastic lesions to HCC including normal, cirrhosis without HCC, cirrhosis, low-grade dysplastic, high-grade dysplastic, very early HCC, early HCC, advanced HCC and very advanced HCC stages by using WGCNA (Table S1). To the best of our knowledge, this is the first report describing the dynamics of gene expression changes during the development of HCC, though there are some studies about each stage of HCC based on the dataset [17,18].
In this study, we constructed a gene co-expression network based on WGCNA and identified 25 modules during the progression of HCC. The modules were correlated with the progression of HCC. We also performed canonical pathway analysis by Ingenuity pathway analysis (IPA) on five modules closely related to different HCC stages. The turquoise module enriched in the cell cycle which was activated only at four neoplastic stages, and the degree of the activity of the cell cycle corresponded to the deterioration degree of HCC. The orange and yellow modules enriched in energy metabolism, especially oxidative metabolism, and decreased only at four neoplastic stages. The brown module enriched in protein ubiquitination and ephrin receptor-signaling pathways which decrease at very early stages of HCC only. The darkred module enriched in hepatic fibrosis/hepatic stellate cell activation, which isactivated at the cirrhotic stage only. Then, we performed upstream regulator analysis and detected some regulators such as PPARA (Peroxisome Proliferator Activated Receptor Alpha ) and PLG (Plasminogen). We also identified the high degree genes in every module using the cytohubba plugin based on cytoscape [19].

Data Processing
The gene expression dataset GSE6764 provided by Wurmbach, E. et al. [17] (https://www.ncbi. nlm.nih.gov/geo/query/acc.cgi?acc=GSE6764) was downloaded from the Gene Expression Omnibus (GEO) database [20]. In total, 75 tissue samples were divided into eight consecutive pathological stages from pre-neoplastic lesions to HCC including normal, cirrhosis without HCC, cirrhosis, low-grade dysplastic, high-grade dysplastic, very early HCC, early HCC, advanced HCC and very advanced HCC. All tissue samples are hybridized on the human U133 plus 2.0 array (Affymetrix, Santa Clara, CA, USA). The Robust Multi-array Average (RMA) algorithm was performed to process the raw files. In order to filter the features exhibiting little variation across the above samples, the nsFilter algorithm was used to filter the data for the subsequent WGCNA [16,21].

Construction of Weighted Gene Co-Expression Networks and Identification of Modules Associated with Different Stages of Hepatocellular Carcinoma
From thousands of genes, the interesting gene modules can be identified by WGCNA, and then, the intramodular connectivity and gene significance based on the correlation of a gene expression profile with a sample trait were used to identify key genes in HCC for further validation. WGCNA is a freely accessible R package for the construction of weighted gene co-expression networks [22]. The above filtered data were used to construct the network. Three different ways can be selected to construct the network and identify modules according to different needs. In our study, the one-step function was used for network construction and detection of consensus modules.

Interaction Analysis of Co-Expression Modules
To further evaluate the co-expression similarity of all the modules, the eigengenes adjacency based on their correlation was calculated. The interaction relationship among different co-expression modules was performed by the flashClust function [23]. A heat map was used for visualization of the correlations of each module.

Functional Enrichment Analysis of Genes in Every Module
Hub gene is a loosely defined term which is an abbreviation of "highly connected gene". The genes inside co-expression modules have high connectivity and the genes within the same module may play similar roles. We filtered the hub genes in each module according to the intra-modular connectivity and correlation with module eigengenes. To identify the biological function of the significant modules and the relationship between the modules and different stages, we extracted the top ranked genes with the strongest connections within each module to perform canonical pathways analysis in selected modules using of IPA.

The Ingenuity Pathway Analysis Upstream Regulator Analysis
The co-expressed genes participating in the same biological process or disease may be regulated by the same or similar regulators especially transcription factors (TF). In order to explain the biological activities of each module, we identified the upstream transcriptional regulators in each module with a p value of overlap <0.01.

Protein-Protein Interaction Network Construction and Analysis for Selected Modules
The top ranked genes in every module are thought to be hub genes. In order to identify the high degree genes which play a critical role in the protein-protein network (PPI), the Cytohubba plugin based on Cytoscape was used to perform the network analysis [19], and the high degree genes were identified.

Kaplan-Meier Survival Analysis
Publicly available data and tools were employed to perform the survival analysis using the The Cancer Genome Atlas (TCGA)-liver cancer data which contained 361 samples with the high degree hub genes as input.
For the duplicated genes, all probe sets/records will be averaged per sample using quantile-normalized data. The maximum risk groups were selected for the survival analysis. All the details are described in the tutorial provided on the SurvExpress website [24].

Data Processing
A total of 75 tissue sample raw files (.CEL format) were downloaded from the NCBI (National Center for Biotechnology Information). The raw files were converted to expression data using the RMA algorithm based on R language including background correction, normalization and summarization. There were a total of 16,383 probes for further WGCNA analysis after nsFilter processing.

Construction of Weighted Gene Co-Expression Network Identification of Modules Associated with Different Stages of HCC
The network was constructed from the filtered probes and twenty-five modules were identified. We have chosen the soft threshold power 8 to define the adjacency matrix based on the criterion of approximate scale-free topology ( Figure 1), with minimum module size 30, the module detection sensitivity deepSplit 2, and cut height for merging of modules 0.2 which means that the modules whose eigengenes are correlated above 0.8 will be merged ( Figure 2A).

Kaplan-Meier Survival Analysis
Publicly available data and tools were employed to perform the survival analysis using the The Cancer Genome Atlas (TCGA)-liver cancer data which contained 361 samples with the high degree hub genes as input.
For the duplicated genes, all probe sets/records will be averaged per sample using quantilenormalized data. The maximum risk groups were selected for the survival analysis. All the details are described in the tutorial provided on the SurvExpress website [24].

Data Processing
A total of 75 tissue sample raw files (.CEL format) were downloaded from the NCBI (National Center for Biotechnology Information). The raw files were converted to expression data using the RMA algorithm based on R language including background correction, normalization and summarization. There were a total of 16,383 probes for further WGCNA analysis after nsFilter processing.

Construction of Weighted Gene Co-Expression Network Identification of Modules Associated with Different Stages of HCC
The network was constructed from the filtered probes and twenty-five modules were identified. We have chosen the soft threshold power 8 to define the adjacency matrix based on the criterion of approximate scale-free topology ( Figure 1), with minimum module size 30, the module detection sensitivity deepSplit 2, and cut height for merging of modules 0.2 which means that the modules whose eigengenes are correlated above 0.8 will be merged ( Figure 2A).   Gene dendrogram obtained by clustering the dissimilarity based on consensus Topological Overlap with the corresponding module colors indicated by the color row. Each colored row represents a color-coded module which contains a group of highly connected genes. A total of 25 modules were identified. (B) Dendrogram of consensus module eigengenes obtained by WGCNA on the consensus correlation. The red line is the merging threshold, and groups of eigengenes below the threshold represent modules whose expressions profiles should be merged due to their similarity. (C) Heatmap plot of the adjacencies of modules. Red represents high adjacency (positive correlation) and blue represents low adjacency (negative correlation).

Correlation between each Module
We can find that some of the modules had similar expression profiles. In order to figure out the interactions among these 25 co-expressed modules, the connectivity of eigengenes was analyzed. As shown in Figure 2B,C, we performed a cluster analysis. In general, 25 clusters were grouped into two (A) Gene dendrogram obtained by clustering the dissimilarity based on consensus Topological Overlap with the corresponding module colors indicated by the color row. Each colored row represents a color-coded module which contains a group of highly connected genes. A total of 25 modules were identified. (B) Dendrogram of consensus module eigengenes obtained by WGCNA on the consensus correlation. The red line is the merging threshold, and groups of eigengenes below the threshold represent modules whose expressions profiles should be merged due to their similarity. (C) Heatmap plot of the adjacencies of modules. Red represents high adjacency (positive correlation) and blue represents low adjacency (negative correlation).

Correlation between each Module
We can find that some of the modules had similar expression profiles. In order to figure out the interactions among these 25 co-expressed modules, the connectivity of eigengenes was analyzed. As shown in Figure 2B,C, we performed a cluster analysis. In general, 25 clusters were grouped into two clusters, and each cluster contains three branches. Combined with Figure 3, there was a significant difference among the 25 modules. However, no pair of modules below the threshold (0.2) was merged. There are multiple modules related to one or more tumor stages. For instance, the turquoise, light green, green yellow, blue, salmon, pink and purple modules were related to four tumor stages, especially the turquoise module, which is strongly related to the development of HCC; the red and brown modules were negatively related to the very early stage; the tan modules were specifically related to the very early stage; the magenta, darkred, cyan, black and green modules were related to the pre-tumor stage; the midnight blue, orange, yellow and light cyan modules were negatively related to four HCC stages. clusters, and each cluster contains three branches. Combined with Figure 3, there was a significant difference among the 25 modules. However, no pair of modules below the threshold (0.2) was merged. There are multiple modules related to one or more tumor stages. For instance, the turquoise, light green, green yellow, blue, salmon, pink and purple modules were related to four tumor stages, especially the turquoise module, which is strongly related to the development of HCC; the red and brown modules were negatively related to the very early stage; the tan modules were specifically related to the very early stage; the magenta, darkred, cyan, black and green modules were related to the pre-tumor stage; the midnight blue, orange, yellow and light cyan modules were negatively related to four HCC stages.

Functional Enrichment Analysis
Five modules intimately related to different HCC stages were selected for the canonical pathway analysis by IPA, and the hub genes in the five modules are listed in the supplementary table. As shown in Figure 4, the enriched pathways including cell cycle, mitosis, DNA damage-induced 14-3-3σ signaling, G2/M DNA damage checkpoint regulating and GADD45 signaling, mainly in the turquoise module, indicated that the cell cycle-related pathways changed initially at very early HCC and changed more significantly as the disease worsened, which is in agreement with previous studies [25][26][27]. The biological activity enriched in the brown module reduced suddenly at the very early stage of HCC, which included the protein ubiquitination pathway, etc. The pathways in the darkred module were activated in the cirrhotic tissue samples and decreased in four stages of HCC which mainly include hepatic fibrosis/hepatic stellate cell activation. The orange and yellow modules were

Functional Enrichment Analysis
Five modules intimately related to different HCC stages were selected for the canonical pathway analysis by IPA, and the hub genes in the five modules are listed in the supplementary table. As shown in Figure 4, the enriched pathways including cell cycle, mitosis, DNA damage-induced 14-3-3σ signaling, G2/M DNA damage checkpoint regulating and GADD45 signaling, mainly in the turquoise module, indicated that the cell cycle-related pathways changed initially at very early HCC and changed more significantly as the disease worsened, which is in agreement with previous studies [25][26][27]. The biological activity enriched in the brown module reduced suddenly at the very early stage of HCC, which included the protein ubiquitination pathway, etc. The pathways in the darkred module were activated in the cirrhotic tissue samples and decreased in four stages of HCC which mainly include hepatic fibrosis/hepatic stellate cell activation. The orange and yellow modules were all inactivated at four stages of HCC, as shown in Figure 4; the enrichment pathways participated mainly in metabolism. The energy metabolism changed significantly. The process related to oxidative metabolism decreased in HCC progression which indicated the alternation of the supply method of energy. all inactivated at four stages of HCC, as shown in Figure 4; the enrichment pathways participated mainly in metabolism. The energy metabolism changed significantly. The process related to oxidative metabolism decreased in HCC progression which indicated the alternation of the supply method of energy.

The Ingenuity Pathway Analysis Upstream Regulator Analysis
Upstream regulator analysis was performed for the genes from selected modules using IPA. As shown in Table 1, several kinds of upstream regulators were predicted including transcription regulators, transporters, microRNA, growth factors and enzymes, etc. The ligand-dependent nuclear receptor PPARA was identified in turquoise and yellow modules at the same time with different target molecules. PLG and GPD1 are both hub genes in the orange and yellow module respectively.

The Ingenuity Pathway Analysis Upstream Regulator Analysis
Upstream regulator analysis was performed for the genes from selected modules using IPA. As shown in Table 1, several kinds of upstream regulators were predicted including transcription regulators, transporters, microRNA, growth factors and enzymes, etc. The ligand-dependent nuclear receptor PPARA was identified in turquoise and yellow modules at the same time with different target molecules. PLG and GPD1 are both hub genes in the orange and yellow module respectively.

PPI Network Construction and Analysis of Selected Modules
The co-expression networks of top ranked genes for four selected modules, including the turquoise, brown, orange and yellow module were constructed as shown in Figure 5. The high degree genes calculated by the cytohubba plugin are shown in a "v" shape. GINS1, NEK2, BUB1B, KIF11 and TOP2A were identified in the turquoise module, which was enriched in cell cycle-related processes. MUT, AZGP1, HBB, HBA1, HBA2, HBD, SUCLA2, ACADM and UQCRC2 were identified in the yellow and orange modules, which were enriched in oxidative metabolism. KBP1A, ARPC4, HSP90AB1 and ENO1 were high degree genes involved in the early stage of HCC. , brown (C) and orange (D) modules. The density of the ellipse corresponds to module membership (kME) values. The network was constructed using Cytoscape 3.4 software. The genes with a v-shape represent the high-degree genes from cytohubba.

Kaplan-Meier Survival Analysis
Kaplan-Meier Survival analysis was performed to determine the relationship between the expression of high degree hub genes and the survival time of HCC patients.
Three cell-cycle related genes (GINS, BUB1B and TOP2A), one oxidative metabolism-related gene (ACADM) and one early stage-related gene (ARPC4) were significantly correlated with high risk, poor prognosis and shorter overall survival, respectively, as shown in Figure 6. The result showed that the identified genes were able to distinguish the high risk from low risk patients effectively. The network was constructed using Cytoscape 3.4 software. The genes with a v-shape represent the high-degree genes from cytohubba.

Kaplan-Meier Survival Analysis
Kaplan-Meier Survival analysis was performed to determine the relationship between the expression of high degree hub genes and the survival time of HCC patients.
Three cell-cycle related genes (GINS, BUB1B and TOP2A), one oxidative metabolism-related gene (ACADM) and one early stage-related gene (ARPC4) were significantly correlated with high risk, poor prognosis and shorter overall survival, respectively, as shown in Figure 6. The result showed that the identified genes were able to distinguish the high risk from low risk patients effectively.

Discussion
As a global health problem, HCV can cause infections which lead to liver cirrhosis and HCC. The multivariate analysis of risk factors for HCC has been extensively documented [2,3,28]. The main objectives of the study were to gain molecular insights into the progression of HCC and to identify and predict the candidate gene groups associated with the stepwise carcinogenic process from preneoplastic lesions to HCC by constructing a gene co-expression network using WGCNA. The significantly changed modules that correlated with different stages of HCC were identified, and the genes in the same module were extracted to perform pathway enrichment analysis. Then, the upstream regulators and hub genes were identified by using integrated bioinformatics methods including cytoscape and IPA.
Modules changed significantly at four neoplastic stages included the turquoise, orange and yellow modules. The cell cycle-associated turquoise module changed significantly in all stages of carcinogenesis, and the change was exacerbated as the disease worsened. All above results suggest

Discussion
As a global health problem, HCV can cause infections which lead to liver cirrhosis and HCC. The multivariate analysis of risk factors for HCC has been extensively documented [2,3,28]. The main objectives of the study were to gain molecular insights into the progression of HCC and to identify and predict the candidate gene groups associated with the stepwise carcinogenic process from pre-neoplastic lesions to HCC by constructing a gene co-expression network using WGCNA. The significantly changed modules that correlated with different stages of HCC were identified, and the genes in the same module were extracted to perform pathway enrichment analysis. Then, the upstream regulators and hub genes were identified by using integrated bioinformatics methods including cytoscape and IPA.
Modules changed significantly at four neoplastic stages included the turquoise, orange and yellow modules. The cell cycle-associated turquoise module changed significantly in all stages of carcinogenesis, and the change was exacerbated as the disease worsened. All above results suggest that gene expression associated with cell cycle changed at the very early stage of HCC. It is well known that the cell cycle, the process of cell progression and division lies at the heart of cancer. Cells use a complex set of kinases to control the complex steps in the cell cycle. Once the regulatory process malfunctions, uncontrolled cell proliferation occurs. Checkpoints are important quality control measures, which ensure proper cell cycle events. Therefore, regulation of the cell cycle could be used to cure cancer especially the cell cycle checkpoint [29][30][31][32][33]. It can be seen that from Figure 3, the activation of the genes associated with the cell cycle occurred at the very early stage of HCC.
All this indicated that the abnormal cell cycle-associated pathways in the transcriptome play a critical role in the initiation and development of HCC. The orange and yellow modules, enriched in energy metabolism, especially oxidative metabolism, decreased only in four neoplastic stages. In addition, the biological process decreased significantly in the very advanced stage. All this suggested the switch of an altered energy source, which is consistent with recent studies [34][35][36][37][38][39][40]. Otto Warburg thought that the core metabolic signature of cancer cells is a high glycolytic flux, and the prime cause of cancer is the replacement of respiration of oxygen by glycolysis due to defective mitochondrial respiration [40]. As the energy and redox currencies of the cell, ATP and NADH are key factors for tumor survival, growth, and expansion, which makes them the core for therapeutic exploitation. Our findings are consistent with this opinion. We found a decrease in the tricarboxylic acid (TCA) cycle, fatty acid ß-oxidation and oxidative phosphorylation in all stages of HCC, which indicated a decrease in oxygen consumption. What is the primary origin of cancer; genetic mutations or energetic imbalances? We are not sure.
According to the PPI network analysis from the selected models, some high-degree hub genes were identified which played critical roles in the network. For the turquoise module, GINS1, TOP2A, KIF11, BUB1B and NEK2 were identified asthe high degree genes. For yellow and orange modules, MUT, AZGP1, HBA1, HBB, HBD, HBA2, ACADM, UQCRC2 and SUCLA2 were high degree genes.
GINS1 (also known as PSF1) is a subunit of the GINS (Go, lchi, Nii, San) complex which drives the unwinding of DNA in front of the replication fork [41]. Many studies have shown that GINS1 is up-regulated in several cancer types including lung, colon, prostate, colorectal and breast cancers [41][42][43][44][45][46][47]. It is reported that GINS1 is expressed widely in early embryogenesis in mice, stem and progenitor cells, etc. [42,48,49]. All the above showed a close relationship between GINS1 and the cell cycle. From previous results, high GINS1 expression correlated with a more aggressive phenotype as well as worse prognosis in HCC patients [50]. Through the analysis of HCC tissue, TOP2A expressions were correlated with advanced histological grading, microvascular invasion and an early age onset of the malignancy [51,52] which indicated the prognostic value of TOP2A in HCC. As for KIF11 and BUB1B, there are relatively few reports on their function in HCC, but all of them are related to the cell cycle [53][54][55][56][57][58][59][60][61]. Amazingly, HBA1, HBB, HBD and HBA2 all appeared in the yellow and orange modules, which are related to energy metabolism. It is well known that the tumor cells have a high glycolytic rate compared to normal cells even if oxygen is sufficient [62]. It is assumed that impaired mitochondrial respiration may play a vital role in the process [63].There are some reports on the relationship between hemoglobin and HCC, but the mechanism is not well understood [64,65]. Are there some correlations between the deregulation of hemoglobin and altered energy metabolism? This is worth further study. Increasing evidence has shown that the decrease of zinc-alpha2-glycoprotein (AZGP1) is associated with poor prognosis and more aggressive tumors in HCC [66,67]. And according to a recent study, AZGP1 plays a role in regulating the PTEN/Akt and CD44 pathways [61]. As important enzymes in the electron transport chain (ETC), the roles of UQCRC2 and SUCLA2 are not known yet.
The brown module was enriched in protein ubiquitination and ephrin receptor signaling pathways which decrease at the very early stage of HCC only. Eph receptors belong to receptor tyrosine kinase families and participate in many cancers. A previous study has shown that EphrinA2 suppressed apoptosis, rather than accelerated proliferation and facilitated cancer cell survival [68]. In the present study, ephrin signaling deregulated at the very early stage, and upregulated increasingly with the progression of HCC. According to the network analysis of the brown module, ARPC4, HSP90AB1, ENO1 were identified as high degree hub genes. As a subunit of the actin-related protein 2/3 complex (ARP2/3), ARPC4 may contribute to the development of HCC [69]. It can be inferred that some biological processes occurred at all stages of HCC and some at a certain stage, which shows the complexity of the development of HCC. The darkred module, enriched in hepatic fibrosis/hepatic stellate cell activation, that was correlated with the cirrhotic stage only, will be further studied in the future.
The survival analysis showed that the five novel high-degree hub gene signatures identified in our selected modules which changed significantly may shed light on future prognostic and therapeutic approaches.
In conclusion, we have presented a novel approach using WGCNA to explore the dynamic changes during the stepwise carcinogenic process from pre-neoplastic lesions to HCC including eight stages. According to the network constructed by WGCNA, 25 modules were identified, and five modules were selected to be analyzed in detail. We found that the turquoise module, enriched in the cell cycle-related process, was activated at all stages of HCC, and the yellow and orange module, enriched in aerobic metabolism, was inactivated in four stages of HCC. Some hub genes with a high degree were identified. All of the above may shed light on the understanding of pathways and genes underlying HCV-associated disease and the application of prognostic and predictive markers for HCC, which needed further analysis.