Multi-Level Biological Network Analysis and Drug Repurposing Based on Leukocyte Transcriptomics in Severe COVID-19: In Silico Systems Biology to Precision Medicine

The coronavirus disease 2019 (COVID-19) pandemic causes many morbidity and mortality cases. Despite several developed vaccines and antiviral therapies, some patients experience severe conditions that need intensive care units (ICU); therefore, precision medicine is necessary to predict and treat these patients using novel biomarkers and targeted drugs. In this study, we proposed a multi-level biological network analysis framework to identify key genes via protein–protein interaction (PPI) network analysis as well as survival analysis based on differentially expressed genes (DEGs) in leukocyte transcriptomic profiles, discover novel biomarkers using microRNAs (miRNA) from regulatory network analysis, and provide candidate drugs targeting the key genes using drug–gene interaction network and structural analysis. The results show that upregulated DEGs were mainly enriched in cell division, cell cycle, and innate immune signaling pathways. Downregulated DEGs were primarily concentrated in the cellular response to stress, lysosome, glycosaminoglycan catabolic process, and mature B cell differentiation. Regulatory network analysis revealed that hsa-miR-6792-5p, hsa-let-7b-5p, hsa-miR-34a-5p, hsa-miR-92a-3p, and hsa-miR-146a-5p were predicted biomarkers. CDC25A, GUSB, MYBL2, and SDAD1 were identified as key genes in severe COVID-19. In addition, drug repurposing from drug–gene and drug–protein database searching and molecular docking showed that camptothecin and doxorubicin were candidate drugs interacting with the key genes. In conclusion, multi-level systems biology analysis plays an important role in precision medicine by finding novel biomarkers and targeted drugs based on key gene identification.


Introduction
Nowadays, our world has experienced the coronavirus disease 2019 (COVID- 19) pandemic, causing numerous morbid and mortal cases. The disease is caused by severe infection of acute respiratory syndrome coronavirus-2 (SARS-CoV-2). The virus is a positivesense single-strand RNA β-coronavirus classified in the Coronaviridae family, which also

Materials and Methods
The overall process of identifying key genes, novel biomarkers, and candidate drugs using several levels of the biological network is summarized in Figure 1. All our proposed methods were dry experiments or in silico studies based on wet experimental data acquisition from databases. First, the leukocyte transcriptomic profiles from Gene Expression Omnibus (GEO) datasets [50] were downloaded to indicate an overall immune status in severe COVID-19 patients compared to controls. Common differentially expressed genes (DEGs) were identified by considering statistical criteria described in Section 2.1 Data Collection and Preprocessing. The functional enrichment analysis of upregulated and downregulated DEGs were conducted using Metascape [63]. Second, STRING v11.0 [64] was used to construct the PPI network based on the common DEGs. Network clustering was conducted using the Molecular Complex Detection (MCODE) plugin in Cytoscape [65]. The degree and betweenness centrality were calculated using Network Analyzer in Cytoscape to find hub and bottleneck genes in the PPI network. Additionally, the survival analysis from Gene Expression Profiling Interactive Analysis (GEPIA2) [66], using acute myeloid leukemia (LAML) as a cell type model, was operated to identify key genes from the hub and bottleneck genes. Third, MicroRNA Enrichment Turned Network (MIENTUR-NET) [67,68] was used to construct regulatory networks and identify novel biomarkers. Finally, drugs resulting from drug-gene and drug-protein interaction databases were studied by molecular docking. methods were dry experiments or in silico studies based on wet experimental data acquisition from databases. First, the leukocyte transcriptomic profiles from Gene Expression Omnibus (GEO) datasets [50] were downloaded to indicate an overall immune status in severe COVID-19 patients compared to controls. Common differentially expressed genes (DEGs) were identified by considering statistical criteria described in Section 2.1 Data Collection and Preprocessing. The functional enrichment analysis of upregulated and downregulated DEGs were conducted using Metascape [63]. Second, STRING v11.0 [64] was used to construct the PPI network based on the common DEGs. Network clustering was conducted using the Molecular Complex Detection (MCODE) plugin in Cytoscape [65]. The degree and betweenness centrality were calculated using Network Analyzer in Cytoscape to find hub and bottleneck genes in the PPI network. Additionally, the survival analysis from Gene Expression Profiling Interactive Analysis (GEPIA2) [66], using acute myeloid leukemia (LAML) as a cell type model, was operated to identify key genes from the hub and bottleneck genes. Third, MicroRNA Enrichment Turned Network (MIEN-TURNET) [67,68] was used to construct regulatory networks and identify novel biomarkers. Finally, drugs resulting from drug-gene and drug-protein interaction databases were studied by molecular docking. Figure 1. Diagram summarizes the process of identifying key genes, novel biomarkers, and candidate drugs using multi-levels of biological network analyses. There are four principal data and networks, including transcriptomics data, protein-protein interaction network, miRNA-mRNA interaction regulatory network, and drug-protein interaction network towards precision medicine.

Data Collection and Preprocessing
Two gene expression datasets (GSE164805 and GSE154998) were downloaded from GEO DataSets (https://www.ncbi.nlm.nih.gov/geo/, accessed on 14 January 2022) [69]. Diagram summarizes the process of identifying key genes, novel biomarkers, and candidate drugs using multi-levels of biological network analyses. There are four principal data and networks, including transcriptomics data, protein-protein interaction network, miRNA-mRNA interaction regulatory network, and drug-protein interaction network towards precision medicine.

Data Collection and Preprocessing
Two gene expression datasets (GSE164805 and GSE154998) were downloaded from GEO DataSets (https://www.ncbi.nlm.nih.gov/geo/, accessed on 14 January 2022) [69]. Both datasets are leukocyte transcriptomic profiles collected from peripheral blood samples in severe COVID-19 patients compared to non-COVID-19 controls. The gene expression method in GSE164805 was conducted based on the microarray technique, while GSE154998 measured the transcriptomic profiles via the RNA sequencing (RNA-Seq) method [70,71]. The complete data sets, consisting of false discovery rate (FDR) q-value and log2fold change (log2 FC), were manipulated using R package 'dplyr' [72]. The DEGs were filtered based on genes expression having the FDR < 0.05 and absolute log2 FC (|log2 FC|) > 1. DEGs that met the criteria in both data sets were common DEGs that were used for further analysis.

Topological and Network Clustering Analysis of the PPI Network
In Cytoscape, the Network Analyzer plugin was performed to calculate global topological parameters, for instance, average degree, diameter, radius, average clustering coefficient, average shortest path length, and network density. Local topological parameters, such as degree, closeness, betweenness, and clustering coefficient, were also computed. Moreover, network clustering was conducted using MCODE plugin [79] in Cytoscape. The plugin was used by default settings, for example, a degree cut-off: 2, node score cut-off: 0.2, k-core: 2, and max depth: 100. An MCODE score cut-off for cluster selection was greater than 5.

Regulatory Network Construction and Novel Biomarkers Identification
The gene sets in each MCODE cluster were inputted in MIENTURNET (http://userver. bio.uniroma1.it/apps/mienturnet/, accessed on 22 January 2022) [67], an online-based software, to construct microRNA (miRNA)-mRNA regulatory networks. The software was used to find miRNA-mRNA interactions based on miRTarBase, a miRNA-target database validated from experimental data [68]. miRNAs with interaction FDR q-value less than 0.05 were considered novel biomarkers in severe COVID-19.

Identification of Hub and Bottleneck Genes
Degree and betweenness centrality were measured using the Network Analyzer plugin in the Cytoscape to find the hub and bottleneck genes in the PPI network. Given a network called G, let A be a non-weight adjacency matrix of network G. Degree centrality (C D ) is the number of adjacent nodes interacting with interested node i, according to this equation where A ij is a value of matrix A of node i and j, respectively. In biological networks, the high-degree nodes are hub genes playing a crucial role in the network function due to numerous interactions. Nodes in the PPI network having degree centrality greater than the 95th percentile were considered hub genes.
Betweenness centrality (C B ) is the summation of the ratio between the shortest path of node u and v that pass through node i. The betweenness centrality is calculated based on this equation where σ uv is a total number of the shortest path between node u and v and σ uv (i) is the number of the shortest path between node u and v that pass through node i. Nodes with betweenness more than 95th percentile were bottleneck genes in the PPI network. The bottleneck nodes play an important function in forming the bridges controlling the flow of information in the network.

Finding Key Genes Using Survival Analysis
Because there is no powerful tool to validate and predict the gene essentiality in severe COVID-19 recently, we applied the cancer survival analysis to find key genes from the PPI network. The key genes in severe COVID-19 were identified based on the hub and bottleneck genes by using GEPIA2 (http://gepia2.cancer-pku.cn/#index, accessed on 25 January 2022) [66]. GEPIA2 provides the single gene essentiality in several cancer types by using survival and gene expression analysis based on The Cancer Genome Atlas (TCGA) [80] and Genotype-Tissue Expression (GTEx) data [81]. As earlier described, the myeloid-lineage leukocytes such as macrophages, neutrophils, and NK cells play a vital role in COVID-19-associated cytokine storm by releasing the excessive proinflammatory cytokines. Hence, LAML was used as a cell type model to find the key genes related to immune-induced severe COVID-19. The survival analysis was performed by the Kaplan-Meier method, which considers these parameters such as log-rank p-value and hazard ratio (HR) with 95% confidence interval. The key genes were inputted to find targeted drugs in these drug-gene interaction databases, for example, DrugBank database (https://go.drugbank.com/, accessed on 30 January 2022) [82], Therapeutic Target Database (TTD) (http://db.idrblab.net/ttd/, accessed on 30 January 2022) [83], Comparative Toxicogenomics Databases (CTD) (http:// ctdbase.org/, accessed on 30 January 2022) [84], and GeneCards (https://www.genecards. org/, accessed on 30 January 2022) [85]. The selected drugs were confirmed the interaction significance using the STITCH v5.0 database (http://stitch.embl.de/, accessed on 30 January 2022) [86], a drug-protein interaction database, by considering a confidence score greater than 0.400 (medium confidence). The confidence score is calculated based on both experimental and computational evidence, similar to the STRING database. Drugs that met the criteria were considered candidate-targeted drugs.

Molecular Docking of Potential Drugs against B-Myb
Molecular docking was performed to elucidate the interaction between drug candidates and a target protein named B-Myb. This protein is encoded from MYBL2, an essential gene in the network analysis. The crystal structure of B-Myb was received from PDB (https://www. rcsb.org/, accessed on 12 June 2022) [87] using PDB ID: 6C48 from the study [88]. The function of B-Myb is activated via binding between the LXXLL motif located in the B-Myb transactivation domain and the KIX domain of coactivator p300 to form a transcriptional module [89][90][91][92]. Thus, the motif containing L688, R687, G686, L685, and L684 residues was set as the binding site. Several studies have shown that plumbagin, a natural naphthoquinone binding at this motif, can cause B-Myb/p300 interaction interference [93][94][95]; therefore, plumbagin was used as a reference ligand in the docking study to compare with candidate drugs such as doxorubicin and camptothecin. The three compounds were individually docked into B-Myb using HDOCK server (http://hdock.phys.hust.edu.cn/, accessed on 12 June 2022) [96] and AutoDock VinaXB, a docking program using a genetic algorithm [97]. The ionized states of B-Myb were configured at pH 7.4 using PROPKA3.1 [98], while ChemAxon [99] was used to check the pKa value of the compounds. The binding affinity of the candidate drugs was calculated and compared to plumbagin. The 3D and 2D structures demonstrating the drug-protein interactions were visualized using the UCSF Chimera package [100] and the LigPlot [101].

Identification of Common DEGs
The common DEGs were filtered from the transcriptomics data based on microarray and RNA-Seq dataset (see Material and Methods) by considering FDR q-value < 0.05 and |log2 FC| > 1. There were 6692 and 1129 DEGs found by microarray technique and RNA-Seq technology, respectively. Figure 2a displays the Venn diagram representing the common DEGs from both datasets. In total, 384 common DEGs were identified, having 39 upregulated and 221 downregulated DEGs; however, the remaining common DEGs (124 genes) had both upregulation and downregulation because their expression pattern was contradictory between the two datasets. Figure 2b shows the correlation heatmap of the common DEGs between the two datasets. The gene list of the common DEGs is shown in Table S1 in Supplementary Materials.

Functional Enrichment Analysis of Up-and Downregulated DEGs
The functional enrichment analysis using Metascape of the DEGs is shown in Figure  3. In the upregulated DEGs, the terms were primarily enriched relevantly to viral innate immune response and cell cycle regulation (Figure 3a). For instance, IFN-α and IFN-β were type I IFN (IFN-I) predominant in the viral innate immune response. In addition, anaphase-promoting complex/cyclosome (APC/C), a cell cycle regulator complex, and chromosome segregation were enhanced in leukocytes during severe COVID-19. Other increased functional terms, such as regulation of binding and endopeptidase activity,

Functional Enrichment Analysis of Up-and Downregulated DEGs
The functional enrichment analysis using Metascape of the DEGs is shown in Figure 3. In the upregulated DEGs, the terms were primarily enriched relevantly to viral innate immune response and cell cycle regulation (Figure 3a). For instance, IFN-α and IFN-β were type I IFN (IFN-I) predominant in the viral innate immune response. In addition, anaphasepromoting complex/cyclosome (APC/C), a cell cycle regulator complex, and chromosome segregation were enhanced in leukocytes during severe COVID-19. Other increased functional terms, such as regulation of binding and endopeptidase activity, were also found in the upregulated DEGs. Moreover, the functional enrichment in the downregulated DEGs was mainly associated with cellular response to stress, lysosome, protein localization (the processes establishing and maintaining proteins at specific locations), glycosaminoglycan catabolic pathway, mature lymphocyte differentiation, positive regulation of intracellular protein transportation, negative regulation of protein modification, response to hyperoxia, and adaptive immune response (Figure 3b).

PPI Network Construction, Topological Analysis, and Cluster Detection
From the PPI network construction without the neighboring node expansion via STRING v11.0, there were 85 components with 384 nodes and 861 edges. The largest component containing 288 nodes and 848 edges was extracted for topological analysis and identifying clusters and key genes. The edge list information for the component is also provided in Table S2 in Supplementary Materials. Global topological parameters calculated from the Network Analyzer plugin in Cytoscape are illustrated in Table 1. Moreover, local topological parameters in each node in the network are summarized in Table S3 in Supplementary Materials.

PPI Network Construction, Topological Analysis, and Cluster Detection
From the PPI network construction without the neighboring node expansion via STRING v11.0, there were 85 components with 384 nodes and 861 edges. The largest component containing 288 nodes and 848 edges was extracted for topological analysis and identifying clusters and key genes. The edge list information for the component is also provided in Table S2 in Supplementary Materials. Global topological parameters calculated from the Network Analyzer plugin in Cytoscape are illustrated in Table 1. Moreover, local topological parameters in each node in the network are summarized in Table S3 in Supplementary Materials. The largest network visualized by STRING v11.0 is shown in Figure 4. The results analyzed by the STRING revealed that the average node degree, expected number of edges, and average local clustering coefficient were 5.89, 544, and 0.461, respectively. Additionally, a PPI enrichment p-value was less than 10 −16 , indicating that the proteins have interactions with each other more than by chance.  The network probably provided the small-world effect, such as several biological networks, because it had a low value of the mean shortest path length (mspl = 4.33) even though there was a moderate average clustering coefficient (acc = 0.28). Furthermore, the degree distribution plot illustrated in Figure 5a shows the power-law property, indicating the strong negative association between logarithmic scales of degree and its probability (R 2 = 0.86). On the other hand, the clustering coefficient versus degree plot (Figure 5b) shows no relationship between the clustering coefficient and degree (R 2 = 0.12). These be- The network probably provided the small-world effect, such as several biological networks, because it had a low value of the mean shortest path length (mspl = 4.33) even though there was a moderate average clustering coefficient (acc = 0.28). Furthermore, the degree distribution plot illustrated in Figure 5a shows the power-law property, indicating the strong negative association between logarithmic scales of degree and its probability (R 2 = 0.86). On the other hand, the clustering coefficient versus degree plot (Figure 5b) shows no relationship between the clustering coefficient and degree (R 2 = 0.12). These behaviors suggested that the network had scale-free properties. There were three clusters identified from the MCODE plugin with the score of more than 5: MCODE 1, 2, and 3. Topological parameters of the clusters were described in Supplementary Table S4. Most MCODE 1 and 3 cluster members were upregulated DEGs, while MCODE2's cluster members were downregulated DEGs. Functional enrichment results of each cluster are illustrated in Table 2    There were three clusters identified from the MCODE plugin with the score of more than 5: MCODE 1, 2, and 3. Topological parameters of the clusters were described in Supplementary Table S4. Most MCODE 1 and 3 cluster members were upregulated DEGs, while MCODE2's cluster members were downregulated DEGs. Functional enrichment results of each cluster are illustrated in Table 2    There were three clusters identified from the MCODE plugin with the score of more than 5: MCODE 1, 2, and 3. Topological parameters of the clusters were described in Supplementary Table S4. Most MCODE 1 and 3 cluster members were upregulated DEGs, while MCODE2's cluster members were downregulated DEGs. Functional enrichment results of each cluster are illustrated in Table 2 Figure 7 illustrates miRNA-mRNA interaction networks constructed based on the three MCODE clusters. The interactions were statistically significant at FDR q-value < 0.05. There were five novel candidate biomarkers analyzed from the regulatory networks, for instance, hsa-miR-6792-5p, hsa-let-7b-5p, hsa-miR-34a-5p, hsa-miR-92a-3p, and hsa-miR-146a-5p. The further statistical and interaction data of regulatory networks in MCODE 1, 2, and 3 are explained in Figure S2 and Tables S5-S7 in Supplementary Materials. There were three miRNAs interacting with the mRNAs in MCODE 1 (Figure 7a): hsa-miR-6792-5p, hsalet-7b-5p, and hsa-miR-34a-5p. In addition, hsa-miR-92a-3p and hsa-miR-146a-5p interacted with the mRNAs in MCODE 2 and 3, respectively. miRNA regulates gene expression via mRNA binding and increases mRNA degradation or activation [102,103]. A change in miRNA levels can indicate gene expression status; therefore, miRNA measurement can be applied to predict severe COVID-19 based on the effect on gene expression patterns.

Finding Potential miRNAs as Novel Biomarkers in Regulatory Networks
were three miRNAs interacting with the mRNAs in MCODE 1 (Figure 7a): hsa-miR-6792-5p, hsa-let-7b-5p, and hsa-miR-34a-5p. In addition, hsa-miR-92a-3p and hsa-miR-146a-5p interacted with the mRNAs in MCODE 2 and 3, respectively. miRNA regulates gene expression via mRNA binding and increases mRNA degradation or activation [102,103]. A change in miRNA levels can indicate gene expression status; therefore, miRNA measurement can be applied to predict severe COVID-19 based on the effect on gene expression patterns. The blue triangular nodes are miRNAs. The red and green circular nodes represent upregulated and downregulated DEGs, respectively. In comparison, gray nodes represent genes having both upregulation and downregulation.

Key Gene Identification and Survival Analysis
There were 19 and 15 genes being hub and bottleneck, respectively. Tables S8 and S9 in Supplementary Materials reveal topological parameters of the hub and bottleneck The blue triangular nodes are miRNAs. The red and green circular nodes represent upregulated and downregulated DEGs, respectively. In comparison, gray nodes represent genes having both upregulation and downregulation.

Key Gene Identification and Survival Analysis
There were 19 and 15 genes being hub and bottleneck, respectively. Tables S8 and S9 in Supplementary Materials reveal topological parameters of the hub and bottleneck genes, such as degree, betweenness, closeness, and clustering coefficient. Furthermore, Figure S3 in Supplementary Materials shows a Venn diagram of nodes being the hub and bottleneck genes. Seven genes were both hub and bottleneck: AURKB, CD44, CDC25A, DDX58, DICER1, POLR2B, and RPL7. Table 3 displays the biological function of the hub and bottle genes. Most hub genes were involved in cell proliferation and differentiation, such as cell cycle regulation, hematopoiesis, antiapoptotic process, DNA replication and transcription, and ribosomal synthesis. Additionally, the bottleneck genes in the PPI network mainly play an essential role in inflammation, antiviral and innate immune activation, oxidative stress prevention, and biomolecule metabolisms, for instance, lymphocyte and macrophage activation, viral recognition, mitochondrial protein transportation, protein and glycosaminoglycan degradation, and heme catabolism.  The survival analysis using GEPIA2 based on the LAML model in the TCGA database of the 27 hub and bottleneck genes revealed that only MYBL2 provided significant overall survival (log-rank p-value < 0.05) and a high hazard ratio (HR = 1.7); however, there were three genes that were nearly significant overall survival and high hazard ratio, for example, CDC25A (log-rank p-value = 0.064 and HR = 1.7), GUSB (log-rank p-value = 0.057 and HR = 0.58), and SDAD1 (log-rank p-value = 0.082 and HR = 1.6). Figure 8 displays Kaplan-Meier overall survival analysis of the significant and almost significant genes. The overall survival analysis of other hub and bottleneck genes is illustrated in Figure S4 in Supplementary and Materials.

Finding Candidate Targeted Drugs
MYBL2, the significant key gene obtained from the survival analysis, was inputted to the drug-gene interaction databases: DrugBank database [82], TTD [83], CTD [84], and GeneCards [85]. The almost significant key genes, such as CDC25A, GUSB, and SDAD1, were also used to find drug-gene interactions. The result showed 35 FDA-approved drugs interacting with the key genes, as illustrated in Table S10 in Supplementary Materials. STITCH v5.0 database [86] was used to confirm the result from the search. MYBL2 was the only key gene having drug-protein interaction. The STITCH result revealed that doxorubicin and camptothecin interact with MYBL2, as shown in Figure 9.
overall survival (log-rank p-value < 0.05) and a high hazard ratio (HR = 1.7); however, there were three genes that were nearly significant overall survival and high hazard ratio, for example, CDC25A (log-rank p-value = 0.064 and HR = 1.7), GUSB (log-rank p-value = 0.057 and HR = 0.58), and SDAD1 (log-rank p-value = 0.082 and HR = 1.6). Figure 8 displays Kaplan-Meier overall survival analysis of the significant and almost significant genes. The overall survival analysis of other hub and bottleneck genes is illustrated in Figure S4 in Supplementary and Materials.

Finding Candidate Targeted Drugs
MYBL2, the significant key gene obtained from the survival analysis, was inputted to the drug-gene interaction databases: DrugBank database [82], TTD [83], CTD [84], and GeneCards [85]. The almost significant key genes, such as CDC25A, GUSB, and SDAD1, were also used to find drug-gene interactions. The result showed 35 FDA-approved drugs interacting with the key genes, as illustrated in Table S10 in Supplementary Materials. STITCH v5.0 database [86] was used to confirm the result from the search. MYBL2 was the only key gene having drug-protein interaction. The STITCH result revealed that doxorubicin and camptothecin interact with MYBL2, as shown in Figure 9.  Figure  10b). The former program showed that either plumbagin or candidate drugs interacted with the three crucial residues associated with the motif, i.e., L685, R687, and L688. Through the interaction with the active site of B-Myb, doxorubicin and camptothecin produced an HDOCK score of −119.59 and −88.15 kcal mol −1 , relatively outperforming plumbagin's (−64.19 kcal mol −1 ). The strong binding affinity of doxorubicin was supported by two hydrogen bonds formed with the two positively charged residues, R682 and R687. Conversely, only one hydrogen bond binding to residue R687 was detected in the reference ligand and camptothecin. The obtained data were in accordance with the AutoDock VinaXB results. All compounds can bind to the critical residues L685, R687, and L688 with binding affinities of −4.1, −5.6, and −5.5 kcal mol −1 for plumbagin, doxorubicin, and camptothecin, respectively. Again, there were hydrogen bonds between the candidate drugs and B-Myb through R682 and R687 residues. In contrast, no hydrogen bond formation was identified in the case of the reference ligand. Figure 9. Drug-protein interaction network of the candidate drugs targeting MYBL2 resulted from STITCH v5.0. The black, green, and red edges represent protein-protein, drug-protein, and drugdrug interactions. Figure 9. Drug-protein interaction network of the candidate drugs targeting MYBL2 resulted from STITCH v5.0. The black, green, and red edges represent protein-protein, drug-protein, and drugdrug interactions. Figure 10 displays the molecular docking results of the studied compounds binding to the LXXLL motif by the HDOCK webserver ( Figure 10a) and AutoDock VinaXB (Figure 10b). The former program showed that either plumbagin or candidate drugs interacted with the three crucial residues associated with the motif, i.e., L685, R687, and L688. Through the interaction with the active site of B-Myb, doxorubicin and camptothecin produced an HDOCK score of −119.59 and −88.15 kcal mol −1 , relatively outperforming plumbagin's (−64.19 kcal mol −1 ). The strong binding affinity of doxorubicin was supported by two hydrogen bonds formed with the two positively charged residues, R682 and R687. Conversely, only one hydrogen bond binding to residue R687 was detected in the reference ligand and camptothecin. The obtained data were in accordance with the AutoDock VinaXB results. All compounds can bind to the critical residues L685, R687, and L688 with binding affinities of −4.1, −5.6, and −5.5 kcal mol −1 for plumbagin, doxorubicin, and camptothecin, respectively. Again, there were hydrogen bonds between the candidate drugs and B-Myb through R682 and R687 residues. In contrast, no hydrogen bond formation was identified in the case of the reference ligand.

Discussion
Finding novel biomarkers, key genes, and candidate targeted drugs is necessary to predict, treat, and follow severe COVID-19 patients. This study conducted various types of biological network analysis, such as regulatory and protein-protein interaction networks, based on common DEGs from microarray data and RNA-Seq data for the transcriptomics data of severe CPVID-19 patients. The functional enrichment analysis of the upregulated and downregulated DEGs was operated to discover the disease's underlying molecular mechanisms. We also detected a network community in the PPI network. Novel biomarkers were discovered via miRNA identification in the regulatory networks constructed based on the MCODE modules. In addition, the key genes in the PPI network

Discussion
Finding novel biomarkers, key genes, and candidate targeted drugs is necessary to predict, treat, and follow severe COVID-19 patients. This study conducted various types of biological network analysis, such as regulatory and protein-protein interaction networks, based on common DEGs from microarray data and RNA-Seq data for the transcriptomics data of severe CPVID-19 patients. The functional enrichment analysis of the upregulated and downregulated DEGs was operated to discover the disease's underlying molecular mechanisms. We also detected a network community in the PPI network. Novel biomarkers were discovered via miRNA identification in the regulatory networks constructed based on the MCODE modules. In addition, the key genes in the PPI network were found by finding the hub and bottleneck nodes using degree and betweenness centrality measurement and were validated by the overall survival analysis of the LAML model. Finally, drug repurposing was performed by drug-gene and drug-protein interaction database searching and molecular docking based on the key genes.
We identified 384 common DEGs that met the two datasets, and the number of upregulated and downregulated genes were 39 and 221, respectively. The remaining 124 DEGs had both upregulation and downregulation. The functional enrichment result of the upregulated DEGs revealed that the terms were generally involved in antiviral and innate immune response and cell cycle regulation. The processes and pathways were concordant with immune responses to infectious diseases. In host response to infections, immune-related, inflammatory-related, and leukocyte proliferation and differentiation genes are overexpressed to eradicate pathogens [104][105][106][107]; however, excessive immune and inflammatory responses can cause uncontrolled self-tissue injury, leading to severe complications and increased morbid and mortal cases. Furthermore, the enrichment analysis of the downregulated DEGs mainly concentrated in the cellular response to stress, lysosome, mature lymphocyte differentiation, negative regulation of protein modification, glycosaminoglycan catabolic pathway, response to hyperoxia, and adaptive immune response. Numerous studies have shown that impaired lymphocyte differentiation and adaptive immune activation are found in severe COVID-19, resulting in delayed viral clearance and persistent proinflammatory cytokine release [108][109][110][111][112]. ARDS and severe pneumonia are also found in severe COVID-19, causing hypoxia. Hence, genes related to the hyperoxia response were downregulated. In addition, negative protein modification regulation expression was reduced to increase the proinflammatory cytokine and antiviral protein production and release. Decreased glycosaminoglycan degradation can promote SAR-CoV, MERS-CoV, and SARS-CoV-2 to infect host cells. For example, some studies have revealed that the viruses use S protein binding with heparan sulfate proteoglycans (HSPGs) to enter the host cells in the disease's early stage [113][114][115][116].
The PPI network constructed by the STRING database based on the common DEGs, same as other biological networks, had the scale-free property. The scale-free property was proved by the strong relationship between degree and degree probability in degree distribution and the independence of the clustering coefficient and degree. Furthermore, the network likely provided the small-world effect because it had a low average shortest path length and moderate average clustering coefficient. The PPI network cluster detection using the MCODE algorithm showed three clusters with high MCODE scores: MCODE 1, 2, and 3. MCODE 1 was the upregulated gene cluster mainly enriched in cell proliferation and cell cycle. MCODE 2, the downregulated gene set, was primarily concentrated in ribosomal synthesis and protein translation regulation. Furthermore, MCODE 3 was centered on antiviral and innate immune responses; therefore, the enrichment terms of each cluster were according to the terms found in upregulated and downregulated DEGs.
The regulatory networks from the MCODE clusters showed that five miRNAs, hsa-miR-6792-5p, hsa-let-7b-5p, hsa-miR-34a-5p, hsa-miR-92a-3p, and hsa-miR-146a-5p, were the novel candidate biomarkers. hsa-miR-6792-5p, hsa-let-7b-5p, and hsa-miR-34a-5p interacted upregulated mRNAs related to cell proliferation and differentiation in MCODE 1 cluster. In MCODE 2, downregulated mRNAs involved in protein translation regulation were associated with hsa-miR-92a-3p. Moreover, hsa-miR-146a-5p interacted with upregulated antiviral and innate immune mRNAs in MCODE 3. miRNAs are small, non-coding RNAs that play an essential role in controlling gene expression via binding mRNA and then increase mRNA cleavage or translation dependent on their properties [103,117]. miRNAs also have a role in clinical applications such as diagnostic markers and therapeutic targets [118,119]. Because miRNAs are stable and detectable in serum and plasma, they are applied as biomarkers for diagnosis [120]. Several studies have revealed miRNA expression based on viral proteins, host membrane receptors, and proinflammatory cytokines [121][122][123][124]; however, no study has reported the relationship between the five miRNAs acquired from the regulatory network analysis and COVID-19. Thus, they could play a crucial role in novel diagnostic biomarkers and therapeutic agents in severe COVID-19 and further investigation of their roles should be needed.
There were 27 hub or bottleneck genes from high degree and betweenness value selection. The hub genes were mainly involved in cell proliferation and differentiation, while the bottleneck genes were focused on antiviral and innate immune responses. We also found the four key genes, such as CDC25A, GUSB, MYBL2, and SDAD1, from the overall survival analysis based on the LAML model. CDC25A is a cell cycle and apoptosis regulator that plays a vital role in many cancers' progression, for example, breast, esophageal, lung, colorectal, prostate, and ovarian cancer [125,126]. Furthermore, in viral infection, a study performing Sendai virus-infected cell line showed that upregulated CDC25A suppressed IFN-β activation while knockdown of CDC25A increased IFN-β stimulation [127]. The result suggested that CDC25A could participate in impaired viral innate immunity and increase viral survival. GUSB is a hydrolase enzyme for glycosaminoglycan degradation [128]. As described earlier, declined glycosaminoglycan degradation can promote the coronaviruses to enter the host cells. As a result, GUSB can play a central role in COVID-19 progression. B-Myb, encoded from MYBL2, is a transcription factor in the MYB family that plays an essential role in cell proliferation, differentiation, apoptosis, and tumorigenesis [129]. It is used as a prognostic marker in many cancer types, such as hepatocellular carcinoma, gallbladder, colorectal, and breast cancer [130][131][132][133]. Interestingly, a weighted gene co-expression network analysis in the COVID-19 study reported that MYBL2 was one of 52 hub genes from the network analysis [134]. This result suggested the important role of MYBL2 in numerous biological networks. There are a few studies on the role of SDAD1. It probably plays a role in ribosomal biogenesis and tumorigenesis [135]. There is no report about a relationship between its expression and COVID-19. Hence, further studies on the biological roles of SDAD1 are needed.
The candidate targeted drug discovery came from searching in the four drug-gene interaction databases and the drug-protein interaction database based on the four key genes. The result indicated that doxorubicin and camptothecin had interacted with MYBL2. The drug-protein interactions can be investigated by molecular docking. No 3D structure of B-Myb in complex with known inhibitor is currently available. The involvement between the key residues and binding site in B-Myb's LXXLL motif, a multifunctional binding sequence in transcriptional regulation [136], was reported [93][94][95]. The b-Myb activity was inhibited by blocking the KIX domain of the B-Myb interaction partner, which was p300 [89], through natural [137][138][139] and small compounds [140]; however, identifying compounds that inhibit directly on B-Myb rather than p300 has not been revealed. In this work, the molecular docking results from the HDOCK webserver and AutoDock VinaXB showed that the two drug candidates, doxorubicin and camptothecin, had physical interactions with B-Myb. This evidence was supported by a reduction in cell proliferation in cancer cell lines having MYBL2 overexpression without proving apparent mechanisms [141,142]. We then proposed the possible mechanism from our study that their direct interactions with B-Myb could be involved in the decreased cellular activity of upregulated MYBL2 cells. Additionally, the candidate drugs demonstrated binding interaction and susceptibility with B-Myb significantly greater than plumbagin, the reference ligand; therefore, doxorubicin and camptothecin could be potential candidates to combat COVID-19.
There is other evidence to support that the candidate drugs could play an important role in severe COVID-19 treatment. Doxorubicin is a chemotherapeutic agent treating various types of cancer [143]. A study of structural bioinformatics revealed that doxorubicin proved the significant binding energy with SARS-CoV-2 main protease in the molecular docking [144]. This result suggested that doxorubicin could be a potential drug to treat severe COVID-19. Camptothecin is a natural product extracted from the Chinese happy tree (Camptotheca acuminata) [145]. It is used as a chemotherapeutic agent in cancer treatment by inhibiting DNA replication [146,147]. Camptothecin also has antiviral activity by inhibiting viral replication [148][149][150]. A study on transcriptomic profile in COVID-19 using bioinformatics showed that camptothecin could reverse the gene signature in COVID-19 [151]. In addition, the evidence from a molecular docking study uncovered that camptothecin formed hydrogen bonds with SARS-CoV-2 S protein to prevent the binding between S protein and ACE2 receptor [152]. The results indicated that camptothecin could play a vital role in COVID-19 treatment.
We studied the biological networks and structural biology to identify the key genes, novel biomarkers, and candidate targeted drugs based on leukocyte transcriptomic profiles; however, the immunopathology of severe COVID-19 is the interaction between immune cells and respiratory cells. Analysis of peripheral white blood cell gene expression can lose some proinflammatory cytokine information. Performing lung transcriptomic profiles for biological network construction is our suggestion for future research. Single-cell methods should be conducted to identify key genes and targeted drugs in each cell type. Advanced computational chemical methods such as molecular mechanics and molecular dynamics should also be included to simulate drug-protein interactions. Moreover, machine learning approaches are needed to deal with the big data of transcriptomic profiles to find the important features and predict key genes, novel biomarkers, and candidate-targeted drugs more widely and precisely.

Conclusions
Our study performed the multi-level biological network analysis from peripheral white blood cell transcriptomic profiles in severe COVID-19 patients. We found that the upregulated genes were enriched in cell proliferation and innate immune responses while the downregulated genes were concentrated in lymphocyte differentiation, adaptive immune response, and glycosaminoglycan degradation. The regulatory network analysis of the PPI network clusters provided novel diagnostic biomarkers from miRNAs. The key genes in severe COVID-19 were also identified via topological and survival analysis. These key genes play a significant role in leukocyte proliferation, antiviral activity, and viral proliferation. Furthermore, the candidate drugs targeting the key genes were found from database searching and evaluated with molecular docking. Nonetheless, other biomarkers, key genes, and candidate-targeted drugs were not found and need further investigation; therefore, advanced experimental and computational tools should be integrated to find new biomarkers and target treatments more precisely and personally.
Supplementary Materials: The following supporting information can be found and downloaded at: https://www.mdpi.com/article/10.3390/jpm12071030/s1. Table S1: Common differentially expressed genes (DEGs) from the two GEO datasets; Table S2: Edgelist of the largest PPI network; Table S3: Local topological parameters in each node in the PPI network; Table S4: Local topological parameters of the three MCODE clusters; Table S5: Statistical and interaction data of regulatory networks in MCODE 1; Table S6: Statistical and interaction data of regulatory networks in MCODE 2; Table S7: Statistical and interaction data of regulatory networks in MCODE 3; Table S8: Topological parameters of the 19 hub genes in the PPI network; Table S9: Topological parameters of the 15 bottleneck genes in the PPI network; Table S10: Drug-gene interaction data of the key genes; Figure  S1: Functional enrichment analysis of genes in each MCODE module; Figure S2: Bar graph of the number of interactions of miRNAs in each gene in MCODE modules using MIENTURNET based on miRTarBase database; Figure S3: Venn diagram of the key genes in the degree and betweenness centrality; Figure S4: Kaplan-Meier overall survival analysis of the hub and bottleneck genes in severe COVID-19.

Data Availability Statement:
The data generated in this study is available in this article.