Exploring the Novel Computational Drug Target and Associated Key Pathways of Oral Cancer

Oral cancer (OC) is a serious health concern that has a high fatality rate. The oral cavity has seven kinds of OC, including the lip, tongue, and floor of the mouth, as well as the buccal, hard palate, alveolar, retromolar trigone, and soft palate. The goal of this study is to look into new biomarkers and important pathways that might be used as diagnostic biomarkers and therapeutic candidates in OC. The publicly available repository the Gene Expression Omnibus (GEO) was to the source for the collection of OC-related datasets. GSE74530, GSE23558, and GSE3524 microarray datasets were collected for analysis. Minimum cut-off criteria of |log fold-change (FC)| > 1 and adjusted p < 0.05 were applied to calculate the upregulated and downregulated differential expression genes (DEGs) from the three datasets. After that only common DEGs in all three datasets were collected to apply further analysis. Gene ontology (GO) and pathway analysis were implemented to explore the functional behaviors of DEGs. Then protein–protein interaction (PPI) networks were built to identify the most active genes, and a clustering algorithm was also implemented to identify complex parts of PPI. TF-miRNA networks were also constructed to study OC-associated DEGs in-depth. Finally, top gene performers from PPI networks were used to apply drug signature analysis. After applying filtration and cut-off criteria, 2508, 3377, and 670 DEGs were found for GSE74530, GSE23558, and GSE3524 respectively, and 166 common DEGs were found in every dataset. The GO annotation remarks that most of the DEGs were associated with the terms of type I interferon signaling pathway. The pathways of KEGG reported that the common DEGs are related to the cell cycle and influenza A. The PPI network holds 88 nodes and 492 edges, and CDC6 had the highest number of connections. Four clusters were identified from the PPI. Drug signatures doxorubicin and resveratrol showed high significance according to the hub genes. We anticipate that our bioinformatics research will aid in the definition of OC pathophysiology and the development of new therapies for OC.

outcome of this study, based on which hub genes, GO terms, disease pathways, and cluster analysis were employed. All the steps of the study are demonstrated in Figure 1.
arrange vast volumes of data. An extensive network-based bioinformatics methodology was constructed in this study to reveal the influence of OC on gene expression patterns and how these could contribute to promoting other illnesses. In the beginning, we perused various gene expression patterns of three datasets, and after filtration of minimum criteria, all common genes were taken for further analysis. The protein interaction network was the major outcome of this study, based on which hub genes, GO terms, disease pathways, and cluster analysis were employed. All the steps of the study are demonstrated in Figure 1. Figure 1. The methodology process for the investigation is illustrated in a snapshot. A transcriptomic comparative analysis was performed between the datasets of OC. Initially, three datasets were exported for OC through the GEO open repository; then GEO2R was used to normalize the datasets. Three normalized datasets were compared with each other to find common DEGs. This comparison method showed a total of 166 DEGs, which were used to further analysis including the PPI network, GO analysis, pathway analysis, module network, and hub genes. Finally, using the hub genes, drug compounds and TF-miRNA co-regulatory network were extracted. Figure 1. The methodology process for the investigation is illustrated in a snapshot. A transcriptomic comparative analysis was performed between the datasets of OC. Initially, three datasets were exported for OC through the GEO open repository; then GEO2R was used to normalize the datasets. Three normalized datasets were compared with each other to find common DEGs. This comparison method showed a total of 166 DEGs, which were used to further analysis including the PPI network, GO analysis, pathway analysis, module network, and hub genes. Finally, using the hub genes, drug compounds and TF-miRNA co-regulatory network were extracted.

Dataset Consideration and DEG Identification
GEO [19] (https://www.ncbi.nlm.nih.gov/geo/, Access Date 1 August 2022) database was used to extract the microarray datasets. GEO is a publicly accessible gene expression collection with over 94 000 datasets and over 2 million samples [20], and was founded by the National Center for Biotechnology Information (NCBI) at the National Library of Medicine [21][22][23]. Three datasets including accession numbers GSE74530, GSE23558, and GSE3524 were extracted for the OC; all the datasets belonged to a microarray database. The GSE74530 dataset stands on a single platform GPL570 [HGU133_Plus_2] Affymetrix Human Genome U133 plus 2.0 Array for Homo sapiens. This dataset conducts 6 normal tissue and 6 tumor tissue. The GSE23558 dataset also depends on the GPL6480 platform, Agilent-014850 Whole Human Genome Microarray 4x44K G4112F, and conducts 27 tumor tissue and 5 normal tissue. The GSE3524 dataset conduct 16 tumor tissue and 4 normal tissue, and also stands on a single platform GPL96 [HG-U133A] Affymetrix Human Genome U133A Array. Three datasets were filtered with the minimum criteria adjusted p-value < 0.05 and logFc ± 1, and a comparison method was also devised to identify the common genes between the three datasets using online Venny toll (http://bioinformatics.psb.ugent.be/ webtools/Venn/, Access Date 1 August 2022) [24].

GO and Pathway Enrichment Analysis
The biological activities of the common DEGs were evaluated by functional analysis. To complete the functional analysis, the GO and pathway analysis were performed through clueGO. ClueGO is a Cytoscape App that pulls typical functional biological information from long lists of genes or proteins. The functional enrichment study was based on the most recent publicly accessible data from several annotations and ontology resources, which ClueGO can automatically retrieve. To make the analysis easier, predefined options for term selection are supplied. The results are shown as networks, in which GO terms and pathways are classified according to their biological function [25]. GO terms are divided into three stages, biological process (BP), Molecular function (MF), and Cellular component (CC). Three databases-KEGG, Wikipathways, and Reactome-were used to extract the pathway-related data.

PPI Network Construction and Cluster Algorithm Implementation
The PPI network is a visual representation of the protein-protein edge-interaction, which helps to identify the critical genes and potential biomarkers [26]. The protein network was constructed based on the physical connections between the proteins for datasets of OC from the STRING database [27] through the NetworkAnalyst free web tool [28]. To develop the PPI network, the minimum cutoff confidence score of 0.70 was used. Cytoscape is a free and open-source network visualization, data integration, and analysis software tool. Its research and implementation have mostly focused on the modeling demands of systems biology, but it has also been used in other fields [29]. The MCODE plugin [30] tool was used to identify the complex network area of the PPI network, where the basic parameter was degree cutoff = 2, k-core = 2, maximum depth = 100, and node score cutoff = 0.2.

Hub Genes Identification and Analysis
Hub genes are highly connected proteins in the PPI network that can play a significant role in identifying the potential therapeutic biomarker [31]. The three most popular and effective algorithms, including Degree [32], Maximal Clique Centrality (MCC) [33], and Maximum Neighborhood Component (MNC) [34] were used to identify the significant genes from the PPI network through the cytoHubba plugin tool of Cytoscape.

Computational Drug Signature Identification and Analysis
Therapeutic targets for the chosen hub DEGs were identified using the Drug Signatures Database (DSigDB) [35]. The DSigDB is a new gene set repository for gene set enrichment analysis, which connects medicines and chemicals to their target genes (GSEA). DSigDB presently includes 22527 gene sets with 17389 distinct compounds covering 19531 genes in each. The DSigDB database allows users to search for, browse through, and download medications, chemicals, and gene sets. For drug repurposing and translational research, DSigDB gene sets may be used in GSEA software to correlate gene expression to drugs/compounds [35]. p-value 0.01 and an overlap gene count of >= 9 were used as cutoff criteria for finding pharmaceutical targets.

TF-miRNA Co-Regulatory Network and Analysis
The TF-miRNA co-regulatory network was extracted from the RegNetwork repository (http://www.regnetworkweb.org/, Access Date 1 August 2022), and may play a significant role in identifying novel information about the OC. MicroRNA (miRNA) is a form of non-coding RNA molecule that regulates gene expression after it has been transcribed.
MiRNAs have a crucial role in tumor growth, differentiation, and apoptosis, according to recent research [36,37]. The targets of miRNAs were linked to tumor development in oral cancer in a number of studies [38]. Some miRNAs are antiapoptotic, whereas others are apoptotic promoters. Transcription factors (TFs) are frequent regulators of genes. They are primarily responsible for transcriptional regulation. Cancer subtypes may be identified using data from networks of miRNAs, TFs, and mRNAs. This information sheds light on the processes that control each cancer subtype.

166 Common Genes Were Found
Three datasets were extracted using the GEO repository from the NCBI open-source database. Initially, the GSE74530, GSE23558, and GSE3524 datasets showed 22187, 19563, and 9617 DEGs, respectively. Afterward, filtration with the minimum criteria 2508, 3377, and 670 DEGs were found for GSE74530, GSE23558, and GSE3524, respectively. A comparative analysis was used to identify the common DEGs. A total of 166 common DEGs were found, as shown in Figure 2.

The ClueGO Analysis for Gene Ontology and Pathway Enrichment
The GO annotation remarks that the terms of type I interferon signaling pathway, type I interferon signaling pathway, cellular response to type I interferon, response to type I interferon, mitotic sister chromatid segregation, sister chromatid segregation, regulation of mitotic metaphase/anaphase transition, etc. are highly significant in BP. Further, the terms spindle, mitotic spindle, chromosome, centromeric region, kinetochore, chromosomal region, spindle midzone, condensed chromosome, contractile actin filament bundle, stress fiber, condensed chromosome kinetochore, etc. are highly associated with the CC. The gene expression datasets of OC were analyzed to identify the common differentially expressed genes (DEGs) between the datasets. A total of 166 genes were regarded as the common DEGs between the datasets of OC.

The ClueGO Analysis for Gene Ontology and Pathway Enrichment
The GO annotation remarks that the terms of type I interferon signaling pathway, type I interferon signaling pathway, cellular response to type I interferon, response to type I interferon, mitotic sister chromatid segregation, sister chromatid segregation, regulation of mitotic metaphase/anaphase transition, etc. are highly significant in BP. Further, the terms spindle, mitotic spindle, chromosome, centromeric region, kinetochore, chromosomal region, spindle midzone, condensed chromosome, contractile actin filament bundle, stress fiber, condensed chromosome kinetochore, etc. are highly associated with the CC. The MF is also related to the terms peptidase activator activity, integrin binding, ATP-dependent microtubule motor activity, peptidase activator activity involved in the apoptotic process, microtubule motor activity, transmembrane receptor protein tyrosine kinase activity, motor activity, DNA replication origin binding, non-membrane spanning protein tyrosine kinase activity, chaperone binding, etc. (Table 1, Figure 3) On the other hand, the pathways of KEGG reported that the common DEGs are connected with the cell cycle, influenza A, progesterone-mediated oocyte maturation, Epstein-Barr virus infection, oocyte meiosis, measles, serotonergic synapse, hepatitis C, ECM-receptor interaction, small cell lung cancer, etc. Furthermore, the Reactome pathways showed that the common DEGs are mostly connected with expression of IFN-induced genes, CDK1 phosphorylates CDCA5 (sororin) at centromeres, formation of Cyclin B: Cdc2 complexes, CDK1: CCNB phosphorylates, CDK1 phosphorylates, ISGylation of host proteins, etc. The WikiPathways resource reveals that the common DEGs were associated with the cell cycle, overview of nanoparticle effects, non-genomic actions of 1,25 dihydroxy vitamin D3, DNA replication, type II interferon signaling (IFNG), the human immune response to tuberculosis, type I interferon induction and signaling during SARS-CoV-2 infection, host-pathogen interaction of human coronaviruses-Interferon induction, etc. (Table 2, Figure 4).

PPI Network Analysis and Cluster Algorithm Implementation
The PPI network was the most significant outcome of this research study. The STRING database was used to construct the PPI network, which was modified by the Cytoscape application. The PPI network stood on a total of 88 nodes and 492 edges, where nodes represented the genes and edges represented the connection between the genes ( Figure 5). In addition, the MCODE algorithm was implemented to identify the complex network area of the PPI network. There were four complex networks (modules) reported by the MCODE plugin algorithm (Figure 6). The first module was built with 26 nodes and 269 edges, where most of the hub genes were interconnected, the second module conducted 13 nodes and 70 edges, and the third and fourth modules both stood on the three nodes and three edges.

PPI Network Analysis and Cluster Algorithm Implementation
The PPI network was the most significant outcome of this research study. Th STRING database was used to construct the PPI network, which was modified by the Cy toscape application. The PPI network stood on a total of 88 nodes and 492 edges, wher nodes represented the genes and edges represented the connection between the gene ( Figure 5). In addition, the MCODE algorithm was implemented to identify the complex network area of the PPI network. There were four complex networks (modules) reported by the MCODE plugin algorithm (Figure 6). The first module was built with 26 nodes and 269 edges, where most of the hub genes were interconnected, the second module con ducted 13 nodes and 70 edges, and the third and fourth modules both stood on the thre nodes and three edges.

Doxorubicin and Resveratrol Significant Drug Signature
The DSigDB database was used to uncover the computational drug signature for the hub genes. There were many signatures that showed significant connectivity with the hub genes (Table 4). Among them, doxorubicin and resveratrol showed high significance. Doxorubicin is a commonly utilized therapy drug for many forms of cancer, according to a large number of research studies [39][40][41][42]. This medication binds to DNA and inhibits topoisomerase II activity, preventing the DNA double helix from resealing and stopping replication. Long-term replication halting triggers molecular pathways that lead to cell demise. While it is an effective anticancer drug, its usage has been limited because of the accompanying adverse effects, which include permanent myocardial damage and deadly congestive heart failure [42].

TF-miRNA Co-Regulatory Network and Analysis
MicroRNAs (miRNAs) and transcription factors (TFs) are essential regulators of gene expression [43]. MiRNAs and TFs may have a dual regulatory role in OC. After aggregating hub genes from the PPI network, we created a full TF-miRNA co-regulatory network by combining anticipated and experimentally proven TF and miRNA targets. The RegNetwork repository was used to create a TF-miRNA co-regulatory network using hub genes. The TF-miRNA co-regulatory network has 131 nodes and 153 edges, including 63 TF candidates, 8 hub nodes, and 60 miRNA candidate nodes (Figure 8). hsa-miR-590-3p miRNA is the most significant target and is connected with the 3 hub genes and a TF gene. In the TF-miRNA co-regulatory network, four TF genes, including MYB, SP1, NFYA, and MYC traced as highly connected with the hub genes. Figure 8. A TF-miRNA co-regulatory network was created using hub genes. In the figure, octagonal red color nodes refer to hub genes, rectangular nodes indicate miRNA, green (circular) and yellow (diamond-shaped) colored nodes are TF genes. According to the figure, MYC, MYB, SP1, and NFYA are significant TF genes; on the other hand, hsa-mir-590-3p is an important miRNA target.

Discussion
Oral cancer is a major health issue that has a high morbidity and fatality rate. Early identification and prevention are critical in reducing the global incidence of oral cancer [44]. We examined gene expression patterns in three microarray datasets of OC patients using a network-based method, and discovered molecular targets that could be exploited as cancer biomarkers, and could also provide critical details regarding their influence on the evolution of diseases or disorders. Expression profiling using high-throughput microarray datasets was shown to be a useful resource for discovering biomarker candidates for a number of disorders in the domains of biomedical and computational biology [45]. The 166 common DEGs had comparable expression across three datasets, according to the OC transcriptomic analysis. The biological significance of the 166 common DEGs was investigated using gene ontology and pathway analysis methodologies based on p-values to gain insight into the etiology of OC.
The GO is a gene regulatory framework based on a general conceptual paradigm that makes genes and their interactions easier to grasp. Accumulated biological knowledge about gene activities and regulation in a range of ontological areas has evolved over time to achieve this [46]. The GO database was used as an annotation source for three different types of GO analyses: BP (molecular activities), CC (gene regulates function), and MF (molecular level activities) [47]. The BP reported that the mitotic sister chromatid segregation, regulation of mitotic metaphase/anaphase transition, type I interferon signaling pathway, positive regulation of cell cycle phase transition, response to interferon-beta, response to interferon-alpha, macrophage differentiation, etc. are significant terms that have been revealed by group-wise analysis through ClueGO. From these, the mitotic sister chromatid segregation is significantly associated with the common DEGs. The groups of terms chromosome, centromeric region, spindle, stress fiber, CMG complex, nuclear Figure 8. A TF-miRNA co-regulatory network was created using hub genes. In the figure, octagonal red color nodes refer to hub genes, rectangular nodes indicate miRNA, green (circular) and yellow (diamond-shaped) colored nodes are TF genes. According to the figure, MYC, MYB, SP1, and NFYA are significant TF genes; on the other hand, hsa-mir-590-3p is an important miRNA target.

Discussion
Oral cancer is a major health issue that has a high morbidity and fatality rate. Early identification and prevention are critical in reducing the global incidence of oral cancer [44]. We examined gene expression patterns in three microarray datasets of OC patients using a network-based method, and discovered molecular targets that could be exploited as cancer biomarkers, and could also provide critical details regarding their influence on the evolution of diseases or disorders. Expression profiling using high-throughput microarray datasets was shown to be a useful resource for discovering biomarker candidates for a number of disorders in the domains of biomedical and computational biology [45]. The 166 common DEGs had comparable expression across three datasets, according to the OC transcriptomic analysis. The biological significance of the 166 common DEGs was investigated using gene ontology and pathway analysis methodologies based on p-values to gain insight into the etiology of OC.
The GO is a gene regulatory framework based on a general conceptual paradigm that makes genes and their interactions easier to grasp. Accumulated biological knowledge about gene activities and regulation in a range of ontological areas has evolved over time to achieve this [46]. The GO database was used as an annotation source for three different types of GO analyses: BP (molecular activities), CC (gene regulates function), and MF (molecular level activities) [47]. The BP reported that the mitotic sister chromatid segregation, regulation of mitotic metaphase/anaphase transition, type I interferon signaling pathway, positive regulation of cell cycle phase transition, response to interferon-beta, response to interferon-alpha, macrophage differentiation, etc. are significant terms that have been revealed by group-wise analysis through ClueGO. From these, the mitotic sister chromatid segregation is significantly associated with the common DEGs. The groups of terms chromosome, centromeric region, spindle, stress fiber, CMG complex, nuclear replication fork, cornified envelope, etc. are associated with the CC. The MF-related groups of terms are ATP-dependent microtubule motor activity, peptidase activator activity, transmembrane receptor protein tyrosine kinase activity, collagen binding, integrin binding, etc.
The most effective approach for reflecting an organism's behavior through internal alterations is pathway analysis. KEGG, Reactome, and WikiPathways were used to compile the pathways of the most common DEGs. The groups of pathway terms Cell cycle, Influenza A, ECM-receptor interaction, Leishmaniasis, Epithelial cell signaling in Helicobacter pylori infection, etc. are associated with the pathway of KEGG. On the other hand, the pathway group of Reactome reported that the Formation of Cyclin B: Cdc2 complexes, CDK1 phosphorylates CDCA5 (Sororin) at centromeres, Association of Nek2A with MCC: APC/C, DNA polymerase alpha: primase binds at the origin, Kinesins move along microtubules consuming ATP, etc. pathways are connected with the common DEGs. In addition, Type I interferon induction and signaling during SARS-CoV-2 infection, Type II interferon signaling (IFNG), DNA Replication, Non-genomic actions of 1,25 dihydroxy vitamin D3, etc. are groups of pathways that are related to the WikiPathways.
Using common DEGs, a PPI network was created to understand the biological characteristics in-depth and to explore disease biomarkers [48]. Using the three methods, eight hub DEGs were traced, namely, CCNB1, BUB1, CCNB2, TTK, MAD2L1, CDK1, BUB1B, and AURKA, for their potential role in identifying the therapeutic biomarker [49]. CDK1 is a serine/threonine kinase with a high degree of conservation. With roughly 70 regulatory targets, it plays a critical role in cell cycle control. A variety of target substrates are phosphorylated by CDK1 directly in order to govern cell transcription and advancement in response to various stimuli [50]. CDKs and their modulators have been found to be abnormally active in a variety of cancers. CDK deficiency results in aberrant cell proliferation and genomic instability [51]. All human malignancies are known to be influenced by the D-cyclin-cdk4/6 INK4-Rb pathway [52]. CDK1 has been reported to be overexpressed in a variety of malignancies, including pancreatic adenocarcinoma, hepatoma, colorectal carcinoma, and head and neck cancers [53,54], indicating that it plays a significant role in cell-cycle regulation and cancer formation.
The hsa-mir-590-3p is an important target in the TF-miRNA co-regulatory network. Previous studies have reported that the hsa-mir-590-3p may play a significant role in pancreatic cancer [54], colorectal cancer [55], prostate cancer [56], and also breast cancer [57] as a therapeutic biomarker. No study has yet reported that hsa-mir-590-3p may play any role in OC. The TF genes MYB, SP1, NFYA, and MYC showed significance in the co-regulatory network of TF-miRNA. The use of modern molecular biology and gene modification methods in vitro and in vivo over the last decade has shown the significance of c-MYB in several forms of cancer. Breast [58], ovarian [59], colorectal [60], and colon [61] carcinoma are all examples of cancers where MYB plays a key role in cancer generation and development. A high level of MYB expression is thought to be linked to a halt in cellular differentiation as well as to continuing proliferation, which leads to oncogenicity [62]. The next experiment revealed that c-Myb may directly decrease miR-1258 production by binding to its promoter. Furthermore, in OSCC tissues, a study discovered a negative correlation between c-Myb and miR-1258 expression. When used together, c-Myb was shown to be responsible for miR-1258 un-regulation in OSCC [63]. SP1 appears to have a role in cancer growth, invasion, and metastasis, according to new findings [64]. SP1 accelerated the cell cycle from G1 to S phase, promoting cell proliferation [65]. SP1 seems to enhance cancer progression by altering cell proliferation and invasion, according to these findings. Overexpression of SP1 was found to contribute to OSCC development in prior research, suggesting that targeting SP1 might be a possible therapeutic target in OSCC [66].
Doxorubicin, often called Adriamycin, is a Streptomyces paucities spp. anthracycline antibiotic. Doxorubicin inhibits topoisomerase II, intercalates DNA, and produces free radicals, resulting in cell death or growth inhibition [67]. Doxorubicin has been widely employed for the treatment of various types of tumors [68,69] due to its broad-spectrum anti-tumor action and affordable cost. Doxorubicin resistance, on the other hand, is common in advanced cancers with a poor prognosis [70]. Doxorubicin is a drug typically used for breast cancer treatment [71][72][73], while some studies have claimed that Doxorubicin might play an important role in the treatment of colon cancer [74], thyroid cancer [75], and oral cancer [76]. One study looked at the effects of resveratrol on numerous targets, such as tubulin, protein kinase C alpha (PKC), phosphodiesterase-4D, human oral cancer cell line proteins, DNA sequences containing AATT/TTAA segments, protein kinase C alpha, and lysine-specific demethylase 1 [77]. Resveratrol administration inhibited the rate of cell growth in OSCC cell lines (0-1.5 g/mL) in a concentration-and time-dependent manner [78]. Cell cycle analysis demonstrated that resveratrol administration increased the number of cells in the G2/M phase with a subsequent decrease in the G1 phase in a time-dependent way [79]. This finding has not been validated by the gold benchmark database, which is a significant weakness of this research work. Another significant weakness is that ontological terms and pathways have not been interconnected with the OC.

Conclusions
The biologic areas and regulatory components discovered were quickly addressed in this work, which may hasten clinical activity against other OC-related illnesses. Our study's strength is that it is the largest transcriptome investigation into OC. This study's transcriptomic analysis yielded a shared ontological entity, pathways, illness connections, and transferrin genes. In the OC investigation, the related genes between datasets were found, resulting in additional molecular results and demonstrating the interaction of DEGs. This research could aid in the development of therapeutic targets and therapies in the future.

Conflicts of Interest:
The authors declare no conflict of interest.