A Network-Based Bioinformatics Approach to Identify Molecular Biomarkers for Type 2 Diabetes that Are Linked to the Progression of Neurological Diseases

Neurological diseases (NDs) are progressive disorders, the progression of which can be significantly affected by a range of common diseases that present as comorbidities. Clinical studies, including epidemiological and neuropathological analyses, indicate that patients with type 2 diabetes (T2D) have worse progression of NDs, suggesting pathogenic links between NDs and T2D. However, finding causal or predisposing factors that link T2D and NDs remains challenging. To address these problems, we developed a high-throughput network-based quantitative pipeline using agnostic approaches to identify genes expressed abnormally in both T2D and NDs, to identify some of the shared molecular pathways that may underpin T2D and ND interaction. We employed gene expression transcriptomic datasets from control and disease-affected individuals and identified differentially expressed genes (DEGs) in tissues of patients with T2D and ND when compared to unaffected control individuals. One hundred and ninety seven DEGs (99 up-regulated and 98 down-regulated in affected individuals) that were common to both the T2D and the ND datasets were identified. Functional annotation of these identified DEGs revealed the involvement of significant cell signaling associated molecular pathways. The overlapping DEGs (i.e., seen in both T2D and ND datasets) were then used to extract the most significant GO terms. We performed validation of these results with gold benchmark databases and literature searching, which identified which genes and pathways had been previously linked to NDs or T2D and which are novel. Hub proteins in the pathways were identified (including DNM2, DNM1, MYH14, PACSIN2, TFRC, PDE4D, ENTPD1, PLK4, CDC20B, and CDC14A) using protein-protein interaction analysis which have not previously been described as playing a role in these diseases. To reveal the transcriptional and post-transcriptional regulators of the DEGs we used transcription factor (TF) interactions analysis and DEG-microRNAs (miRNAs) interaction analysis, respectively. We thus identified the following TFs as important in driving expression of our T2D/ND common genes: FOXC1, GATA2, FOXL1, YY1, E2F1, NFIC, NFYA, USF2, HINFP, MEF2A, SRF, NFKB1, USF2, HINFP, MEF2A, SRF, NFKB1, PDE4D, CREB1, SP1, HOXA5, SREBF1, TFAP2A, STAT3, POU2F2, TP53, PPARG, and JUN. MicroRNAs that affect expression of these genes include mir-335-5p, mir-16-5p, mir-93-5p, mir-17-5p, mir-124-3p. Thus, our transcriptomic data analysis identifies novel potential links between NDs and T2D pathologies that may underlie comorbidity interactions, links that may include potential targets for therapeutic intervention. In sum, our neighborhood-based benchmarking and multilayer network topology methods identified novel putative biomarkers that indicate how type 2 diabetes (T2D) and these neurological diseases interact and pathways that, in the future, may be targeted for treatment.

Abstract: Neurological diseases (NDs) are progressive disorders, the progression of which can be significantly affected by a range of common diseases that present as comorbidities. Clinical studies, including epidemiological and neuropathological analyses, indicate that patients with type 2 diabetes (T2D) have worse progression of NDs, suggesting pathogenic links between NDs and T2D. However, finding causal or predisposing factors that link T2D and NDs remains challenging.
To address these problems, we developed a high-throughput network-based quantitative pipeline using agnostic approaches to identify genes expressed abnormally in both T2D and NDs, to identify some of the shared molecular pathways that may underpin T2D and ND interaction. We employed gene expression transcriptomic datasets from control and disease-affected individuals and identified differentially expressed genes (DEGs) in tissues of patients with T2D and ND when compared to unaffected control individuals. One hundred and ninety seven DEGs (99 up-regulated and 98 down-regulated in affected individuals) that were common to both the T2D and the ND datasets were identified. Functional annotation of these identified DEGs revealed the involvement of significant cell signaling associated molecular pathways. The overlapping DEGs (i.e., seen in both T2D and ND datasets) were then used to extract the most significant GO terms. We performed validation of these results with gold benchmark databases and literature searching, which identified which genes and pathways had been previously linked to NDs or T2D and which are novel. Hub proteins in the pathways were identified (including DNM2, DNM1, MYH14, PACSIN2, TFRC, PDE4D, ENTPD1, PLK4, CDC20B, and CDC14A) using protein-protein interaction analysis which have not previously been described as playing a role in these diseases. To reveal the transcriptional and post-transcriptional regulators of the DEGs we used transcription factor (TF) interactions analysis and DEG-microRNAs (miRNAs) interaction analysis, respectively. We thus identified the following TFs as important in driving expression of our T2D/ND common genes: FOXC1, GATA2, FOXL1, YY1, E2F1, NFIC, NFYA, USF2, HINFP, MEF2A, SRF, NFKB1, USF2, HINFP, MEF2A, SRF, NFKB1, PDE4D, CREB1, SP1, HOXA5, SREBF1, TFAP2A, STAT3, POU2F2, TP53, PPARG, and JUN. MicroRNAs that

Introduction
Type 2 diabetes (T2D) is a global health burden that affects hundreds of millions of people [1]. It is characterized by glucose dyshomeostasis, hyperglycaemia and insulin resistance, with predisposing factors that include obesity, poor quality diet, insufficient physical activity and genetic factors [2,3]. These factors interact to cause failure of circulating glucose level regulation which can result in an inability to supply sufficient insulin and eventual beta-cell loss that exacerbates the condition [4,5]. Glucotoxicity caused by chronic hyperglycemia induces cell injury of many cell types but hepatocytes and pancreatic cells in particular [6]. Hyperglycaemia classically causes vascular disease, damaging blood vessels and leading to a range of cardiovascular diseases. In addition, hyperglycemia has a number of long term effects that exacerbate impairments of central nervous system (CNS) function and cognitive function [7]. Metabolic changes seen in T2D patients lead to chronic CNS inflammation that contribute to neurodegeneration [8]. Other T2D associated metabolic disturbances are associated with atrophy in several regions of the brain (e.g., hippocampal) that in turn are associated with cognitive impairment [9]. The brain is a very insulin-sensitive organ, so insulin resistance itself can affect memory and learning [10]. Indeed, glucose levels affect neuronal maintenance, neurogenesis, neurotransmitter regulation, cell survival and synaptic plasticity [11]. Moreover, it is notable that neurodegenerative diseases are accompanied by high production of inflammatory mediators, oxidative stress, Deoxyribonuclic acid (DNA) damage, and mitochondrial dysfunction which in turn also contribute to the degenerative cascade and exacerbate insulin resistance [12]. T2D is also associated with excessive immune system activation [13].
Epidemiological studies show a particularly strong association between T2D and AD [14]. AD is characterized by the accumulation of β-amyloid (Aβ) into neuritic plaques and the presence of intracellular aggregates of tau protein in neurofibrillary tangles, amylin deposition as well as synaptic loss, neuroinflammation, and neuronal death [14]. Although its etiology still remains unclear, genetic predisposition and aging are strong risk factors for AD [22]. Moreover, the presence of (Aβ) and tau in the pancreas and insulin-sensitive tissues and their roles in inducing peripheral insulin resistance or disruptions in insulin secretion indicate that this may contribute to the incidence of AD [14].
A wealth of evidence indicates a link between T2D and ALS [15]. ALS is a disorder characterized by progressive muscular atrophy, cognitive impairment, and pyramidal deficit, due to the degeneration of upper and lower motor neurons [23]. Motor neuron loss is associated with mutations in the Cu/Zn superoxide dismutase (SOD1) gene in this disease. ALS patients have altered lipid and glucose metabolism with increased energy consumption [24] and hypermetabolism [25]. Nevertheless, the biological mechanisms linking T2D to ALS are yet unclear even though there are clearly important risk factors in common, including environmental factors, higher body mass index (BMI), elevated cholesterol level and hyperlipidemia [26].
T2D also shows associations with CP, a neurodevelopmental disorder characterized by permanent movement-related disabilities, evident in early life, due to abnormalities in the brain centres that control movement and balance [16]. There are developmental issues in the fetus that lead to CP may have a link to maternal T2D incidence [27], and as maternal obesity and T2D might raise the risk of CP occurring [16]. Notably, children with CP may develop T2D as an adult. The etiology of CP is unclear but the role of perinatal factors such as chorioamnionitis hypoxic-ischemic encephalopathy as well as brain injury occurring during the perinatal and postnatal periods may contribute to CP [28].
T2D is also linked to ED [17]. This is a group of neurological diseases characterized by epileptic seizures. The precise relationship between T2D and epilepsy remains unclear. It is known that epilepsy or seizures are associated with autoimmune or inflammatory disorders and in the pro-inflammatory processes [29]. Additionally, hyperglycemia may exert adverse effects on the central nervous system which leads to ED [30]. Some known risk factors including degenerative brain disorders and head injuries, stroke and dementia are considered as predisposing factors to ED [31].
HD is a neurodegenerative disorder caused by an expanded CAG repeat in exon 1 of the huntingtin gene (HTT) encoding huntingtin protein and characterized by progressive disturbances of mood and motor function, and by cognitive dysfunction [32]. The pathogenetic mechanisms behind HD include misfolding and aggregation of the huntingtin protein, oxidative stress, impaired mitochondrial metabolism, excitotoxicity in affected brain regions, and impairment of the ubiquitin-proteasome system [32]. Although Huntington's disease (HD) is primarily considered a rare neurodegenerative disorder, insulin resistance and impaired glucose metabolism contribute to its development [33].
MS is a chronic inflammatory and progressive immune-mediated disease of the central nervous system (CNS), characterized by a selective and coordinated inflammatory destruction of the myelin sheath, with damage to the axon [34]. Inflammation, demyelination, and axonal degeneration are associated with MS. Moreover, insulin resistance may induce inflammatory responses, oxidative stress and could exacerbate cognitive impairments in individuals with MS [19]. Though the links between T2D and risk of MS incidence is unclear, there are common genetic and environmental factors that contribute to the MS [34]. PD is clinically characterized by severe motor symptoms that include postural instability, resting tremor, muscular rigidity, and slowness of movements and pathologically characterized by the preferential loss of dopaminergic neurons [20]. PD features the presence of intracellular inclusions, known as Lewy bodies, rich in fibrillar α-Synuclein (aS), a protein suggested being involved in synaptic vesicle recycling and docking [35]. Several epidemiological studies have demonstrated that PD patients have impaired insulin signaling and insulin resistance, and hyperglycaemia which play a role to suppress dopaminergic neuronal activity and that decreasing dopamine turnover that contributes to the possible progression of PD [36].
In sum, there is good evidence that there are pathologically and clinically significant relationships between T2D and many NDs but the association has not been widely examined. As the etiology of T2D and NDs are quite complex and their risk factors somewhat tend to overlap, their biological basis and the molecular mechanisms that underlie this link are still not well understood. Finding interactions between T2D and NDs is very difficult, but is of great interest in medical endocrinology. T2D and NDs are very complex diseases in terms of their clinical presentations, and because of this they are hard to study by conventional hypothesis-driven endocrinology research, despite the high clinical importance. Moreover, there is still a lack of bioinformatics studies addressing the relationship between T2D and NDs. The aim of this study was to identify such links between T2D and NDs, since understanding the nature of these links could bring important insights into the mechanisms that underlie these diseases. This led us to employ a bioinformatics system pipeline, to analyze gene expression data from studies of disease-affected tissues for clues to the nature of the relationship between T2D and NDs.
Here, we focus on finding ND-associated differentially expressed genes (DEGs), molecular pathways and putative discriminative biomarkers that are common to both T2D and NDs. We subsequently performed the validation of the results with gold benchmark experimentally validated databases that include dbGaP, OMIM and OMIM Expanded, and literature.

Datasets Employed in This Study
We query datasets from the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) [37]. Queries for each disease returns a number of datasets but most of them were discarded for having a sample size below our selected cutoff sample size of at least 7, having no two conditions such as control vs case or control vs treated, replicate datasets, having undesirable formatting or irrelevant experimental focus, RNAseq datasets, and datasets from non-human organisms. Here, we used datasets by seeking those that would minimise any bias and noise for this type of analysis. This process identified 8 datasets that are highly relevant to T2D, AD, ALS, CP, ED, HD, MS, and PD are appropriate for our study.
We analyzed different human gene expression datasets with accession numbers GSE23343 [38], GSE28146 [39], GSE833 [40], GSE31243 [41], GSE22779 [42], GSE1751 [43], GSE38010 [44]. and GSE19587 [45], having control and disease affected individuals for our study. The T2D dataset (GSE23343) contained gene expression data obtained from liver biopsies of 10 T2D hyperglycemic patients and 7 normoglycemic controls using an Affymetrix Human Genome U133 Plus 2.0 arrays. The AD dataset (GSE28146) was microarray data (also Affymetrix U133 Plus 2.0 arrays) on RNA from snap-frozen brain tissue where white matter tissue was extracted by laser capture methods to collect only CA1 hippocampal gray matter. The ALS dataset (GSE833) employing Affymetrix HuGeneFL Hu6800 arrays, was a study of postmortem spinal cord gray matter samples from 7 ALS patients and from 4 control. The CP (GSE31243) dataset was generated from 40 Affymetrix human genome U133A 2.0 microarray studies of hamstring muscle samples from 20 controls (taken during tissue reconstruction) and 20 CP patients. The ED dataset (GSE22779) was a gene expression profile of peripheral blood mononuclear cells from 12 healthy control and individuals 4 with epilepsy in a study of in vivo glucocorticoid treatment; only data from blood cells extracted before glucocorticoid treatment were used. The HD dataset (GSE1751) was taken from peripheral blood cells from 14 healthy control and 12 symptomatic HD-affected individuals along with 5 presymptomatic HD patients; this also used U133A arrays. The MS dataset (GSE38010) using U133A array data from 5 MS patient brain lesions (identified histologically) compared with 2 brain tissue samples from control individuals. The PD dataset (GSE19587) was an analysis taken from affected brain areas of 12 postmortem brains of PD patients and 10 control samples of unaffected brain tissue using Affymetrix U133A Plus 2.0 arrays.

Preprocessing and Identification of Differentially Expressed Genes
We acquired gene expression microarray datasets from the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO). All these datasets were generated by comparing diseased tissue against controls to identify differentially expressed genes (DEGs) associated with their respective pathology. To make uniform the mRNA expression data from different platforms and to avoid the problems of experimental systems, we normalized the gene expression data comprising disease state and control data by using the Z-score transformation (Z ij ) for each NDs gene expression profile using the following equation: where σ i implies standard deviation and g ij indicates gene expression magnitude i in sample j. Such a transformation allows direct comparisons of gene expression values across samples and diseases. The datasets were normalized using the Robust Multi-Array Average expression measure (version 1.30.1) as implemented in the "affy" package (version 1.56.0) of the Bioconductor platform (version Rx64 3.3.0) in R. We performed the analysis of the microarray data using Linear Models for Microarray Data (Limma) [46]. Unpaired t-test statistic was used to identify genes differentially expressed in patients compared to normal samples. Moreover, to determine the statistical significance between groups, a two-way analysis of variance (ANOVA) with the false discovery rate (FDR) test was performed. Based on standard statistical criteria, a threshold of at least 1 log2 fold change (logFC) and t-tests giving a p-value of 0.01 were chosen. p-value < 0.01 and logFC > 1 for up-regulated genes and p-value < 0.01 and logFC < −1 for down-regulated genes were used. Genes with significantly different expression were thus selected. Gene symbols and names extracted for each disease. Gene symbol records with null or missing data were discarded for each disease. We identified both unique genes that were both over and under-expressed in NDs and T2D. We then pairwise compared the DEGs from T2D datasets with that of our AD, ALS, CP, ED, HD, MS, and PD datasets to find DEGs common to T2D and the NDs. Genes with the greatest magnitude up and down-regulation were selected from those common to the individual disease and T2D. We then applied neighborhood benchmarking and topological methods to show the associations between genes and diseases. A gene-disease network in short GDN was built to identify gene-disease connections, in which nodes can be either diseases or genes; such a network is represented as a bipartite graph where T2D is the center of this network using Cytoscape V 3.6.1. [47]. Diseases are associated when sharing at least one significantly dysregulated gene. For gene-disease association, we consider a set of human diseases, denoted by D and a set of human genes, denoted by G to find whether gene g ∈ G is associated with disease d ∈ D. Moreover, we consider that if G i and G j is the sets of genes with significantly up and down-regulated that were associated with diseases D i and D j , respectively, then the number of shared dysregulated genes (n g ij ) associated with both disease D i and D j is defined as follows [48]: The co-occurrence is the number of common genes between two diseases in the GDN and common neighbors identified employing Jaccard Coefficient methods [48], where edge predictions score (association score) for the node pair is: where G is the set of nodes and E is the set of edges. We also applied R software packages comoR [49], and POGO [50] to cross-check disease comorbidity associations.

Identification of Molecular Pathway and Gene Ontology
To obtain further insights into the molecular pathways and gene ontology (GO) of T2D that overlap with AD, ALS, CP, ED, HD, MS, and PD, we performed gene set enrichment analysis to identify pathways and GO of the overlapping DEGs with EnrichR [51]. Pathways are central to organism responses to stimuli, and pathway-based analysis is a recently developed approach to understand how complex diseases may be related to each other through their underlying molecular mechanisms [52]. GO is a conceptual model for the representation of gene functions and their relationship to gene regulation [53]. We considered 7 pathway databases: KEGG [54], Reactome [55], NCI-Nature [56], WiKi [57], BioCarta [58], Panther [59] and HumanCyc pathway database [60] and Gene ontology (GO) domain: Biological Process (BP).

Protein-Protein Interactions Analysis
Protein-protein interaction networks (PPIs) represent the physical contacts between two or more proteins molecules and are essential to all cell processes [61]. We also generated protein-protein interaction (PPI) networks based on the physical interaction of the proteins of DEGs by using information from the STRING database [62] via Network Analyst using the confidence score 900, where proteins are represented by nodes and protein interactions represented by edges. Using topological parameters, for example, a degree greater than 15°, highly interacting proteins were identified from PPI analysis.

Transcription Factors-microRNA Interactions Analysis
We studied the DEGs-Transcription Factors (TFs) and DEGs-microRNAs (miRNAs) to identify the regulatory biomolecules (i.e., TFs and miRNAs) that regulate DEGs of interest at the transcriptional and post-transcriptional level. We utilized the JASPAR database to analyze the DEGs-TFs interaction [63]. We employed miRNA-DEGs interactions from TarBase [64] and miRTarBase [65]. The topological analysis was performed via Network Analyzer in Cytoscape [47] and Network Analyst [66]. The TFs were screened out based on the degree (≥20) from the DEGs-TFs network. The miRNAs were selected based on the degree (≥15) from the DEGs-miRNAs network.

An Overview of the Analytical Approach
Our network-based systematical and quantitative pipeline to evaluate gene expression in human disease comorbidities is summarized as shown in Figure 1. The integrated pipeline of used here was implemented using the R language, code which is available on request. We developed the proposed pipeline using GEOquery [67] for downloading GEO datasets and expression set class transformation; limma [46] for differentially expressed gene identification from microarray data; genefilter [68] for filtering genes. The version of the used software and packages was R version 3.  Our approach employs gene expression analyses, disease gene association networks, signaling pathway mechanisms, gene ontology (GO) data, protein-protein interactions (PPIs) network, DEGs-Transcription Factors (TFs) interaction analysis, and DEGs-MicroRNAs (miRNAs) interaction analysis to identify putative discriminatory biomarkers between T2D and NDs. Furthermore, we also incorporated three gold benchmark verified datasets, OMIM (www.omim.org), OMIM Expanded, and dbGaP (www.ncbi.nlm.nih.gov/gap) to retrieve genes associated with known diseases and relevant disorders for validating the proof of principle of our network-based approach.

Gene Expression Analysis
To identify and investigate the gene expression effects of T2D that may influence the progression of NDs, we analyzed the gene expression microarray data collected from the National Center for Biotechnology Information (NCBI). Based on significant p-values, we found 1320 DEGs for T2D with false discovery rate (FDR) of 0.01 and absolute logFC of 1 using R Bioconductor packages (Limma). Similarly, we identified the most significant DEGs for each ND after statistical analysis. We identified 1606 DEGs in AD, 2901 in ALS, 588 in CP, 1887 in ED, 1338 in HD, 7463 in MS and 1558 in PD which is shown in Table 1. The cross-comparison analysis also identified common DEGs between T2D and each ND. We found that T2D shares 5, 5, 11, 15, 7, 35 and 21 significantly up-regulated genes whereas 12, 25, 6, 16, 7, 29 and 3 significant down-regulated genes for the AD, ALS, CP, ED, HD, MS, and PD respectively. To get statistically significant associations between T2D and the NDs, we built up-and down-regulated diseasome relationships network centered on the T2D and a link indicated between a disease and a gene when mutations in that gene are known to lead to the specific disease, as shown in Figures 2  and 3 whereas two diseases are comorbid, if they share associated genes.

Pathway and Functional Association Analysis
We performed pathways analysis to identify how complex diseases are interrelated with other diseases by the underlying molecular mechanisms. We performed gene set enrichment analysis to identify pathways using a bioinformatics resource: EnrichR [51] and considered 7 pathways databases to carry out tests using DEGs common between T2D and each ND. We also performed the regulatory analysis to get more insights into the molecular pathways involved in these comorbidities. We pinpointed overrepresented pathways among DEGs common to T2D and NDs and classified them into functional categories. Pathways deemed significantly enriched in the common DEG sets were reduced by manual curation to include only those pathways which have a p-value of below 0.05. We retrieved significant pathways by EnrichR which are significantly connected with DEGs of T2D and NDs as shown in Table 2. Table 2. Pathways common to T2D and the NDs revealed by the commonly expressed genes. These include significant pathways common to (a) T2D and AD (b) T2D and ALS (c) T2D and CP (d) T2D and ED (e) T2D and HD (f) T2D and MS and (g) T2D and PD.
(a) Common significant pathway common between T2D and AD   Among the identified pathways, we found that cytokine-cytokine receptor interaction pathway associated with adaptive inflammatory host defenses, cell growth, differentiation, cell death [54]. Glycosphingolipid biosynthesis pathway is associated with abundant amphipathic lipids expression in the nervous system [54]; Ubiquitin proteasome pathway associated with immune response and inflammation, neural and muscular degeneration, morphogenesis of neural networks and response to stress and extracellular modulators [59]; Ionotropic glutamate receptor pathway associated with the mediatation of the majority of excitatory synaptic transmission throughout the central nervous system and synaptic plasticity [59]; Glutamate neurotransmitter release cycle maintains the neurotransmitter glutamate in the central nervous system [55]; Transmission across chemical synapses pathway associated with the communication between neurons, muscle or gland cells [55]; Glutamatergic synapse pathway associated with the regulation of several neuronal functions, such as neuronal migration, excitability, plasticity, long-term potentiation (LTP) and long-term depression (LTD) [57]; Neuronal system pathway comprised of at least 100 billion neurons are associated with the communication among astronomical number of elements with functional connection between neurons [55]; Cell adhesion molecules (CAMs) pathway associated with a vital role in the development and maintenance of the nervous system [54]; Electric transmission across gap junctions pathway associated with the function of communicating neurons in the nervous systems [55]; Transmission across electrical synapses pathway associated with the mechanical conductive link between two neighboring neurons [55]; Spinal cord injury pathway associated with the loss of muscle function, sensation, or autonomic function in the parts of the body [57]; Neurotrophic factor-mediated Trk receptor signaling pathway associated with neuronal differentiation, survival and growth [56]; Neurotrophin signaling pathway associated with differentiation and survival of neural cells [54]; Adipocytokine signaling pathway associated with the process of inflammation, coagulation, fibrinolysis, insulin resistance, diabetes and atherosclerosis [54]; Brain-derived neurotrophic factor (BDNF) signaling pathway associated with growth, differentiation, plasticity, and survival of neurons. BDNF is also implicated in various neuronal disorders such as Alzheimer's disease, Huntington's disease [57].
To obtain further insights into the molecular roles and biological significance, enriched common DEGs sets were processed by GO methods using Enrichr, which identifies related biological processes (BP) in order to group them in functional categories. The list of processes and terms was then curated to include those terms with a p-value below 0.05. The cell processes thus identified are summarized in Table 3.

Protein-Protein Interactions (PPIs) Analysis
Using our enriched common disease genesets, we constructed putative PPI networks with web-based visualization resource STRING via Network Analyst using the confidence score 900 by the distinct 159 DEGs, as shown in Figure 4. The PPIs make up the so-called interactomics of the organism where anomalous PPIs cause multiple diseases. Two diseases are known to be related where one or more commonly associated protein subnetworks are shared. Using topological parameters, for example, the degree greater than 15°, highly interacting proteins were identified from PPI analysis. The simplified PPI networks were generated with the Cyto-Hubba plugin [69] to identify the most significant hub proteins as shown in Table 4 and the topological parameters were determined by Network Analyzer in Cytoscape [47] as shown in Figure 5.
This data provides evidence that PPI subnetwork exists in our enriched genesets and confirms the inclusion of relevant functional pathways. These identified hub proteins could be useful for therapeutic targets although further characterization is needed for their roles. The summary of hub protein is shown in Table 4. Figure 5. The simplified PPIs network is built with the significantly dysregulated genes common to type 2 diabetes (T2D) and neurological diseases (NDs) to identify the 10 most significant hub proteins marked as red, orange and yellow colour. Table 4. Summary of hub proteins identified by protein-protein interaction analysis encoded by DEGs that are common to type 2 diabetes (T2D) and neurological diseases (NDs).

Identification of Transcriptional and Post-Transcriptional Regulators of the Differentially Expressed Genes
TFs are proteins that regulate transcriptional and gene expression in all living organisms. TFs play a vital role in all cellular processes [70]. miRNAs are short RNA species involved in the post-transcriptional regulation of gene expression. The miRNAs are important biological regulators, for instance, neuronal differentiation, neurogenesis, and synaptic plasticity and they play vital roles in neurodegenerative diseases [71].
The miRNAs (mir-335-5p, mir-16-5p, mir-93-5p, mir-17-5p, mir-124-3p) were identified to provide an in-depth understanding of the DEGs at post-transcriptional regulators. The summary of transcriptional and/or post-transcriptional regulatory biomolecules of differentially expressed genes that are common to T2D and NDs is shown in Table 5. DNA-binding transcription factor activity and RNA polymerase II proximal promoter sequence-specific DNA binding

Validating Potential Targets Using Gold Benchmark Databases and Literatures
First of all, to validate our identified potential targets, we used OMIM, OMIM Expanded, and dbGaP datasets; these datasets collect curated and validated genes that indicate disease association data from the literature. In the validation, we presented a combined relation of OMIM, OMIM Expanded, and dbGaP databases. For evaluating the validity of our work, we provided statistically significant DEGs (genes common to T2D and neurological diseases (NDs)) to the online tool EnrichR [51] and collected enriched genes and their corresponding neurological disease names from OMIM, OMIM Expanded, and dbGaP databases. To find significant NDs, manual curation is applied considering a p-value of 0.05. Then, several diseases such as cancer, infectious diseases are removed from this list because they are not of interest in this study.
We also validated our identified potential targets by checking the biomedical literature to find genes clinically used as biomarkers for any of the NDs. We found that Van Cauwenberghe et al. [72] identified the MS4A2 gene associated with AD and Munshi et al. [73] identified the CR1 gene associated with AD. Eykens et al. [74] identified the APOE gene associated with ALS. Fahey et al. [75] identified the TENM1 gene associated with CP. UBE3A and CHD2 genes are associated with ED [76,77]. Arning et al. identified the UCHL1 [78] gene associated with HD. Baranzini et al. [79] identified the HLA-DRB1 gene to be associated with MS. Redenšek et al. identified [80] the HLA-DQB1 gene as associated with PD. This indicates that our analyses of significant genes in NDs match with existing records. We then constructed a Gene-Disease Network (GDN) based on genes and their associated neurological diseases from gold benchmark databases and literature using Cytoscape. This network showed gene-disease associations whereby if a gene mutation is known to lead to a specific disease, a link is indicated between disease and gene; this is shown in Figure 8.

Discussion
T2D and NDs are complex diseases but we have attempted here to take advantage of this complexity by looking at pathway and looking at interactions between type 2 diabetes (T2D) and neurological diseases, due to their great clinical importance. T2D is known to affect neurological diseases but how it does this is generally unclear (though some vascular based mechanisms are usually considered) and it is very hard to study by hypothesis-driven biochical or endocrinological research. This is why we employed well-established bioinformatics methods and analytical approaches that examine functional disease overlaps in genes and pathways, and provide an important but agnostic tool to identify new factors that play a part in these comorbidity interactions and which, by implication, may be important pathogenic mechanisms for these and other related diseases. We studied the microarray gene expression datasets from publicly available repositories employing a network-based bioinformatics pipeline. We identified DEGs common to T2D and NDs and constructed diseasome networks to provide insights into the interactions of these comorbidities using the diesease-common DEGs. These DEGs enabled identification of associated dysregulated molecular pathways and related GO terms. One particular technical point is that a large number of pathways and GO categories were reduced by manual curation after filtering using a p-value threshold of 0.05. We identified different pathways by investigating cell proteins (i.e., gene products) and their interactions considering seven pathway databases.
In addition to the pathways and GO terms we investigated interactors of the products of our DEGs of interest using protein-protein interaction (PPI) analysis and hub protein identification as well as DEGs-TFs, and DEGs-miRNAs interaction that has not been previously studied for these diseases. These studies' provide information about the molecules that could be the key drivers of pathogenesis for these comorbidities. The STRING database contains known protein-protein interactions which we used to identify the PPI for products of our genes of interest. For our purpose, we considered only experimentally verified PPI data, not predicted PPIs. We reconstructed the PPI based on the identified DEGs common to T2D and NDs and identified the central hub proteins using topological parameters. Among the hub proteins we identified, dynamin (DNM1) has been implicated in central nervous systems [81] and DNM2 is involved in Charcot-Marie-Tooth neuropathy [82]. However, a role for MYH14 in diabetes or neurodegenerative diseases is not reported. PACSIN2 expression has been found to be upregulated in diabetic kidney disease [83]. Borie et al studied the polymorphism of the TFRC gene involved in Parkinson's disease [84]. Rahman et al. identified PDE4D commonly expressed in blood cell and brain tissues of AD [85]. A mutation of ENTPD1 has been identified in Spastic paraplegia type 64 in individuals diagnosed with suspected neurodegenerative disease patients [86]. To date, no role for PLK4, CDC20B, and CDC14A in NDs has been reported.
Transcription factors (TFs) are critical determinants of transcription of their various target genes, so their levels can identify potential biomarkers for neurodegenerative diseases. In this study, we identified relevant TFs as the regulator of the DEGs through TF-mRNA interaction networks that are relevant to the pathogenesis of T2D and NDs. TF-association has also been used by Rahman et al. in a network-based method to profile gene expression of DEGs associated with AD; they identified a number of AD-associated TFs, including JUN, YY1, E2F1, FOXC1, GATA2, SRF, USF2, PPARG, FOXL1, and NFIC [87], consistent with the present study. In contrast to TFs, microRNAs (∼22ntlong) act post-transcriptionally to regulate expression. These are single nucleotide RNA which bind target mRNA, leading to the target cleavage and reduced expression. These miRNAs have many advantages of non-invasive biomarkers and can be detected in body fluids such as urine, saliva and makes them potentially attractive as biomarkers. Indeed, there are miRNAs that show good potential as biomarkers for neurodegenerative diseases [88]. Thus, we studied DEGs-miRNA interaction networks to identify relevant miRNAs as potential targets for NDs. Among the identified miRNAs, the miRNA-335 was particularly associated with AD [71]. In addition, miRNA-16 was reported to be involved in apoptosis in neural cells [89]. Furthermore, the down-expression of this miRNA is involved in the accumulation of amyloid protein precursor (APP) protein in AD [90]. The miRNA-124 is found abundantly in neural cells and as known involvement in NDs. The reduced expression is miRNA associated with AD, PD, and HD [88]. There is no evidence for ND association with miR-17-5p although it has pathogenic actions both enhancing and suppressing tumour development depending on the cellular contexts [91].
The above indicates that our approach has the potential to reveal some of the important mechanisms that underlie disease pathogenesis and provide novel hypotheses of disease mechanisms and may identify new biomarkers. Such genetic data analyses will be a key element in the development of predictive medicine and elucidating the underlying mechanisms that connect T2D and NDs and may indicate possible new drug targets. Nevertheless, our data show some limitations. It should be noted that no clinical confirmation of the roles of proteins generated from our identified genes of interest. Furthermore, the low number of samples for some diseases analyzed which may not fully sample the disease-associated genes that we used to determine the common DEGs. Thus, further experimentation is needed to properly evaluate the biological significance of the identified potential targets candidates in this study.

Conclusions
The present study analyzed transcriptomics datasets of the T2D and neurodegenerative diseases employing a multi-omics approach to decode the overlapped genes that were expressed between T2D and NDs. The gene set enrichment analysis revealed significantly enriched dysregulated pathways. Integration of the overlapped DEGs with different biomolecular interaction networks yielded 10 hub proteins form protein-protein interaction, regulatory TFs from DEG-TF interactions analysis, and miRNAs from DEGs-miRNAs interactions analysis. All of these hub genes and pathways are novel, that is, they have not previously been shown to be important in these diseases or in the disease interactions and most of the TFs and miRNAs identified are also novel. In this way, the present study presented molecular signatures at proteins level (i.e., hub proteins and TFs), and RNA levels (i.e., mRNAs, miRNAs), pathway and GO level but further studies are needed to establish them as biomarkers. These results indicate differentially expressed genes of T2D that may be key to the progression of NDs and may give new insights into these diseases. It also points the way to identifying mechanistic links between the T2D and various ND and that explains why their association with T2D. This study also suggests that T2D shares several common multifactorial degenerative biological processes that contribute to neuronal death, which may, in turn, lead to functional impairment. Additionally, we believe that our high-throughput transcript analysis of tissues using rigorous agnostic approaches would allow the discovery of disease-modifying therapeutic targets. Treatments aimed at attenuating the identified dysregulated pathways have the potential to ameliorate neurological dysfunctions in the T2D patient.