Common Genetic Aberrations Associated with Metabolic Interferences in Human Type-2 Diabetes and Acute Myeloid Leukemia: A Bioinformatics Approach

Type-2 diabetes mellitus (T2D) is a chronic metabolic disorder, associated with an increased risk of developing solid tumors and hematological malignancies, including acute myeloid leukemia (AML). However, the genetic background underlying this predisposition remains elusive. We herein aimed at the exploration of the genetic variants, related transcriptomic changes and disturbances in metabolic pathways shared by T2D and AML, utilizing bioinformatics tools and repositories, as well as publicly available clinical datasets. Our approach revealed that rs11709077 and rs1801282, on PPARG, rs11108094 on USP44, rs6685701 on RPS6KA1 and rs7929543 on AC118942.1 comprise common SNPs susceptible to the two diseases and, together with 64 other co-inherited proxy SNPs, may affect the expression patterns of metabolic genes, such as USP44, METAP2, PPARG, TIMP4 and RPS6KA1, in adipose tissue, skeletal muscle, liver, pancreas and whole blood. Most importantly, a set of 86 AML/T2D common susceptibility genes was found to be significantly associated with metabolic cellular processes, including purine, pyrimidine, and choline metabolism, as well as insulin, AMPK, mTOR and PI3K signaling. Moreover, it was revealed that the whole blood of AML patients exhibits deregulated expression of certain T2D-related genes. Our findings support the existence of common metabolic perturbations in AML and T2D that may account for the increased risk for AML in T2D patients. Future studies may focus on the elucidation of these pathogenetic mechanisms in AML/T2D patients, as well as on the assessment of certain susceptibility variants and genes as potential biomarkers for AML development in the setting of T2D. Detection of shared therapeutic molecular targets may enforce the need for repurposing metabolic drugs in the therapeutic management of AML.


Introduction
Type-2 diabetes mellitus (T2D) is a chronic metabolic disorder, nowadays considered a global epidemic, with ever-increasing prevalence and high cardiovascular mortality rates [1]. Metabolic disturbances in T2D are associated with chronic hyperglycemia due to deficient insulin secretion by pancreatic β-cells and decreased insulin sensitivity in the skeletal muscle, liver, and adipose tissue [2]. During the last two decades, 85 genome-wide association studies (GWAS) have revealed 1894 single-nucleotide polymorphisms (SNPs) in 1294 genes involved in the aforementioned processes [3]. Interestingly, it was recently shown that certain T2D susceptibility genes exhibit deregulated mRNA expression in the peripheral blood of patients and predisposed individuals, possibly mirroring the aberrant regulation in disease-target organs [4]. Venn diagrams reporting the number of common and specific SNPs significantly associated with AML or T2D, based on data downloaded from the NHGRI-EBI GWAS Catalog. (B) Violin plots depicting the impact of the five common SNPs on the expression levels of associated or other genes, in disease-associated tissues (subcutaneous or visceral adipose tissue, skeletal muscle, liver, pancreas, whole blood) (GTex portal, May 2021). NES: normalized effect size. Table 1. Information about the five common SNPs associated with both AML and T2D, as obtained upon search in the NHGRI-EBI Catalog of genome-wide association studies (GWAS) (May 2021) [3]. Variant ID, chromosomal location, cytogenetic region, mapped genes, risk alleles, p-values detected in each study, study accession numbers and the corresponding traits are reported.

SNP Chromosomal Location
Cytogenetic Region  Two of these SNPs (rs11709077, rs1801282) lie in the PPARG (peroxisome proliferatoractivated receptor gamma) gene, exerting the following p-values: for rs11709077 5 × 10 −11 for AML and 2 × 10 −36 for T2D, and for rs1801282 5 × 10 −11 for AML and 2 × 10 −19 for T2D. Another common SNP, the rs6685701, is found in the gene encoding for the ribosomal protein S6 kinase A1 (RPS6KA1) and exhibits a significant association with AML (p = 6 × 10 −18 ) and T2D (p = 1 × 10 −08 ). USP44 (Ubiquitin Specific Peptidase 44) also bears an SNP (rs11108094) significantly related to both AML and T2D development (p = 2 × 10 −10 and 6 × 10 −10 , respectively). Last, rs7929543, located in AC118942.1 (NADPH oxidase 4 pseudogene), is also significantly associated with both AML (p = 7 × 10 −09 ) and T2D (p = 2 × 10 −09 ). It is important to note that all SNPs are in non-coding regions except SNP rs1801282 which is a missense variant in PPARG, also known as Pro12Ala. The more common C allele encodes for the Pro amino acid at the SNP position [20]. Table 1. Information about the five common SNPs associated with both AML and T2D, as obtained upon search in the NHGRI-EBI Catalog of genome-wide association studies (GWAS) (May 2021) [3]. Variant ID, chromosomal location, cytogenetic region, mapped genes, risk alleles, p-values detected in each study, study accession numbers and the corresponding traits are reported. To investigate whether these genetic variants affect the expression levels of associated or other genes in disease-related tissues (adipose, skeletal muscle, liver, pancreas, whole blood), we searched for eQTLs through the GTex and Blood eQTL Browser databases [21,22]. All results obtained are reported in Table 2. Moreover, graphical data from the GTex portal are shown in Figure 1B; corresponding data from Blood eQTL Browser were not available. Rs11709077 (allele: G/A; minor allele: A) and rs1801282 (G/C; minor: G), on the PPARG gene, were found to affect the mRNA expression levels of SYN2 (synapsin II) in the skeletal muscle ( Figure 1B and Table 2) and whole blood ( Table 2). In the skeletal muscle, the presence of the minor alleles correlates with increased SYN2 expression (normalized effect size (NES): 0.35 and 0.36 for rs11709077 and rs1801282, respectively) ( Figure 1B and Table 2), whereas in the whole blood, they are correlated with decreased levels (z-score: −3.61, for both) ( Table 2). In addition, rs1801282 was found to negatively impact the expression of the GATA3 transcription factor in whole blood (z-score = −4.54) (Table 2) and of TIMP4 (TIMP metallopeptidase inhibitor 4) (NES = −0.21) in visceral adipose tissue ( Figure 1B and Table 2). The rs11108094 variant (C/A; minor allele: A) on USP44 was associated with decreased expression of METAP2 (methionine aminopeptidase 2) in subcutaneous and visceral adipose tissue (NES: −0.64 and −0.55, respectively) ( Figure 1B and Table 2). Finally, in visceral adipose tissue, rs6685701 (A/G; minor allele: G) in RPS6KA1 negatively affects its own expression levels (NES: −0.099), while rs7929543 (A/C; minor allele: C) on AC118942.1 is positively associated with the expression levels of RP11-347H15.5 (clone-based (Vega) gene) (NES: 0.53) ( Figure 1B and Table 2). Table 2. eQTL associated with the five common disease susceptibility SNPs described in AML and/or T2D target tissues, as well as with their 64 proxies, as deposited in the GTEx project and Blood eQTL Browser. The SNP ID, SNP alleles, associated and affected genes and tissue(s), as well as corresponding p-values and the effect sizes, are reported.

Proxy SNPs of the Five Common AML/T2D Susceptibility SNPs
Apart from the SNPs directly identified to be associated with a disease, other co-inherited SNPs may also lead to its development [23]. Based on this, we searched for the proxy SNPs of the five common AML/T2D susceptibility SNPs, utilizing the LDLink tool [24]. The selection criterion for a proxy SNP was to possess a squared correlation measure (R 2 ) of LD greater than 0.8. Data are shown in Figure 2 and Table 3. Sixty-six (66) unique proxy SNPs that lie in the USP44, METAP2, PPARG, TIMP4, FOLH1 (folate hydrolase 1), AC118942.1 and RPS6KA1 genes were identified; some of them were detected as proxies for more than one of the five common SNPs. Through this analysis, it was also revealed that two of the common AML/T2D susceptibility genes (rs1801282, rs11709077) on the PPARG gene were mutual proxy SNPs (Table 3; bold/italics highlighted). Moreover, Venn diagram analysis revealed that one of the 64 SNPs (rs11519597) is an AML-specific disease susceptibility SNP, while two of them (rs71304101, rs17036160) are T2D-specific disease susceptibility SNPs (data not shown). Figure 2. Regional LD plots of five commonly associated SNPs generated using the LDLink web tool (May 2021). Each dot represents the pairwise LD level between two individual SNPs. X-axis depicts the chromosomal coordinates. Left y-axis represents the pairwise R 2 value with the query variant; R 2 threshold greater than or equal to 0.8 was considered as a cutoff for selected proxies (blue dashed line). Right y-axis indicates the combined recombination rate (cM/Mb) from HapMap. Recombination rate is the rate at which the association between the two loci is changed. It combines the genetic (cM) and physical positions (Mb) of the marker by an interactive plot. 3 12329783 (C/T) 0.9844 G = C,A = T Figure 2. Regional LD plots of five commonly associated SNPs generated using the LDLink web tool (May 2021). Each dot represents the pairwise LD level between two individual SNPs. X-axis depicts the chromosomal coordinates. Left y-axis represents the pairwise R 2 value with the query variant; R 2 threshold greater than or equal to 0.8 was considered as a cut-off for selected proxies (blue dashed line). Right y-axis indicates the combined recombination rate (cM/Mb) from HapMap. Recombination rate is the rate at which the association between the two loci is changed. It combines the genetic (cM) and physical positions (Mb) of the marker by an interactive plot.
Furthermore, to pinpoint possible deregulation at the mRNA levels, attributed to the 64 proxy SNPs, we performed analysis using the GTex and Blood eQTL databases for the identification of eQTLs in disease-affected tissues ( Table 2).

Common Susceptibility Genes in AML and T2D
Beyond the identification of specific genetic variants associated with both AML and T2D, we proceeded to the detection of common susceptibility genes between the two disorders. Analysis using combined data from the GWAS Catalog and the GTex portal showed that 86 genes bear SNPs that have been significantly associated with the development of both diseases, as per GWAS performed ( Figure 3A). These include the five genes with common SNPs and another 81 disease-specific genes. Notably, most of the genes contain a significantly higher number of SNPs associated with AML compared to T2D (Table 4).
To investigate whether these genes comprise eGenes, which have at least one eQTL located near the gene of origin (cis-eQTL) acting upon them, affected by AML or T2Dspecific SNPs in-disease target tissues, we searched through the GTex and eQTL Browsers. Analysis using Venn diagrams identified AML-or T2D-specific SNPs/eQTLs in certain susceptibility genes in adipose, muscle tissue, liver, pancreas and/or whole blood ( Figure 3B). In adipose tissue, 6517 eQTLs on common AML/T2D susceptibility genes were detected, of which 79 were AML-and 8 T2D-specific. In skeletal muscle, 4220 were identified-28 AML-and 5 T2D-specific. In liver, 602 were detected-seven AML-and none T2D-specific. In pancreas, 3507 were found-36 AML-and 5 T2D-specific. Finally, in whole blood, 7187 were identified-55 AML-and 10 T2D-specific. A complementary analysis of the same data revealed the distribution of the AML-or T2D-SNPs/eQTLs in disease-target tissues and identified common and tissue-specific ones ( Figure 3C and Table 5). All identified eQTLs affecting the 86 common disease susceptibility genes are included in Supplementary Table S2. T2D, we proceeded to the detection of common susceptibility genes between the two disorders. Analysis using combined data from the GWAS Catalog and the GTex portal showed that 86 genes bear SNPs that have been significantly associated with the development of both diseases, as per GWAS performed ( Figure 3A). These include the five genes with common SNPs and another 81 disease-specific genes. Notably, most of the genes contain a significantly higher number of SNPs associated with AML compared to T2D (Table 4).

Figure 3.
Common and disease-specific SNPs and eQTLs per target tissue. Venn diagrams reporting: (A) the number of common and disease-specific susceptibility genes between AML and T2D, (B) the numbers of AML-or T2D-specific SNPs that act as eQTLs upon the expression of common AML/T2D susceptibility genes, in adipose, skeletal muscle, liver, pancreas and whole blood, (C) the number of tissue-specific and common AML-or T2D-SNPs. Analysis was performed combining data from the NHGRI-EBI Catalog of GWAS and GTex portal. Common and disease-specific SNPs and eQTLs per target tissue. Venn diagrams reporting: (A) the number of common and disease-specific susceptibility genes between AML and T2D, (B) the numbers of AML-or T2D-specific SNPs that act as eQTLs upon the expression of common AML/T2D susceptibility genes, in adipose, skeletal muscle, liver, pancreas and whole blood, (C) the number of tissue-specific and common AML-or T2D-SNPs. Analysis was performed combining data from the NHGRI-EBI Catalog of GWAS and GTex portal. Table 4. Common genes with common or different disease susceptibility SNPs for AML and T2D, as analyzed using data downloaded from the NHGRI-EBI Catalog of human GWAS [3] (May 2021).

Gene Symbol
Full Gene Name AML SNPs T2D SNPs      Table 5. AML-or T2D-specific SNPs that act as eQTLs on the 86 common AML/T2D susceptibility genes in a tissue-specific manner, as analyzed via the GTex portal [21] (May 2021).

Pathway Analysis of the Proteins Encoded by the Common AML/T2D Susceptibility Genes
To investigate the possible involvement of the 86 common susceptibility genes in molecular networks correlated with both disorders, the developed gene/protein panel was further processed through the STRING and KEGG databases [25,26]. The following eGenes found to be affected by the five common susceptibility SNPs as well as by their proxies in disease-affected tissues were included in the analysis: DHDDS (Dehydrodolichyl Diphosphate Synthase Subunit), GATA3, METAP2, RP11-347H15.5, RPS6KA1, SYN2, TIMP4. The corresponding protein-protein interaction (PPI) network is depicted in Figure 4A. Analysis revealed that numerous proteins of the above set are significantly involved in metabolic pathways, including pyrimidine, purine, choline metabolism, mTOR, AMPK, PI 3 K-Akt and insulin signaling, as well as pathways deposited as related to AML (FDR < 0.05 for all) ( Figure 4B and Table 6).
Differently colored nodes designate various genes/proteins involved in one or more pathways. Edges represent protein-protein associations-either known interactions, predicted interactions or other associations. All regulated pathways revealed in this analysis are included in Supplementary Table S3.  Table 6. Selected pathways significantly regulated by the set of 86 AML/T2D susceptibility genes plus seven eGenes affected by the five common AML/T2D susceptibility genes and their proxies, as analyzed upon processing in the STRING and KEGG databases [25,26]. Pathway IDs and description, number of susceptibility genes involved, number of background genes, their names as well as statistics (strength, FDR and log 10 FDR) for each pathway are reported.

Investigation of Aberrant mRNA Expression of T2D-Deregulated Genes in an AML Cohort
The second aim of the study was to investigate the possible deregulation of T2Drelated metabolic mechanisms in AML patients. To this end, we selected a panel of genes previously reported to be deregulated in T2D patients [4] (CAPN10, CDK5, CDKN2A, IGF2BP2, KCNQ1, THADA, TSPAN8) and explored their mRNA levels in peripheral blood samples from AML-versus non-cancerous individuals utilizing RNAseq data and the TNMplot web tool [27]. Significantly increased mRNA levels of CAPN10, CDK5, CDKN2A, IGF2BP2 and THADA, as well as significantly decreased levels of KCNQ1 and TSPAN8, were found in 151 AML patients compared to 407 normal individuals tested (Mann-Whitney p < 0.0004 for all). The percentage (%) of AML samples that displayed up-or downregulated expression for each of the above genes, at each of the four quantile cut-off values (minimum, 1st quartile, median, 3rd quartile, maximum), as well as the specificity (the ratio of the number of AML samples to the sum of AML and non-cancerous samples over or below each given cut-off), are depicted in Figure 5.
To search for AML-specific SNPs on these deregulated genes, we used data obtained from the NHGRI-EBI Catalog of GWAS. It was found that rs10832134 (chromosomal location: 11:2481256), rs12576156 (11:2477588) and rs11523905 (11:2477029) variants lie in the KCNQ1 (p = 3 × 10 −15 for all), while the rest of the deregulated genes have not been identified to bear AML-related SNPs. Investigation for their proxies revealed three proxy SNPs (rs12574553, rs757092, rs7126330) for rs10832134 and five proxy SNPs (rs73419519, rs7937273, rs7928116, rs179395, rs7542142) for rs12576156, all of them in KCNQ1. No proxies were found for rs11523905 (data not shown). Out of these, the proxy SNP rs12574553 (allele C/T) consists of an eQTL for KCNQ1; the minor allele leads to the downregulation of mRNA levels in whole blood [21].
TSPAN8, were found in 151 AML patients compared to 407 normal individuals tested (Mann-Whitney p < 0.0004 for all). The percentage (%) of AML samples that displayed up-or downregulated expression for each of the above genes, at each of the four quantile cut-off values (minimum, 1st quartile, median, 3rd quartile, maximum), as well as the specificity (the ratio of the number of AML samples to the sum of AML and non-cancerous samples over or below each given cut-off), are depicted in Figure 5. To search for AML-specific SNPs on these deregulated genes, we used data obtained from the NHGRI-EBI Catalog of GWAS. It was found that rs10832134 (chromosomal location: 11:2481256), rs12576156 (11:2477588) and rs11523905 (11:2477029) variants lie in the KCNQ1 (p = 3 × 10 −15 for all), while the rest of the deregulated genes have not been identified to bear AML-related SNPs. Investigation for their proxies revealed three proxy SNPs (rs12574553, rs757092, rs7126330) for rs10832134 and five proxy SNPs (rs73419519, rs7937273, rs7928116, rs179395, rs7542142) for rs12576156, all of them in KCNQ1. No proxies were found for rs11523905 (data not shown). Out of these, the proxy SNP rs12574553

Discussion
Today, there is a well-accepted epidemiological link between T2D and cancer development [5]. However, in other types of human neoplasia, the association between T2D and hematological malignancies is less explored. Among them, AML represents one of the most intriguing morbidities for further investigation due to its increasing rates and relatively poor prognosis and response to treatment [10,28]. Accumulating clinical evidence connecting metabolic syndrome parameters (including BMI and T2D) to AML [9,[11][12][13][14][15][16], together with corresponding in vitro data [17][18][19], highlights the need for investigation of the underlying mechanisms implicating genetic predisposition, which may regulate metabolic abnormalities.
In this study, we first aimed at the description of the possible common genetic background shared by the two disorders. Processing of the thousands of AML-and T2Dassociated SNPs deposited in the GWAS NHGRI-EBI Catalog uncovered five SNPs that are significantly linked to both diseases (Table 1). Two of them (rs11709077, rs1801282) lie in the PPARG gene, the first gene reproducibly associated with T2D [29,30]. The gene encodes for the PPAR-γ receptor, a molecular target of thiazolidinediones (insulin-sensitizing antidiabetic drugs); gene variants affecting its transcription levels in adipose tissue are associated with insulin sensitivity [29,30]. Although there are no data directly linking PPARG with AML, it is worth mentioning that the protein is implicated in the TGF-beta and mTOR signaling pathways, both associated with cancer development [31][32][33]. Our analyses also indicated that rs11709077 and rs1801282 on PPARG negatively affect the expression of SYN2 (Synapsin II) in skeletal muscle and in whole blood (Table 2, Figure 1); however, there is not yet any evidence connecting SYN2 with T2D or AML.
Another common SNP, which is a missense variant rs1801282, was found to negatively regulate the expression of the tissue inhibitor of metalloproteinases 4 (TIMP4) in visceral adipose tissue. The TIMP family has been associated with several cancers [34], but no information about its relation to T2D is available yet. Another interesting observation regards the negative impact of rs1801282 on GATA3 in whole blood. GATA3 is a transcription factor with a multi-faceted role in hematopoiesis [35], while related genetic and epigenetic aberrations are strongly associated with AML development, prognosis and response to therapy [36,37]. Regarding T2D, GATA3 is considered an anti-adipogenic factor and a potential molecular therapeutic target for insulin resistance, through restoration of adipogenesis and amelioration of inflammation [38,39].
Rs6685701, located in the gene encoding for the ribosomal protein S6 kinase A1 (RPS6KA1 or P90S6K), was found to be associated with its lower expression levels in visceral adipose tissue. The protein belongs to the family of serine/threonine kinases that govern various cellular processes, and it acts downstream of ERK (MAPK1/ERK2 and MAPK3/ERK1) signaling [33]. In murine models of T2D, RPS6KA1 has been implicated in impaired glucose homeostasis in β-pancreatic, muscle and liver cells [40,41], which is improved upon sitagliptin (DPP-4 inhibitor; antidiabetic drug) administration [42]. Using an in vivo model of leukemia, RPS6KA1 has been shown to promote the self-renewal of hematopoietic stem cells and disease progression through the regulation of the mTOR pathway [43]. More importantly, it was very recently reported that RPS6KA1 may be a strong indicator of overall survival in AML patients, while aberrations in the miR-138-5p/RPS6KA1 axis are associated with poor prognosis among patients [44].
The rs11108094 in USP44 (ubiquitin-specific peptidase 44) was also recognized as a common susceptibility variant for AML and T2D, which acts as an eQTL downregulating the expression of METAP2 (methionyl aminopeptidase 2) in subcutaneous and adipose tissue. The USP44 protein is implicated in protein metabolism and ubiquitinmediated proteasome-dependent proteolysis. More importantly, METAP2 is involved in the metabolism of fat-soluble vitamins [33]. Its inhibition results in weight loss in obese rodents, dogs and humans and has been proposed as a therapeutic target against obesity [45]. On the other hand, METAP2 inhibitors have been shown to induce apoptosis in leukemic cell lines [46], which renders them potent therapeutic agents also for leukemia. Lastly, the rs7929543 variant on the AC118942.1 pseudogene was identified as an eQTL influencing the expression of the RP11-347H15.5 pseudogene in visceral adipose tissue. The involvement of this deregulation in possible pathogenetic processes for both diseases might be part of the complex underlying genetic-molecular mechanisms.
To describe the network of genetic variants' inheritance more extensively, we developed a panel of 64 unique proxy SNPs associated with the five common AML/T2D ones (Table 2). Interestingly, these proxies are found to lie within and/or be eQTLs for the aforementioned genes (PPARG, SYN2, TIMP4, GATA3, RPS6KA1, USP44, METAP2, AC118942.1, RP11-347H15.5) in disease-target tissues. A new eGene added to the panel was DHHS, which is downregulated in whole blood by SNPs on RP11-347H15.5. The gene encodes for the dehydrodolichyl diphosphate synthase subunit and is involved in pathways of protein metabolism and in N-glycan biosynthesis [33]. However, no direct data connecting the gene with neoplasias or diabetes have been reported to date.
Next, we identified a panel of 86 common AML/T2D susceptibility genes using the GWAS NHGRI-EBI Catalog (Figure 3). Several SNPs specific for each disease were found to impact the expression patterns of some of these common susceptibility genes in affected tissues, suggesting their possible functional involvement in disease development (Table 5). Pathway analysis revealed that the AML/T2D gene set regulates a series of metabolic pathways, with the highest significance observed for pyrimidine and purine metabolism. Although neither AML or T2D is purely a disorder of pyrimidine and/or purine metabolism, there are data supporting their implication in the development of each disease. The insulin effect on their regulation in diabetic liver is knowledge obtained decades ago [47,48]. Nevertheless, it was very recently described that the signatures of purine metabolites, including betaine metabolites, branched-chain amino acids, aromatic amino acids, acylglycine derivatives and nucleic acid metabolites, are associated with hyperglycemia or insulin resistance [49,50]. While there is no recent evidence regarding a possible role for purine and pyrimidine metabolites in leukemia, older studies support the notion that reciprocal alterations in the phenotype of specific enzymes may occur in leukemia cells [51,52].
Choline metabolism is another pathway that emerged through gene set enrichment analysis. Indeed, its upregulation in malignant transformation is well described [53], while the serum metabolomic signature of AML patients includes parameters of aberrant choline metabolism [54]. A group of metabolic pathways, including those of carbohydrates, lipids, nucleotides, amino acids, glycans, cofactors, vitamins, biosynthesis of terpenoids, polyketides and other secondary metabolites [25], as well as signaling pathways related to metabolic disturbances and the development of neoplasia and T2D, such as mTOR, AMPK, PI 3 K-Akt and insulin signaling pathways, were also among the ontologies significantly regulated by the AML/T2D gene set. Analysis also revealed an association with a pathway category deposited as "Acute Myeloid Leukemia", which refers to ERK, PI 3 K and JAK-STAT signaling and transcription regulation pathways including mutated RUNX1 and the fusion genes AML1-ETO, PML-RARA and PLZF-RARA [33].
Finally, exploration through clinical datasets revealed that certain T2D-related genes, previously shown to be deregulated in T2D individuals [4], also exhibit deviated transcriptomic levels in AML patients. Expression levels of THADA (thyroid adenoma-associated protein), IGF2BP2 (insulin-like growth factor 2 mRNA binding protein 2), CDKN2A (cyclindependent kinase inhibitor 2A) and CDK5 (cyclin-dependent kinase 5) were upregulated, while levels of KCNQ1 (potassium voltage-gated channel subfamily Q member 1) were downregulated in the peripheral blood of AML patients compared to normal subjects. IGF2BP2, CDKN2A, CDK5 and KCNQ1 are known to be implicated in the mass development, proliferation, and insulin secretory function of β-cells, and in metabolic processes in T2D-affected tissues [3,20,55,56]. As for THADA, despite its susceptibility to T2D, there are no data yet related to its involvement in the disease's pathogenesis and/or metabolic pathways [4]. However, chromosomal aberrations engaging this gene are observed in benign thyroid adenomas [57]. CAPN10 (calpain 10) shows increased whereas TSPAN8 (Tetraspanin 8) exhibits decreased mRNA levels in AML versus non-cancerous individuals, a trend opposite to what was observed in T2D versus healthy subjects. CAPN10 plays important roles in the translocation of glucose transporter 4 (GLUT4), secretion of insulin and apoptotic processes in pancreatic cells [57], while TSPAN8 has been described as a prognostic indicator for patients with certain solid tumors [58,59], but not for hematological malignancies.
In summary, this study provides, for the first time, evidence for a strong genetic network that is related to aberrations in metabolic processes and molecular pathways, shared between AML and T2D. Even though the metabolic vulnerability of AML cells and aberrant metabolic pathways observed in AML patients [54,60] have increasingly gained the attention of the research community, the genetic background leading to these metabolic disturbances had not yet been investigated. Data emerging from our study revealed that: (i) specific genetic variants (SNPs) associated with both AML and T2D, as well as their co-inherited proxy SNPs, mostly specific for each disease rather than common, can alter the gene expression patterns in disease-target tissues; (ii) common susceptibility genes and genes with altered expression may be linked to the development of AML or T2D through common (such as PPARG) or different mechanisms (such as GATA3) and (iii) common susceptibility genes can regulate metabolic pathways, which may be implicated in the pathogenetic mechanisms leading to the development of the two disorders. It should be noted, however, that the study has certain limitations, including that it exclusively analyzed in silico data and the fact that other parameters affecting the gene expression, such as epigenetic mechanisms, were not explored. Moreover, in the case of certain genes and their SNPs, i.e., those of PPARG and GATA3, their specific implication in AML and/or T2D development is not well documented. Therefore, it is yet difficult to provide a plausible explanation regarding their possible impact as risk factors for AML in the context of T2D. Lastly, it needs to be clarified that, although some of the reported SNPs are associated with certain genes involved in AML (such as RPS6KA1 and METAP2), the latter are not considered driver genes for AML initiation.
Despite these limitations, significant evidence emerging from this study can be further explored in future basic and clinical studies. For example, the common susceptibility genes revealed can be evaluated for their potential to serve as prognostic biomarkers of AML development in cohorts of T2D individuals. Moreover, in depth exploration of the described metabolic pathways and involved genes may lead to a better understanding of the pathogenetic basis of the increased risk for AML development observed in individuals with T2D. Finally, detailed investigation of the common therapeutic targets identified may suggest that repurposing of metabolic drugs (i.e., DPP-4 inhibitor targeting RPS6KA1 or thiazolidinediones targeting PPAR-γ) could be exploited as novel therapeutic strategies to enhance the anti-leukemic armamentarium.

Study Design
Our study was performed in two axes. (A) Detection of common genetic variants and deregulated pathways in T2D and AML: We first created a panel of SNPs associated with AML or T2D, upon an in-depth search in the NHGRI-EBI Catalog of published GWAS [3], to detect common disease susceptibility genes. Their proxy SNPs were also detected using the LDLink web tool [24]. For the possible impact of the common susceptibility SNPs and their proxies on gene mRNA expression, a combined search in the Genotype-Tissue Expression (GTEx) project [21] and the Blood eQTL Browser [22] was performed. Moreover, a panel of mutual genes bearing common or disease (AML or T2D)-specific genes were processed through pathway analysis using the STRING (Search Tool for the Retrieval of Interacting Genes/Proteins) database [26], to reveal associated molecular networks and biological processes. (B) Investigation of possible deregulated expression of T2D susceptibility genes in AML cohorts: A panel of T2D susceptibility genes that were previously described to exert aberrant mRNA levels in diabetic patients was explored for their possible deregulated expression also in AML patients, using the TNMplot tool [27].

Development of the AML and T2D Susceptibility SNP Panels and Detection of Common SNPs
The panels of total susceptibility genes specific for AML and T2D were developed upon an in-depth search in the NHGRI-EBI GWAS Catalog [3]. All populations were considered for assessment. Common disease susceptibility genes were detected, generating Venn diagrams with the Draw-Venn-Diagrams online tool (http://bioinformatics.psb. ugent.be/webtools/Venn/) (May 2021). A genome-wide statistically significant p-value lower than or equal to 5 × 10 −8 was applied to detect the SNPs that were significantly associated with the diseases. Data regarding the prevalence of the SNPs of interest in the general population were obtained from the gnomAD browser [61].

Detection of Proxy SNPs
Proxy SNPs of disease susceptibility SNPs of interest were detected utilizing the LDLink tool [24]. LDLink interactively explores proxy and putatively functional variants/SNPs for a query/tag variant (±500 kilobases). The tool provides information about: (A) a squared correlation measure (R 2 ) of linkage disequilibrium (LD); proxy SNPs are considered those having ≥80% possibility of coinheritance with the tag SNP, which equals to a R 2 value ≥ 0.8, and (b) the combined recombination rate (cM/Mb) from HapMap; the recombination rate is the rate at which the association between the two loci is changed. It combines the genetic (cM) and physical positions (Mb) of the marker by an interactive plot.

Detection of Expression Quantitative Trait Loci (eQTLs)
Expression quantitative trait loci (eQTLs), which explain variations in mRNA expression levels, related to the SNPs of interest were explored utilizing the GTEx portal and the Blood eQTL Browser [21,22]. Analysis was focused on the expression patterns in the total target tissues of the two diseases (as per their availability in the databases). These included adipose tissue (subcutaneous, visceral), skeletal muscle, liver, pancreas and whole blood.

Pathway Analysis
Analysis through the STRING [26] and Kyoto Encyclopedia of Genes and Genomes (KEGG) [25] databases was performed to detect protein-protein interactions possibly regulated by a panel including: (i) proteins encoded by genes that bear disease susceptibility SNPs in both AML and T2D as well as (ii) proteins encoded by genes that are commonly affected by different AML-specific and T2D-specific SNPs. To filter significantly regulated pathways, a false discovery rate (FDR) < 0.05 was set as cut-off.

Investigation of the Expression Patterns of T2D-Deregulated Genes in AML Clinical Cohorts
To explore possible variations in the mRNA expression levels of previously described T2D-deregulated genes [4] in patients with AML, the TNMplot tool was used [27]. In more detail, analysis processed whole-exome sequencing data from 151 AML patients versus 407 non-cancerous individuals, available in the database. The tool compared the expression levels of each gene in the two groups using the Mann-Whitney non-parametric test, reporting the p-value of significance and the fold-change between groups. Other information included (a) the percentage (%) of AML samples that exerted up-or downregulated expression of query genes compared to non-cancerous samples, at each of the four quantile cut-off values (minimum, 1st quartile, median, 3rd quartile, maximum), and (b) the specificity, defined as the ratio of the number of AML samples to the sum of AML and non-cancerous samples over or below each given cut-off.