Integrative Identification of Genetic Loci Jointly Influencing Diabetes-Related Traits and Sleep Traits of Insomnia, Sleep Duration, and Chronotypes

Accumulating evidence suggests a relationship between type 2 diabetes mellitus and sleep problems. A comprehensive study is needed to decipher whether shared polygenic risk variants exist between diabetic traits and sleep traits. Methods: We integrated summary statistics from different genome-wide association studies and investigated overlap in single-nucleotide polymorphisms (SNPs) associated with diabetes-related traits (type 2 diabetes, fasting glucose, fasting insulin, and glycated hemoglobin) and sleep traits (insomnia symptoms, sleep duration, and chronotype) using a conditional/conjunctional false discovery rate approach. Pleiotropic genes were further evaluated for differential expression analysis, and we assessed their expression pattern effects on type 2 diabetes by Mendelian randomization (MR) analysis. Results: We observed extensive polygenic pleiotropy between diabetic traits and sleep traits. Fifty-eight independent genetic loci jointly influenced the risk of type 2 diabetes and the sleep traits of insomnia, sleep duration, and chronotype. The strongest shared locus between type 2 diabetes and sleep straits was FTO (lead SNP rs8047587). Type 2 diabetes (z score, 16.19; P = 6.29 × 10−59) and two sleep traits, sleep duration (z score, −6.66; P = 2.66 × 10−11) and chronotype (z score, 7.42; P = 1.19 × 10−13), were shared. Two of the pleiotropic genes, ENSA and PMPCA, were validated to be differentially expressed in type 2 diabetes, and PMPCA showed a slight protective effect on type 2 diabetes in MR analysis. Conclusions: Our study provided evidence for the polygenic overlap between diabetic traits and sleep traits, of which the expression of PMPCA may play a crucial role and provide support of the hazardous effect of being an “evening” person on diabetes risk.


Introduction
Type 2 diabetes mellitus is a complex disease induced by a combination of environmental and genetic factors. It is estimated that approximately 463 million adults aged 20-79 years suffer from diabetes globally, which is expected to surge to 700 million by 2045 [1,2]. Complications of diabetes seriously affect the physical health of patients and lead to a heavy health burden of disability and mortality, consuming massive loss of social resources [1]. Genome-wide association studies (GWAS) have identified more than 500 susceptibility loci that demonstrate a robust association with type 2 diabetes [3]. In contrast to the tremendous stride in GWAS research, the conundrum of "missing heritability" in type 2 diabetes has progressed slowly and arduously. The identified genetic variants explain only 19% of the familial clustering of type 2 diabetes [4,5].
An extensive overview of pleiotropy and genetic architecture showed that 90% of trait-associated loci overlap with loci from multiple traits [6]. Combining GWAS from multiple phenotypes provides insights into genetic pleiotropy and could elucidate shared pathobiology [7]. The conjunctional false discovery rate (conjFDR), an extension of the conditional false discovery rate (condFDR), is such an approach that boosts GWAS discovery by leveraging auxiliary genetic information to readjust the GWAS test statistics in a primary phenotype and was applied for cross-trait analysis by leveraging overlapping SNP associations between separate GWAS to rerank the test statistics in a primary phenotype conditional on the associations in a secondary phenotype [8,9]. This method is a model-free strategy for the analysis of GWAS summary statistics inspired by the empirical Bayes statistical framework, which is designed for situations with dense elements, such as the large number of small genetic effects seen in polygenic traits and disorders [8,9].
Accumulating evidence suggests that sleep traits may have indispensable effects on the development of type 2 diabetes, such as insomnia and chronotype. Insomnia disorder is the second-most prevalent mental disorder with prevalence estimates ranging from 10% (adults) to 22% (elderly) and is characterized by lasting problems falling asleep or waking up in the night or early morning, with subjective repercussions for daytime functioning [10]. The adverse effect of insomnia on type 2 diabetes risk was verified by multiple observational studies and Mendelian randomization studies [11][12][13]. A 12-day inpatient General Clinical Research Center study found that sleep restriction significantly reduces insulin sensitivity [14], and simple sleep interventions such as sleep extension are associated with improvements in fasting insulin sensitivity [15]. In addition to the above epidemiological evidence, genome-wide association studies (GWAS) have provided new insights into the complex genetic mechanisms between type 2 diabetes and sleep traits. Polygenic risk scores for sleep duration obtained from GWAS summary statistics are associated with an increased likelihood of various metabolic traits [16]. There is also a correlation between genetic risk factors for insomnia and the risk of type 2 diabetes (r g = 0.20) [17]. Chronotype of an individual refers to the specific entrainment and/or activity-rest preference of that individual in a given 24-h day [18]. It can be denoted as circadian topology or diurnal preference and may manifest as measures of the timing of actual sleep-wake behaviors or preference for sleep-wake timing under idealized conditions [19]. Early risers who are preferentially active in the mornings are said to have a morning chronotype and are often dubbed as larks, and late risers with more nocturnal activities have late chronotypes and are popularly dubbed owls. The literature suggests that circadian rhythms are important to weight regulation and metabolism. Suggested mechanisms include dietary behavior, appetite-stimulating hormones, and glucose metabolism [20]. Therefore, shared genetic influences of sleep traits can be highly valuable for type 2 diabetes to provide biological insights and uncover shared biological underpinnings. A comprehensive study is needed to decipher whether shared polygenic risk variants exist between diabetic traits and sleep traits, which is essential to unveil the genetic mechanisms of type 2 diabetes and impel early prevention and therapy.
In this study, we investigated the polygenic overlap between type 2 diabetes and sleep traits using the conjFDR approach and focused on pleiotropic genes. In order to better understand type 2 diabetes pathophysiology, we also included other diabetes-related traits, including fasting glucose (FG), glycated hemoglobin (HbA1c), and fasting insulin (FI). We further assessed whether the pleiotropic genes were enriched in particular pathways and their expression pattern effects on type 2 diabetes.

Study Participants
GWAS results in the form of summary statistics on type 2 diabetes were acquired from Mahajan et al.'s work [21]. In this study, 403 independent association signals were detected by conditional analyses at each of the genome-wide significant risk loci for type 2 diabetes (except at the major histocompatibility complex (MHC) region). Summary-level data are available at the DIAGRAM consortium (http://diagram-consortium.org/, accessed on 13 November 2020). European-specific meta-analysis summary-level results for fasting glucose (FG), glycated haemoglobin (HbA1c), and fasting insulin data (FI) were acquired from a trans-ancestral meta-analysis, which aggregated genome-wide association studies comprising up to 281,416 individuals without diabetes (70% European ancestry) [22]. European-specific meta-analysis summary-level results were downloaded through the MAGIC website (https://www.magicinvestigators.org/, accessed on 13 November 2020) and used for subsequent analysis.
Summary statistics results of sleep traits were obtained from Jansen et al.'s study [23]. The freely available meta-analytic sleep traits (insomnia symptoms, sleep duration, and chronotype) represent results partly provided by the UK Biobank Study (www.ukbiobank. ac.uk, accessed on 13 November 2020) [24].
The UK Biobank collected a single self-reported measure at baseline of sleep traits. Insomnia symptoms were assessed by asking, "Do you have trouble falling asleep at night or do you wake up in the middle of the night?", with responses "Never/rarely", "Sometimes", "Usually", or "Prefer not to answer". Those who responded "prefer not to answer" were missing. Insomnia cases (n = 109,402) were defined as participants who answered this question with "usually", while participants answering "never/rarely" or "sometimes" were defined as controls (n = 277,131). "Usually have trouble falling asleep at night or waking up in the middle of the night" may be the most important part of the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) and International Classification of Sleep Disorders (ICSD) diagnostic criteria for insomnia disorder, so this definition of insomnia symptoms from the self-reported measure was validated to be closer to the DSM-5 and ICSD diagnostic criteria than the commonly used Insomnia Severity Index (ISI) or Pittsburgh Sleep Quality Index (PSQI). Additionally, it previously showed excellent sensitivity (98%) and specificity (96%) of the UK Biobank insomnia phenotype to differentiate between cases that consistently met both the ISI and PSQI criteria versus controls that consistently were below both the ISI and PSQI cutoff scores [10]. Thus, we used this phenotype as a proxy for insomnia. Sleep duration, obtained from 384,317 individuals, was a quantitative variable assessed by asking, "About how many hours sleep do you get in every 24 h? (please include naps)". Chronotype ("Morning/evening person (chronotype)"; data-field 1180, n = 345,552) was assessed by the question "Do you consider yourself to be?" with one of six possible answers: "Definitely a 'morning' person", "More a 'morning' than 'evening' person", "More an 'evening' than a 'morning' person", "Definitely an 'evening' person", "Do not know", or "Prefer not to answer", which were coded as 2, 1, −1, −2, 0, and missing, respectively. Summarylevel data are available at https://ctg.cncr.nl/software/summary_statistics (accessed on 13 November 2020).

Conditional Quantile-Quantile (Q-Q) Plots
We constructed conditional Q-Q plots to assess pleiotropic enrichment between diabetes-relevant traits and sleep traits. Conditional Q-Q plots compare the association with the primary phenotype (e.g., type 2 diabetes) across all single-nucleotide polymorphisms (SNPs) and within SNPs stratified by their association with the secondary phenotype (e.g., insomnia). Successive leftward deflections from the null distribution of conditional Q-Q plots denoted the existence of pleiotropic enrichment. Spurious enrichment was controlled after random pruning by selecting one random SNP per linkage disequilibrium (LD) block (defined by LD r 2 > 0.1) averaged over 100 iterations.

Identification for Pleiotropic Loci
We identified specific loci jointly involved with diabetes-relevant traits and sleep traits according to a condFDR statistical framework (https://github.com/precimed/pleiofdr) (accessed on 25 August 2021) [25]. CondFDR is an extension of the standard FDR. It incorporates information from GWAS summary statistics of a secondary phenotype to adjust its significance level. We denoted the condFDR for phenotype 1 given phenotype 2 as FDRtrait1|trait2, which is defined as the posterior probability that a given SNP is null for the first phenotype given that the p-values for both phenotypes are as small as or smaller than the observed ones. Based on CondFDR, we computed the conjunctional false discovery rate (conjFDR), denoted as FDRtrait1&trait2, the conservative estimate of which was given by the maximum between FDRtrait1|trait2 and FDRtrait2|trait1. It is defined as the posterior probability that an SNP is null for either phenotype or both simultaneously, given that its p-values for associations with both phenotypes are as small as or smaller than the observed ones. SNPs with a conjFDR value less than 0.01 were considered shared loci. Based on the 1000 Genome Project LD structure, the significant SNPs identified were clustered into LD blocks at the LD r 2 > 0.1 level.

Functional Annotation
The significant SNPs identified were annotated by SNPNexus (https://www.snpnexus.org/v4/) (accessed on 26 August 2021) [26]. SNPs were annotated for functional consequences on deleteriousness score (CADD score) and potential regulatory functions (RegulumeDB score) [27,28]. A CADD score above 12.37 is the threshold to be potentially pathogenic [27]. The RegulumeDB score is based on information from eQTLs and chromatin marks, ranging from 1a to 7, with lower scores indicating an increased likelihood of having a regulatory function. In order to clarify the biological mechanism behind the pleiotropic genes, we conducted pathway enrichment analysis in the Kyoto Encyclopedia of Genes and Genomes (KEGG) dataset [29].

Expression Analysis of Pleiotropic Genes
In order to evaluate whether the identified pleiotropic genes are differentially expressed, we used the publicly available expression dataset GSE184050 from the Gene Expression Omnibus (https://www.ncbi.nlm.nih.gov/geo/) (accessed on 13 September 2021) database. GSE184050 compared changes in gene expression using two longitudinally collected blood samples from subjects who transitioned to type 2 diabetes between the time points against those who did not, with a novel analytical network approach. A total of 116 individual samples (50 from type 2 diabetes cases and 66 from healthy controls) were submitted to the analysis. RNA was extracted, amplified, reverse transcribed, labeled, and sequenced with an Illumina HiSeq 2000 (GPL11154). We scaled the original data and deleted outliners defined as more than 3 standard deviations and used a Benjamini-Hochberg multiple-testing correction with a p-value < 0.05.

Mendelian Randomization Study
In order to investigate causal associations between the expression pattern of pleiotropic genes and type 2 diabetes, we used eQTLGen 2019 results comprising all cis and some trans regions of gene expression in whole blood to perform a two-sample Mendelian randomization study. The eQTLGen consortium was set up to identify the downstream consequences of trait-related genetic variants. The consortium incorporates 37 datasets, with a total of 31,684 individuals [30]. We outlined acceptable instrumental variables via three main assumptions: they were associated with the relevant risk factor (relevance assumption), they and the outcome had no common cause (independence assumption), and the outcome was not affected by them except via the risk factor (exclusion restriction assumption) [31]. Genetic instrumental variables for eQTL summary statistics of pleiotropic genes were acquired from OpenGWAS, developed by the MRC IEU OpenGWAS project, the contributor of TwoSampleMR (https://github.com/mrcieu/TwoSampleMR) (accessed on 16 October 2021) package and MR-base [32]. The data setup of the open-access OpenGWAS database is scalable, open-source, high-performance, and cloud-based, importing and publishing complete GWAS metadata and summary datasets for scientific society. The import pipeline matches these datasets to the reference sequence of the human genome, and dbSNP produces summary reports and systematizes the results and metadata formats.
We used the widely accepted inverse-variance weighted (IVW) method for the main analysis to estimate the causal effect between pleiotropic genes and type 2 diabetes. The IVW estimate is calculated by regressing the coefficient from an outcome regression on the genetic variant on that from an exposure regression on the variant and weighting each estimate by the inverse variance of the association between the instrument and the outcome [33].

Assessment of Pleiotropic Enrichment
We observed successive increments of SNP enrichment for diabetes-relevant phenotypes as a function of the significance of the associations with sleep traits (Figure 1). For a given nominal p-value for each diabetes-relevant trait, an earlier departure from the null line indicates a greater proportion of true associations. Gradual leftward shifts for decreasing nominal sleep traits p-values indicate that the proportion of nonnull SNPs varies considerably across different levels of association with sleep traits, which could be interpreted as the polygenic overlap between these phenotypes. Type 2 diabetes showed obvious pleiotropic enrichment with sleep traits. All diabetes-relevant phenotypes showed significant pleiotropy with chronotype.

Functional Annotation of Pleiotropic Gene
Five SNPs (rs10881959, rs11039358, rs2236950, rs12485697, rs1296328) had CADD scores greater than 12.37, suggesting that they might be deleterious mutations (Supplemental Table S4). One SNP (rs174555), shared among FG, HbA1c, and sleep duration, had Regulome DB scores of 1f, indicating that it was likely affecting binding sites (Supplemental Table S5). At the false discovery rate 0.05 level, KEGG pathway enrichment analysis found that HSD17B12, FADS2, and FADS1 were significantly enriched in the biosynthesis of unsaturated fatty acids (hsa01040), of which FADS2 and FADS1 were the overlapping genes with SNP rs174555.

Discussion
In the current study, we observed extensive polygenic pleiotropy between diabetic traits and sleep traits using conjFDR analysis. Fifty-eight independent genetic loci jointly influenced the risk of type 2 diabetes and the sleep traits of insomnia, sleep duration, and chronotype. Two of the pleiotropic genes, ENSA and PMPCA, were validated to be differentially expressed in type 2 diabetes, and PMPCA showed a slight protective effect on type 2 diabetes in MR analysis. Our study provides integrative evidence of a shared genetic mechanism between diabetes and sleep traits.
The strongest shared locus between type 2 diabetes and sleep traits was FTO (lead SNP rs8047587), a well-known gene associated with body mass index, obesity risk, and type 2 diabetes. However, the association between FTO and sleep traits has not been well discerned. Prats-Puig et al. showed that TT homozygotes for the FTO SNP exhibited nominal associations between decreasing sleep duration and increasing BMI, waist circumference, visceral fat, and HOMA-IR (all p < 0.05) in 297 asymptomatic children aged 5-9 years [34]. It is worth noting that FTO is predominantly expressed in the brain. Disruption in mice of Fto showed diet-or obesity-related changes in expression in the hypothalamus [35,36]. Abundant evidence supports multiple possible roles of the central nervous system in body weight regulation [37], and our study emphasized the role of sleep in the regulatory process.
Two notable pleiotropic genes were ENSA (lead SNP rs2055975) and PMPCA (lead SNP rs10747046), which were differentially expressed in type 2 diabetes cases. ENSA is expressed in brain and endocrine tissues and was proposed as a candidate gene for type 2 diabetes. It encodes alpha-endosulfine, which has the ability to block ATP-sensitive potassium (K(ATP)) channels and stimulate insulin release in beta cells such as sulfonylurea [38]. The considerably decreased alpha-endosulfine could result in a decrease in neurotransmitter release associated with cognition [38]. In our study, PMPCA showed a slight protective effect on type 2 diabetes and lowered FG. The literature on the direct role of PMPCA in diabetes is sparse, while a homozygous mutation in PMPCA has been reported to be crucial for autosomal recessive cerebellar ataxia [39,40]. PMPCA encodes the α-subunit of mitochondrial processing peptidase (MPP), a heterodimeric enzyme responsible for the cleavage of nuclear-encoded mitochondrial precursor proteins after import into mitochondria [41]. As previously mentioned, mitochondrial dysfunction leads to impairment of insulin sensitivity by reducing the activity of AMPK, an important cellular fuel sensor and regulator [42]. Agents addressing impaired mitochondrial function were thought to have the greatest potential for supporting a substantial improvement of glycemic and body weight control in the growing population with type 2 diabetes [43]. This may partly explain the pleiotropy of PMPCA in type 2 diabetes and sleep traits.
Our study showed that both PMPCA and INPP5E showed a significant association with chronotype, which is in opposite directions with type 2 diabetes, which suggested that people who are prone to be more of an "evening" than a "morning" person have a higher risk for developing type 2 diabetes. This is consistent with the latest systematic review and a cross-sectional study showing that evening chronotype was associated with a worse cardiometabolic risk profile and a higher risk of diabetes, cancer, and depression [44,45]. The latest research showed circadian rhythm disruption perturbed glucose homeostasis through disruption of pancreatic β cell function and loss of circadian transcriptional and epigenetic identity [46]. However, the opposite result was found in MR analysis, in which the IVW estimate yielded a morning chronotype and had an adverse effect on type 2 diabetes (OR = 1.37, 95% CI: 1.09-1.72, p = 0.0068). On the one hand, the odd results of the unrobust MR analysis suggest that MR studies should be validated more widely with multiple methods. On the other hand, Reis-Canaan's study showed that most morning chronotype individuals were elderly thin males with lower consumption of omega-6 and omega-3, sodium, zinc, thiamine, pyridoxine, and niacin, whereas evening individuals were younger, had higher BMI, and had higher consumption of the studied micronutrients [47]. This indicates that the association between diabetes and chronotype is extremely entangled. The interpretation should be careful, and further well-designed studies should be conducted.
Our research had some limitations. First, overlapping participants between the investigated GWAS may inflate the cross-trait enrichment in the condFDR statistical framework. However, we had to choose a stringent threshold (conjFDR < 0.01) instead of the default parameter setting (0.05) to control for false positives. Another limitation is the use of selfreported sleep symptoms rather than clinical diagnostic criteria. Measurement errors and recall bias would result in misclassification of case status, especially for insomnia which we used insomnia complain as a proxy. Although a previous study showed that the UK Biobank insomnia phenotype is predictive of insomnia disorder, with little confounding by comorbidity [10], large-scale summary statistics for a precise definition of clinical diagnostic insomnia was desired in subsequent studies. Finally, our study requires more fundamental work to detect the underlying biological mechanisms between diabetes and sleep traits.

Conclusions
Our study provided evidence for the polygenic overlap between diabetic traits and sleep traits, of which the expression of PMPCA may play a crucial role and provide support of the hazardous effect of being an "evening" person on diabetes risk.
Author Contributions: D.C. conceived the study and undertook project leadership. In addition, D.C., T.W. and Y.W. were the guarantors of this work. Y.M. wrote the first draft of the manuscript, analyzed data, and interpreted the results. Z.Z., X.L., Z.Y., K.D. and H.X. were involved in the data collection. All authors contributed to the drafting and critical revision of the manuscript. All authors have read and agreed to the published version of the manuscript. Institutional Review Board Statement: The present study did not follow a prespecified analysis plan or protocol. Ethics approval was not required for this study because the data are publicly available, deidentified, summary-level data. The patients/participants provided their written informed consent to participate in the original studies.

Informed Consent Statement: Not applicable.
Data Availability Statement: Our datasets analyzed during the current study were derived from the following public domain resources: Summary statistics of type 2 diabetes is available from DIAGRAM consortium (http://diagram-consortium.org/, accessed on 13 November 2020) and summary statistics of the diabetes-related traits are available from Meta-Analyses of Glucose and Insulin-related traits Consortium (https://magicinvestigators.org/, accessed on 13 November 2020). Summary-level data for sleep traits are available at https://ctg.cncr.nl/software/summary_statistics (accessed on 13 November 2020).