Association of Polygenic Variants with Type 2 Diabetes Risk and Their Interaction with Lifestyles in Asians

Over the last several decades, there has been a considerable growth in type 2 diabetes (T2DM) in Asians. A pathophysiological mechanism in Asian T2DM is closely linked to low insulin secretion, β-cell mass, and inability to compensate for insulin resistance. We hypothesized that genetic variants associated with lower β-cell mass and function and their combination with unhealthy lifestyle factors significantly raise T2DM risk among Asians. This hypothesis was explored with participants aged over 40. Participants were categorized into T2DM (case; n = 5383) and control (n = 53,318) groups. The genetic variants associated with a higher risk of T2DM were selected from a genome-wide association study in a city hospital-based cohort, and they were confirmed with a replicate study in Ansan/Ansung plus rural cohorts. The interacted genetic variants were identified with generalized multifactor dimensionality reduction analysis, and the polygenic risk score (PRS)-nutrient interactions were examined. The 8-SNP model was positively associated with T2DM risk by about 10 times, exhibiting a higher association than the 20-SNP model, including all T2DM-linked SNPs with p < 5 × 10−6. The SNPs in the models were primarily involved in pancreatic β-cell growth and survival. The PRS of the 8-SNP model interacted with three lifestyle factors: energy intake based on the estimated energy requirement (EER), Western-style diet (WSD), and smoking status. Fasting serum glucose concentrations were much higher in the participants with High-PRS in rather low EER intake and high-WSD compared to the High-EER and Low-WSD, respectively. They were shown to be higher in the participants with High-PRS in smokers than in non-smokers. In conclusion, the genetic impact of T2DM risk was mainly involved with regulating pancreatic β-cell mass and function, and the PRS interacted with lifestyles. These results highlight the interaction between genetic impacts and lifestyles in precision nutrition.


Introduction
Type 2 diabetes mellitus (T2DM) is a chronic metabolic disease characterized by the presence of elevated levels of glucose in the blood. A combination of two factors is involved in the T2DM etiology: inadequate insulin secretion by pancreatic β-cells and the inability of insulin-sensitive tissues to respond to insulin, namely, insulin resistance [1]. The etiology of type 2 diabetes is somewhat different in Caucasians and Asians [1]. In a high insulinresistant state, Caucasians induce hyperinsulinemia to delay T2DM development, but Asians are susceptible to T2DM development due to low insulin secretion with small amounts of islets [2]. The low β-cell mass and function may be associated with antecedent lifestyles, and Asian antecedents were insulin-sensitive due to high grains and fiber intake and good physical activity not to secrete high insulin. However, significant lifestyle changes, including drastic shifts in diet and a more sedentary life over the last three decades, are believed to have led to elevated insulin resistance, which markedly increases the T2DM risk in Asians [1]. Therefore, the rapid increase in insulin resistance due to Westernized lifestyles remarkably elevates T2DM incidence in Asians.
The ethnic differences in the development of T2DM are also linked to the interaction of genetic and environmental factors. Genome-wide association studies (GWAS) have demonstrated that some genetic variants are involved in T2DM in Caucasians and Asians, and the genes linked to T2DM are similar [3,4]. However, Asian and European populations exhibit substantial differences in the allele frequencies of IGF2BP2, CDKAL1, JAZF1, SCL30A8, HHEX, TCF7L2, EXT2, and FTO [4]. The biological actions of their genetic variants are similar in different ethnic populations, and the ethnic differences in genetic variation are due to the substantial differences in the allele frequencies of the genes related to T2DM [5]. Therefore, genetic differences also need to be considered for formulating therapeutic strategies for T2DM in Asians. Increasing insulin secretion, preserving β-cell mass, and improving glucose utilization by peripheral tissues are the target outcomes for treatment, and the interaction of the genetic variants and lifestyles should be considered for the therapy.
In addition to the genetic differences, food and nutrient intake affect T2DM. The relationship between carbohydrates, fat, protein intake, and T2DM varies with the proportions of their intake, food sources, and cooking types. However, genetics play an essential role in the metabolism of these macronutrients. For example, a high carbohydrate diet is associated with a higher risk of metabolic syndrome, especially hypertriglyceridemia, but not hyperglycemia, in Korean adults [6,7]. It is shown to be related to the interaction of the APOA5, EFCAB4B, ZNF259, and APOBEC1 genetic variants with a diet [8,9]. However, plasma vitamin C concentrations and dietary vitamin C intake, but not dietary fiber, improve hyperglycemia in Asian and European populations [10,11]. Therefore, the interaction of the lifestyle factors with genetic variants related to T2DM needs to be studied in each ethnic group.
Genetic variation studies have been conducted on relatively small populations, especially Asians. More studies need to be conducted to explore the impacts of genetic variation on the T2DM etiology and the interactions of the genetic variants with lifestyle. The paucity of large, well-controlled studies of the interactive effects of genetic variants among Asians with diet and lifestyle limits the ability to develop precision medicine treatments for diabetes. However, a few studies have been conducted about the genetic variant interactions with lifestyles. The present study aimed to investigate the genetic variants associated with T2DM and their interaction with lifestyle factors in large hospital-based cohorts of Asians. The results of this study can provide the potential etiological basis for the development of T2DM and its interaction with lifestyles. These results can be applied to a precision medicine approach to the prevention and treatment of T2DM after having been confirmed by dietary intervention trials.

Subjects
Participants over 40 years who had volunteered in a large city hospital-based cohort as part of the Korean Genome and Epidemiology Study (KoGES) conducted during 2010-2014 were included in the study [12]. The institutional review board (IRB) of the Korean National Institute of Health and Hoseo University approved the KoGES (KBP-2015-055) and the present study (HR-034-01). All participants signed a written informed consent form. Because the participants with severe disease states were not included in KoGES, all participants were included for the present study. The parameters influencing T2DM were adjusted for the analysis.

Demographic, Anthropometric, and Biochemical Parameters of the Participants
At the initial visit, demographic parameters including age, gender, education, income, and place of residence were obtained using survey questionnaires. The height and weight were measured with the participant wearing a light gown and bare feet using a wellcalibrated digital weight and height scale (Inbody, Cheonan, Korea) [13,14]. Body mass index (BMI) was calculated with the equation of dividing body weight (kg) by the square of height (m 2 ). Waist circumferences were determined by taking around the abdomen at the position of two finger-widths above the umbilicus in a relaxing state with a tape measure (Stanley, New Britain, CT, USA). The appendicular skeletal muscle and fat mass were calculated using a machine learning prediction model from the Ansan/Ansung cohort [15].
Blood was collected after more than 12 h fasting in a blood collecting tube with and without heparin and ethylenediaminetetraacetic acid (EDTA). The fasting plasma glucose and HbA1c concentrations were assessed using a Hitachi 7600 Automatic Analyzer (Hitachi, Tokyo, Japan) and an automatic analyzer (ZEUS 9.9; Takeda, Tokyo, Japan). Insulin resistance was estimated using the prediction model generated by the homeostatic model assessment of insulin resistance (HOMA-IR) from a previous study [16].
The frequencies and amounts of alcohol, coffee consumption, physical activity, and smoking history were collected during a health interview. The daily alcohol intake (g/day) was calculated by multiplying the consumption frequencies and amounts [17]. The smoking status was classified into current smokers (at least 20 cigarettes in the past six months), past smokers (no smoking for at least six months, although more than 20 cigarettes in the lifetime), and never been a smoker in the lifetime [17]. Weekly coffee intake was assessed by multiplying drinking frequencies by the amount and was categorized into three groups by tertiles. Regular exercise was defined as more than 30 min of moderate physical activity for three or more days per week.

T2DM Definition
According to the diagnosis guidelines of the American Diabetes Association [18], the participants who had ≥126 mmol/L fasting plasma glucose level, ≥6.5% HbA1c, or were currently using anti-diabetic medication were considered cases of T2DM (case; n = 5383), and the other participants comprised the control (n = 53,318). They were not T1DM or MODY and developed T2DM over age 25.

Estimation of Usual Food Intake Using a Semi-Quantitative Food Frequency Questionnaire (SQFFQ)
The usual food intake during the last 12 months was estimated for each participant using an SQFFQ designed for Korean diet patterns, and its reproducibility and accuracy were validated with three-day food records during the four Korean seasons [19]. The SQFFQ was composed of 106 food items that Koreans commonly consume, and the food frequencies were classified into never or seldom, once per month, two to three times per month, once or twice weekly, three or four times weekly, five or six times weekly, daily, twice daily, and ≥3 times daily. The food consumed at a meal was scored as more than, equal to, or less than the regular portion size visualized using photographs of the 106 foods. The participants recorded the frequencies and the portion size of the 106 food items in the SQFFQ. The daily food intake (g/day) was estimated by multiplying the median of the weekly consumed frequencies by portion sizes. The daily energy, carbohydrates, fats, proteins, vitamins, and mineral intakes were assessed from the daily food intake using the Can-Pro 2.0 nutrient assessment software designed by the Korean Nutrition Society.

Dietary Patterns by Principal Component Analysis (PCA)
The 106 food items in the SQFFQ were categorized into 30 predefined food groups, as previously described [20]. The dietary patterns were classified using the PCA of the 30 predefined groups based on eigenvalues >1.5, and four dietary patterns satisfied these eigenvalues in a PCA [21]. The patterns from the 30 predefined groups were classified using the orthogonal rotation procedure (varimax), and food groups with ≥0.40 factor-loading values were considered the predominant contributors to the assigned dietary pattern [22]. The foods in each dietary pattern were as provided in Supplemental Table S1. According to the foods included in each group, the dietary patterns were named the Korean balanced diet (KBD), plant-based diet (PBD), Western-style diet (WSD), or rice-main diet (RMD) [23,24].
The dietary inflammatory index (DII) was estimated from the prediction equation by multiplying food and nutrient intakes by their dietary inflammatory weights, as reported previously. The equation included energy, 32 nutrients, four food products, four spices, and caffeine [25]. However, garlic, ginger, saffron, and turmeric were absent in the SQFFQ and excluded from the DII calculation. After multiplying the dietary inflammatory scores of 38 foods and nutrients by daily intake, DII was calculated with the sum of their scores divided by 100.

Genotyping Using a Korean Chip and Quality Control
The Center for Genome Science at the Korea National Institute of Health genotyped the participants in the Ansan/Ansung and city hospital-based cohorts. In brief, the genomic DNA was isolated from whole blood, and genotypes were measured using a Korean Chip (Affymetrix, Santa Clara, CA, USA) designed to examine the disease-related SNPs in Koreans [26]. The genotypes were imputed based on the 1000-genome sequence or the Korean HapMap data [27]. The genotyping accuracy was estimated using a Bayesian learning algorithm for Robust General Linear Models (RGLMs) [28]. The inclusion criteria of the genotyping accuracy, missing genotype call rate, and heterozygosity were ≥98%, <4%, and <30%, respectively, and the data showed no gender bias. The genetic variants were included to satisfy the criteria of p > 0.05 of the Hardy-Weinberg equilibrium (HWE) and minor allele frequency (MAF) > 5% [28]. Manhattan and QQ plots indicated the accuracy of the GWAS data using the "fastman" library in the R program. A Manhattan plot of the genetic variants was displayed with the negative logarithms of the association p-value for T2DM. A QQ plot is a probability plot to show the goodness of fit of the actual data distribution to the theoretical data distribution. The QQ plot of the genotype data displayed the quantile distribution of observed p-values (on the y-axis) versus the quantile distribution of expected p-values (on the x-axis). The QQ plot determined the quality of genotypes from the GWAS. When the lambda value of the QQ plot was close to 1, the genotypes by GWAS were ideal. The pathway linked to the genetic variants associated with T2DM with P for Bonferroni correction <0.05 were selected using the MAGMA gene-set analysis in the SNP2GENE of the FUMA web application, available through the git repository (https://github.com/Kyoko-wtnb/FUMA-webapp/, accessed on 8 March 2022).

Genotype-Tissue Expression (GTEx) of Genetic Mutations and Distribution of Identified Tissue/Organ-Specific Expressed SNPs
Genotype-Tissue Expression (GTEx) of genetic mutations uses the source code of the FUMA web application. Normalized gene expression with reads per kilobase of transcript per million reads mapped (RPKM) for 53 tissue types was obtained from GTEx. A total of 56,320 genes were available in GTEx, and we filtered for each tissue with an average RPKM greater than or equal to 1 in at least one tissue type.
In GENE2FUNC, a heatmap of prioritized genes showed two expression values; (i) average log 2 (RPKM + 1) per tissue per gene where RPKM was Winsorized at 50, allowing a comparison of expression levels between genes and tissue types, and (ii) the mean of the normalized expression (zero mean of log 2 (RPKM + 1)) per tissue per gene to allow comparison of the expression levels within tissue types within genes. Genes were tested against these differentially expressed genes (DEG) sets by a hypergeometric test to assess whether preferential genes were overrepresented in the DEG sets for specific tissue types. Hierarchical clustering was performed using the python "scipy" package (using the "average" method).  Figure 1 represents the best genetic model selection process with SNP-SNP interactions. The GWAS was conducted to explore genetic variants associated with T2DM risk using the T2DM (n = 5383) and control groups (n = 53,318) in the urban hospital-based cohort (p < 5 × 10 −5 ). Of these, 287 genetic variants did not meet MAF (<5%) and HWE (p < 0.05), and the 300 gene names were identified from the 4330 genetic variants using g:Profiler (https://biit.cs.ut.ee/gprofiler/snpense, accessed on 9 February 2022). The SNPs with high D' values (D' ≥ 0.2) were excluded because they provided the same information on the genetic impact. Subsequently, 89 genetic variants remained to meet the linkage disequilibrium (LD) criteria (D' < 0.2) using Haploview 4.2 in PLINK. The potential genetic variants in the same chromosome were not strongly correlated (D' < 0.2). In addition, 43 genetic variants were removed due to the inability to identify gene names. Among the genetic variants, 29 genetic variants, of which gene names were not identified, were eliminated (n = 29). normalized expression (zero mean of log 2 (RPKM + 1)) per tissue per gene to allow comparison of the expression levels within tissue types within genes. Genes were tested against these differentially expressed genes (DEG) sets by a hypergeometric test to assess whether preferential genes were overrepresented in the DEG sets for specific tissue types. Hierarchical clustering was performed using the python "scipy" package (using the "average" method). Figure 1 represents the best genetic model selection process with SNP-SNP interactions. The GWAS was conducted to explore genetic variants associated with T2DM risk using the T2DM (n = 5383) and control groups (n = 53,318) in the urban hospital-based cohort (p < 5 × 10 −5 ). Of these, 287 genetic variants did not meet MAF (<5%) and HWE (p < 0.05), and the 300 gene names were identified from the 4330 genetic variants using g:Profiler (https://biit.cs.ut.ee/gprofiler/snpense, 9 February 2022). The SNPs with high D' values (D' ≥ 0.2) were excluded because they provided the same information on the genetic impact. Subsequently, 89 genetic variants remained to meet the linkage disequilibrium (LD) criteria (D' < 0.2) using Haploview 4.2 in PLINK. The potential genetic variants in the same chromosome were not strongly correlated (D' < 0.2). In addition, 43 genetic variants were removed due to the inability to identify gene names. Among the genetic variants, 29 genetic variants, of which gene names were not identified, were eliminated (n = 29).  Flow chart for the generation of polygenic risk scores (PRS) to increase the type 2 diabetes risk and interactions between PRS and lifestyles. Type 2 diabetes was defined as ≥126 mmol/L of fasting plasma glucose level, ≥6.5% HbA1c, or currently taking anti-diabetic medication. T2DM (case; n = 5383) and the control (n = 53,318).

Selection of the Genetic Variants That Influence the T2DM Risk and the Best Model with SNP-SNP Interactions
Ten genetic variants were selected for the best model with SNP-SNP interaction involved in T2DM risk from 20 SNPs in the generalized multifactor dimensionality reduction (GMDR) with p < 0.001 for the sign test of testing balanced accuracy (TEBA) and 10 crossvalidation consistency (CVC) in the exhaustive search type and adjustment with covariates of age, gender, residence area, education, and income for models 1, plus energy intake, alcohol intake, regular exercise, and smoking status for model 2. Ten-fold cross-validation was used for CVC because the sample size was larger than 1000 [29].
The PRS for the best model was assessed by summing the number of the risk alleles (genetic risk score) from each selected SNP in the best gene-gene interaction model [30,31]. If the risk allele of the genetic variant was A, the genetic scores of "AA", "AG", and "GG" were 2, 1, and 0, respectively. The PRS values in the models with three-, eight-, and 20-SNPs were divided into three categories: Low-PRS, Middle-PRS, and High-PRS. The PRS was classified into Low-, Medium-, and High-PRS as 0-3, 4-5, and ≥6, respectively) in the three-SNP model; 2-7, 8-10, and ≥11 in the eight-SNP model; and 10-19, 20-24, and ≥25 in the 20-SNP model (5 × 10 −6 ), respectively. Out of the best models to meet the p-value of the sign test and CVC, the model with the lowest SNP number (eight-SNP model) was used to interact with the lifestyle parameters.

Statistical Analysis
The statistical analysis was performed using SAS (version 9.3; SAS Institute, Cary, NC, USA). The 58,701 participants (about 10% of prevalence) were sufficient for the sample size to exhibit significance at α = 0.05, β = 0.99, and an odds ratio of 1.05 in the logistic analysis using a G-power calculator. The descriptive statistics for categorical variables were calculated to show the frequency distributions, and their statistical differences with the T2DM were analyzed using the chi-square test. Descriptive statistics of the continuous variables were assessed to show the adjusted means with standard deviations for the covariates. The statistical differences between the genders in the T2DM group were compared using a two-way analysis of covariance (ANCOVA). When ANCOVA was significant, multiple comparisons according to the genders and T2DM groups were conducted using Tukey's test.
The association of T2DM with metabolic parameters was analyzed using logistic regression analysis after adjustment for covariates. The results are shown with the adjusted odds ratios (ORs) and 95% confidence intervals (CI) of each metabolic parameter. Two different models were included according to the covariates. Model 1 included age, residence area, survey year, BMI, education, and income as covariates. Model 2 was calculated with covariates of model 1 plus the energy intakes, physical activity, smoking status, and alcohol and coffee consumption.
Two-way ANCOVA was used to analyze the interactions between T2DM and lifestylerelated parameters after they were categorized into the high or low groups according to the dietary reference intake [24] or 30th percentiles of each variable. The two-way ANCOVA model included the main effects of the T2DM and lifestyle-related parameters, their interaction effect, and covariates. The ORs and 95% CI of T2DM with lifestyle-related parameters were also assessed with logistic regression analysis in the high and low groups of the lifestyle-related parameters. The significant difference in the T2DM proportion was analyzed according to the PRS groups using the χ 2 test in the low-and high groups of the lifestyle-related parameters.

Demographic and Lifestyle Characteristics
According to genders, the baseline characteristics of the T2DM and control groups are presented in Table 1 since men (13.0%) exhibited a much higher T2DM incidence than women (7.15%). The participants in the T2DM group were older, less educated, and had a lower income than those in the control group of both genders ( Table 1). The T2DM risk increased with participants aged over 55 years by 1.88 times and decreased with those educated ≥high school and earned ≥USD 2000/month by 0.646 and 0.748 times, respectively (Table 1). Patients in the T2DM group had a higher BMI, waist circumference, and fat mass than the control group. These factors raised the risk of T2DM by 1.69, 1.92, and 1.58 times, respectively. However, the skeletal muscle mass index (SMI) showed a relation opposite to BMI and was inversely linked with T2DM by 0.77 times ( Table 1). As expected, fasting serum glucose and HbA1c concentrations were higher in the T2DM group than in the control group. The incidence of insulin resistance was higher in the T2DM group than in the control group and raised the risk of T2DM by 58.81 times (Table 1). Values represent adjusted means and standard errors. Values represent adjusted odd ratios (ORs) and 95% confidence intervals (CI). Covariates included age, sex, education, income, energy intake (percentage of estimated energy requirement), residence areas, daily activity, alcohol intake, and smoking status. Skeletal muscle mass index (SMI) was calculated by dividing the limb skeletal muscle mass (kg) by the square of the height (m 2 ). The cutoffs of the logistic regression analysis were as follows: 55 years for age, 25 kg/m 2 for BMI, 90 cm in men and 85 cm in women for waist circumferences, 75th percentiles for SMI, and 25% in men and 32% in women for fat mass. ** Significant differences by genders at p < 0.01, *** p < 0.001. +++ Significant differences by T2DM at p < 0.001. ### Significant interaction between genders and obesity at p < 0.001. ‡ ‡ Significantly different from the control group in X2 test in each gender at p < 0.01, ‡ ‡ ‡ at p < 0.001. a,b,c,d Different superscripts indicate significant differences among the groups by Tukey test at p < 0.05.
Nutrient intake was also related to T2DM risk. Energy intake only in men slightly lowered the T2DM group than in the control group, and all participants had no association between energy intake and T2DM risk ( Table 2). Carbohydrate and protein intake did not differ between the T2DM and control groups in both genders (Table 2). Fat intake only in women was higher in the T2DM group than in the control group, and it was not associated with T2DM risk. Women's calcium intake was lower in the T2DM group than in the control group, while vitamin D intake showed the same trend as calcium intake in both genders (Table 2). However, there was no significant difference in the T2DM risk with the intake of these two supplements. Fiber intake did not differ between the T2DM and control groups. Unlike fiber intake, vitamin C and flavonoid intakes were lower in the T2DM group than in the control group, and their intakes were inversely associated with T2DM risk (Table 2). DII was also higher in the T2DM than in the control group, especially in women, and it was positively associated with T2DM risk.   4 13.3 ± 0.03 c 13.4 ± 0.05 b 13.6 ± 0.02 a 13.6 ± 0.04 a *** + # 1.054 (0.978-1.136) Fiber (g) 5 14 . 3 15 En% for fat, 4 14 En% for protein, 5 20 g for fiber, 6 500 mg for calcium, 7 100 mg for vitamin C, 8 10 ug for vitamin D, 9 −25 scores for dietary inflammatory scores, 10 45 mg for flavonoids, 11 70th percentiles of each dietary pattern, 12 20 g for alcohol, and 13 moderate exercise for 150 min/week. *** Significant differences by genders at p < 0.001. + Significant differences by type 2 diabetes (T2DM) at p < 0.05, at ++ p<0.01, +++ p < 0.001. # Significant interaction between genders and obesity at p < 0.05, ## at p < 0.01, ### p < 0.001. ‡ Significantly different from the control group in χ 2 test in each gender at p < 0.05, ‡ ‡ at p < 0.01, ‡ ‡ ‡ at p < 0.001. a,b,c,d Different superscripts indicate significant differences among the groups by Tukey test at p < 0.05.
The participants with a high KBD intake were higher in the T2DM group than in the control group in men, but it was the opposite in women. There was no association between the KBD and T2DM risk. PBD was inversely associated with T2DM risk, while WSD was positively associated with it ( Table 2). The participants with high PBD intake were lower in the T2DM group than in the control group only for women (Table 2). There was no difference in T2DM incidence with RMD in both genders. Alcohol intake was inversely linked to T2DM risk ( Table 2). The participants with moderate exercise were higher in the T2DM group than the control in both genders, and exercise was positively associated with T2DM risk and was related to exercise recommendation to T2DM patients. Smokers were higher in the T2DM group than the control in both genders, and smoking was positively associated with the T2DM risk (Table 2).

Polygenetic Variants with Their Interaction Related to the T2DM Risk
The overall statistical association of genetic variants with T2DM is shown as a Manhattan plot (Figure 2A), representing the distribution of genetic variants according to statistical differences. The Q-Q plot shows the quantile distribution of the log of observed p values versus the quantile distribution of the log of expected p values, and the lambda value, the genome inflation factor, was 1.073, indicating that there was no bias or inflation in the genetic variants from the GWAS for T2DM ( Figure 2B). rs10811661_CDKN2A/B, rs12764758_IDE, rs60808706_KCNQ1, and rs11651052_HNF1B ( Table 3). The characteristics of genetic variants are listed in Table 3. These ten selected genetic variants were significantly associated with T2DM risk in the city hospital-based and Ansan/Ansung plus rural cohorts at p < 5 × 10 −7 and p < 0.05, respectively. The alleles of rs35612982_CDKAL1, rs61160304_PAX4, rs7034200_GLIS3, rs12764758_ IDE, and rs11651052_HNF1B were positively associated with T2DM risk, while the rest of the alleles were inversely linked to it (Table 3).  According to the selection procedure of the genetic variants influencing T2DM risk, the 49 SNPs were selected to satisfy the criteria, including p < 5 × 10 −6 for the GWAS, D' < 0.2 in LD, p ≥ 0.05 in HWE, and ≥0.05 in MAF. Ten genetic variants were selected in the SNP-SNP interaction, and they were rs7631705_UBE2E2, rs35612982_CDKAL1, rs2191349_ DGKB, rs61160304_PAX4, rs13266634_SLC30A8, rs7034200_GLIS3, rs10811661_CDKN2A/B, rs12764758_IDE, rs60808706_KCNQ1, and rs11651052_HNF1B ( Table 3). The characteristics of genetic variants are listed in Table 3. These ten selected genetic variants were significantly associated with T2DM risk in the city hospital-based and Ansan/Ansung plus rural cohorts at p < 5 × 10 −7 and p < 0.05, respectively. The alleles of rs35612982_CDKAL1, rs61160304_PAX4, rs7034200_GLIS3, rs12764758_ IDE, and rs11651052_HNF1B were positively associated with T2DM risk, while the rest of the alleles were inversely linked to it (Table 3). In the GMDR analysis, models 3, 4, 8, 9, and 10 met the criteria of the sign test (p < 0.05) and CVC (10/10), suggesting that they were the candidates for the best models ( Table 4). The PRS for the candidate models was calculated, and the association of the PRS with T2DM was determined. The four-SNP model included CDKAL1_rs35612982, CDKN2A/B_ rs10811661, KCNQ1_rs60808706, and GLIS3_rs7034200 while the eight-SNP model included four genetic variants plus UBE2E2_rs7631705, HNF1B_rs11651052, SLC30A8_rs13266634, and PAX4_rs61160304 (Table 4). The PRS of the four-SNP model was associated with T2DM, 6.1 and 5.8 times in models 1 and 2 with different covariates, respectively, while that of the eight-SNP model was 10.5 and 9.3 times, respectively (Figure 3). The PRS of SNP models 9 and 10 were linked with T2DM by about 6.0 times (data not shown). Furthermore, the PRS of all 20 genetic variants (p < 5 × 10 −6 ) showed a significant association with T2DM by 6.0 times (Figure 3). In addition, 20 genetic variants with gene names and p < 5 × 10 −6 were found, and the association of their PRS with T2DM was 6.0 times (Figure 3). These results showed that the eight-SNP model was optimal for predicting T2DM risk. Table 4. Generalized multifactor dimensionality reduction (GMDR) results from multi-locus interaction with genes related to β-cell function for type 2 diabetes risk.

Covariates
Adjusted In the GMDR analysis, models 3, 4, 8, 9, and 10 met the criteria of the sign test (p < 0.05) and CVC (10/10), suggesting that they were the candidates for the best models ( Table  4). The PRS for the candidate models was calculated, and the association of the PRS with T2DM was determined. The four-SNP model included CDKAL1_rs35612982, CDKN2A/B_ rs10811661, KCNQ1_rs60808706, and GLIS3_rs7034200 while the eight-SNP model included four genetic variants plus UBE2E2_rs7631705, HNF1B_rs11651052, SLC30A8_rs13266634, and PAX4_rs61160304 (Table 4). The PRS of the four-SNP model was associated with T2DM, 6.1 and 5.8 times in models 1 and 2 with different covariates, respectively, while that of the eight-SNP model was 10.5 and 9.3 times, respectively (Figure 3). The PRS of SNP models 9 and 10 were linked with T2DM by about 6.0 times (data not shown). Furthermore, the PRS of all 20 genetic variants (p < 5 × 10 −6 ) showed a significant association with T2DM by 6.0 times (Figure 3). In addition, 20 genetic variants with gene names and p < 5 × 10 −6 were found, and the association of their PRS with T2DM was 6.0 times (Figure 3). These results showed that the eight-SNP model was optimal for predicting T2DM risk.

GTEx and Frequency of Tissue/organ-Specific Expression
The selected genetic variants of the genes influencing T2DM risk were expressed in various tissues, including the brain, adipose tissue, adrenal gland, blood vessel, breast tissue, colon, esophagus, heart, kidney, liver, lung, salivary gland, skeletal muscles, nerve, and ovary ( Figure 4). Nine genes of the selected genetic variants were included in the GTEx dataset ( Figure 4). In descending order, higher expression of the corresponding gene was seen in the red cells, followed by the blue cells in the risk allele and the non-risk allele in Figure 4. UBE2E2 was expressed in most tissues but had low expression in the pancreas and blood (Figure 4). The SLC30A8 risk allele exhibited low expression in most tissues but not in the pancreas. The KCNQ1 risk allele was observed to have a high expression in the pancreas, thyroid, and adrenal gland compared to the non-risk allele and was relatively highly expressed in the heart and kidney (Figure 4). The IDE risk allele was expressed in most tissues but not the brain and heart, while CDKAL1 was expressed relatively low in most tissues but very low in the brain. DGKB and GLIS3 risk alleles were expressed very low in most tissues (Figure 4).

Metabolism Related to the Genetic Variants for T2DM Risk
A MAGMA gene-set analysis was performed for the curated gene sets and Gene Oncology (GO) terms obtained from MsigDB. Table 5 displays either significant gene sets with a p-value of Bonferroni correction or the top 10 genes set when there were greater than ten significant gene sets in GENE2FUNC that were only examined for enrichment of prioritized genes. The genetic variants influencing T2DM from the GWAS were mainly related to the regulation of pancreatic β-cell development, which is linked to maturity-onset diabetes in young adolescents (MODY; Table 5). The β-value represented the cumulative association of genetic variants with the specified pathway. As shown in Table 5, the pathways of the regulation of gene expression in endocrine committed neurog3plus progenitor cells, regulation of β-cell development, and pancreatic endocrine progenitor cells are involved in β-cell development and regeneration. These results suggest that the genetic variants linked to T2DM risk by GWAS were mainly related to β-cell mass regulation, which was also linked to MODY. Moreover, genetic variants related to T2DM risk were associated with the negative regulation of insulin secretion. However, their related genes were not selected in the top 20 SNPs with p-values. Therefore, T2DM in Koreans might be genetically close to MODY and mainly related to smaller β-cell mass.    The number of genes related to type 2 diabetes risk. 2 The resulting coefficient from a fit between genetic variants for type 2 diabetes with the pathway. 3 SD, standard deviation of beta; 4 p value for the beta for type 2 diabetes. 5 p value with Bonferroni correction for the beta for type 2 diabetes. PAX6, paired box protein 6, HNF, hepatocyte nuclear factor; NeuroD1, neuronal Differentiation 1; FGF10, fibroblast growth factor 10; ONECUT3, one cut homeobox 3; PDX1, pancreatic and duodenal homeobox 1; ADR2α, adrenergic α2a; CRHR2, corticotropin-releasing hormone receptor 2; KLF7, Kruppel-like factor 7, PDE1c, Phosphodiesterase 1C; UCP2, Uncoupling Protein 2.

Interaction of PRS with Lifestyle Factors to Influence T2DM Risk
Energy intake interacted with PRS influencing T2DM risk, and PRS was positively associated with T2DM risk in high and low energy intakes by 3.59 and 2.96 times, respectively (Table 6). Fasting serum glucose concentrations increased in High-PRS compared to Low-PRS in low and high energy intake groups, but interestingly, they were lower in all the PRS groups in the low energy intake groups compared to the high groups (Table 6, Figure 5A). These results may be due to the lower energy intake in T2DM patients.   (C) Adjusted means and standard errors of the participants according to PRS categories by smoking status (a cutoff value: smoking). Covariates included age, gender, education, income, energy intake (percentage of estimated energy requirement), occupation, residence area, regular exercise, alcohol intake, and smoking status. a,b,c Different letters on the bar indicated significant differences among the groups in Tukey's test at p < 0.05.
Only WSD interacted with PRS for T2DM risk in the four dietary patterns, but KBD, PBD, and RMD did not interact with it (p < 0.035; Table 6). However, the statistical significance was not large enough to pass the p-value of the Bonferroni correction. High and low WSD intakes were positively associated with T2DM risk by 3.13 and 3.33 times, respectively (Table 6). However, fasting serum glucose was lower in the High-WSD group than in the Low-WSD group in all PRS groups ( Figure 5B).
In an assessment of lifestyles, alcohol intake and exercise also did not interact with PRS to affect the T2DM risk, but smoking status interacted with it (Table 6). These results suggest that the smoking state affected T2DM risk in the participants with High-PRS. Serum glucose concentrations were much higher in the High-PRS group than in the Low-PRS group, especially in smokers ( Figure 5C). Therefore, it was recommended that the participants with High-PRS should not smoke, and low energy intake did not reduce T2DM risk.

Discussion
The etiology of T2DM in Asians is somewhat different from that of Caucasians [1,2]. T2DM is characterized by increased insulin resistance and β-cell dysfunction. Although both factors are hallmarks of T2DM, recent studies suggest ethnic differences in β-cell function, wherein Asians have less β-cell functional capacity than Caucasians [1,2]. These findings indicate that genetic variants related to β-cell function and mass may differently influence T2DM risk in two populations. The genetic variants related to T2DM in Caucasians are mainly related to insulin resistance [32], but those in Asians, including Chinese, Japanese, and Koreans, are involved in insulin secretion (GLP1R, PAX4, HNF4A, SLC30A8, HHEX, CD-KAL1, CDKN2A/B, and GCKR) [33][34][35][36]. The present study showed that the genetic variants related to T2DM risk were involved in pancreatic β-cell development, growth, and insulin secretion and were linked to MODY genes. The best model for genetic variant-genetic variant interaction included CDKAL1_rs35612982, CDKN2A/B_rs10811661, KCNQ1_ rs60808706, GLIS3_rs7034200, UBE2E2_rs7631705, HNF1B_rs11651052, SLC30A8_rs13266634, and PAX4_rs61160304. The High-PRS of the eight-SNP model was associated with T2DM by about 10 times. According to genetic variants, the gene expression in the 7-SNP model was not provided in GTEx. However, their gene expression was different in various tissues. Most genes except IDE and UBE2E2 exhibited low expression in most tissues. However, KCNQ1, SLC30A8, and IDE with the risk alleles were relatively highly expressed in the pancreas compared to the non-risk alleles.
MODY is monogenic diabetes caused by a single gene mutation of either HNF-1α, HNF-1β, HNF-4α, HNF-1β, glucokinase, PAK4, KLF11, neurogenic differentiation 1 (neuroD1), and insulin (INS). MODY is typically developed before the age of 25 [37,38]. MODY is linked to the defects in pancreatic islet cell development and insulin secretion in young individuals. MODY is linked to non-obesity, similar to Asian T2DM [38]. The present study demonstrated that Korean T2DM included the mutation of several genes also related to MODY. However, T2DM in this study was not a monogenic disease like MODY, and the participants with T2DM developed diabetes after age 25. Therefore, Asian T2DM was related to low β-cell function and mass, and some of the common genetic variants were in genes also linked to MODY. PAX4 and HNF1β genetic variants were included in the best model, while those of HNF-1α, HNF-1β, KLF11, neuroD1, and glucokinase, were additionally involved at p < 0.00001 [39]. Unlike MODY single gene mutations, the combination of genetic variants of several MODY-related genes induced T2DM in Korean adults. These results suggest that the etiology of Korean T2DM is associated with specific genetic predispositions that impair β-cell function and mass, and this theory may be extended to all Asians.
Furthermore, the genetic variants that raise the risk of T2DM were related to β-cell development and survival and insulin secretion in the present study. The study supports the current evolving data that the higher risk of T2DM in Asians is connected to their smaller pancreatic β-cell mass. The genetic variants involved in the pancreatic β-cell mass through modulating β-cell development and survival result in impaired glucose metabolism to induce T2DM [40]. KCNQ1 mutations are reported to reduce insulin secretion and decrease pancreatic β-cell mass [41]. Although it remains unclear, KCNQ1 mutations are believed to increase cyclin-dependent kinase inhibitor 1C (CDKNL) through epigenetic modification [42]. Along with MODY genes, KCNQ1 mutations could influence T2DM by decreasing pancreatic β-cell mass and insulin secretion as per the present study.
SLC30A8 is also involved in providing zinc for insulin-hexamer formation, and it binds to PDX-1, linked to pancreatic β-cell growth. The decrease in zinc transport due to its mutation is associated with T2DM risk [43]. The PDX-1 mutation can suppress its binding to SLC30A8, decreasing β-cell growth to modulate T2DM susceptibility. Glucose-stimulated insulin secretion in the islets from SLC30A8 knockout mice is also suppressed [43]. SLC30A8 is involved in β-cell growth, proinsulin modification, and glucose-stimulated insulin secretion. It is a missense mutation to truncating protein (Trp325Arg; rs13266634) carriers that exhibit a 65% reduction in T2DM risk [43], which was consistent with the present study (OR = 0.8529, p < 8.22 × 10 −12 ). The SLC30A8 mutation is involved in releasing insulin secretory granules by zinc flux proinsulin modification. The risk allele carrier of the Trp325Arg mutation reduces zinc transporter activity, but its impact on glucose metabolism is varied in observational studies. The mutation exhibits modest hyperglycemia in a high-fat diet.
In addition, CDKAL1 variants, including rs35612982, are strongly involved in increased T2DM and obesity risk [44]. CDKAL1 act as a tRNA Lys modifier, and CDKAL1 loss impairs proinsulin translational fidelity in pancreatic β-cells, contributing to glucosedependent insulin secretion [44]. Its variants also affect insulin response in persons of European ancestry [45]. CDKN2A/B locus encodes p15 and p16 inhibitors for cyclindependent kinase 4 (CDK4), which regulates pancreatic β-cell replication [46]. The increased expression of CDKN2B induces hypoplasia of the exocrine and endocrine glands in rodents, and CDK4 inhibition induces insulin deficiency due to decreased pancreatic β-cell counts [46]. CDKN2A/B_rs10811661 carriers are susceptible to T2DM risk in different ethnic groups, including the Iraqi, Chinese, and Tai populations [46][47][48]. Consistent with the present study results, Asians carrying CDKAL1_rs35612982, CDKN2A/B_rs10811661, and KCNQ1_rs60808706, which are mainly linked to β-cell mass, have an elevated susceptibility to T2DM, suggesting that T2DM in Asians may be closely related to reduced β-cell mass and function.
T2DM risk is linked to genetics and multiple lifestyle factors. Managing this interaction could modulate the risk of developing T2DM in genetically susceptible to T2DM. The PRS of the best model for T2DM interacted with energy intake, but it did not interact with the macronutrient composition in the present study. Previous studies have demonstrated that some of the genetic variants associated with T2DM included in the PRS found in the present study have interactions with nutrients and other lifestyle factors [49]. Furthermore, the PRS interacted with WSD, and in a high WSD intake, high-PRS increased T2DM risk compared to low-PRS in the present study. In a systematic review, Western dietary patterns have interacted with T2DM-related PRS, fat and carbohydrate intake with IRS1_rs2943641, and physical activity with HNF1B, IRS1, PPARγ, ADRA2B, SLC2A2, and ABCC8 variants in European ethnicities [5,49]. In Asians, PRS related to T2DM interacted with energy and calcium intake [36], and PRS related to insulin secretion interacted with Western-style diets [50]. Asian antecedents were once insulin-sensitive due to high dietary fiber intake and good physical activity [51]. However, significant lifestyle changes, including drastic shifts in diet to WSD and a more sedentary life over the last three decades, are believed to have led to elevated insulin resistance. WSD is characterized by high fat and low fiber diet, which enables acidic conditions in the body [52]. Acidic load, mainly by mineral balance, is reported to damage bone [52], but it may be involved in β-cell dysfunction. Therefore, the results suggest that carriers with a high PRS should be recommended an adequate energy intake with sufficient nutrients but avoid a Western-style diet.
This study has some strengths and limitations. The strength of the current study was to be conducted on a large hospital-based cohort in Korea (n = 58,701). It was also well designed, and specialists collected test samples uniformly from the volunteers. However, it had some limitations: First, it was cross-sectionally conducted, and the results cannot be directly applied to cause-and-effect. Second, daily food intake during the past six months was estimated from the SQFFQ, including 106 food and dishes commonly consumed by Koreans. SQFFQ could not evaluate the exact amounts of each food and dish, but it represented the usual nutrient intake. SQFFQ could be appropriate for estimating the usual nutrient intake in big cohort studies. Third, regular exercise was defined as more than 30 min of moderate physical activity for three or more days per week, which was not validated. Fourth, genetic variants were also determined with a customized K-chip (Axiom Biobank plus Genotyping Array, KNIHv1.1), and it contained tagging SNPs that maximized genomic coverage and functional SNPs such as nonsynonymous, expression quantitative trait loci (eQTL), and previously known reported disease-associated SNPs [53]. Its reproducibility and accuracy were 99.77 and 99.73%, respectively [53], and the K-chip is optimal for exploring the genetic variants for T2DM risk in Koreans.
In conclusion, the participants with a High-PRS of the eight-SNP model that included rs7631705_UBE2E2, rs35612982_CDKAL1, rs61160304_PAX4, rs13266634_SLC30A8, rs7034200_GLIS3, rs10811661_CDKN2A/B, rs60808706_KCNQ1, and rs11651052_HNF1B showed a 10-fold elevation in T2DM risk compared to those with a Low-PRS. The selected genetic variants were mainly involved with regulating pancreatic β-cell mass and function, and the PRS interacted with lifestyles. This study clearly demonstrated that Asians with a high T2DM-related genetic predisposition are at much greater risk of developing diabetes, and that risk is exacerbated by interaction with a Westernized lifestyle, including diet, sedentary lifestyles, and smoking. PRS interacted with energy intake, WSD, and smoking, indicating that the participants with high PRS generally failed to compensate for high insulin resistance by elevating insulin secretion, especially when having WSD and smoking. Therefore, this study suggests that people with a high PRS may benefit from consuming adequate nutrition to meet EER while decreasing intakes of WSD and avoiding smoking. These results provide a potential genetic mechanism that places individuals at high risk of T2DM and readily interacts with diet and lifestyle to facilitate progression to T2DM. They can be applied in precision nutrition to prevent and alleviate T2DM.