NGS Sequencing Reveals New UCP1 Gene Variants Potentially Associated with MetS and/or T2DM Risk in the Polish Population—A Preliminary Study

The number of people suffering from metabolic syndrome (MetS) including type 2 diabetes (T2DM), hypertension, and obesity increased over 10 times through the last 30 years and it is a severe public health concern worldwide. Uncoupling protein 1 (UCP1) is a mitochondrial carrier protein found only in brown adipose tissue involved in thermogenesis and energy expenditure. Several studies showed an association between UCP1 variants and the susceptibility to MetS, T2DM, and/or obesity in various populations; all these studies were, however, limited to a few selected polymorphisms. The present study aimed to search within the entire UCP1 gene for new variants potentially associated with MetS and/or T2DM risk. We performed NGS sequencing of the entire UCP1 gene in 59 MetS patients including 29 T2DM patients, and 36 controls using the MiSeq platform. An analysis of allele and genotype distribution revealed nine variations which seem to be interesting in the context of MetS and fifteen in the context of T2DM. Altogether, we identified 12 new variants, among which only rs3811787 was investigated previously by others. Thereby, NGS sequencing revealed new intriguing UCP1 gene variants potentially associated with MetS and/or T2DM risk in the Polish population.


Introduction
The term metabolic syndrome (MetS) is defined as the co-occurrence of related metabolic disorders such as elevated blood glucose levels, hypertension, lipid disorders, elevated body mass index (BMI), abdominal obesity, and insulin resistance, among others. These disorders promote the development of cardiovascular disease (CDV) (stroke, coronary heart disease, and myocardial infarction) and type 2 diabetes (T2DM) [1,2].
The WOBASZ II study, published in 2021, indicated that in Poland, the prevalence of MetS was 32.8% in women and 39% in men [3,4]. Moreover, when compared to the previous study (WOBASZ I), a significant increase in the prevalence of MetS in Polish adults (aged 20-74) was observed (3.3% in women and 8.8% in men) in only one decade. This was due to the increased frequency of elevating fasting glucose or diabetes, abdominal obesity (especially in women), and lipid disorders (mainly in men) [3,4].
Despite advances in diagnostics, treatment, and prevention, the prevalence of CVD reportedly increases year after year worldwide [5], and this disease remains the leading cause of death in developed countries. An estimated 17.9 million people died from CVD in of the UCP1 gene. Therefore, the aim of the presented study was to search, using next generation sequencing (NGS) technology, for new, not previously studied UCP1 gene variants, that are potentially associated with MetS and/or T2DM risk.

Patients
For the purpose of this study, a group of 59 patients was recruited based on the following criteria: (1) of adult age; (2) diagnosed with MetS according to IDF (International Diabetes Federation) guidelines; (3) having central obesity, defined as waist circumference ≥ 80 cm in women or ≥94 cm in men; and (4) having at least two of the following factors: (i) triglycerides level ≥ 150 mg/dL (1.7 mmol/L) or treatment specific for this abnormality; (ii) HDL level < 40 mg/dL (1.03 mmol/L) in men or <50 mg/dL (1.29 mmol/L) in women or treatment specific for this abnormality; (iii) systolic blood pressure ≥ 130 mmHg or diastolic blood pressure ≥ 85 mmHg, or treatment of diagnosed hypertension; (iv) fasting blood glucose ≥ 100 mg/dL (5.6 mmol/L) or previously diagnosed T2DM. Among the selected patients, 29 of them also suffered from T2DM. Patient samples were collected during the project U-GENE "Multi-national network of excellence for research on genetic predisposition to cardio-metabolic disorders due to UCP1 gene polymorphisms" (European Union 7th Framework Program-FP7-PEOPLE-2013-IRSES Grant No. 319010) from 2013 to 2017. The study was conducted in accordance with the Declaration of Helsinki and was approved by the Institutional Review Board Bioethics Committee at Poznan University of Medical Sciences (KB 215/13, annexed 85/16). The patients provided their written informed consent to participate in this study. Patient characteristics are presented in Table 1.

Controls
The control group consisted of 36 participants who were derived from the same geographical area as the patients. Healthy participants were asked to fill in questionnaires where they provided information about age, sex, smoking status, and any present illnesses. Additionally, on the day of blood collection, parameters such as height, weight, and waist circumference were measured, and WHR and BMI were calculated. The control group were selected based on the following criteria: (1) of adult age; (2) BMI ≤ 25; (3) non-smoking; (4) body fat percentage below 35% for women, and below 30% for men; and (5) no diagnosis of hypertension, elevated fasting glucose or diabetes, or thyroid diseases ( Table 1). The participants provided their written informed consent to participate in this study.

DNA Isolation and Genotyping
Genomic DNA was isolated from refrozen blood samples by Invisorb QIAamp DNA Blood Mini Kit (Qiagen, Hilden, Germany) or Spin Blood Mini Kit (Stratec Molecular, Birkenfeld, Germany) according to the manufacturer's instructions. The quality and quantity of the DNA samples were determined using DS-11 spectrophotometer (DeNovix, Wilmington, DE, USA) and Qubit 3.0 Fluorometer (ThermoFisher Scientific, Waltham, MA, USA) using Qubit dsDNA HS Assay (ThermoFisher Scientific, Waltham, MA, USA). Integrity of DNA was determined by capillary electrophoresis on TapeStation 4200 (Agilent Technologies, Santa Clara, CA, USA) using Genomic DNA ScreenTape Assay (Agilent Technologies, Santa Clara, CA, USA).

Library Preparation and Next Generation Sequencing
High quality DNA was selected for NGS sequencing. We performed targeted panel sequencing to identify genetic variants within the UCP1 gene for MetS patients and controls.
To validate the obtained NGS sequencing results, we compared the results of four UCP1 polymorphisms previously genotyped (using TaqMan probes) by us: rs1800592, rs10011540, rs45539933, and rs2270565. The degree of concordance between the results was 100%.
To identify the rsIDs and frequencies of considered UCP1 gene variants, we used USCS and dbSNV databases (available at https://genome.ucsc.edu/index.html and https://www.ncbi.nlm.nih.gov/SNV/, accessed on 24 May 2021). In short, the chromosome position of the variant was pasted to the USCS browser with the open track for Short Genetic Variants from dbSNV release 155 and 153. Next, if the rsID for the searched position was present, the rsID was pasted into the dbSNV browser to retrieve more infor-mation about the specific SNV including variation frequency, localization within the gene, variation type, and variation consequence. Additionally, selected SNVs were searched for in the VarSome database (available at https://varsome.com/, accessed on 24 May 2021) for ACMG Classification and Conservation Score (phyloP100).
The difference between allele and genotype distribution in the patient and control groups was analyzed with χ2 tests. Odds ratios (OR) and 95% confidence intervals (95% CI) were calculated using binary logistics regression model to evaluate the relationship between the studied gene variants and susceptibility to MetS. Differences between groups were considered statistically significant if p < 0.05. Statistical analysis was conducted using Statistica 13.1 (TIBCO, Inc., Palo Alto, CA, USA).
Linkage disequilibrium (LD) analysis was prepared using the online LDpair tool (available at https://ldlink.nci.nih.gov/?tab=ldpair, accessed on 24 May 2021), which allows for the investigation of correlated alleles for a pair of variants in high LD based on data available on dbSNV. LDpair analysis was performed on GRCh38 High Coverage (1000 Genomes Project) in the CEU (Utah residents with Northern and Western European ancestry) population. The correlation of alleles for two genetic variants was calculated by R 2 method for UCP1 gene variants with rsIDs. Gene variants with a frequency below 1% in the European population were excluded from LD analysis due to their rarity. Additionally, we calculated LD for SNVs associated with MetS and/ or MetS with T2DM using SHEsis software [29] on the data from the control group.

UCP1 Gene Variants Identified by NGS Sequencing
Using NGS sequencing, we identified 85 sequence variations of the UCP1 gene in controls and MetS patients in total (Table S1). In detail, thirty-five gene variants were located in the 5 UTR region, forty-three in the intronic regions, six in the exonic regions, and three in 3 UTR region. Based on data from USCS and dbSNV, we were able to identify 59 variants from the detected 85 sequence variations. Between the detected variants, SNVs constituted the largest group. Among them, based on dbSNV, 14 were found to have a frequency of less than 1% in the European population.
Analysis of allele and genotype distribution revealed nine variations which seem to be interesting in the context of MetS risk. For 7 of those variants, we were able to identify the rsIDs. Their locations were as follows: one within 3 UTR region (rs77149926), three within coding region (chr4:140560701, rs183105785, chr4:140563477), and five within 5 UTR region (rs36207410, rs36207408, rs76129861, rs56724370, rs77178927) of the UCP1 gene. For the following SNVs: rs77149926, rs36207410, rs36207408, rs76129861, rs56724370, and rs77178927 minor alleles were significantly less frequent in MetS patients than in the control group, while for those: chr4:140560701, rs183105785, and chr4:140563477 minor alleles were significantly more frequent in MetS patients compared to the control group (Tables 2 and S1).  Table 3). Out of 15 variants, we identified rsIDs for eight positions. Among them, one appeared to be an insertion (rs3138733).
The distribution of genotypes and alleles in MetS patients without T2DM did not differ significantly from that observed in controls (Table S2).
As presented in Table 4, five variants were differently distributed in the whole group of MetS patients as well as MetS with T2DM compared to controls. Furthermore, we observed variants which were distributed differently as compared to controls only in MetS patients with T2DM, especially within the coding and 5 UTR region of the UCP1 gene.
Altogether 12 variants, for which distribution of allele and genotypes in MeS and/or MetS with T2DM and controls were different, were found to have rsIDs. To see which variants were previously studied by others, we searched every variant in dbSNV and VarSome databases and investigated the publications tab. In this analysis, we focused on the variants found in our study: rs77149926, rs183105785, rs3138733, rs3811787, rs36207410, rs6536992, rs6536993, rs6536994, rs36207408, rs76129861, rs56724370, and rs77178927. Our analysis of dbSNV revealed that only one of these variants was studied previously by others (rs3811787) ( Table 4). In the case of the remaining 11 variants, there were no articles studying their correlation with MetS and/or T2DM risk (Table 4).
In summary, five SNVs were associated with the susceptibility to MetS in the whole group of patients as well as MetS with T2DM, ten were associated only with MetS with T2DM, while four SNVs were significantly associated with MetS in the whole group of patients. Among them, only rs3811787 was investigated previously. However, we would like to draw attention to the fact that the results achieved should be treated with caution, since the compared groups were small and the results of these analyses can only be treated as direction for future investigations.  [32]; ** potentially associated with susceptibility to disorder, + yes; -not associated.

LD Analysis of SNVs Idetyfied by NGS Sequencing
Taking into consideration the limitation of the LDpair tool, analysis was performed only for the identified UCP1 gene variants (established rsIDs) with a minor allele frequency (MAF) higher than 1% in the European population, and also, the variant must be present on the LDpair tool. We performed LD analysis separately for variants located within regulatory regions (3 UTR and 5 UTR) and coding regions of the UCP1 gene.
In the case of coding regions based on the results from the LDpair software, we identified two very strong LD blocks with r 2 = 1. The first LD block consisted of four variants: rs726989, rs2043124, rs11932232, and rs6818140, and the second LD block comprised two variants: rs2270565 and rs45539933 (Figure 1).
The LD analysis of regulatory regions showed that there were four very strong LD blocks with r 2 = 1 in the 5 UTR region of the UCP1 gene. The first LD block consisted of three variants: rs10011540, rs3749539, and rs3811789; the second LD block comprised six variants: rs36207410, rs36207408, rs76129861, rs56724370, rs77178927, and rs79430751; the third LD block included five variants: rs1430579, rs1430578, rs1472268, rs1472269, and rs1800592; and the last LD block contained three variants: rs6536992, rs6536993, and rs1434525671. In the 3 UTR region, there were no significant LD blocks (Figure 2). The LD analysis of regulatory regions showed that there were four very strong LD blocks with r 2 = 1 in the 5′UTR region of the UCP1 gene. The first LD block consisted of three variants: rs10011540, rs3749539, and rs3811789; the second LD block comprised six variants: rs36207410, rs36207408, rs76129861, rs56724370, rs77178927, and rs79430751; the third LD block included five variants: rs1430579, rs1430578, rs1472268, rs1472269, and rs1800592; and the last LD block contained three variants: rs6536992, rs6536993, and rs1434525671. In the 3′UTR region, there were no significant LD blocks (Figure 2).
The analysis performed with the use of SHEsis software on the genotyping data for SNVs potentially associated with MetS or MetS with T2DM (Table 4) showed that all SNVs, except one polymorphism (rs3811787) located in the regulatory region of UCP1 gene, were in the strong LD (r 2 = 1). However, in the coding region, only one LD block including chr4:140561340, chr4:140564690, and chr4:140567294 with moderate LD (r 2 ≈ 0.7) was observed ( Figure S1). The analysis performed with the use of SHEsis software on the genotyping data for SNVs potentially associated with MetS or MetS with T2DM (Table 4) showed that all SNVs, except one polymorphism (rs3811787) located in the regulatory region of UCP1 gene, were in the strong LD (r 2 = 1). However, in the coding region, only one LD block including chr4:140561340, chr4:140564690, and chr4:140567294 with moderate LD (r 2 ≈ 0.7) was observed ( Figure S1).

Discussion
For years, the treatment of chronic metabolic diseases and obesity focused on developing a properly balanced diet and increasing energy consumption through increased physical exercise. In recent years, multiple studies documented that MetS, including obesity and T2DM, are multifactorial diseases whose development is not solely determined by environmental factors. Numerous studies showed that multiple biochemical and genetic factors play an important role in MetS pathogenesis [33,34]. Among others, it was demonstrated that all three members of the peroxisome proliferator-activated receptor (PPAR) nuclear receptor subfamily, PPARα, PPARβ/δ, and PPARγ, are critical in regulating insulin sensitivity, adipogenesis, lipid metabolism, and blood pressure. Recently, several studies indicated that SNVs of PPARs, such as Leu(162)Val and Val(227)Ala of PPARα, +294T > C of PPARβ/δ, Pro(12)Ala, and C1431T of PPARγ, are significantly associated with the onset and progression of MetS and T2DM in different populations worldwide. Furthermore, it was demonstrated that the glucose metabolism and lipid metabolism were influenced by gene-gene interaction among PPARs genes [35,36].
On the other hand, it is postulated that abnormal metabolism of BAT and disturbed thermogenesis led to obesity in mice [37,38]. This observation attracted the interest of scientists to the study of the potential role of BAT in the development of MetS. Attention was drawn to genes encoding the UCP1 protein and adrenergic receptors as genes potentially being associated with the predisposition to MetS and related diseases.
The transcriptional regulation of the gene is located in the 5 non-coding region. In the UCP1 gene, this region includes a proximal regulatory region, immediately upstream of the transcription start site that contains a CREB (cAMP response element-binding protein) binding site, which mediates a positive transcriptional response to cAMP and a negative response to AP2 (c-Jun/c-Fos) complexes, and a C/EBP (CCAAT-enhancer-binding protein) binding site capable of responding to C/EBPa and C/EBPb. Recently, the transcription factor zinc finger protein-516 (Zfp516) was shown to bind to the proximal region of the UCP1 gene promoter to play a role in cold-induced UCP1 gene expression. In addition to the proximal region, there is a strong enhancer region that contains a cluster of response elements for nuclear hormone receptors that bind PPARα/retinoid X receptor (RXR), PPARγ/RXR, RAR/RXR, and thyroid receptor (TR)/RXR heterodimers, providing responsiveness to ligands of PPARγ and PPARα (e.g., naturally occurring fatty acid derivatives, specific drugs such as fibrates and thiazolidinediones), retinoic acids, and thyroid hormones. Moreover, in the UCP1 promoter, the activating transcription factor-2 (ATF2)-binding site that provides cAMP-dependent responsiveness to CP1 gene transcription is also present. Transcriptional coregulators, such as the PPAR-g coactivator 1-a (PGC-1a), bind and regulate PPAR/RXR heterodimers and possibly other nuclear receptor dimer complexes. It postulated that the enhancer region is strongly involved in determining BAT-specific transcription of the UCP1 gene [16].
Given the above, in the present study, we sequenced the entire UCP1 gene using NGS technology to search for new and potential disease risk markers that may be worth studying in depth in the future. We found that five UCP1 variants were differently distributed in the overall group of MetS patients, as well as MetS with T2DM compared to controls. Among them, the chr4:140563477 variation is located in exon 3 of the UCP1 gene. This variation leads to the exchange of the allele T to C resulting in a missense mutation, causing substitution from threonine (T) to alanine (A) in the position 123 (T123A). Threonine is a polar amino acid, whereas alanine is a non-polar hydrophobic amino acid. Therefore, this T123A modification may have an influence on proper UCP1 protein structure.
Furthermore, we observed SNVs which were distributed differently compared to controls only in the MetS patient group with T2DM. rs6536992, rs6536993, and rs6536994 are located in the 5 UTR region and situated in potential binding sites for transcription factors such as Prdm5, ERS1, RUNX2, PRDM9, MEF2C, and TEAD4 (based on JASPAR Transcription Factor Binding Site Database in UCSC). Therefore, the presence of those variants within the 5 enhancer or promotor regions may disturb the binding of specific transcription factors.
Another SNV of interest in this study, rs3811787 located in the 5 UTR region, was the only SNV indicated here that was previously described in the literature. The association between rs3811787 and diabetic retinopathy risk in T2DM was investigated by Jin et al. [31]. It was shown that in patients with uncontrolled blood glucose, the rs3811787 T allele was associated with a decreased risk of diabetic retinopathy. In a subsequent study in Korean women, Cha et al. found an association between rs3811787 and rs1800592 and body fat accumulation and obesity [30]. Moreover, Nikaranova's study [32] showed the influence of rs3811787 on the leptin-mediated thermoregulation mechanism. In this study, the prevalence gradient of the rs3811787T allele of UCP1 increased from the south to the north across Eurasia, along the shore of the Arctic Ocean. Thereby, the authors suggest the potential involvement of the UCP1 gene, in particular the rs3811787 T allele, in better activation of the UCP1 gene and higher activity of BAT [32]. Our very preliminary data also confirmed the decreased risk of T2DM individuals possessing the rs3811787 TT genotype.
In summary, our study revealed new and interesting genetic variations of the UCP1 gene worthy of studying in the future in the context of MetS and/or T2DM susceptibility. Most of the variants identified were located in the regulatory regions of the UCP1 gene suggesting their potential role in the regulation of UCP1 expression. However, further studies are needed to elucidate their role in MetS and T2DM susceptibility.
The main limitation of this study was the small sample size of the sequenced MetS and control group, which made it impossible to draw far-reaching conclusions in regard to disease risk. However, the aim of this study was only to search for new or interesting disease markers in the context of MetS and/or T2DM disease. The results obtained serve as preliminary research for wider projects, in which we plan to investigate the association of the new UCP1 gene variants discovered here that may be potentially associated with MetS and/or T2DM susceptibility in much bigger groups of MetS and T2DM patients. Secondly, the library of coding and regulatory regions (5 UTR and 3 UTR) were prepared using two different Illumina kits (TruSeq and AmpliSeq). The first algorithm diagnosed coding regions (TrueSeq) based on specific probes, while the second (AmpliSeq) was based on the sequencing of amplicons. Therefore, theoretically, the results may differ in the frequency of identified variants. However, the library for the fragment of the 5 UTR region was prepared using both TruSeq and AmpliSeq kits, and NGS results were in full compliance. The third limitation is the mismatch of ages between the older MetS and the younger control groups. Fortunately, the potential of the control subjects to develop MetS later in their lives should lead to underestimation of their enhancing or protective effects, which, therefore, should leave our conclusions largely intact.

Conclusions
In conclusion, to date, only a limited number of genetic variants present within the UCP1 gene were studied (rs1800592, rs10011540, rs3811791, rs45539933, and rs2270565). Our preliminary NGS data showed that many other genetic variants were present within the UCP1 gene. Among the identified SNVs, some of them are worth taking into consideration when studying the susceptibility to MetS and/or T2DM. Moreover, we identified a few LD blocks, which enables a more economical approach to further research by selecting the tag SNVs representing each LD block. Therefore, the presented results might be treated as directions for future studies on the association between UCP1 gene SNVs and MetS/T2DM risk on larger groups of cases and controls in different populations.