Identification of Novel Intronic SNPs in Transporter Genes Associated with Metformin Side Effects

Metformin is a widely used and effective medication in type 2 diabetes (T2DM) as well as in polycystic ovary syndrome (PCOS). Single nucleotide polymorphisms (SNPs) contribute to the occurrence of metformin side effects. The aim of the present study was to identify intronic genetic variants modifying the occurrence of metformin side effects and to replicate them in individuals with T2DM and in women with PCOS. We performed Next Generation Sequencing (Illumina Next Seq) of 115 SNPs in a discovery cohort of 120 metformin users and conducted a systematic literature review. Selected SNPs were analysed in two independent cohorts of individuals with either T2DM or PCOS, using 5′-3′exonucleaseassay. A total of 14 SNPs in the organic cation transporters (OCTs) showed associations with side effects in an unadjusted binary logistic regression model, with eight SNPs remaining significantly associated after appropriate adjustment in the discovery cohort. Five SNPs were confirmed in a combined analysis of both replication cohorts but showed different association patterns in subgroup analyses. In an unweighted polygenic risk score (PRS), the risk for metformin side effects increased with the number of risk alleles. Intronic SNPs in the OCT cluster contribute to the development of metformin side effects in individuals with T2DM and in women with PCOS and are therefore of interest for personalized therapy options.


Introduction
Metformin is the most commonly prescribed drug for the treatment of type 2 diabetes (T2DM) as well as other indications requiring insulin-sensitizing drugs, such as polycystic ovary syndrome (PCOS), presenting with insulin resistance, hyperandrogenemia and female infertility [1].Recently, metformin has been repurposed and tested for a number of other diseases [2][3][4][5], such as dementia, and also has implications for healthy ageing [6].Metformin is known to be safe in long term use.Lactic acidosis has been described in people treated with metformin; however, a Cochrane systematic review did not show a significantly increased risk of lactic acidosis with metformin compared to other glucose-lowering drugs [7].The most commonly reported side effects are gastrointestinal issues such as nausea, vomiting, flatulence, indigestion, abdominal discomfort, heartburn, bloating and diarrhoea.Up to 25% of metformin users suffer from these symptoms, with 5% not able to tolerate metformin at all [8,9].Furthermore, they strongly decrease persistence of therapy, since up to 48% of metformin users are non-adherent within the first year of metformin intake [2].Metformin is mainly absorbed by the upper small intestine (20% in the duodenum, about 60% in the jejunum and ileum) and has an absolute bioavailability of 50-60%, with the highest levels in the liver and jejunal sites [3,10].It is excreted unchanged in urine [3], and about 30% is found in faeces [9].The main transporters involved in metformin transport are the organic cation transporters (OCTs) and the multidrug and extrusion protein (MATE1).The human OCTs consist of three closely related members: OCT1 (encoded by SLC22A1), OCT2 (encoded by SLC22A2) and OCT3 (encoded by SLC22A3).OCT1 and OCT2 are 70% identical in protein sequence, whereas OCT3 shares 50% sequence homology with OCT1 and OCT2 [11].In excretory organs, OCTs frequently team up with the MATE1 protein to mediate transepithelial transport of organic cations, encoded by the MATE1 gene also known as SLC47A1.Human MATE1 has only one isoform 570 amino acids in length [12] and is predicted to have 13 transmembrane domains with an extracellular carboxyl terminus and an intracellular amino terminus [13].GLUT2, encoded by the SLC2A2 gene, is present in the basolateral membrane of enterocytes and of epithelial cells from the kidney where it functions in the second step of transepithelial glucose transport [14].The intra-individual variability in the efficacy and occurrence of side effects of drugs is highly heritable [15].Coding SNPs in transport proteins are not only associated with reduced uptake or enhanced elimination, but also with the presence of side effects [8,16,17].While some non-coding SNPs in the OCT cluster have been described to influence metformin efficacy, data about their involvement in side effect occurrence are scarce [18,19].
The aims of the study were (1) to identify intronic SNPs in genes encoding the OCT cluster that are associated with the occurrence of side effects during metformin use, and to replicate these findings in two cohorts of individuals with disturbances of insulin sensitivity-namely, T2DM and in women with PCOS on metformin medication; and (2) to investigate published metformin-related SNPs in relation to metformin side effects.

Discovery Study
A discovery cohort of 120 metformin users was recruited between September 2016 and 2018 in the Outpatient clinic of the Division of Endocrinology and Diabetology.Participants were included at an age between 18 and 80 years with PCOS or T2DM and metformin therapy for at least 1 month.PCOS was diagnosed according to the Rotterdam criteria [20], and T2DM according to the current guidelines for T2DM [21].In order to reduce possible confounding factors from critically ill patients, we set the following exclusion criteria.Individuals with GFR < 30, anaemia (haemoglobin < 12 g/dL), more than double the normal value of ASAT (men > 70, women > 60 U/L), ALAT (men > 90, women > 70 U/L), GGT (men > 110, women > 76 U/L); creatinine levels ≥ 1.5 mg/dL (men) or ≥1.4 mg/dL (women) were excluded from the study.Data on medical history, anthropometry and concomitant medications were collected, and metformin intake and side effects were assessed by questionnaires.All participants gave their written informed consent prior to inclusion in the study.All protocol procedures were approved by the local Ethics Committee of the Medical University of Graz (EC-number 26-020 ex 13/14).Anthropometric data: body weight, precise to 0.1 kg, was determined using an electronic scale (model SECA 764, Hamburg, Germany); height, precise to 0.1 cm, was measured using a fixed stadiometer; BMI was calculated as weight in kilograms divided by height in meters, squared (kg/m 2 ).Hip and waist were measured according to WHO guidelines (WHO 2012) in centimeters.Definition of metformin side effects: metformin side effects were defined as the report of adverse effects after intake of metformin, regardless of dose, after an initial 1-month run-in period.Concomitant medications: we identified medications reported to inhibit OCTs, proteins that mediate transmembrane trafficking of their target molecules and are required for metformin absorption in the gut as described by Dawed et al. [16].The use of the following drugs was included as a covariate in binary logistic regression analysis: tricyclic antidepressants (TCAs), proton pump inhibitors (PPIs), citalopram, verapamil, diltiazem, doxazosin spironolactone, clopidogrel, rosiglitazone, quinine, tramadol, codeine, disopyramide, quinidine, repaglinide, propafenone, ketoconazole, morphine, tropisetron, ondansetron, antipsychotic agents and tyrosine kinase inhibitors.

Replication Cohort 1-Individuals with T2DM
Samples from the Graz Diabetes Registry for Biomarker Research (GIRO), a prospective cohort study at the Outpatient Clinics for Diabetes, Lipids and Metabolic Disease at the Division of Endocrinology and Diabetology, including individuals with type 1 diabetes, T2DM, rare types of diabetes like Maturity onset diabetes of the young, type 3 diabetes, obese people undergoing bariatric surgery and patients with lipid metabolism disorder were used for replication.Participants with T2DM on metformin therapy were additionally contacted by phone to collect data about metformin intake and potential side effects.

Replication Cohort 2-Women with PCOS
The second replication cohort consisted of 178 women with PCOS from the local PCOS Cohort Registry, who were treated with metformin for at least 3 months.The women were routinely referred to our Outpatient Endocrinology Clinics from 2006 to 2011 for PCOS assessment according to Rotterdam criteria [20].All protocol procedures were approved by the local ethics committee of the Medical University of Graz (EC-number18-066 ex 06/07).Information about side effects was either collected with a questionnaire during control visits to check for compliance or by phone call.

DNA Isolation and SNP Genotyping
Chromosomal DNA was either isolated from EDTA blood or out of serum.Blood samples of the discovery cohort, as well as replication cohort 2, were collected in tubes containing EDTA as anticoagulant.DNA isolation was performed with the NucleoSpin

DNA Isolation and SNP Genotyping
Chromosomal DNA was either isolated from EDTA blood or out of serum.Blood samples of the discovery cohort, as well as replication cohort 2, were collected in tubes containing EDTA as anticoagulant.DNA isolation was performed with the NucleoSpin Blood Kit (Macherey-Nagel, Düren, Germany) according to the manufacturer.Isolation of chromosomal DNA from serum samples was performed with the ChargeSwitchTM gDNA Serum Kit (Thermo Fisher Scientific Inc., Bothell, WA, USA) according to the manufacturer for samples of replication cohort 1.DNA quantity and quality was assessed with the QuantiFluor ® dsDNA System (Promega GmbH, Walldorf, Austria).Isolated DNA was diluted threefold and 1 µL was used for genotyping with predesigned TaqMan SNP genotyping assays (Thermo Fisher Scientific Inc., Bothell, WA, USA).Endpoint fluorescence was measured with the Fluoroskan Ascent system (Thermo Labsystems, Fischer Scientific GmbH, Wien, Austria).Fluorescence data were analyzed as scatter plots.

Functional Analysis of Intronic SNPs Expression Quantitative Trait Locus Analyses
Examined intronic SNPs were tested for cis-quantitative trait loci (eQTLs) or splice QTLs in any, but particularly in gastrointestinal tissues and the liver, by use of the Genotype-Tissue by the use of the Genotype-Tissue Expression (GTEx RRID: SCR_013042) data release V8 [30][31][32].
Changes in miRNA sequences or their binding sites were investigated by use of miRdSNP v11.03,Center for Computational Research SUNY at Buffalo [35].

Statistical Analysis
Data are presented as mean ± standard deviation (SD) unless otherwise stated.Nominal variables were analyzed using the χ 2 and Fisher exact tests.The Shapiro-Wilk test was used to examine for normal distribution.Differences in continuous parameters between genotypes were assessed using analysis of variance (ANOVA) or analysis of covariance, the Mann-Whitney U test, and the Kruskal-Wallis test.Binary logistic regression models were used to determine factors influencing the presence of side effects after metformin intake.A model unadjusted and adjusted to age, concomitant medication, sex and weight according to Dujic et al. was calculated [8].Due to the low number of homozygous minor allele carriers, they were combined with heterozygous minor allele carriers and compared to homozygous major allele carriers in the analysis performed.A p-value < 0.05 was considered significant.A p-value > 0.05 and <0.1 was considered as a trend.Statistical analysis was performed using SPSS software version 26 (IBM Corp., New York, NY, USA).

Discovery Cohort
We included 120 metformin users in our discovery cohort, which consisted of 113 (94.1%) individuals with T2DM, six (5%) women with PCOS and T2DM and one (0.85%) woman with PCOS.Of these, 47 (39%) persons received metformin monotherapy and 73 (61%) metformin in combination with other antihyperglycaemic agents.A total of 17 participants were taking their prescribed dose of 500 mg of metformin, 17 of them were taking 850 mg and 86 persons were using 1000 mg of metformin at the study visit.Due to the lack of information on the side effects of metformin one participant had to be excluded.All investigated parameters did not differ between individuals without metformin side effects and individuals with metformin side effects.Further participant characteristics are given in Table 1.

Gender Specific Side Effects
We investigated the influence on gender-specific side effects in the discovery cohort.Significantly more female study participants reported nausea as a side effect than male participants (p = 0.030).All other side effects were reported equally in both genders (Table 2).

SNPs Associated with Metformin Side Effects
Of the 115 genotyped SNPs, 14 showed associations with either all side effects, gastrointestinal side effects or other side effects in a preliminary chi square analysis and were therefore selected for replication (Figure 1).Based on their effect size, we included 13 of these associated SNPs in a binary logistic regression model.Of these, ten SNPs showed associations with side effects in either the unadjusted and/or the adjusted binary logistic regression model.Rs3798167 and rs2197296 were associated with an increased risk for all investigated side effect groups in the unadjusted as well as in the adjusted model.The presence of minor alleles of seven SNPs increased the odds ratio to develop side effects (side effects; four for other side effects) and two SNPs decreased the presence of side effects (one for gastrointestinal and other side effects, respectively, Table 3).

Reporting on Metformin Side Effects
Replication cohort 1-out of 142 contacted people with T2DM who were treated with metformin in the GIRO study, 30 participants reported side effects from metformin use, while 91 participants reported experiencing no side effects.Twenty-one participants could not be reached by phone.
Replication cohort 2-in the PCOS cohort, 178 women used metformin to improve their symptoms; 126 women experienced no consciously perceived side effects whereas 52 women outlined side effects after metformin intake.A total of 24 women reported diarrhoea; 23 had nausea, no appetite, vomiting or loss of weight.Six reported flatulence, abdominal pain or meteorism.Four women showed symptoms of hypoglycaemia, two had headache or prickling fatigue and one woman reported gain of weight and hair loss.The occurrence of side effects did not significantly differ between individuals with T2DM and women with PCOS (p = 0.430).Anthropometric data of both replication cohorts are listed in Table 4.

SNPs Associated with Metformin-Induced Side Effects
Four SNPs from the discovery cohort and one SNP previously associated with metformin efficacy [28] were replicated in either the combined and/or in subgroup analysis.To predict metformin side effects we used linear regression models including the minor alleles of all 22 selected SNPs in one model combined with both replication cohorts and another with each individual.SNPs with significant odds ratios (OR) are shown in Table 5.In the combined model the minor allele of SNPs rs3798167 and rs2197296 significantly increased the OR for the occurrence of metformin side effects, while the minor allele of rs3777392, rs628031 and rs8192675 decreased the OR for metformin side effects (Table 5).In the model with T2DM, the minor allele of rs3798167 increased the OR for the occurrence of side effects and the minor allele of rs3777392 and rs628031 decreased it (Table 5).In the model calculated only with women of the PCOS cohort, the minor allele of rs2197296 increased the risk for the occurrence of side effects and SNPs rs3777392 and rs8192675 decreased it (Table 5).With the exception of rs2197296, genotype frequencies of metformin side effects associated SNPs differ significantly between T2DM individuals and PCOS women (Table 6).

Polygenetic Risk Score
To estimate genetic predisposition for metformin-induced side effects, we evaluated a quantitative unweighted PRS.The PRS was generated by summing the number of risk alleles for each person.The higher the number of risk alleles, the higher the (theoretical) genetic predisposition.Risk alleles comprised the minor alleles of rs3798167 and rs2197296 and the major alleles of rs3777392, rs628031 and rs8192675.Neither in individuals with T2DM nor in women with PCOS were persons with 0, 1 or 2 risk alleles identified.PRS groups were therefore designed as follows: >5 risk alleles (n = 25), 5-8 risk alleles (n = 177) and >8 risk alleles (n = 97).As shown in Figure 2, the OR increases with the number of risk alleles.In all participants the presence of 5-8 risk alleles was not associated with a numerically increased risk of side effects (OR: 1.63 (95% CI: 0.53-5.03),p = 0.392), whereas the presence of more than 8 risk alleles by trend led to a more than threefold increased risk to develop side effects OR: 3.09 (95% CI: 0.99-9.75),albeit not reaching statistical significance (p = 0.053) (Figure 2).tistical significance (p = 0.053) (Figure 2).

Functional Annotation
Expression Quantitative Trait Locus Analyses and analysis of chrom changes.Annotated SNPs act as eQTLs or sQTLs in different tissues such a skin, liver, colon and intestine via different mechanisms.SNPs rs3777392 rs8192675 were replicated at least in one of our cohorts (Table 5).They act hancers by altering chromatin structure in the liver and or the mucosa of t (Table 7).Rs3798167 and rs2197296 replicated SNPs of SLC22A1 and A2 act modify expressions of OCT3, OCT1 and GLUT2.Rs628031, rs3798167 and as eQTLs that alter regulatory regions of genes and as sQTLs that affect m and structure.

Functional Annotation
Expression Quantitative Trait Locus Analyses and analysis of chromatin structure changes.Annotated SNPs act as eQTLs or sQTLs in different tissues such as whole blood, skin, liver, colon and intestine via different mechanisms.SNPs rs3777392, rs628031 and rs8192675 were replicated at least in one of our cohorts (Table 5).They act as genetic enhancers by altering chromatin structure in the liver and or the mucosa of the duodenum (Table 7).Rs3798167 and rs2197296 replicated SNPs of SLC22A1 and A2 act as eQTLs and modify expressions of OCT3, OCT1 and GLUT2.Rs628031, rs3798167 and rs8192675 act as eQTLs that alter regulatory regions of genes and as sQTLs that affect mRNA splicing and structure.

Discussion
Based on data from the present discovery cohort of 120 metformin users and the previously described association with metformin efficacy [23][24][25][26][27][28], we investigated 22 mostly intronic polymorphisms in two independent cohorts, one in participants with T2DM and another in women with PCOS (Figure 1).We replicated six intronic SNPs associated with side effects in the combined replication cohort of participants with T2DM and PCOS in a linear regression model (Table 5).In the subgroup analyses of each replication cohort, the association did not appear to be consistent.Heterogeneity of the subgroups in terms of sex, genetic or environmental factors could modulate the effect of the SNPs on metformin induced side effects.It is known that female patients have a 1.5 to 1.7 higher risk of developing adverse side effects compared to male patients [36].Gender-related differences included perceptual, pharmacokinetic, immunological and hormonal factors, as well as differences in the use of medications by women compared with men, and the reporting on side effects itself.How these differences result in an increased risk of side or adverse effects are not entirely clear [36].In our discovery cohort, significantly more female study participants reported nausea compared to males.All other side effects were reported equally in both sexes (Table 2).Gender differences in taking metformin or reporting of the side effects could not be excluded.Metformin might exert part of its insulin sensitizing effects through gut microbiota, although currently available data are not consistent [37].Numerous studies have confirmed that the composition of the gut microbiota in patients with T2DM [38], as well as in women with PCOS [39], is altered.We cannot exclude that disease-related microbiota dysbiosis may contribute to the observed differences in side effects between our two replication cohorts.Rs8192675, a SNP in the SLC2A2 gene, was selected for replication because of its known influence on metformin efficacy.Rathmann et al. published that the C-allele is associated with an improved glucose response to metformin monotherapy during the first year after diagnosis in T2DM [28] and shows in our data as reduced OR for occurrence of metformin side effects (Table 5).It is unclear whether the improved response of hemoglobin A1C (HbA1c) to metformin in the presence of the C variant can be explained by a lower hepatic glucose production or by improved glucose disposal to the liver or peripheral tissues [40].
To generate hypotheses about the mechanistic effects of the replicated noncoding variants on metformin side effects, we performed a functional annotation.Modes of action of intronic SNPs are very diverse.Introns enhance transcript levels by affecting the rate of transcription, nuclear export and transcript stability.Moreover, effects on the efficiency of mRNA translation [41] and on gene expression at the post-transcriptional level are known [42].SNPs located at introns potentially influence gene expression at all levels mentioned above, as well as via epigenetic changes.We determined possible functional consequences of the selected SNPs by the use of various databases in three main areas: SNPs were tested (1) translationally for eQTLs and sQTLS, (2) epigenetically for changes in chromatin structure and (3) post-translationally for modifications such as changes in binding sites for miRNAs (Table 7), mainly in gastrointestinal tissue and in the liver.
Six SNPs identified in the discovery study (Table 3) were shown to act either as eQTL, sQTL or change chromatin states in gastrointestinal tissue, which might explain their association with gastrointestinal side effects (Figure 2).The investigated SNPs in genes for OCT 1 and 2 were e/sQTLs for genes in the OCT cluster but were predominantly active in the liver, potentially fitting with the lack of associations with gastrointestinal side effects.Four SNPs, e.g., rs3798167, showed, according to our functional annotation, changes in the binding site of transcriptional factors that potentially modulate various genes important for the pharmacokinetics of metformin.Although we cannot derive a definite hypothesis from our functional annotation, it shows that these replicated SNPs are likely to be functionally involved in the development of metformin side effects.
To estimate the genetic risk for developing metformin side effects, we performed a simple unweighted PRS which assigns each genotype the same influence on the total risk score ("allele count model") [43].In our cohort, no individual had less than three risk alleles which correspond to the selection criterion of a MAF > 0.01.As expected, the OR increases with the number of risk alleles.All participants with <5 risk alleles did not show a higher risk for the occurrence of side effects (OR: 1.63; 95% CI: 0.53-5.03;p = 0.392), whereas participants with >8 risk alleles showed a threefold increased risk of developing side effects, just missing statistical significance (OR: 3.09; 95% CI: 0.99-9.75;p = 0.053).It should be mentioned that this score has less predictive power, because (1) it is based on genetic data and does not take gene environment interaction or comorbidities into account and (2) the proportion of genetic variability captured by the set of SNPs is relatively small.This PRS is therefore not suitable as a sole indicator of occurrence of metformin side effects, but is suitable as an additional tool to more accurate risk assessment and to promote future developments to increase the adherence of patients to their medication.
A limitation of this study is the small number of participants in the discovery study, which can lead to false positive results, and the use of two replication cohorts originally not designed to investigate metformin side effects.This does not allow us to specifically focus on gastrointestinal side effects in the analysis of the replication cohorts, although gastrointestinal intolerance represents the most common adverse effect of metformin treatment.Due to the low numbers of metformin intolerant persons in the replication cohort and the high number of SNPs associated with side effects, no analysis of haplo-or diplotypes was performed.We have to mention that in cases of concomitant medications the possibility of a combined effect of these along with metformin cannot be ruled out.Another limitation is that the study participants were taking a relatively low dose of metformin.Further, the investigated SNPs may not be the causal variants but might be in linkage disequilibrium with potential important loci.The strengths of our investigation lie in the in-depth clinical and biochemical characterisation of all participants as well as in the application of a state-of-the-art genotyping method that provides very accurate, reproducible and reliable results.

Figure 1 .
Figure 1.Flow chart of SNP selection.Intronic (non-coding) SNPs are highlighted in bold.

Figure 2 .
Figure 2. Association of an unweighted polygenetic risk score (PRS) derived from effects associated with SNPs.Bars indicate 95 percent confidence interval.* p < 0.05

Table 1 .
Description of study participants in the discovery cohort.

Table 2 .
Presence of metformin induced side effects in all, male and female study participants.
Frequency data are presented as number (percentage).Significant p-values are highlighted in bold.N, numbers; %, percent; GI, gastrointestinal; n.a., not applicable.

Table 3 .
Odds ratios of investigated SNPs in an unadjusted and adjusted binary logistic regression model.
* In linkage disequilibrium with rs2048327.** Associated with an increased risk for all investigated side effect groups in the unadjusted as well as in the adjusted model.OR: odds ratio; p: p-value; CI: confidence interval; GI: gastrointestinal; n.a.: not available data.Adjustment included age, presence of OCT blocking medications, sex and weight.Significant p-values are highlighted in bold.Due to the low number of other side effects and thus decreased power.The adjusted model was not applied to the group of other side effects.

Table 4 .
Characterization of individuals in both replication cohorts.
Continuous data are presented as mean ± standard deviation.cm: centimetres; kg: kilogram.

Table 5 .
Odds ratios of metformin associated SNPs.

Table 6 .
Genotype distribution of metformin-associated SNPs in participants with T2DM and in women with PCOS.

Table 7 .
Functional annotation of the replicated SNPs.