Molecular Biomarkers for Gestational Diabetes Mellitus

Gestational diabetes mellitus (GDM) is a growing public health problem worldwide. The condition is associated with perinatal complications and an increased risk for future metabolic disease in both mothers and their offspring. In recent years, molecular biomarkers received considerable interest as screening tools for GDM. The purpose of this review is to provide an overview of the current status of single-nucleotide polymorphisms (SNPs), DNA methylation, and microRNAs as biomarkers for GDM. PubMed, Scopus, and Web of Science were searched for articles published between January 1990 and August 2018. The search terms included “gestational diabetes mellitus”, “blood”, “single-nucleotide polymorphism (SNP)”, “DNA methylation”, and “microRNAs”, including corresponding synonyms and associated terms for each word. This review updates current knowledge of the candidacy of these molecular biomarkers for GDM with recommendations for future research avenues.


Introduction
Gestational diabetes mellitus (GDM) is defined as glucose intolerance with onset or first recognition during pregnancy [1]. The prevalence of GDM is increasing worldwide, with approximately 14% of pregnancies affected by GDM [2]. The condition is associated with perinatal complications and an increased risk for future metabolic disease in both mothers and their offspring. The oral glucose tolerance test (OGTT) is considered the gold standard for the diagnosis of GDM [3]. However, the test is cumbersome to conduct, requires fasting and multiple blood draws, and its association with nausea and vomiting leads to decreased patient compliance. Furthermore, the OGTT is conducted between 24-28 weeks of gestation [3], presenting a small window of opportunity to implement interventions to improve pregnancy outcomes. Earlier detection of GDM may lead to improved management, possibly preventing pregnancy complications. Thus, the identification of sensitive and specific biomarkers, which may offer potential for risk prediction and intervention strategies, became a major focus in GDM research. Several studies provided evidence for a genetic predisposition to GDM [4], while gene-environment interactions could explain the population-specific variation in GDM should be cost effective and reproducible, easily accessible through non-invasive methods, stably expressed in biological fluids, sensitive to relevant changes in disease state, provide early detection of disease before clinical symptoms arise, and have the ability to differentiate between disease pathologies [26,27]. Commercial kits for SNPs [28], DNA methylation [29], and miRNAs [30] are already clinically available for a number of other disorders.

Single-Nucleotide Polymorphisms
Single-nucleotide polymorphisms (SNPs) refer to alterations in the DNA sequence at individual nucleotide bases. They are the most common genetic variation, with over 10 million SNPs present in the human genome [31]. In most cases SNPs are silent, not altering the function or expression of genes [32], while others are biologically functional, and can lead to altered protein function and disease. The search for SNPs that influence disease susceptibility and outcome is a field of active research. Several studies provided evidence that SNPs are associated with metabolic conditions including obesity, T2D, and cardiovascular disease [33]. Variants in more than 50 and 80 loci were found to be associated with obesity [34] and T2D [35], respectively, and occur in genes that regulate glucose homeostasis and insulin signaling.

Single-Nucleotide Polymorphisms and Gestational Diabetes Mellitus
Genetic variants are increasingly being implicated in the pathogenesis of GDM [4]. Evidence suggests that genetic alterations in genes responsible for metabolic changes during pregnancy predispose one to GDM. In this review, a total of 76 studies were identified that investigated SNPs during GDM, using the search terms previously stated. However, to increase the likelihood of reporting a true association, only SNPs investigated in two or more populations were reported. Thirty-four SNPs investigated in 49 studies met the inclusion criteria and are summarized in Table 1.
Genetic studies of the transcription factor 7 like 2 (TCF7L2) gene, which is arguably one of the most important T2D susceptibility genes [36], produced varying results in GDM [37][38][39][40][41][42][43][44][45][46]. TCF7L2 encodes a transcription factor, which is involved in Wnt signaling, an important pathway that regulates glucose homeostasis. Twenty studies conducted in diverse populations screened four SNPs (rs7903146, rs4506565, rs7901695, and rs12255372) in the TCF7L2 gene. Four of the eight studies that investigated rs7903146 showed an association between the T allele and GDM [37,38,43,44]. The other studies failed to observe an association between rs7903146 and GDM, possibly due to small sample size and a lack of statistical power. [40][41][42]. Both studies investigating rs4506565 reported an association between the T allele and GDM [37,41]. One of the five studies investigating rs7901695 found an association between GDM and the T allele in American Caucasians [46], while one study found that the C allele, rather than the T allele, was associated with GDM in a large Swedish population [43]. The three studies that did not show an association had relatively small sample sizes [40,41,45]. Of the five studies investigating rs12255372, two showed an association between the T allele and GDM, one was conducted in a large Swedish population and the other in a small Mexican population [42,43]. However, these results were not replicated in studies conducted in Russian, Spanish, or Brazilian populations [39,41,47] of moderate size, suggesting that ethnic or other confounding factors underlie these differences. The T allele is associated with decreased insulin production and altered hepatic gluconeogenesis [48], and therefore, is a good candidate for further research in larger cohorts, despite these conflicting results obtained in these studies.
The melanotonin receptor 1B gene (MTNR1B) encodes one of the receptors for melatonin, a hormone that is involved in regulating circadian rhythms, insulin signaling, and glucose metabolism, amongst others [56]. Two SNPs, rs10830963 and rs1387153, within the MTNR1B gene were investigated. Eight of the nine studies that screened rs10830963 showed that the G allele was associated with an increased risk for GDM in several Caucasian populations [37,39,44,46,57,58], as well as in Chinese and South Korean populations [59,60]. However, Wang et al. found that this SNP was not associated with GDM in a different Chinese population [61]. The three studies that investigated rs1387153 reported an association between the T allele and GDM [37,39,60]. Variants in MTNR1B, particularly the G allele of rs10830963, were previously shown to be associated with increased fasting glucose concentrations and reduced beta-cell function in Caucasians [62].
Glucokinase (GCK) and the glucokinase regulator (GCKR) play critical roles in glucose processing in the liver [63]. Two variants, rs1799884 and rs4607517, within the GCK gene were studied for GDM. For rs1799884, the minor allele, reported as either T [39] or A [64], was associated with an increased risk of GDM. Tarnowski et al. also showed a trend toward a significant association between the T allele and risk of GDM in a Polish population [65]. However, a large study in a Finnish population showed no association between rs1799884 and GDM [44]. No association between rs4607517 and GDM was observed [44,61]. Within the GCKR gene, the C allele of rs780094 was associated with an increased risk of GDM in Malaysian, American Caucasian, and Brazilian populations [46,66,67], but not in studies conducted in Polish or Finnish populations [44,65]. The C allele was increased in women with GDM from the Polish population, but this did not reach significance due to a lack of statistical power.
The association between genetic variants within the fat mass and obesity-associated (FTO) gene and metabolic syndrome is widely reported [68]. FTO encodes an alpha-ketoglutarate-dependent dioxygenase, which plays a role in adipocyte development and function [69]. Three SNPs within the FTO gene were studied for GDM. Of the six studies investigating rs9939609, one study in a Finnish population found an association between the A allele and an increased risk for GDM [44], another study in a small Spanish population found an association between the T allele and GDM [41], while four studies reported no association [38,39,47,70]. Discrepancies between the studies are possibly due to ethnic and genotyping method differences. None of the studies investigating rs8050136 and rs1421085 found an association between these SNPs and GDM [45,47,70].
Insulin receptor substrate 1 (IRS1) is a protein that plays a key role in transmitting signals from the insulin and insulin-like growth factor-1 receptors to intracellular pathways that are associated with insulin response and risk of T2D [71]. Two genetic variants, rs1801278 and rs7578326, within IRS1 were investigated during GDM. For rs1801278, the T allele was associated with an increased risk of GDM [72] in a Saudi Arabian population, but not in a Russian population [39], while, for rs7578326, the G allele was associated with a decreased risk of GDM in an Austro-Hungarian population [58], but not in a Finnish population [44]. As previously stated, these conflicting results may be due to population and genotyping method differences.
Potassium voltage-gated channel subfamily Q member 1 (KCNQ1) plays a role in insulin secretion, and variants of KCNQ1 are associated with decreased insulin secretion and increased susceptibility to T2D [73]. Two variants, rs2237895 and rs2237892, were investigated in different populations in four studies. In both variants, the C allele was associated with an increased risk of GDM [74][75][76]. The solute carrier family 30 member 8 (SLC30A8) gene encodes a zinc transporter protein that plays a role in insulin secretion, and variants of the gene are associated with T2D risk [77]. Rs13266634 was investigated in three studies with varying results. One study showed that the T allele was associated with a decreased risk of GDM in an Austro-Hungarian population, while the C allele was found to be associated with an increased risk of GDM in a large Swedish population [58,78]. A large Finnish population showed no association between rs13266634 and GDM [44].

Limitations of Single-Nucleotide Polymorphisms
There are inherent limitations in genetic association studies, particularly in studies of polygenic and multifactorial diseases such as GDM. As stated above, these limitations include inadequate sample size to detect statistically significant associations, and differences in allele frequencies and disease etiology between ethnicities, which may explain why many genetic associations are not reproducible across populations. Furthermore, GDM diagnosis is not standardized internationally; thus, different diagnostic criteria could have contributed to the discordant results observed between studies. Importantly, genetic variants do not solely contribute to the development of complex diseases, and it is widely believed that disease arise due to the interaction of genetic predisposition and environmental factors [79]. Thus, to accurately assess risk of GDM, biological and environmental factors, such as maternal age and diet [39], should be considered together with genetic variants.
Despite the variable results obtained across studies, many of the variants found to be associated with GDM, are also associated with T2D, supporting their biological plausibility. Therefore, while the etiology of GDM may differ from T2D, the genetic pathways through which the symptoms manifest are likely to overlap. In this review, only studies that profiled SNPs in DNA extracted from whole blood were reported on. However, the use of less invasive sources of genetic material, such as buccal swabs, is acknowledged [80]. Furthermore, this review only included SNPs reported in two or more studies, and may have overlooked other important SNPs possibly associated with GDM.

DNA Methylation
DNA methylation, the most widely studied and best characterized epigenetic mechanism, occurs via the addition of a methyl group to the fifth carbon position of a cytosine residue within cytosine-phosphate-guanine (CpG) dinucleotides [97]. The process is catalyzed by the enzyme DNA methyltransferase (DNMT), with S-adenosyl-methionine serving as the methyl donor. Methylation of CpG islands, which are regions with high levels of CpG dinucleotides primarily in the promoter regions of genes, is generally associated with transcriptional repression due to altered protein binding to target sites on DNA [98,99]. DNA methylation is a reversible process [100]. Ten-eleven translocation (TET) methylcytosine dioxygenases are able to cause the oxidation and demethylation of methylated cytosine to 5-hydroxymethylcytosine [100], which is associated with gene activation. Recently, DNA methylation of CpG-poor islands was identified downstream of active promoters, either within (intragenic) or between (intergenic) genes, although the role of methylation in these regions are not fully elucidated [101]. Approximately 55-90% of all CpG dinucleotides within CpG islands are methylated, constituting about 3% of the genome. Global DNA hypomethylation is associated with genomic and chromosomal instability, while DNA methylation within the promoters of genes is generally associated with gene silencing. Both aberrant global and gene-specific DNA methylation was shown to be associated with metabolic conditions such as obesity [102], T2D [103], and cardiovascular disease [104]. Thus, characterization of altered DNA methylation during disease processes could give insight into the pathophysiology of disease, and reveal novel diagnostic, prognostic, and therapeutic targets.

DNA Methylation and Gestational Diabetes Mellitus
DNA methylation during pregnancy plays a key role in modulating the transcriptional potential of the genome, and is known to affect gene expression pathways associated with a range of pathophysiological processes such as GDM [49,105]. Several studies demonstrated that DNA methylation is altered in the placenta and cord blood of women with GDM compared to women with normoglycemic pregnancies [23,[106][107][108][109]. Intrauterine exposure to GDM leads to long-lasting effects in the offspring and increases risk of disease in later life, possibly mediated by DNA methylation [110,111]. Importantly, it was demonstrated that physiological and DNA methylation changes that occur during pregnancy are reflected in whole blood [112], thus increasing interest in screening maternal blood for biomarkers of GDM. DNA methylation profiling in pregnancies complicated by GDM is a relatively new research field, with limited studies conducted in maternal whole blood. Studies that investigated DNA methylation in whole blood of women with GDM are summarized in Table 2.
Global DNA methylation provides an estimate of overall genomic methylation and is relatively easy and cost-effective to measure [113]. Currently, the only study that investigated global DNA methylation during GDM was conducted in our laboratory [114]. The study showed that global DNA methylation was not associated with GDM in a South African population, suggesting that the method may be too crude to detect subtle glucose intolerance, and that gene-specific methylation is warranted in this population. Genome-wide DNA methylation profiling in maternal blood during GDM was conducted using methylation bead chip arrays [115][116][117]. Methylation bead chip arrays can interrogate between 27,000 and 850,000 CpG sites across the genome at a single-nucleotide resolution. In one of the earliest studies using bead chip arrays, Enquobahrie et al. reported that DNA methylation changes occurred early during pregnancy in six women with repeat pregnancies, one of which was complicated by GDM [116]. They reported that 17 CpG sites were hypomethylated and 10 CpG sites were hypermethylated between GDM and normal pregnancies within the same women. Novel genes related to these CpG sites were found to be associated with cell cycle, cell morphology, cell assembly, cell organization, and cell compromise. Subsequently, using a newer bead chip array containing more CpG sites, Kang et al. showed that 200 CpGs corresponding to 151 genes were differentially methylated in women with GDM (n = 8) compared to controls (n = 8). Amongst the differentially methylated genes were interleukin-6 (IL-6) and interleukin-10 (IL-10), which are key pro-inflammatory and anti-inflammatory cytokines, respectively [115]. These cytokines function in a wide variety of inflammatory-associated diseases, including obesity and T2D. Moreover, a different study by Kang et al. showed that decreased methylation of IL-10 during GDM was associated with increased serum IL-10 concentrations at the end of pregnancy [118]. IL-10 serum concentrations were shown to vary during pregnancy, suggesting that this cytokine plays an important role in the development of GDM. In another study using bead chip arrays, 100 differentially methylated CpG sites corresponding to 66 genes were identified in women with GDM (n = 11) compared to controls (n = 11) [117]. Using more stringent statistical criteria to prioritize methylation sites, a total of five CpG sites within the constitutive photomorphogenic homolog subunit 8 (COPS8), phosphoinositide 3-kinase regulatory subunit 5 (PIK3R5), 3-hydroxyanthranilate 3,4-dioxygenase (HAAO), coiled-coil domain containing 124 (CCDC124), and chromosome 5 open reading frame 34 (C5orf34) genes were identified and validated using pyrosequencing. Since blood for DNA methylation profiling was collected prior to GDM diagnosis, these CpG sites may prove useful as predictive biomarkers for GDM. However, their candidacy as biomarkers requires validation in larger studies.

Limitations of DNA Methylation
Although studies show that DNA methylation has potential as a diagnostic and prognostic biomarker, they are not without limitations [119]. Several factors, including small sample size, lack of validation, differences in ethnicity, method of quantification, and timing of methylation analysis during pregnancy, hinder reproducibility of findings across studies. Another limitation of the studies included in this review is the use of whole blood, which consists of a mixture of cell types such as lymphocytes, erythrocytes, and platelets, and may confound methylation analysis [120]. Thus, future studies should consider purification of blood-cell populations to separate specific cell types. Currently, there is no consensus on the best method to use for DNA methylation analysis. While global DNA methylation can easily be measured using crude DNA preparations, it is a measure of overall genomic methylation and does not offer the resolution required to detect subtle DNA methylation differences within genes [121]. In contrast, locus-specific DNA methylation methods such as bead chip arrays and pyrosequencing are expensive, requiring sophisticated equipment and bioinformatics expertise. qRT-PCR-quantitative real-time PCR; CpG-cytosine-phosphate-guanine; IL-10-interleukin-10; GDM-gestational diabetes mellitus. * Sigma-Aldrich. St. Louis, USA.

MicroRNAs
MiRNAs are short, highly conserved non-coding RNA molecules, approximately 22 nucleotides in length, which are powerful mediators of biological function. They regulate gene expression through post-transcriptional mechanisms by binding to the 3 untranslated region (UTR) of messenger RNA (mRNA), inducing gene silencing through translational repression or mRNA degradation [122]. This interaction is dependent on the complementarity of the miRNA to the "miRNA seed region", a region of seven or eight nucleotides contained within the 3 UTR of mRNA. MiRNA binding requires a number of nucleotides to match the sequence flanking the seed region to direct the specificity of miRNA-mRNA interactions [123,124]. Since their initial discovery in Caenorhabditis elegans in 1993 [125], over 2000 miRNAs were identified in humans, and they are believed to regulate about one-third of the genome [126].
MiRNAs are master regulators that control many biological processes including cell proliferation, differentiation, apoptosis, and development [127]. Moreover, they regulate genes involved in metabolic processes such as glucose homeostasis, insulin signaling, pancreatic beta-cell function, lipid metabolism, and inflammation [128]. Their dysregulation was reported during many metabolic conditions, including obesity, T2D, and cardiovascular disease [129][130][131]. Although they exert their function intracellularly, several studies identified extracellular circulating miRNAs, which sparked interest in their use as biomarkers of disease [132]. Circulating miRNAs are associated with various complexes such as lipoproteins, exosomes, apoptotic bodies, microvesicles, and ribonucleoproteins such as Argonaute (Ago)1-4 or nucleophosphin 1 (NPM1), which serve to protect these miRNAs from nuclease degradation, and act as carriers to transport them to their target mRNAs. This suggests that miRNAs function in cell-to-cell communication, regulating gene expression in neighboring cells by either acting locally (paracrine or autocrine signaling) or at a distance (endocrine/exocrine) [132,133].

MicroRNAs and Gestational Diabetes Mellitus
MiRNAs are important metabolic and developmental regulators during pregnancy, and were shown to play a role in the development of GDM. In 2013, genome-wide analysis demonstrated that more than 600 miRNAs are expressed in the placenta [134]. Recently, Poirer et al. reviewed placental miRNAs that are dysregulated during pregnancy and GDM [135]. The placenta plays an important role in maternal metabolic adaptation to pregnancy, and differential expression of placental miRNAs are believed to partly underlie these physiological changes. Placental miRNAs are released into maternal circulation [112]; thus, these miRNAs hold potential as biomarkers of placental dysfunction and GDM. Studies reporting circulating miRNA expression during GDM are summarized in Table 3.
In 2011, Zhao et al. were the first to profile the expression of serum miRNAs during GDM [136]. Using Taqman low-density arrays, followed by confirmation with individual qRT-PCR, they identified three miRNAs, miR-132, miR-29a, and miR-222, that were significantly downregulated in Chinese women with GDM (n = 24) compared to controls (n = 24) [136]. The differential expressions of miR-29a and miR-222 were validated in an internal and two external validation cohorts. These miRNAs are thought to play a role in glucose homeostasis, insulin sensitivity, and beta-cell function [136]. A number of studies in other populations replicated these experiments with conflicting results. Recently, Pheiffer et al. reported decreased expression of miR-132, miR-29a, and miR-222 in the serum of South African women with GDM (n = 28) compared to controls (n = 53); however, only the latter was statistically significant [5]. These findings demonstrate that the expression of these serum miRNAs are shared across South African and Chinese populations. In contrast to Zhao et al., Tagnoma et al. showed that miR-222 expression was increased in plasma of women with GDM (n = 13) compared to controls (n = 9) [137]. Wander et al. observed no differences in the expression of miR-222 or miR-29a in the plasma of American Caucasian women with GDM (n = 36) compared to controls (n = 80) [138]. These discrepancies may be due to differences in biological samples used (serum or plasma), gestational age, or other unknown factors not accounted for. Zhu et al. used high-throughput sequencing and qRT-PCR to investigate miRNAs in pooled plasma samples of Chinese women with (n = 10) or without (n = 10) GDM between 16 and 19 weeks of gestation. Five miRNAs (miR- 16, miR-17, miR-19a, miR-19b, and miR-20a) were significantly upregulated in GDM compared to controls [139]. Bioinformatic analysis revealed that the targets of these miRNAs are associated with mitogen-activated protein kinase (MAPK), insulin, transforming growth factor beta (TGF-β), and mammalian target of rapamycin (mTOR) signaling pathways, providing insight into the role of these miRNAs in GDM. Cao et al. investigated miR-16, miR-17, and miR-20a in a larger cohort of Chinese women at 16-19 weeks, 20-24 weeks, and 24-28 week of gestation and found sustained increased expression in the plasma of women with GDM (n = 85) compared to controls (n = 72) at all the measured time points. However, they did not observe differences in the expressions of miR-19a and miR-19b [140], as previously reported by Zhu et al. More recently, Pheiffer et al. reported conflicting results. The expression of all five miRNAs were decreased in South African women with GDM; however, only the decreased expression of miR-20a was statistically significant [5].
Functional analyses of these miRNAs provided support for their role in the development of GDM [141][142][143][144]. Many other miRNAs were reported to exhibit altered expression during GDM, although these were identified in single studies only (Table 3).

Limitations of Circulating microRNA Profiling
The studies reviewed above highlight several miRNA candidates as biomarkers for GDM. However, the results are often discordant, possibly due to the different sample types and sizes, gestational age, and the methods of analysis used.
Differences in miRNA expression were reported in serum and plasma, suggesting that factors during the coagulation process could influence expression [145]. Currently, there is no consensus on the best quantification method to use when profiling circulating miRNAs. Different methods of quantification are known to vary in sensitivity and specificity [146], which may impact the accuracy and interpretation of the data. Moreover, data normalization presents a significant challenge for the analysis of circulating miRNA profiling. Although strategies using exogenous miRNAs such as C. elegans miR-39 were proven to be less variable than endogenous reference genes, no ideal normalization strategy exists [147]. Thus, standardized guidelines for miRNA profiling would aid in the biological interpretation of miRNA data.

Current Perspectives and Future Recommendations
Advances in molecular biology resulted in the identification of several molecular biomarkers for disease. Of these genetic variants, DNA methylation and miRNAs are widely studied during GDM [148][149][150]. These molecular markers are stably expressed in biological fluids and hold potential as diagnostic or prognostic biomarkers of GDM. As reviewed above, many studies provided evidence to support the use of these markers as biomarkers of GDM. However, despite these favorable results, molecular biomarkers face many challenges, which hinder their candidacy as biomarkers, and that must be addressed before they can be used clinically. As outlined above, SNPs, DNA methylation, and miRNAs are all impacted by ethnicity and environmental factors. Furthermore, technical challenges during analysis contribute to inaccurate data and lack of reproducibility. Thus, standardization of analytical methods is critical when profiling molecular biomarkers. Moreover, large prospective cohort studies, conducted in populations with different ethnicities and environmental factors, are warranted to identify robust markers that are not influenced by these factors. The ideal biomarker for GDM would most likely be a combination of several molecular biomarkers to overcome the lack of sensitivity and specificity of individual factors. For example, a single miRNA regulates up to 200 different genes [151]; thus, miRNAs found to be associated with GDM are non-specific and may possibly be involved in other conditions as well. To increase the predictive power of molecular biomarkers, future studies should consider using a combination of these markers in risk stratification models for predicting GDM risk.

Conclusions
GDM is a growing public health problem worldwide. The short-and long-term consequences of GDM are likely to have an immediate negative impact on health systems, and , in addition, present a major reservoir of future disease. Screening and treatment of GDM leads to improved pregnancy outcomes [13]; thus, universal screening is widely advocated as a strategy to prevent adverse consequences. A growing body of evidence supports the use of SNPs, DNA methylation, and miRNAs as biomarkers that could aid in the early detection of GDM, thus facilitating intervention strategies to better manage GDM and improve health outcomes. Despite their potential, these molecular biomarkers face several challenges that need to be addressed before they can become clinically applicable. However, rapid technological advances could overcome these challenges and lead to the development of a quick, cost-effective point-of-care test that could accurately identify women at high risk for GDM during early pregnancy. The establishment of an international body to standardize analytical conditions for molecular biomarkers, and large prospective cohort studies in different populations are required.
Funding: This research received no external funding.

Conflicts of Interest:
The authors declare no conflicts of interest.