Automated Machine Learning (AutoML)-Derived Preconception Predictive Risk Model to Guide Early Intervention for Gestational Diabetes Mellitus

The increasing prevalence of gestational diabetes mellitus (GDM) is contributing to the rising global burden of type 2 diabetes (T2D) and intergenerational cycle of chronic metabolic disorders. Primary lifestyle interventions to manage GDM, including second trimester dietary and exercise guidance, have met with limited success due to late implementation, poor adherence and generic guidelines. In this study, we aimed to build a preconception-based GDM predictor to enable early intervention. We also assessed the associations of top predictors with GDM and adverse birth outcomes. Our evolutionary algorithm-based automated machine learning (AutoML) model was implemented with data from 222 Asian multi-ethnic women in a preconception cohort study, Singapore Preconception Study of Long-Term Maternal and Child Outcomes (S-PRESTO). A stacked ensemble model with a gradient boosting classifier and linear support vector machine classifier (stochastic gradient descent training) was derived using genetic programming, achieving an excellent AUC of 0.93 based on four features (glycated hemoglobin A1c (HbA1c), mean arterial blood pressure, fasting insulin, triglycerides/HDL ratio). The results of multivariate logistic regression model showed that each 1 mmol/mol increase in preconception HbA1c was positively associated with increased risks of GDM (p = 0.001, odds ratio (95% CI) 1.34 (1.13–1.60)) and preterm birth (p = 0.011, odds ratio 1.63 (1.12–2.38)). Optimal control of preconception HbA1c may aid in preventing GDM and reducing the incidence of preterm birth. Our trained predictor has been deployed as a web application that can be easily employed in GDM intervention programs, prior to conception.


Introduction
The prevalence of gestational diabetes mellitus (GDM) is increasing globally, affecting one in five pregnancies in some populations [1]. GDM is a condition in which a woman without previous diabetes develops glucose intolerance during pregnancy [2]. This condition increases the risk of developing GDM-related complications such as hypertensive disorders of pregnancy, fetal macrosomia, caesarean section, shoulder dystocia and birth injuries [3]. Poorly controlled GDM also increases risks of premature birth, perinatal mortality and neonatal morbidity. GDM has long-term implications as women with a history of GDM have a 10-fold higher risk of developing type 2 diabetes (T2D) as well as higher risk of developing cardiovascular adversities compared to those with a normoglycemic pregnancy [4,5]. Offspring of mothers with GDM are also at an increased risk of having cardiometabolic adversities, resulting in a transgenerational cycle of diabetes and cardiovascular diseases [6].
Healthcare systems across the world use either the high risk selective screening approach or universal screening of GDM in pregnant women. The American Diabetes Association (ADA) endorses the use of either a one-step approach (IADPSG diagnostic criteria, fasting two-hour, three-point 75 g oral glucose tolerance test (OGTT)) or an older two-step approach (non-fasting one-hour 50 g glucose challenge test (GCT), followed by diagnostic fasting three-hour 100 g OGTT on a subset of women exceeding the glucose threshold value of GCT) at 24-28 weeks' gestation [7]. The UK NICE recommends high risk selective screening for women with known GDM risk factors, such as obesity (body mass index (BMI) ≥30 kg/m 2 ), family history of diabetes, history of GDM, previous delivery of a macrosomic baby (≥4.5 kg) and being in an ethnic group with a high prevalence of diabetes (South Asian, Black Caribbean or Middle Eastern) [8]. In the latest UK NICE 2015 guidelines, women with a history of GDM are offered an OGTT at their booking appointment [8]. Women with other risk factors are offered an OGTT at 24-28 weeks' gestation. The International Diabetes Federation (IDF) GDM Model of Care [9] recommends that all pregnant women are screened at first visit by a fasting glucose, HbA 1c or random glucose sample to rule out pre-existing diabetes. In those with normal early screening, an OGTT is performed at 24-28 weeks' and 32 weeks' gestation (for high risk women) to assess the risk of GDM.
Pre-existing abnormalities in maternal metabolism are important factors in the pathophysiology of metabolic diseases and fetal programming. GDM intervention typically focuses on counseling, dietary modification and increasing physical activity. The daily self-monitoring of blood glucose is aimed at normalizing blood glucose levels and reducing the complications of GDM. Primary lifestyle interventions to manage GDM, such as diet and exercise in the second trimester, provide limited benefits for the mother and child due to late implementation, poor adherence and generic guidelines [10]. Preconception presents an important opportunity to break the intergenerational cycle of chronic metabolic disorders. The Lancet series on preconception maternal health in 2018 highlighted preconception as a critical period for shaping pregnancy outcomes and subsequent maternal and child health [11][12][13].
In recent years, some machine learning models have been developed for population based GDM risk stratification. However, the current state-of-the-art models are only applicable during pregnancy, which can be too late for effective intervention. Artzi et al. trained a LightGBM gradient boosting classifier with Israel's Electronic Health Records (EHR) data for onset of GDM (area under the receiver operating characteristic curve (AUC) of 0.80 was achieved with nine questionnaire features) [14]. In another study, Wu et al. trained a logistic regression classifier with China's EHR data for early GDM prediction (AUC of 0.77 was achieved with seven clinical features) [15]. To date, there have been no studies applying machine learning for GDM risk assessment in a preconception population. We therefore would like to suggest a paradigm shift in GDM management strategy.
In this study, we developed a machine learning model for early prediction of GDM during preconception among women in Singapore. Taking a longitudinal approach, we also assessed the associations of the strongest predictors with GDM and adverse birth outcomes (preterm birth, low birthweight at term and large for gestational age infant). Our machine learning models were implemented using data from the prospective Singapore Preconception Study of Long-Term Maternal and Child Outcomes (S-PRESTO) cohort study.

Study Design
S-PRESTO (ClinicalTrials.gov NCT03531658) is a prospective, preconception cohort study of multi-ethnic groups (Chinese, Malay, Indian or any combination of these three ethnicities) [16]. Women planning for pregnancies were recruited from the KK Women's and Children's Hospital (KKH) and community between February 2015 and October 2017. There were 1032 unique participants for preconception; 475 conceived singleton pregnancies within a year of enrollment into the study, and 373 remained in the study and had a livebirth. The mother-child dyads have been followed for seven years, with longitudinal phenotypic data collected across multiple health domains.
Participants diagnosed with T2D based on preconception and 3 months postpartum OGTT or HbA 1c readings were excluded from model training. GDM analysis was restricted to mothers whose gestation at the time of antenatal OGTT was 24 +1 -28 +6 weeks (gestational age is given as weeks +days ). Participants of mixed ethnicity or unclassifiable GDM status due to missing glucose readings were removed from the final analysis set.
Our models were built using 222 preconception women who had complete data on demographics, medical/obstetric history, physical measures, blood-derived markers, lifestyle factors and antenatal OGTT ( Figure 1). The prevalence of GDM was 13.1% in our training dataset. Participant characteristics are presented in Table 1.
The physical measures at preconception were included for feature selection modeling. Weight was measured to the nearest 0.1 kg (SECA 803) and height to the nearest 0.1 cm (SECA 213). BMI was derived using weight divided by height squared (kg/m 2 ). Waist circumference was measured to the nearest 0.1 cm (SECA 203). Additionally, mid-upper arm circumference was measured to the nearest 0.1 cm, midway between acromion process and olecranon process (SECA 212). Systolic and diastolic blood pressure were measured using the Microlife BP 3AS1-2 blood pressure device. Mean arterial blood pressure was further derived by doubling the diastolic blood pressure and adding to the systolic blood pressure, with the composite sum divided by 3.
Sodium fluoride/potassium oxalate tubes were used to collect blood samples for plasma glucose measurement. Potassium EDTA tubes were used to collect whole blood samples for HbA 1c measurement. All samples were kept at 4 • C, immediately sent to the hospital laboratory, centrifuged within 30 min and analyzed within 1 h from the time of earliest blood draw. Fasting plasma glucose, 30 min plasma glucose, 1 h plasma glucose (antenatal OGTT only), 90 min plasma glucose (antenatal OGTT only), 2 h plasma glucose and HbA 1c concentrations were measured using the ARCHITECT c8000 Clinical Chemistry Analyzer (Abbott Laboratories), which is a National Glycohemoglobin Standardisation Program (NGSP) certified method for HbA 1c testing. The preconception HbA 1c marker was included for feature selection modeling.
Longitudinally obtained plasma samples were analyzed for fasting insulin, triglycerides (TGs), high density lipoprotein (HDL) cholesterol and gamma-glutamyl transferase at the National University Hospital (NUH) clinical laboratory (accredited by the College of American Pathologists [20]). Maternal venous blood was collected into silicone coated tubes, and serum was obtained by centrifugation at 1600× g for 10 min at 4 • C. The serum was stored at −80 • C until sample batch analysis. Insulin was measured using the Sandwich immunoassay (Beckman DxI 800 analyzer, manufactured by Beckman Coulter in Fullerton, CA, USA). Using a Beckman AU5800 analyzer, TG and gamma-glutamyl transferase were measured by colorimetric assays and HDL cholesterol using an enzymatic assay. These blood markers were subsequently used for the derivation of metabolic indices and machine learning modeling.
The homeostasis model assessment of insulin resistance (HOMA-IR) index was calculated based on the formula [21]: In addition, the TG/HDL cholesterol ratio was calculated based on the fasting lipid concentrations to assess insulin resistance [22].
Age, ethnicity, family history of diabetes mellitus, history of GDM, parity, height, BMI, mid-upper arm circumference, mean arterial blood pressure, HbA 1c , fasting insulin, self-reported smoking, self-reported alcohol consumption, TG/HDL ratio, fatty liver index and metabolic syndrome variables were included for feature selection modeling.

Machine Learning Methodology and Statistical Analyses
Our methodological novelty lies in combining coalitional game theory concepts with evolutionary algorithm-based automated machine learning (AutoML). Automating the process of machine learning enables the best possible model to be built for our supervised machine learning problem. The optimal machine learning pipelines were automatically generated using genetic programming (GP), a type of evolutionary algorithm [25,26]. An introduction to GP is provided in the Supplementary Materials. In brief, GP solves machine learning tasks based on random mutation, crossover, fitness functions and generations to arrive at optimal solutions (models and hyperparameters).
The Shapley additive explanations (SHAP) framework [27] was combined with the evolutionary algorithm-based Tree-Based Pipeline Optimization Tool (TPOT) [28] to discover novel features and select optimal supervised machine learning models. We explored the interaction effects of multiple predictors using the SHAP framework methodology. In game theory, the Shapley value is the average expected marginal contribution of one player across all possible permutations of players (average effects of team member composition and team size). The Shapley value helps to determine a payoff for all the game players when each player might have contributed more or less than the others when working in coalition. In machine learning, game players are the features, and collective payout is the model prediction. The SHAP framework provides local explanations based on exact Shapley values to understand the global model structure. For every possible feature ordering, features are introduced one at a time into a conditional expectation function of the model's output, and changes in expectation are attributed to the introduced feature, averaged over all possible feature orderings in a fair manner. SHAP values represent a change in log odds ratio. Lundberg and Lee have proposed SHAP as the only additive feature attribution method that satisfies two important properties of game theory-additivity (local accuracy) and monotonicity (consistency) [27]. The integrated game theoretical approach with automated machine learning further advances biomedical data science for data-driven precision care.
The AutoML models were built using Anaconda's distribution of Python v3.7.9 programming language in a JupyterLab computational environment. Community-developed Python packages were used for modular programming: Pandas v0.25.3, Numpy v1.19.2, Matplotlib v3.3.2, Scikit-learn v0.23.2, TPOT v0.11.7 and Shap v0.37.0. We trained the Au-toML models on a Linux server with an Intel Xeon Gold 6138 CPU processor. In the TPOT classifier, the search for optimal machine learning pipelines was run over 100 generations with 100 individuals retained in the genetic programming population of every generation. We used 5-fold stratified cross validation to preserve the same proportion of GDM cases in each fold, and model performances were evaluated using the area under the receiver operating characteristic curve (AUC).
The AutoML feature selection model based on preconception feature variables was trained with GDM as the outcome; top predictors with SHAP value magnitudes greater than zero were included in the GDM prediction models. Sensitivity analyses were performed to explore the prediction effects of fasting glucose, systolic blood pressure and HOMA-IR index in the proposed AutoML model. We also assessed the associations between the strongest predictors and GDM outcome/adverse birth outcomes (preterm birth, low birthweight at term and large for gestational age infants). Preterm birth was defined as livebirth before 37 weeks of pregnancy [29]. Low birthweight at term was defined as birthweight less than 2500 g in term births (37-42 weeks of pregnancy) [29]. The sex-specific birthweight for gestational age percentile was derived by making reference to Growing Up in Singapore Towards Healthy Outcomes (GUSTO) healthy newborn weight percentile [30], which was based on the generic reference for birthweight percentiles created by Mikolajczyk et al. [31]. Large for gestational age infants have a birthweight of more than 90th percentile. Additional sensitivity analyses were performed by excluding preconception women with prediabetes (IFG and IGT). All association analyses were performed using Stata/MP 17.0 software (StataCorp LP, College Station, TX, USA). Figure 2 presents the SHAP global importance plot of the AutoML feature selection model. A stacked ensemble model with a random forest classifier and linear support vector machine classifier (stochastic gradient descent training) was the best machine learning pipeline evaluated by TPOT (AUC: 0.89). The top preconception feature variables impacting the model outputs were HbA 1c , fatty liver index, mean arterial blood pressure, fasting insulin, TG/HDL ratio, height, age, mid-upper arm circumference, BMI, parity, alcohol consumption, family history of diabetes mellitus and Chinese ethnicity. Figure 2 presents the SHAP global importance plot of the AutoML feature selection model. A stacked ensemble model with a random forest classifier and linear support vector machine classifier (stochastic gradient descent training) was the best machine learning pipeline evaluated by TPOT (AUC: 0.89). The top preconception feature variables impacting the model outputs were HbA1c, fatty liver index, mean arterial blood pressure, fasting insulin, TG/HDL ratio, height, age, mid-upper arm circumference, BMI, parity, alcohol consumption, family history of diabetes mellitus and Chinese ethnicity. Pre-pregnancy BMI demonstrated small predictive effects relative to preconception HbA1c. Chinese women also had a higher risk of GDM when compared with Indian and Malay women. The latter observation could be attributed to the high proportion of Chinese ethnic participants in the analysis set (79.3%). A history of GDM was a redundant feature in the AutoML feature selection model possibly due to the low frequency of participants with a history of documented GDM (2.7%). Metabolic syndrome status preconception did not contribute to GDM prediction.

Preconception Predictive Risk Model
The preconception predictive risk model for GDM was sequentially constructed using top predictors with SHAP value magnitudes greater than zero (Table 2). Preconception HbA1c alone was able to predict GDM outcome with high discrimination (AUC: 0.81). A model with nine features obtainable non-invasively (mean arterial blood pressure, height, age, mid-upper arm circumference, BMI, parity, alcohol consumption, family history of diabetes, Chinese ethnicity) was also able to predict GDM outcome with good performance (AUC: 0.81). The optimal machine learning pipeline comprises five features (HbA1c, fatty liver index, mean arterial blood pressure, fasting insulin, TG/HDL ratio). The extra trees classifier was the best machine learning pipeline evaluated by TPOT (AUC: 0.93). In the sensitivity analysis (see Supplementary Table S1), we observed that model Pre-pregnancy BMI demonstrated small predictive effects relative to preconception HbA 1c . Chinese women also had a higher risk of GDM when compared with Indian and Malay women. The latter observation could be attributed to the high proportion of Chinese ethnic participants in the analysis set (79.3%). A history of GDM was a redundant feature in the AutoML feature selection model possibly due to the low frequency of participants with a history of documented GDM (2.7%). Metabolic syndrome status preconception did not contribute to GDM prediction.

Preconception Predictive Risk Model
The preconception predictive risk model for GDM was sequentially constructed using top predictors with SHAP value magnitudes greater than zero ( Table 2). Preconception HbA 1c alone was able to predict GDM outcome with high discrimination (AUC: 0.81). A model with nine features obtainable non-invasively (mean arterial blood pressure, height, age, mid-upper arm circumference, BMI, parity, alcohol consumption, family history of diabetes, Chinese ethnicity) was also able to predict GDM outcome with good performance (AUC: 0.81). The optimal machine learning pipeline comprises five features (HbA 1c , fatty liver index, mean arterial blood pressure, fasting insulin, TG/HDL ratio). The extra trees classifier was the best machine learning pipeline evaluated by TPOT (AUC: 0.93). In the sensitivity analysis (see Supplementary Table S1), we observed that model performance was still maintained by dropping the fatty liver index as a feature variable. Based on the remaining four features, a stacked ensemble model with a gradient boosting classifier and linear support vector machine classifier (stochastic gradient descent training) was the best machine learning pipeline evaluated by TPOT (AUC: 0.93). The four-feature model comprising HbA 1c , mean arterial blood pressure, fasting insulin and TG/HDL ratio is our proposed solution for a preconception-based GDM predictor. The exported AutoML pipeline for the best predictive model is provided in the Supplementary Materials. Table 2. Construction of preconception predictive risk model. The preconception predictive risk model for GDM was sequentially constructed using top predictors with SHAP value magnitudes greater than zero in the AutoML feature selection model. The optimal machine learning pipeline for each model and area under the receiver operating characteristic curve (AUC) performance metric are reported. The proposed AutoML model was also robust when replacing HbA 1c with fasting glucose (AUC: 0.87), replacing mean arterial blood pressure with systolic blood pressure (AUC: 0.91) and replacing fasting insulin with HOMA-IR index (AUC: 0.91) (Supplementary Table S1). HbA 1c had the greatest impact on model performance changes (∆AUC = −0.06), followed by mean arterial blood pressure (∆AUC = −0.02) and fasting insulin (∆AUC = −0.02). Given these observations, maternal insulin resistance around conception can be postulated as an important determinant in the pathophysiology of metabolic diseases and fetal programming.

Features
Optimal Machine Learning Pipeline AUC   Table 3 presents the associations of the strongest predictors identified from the AutoML feature selection model for GDM. Each 1 mmol/mol increase in preconception HbA 1c was positively associated with GDM, independent of maternal ethnicity, age, parity, family history of diabetes mellitus and pre-pregnancy BMI (p = 0.001, OR (95% CI) 1.34 (1.13-1.60)).

Associations of Top Predictors and Adverse Birth Outcomes (Preterm Birth, Low Birthweight at Term and Large for Gestational Age Infant)
Similarly, Table 4 presents the associations of top GDM predictors with adverse birth outcomes (preterm birth, low birthweight at term and large for gestational age infant). Each 1 mmol/mol increase in preconception HbA 1c was positively associated with preterm birth outcome, independent of maternal ethnicity, age, parity, family history of diabetes mellitus, pre-pregnancy BMI, GDM diagnosis, total gestational weight gain and child sex (p = 0.011, OR: 1.63 (1.12-2.38)). However, preconception HbA 1c was not associated with low birthweight at term (OR: 1.13 (0.86-1.49)) or large for gestational age infant (OR: 1.06 (0.92-1.21)). We additionally found that pre-pregnancy BMI was positively associated with large for gestational age infant (p < 0.001, OR: 1.20 (1.10-1.31)). Table 4. Associations of top predictors and adverse birth outcomes (preterm birth, low birthweight at term and large for gestational age infant). Associations of top predictors identified from AutoML feature selection model and adverse birth outcomes (preterm birth, low birthweight at term and large for gestational age infant). Statistical tests were conducted two-sided with a significance level of 5%. All confidence intervals (CIs) are presented two-sided with a confidence level of 95%. The odds ratios (ORs) with 95% CI are presented. A resultant p-value of less than 0.05 is considered statistically significant.   After excluding women with prediabetes, the associations between preconception HbA 1c and a GDM outcome (p = 0.003, OR: 1.32 (1.10-1.59)) and with a preterm birth outcome (p = 0.010, OR: 1.75 (1.14-2.67)) were not materially changed.

Primary Findings
We built an effective preconception-based GDM predictor by integrating game theory concepts with evolutionary algorithm-based AutoML. Our proposed AutoML model was derived using genetic programming and achieved an excellent AUC of 0.93 with four features (HbA 1c , mean arterial blood pressure, fasting insulin, TG/HDL ratio). A stacked ensemble model with the gradient boosting classifier and linear support vector machine classifier (stochastic gradient descent training) was the best machine learning pipeline evaluated by TPOT. The preconception predictive risk model can be leveraged as a risk stratification tool during preconception care to identify Asian women at high risk of developing GDM, enabling early intervention. Alternatively, our non-invasive model trained with nine features (mean arterial blood pressure, height, age, mid-upper arm circumference, BMI, parity, alcohol consumption, family history of diabetes, Chinese ethnicity) provides an alternative for clinical implementation if blood-derived markers are unavailable (AUC: 0.81).
Population-based research on preconception HbA 1c and its relationship/association with GDM and adverse birth outcomes remains limited. In our study, HbA 1c was the top predictive feature discovered from AutoML feature selection modeling. The physiological variation in HbA 1c can be attributed to increased red cell turnover during pregnancy with new erythrocytes exposed to a lower time-averaged glucose concentration [32] and decreasing insulin sensitivity with increasing gestation [33].
In the fully adjusted logistic regression model (adjusted for maternal ethnicity, age, parity, family history of diabetes mellitus and pre-pregnancy BMI), preconception HbA 1c was associated with increased risks of GDM. Preconception HbA 1c alone had a high predictive performance in the AutoML model (AUC: 0.81). Similarly in the sensitivity analyses, the predictive performance of the AutoML model was stronger with preconception HbA 1c (AUC: 0.93) than preconception fasting glucose (AUC: 0.87), implying that early GDM risk stratification can be significantly improved with the inclusion of preconception HbA 1c over preconception fasting glucose. Moreover, HbA 1c offers greater clinical convenience than fasting glucose as there is no fasting requirement, less biological variation and greater pre-analytical stability [34]. As HbA 1c is a measure of how glucose has interacted with erythrocytes up to a three-month period [35], our findings suggest that women who develop GDM may have impaired glucose homeostasis prior to pregnancy itself.
The clinical usefulness of preconception HbA 1c can be extended to adverse pregnancy outcomes. In a Swedish study by Ludvigsson et al. [36], women with periconceptional HbA 1c levels within recommended target levels (HbA 1c < 6.5%) were at increased risk of preterm delivery. The risk of early preterm birth increased with increasing HbA 1c levels in normal pregnancies and among women with type 1 diabetes [36]. Our study provides further evidence that preconception HbA 1c is an independent risk factor for preterm birth. In the fully adjusted logistic regression model (adjusted for maternal ethnicity, age, parity, family history of diabetes mellitus, pre-pregnancy BMI, GDM diagnosis, total gestational weight gain and child sex), preconception HbA 1c was associated with increased risks of preterm birth. Associations between preconception HbA 1c and GDM and preterm birth were not materially changed after excluding women with prediabetes, indicating that preconception HbA 1c is a reliable marker in predicting GDM/preterm birth even within normal HbA 1c range.
Blood pressure changes between preconception and pregnancy are underexplored in the literature. In our study, mean arterial blood pressure feature was another critical component of the AutoML model. Although mean arterial blood pressure at preconception was not associated with GDM outcome, the linkage between preconception blood pressure and physiological changes associated with pregnancy complications warrants further investigation.
The TG/HDL ratio is a surrogate marker for insulin resistance and was one of the top five features in the AutoML feature selection model. GDM is a condition of increased insulin resistance, and this shifts the balance of lipid processing as reflected by the TG/HDL ratio [37,38]. The four features in AutoML modeling for GDM prediction (HbA 1c , mean arterial blood pressure, fasting insulin and TG/HDL ratio) discovered through genetic programming are suggestive of transient insulin resistance at preconception and reflect the women's pre-existing metabolic physiology, which clearly has a bearing on the women's ability to amount an appropriate metabolic adaptation to pregnancy in response to signals from the conceptus to ensure a successful pregnancy. Dysfunctional metabolic adaptation can thus lead to gestational diabetes and preterm birth.

Limitations
This study has some limitations due to scarcity of longitudinal data. Our AutoML model was trained on a limited S-PRESTO cohort of 222 preconception women. However, the prospective S-PRESTO data capture complex clinical pathways during pregnancy initiation and are less prone to differential measurement errors. A sub-cohort analyses by individual ethnic groups can be trained with larger clinical datasets such as the electronic health records. No replication cohort was available, and our proposed model should be evaluated in confirmatory studies. The four features in AutoML modeling for GDM prediction need to be evaluated in an early pregnancy cohort for generalizability.

Comparison with Prior Work
The implementation of our GDM risk prediction algorithm during preconception care would enable early engagement of women for preventive intervention, compared to existing pregnancy-based GDM risk prediction algorithms [14,15] developed for antenatal care. In another recent study by Wu et al. [39], an early pregnancy prediction model for GDM was developed based on genetic variants (four genetic susceptible single nucleotide polymorphisms (SNPs)) and six basic clinical features (AUC: 0.73). The latter model requires more advanced laboratory testing for SNPs, which may not be routinely available in all standard clinical laboratories. Xiong et al. [40] developed high performance machine learning models with the linear support vector machine classifier and LightGBM gradient boosting classifier using 10-19 weeks' gestation data (AUC: 0.91-0.98), which may be too late for effective GDM interventions. With four basic clinical features measured at preconception and high prediction performance of AUC: 0.93, our stacked ensemble model with the gradient boosting classifier and linear support vector machine classifier (stochastic gradient descent training) offers a simpler solution for early GDM prediction.

Conclusions
Leveraging AI and evolutionary algorithms, we devised a population-based predictive care solution to assess the risk of developing GDM in preconception of Asian women. An optimal control of preconception HbA 1c has the potential to lower the risk of GDM and reduce the incidence of preterm birth. Our trained classifier has been deployed in a web application for GDM prevention programs and intervention with early-stage nutritional and lifestyle changes during preconception care. The GDM predictor can also be combined with a digital health intervention such as a smartphone application.
Author Contributions: M.K. contributed to research study design, machine learning modeling, statistical analyses, interpretation of results and writing of the manuscript. L.T.A., H.P. and M.N. contributed to clinical data curation. K.T. contributed to the acquisition, curation of biochemistry data and critical reading of the manuscript. S.L.L. contributed to collection of phenotypic data in S-PRESTO cohort and critical reading of the manuscript. K.H.T., J.K.Y.C., K.M.G., S.-y.C. and Y.S.C. contributed to S-PRESTO cohort study design, data collection and critical reading of the manuscript. J.G.E. contributed to interpretation of results, writing of the manuscript and S-PRESTO cohort data collection. M.F. contributed to supervision of the study, interpretation of results and writing of the manuscript. N.K. contributed to supervision of the study, interpretation of results, writing of the manuscript and S-PRESTO cohort study data collection. M.F. and N.K. accept full responsibility for the work, had access to the data and controlled the decision to publish. All authors have read and agreed to the published version of the manuscript.