A Machine Learning Model Based on First-Trimester Lipidomic Signatures for Predicting Metabolic Pregnancy Complications
Abstract
1. Introduction
1.1. The Clinical Challenge of Metabolic Pregnancy Complications
1.2. The Search for Early Predictive Biomarkers
1.3. The Lipidome as a Rich Source of Early Markers
1.4. Advancing Prediction with Machine Learning and Integrated Data
1.5. Study Aim and Novelty
2. Results
2.1. Clinical Parameters
2.2. Diagnostic Models
3. Discussion
4. Materials and Methods
4.1. Study Design
- Singleton pregnancy;
- Neonatal birth weight ≥ 2500 g;
- Absence of malignant diseases in the mother;
- No history of organ transplantation in the mother;
- Absence of pregestational type 1 or type 2 diabetes mellitus in the mother;
- Undergoing an oral glucose tolerance test at 24–28 weeks of gestation and delivery at the center;
- Absence of congenital malformations in the mother and fetus;
- Absence of other major pregnancy complications;
- Provision of informed consent by the mother for participation in the study.
4.2. Sample Collection and Preparation
4.3. Lipidomic Mass Spectrometric Analysis
4.4. Statistical Analysis
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
| AUC | Area Under the Receiver Operating Characteristic Curve |
| BMI | Body Mass Index |
| GDM | Gestational Diabetes Mellitus |
| IVF | In Vitro Fertilization |
| MLP | Multilayer Perceptron |
| MS | Mass Spectrometry |
| m/z | Mass-to-Charge Ratio |
| NPV | Negative Predictive value |
| OGTT | Oral Glucose Tolerance Test |
| PC | Phosphatidylcholine |
| PG | Phosphatidylglycerol |
| PPV | Positive Predictive Value |
| ROC | Receiver Operating Characteristic |
| SHAP | Shapley Additive exPlanations |
| TG | Triglyceride |
| XGBoost | Extreme Gradient Boosting |
References
- International Diabetes Federation. Diabetes Atlas, 11th ed.; IDF: Brussels, Belgium, 2025; ISBN 9782930229966. [Google Scholar]
- Hocquette, A.; Durox, M.; Wood, R.; Klungsøyr, K.; Szamotulska, K.; Berrut, S.; Rihs, T.; Kyprianou, T.; Sakkeus, L.; Lecomte, A.; et al. International versus national growth charts for identifying small and large-for-gestational age newborns: A population-based study in 15 European countries. Lancet Reg. Health-Eur. 2021, 8, 100167. [Google Scholar] [CrossRef]
- Billionnet, C.; Mitanchez, D.; Weill, A.; Nizard, J.; Alla, F.; Hartemann, A.; Jacqueminet, S. Gestational diabetes and adverse perinatal outcomes from 716,152 births in France in 2012. Diabetologia 2017, 60, 636–644. [Google Scholar] [CrossRef] [PubMed]
- Reece, E.A. The fetal and maternal consequences of gestational diabetes mellitus. J. Matern. Neonatal Med. 2010, 23, 199–203. [Google Scholar] [CrossRef] [PubMed]
- McIntyre, H.D.; Catalano, P.; Zhang, C.; Desoye, G.; Mathiesen, E.R.; Damm, P. Gestational diabetes mellitus. Nat. Rev. Dis. Prim. 2019, 5, 47. [Google Scholar] [CrossRef] [PubMed]
- Farahvar, S.; Walfisch, A.; Sheiner, E. Gestational diabetes risk factors and long-term consequences for both mother and offspring: A literature review. Expert Rev. Endocrinol. Metab. 2019, 14, 63–74. [Google Scholar] [CrossRef]
- Damm, P. Future risk of diabetes in mother and child after gestational diabetes mellitus. Int. J. Gynecol. Obstet. 2009, 104, S25–S26. [Google Scholar] [CrossRef]
- Kramer, C.K.; Campbell, S.; Retnakaran, R. Gestational diabetes and the risk of cardiovascular disease in women. Diabetologia 2019, 62, 905–914. [Google Scholar] [CrossRef]
- Kelstrup, L.; Damm, P.; Mathiesen, E.R.; Hansen, T.; Vaag, A.A.; Pedersen, O.; Clausen, T.D. Insulin resistance and impaired pancreatic β-cell function in adult offspring of women with diabetes in pregnancy. J. Clin. Endocrinol. Metab. 2013, 98, 3793–3801. [Google Scholar] [CrossRef]
- Johnsson, I.W.; Haglund, B.; Ahlsson, F.; Gustafsson, J. A high birth weight is associated with increased risk of type 2 diabetes and obesity. Pediatr. Obes. 2015, 10, 77–83. [Google Scholar] [CrossRef]
- Kuciene, R.; Dulskiene, V.; Medzioniene, J. Associations between high birth weight, being large for gestational age, and high blood pressure among adolescents: A cross-sectional study. Eur. J. Nutr. 2018, 57, 373–381. [Google Scholar] [CrossRef]
- Kumru, P.; Arisoy, R.; Erdogdu, E.; Demirci, O.; Kavrut, M.; Ardıc, C.; Aslaner, N.; Ozkoral, A.; Ertekin, A. Prediction of gestational diabetes mellitus at first trimester in low-risk pregnancies. Taiwan. J. Obstet. Gynecol. 2016, 55, 815–820. [Google Scholar] [CrossRef] [PubMed]
- Mavreli, D.; Evangelinakis, N.; Papantoniou, N.; Kolialexi, A. Quantitative comparative proteomics reveals candidate biomarkers for the early prediction of gestational diabetes mellitus: A preliminary study. In Vivo 2020, 34, 517–525. [Google Scholar] [CrossRef] [PubMed]
- Yang, J.; Cao, Y.; Qian, F.; Grewal, J.; Sacks, D.B.; Chen, Z.; Tsai, M.Y.; Chen, J.; Zhang, C. Early prediction of gestational diabetes mellitus based on systematically selected multi-panel biomarkers and clinical accessibility—A longitudinal study of a multi-racial pregnant cohort. BMC Med. 2025, 23, 430. [Google Scholar] [CrossRef] [PubMed]
- Lin, J.; Zhao, D.; Liang, Y.; Liang, Z.; Wang, M.; Tang, X.; Zhuang, H.; Wang, H.; Yin, X.; Huang, Y.; et al. Proteomic analysis of plasma total exosomes and placenta-derived exosomes in patients with gestational diabetes mellitus in the first and second trimesters. BMC Pregnancy Childbirth 2024, 24, 713. [Google Scholar] [CrossRef]
- Borges Manna, L.; Syngelaki, A.; Würtz, P.; Koivu, A.; Sairanen, M.; Pölönen, T.; Nicolaides, K.H. First-trimester nuclear magnetic resonance–based metabolomic profiling increases the prediction of gestational diabetes mellitus. Am. J. Obstet. Gynecol. 2025, 233, 71.e1–71.e14. [Google Scholar] [CrossRef]
- Wang, Y.; Huang, Y.; Wu, P.; Ye, Y.; Sun, F.; Yang, X.; Lu, Q.; Yuan, J.; Liu, Y.; Zeng, H.; et al. Plasma lipidomics in early pregnancy and risk of gestational diabetes mellitus: A prospective nested case-control study in Chinese women. Am. J. Clin. Nutr. 2021, 114, 1763–1773. [Google Scholar] [CrossRef]
- Rahman, M.L.; Feng, Y.C.A.; Fiehn, O.; Albert, P.S.; Tsai, M.Y.; Zhu, Y.; Wang, X.; Tekola-Ayele, F.; Liang, L.; Zhang, C. Plasma lipidomics profile in pregnancy and gestational diabetes risk: A prospective study in a multiracial/ethnic cohort. BMJ Open Diabetes Res. Care 2021, 9, e001551. [Google Scholar] [CrossRef]
- Monari, F.; Menichini, D.; Spano’ Bascio, L.; Grandi, G.; Banchelli, F.; Neri, I.; D’Amico, R.; Facchinetti, F. A first trimester prediction model for large for gestational age infants: A preliminary study. BMC Pregnancy Childbirth 2021, 21, 654. [Google Scholar] [CrossRef]
- Du, J.; Zhang, X.; Chai, S.; Zhao, X.; Sun, J.; Yuan, N.; Yu, X.; Zhang, Q. Nomogram-based risk prediction of macrosomia: A case-control study. BMC Pregnancy Childbirth 2022, 22, 392. [Google Scholar] [CrossRef]
- Tranidou, A.; Tsakiridis, I.; Apostolopoulou, A.; Xenidis, T.; Pazaras, N.; Mamopoulos, A.; Athanasiadis, A.; Chourdakis, M.; Dagklis, T. Prediction of Gestational Diabetes Mellitus in the First Trimester of Pregnancy Based on Maternal Variables and Pregnancy Biomarkers. Nutrients 2024, 16, 120. [Google Scholar] [CrossRef]
- Koos, B.J.; Gornbein, J.A. Early pregnancy metabolites predict gestational diabetes mellitus: Implications for fetal programming. Am. J. Obstet. Gynecol. 2021, 224, 215.e1–215.e7. [Google Scholar] [CrossRef]
- Zhong, Z.; An, R.; Ma, S.; Zhang, N.; Zhang, X.; Chen, L.; Wu, X.; Lin, H.; Xiang, T.; Tan, H.; et al. Association between the Maternal Gut Microbiome and Macrosomia. Biology 2024, 13, 570. [Google Scholar] [CrossRef]
- Klebanoff, M.A.; Mednick, B.R.; Schulsinger, C.; Secher, N.J.; Shiono, P.H. Father’s effect on infant birth weight. Am. J. Obstet. Gynecol. 1998, 178, 1022–1026. [Google Scholar] [CrossRef]
- Tomita, H.; Iwama, N.; Hamada, H.; Kudo, R.; Tagami, K.; Kumagai, N.; Sato, N.; Izumi, S.; Sakurai, K.; Watanabe, Z.; et al. The impact of maternal and paternal birth weights on infant birth weights: The Japan environment and children’s study. J. Dev. Orig. Health Dis. 2023, 14, 699–710. [Google Scholar] [CrossRef]
- Rojo-López, M.I.; Barranco-Altirriba, M.; Rossell, J.; Antentas, M.; Castelblanco, E.; Yanes, O.; Weber, R.J.M.; Lloyd, G.R.; Winder, C.; Dunn, W.B.; et al. The Lipidomic Profile Is Associated with the Dietary Pattern in Subjects with and without Diabetes Mellitus from a Mediterranean Area. Nutrients 2024, 16, 1805. [Google Scholar] [CrossRef] [PubMed]
- Rodrigues, W.J.; Nekrakaleya, B.; Ramaiah, C.K.; Poojary, B. Bioassay-guided Isolation and Identification of Antidiabetic Compounds from Naregamia alata. Curr. Bioact. Compd. 2023, 19, e130423215719. [Google Scholar] [CrossRef]
- Wu, P.; Wang, Y.; Ye, Y.; Yang, X.; Huang, Y.; Ye, Y.; Lai, Y.; Ouyang, J.; Wu, L.; Xu, J.; et al. Liver biomarkers, lipid metabolites, and risk of gestational diabetes mellitus in a prospective study among Chinese pregnant women. BMC Med. 2023, 21, 150. [Google Scholar] [CrossRef]
- Bagheri, M.; Tiwari, H.K.; Murillo, A.L.; Al-Tobasei, R.; Arnett, D.K.; Kind, T.; Barupal, D.K.; Fan, S.; Fiehn, O.; O’connell, J.; et al. A lipidome-wide association study of the lipoprotein insulin resistance index. Lipids Health Dis. 2020, 19, 153. [Google Scholar] [CrossRef] [PubMed]
- Pang, S.J.; Liu, T.T.; Pan, J.C.; Man, Q.Q.; Song, S.; Zhang, J. The Association between the Plasma Phospholipid Profile and Insulin Resistance: A Population-Based Cross-Section Study from the China Adult Chronic Disease and Nutrition Surveillance. Nutrients 2024, 16, 1205. [Google Scholar] [CrossRef] [PubMed]
- Starodubtseva, N.L.; Tokareva, A.O.; Rodionov, V.V.; Brzhozovskiy, A.G.; Bugrova, A.E.; Chagovets, V.V.; Kometova, V.V.; Kukaev, E.N.; Soares, N.C.; Kovalev, G.I.; et al. Integrating Proteomics and Lipidomics for Evaluating the Risk of Breast Cancer Progression: A Pilot Study. Biomedicines 2023, 11, 1786. [Google Scholar] [CrossRef] [PubMed]
- Tonoyan, N.M.; Chagovets, V.V.; Starodubtseva, N.L.; Tokareva, A.O.; Chingin, K.; Kozachenko, I.F.; Adamyan, L.V.; Frankevich, V.E. Alterations in lipid profile upon uterine fibroids and its recurrence. Sci. Rep. 2021, 11, 11447. [Google Scholar] [CrossRef]
- Liaw, A.; Wiener, M. Classification and Regression by randomForest. R News 2002, 2, 18–22. [Google Scholar]
- Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Minin, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar] [CrossRef]
- Clerc, M.; Kennedy, J. The Particle Swarm—Explosion, Stability, and Convergence in a Multidimensional Complex Space. Mutat. Res. DNAging 2002, 6, 58–73. [Google Scholar] [CrossRef]
- Štrumbelj, E.; Kononenko, I. Explaining prediction models and individual predictions with feature contributions. Knowl. Inf. Syst. 2014, 41, 647–665. [Google Scholar] [CrossRef]
- Wishart, D.S.; Guo, A.C.; Oler, E.; Wang, F.; Anjum, A.; Peters, H.; Dizon, R.; Sayeeda, Z.; Tian, S.; Lee, B.L.; et al. HMDB 5.0: The Human Metabolome Database for 2022. Nucleic Acids Res. 2022, 50, D622–D631. [Google Scholar] [CrossRef]
- CoreTeam R. R: A Language and Environment for Statistical Computing. Available online: https://www.r-project.org (accessed on 22 October 2025).
- Kuhn, M. Building predictive models in R using the caret package. J. Stat. Softw. 2008, 28, 1–26. [Google Scholar] [CrossRef]
- Kalinowski, T.; Falbe, D.; Allaire, J.; Chollet, F.; RStudio; Google; Tang, Y.; Van Der Bijl, W.; Studer, M.; Keydana, S.; et al. R Interface to “Keras”. 2024. Available online: https://keras3.posit.co/index.html (accessed on 22 October 2025).
- Covert, I.; Lee, S.-I. Improving KernelSHAP: Practical Shapley Value Estimation Using Linear Regression. In Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, Online, 13–15 April 2021; Banerjee, A., Fukumizu, K., Eds.; PMLR: Cambridge, MA, USA, 2021; Volume 130, pp. 3457–3465. [Google Scholar]
- Wright, M.N.; Ziegler, A. Ranger: A fast implementation of random forests for high dimensional data in C++ and R. J. Stat. Softw. 2017, 77, 1–17. [Google Scholar] [CrossRef]
- Robin, X.; Turck, N.; Hainard, A.; Tiberti, N.; Lisacek, F.; Sanchez, J.-C.; Mueller, M. pROC: An open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinform. 2011, 8, 12–77. [Google Scholar] [CrossRef]
- Mayer, M. shapviz: SHAP Visualizations 2025. Available online: https://github.com/modeloriented/shapviz (accessed on 22 October 2025).
- Wickham, H. Elegant Graphics for Data Analysis: Ggplot2; Springer: New York, NY, USA, 2008; ISBN 978-0-387-78170-9. [Google Scholar]

| Data Type | Method | Outcome | Sens. | Spec. | Acc. |
|---|---|---|---|---|---|
| Lipids, Positive Ion Mode | |||||
| Random Forest | GDM | 0.23 | 0.95 | 0.71 | |
| XGBoost | GDM | 0.60 | 0.89 | 0.79 | |
| MLP | GDM | 0.03 | 0.99 | 0.66 | |
| Random Forest | Macrosomia | 0.44 | 0.95 | 0.76 | |
| XGBoost | Macrosomia | 0.76 | 0.80 | 0.78 | |
| MLP | Macrosomia | 0.56 | 0.73 | 0.66 | |
| Lipids, Negative Ion Mode | |||||
| Random Forest | GDM | 0.20 | 0.92 | 0.68 | |
| XGBoost | GDM | 0.53 | 0.80 | 0.71 | |
| MLP | GDM | 0.83 | 0.37 | 0.52 | |
| Random Forest | Macrosomia | 0.84 | 0.95 | 0.91 | |
| XGBoost | Macrosomia | 0.64 | 0.84 | 0.76 | |
| MLP | Macrosomia | 0.56 | 0.73 | 0.66 | |
| Clinical Parameters | |||||
| Random Forest | GDM | 0.51 | 0.92 | 0.76 | |
| XGBoost | GDM | 0.47 | 0.77 | 0.65 | |
| MLP | GDM | 0.43 | 0.68 | 0.60 | |
| Random Forest | Macrosomia | 0.40 | 0.93 | 0.73 | |
| XGBoost | Macrosomia | 0.91 | 0.09 | 0.40 | |
| MLP | Macrosomia | 0.67 | 0.35 | 0.47 |
| Marker | Ion | |Δm/z| | Theoretical m/z | Measured m/z | p-Value (FDR) | MR/HR |
|---|---|---|---|---|---|---|
| Lipids, Positive Ion Mode, GDM | ||||||
| TG 55:7 (triacylglycerol) | (M + NH4+)+ | 0.009 | 908.7710 | 908.7800 | <0.001 (0.003) | 1.78 |
| 13-Docosenamide (fatty acyl) | (M + H+)+ | 0.002 | 338.3420 | 338.3445 | 0.01 (0.08) | 0.73 |
| PC P-36:2 (phosphatidylcholine) | (M + H+)+ | 0.003 | 770.606 | 770.6095 | <0.001 (0.01) | 0.80 |
| PG (i-, a- 29:0) (phosphatidylglycerol) | (M + H+-H2O)+ | 0.001 | 663.46 | 663.4591 | 0.76 (0.88) | 0.89 |
| PC 42:7 (phosphatidylcholine) | (M+H+)+ | 0.004 | 860.6170 | 860.6211 | 0.02 (0.11) | 0.91 |
| Lipids, Negative Ion Mode, GDM | ||||||
| 299.0065 | - | - | - | 299.0065 | <0.001 (0.003) | >10 |
| 295.2112 | - | - | - | 295.2112 | 0.001 (0.04) | 0.88 |
| Clinical Parameters, GDM | ||||||
| Maternal BMI, kg/m2 | 0.003 (0.12) | 1.10 | ||||
| Maternal birth weight, kg | 0.55 (1) | 1.00 | ||||
| Macrosomia in history, n (%) | 0.52 (1) | 1.40 | ||||
| Lipids, Positive Ion Mode, Macrosomia | ||||||
| PG (i-, a- 29:0) (phosphatidylglycerol) | (M+H+ − H2O)+ | 0.001 | 663.4600 | 663.4591 | <0.001 (<0.001) | 3.44 |
| Lipids, Negative Ion Mode, Macrosomia | ||||||
| 4-Hydroxybutyric acid (fatty acyls) | (M+HCO3−)− | 0.008 | 165.0400 | 165.0317 | <0.001 (<0.001) | 3.33 |
| 234.1434 | - | - | - | 234.1434 | <0.001 (<0.001) | 3.22 |
| 174.9463 | - | - | - | 174.9463 | <0.001 (<0.001) | 0.84 |
| 239.1149 | - | - | - | 239.1149 | <0.001 (<0.001) | >10 |
| 951.1787 | - | - | - | 951.1787 | <0.001 (<0.001) | >10 |
| Pantothenol (fatty acyls) | (M+HCO2−)− | 0.002 | 250.129 | 250.1309 | <0.001 (<0.001) | 2.55 |
| 247.1564 | - | - | - | 247.1564 | <0.001 (<0.001) | 2.38 |
| 374.2242 | - | - | - | 374.2242 | <0.001 (<0.001) | 2.01 |
| 195.1282 | - | - | - | 195.1282 | <0.001 (<0.001) | 1.45 |
| Clinical Parameters, Macrosomia | ||||||
| Maternal BMI, kg/m2 | <0.001 (0.003) | 1.13 | ||||
| Maternal birth weight, kg | <0.001 (0.002) | 1.06 | ||||
| Macrosomia in history, n (%) | <0.001 (0.001) | 2.80 | ||||
| Paternal birth weight, kg | 0.004 (0.04) | 1.12 | ||||
| Complication | Model | Sensitivity | Specificity | Accuracy | AUC | PPV | NPV | F-Score |
|---|---|---|---|---|---|---|---|---|
| GDM | XGBoost | 0.87 | 0.89 | 0.88 | 0.88 | 0.80 | 0.93 | 0.83 |
| Macrosomia | Random Forest | 0.87 | 0.93 | 0.91 | 0.90 | 0.89 | 0.92 | 0.89 |
| Authors, Year | Reference | Complication | Method | Features | AUC | Sensitivity | Specificity |
|---|---|---|---|---|---|---|---|
| Yang et al., 2025 | [14] | GDM | Logistic regression | Routine clinical parameters, proteome, metabolome biomarker | 0.84 | 75% | 75% |
| Wang, et al., 2021 | [17] | GDM | Logistic regression | Routine clinical parameters, metabolome biomarkers | 0.80 | ~75% | ~75% |
| Manna, et al., 2025 | [16] | GDM | Logistic regression | Metabolome biomarkers | 0.84 | 60% | 90% |
| Tranidou et al., 2024 | [21] | GDM | Logistic regression | Medical history and routine screening pregnancy markers | 0.68 | 20% | 90% |
| Koos et al., 2021 | [22] | GDM | Decision tree | Urinary metabolome | 0.99 | 97.8% | 95.7% |
| Monari et al., 2021 | [19] | Macrosomia | Logistic regression | Routine clinical parameters | 0.705 | 55.2% | 79.0% |
| Du et al., 2022 | [20] | Macrosomia | Logistic regression | Routine clinical parameters | 0.807 | 71.6% | 77.7% |
| Zhong Z et al., 2024 | [23] | Macrosomia | Random forest | Routine screening parameters, gut microbiota species | 0.91 | 85.71% | 81.82% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tokareva, A.; Frankevich, N.A.; Chagovets, V.; Derenko, A.; Lagutin, V.; Frankevich, V.; Sukhikh, G. A Machine Learning Model Based on First-Trimester Lipidomic Signatures for Predicting Metabolic Pregnancy Complications. Int. J. Mol. Sci. 2025, 26, 11824. https://doi.org/10.3390/ijms262411824
Tokareva A, Frankevich NA, Chagovets V, Derenko A, Lagutin V, Frankevich V, Sukhikh G. A Machine Learning Model Based on First-Trimester Lipidomic Signatures for Predicting Metabolic Pregnancy Complications. International Journal of Molecular Sciences. 2025; 26(24):11824. https://doi.org/10.3390/ijms262411824
Chicago/Turabian StyleTokareva, Alisa, Natalia A. Frankevich, Vitaliy Chagovets, Anna Derenko, Vadim Lagutin, Vladimir Frankevich, and Gennady Sukhikh. 2025. "A Machine Learning Model Based on First-Trimester Lipidomic Signatures for Predicting Metabolic Pregnancy Complications" International Journal of Molecular Sciences 26, no. 24: 11824. https://doi.org/10.3390/ijms262411824
APA StyleTokareva, A., Frankevich, N. A., Chagovets, V., Derenko, A., Lagutin, V., Frankevich, V., & Sukhikh, G. (2025). A Machine Learning Model Based on First-Trimester Lipidomic Signatures for Predicting Metabolic Pregnancy Complications. International Journal of Molecular Sciences, 26(24), 11824. https://doi.org/10.3390/ijms262411824
