Next Article in Journal
Comparison of Short-Term Outcomes and Survivorship of Three Modular Dual Mobility Implants in Primary Total Hip Surgery
Previous Article in Journal
Psychological and Behavioral Adjustment in Patients with Non-Traumatic Lower Limb Amputation and Prosthesis: A Mixed-Method Triangulation Study
Previous Article in Special Issue
Perinatal Outcomes in Pregnancies Complicated by Maternal Thrombocytopenia: A Retrospective Cohort Study
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Smart Pregnancy: AI-Driven Approaches to Personalised Maternal and Foetal Health—A Scoping Review

by
Vera Correia
1,2,*,
Teresa Mascarenhas
2,3 and
Miguel Mascarenhas
2,4,5,6
1
Department of Obstetrics and Gynecology, Unidade Local Saúde Médio Ave, Largo Domingos Moreira, 4780-371 Famalicão, Portugal
2
Faculty of Medicine, University of Porto, Alameda Professor Hernâni Monteiro, 4200-427 Porto, Portugal
3
Department of Obstetrics and Gynecology, São João University Hospital, Alameda Professor Hernâni Monteiro, 4200-427 Porto, Portugal
4
Department of Gastroenterology, São João University Hospital, Alameda Professor Hernâni Monteiro, 4200-427 Porto, Portugal
5
WGO Gastroenterology and Hepatology Training Center, 4200-427 Porto, Portugal
6
CINTESIS@RISE, Department of Community Medicine, Information and Health Decision Sciences (MEDCIDS), Faculty of Medicine, University of Porto, 4200-427 Porto, Portugal
*
Author to whom correspondence should be addressed.
J. Clin. Med. 2025, 14(19), 6974; https://doi.org/10.3390/jcm14196974
Submission received: 28 August 2025 / Revised: 23 September 2025 / Accepted: 23 September 2025 / Published: 1 October 2025
(This article belongs to the Special Issue AI in Maternal Fetal Medicine and Perinatal Management)

Abstract

Background/Objectives: The integration of artificial intelligence (AI) into obstetric care poses significant potential to enhance clinical decision-making and optimize maternal and neonatal outcomes. Traditional prediction methods in maternal-foetal medicine often rely on subjective clinical judgment and limited statistical models, which may not fully capture complex patient data. By integrating computational innovation with mechanistic biology and rigorous clinical validation, AI can finally fulfil the promise of precision obstetrics by transforming pregnancy complications into a preventable, personalised continuum of care. This study aims to map the current landscape of AI applications across the continuous spectrum of maternal–foetal health, identify the types of models used, and compare clinical targets and performance, potential pitfalls, and strategies to translate innovation into clinical impact. Methods: A literature search of peer-reviewed studies that employ AI for prediction, diagnosis, or decision support in Obstetrics was conducted. AI algorithms were categorised by application area: foetal monitoring, prediction of preterm birth, prediction of pregnancy complications, and/or labour and delivery. Results: AI-driven models consistently demonstrate superior performance to traditional approaches. Nevertheless, their widespread clinical adoption is hindered by limited dataset diversity, “black-box” algorithms, and inconsistent reporting standards. Conclusions: AI holds transformative potential to improve maternal and neonatal outcomes through earlier diagnosis, personalised risk assessment, and automated monitoring. To fulfil this promise, the field must prioritize the creation of large, diverse, open-access datasets, mandate transparent, explainable model architectures, and establish robust ethical and regulatory frameworks. By addressing these challenges, AI can become an integral, equitable, and trustworthy component of Obstetric care worldwide.

1. Introduction

Artificial intelligence has become a foundational pillar of modern medicine, harnessing computational paradigms that parallel human reasoning to discern patterns, inform decisions and generate reliable predictions. Initially, symbolic systems encoded domain expertise directly as rules and ontologies; early medical expert systems translated clinical guidelines into sequences of if-then statements that guided diagnosis and treatment planning [1]. As clinical data volumes expanded, a migration toward data-driven methodologies gave rise to machine learning (ML), which enables algorithms to learn predictive relationships from empirical observations [1]. Supervised approaches such as logistic regression (LR) estimate the probability of clinical outcomes from patient variables, while decision trees (DT) partition the feature space into clinically meaningful strata [1]. Ensemble techniques, particularly random forests (RF) and gradient boosting machines exemplified by XGBoost, enhance predictive performance by combining numerous weak learners into robust consensus models [2]. Advances in computational power and algorithmic design piloted in the era of deep learning (DL), a subclass of machine learning characterized by multilayer neural networks that extract hierarchical representations directly from raw inputs [2]. Convolutional neural networks (CNN) analyse spatial hierarchies within medical images, empowering automated segmentation of anatomical structures and detection of subtle anomalies in modalities ranging from ultrasound to magnetic resonance imaging [2,3]. Recurrent neural networks (NN), including long short-term memory architectures, capture temporal dependencies within sequential data streams such as continuous foetal heart rate tracings or maternal physiological signals [3]. The introduction of transformer models, which employ self-attention mechanisms to contextualize every element of a sequence simultaneously, has revolutionized natural language processing and shows promise for modelling complex multimodal clinical data [4]. Complementary to predictive modelling, generative techniques including variational autoencoders and generative adversarial networks synthesize realistic data samples, thereby mitigating limitations imposed by scarce or imbalanced datasets [5]. As model complexity grows, interpretability emerges as a critical consideration. Tools such as SHapley Additive exPlanations (SHAP) and Local Interpretable Model-agnostic Explanations (LIME) decompose model predictions into contributions from individual features, thereby enhancing transparency and facilitating clinician trust [5].
Maternal-foetal medicine exemplifies a domain in which artificial intelligence’s full spectrum can be deployed to transform surveillance and intervention. Conventional prenatal monitoring relies on intermittent clinic visits, manual ultrasound calliper measurements and retrospective chart reviews, approaches that may fail to identify developing complications until they pose imminent risk. Contemporary machine-learning classifiers trained on extensive electronic health-record cohorts have demonstrated the ability to predict complications such as preterm birth several weeks in advance, with ensemble methods such as random forests and gradient boosting outperforming traditional regression models [6]. Deep-learning pipelines now automate foetal biometry with precision on par with expert sonographers, extracting head circumference and abdominal measurements directly from ultrasound video sequences [7,8]. Wearable sensor platforms integrate real-time signal processing algorithms to continuously monitor maternal vital signs and foetal movements, enabling personalised risk stratification and the delivery of timely clinical alerts [9].
The main goals of our research are threefold: First, to offer a comprehensive review of the state-of-the-art applications of artificial intelligence within prenatal and perinatal medicine; second, to critically examine the barriers to its widespread adoption, including data heterogeneity across populations, challenges in algorithmic interpretability, regulatory landscapes and ethical imperatives; and third, to uncover novel perspectives and highlight unresolved issues that could stimulate further investigation in this burgeoning area aimed at translating computational innovations into equitable and personalised maternal-foetal healthcare.

2. Materials and Methods

A scoping review was conducted in accordance with the Joanna Briggs Institute Methodology for JBI Scoping Reviews and adhering to the PRISMA Extension for Scoping Reviews (PRISMA-ScR) checklist [10,11].

2.1. Search Strategy and Selection Criteria

An approach encompassing dual methodology was applied: a semi-structured review that spotlights seminal works deemed to have significant impact, coupled with a systematic search to ensure a thorough and exhaustive synthesis of the current landscape. This integrative strategy allowed for provision of a robust, multifaceted analysis that not only maps the state-of-the-art in Obstetrics research but also illuminates avenues for future research.
A systematic search was conducted in May 2025 across scholarly databases including PubMed and WebofScience to identify relevant studies on AI applications in foetal medicine and Obstetrics. The search terms combined keywords for AI methodologies (“Artificial Intelligence”, “Machine Learning”, “Deep Learning”, “Neural Networks”, “CNN”, “DL”, “AI”, “ML”, “predictive model *”) with obstetric and fetal health descriptors (“Obstetric”, “Fetal medicine”, “Maternal-fetal medicine”, “Prenatal Diagnosis”, “Prenatal Screening”, “Fetal”, “Foetal”, “Maternal”, “Fetal Ultrasound”, “Obstetric Ultrasound”, “perinatal”, “Pregnancy”, “Noninvasive Prenatal Testing”, “Fetal Monitoring”, “Cardiotocography”, “CTG”, “Fetal Heart Rate”, “Fetal Hypoxia”, “Growth Restriction”, “Preterm”, “Preeclampsia”, “Gestational Diabetes”, “Postpartum Haemorrhage”, “Labor”, “Labour”, “Delivery”, “Birth”, “VBAC”, “Cesarean Section”, “Obstetric Surgery”).

2.2. Screening and Eligibility

Titles and abstracts were reviewed and conflicts resolved by discussion. Articles were excluded if they did not describe AI/ML/DL model development or evaluation in foetal or obstetric contexts, or if they were reviews, commentaries or abstracts. Full texts were assessed for inclusion based on predefined criteria: (i) original peer-reviewed research; (ii) application of AI/ML/DL for prediction, diagnostic support, decision-making or workflow optimization in maternal-foetal medicine/obstetrics; (iii) detailed methodological description and performance metrics; (iv) clinically applicable in foetal monitoring, prediction of preterm birth, prediction of pregnancy complications and/or labour and delivery.

2.3. Data Extraction

For each included study, we extracted: (a) year of publication; (b) type of study; (c) AI approach (algorithm type, training/validation strategy); (d) dataset characteristics (size, source, population); (e) performance measures (accuracy, sensitivity, specificity, AUC); (f) best-performing algorithm, if applicable; (g) limitations and (h) clinical applications. Discrepancies were reconciled by consensus.

2.4. Synthesis and Reporting

Data were synthesized narratively, grouping studies by clinical domain (e.g., AI for ultrasound imaging analysis, AI in foetal monitoring and risk prediction, AI-based prediction of preterm birth, AI in pregnancy complications, AI in labour and delivery) and AI technique. Emerging trends, methodological strengths and limitations were identified, and research gaps were highlighted to inform future directions in AI-enabled obstetric care.

3. AI Applications in Maternal-Foetal Medicine

The aforementioned search strategy produced 9927 records. After duplicate exclusion, 9492 unique records were screened, of which 344 met the criteria for full-text review. Of these, 128 were included in the review. Details of the study selection process are presented in Figure 1. Study characteristics are summarized in Table 1, Table 2, Table 3, Table 4, Table 5 and Table 6.

3.1. AI in Foetal Monitoring and Risk Prediction

Cardiotocography (CTG), introduced in the late 1960s, represented a pivotal advance in intrapartum foetal surveillance by enabling continuous recording of foetal heart rate (FHR) and uterine contractions (UC) to detect early signs of hypoxia [139,140]. However, more than fifty years on, definitive evidence that CTG reduces neonatal mortality or long-term neurological injury remains lacking, while its practical use is undermined by a high false-positive rate, poor specificity and substantial intra- and interobserver variability [139]. These shortcomings have contributed to rising caesarean delivery rates and missed opportunities for timely intervention, emphasising the need for more objective, reproducible analytic methods [139].
A transformative paradigm shift is now underway, moving from subjective waveform interpretation toward quantifiable, data-driven solutions that integrate advances in signal acquisition, engineering and AI. This transition necessitates international, multidisciplinary collaboration among clinicians, engineers, data scientists, device manufacturers and regulators, all aligned around well-defined, actionable perinatal outcomes.
The first computerized CTG systems of the 1980s automated extraction of classic features, including baseline rate, variability, accelerations and decelerations, to alert clinicians to abnormal tracings [140,141]. While these rule-based algorithms reduced some subjectivity, their reliance on small, retrospective datasets and expert-defined thresholds limited generalizability [140,141]. In the past decade, ML and DL methods have superseded these early efforts, incorporating advanced signal-processing techniques, such as phase-rectified signal averaging, wavelet transforms and time-frequency feature extraction, to uncover latent predictors of foetal compromise [142,143,144]. Hybrid models that combine CTG features with maternal and obstetric risk factors have achieved improved sensitivity and reduced false positives in retrospective cohorts, albeit prospective, external validation remains sparse [145,146].
Contemporary DL approaches have demonstrated significant gains in CTG interpretation. In one of the largest series to date, Park et al. trained an InceptionTime convolutional neural network on 124,777 intrapartum CTG recordings drawn from a nationwide, multicentre registry, achieving an area under the receiver-operating characteristic curve (AUC) of 0.89 for predicting severe neonatal acidemia (pH < 7.05) with 90% sensitivity at a positive-predictive-value threshold of 30%; external validation yielded an AUC of 0.72, highlighting the imperative for dataset-specific calibration [147]. Similarly, Ben M’Barek et al. developed DeepCTG® 2.0, a CNN framework trained on 27,662 CTG tracings across three tertiary centres, which demonstrated AUCs of 0.74–0.83 for moderate to severe acidemia, outperforming its logistic-regression–based predecessor by approximately 0.05 in AUC [13].
Signal-quality correction has emerged as a critical preprocessing step. Boudet et al. employed a gated recurrent unit (GRU) network to distinguish true foetal from maternal heart rate and artefactual noise, achieving 93% sensitivity and a 96% positive predictive value, thereby strengthening inputs for downstream predictive models [31]. Notably, Frasch et al. demonstrated that DL applied to scanned analog CTG tracings could classify potentially preventable foetal distress with 94% accuracy, illustrating the utility of leveraging legacy datasets for large-scale model training [32]. More recently, Daydulo et al. combined Morse-wavelet time–frequency feature extraction with DL architectures to detect foetal distress, further validating the synergistic benefits of advanced signal processing and neural networks [29].
Accompanying the emerging Role of Generative Aim, the interpretative capacity of large language models (LLMs) has also been explored in this field. In a proof-of-concept comparison against junior and senior clinicians interpreting CTG screenshots, Gumilar et al. found that GPT-4o achieved a mean interpretive score of 77.9 (0–100 scale), closely approximating senior obstetricians (80.4) and outperforming less experienced practitioners, suggesting potential for generative AI–assisted decision support [14].
Despite these advances, clinical translation is impeded by several factors: (i) a predominantly retrospective evidence base, necessitating prospective, multicentre trials to establish safety and efficacy; (ii) heterogeneity in CTG acquisition protocols and annotation standards, underscoring the need for harmonized data standards; (iii) limited explainability of “black-box” models, which must be addressed through integrated visualization and interpretability tools; and (iv) nascent regulatory frameworks for utilization of AI/ML technology in Obstetrics.
Looking ahead, integration of AI-driven CTG analysis into electronic health records and central monitoring systems promises real-time decision support, automated risk stratification and standardised reporting. The formation of large, federated datasets governed by international standards will be essential to develop robust, generalizable models. Ultimately, by combining methodological rigor with collaborative, multidisciplinary implementation efforts, AI-enhanced CTG has the potential to transform intrapartum care, reducing perinatal morbidity and avoiding unnecessary interventions.
A summary of studies regarding AI applications in Foetal Monitoring is presented in Table 1.

3.2. AI-Based Prediction of Preterm Birth

Preterm birth (PTB), defined as delivery prior to 37 completed weeks of gestation, represents a critical global health challenge, accounting for approximately 10% of live births and remaining the foremost cause of neonatal morbidity and mortality worldwide [47]. Infants born preterm are predisposed to respiratory distress syndrome, intraventricular haemorrhage, necrotizing enterocolitis and long-term neurodevelopmental impairments, including cerebral palsy and cognitive delay, as well as chronic cardiovascular and metabolic sequelae extending into adulthood [148]. The aetiology of PTB is complex and multifactorial, encompassing spontaneous preterm labour precipitated by uterine overactivity or cervical insufficiency, infection-mediated inflammatory cascades, aberrant activation of the maternal–foetal hypothalamic–pituitary–adrenal axis, decidual haemorrhage and placental dysfunction, as well as medically indicated (iatrogenic) delivery for maternal or foetal compromise [148]. Established prophylactic strategies in women with prior PTB or asymptomatic transvaginal ultrasound short cervix, such as vaginal progesterone administration, cervical cerclage, pessary or a combination of these, have demonstrably reduced individual risk but remain suboptimal at the population level, in part because current clinical screening instruments identify only a subset of those destined to deliver prematurely [148]. Early and accurate stratification of PTB risk is therefore essential to optimize antenatal surveillance, deliver targeted therapies and allocate perinatal resources effectively.
Recent advances in AI have enabled the synthesis of high-dimensional data streams to improve PTB prediction beyond conventional risk factors. Kłoska and colleagues applied long short-term memory (LSTM), CNN and RF algorithms to combined uterine electromyography (EMG), tocodynamometry (TOCO) and electronic health record data from 1200 singleton pregnancies [47]. By extracting EMG features as burst frequency, interburst-interval variability and power spectral density in the 0.34–1.0 Hz band, and TOCO metrics, specifically contraction frequency and amplitude, and further integrating these with maternal age, body mass index, parity and prior PTB history, the LSTM model most effectively captured the temporal progression of uterine “activation,” achieving an area under the receiver-operating characteristic curve (AUC) of 0.87, a sensitivity of 0.83 and a specificity 0.85, significantly outperforming both CNN (AUC 0.83) and RF (AUC 0.79) [47]. In parallel, Ohtaka et al. trained a CNN on transvaginal cervical ultrasound video sequences (n = 59), quantifying dynamic changes in cervical funnelling and tissue echotexture to predict imminent PTB with an AUC of 0.92, a sensitivity of 0.88 and a specificity of 0.90, thereby demonstrating the feasibility of real-time, image-based risk stratification within routine obstetric practice [48].
Nonetheless, explainability and clinical interpretability are paramount for AI adoption. Clinicians must understand why an algorithm issues a high-risk alert before acting on it, verifying that predictions reflect known physiological mechanisms rather than spurious correlations. It is, therefore, relevant to denote work such as Kokkidinis and colleagues’, which combined extreme gradient boosting with SHapley Additive exPlanations (SHAP) on cervical length, foetal fibronectin, interleukin-6 and obstetric history (n = 500) to achieve an AUC of 0.89, sensitivity of 0.82 and specificity of 0.85, providing feature-level attributions that facilitate individualized patient counselling and shared decision-making [52]. Similarly, Andrade-Júnior et al. developed a stacked Bayesian extreme learning machine ensemble combining XGBoost, LR and neural class-labeling networks for iatrogenic PTB (n = 800), achieving an AUC of 0.91, sensitivity of 0.83 and specificity of 0.88, while preserving interpretability, illustrating how hybrid architectures can enhance robustness in heterogeneous clinical scenarios [51]. Together, these studies illustrate how explainable AI frameworks can bridge the gap between high-performance prediction and clinician trust, thereby accelerating adoption and ultimately improving perinatal outcomes.
By enabling identification of high-risk women weeks or even months before labour onset, these AI-driven models could materially reduce PTB incidence through timely administration of prophylactic interventions (e.g., progesterone, cerclage, pessary), optimised scheduling of antenatal corticosteroids and transfer to tertiary care centres. Moreover, precise risk stratification may prevent unnecessary interventions in low-risk women, thereby reducing healthcare costs and avoiding iatrogenic complications. However, current studies are constrained by retrospective, single-centre designs, limited sample sizes and the absence of external, prospective validation. Standardisation of signal acquisition protocols, harmonization of imaging techniques and rigorous assessment of cost-effectiveness and workflow integration will be critical in the years to come. Future research should prioritize large-scale, multi-institutional prospective trials, exploration of multimodal fusion including biochemical biomarkers and genomics, and development of clinician-centric interfaces to translate these high-performance algorithms into routine obstetric care and ultimately diminish the global burden of preterm birth.
A summary of studies regarding AI applications in prediction of preterm birth is presented in Table 2.

3.3. AI in Prediction of Pregnancy Complications

3.3.1. AI for Early Prediction of Preeclampsia

Pre-eclampsia is among the oldest recognised complications of pregnancy, historically dubbed “the disease of theories” for the multitude of its proposed aetiologies [149]. Clinically defined by new-onset hypertension after 20 weeks’ gestation accompanied by end-organ involvement, it encompasses a spectrum from mild gestational hypertension with proteinuria to life-threatening syndromes, such as eclampsia, HELLP syndrome and maternal multiorgan dysfunction [150]. Maternal manifestations may emerge at any point antenatally or persist postpartum, and even ostensibly mild disease confers an elevated lifetime risk of cardiovascular and metabolic disorders for both mother and offspring [149,150,151].
Despite over a century of research implicating defective placentation as the central driver of early-onset pre-eclampsia, the precise interplay between placental pathology and maternal cardiovascular adaptation remains incompletely defined [149]. Normal pregnancy demands profound cardiovascular remodelling, namely plasma volume expansion, decreased systemic vascular resistance, and enhanced cardiac output; yet, in pre-eclampsia this adaptive capacity falters, precipitating hypertension, end-organ ischemia, and adverse perinatal outcomes [149]. These intertwined placental and maternal maladaptations continue to confound obstetrical care: screening algorithms rooted in clinical risk factors and standardised biomarkers achieve only modest predictive accuracy, and interventions such as low-dose aspirin yield benefit in only a subset of high-risk patients [150,151]. Pre-eclampsia risk stratification still relies primarily on clinical risk scores and simple multivariable regression models. The most widely implemented frameworks draw on maternal history (age, parity, pre-eclampsia in previous gestation), blood pressure, and basic biochemical markers, often transformed into multiples of the median (MoM), to generate population-level risk estimates. The Fetal Medicine Foundation (FMF) competing-risks model, for example, integrates maternal factors, mean arterial pressure (MAP), uterine artery pulsatility index (UtA-PI), placental growth factor (PlGF) and pregnancy-associated plasma protein-A (PAPP-A) in a Gaussian-based algorithm [152]. In large validation cohorts, this approach achieved AUCs of approximately 0.78 for any pre-eclampsia and 0.88 for preterm pre-eclampsia at a 10 % screen-positive rate [152]. Similarly, logistic-regression models that incorporate first-trimester uterine Dopplers and MoM-standardised biochemistry have reported detection rates of 55–60 % for all pre-eclampsia and 75–80 % for preterm cases at a 10 % false-positive rate [152]. While these models represent a step forward from univariate risk assessment, they remain limited by their reliance on linear combinations of preselected features, the need for MoM standardization, and suboptimal performance in heterogeneous populations. Thus, the imperative persists for novel strategies that can integrate multidimensional data, specifically clinical, hemodynamic, imaging and molecular, and reveal latent patterns predictive of disease onset and severity. In this context, AI offers a transformative paradigm, capable of transcending linear models to synthesize high-dimensional inputs and generate individualized risk trajectories that may guide truly precision-driven prevention, monitoring, and therapeutic intervention.
Recent AI-driven studies have pursued complementary strategies, such as deep integration of raw clinical and biochemical inputs, as well as large-scale EHR mining, collectively pushing beyond the plateau of aforementioned traditional methods.
Ansbacher-Feldman et al. bypassed MoM transformations by training a two-layer feed-forward neural network on 60,789 singleton pregnancies using raw maternal characteristics (age, BMI, parity, pre-eclampsia in prior gestation, race, type of conception, interpregnancy interval) and unstandardised biomarker concentrations (MAP, UtA-PI, PlGF, PAPP-A) [75]. At a 10 % screen-positive rate, their “posterior” model augmented with biomarkers increased detection of all pre-eclampsia cases from 41 % to 53 %, and of preterm pre-eclampsia from 53 % to 75 %, with AUC improvements from 0.77 to 0.82 (any PE) and from 0.82 to 0.91 (preterm PE) [81]. This direct ingestion of raw values exemplifies AI’s capacity to learn complex, nonlinear interactions that would be obscured by manual standardization.
In a population of 48,250 pregnancies drawn from a Southeast Melbourne health network, Tiruneh et al. demonstrated that “low-fidelity” EHR variables can yield robust predictions when ensembled in a random forest. Their model attained an AUC of 0.84 for pre-eclampsia versus controls, with similar discrimination for early and late-onset subtypes. Crucially, this approach leverages routinely captured, large-scale data without bespoke biomarker assays or specialized imaging, suggesting a path toward scalable risk stratification in diverse healthcare systems [75].
Zheng et al. combined semi-supervised U-Net segmentation of sagittal T2-weighted placental MRI with large-scale radiomic feature extraction—over 3000 wavelet, texture, and shape metrics—and fused these with logistic regression into a “deep learning radiomics” (DLR) signature. In a multicentre cohort of 420 pregnancies, the DLR model achieved AUCs of 0.84–0.89 for distinguishing pre-eclampsia from normotensive controls and 0.92 for combined PE plus foetal growth restriction. By quantifying subtle microstructural alterations in the placenta, radiomics reveals mechanistic links between villous architecture and disease risk [65].
Wang et al. interrogated GEO microarray datasets and a small validation cohort to identify 11 immune-related differentially expressed genes (DEGs), whose expression profiles were distilled by LASSO and random forest into a diagnostic panel. The resulting model achieved AUCs of 0.79 in test sets and 0.87 in external validation for early pre-eclampsia prediction. This “molecular radiology” of the maternal-foetal interface integrates immune–metabolic networks, offering both predictive and therapeutic insight [66].
Araújo et al. applied a LightGBM boosting model to routine complete blood counts in a Brazilian cohort, achieving an AUROC of 0.90 for severe pre-eclampsia detection, highlighting the diagnostic potential of inexpensive, widely available laboratory indices [77].
Zhou and colleagues used an Inception-ResNet-v2 convolutional network on retinal fundus images obtained before 20 weeks, achieving an AUC of 0.85 for pre-eclampsia prediction, and up to 0.88 when combined with clinical risk factors, underscoring the retinal microvasculature as a surrogate for systemic endothelial health [71].
AI models promise to redefine antenatal care by delivering individualized risk trajectories weeks to months before symptom onset, informing aspirin prophylaxis, intensified surveillance, and resource allocation. Where traditional FMF and logistic-regression models plateau around AUCs of 0.75–0.85, AI approaches routinely exceed these thresholds, often reaching 0.90–0.95 in well-curated cohorts. Yet, their translation remains constrained by retrospective study designs, “black-box” opacity, and the need for prospective, randomised evaluation of clinical impact. Data heterogeneity mandates standardised data curation and federated learning solutions. Finally, robust, explainable AI frameworks are essential to align predictive features with known pathophysiology and secure regulatory approval.
A summary of studies regarding AI applications in pre-eclampsia is presented in Table 3.

3.3.2. AI-Driven Models for Gestational Diabetes Risk Stratification

Gestational diabetes mellitus (GDM) affects up to 15 % of pregnancies worldwide and is a major driver of both short- and long-term morbidity for mother and offspring [153]. Current screening paradigms are predicated on a 75 g oral glucose tolerance test (OGTT) performed at 24–28 weeks, identifying hyperglycaemia late in gestation, by which point maladaptive placental and foetal metabolic programming have often already been set in motion [153]. Such a reactive approach misses the opportunity to deploy targeted lifestyle or pharmacological interventions during the first and early second trimesters, when maternal insulin sensitivity and β-cell function evolve dynamically [154]. Major clinical guidelines share key shortcomings [155,156,157]. The Royal College of Obstetricians and Gynaecologists advocates early risk-factor assessment followed by selective OGTT at 24–28 weeks for women with persistent risk markers; the American College of Obstetricians and Gynecologists (ACOG) recommends universal OGTT at the same gestational window, with earlier testing for high-risk individuals; and Diabetes Canada similarly reserves OGTT for mid-pregnancy while endorsing early screening for those with obesity or prior GDM [155,156,157]. All three frameworks rely on binary glycaemic thresholds that fail to capture the continuum of risk and require fasting, multiple phlebotomies, and laboratory infrastructure that may be inaccessible in under-resourced settings [155,156,157,158].
ML and DL approaches can exploit routinely collected first-trimester data, such as electronic health-record variables, biochemical panels, anthropometric measures, ultrasound radiomics and even patient-reported lifestyle metrics, to generate continuous, individualized GDM risk scores long before conventional OGTT. By stratifying risk in early pregnancy, AI-driven tools have the potential to triage women to intensified surveillance or preventive therapy, conserve resources by deferring OGTT for low-risk individuals, and tailor interventions (nutritional counselling, exercise programmes or metformin initiation) to those most likely to benefit from them.
Recent studies illustrate the promise of this paradigm. Broadly, these efforts can be grouped into three methodological categories: (i) structured-data models built on electronic health records (EHRs) and laboratory values; (ii) image-based radiomics and deep-learning pipelines; and (iii) hybrid or dual-task frameworks that combine clinical, biochemical and imaging inputs.
Several large retrospective cohorts harnessed routinely collected demographic and biochemical data to train ensemble learners. Hu X et al. leveraged 20 first-trimester EHR variables, including prior GDM history, HbA1c, mean arterial pressure and lipid panels, to achieve an AUC of 0.946 using XGBoost, dramatically outperforming logistic regression (AUC 0.946 AUC 0.752) [96]. Similarly, Zhao et al. applied NearMiss resampling to address class imbalance in a 103,172-pregnancy database, with a multilayer perceptron reaching AUC 0.943 versus 0.777 for multivariate logistic regression [90]. These studies illustrate that complex, nonlinear ensembles can extract subtle interactions among clinical predictors that linear models miss, and that careful data-preprocessing (e.g., resampling) is critical when GDM prevalence is low.
By contrast, Bigdeli et al. demonstrated the challenges of dual-task modelling: in a single-centre Iranian cohort, a random-forest algorithm predicted OGTT positivity with high fidelity (AUC 0.94) but struggled to forecast subsequent insulin requirement (AUC 0.64), highlighting how model performance can vary widely depending on endpoint prevalence and data completeness [89]. Kadambi et al. and Liao et al. further underscore the trade-offs of including specialized behavioural and self-monitored glucose metrics: while super-learner ensembles attained C-statistics up to 0.934 in discovery, simplified logistic models still achieved respectable AUCs (~0.80) with far greater interpretability [98,104].
Zhou et al. and similar radiomics studies pioneered the use of first-trimester ultrasound texture features to predict GDM [92]. By extracting >1300 quantitative placental features and combining them with deep-learning convolutional neural network (CNN) scores, the authors produced a nomogram with AUCs of 0.93 (discovery) and 0.88 (validation) [92]. This paradigm offers an elegant “no-blood-draw” risk stratification, seamlessly integrating into routine nuchal translucency scans. Nevertheless, manual region-of-interest delineation and ultrasound protocol variability present barriers to scalability, and external validation in larger, multi-ethnic cohorts remains pending [92].
Preconception and mobile-health tools, such as those developed by Kumar et al. in the S-PRESTO and GUSTO cohorts, illustrate how web-based interfaces can empower women to input lifestyle, anthropometric and basic laboratory data [101,102]. CatBoost and stacked-ensemble pipelines achieved AUCs of 0.82–0.83, demonstrating that patient-reported inputs combined with minimal biochemistry can approximate the performance of EHR-driven models [101,102]. Such platforms hold promise for low-resource settings, though their reliance on self-report introduces potential biases.
These models can be seamlessly integrated into obstetric electronic medical-record systems to flag high-risk women at booking visits, prompting clinicians to initiate dietary and exercise interventions. Ultrasound-augmented platforms can deliver automated GDM risk estimates during routine first trimester scans without additional patient burden. Web and smartphone-based preconception tools enable women to check personalised GDM risk based on self-reported data and first-trimester labs, empowering proactive lifestyle changes even before conception.
Despite impressive retrospective metrics, the majority of published models are single-centre and lack external validation across diverse ethnic and socioeconomic populations. Small sample sizes, extensive data exclusions, and reliance on high-dimensional biomarker panels limit generalizability to low-resource settings. Heterogeneous variable definitions and missing data in electronic health records threaten model robustness; concomitantly, “black-box” architectures without transparent feature attribution risk undermining clinician trust.
Artificial intelligence offers a transformative avenue to shift gestational diabetes care from a reactive, mid-pregnancy diagnostic model to a proactive, first-trimester precision-medicine approach. By harnessing diverse clinical, biochemical, imaging, and behavioural data, ML and DL models can identify at-risk women early, guide personalised interventions, and optimise resource allocation, ultimately improving maternal and neonatal health outcomes on a global scale. To better achieve the promise of AI in GDM care, large-scale, prospective, multi-centre trials are imperative to validate and calibrate risk models across populations and healthcare systems. Embedding explainability frameworks such as SHAP or LIME can illuminate key predictors and facilitate clinician acceptance. Health-economic analyses should compare the cost-effectiveness of AI-guided early interventions against standard OGTT-driven protocols. Finally, alignment of continuous AI-derived risk scores with established guideline thresholds will be essential to integrate these tools into existing care pathways.
A summary of studies regarding AI applications in GDM is presented in Table 4.

3.3.3. AI in Predicting Postpartum Haemorrhage

Postpartum haemorrhage (PPH), conventionally defined as blood loss ≥ 500 mL within 24 h of delivery, remains the leading cause of maternal mortality worldwide, accounting for over 20% of maternal deaths and disproportionately affecting low and middle-income countries, where more than 90% of these deaths occur [159]. In high-resource settings, PPH still contributes to 8–19% of maternal deaths and drives substantial transfusion requirements and morbidity [160]. Early identification of women at elevated risk enables implementation of evidence-based prophylaxis, such as active management of the third stage of labour, timely administration of uterotonics, tranexamic acid, and pre-positioning of blood products [161].
Traditional risk-assessment tools rely on summative clinical scores that assign points for factors including previous PPH, multiple gestation, pre-eclampsia and prolonged labour [162]. Although readily deployable at the bedside of the patient, these instruments demonstrate only modest discrimination and often lack sensitivity for high-risk individuals; for example, a widely cited clinical score achieved an AUROC of 0.68 (95% CI 0.63–0.72) in validation cohorts [162]. Despite being simple to implement, these scores typically achieve only modest discrimination and do not account for complex, nonlinear interactions among maternal, labour, and facility variables [162]. Moreover, estimates of blood loss based on visual inspection remain notoriously inaccurate, further limiting timely recognition and management.
Recent advances in artificial intelligence, notably ML and DL algorithms, offer the capacity to integrate hundreds of peri-partum variables and to model intricate relationships that defy traditional regression. Ahmadzia and colleagues applied gradient-boosting, random forest, support-vector machine and multilayer perceptron models to the Consortium on Safe Labor dataset (n = 228,438), achieving an AUROC of 0.833 and a precision–recall AUC of 0.210 for a composite transfusion-PPH endpoint; the gradient-boosting model’s top predictors included mode of delivery, incremental oxytocin dose, tocolytic use, presence of an anaesthesia nurse and hospital care level [108].
Wang et al. evaluated five algorithmic approaches in 6144 caesarean deliveries, finding that a random forest model minimized prediction error of actual blood loss using a combination of 27 antepartum and intrapartum laboratory and clinical features, achieving a mean absolute error 21.7 mL (< 5.4% error) and root-mean-squared error 33.75 mL [109]. In a low-resource setting, Holcroft et al. developed a random-forest model in a Rwandan case–control cohort (n = 430) that predicted PPH on admission with 80.7% sensitivity and 71.3% specificity, relying solely on nine readily available variables including haemoglobin, maternal age, insurance status and obstetric history [110].
Albeit without external validation, the power of ensemble methods is further exemplified by Westcott et al., who trained gradient-boosted decision trees on 497 EHR features in 30,867 US deliveries to reach an AUROC of 0.979 and an overall accuracy of 98.1% [111]. Earlier work by Akazawa et al. (n = 9894) and Venkatesh et al. (n ≈ 152,000) demonstrated reproducible performance of XGBoost and random-forest models with C-statistics around 0.93 in large multicentre cohorts, while highlighting challenges of missing data and dated predictor sets [113,114].
These AI-driven models can be seamlessly embedded within electronic medical record systems to provide real-time risk scores at labour admission or preoperative evaluation, triggering tailored prophylaxis and facilitating resource allocation. In low-resource contexts, simplified calculators derived from more complex models can guide interventions when blood-bank capacity is limited. Furthermore, AI-based estimators of actual blood loss may improve quantification, enabling earlier transfusion and reducing maternal morbidity.
Despite impressive retrospective performance, most published models suffer from single-centre derivation, small or selective sample sizes, heterogeneous variable definitions, and limited external validation, raising concerns about overfitting and generalizability. The “black-box” nature of many ML/DL algorithms further hampers clinical trust in the absence of transparent explainability frameworks.
Future research should prioritize prospective, multicentre validation studies across diverse healthcare settings; incorporation of explainable-AI techniques (e.g., SHAP, LIME) to elucidate key predictors; alignment of model thresholds with clinical prophylaxis protocols; and health-economic assessments to determine cost-effectiveness. By addressing these challenges, AI has the potential to transform PPH management from reactive treatment to proactive prevention, thereby reducing the global burden of maternal haemorrhage.
A summary of studies regarding AI applications in PPH is presented in Table 5.

3.4. AI Applications in Labour and Delivery

Accurate prediction of delivery mode is essential for personalised intrapartum management, optimal resource allocation and reduction in unnecessary surgical interventions. Traditional risk stratification relies on a limited set of static clinical variables, which have demonstrated only modest discriminatory power.
Recent advances in AI have enabled the integration of dynamic intrapartum signals, comprehensive electronic health record data and ultrasound biometry to achieve substantially higher predictive accuracy. Ricciardi et al. demonstrated that a Random Forest classifier trained on 17 CTG-derived features (including foetal heart rate variability, spectral power and Poincaré metrics) could predict the need for caesarean delivery with 91.1% accuracy and an AUC of 0.967 [135]. In practice, this model could be integrated into monitoring systems to alert clinicians when CTG patterns portend operative delivery, prompting earlier obstetric review and preparation of surgical teams [135]. Similarly, Fergus et al. showed that an ensemble of Fisher’s linear discriminant analysis, RF and SVM achieved an AUC of 0.96 on 552 intrapartum CTG tracings [137]. Clinically, such an ensemble could serve as a second-opinion tool in labour wards to reduce inter-observer variability in CTG interpretation, support decisions about assisted vaginal delivery versus caesarean section and potentially avert adverse outcomes from delayed interventions.
Islam et al. applied a Henry Gas Solubility Optimization–enhanced RF to >20,000 births from demographic and health surveys, attaining nearly 98.3% accuracy in caesarean section prediction [129]. Although derived from survey data, this model could inform public health planning by identifying geographic or demographic groups at elevated risk of caesarean delivery, thereby guiding targeted training, resource deployment and policy interventions in low-resource settings [129].
Meyer et al. developed an XGBoost model on 73,667 deliveries at a tertiary centre, incorporating cervical dilation, ultrasound-adjusted foetal biometry and labour progression metrics. Deployed as a web-based calculator, it enables clinicians to provide individualized caesarean counselling on admission, sharing quantified risk estimates with patients to support shared decision-making and consent processes [126].
Zhang et al. trained a Random Forest on 2552 caesarean cases using detailed electronic medical record variables such as maternal demographics and ultrasound parameters, achieving an AUC of 0.979. This high-precision model could be embedded in electronic health systems to flag patients most likely to require caesarean delivery, streamlining anaesthetic planning, operating-room scheduling and blood-bank readiness [124].
Similarly, Lodi et al. applied a Probability Forest to predict caesarean delivery in 410 class III obese nulliparas (AUC 0.70). Although performance was lower, this targeted tool offers the first risk calculator for a notoriously challenging subgroup, allowing obstetricians to provide tailored counselling on delivery mode, anticipate operative difficulties and mobilize additional support for high-risk patients [123].
Regarding accurate prediction of labour induction success, which is critical to minimize unnecessary interventions, avoid prolonged hospital stays and reduce the risk of emergency caesarean delivery, Tingting Hu and colleagues retrospectively developed and validated a suite of machine-learning models, including logistic regression, naïve Bayes, support vector machine and AdaBoost, on 907 term pregnancies undergoing oxytocin induction (495 primiparous women; 312 multiparous women) [127]. Their logistic regression model achieved an AUC of 0.84 for primiparous and 0.89 for multiparous women, with external validation success rates of 94.2% and 96.6%, respectively, demonstrating robust discrimination based on clinical and sonographic variables such as Bishop score, foetal weight and amniotic fluid index [127]. In a multinational secondary analysis of two phase-III randomised trials (n = 1107), D’Souza et al. applied an unspecified machine-learning algorithm to predict successful induction in women with low Bishop scores (<4), yielding an AUC of 0.73 and identifying parity, gestational age and maternal BMI as leading predictors [125]. More recently, Liu et al. incorporated transvaginal ultrasound-derived cervical maturity features into XGBoost, CatBoost and random forest models in 101 women, with XGBoost achieving a mean absolute error of 13.49 h and RMSE of 16.98 h for prediction of induction-to-delivery interval, significantly outperforming the traditional Bishop score (MAE 19.45 h; RMSE 24.55 h) [122].
In addition to predicting primary caesarean delivery, AI has been applied to the challenge of vaginal birth after caesarean (VBAC). VBAC offers significant benefits over repeat caesarean delivery, including reduced surgical morbidity and faster maternal recovery, yet carries risks such as higher rates of uterine rupture and need for emergency intervention when compared to an elective caesarean section. Historically, VBAC candidacy has been guided by tools such as the Grobman calculator, developed by the Eunice Kennedy Shriver National Institute of Child Health and Human Development Maternal–Fetal Medicine Units (MFMU) Network, which estimates the probability of successful trial of labour after caesarean (TOLAC) using six readily available admission variables: maternal age, body mass index, race/ethnicity, history of prior vaginal delivery, indication for the previous caesarean and timing of that surgery [163]. Although externally validated across diverse populations, the Grobman model exhibits modest discrimination (AUC-PR 0.325 ± 0.067), limiting its precision for individualized counselling [163].
AI–driven approaches have sought to enhance VBAC prediction by incorporating high-dimensional clinical and real-time labour data. Macones et al. compared a back-propagation neural network with multivariate logistic regression in a case–control cohort of 400 women (100 failed TOLAC, 300 successful VBAC) and found that logistic regression (sensitivity 77%, specificity 65%, accuracy 69%) outperformed the neural network (sensitivity 59–63%, specificity 42–44%) in discriminating VBAC success [164]. More recently, Meyer et al. developed and externally validated Random Forest, XGBoost and generalized linear models on 989 consecutive TOLAC deliveries at a tertiary academic centre. The Random Forest achieved the highest area under the precision–recall curve (AUC-PR 0.351 ± 0.028), compared with XGBoost (0.350 ± 0.028) and GLM (0.336 ± 0.024), surpassing the traditional MFMU-Calculator prediction model (0.325 ± 0.067) [134]. Key predictors included prior vaginal birth, maternal height and arrest of descent. Clinically, these models could be integrated into decision-support tools to provide point-of-care VBAC success probabilities, informing counselling, consent and labour management in women considering TOLAC.
With respect to accurate identification of women at elevated risk for mediolateral episiotomy during the second stage of labour, it is well-known that it remains elusive, and its indiscriminate application can increase maternal morbidity while under-use may predispose to severe perineal trauma [116]. Tingting Hu and colleagues prospectively evaluated multiple AI algorithms, including support vector machine, random forest, LightGBM and XGBoost, on 1191 vaginal deliveries (incorporating 300 episiotomies) and demonstrated that the SVM model achieved the highest discriminative performance (AUC 0.882; recall 0.981; precision 0.790) [116]. Key predictors encompassed maternal characteristics (age, body mass index, parity), perineal metrics (length, elasticity, thickness, oedema), labour dynamics (duration of each stage, uterine contraction patterns) and intrapartum complications (shoulder dystocia, instrumental assistance) [116]. Embedding such a model into electronic monitoring systems or mobile decision-support tools could provide clinicians with real-time risk estimates at the bedside, enabling targeted episiotomy only for those most likely to benefit, thereby minimizing unnecessary perineal injury and optimising maternal outcomes.
Despite these promising results, most studies remain retrospective and single-centre, with potential selection bias, limited external validation and variability in data acquisition protocols. Intrapartum signal-based models require standardised cardiotocography sampling and preprocessing, while electronic health record–driven algorithms depend on data completeness and consistent coding practices. Future research must prioritize prospective, multicentre trials to confirm generalizability, incorporate continuous ultrasound and wearable sensor streams for richer temporal modelling, and adopt explainable AI frameworks that provide transparent feature attributions. Seamless integration into clinical workflows, including real-time inference engines and user-centric interfaces, will be essential to translate these high-performance models into routine obstetric practice and ultimately improve maternal and neonatal outcomes.
A summary of studies regarding AI applications in labour and delivery is presented in Table 6.

4. Challenges and Limitations: Ethical and Regulatory Frameworks for AI in Obstetrics

The rapid adoption of AI in the continuum of Obstetric care promises substantial improvements in outcomes but also raises complex legal and ethical challenges. To ensure that AI deployments respect patient autonomy, privacy, and equity, and that they meet rigorous safety standards, stakeholders must navigate a multifaceted regulatory landscape while embedding robust ethical safeguards into every stage of development and implementation.

4.1. Current AI Legal Frameworks in Healthcare

International organizations and national governments are rapidly shaping the legal landscape to govern AI applications in healthcare. The World Health Organization (WHO) has issued comprehensive digital-health guidelines that articulate ethical principles for AI in medicine, emphasising respect for patient autonomy, equity of access, privacy protection, and ongoing evaluation of safety and effectiveness [165]. In 2024, the Organization for Economic Co-operation and Development (OECD) released its updated intergovernmental standard on trustworthy AI, calling for AI systems that are human-centered, fair, transparent, robust, and environmentally sustainable [166]. These high-level frameworks provide foundational guidance but lack binding legal force, leaving national regulators to translate principles into enforceable rules.
In the United States of America, the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule requires covered entities and business associates to implement administrative, physical, and technical safeguards, such as de-identification, encryption, and access controls to protect electronic protected health information (ePHI) used in AI model development and deployment [167]. Concurrently, the Food and Drug Administration (FDA) treats clinical AI/ML algorithms as Software as a Medical Device (SaMD). Its 2019 Discussion Paper and 2021 AI/ML-Based SaMD Action Plan establish a Total Product Life-Cycle approach, mandating pre-specified Change Control Plans for continuous-learning models, adherence to Good Machine Learning Practices (GMLPs), and post-market surveillance to monitor for performance drift and emerging risks [168,169].
The EU’s General Data Protection Regulation (GDPR) extends stringent protections to any processing of “personal data,” including health information [170]. AI projects must comply with GDPR’s principles of lawfulness, transparency, data minimization, and purpose limitation, obtain explicit consent or invoke a research exemption for “special category” data, and conduct Data Protection Impact Assessments for high-risk processing [168]. Building on GDPR, the upcoming AI Act (effective August 2026) introduces a risk-based classification: “high-risk” AI must satisfy rigorous requirements for data quality, human oversight, transparency, and conformity assessment under the Medical Device Regulation (MDR 2017/745) [171].

4.2. Clinical Investigation and AI-Specific Requirements

The translation of AI algorithms from prototype to clinical utility in Obstetrics necessitates rigorous investigation akin to that required for medical devices and pharmaceuticals. Regulatory bodies in major jurisdictions have delineated pathways to ensure that AI systems demonstrate safety, effectiveness, and generalizability before—and after—deployment.
In the United States, AI tools intended to inform clinical decision-making fall under the FDA’s definition of Software as a Medical Device (SaMD) [172]. When algorithm outputs will directly influence patient management, sponsors must secure an Investigational Device Exemption (IDE) before initiating clinical studies [172]. IDE applications must include a detailed protocol describing: the intended use and clinical role of the AI system; the characteristics of the training and validation datasets (including demographic composition and handling of missing data); performance–metric thresholds that would trigger study modification or termination; and data-safety monitoring procedures to detect adverse events or model failures in real time [172].
Under the European Medical Device Regulation (MDR 2017/745), AI/ML-based SaMD classified as “high-risk” must undergo clinical performance studies in accordance with Annex XIV. Such studies require ethics committee approval and, for devices in risk class IIa or higher, a formal Clinical Investigation Plan submitted to a Notified Body. The investigation plan must specify objectives, study design, sample-size justification, inclusion of appropriate comparator arms (for example, standard-of-care risk scores), and statistical analysis methods to evaluate primary endpoints such as sensitivity, specificity, and calibration in target populations [173].
To promote reproducibility and facilitate regulatory review, professional guidelines have been extended to AI interventions. The SPIRIT-AI extension provides 15 additional items for trial protocols, covering aspects such as algorithm versioning, data provenance, and human-in-the-loop mechanisms [174]. Similarly, CONSORT-AI defines 14 extension items for trial reporting, including clear description of integration into clinical workflows, criteria for expert override, and post-hoc explainability analyses to contextualize performance in subgroups (for example, across different ethnicities or gestational ages) [175].
Given that many AI algorithms incorporate adaptive or continuous-learning components, regulators require ongoing post-market monitoring to detect performance drift when real-world patient characteristics or care processes diverge from the development environment.

5. Conclusions and Future Directions

This scoping review provides a comprehensive synthesis of current AI applications across the maternal–foetal continuum. By capturing the methodological diversity of AI approaches while emphasising clinical relevance, it highlights both the transformative potential of AI for early diagnosis, personalised risk stratification, and automated monitoring and the emerging importance of explainable AI, multimodal data integration, and regulatory oversight.
However, several limitations warrant consideration. As a scoping review, no formal assessment of study quality or risk of bias was undertaken, limiting the ability to weigh the relative strength of evidence. The restriction to English-language publications may have introduced language bias, and the inclusion of studies spanning more than two decades introduces heterogeneity in data sources, model architectures, and reporting standards. Moreover, the predominance of retrospective, single-centre studies with limited external validation constrains generalizability.
To fully realize AI’s promise in obstetrics, several challenges and gaps must be addressed. Firstly, improving AI training datasets is crucial. Many current models have been developed on relatively narrow datasets, which can introduce algorithmic bias and limit their generalizability. There is an urgent need for larger and more diverse maternal health data sources to ensure that AI tools perform well across different populations and care settings. Encouraging open access to maternal-foetal datasets and establishing data-sharing collaborations will be instrumental in this effort, as it allows researchers to validate and refine AI models on a wide range of real-world scenarios. By closing the data gap and enhancing data quality, we can increase the fairness, accuracy, and clinical reliability of AI predictions in obstetrics.
Secondly, strengthening the collaboration between AI researchers and healthcare professionals is essential. Interdisciplinary teamwork can ensure that the next generation of AI tools is both clinically relevant and user-friendly for those on the front lines of care. Obstetricians, midwives, and maternal-foetal medicine specialists should be actively involved in the design, testing, and implementation of AI systems, providing valuable insights into clinical workflows and decision-making nuances. Likewise, computer scientists and engineers can tailor their algorithms to address the real-world needs and constraints identified by healthcare providers. In tandem, medical education will need to evolve to build AI literacy among clinicians. Training programs and continuing education could incorporate basic data science and AI concepts so that obstetrical care providers feel comfortable interpreting AI outputs and maintaining appropriate oversight of AI-assisted decisions. Fostering this two-way exchange, where technology experts and clinicians learn from each other, will help ensure trust in AI integration.
Thirdly, robust regulatory frameworks and ethical guidelines are needed to support safe AI deployment in Obstetrics. Given the high stakes in maternal-foetal medicine, it is imperative that AI algorithms undergo rigorous evaluation and approval processes before being widely adopted. Regulatory bodies and professional organizations should establish clear standards for validating obstetric AI tools, including requirements for transparency and thorough testing. Issues such as data privacy, informed consent for AI use, and liability in the event of AI mistakes also need defined policies. Importantly, ongoing oversight is required even after deployment: models may drift in performance over time or behave unpredictably in new settings, so continuous monitoring and periodic re-certification of AI systems should be part of the governance framework.
In summary, albeit its current nascent nature in clinical practice, artificial intelligence offers a transformative opportunity to improve obstetric outcomes and drive more personalised, equitable care. With continued interdisciplinary research, responsible deployment, and robust oversight, AI can be harnessed to help make pregnancy and childbirth safer, ushering in a new era of “intelligent” obstetric care that complements and strengthens the work of healthcare professionals.

Author Contributions

Conceptualization, M.M.; methodology, M.M., V.C.; formal analysis, M.M., V.C.; investigation, M.M., V.C.; writing—original draft preparation, V.C.; writing—review and editing, M.M., V.C., T.M.; supervision, M.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Ethical review and approval were waived for this study due to its nature.

Informed Consent Statement

Not applicable.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
2DTwo-Dimensional
AIFAmniotic Fluid Index
ANNArtificial Neural Network
APTTActivated Partial Thromboplastin Time
ASAAmerican Society of Anesthesiologists Physical Status
ASATAspartate Aminotransferase
ACSAntenatal Corticosteroids
AdaBoostAdaptive Boosting
AccAccuracy
AISIAggregate Index of Systemic Inflammation
AIArtificial Intelligence
ANArtificial Neural Network
AUROCArea Under the Receiver-Operating-Characteristic Curve
AUCArea Under the Curve
AUC-PRArea Under the Precision–Recall Curve
AUCfinalOverall Survival AUC
AUCtdTime-Dependent AUC
BaggingBootstrap Aggregating
BMIBody Mass Index
BPBlood Pressure
CBCComplete Blood Count
CARTClassification and Regression Tree
CHAIDChi-Square Automatic Interaction Detection
CIConfidence Interval
CKDChronic Kidney Disease
CSCaesarean Section
CSLConsortium on Safe Labor
Cox PHCox Proportional Hazards Model
CNNConvolutional Neural Network
CrCreatinine
CTGCardiotocography
DFADetrended Fluctuation Analysis
DBPDiastolic Blood Pressure
DEGsDifferentially Expressed Genes
DLDeep Learning
DLRDeep Learning Radiomics
DLCNNDeep Learning Convolutional Neural Network
DMDiabetes Mellitus
DNNDeep Neural Network
DTCDecision Tree Classifier
DTDecision Tree
DRDetection Rate
DR10Detection Rate at 10% FPR
DRFDistributed Random Forest
EHRElectronic Health Record
EFMElectronic Fetal Monitoring
EMGElectromyography
EMRElectronic Medical Record
EFWEstimated Fetal Weight
F1F1-Score
FBSFasting Blood Sugar
FGRFetal Growth Restriction
FHRFetal Heart Rate
FPGFasting Plasma Glucose
FT4Free Thyroxine
FT-IRFourier-Transform Infrared Spectroscopy
FPRFalse-Positive Rate
GBGradient Boosting
GBMGradient Boosting Machine
GBDTGradient-Boosted Decision Trees
GBTGradient-Boosted Trees
GAMGeneralized Additive Model
GCTGlucose Challenge Test
GDMGestational Diabetes Mellitus
GAGestational Age
GNBGaussian Naïve Bayes
GRUGated Recurrent Unit
GWGGestational Weight Gain
HDPHypertensive Disorders of Pregnancy
HGSORFHenry Gas Solubility Optimization–Based Random Forest
HELLPHemolysis, Elevated Liver Enzymes, Low Platelets Syndrome
HGBHemoglobin
HbHemoglobin
HbA1cGlycated Hemoglobin
HTNHypertension
ICUIntensive Care Unit
IBSIntegrated Brier Score
IUGRIntrauterine Growth Restriction
IVFIn Vitro Fertilization
KNNk-Nearest Neighbors
k-NNk-nearest Neighbors
LDALinear Discriminant Analysis
LASSOLeast Absolute Shrinkage and Selection Operator
LFHFLow and High-Frequency Spectral Power
LRLogistic Regression
LGBLightGBM
LERSLearning from Examples Using Rough Sets
LSTMLong Short-Term Memory
MARSMultivariate Adaptive Regression Splines
MAEMean Absolute Error
MAPMean Arterial Pressure
MAPEMean Absolute Percentage Error
MCHMean Corpuscular Hemoglobin
MCCMatthews Correlation Coefficient
MLMachine Learning
MLPMultilayer Perceptron
MoMMultiples of the Median
MOMIMulti-center Observational Maternal Initiative
MRIMagnetic Resonance Imaging
NBNaïve Bayes
NFHNeonatal Fetal Hypoxia
NMFNon-negative Matrix Factorization
NICHDNational Institute of Child Health and Human Development
NPVNegative Predictive Value
OASIObstetric Anal Sphincter Injury
OGTTOral Glucose Tolerance Test
PAPP-APregnancy-Associated Plasma Protein-A
PCAPrincipal Component Analysis
PFProbability Forest
PGPlasma Glucose
PlGFPlacental Growth Factor
PP-13Placental Protein-13
PPROMPreterm Premature Rupture of Membranes
PROMPremature Rupture of Membranes
PPVPositive Predictive Value
PR-AUCPrecision-Recall Area Under the Curve
PredPredictors
PTProthrombin Time
PTBPreterm Birth
RFRandom Forest
ReLURectified Linear Unit
RBFRadial Basis Function
ROCReceiver-Operating-Characteristic
ROC-AUCReceiver-Operating-Characteristic Area Under the Curve
RMSRoot-Mean-Square
RNNRecurrent Neural Network
SBELMStacked Bayesian Extreme Learning Machine
SBPSystolic Blood Pressure
SensSensitivity
SHAPSHapley Additive Explanations
SFMState Flow Machine
SGBStochastic Gradient Boosting
SMOTESynthetic Minority Oversampling Technique
SOMSelf-Organizing Map
specSpecificity
STStudy Type
STVShort-Term Variability
TBILTotal Bilirubin
TGTriglycerides
TOCOTocodynamometry
TGLCNTrend-Guided Long Convolution Network
T2WIT2-Weighted Imaging
TTThrombin Time
UCUterine Contraction
UtA-PIUterine Artery Pulsatility Index
U-NetU-Shaped Convolutional Network
USUltrasound
VDVaginal Delivery
Vit D3Vitamin D3
VIPVariable Importance in Projection
WBCWhite Blood Cell Count
WCWaist Circumference
XGBExtreme Gradient Boosting
XGBoosteXtreme Gradient Boosting

References

  1. Russell, S.; Norvig, P. Artificial Intelligence: A Modern Approach; Pearson: Hoboken, NJ, USA, 2020. [Google Scholar]
  2. Kufel, J.; Bargieł-Łączek, K.; Kocot, S.; Koźlik, M.; Bartnikowska, W.; Janik, M.; Czogalik, Ł.; Dudek, P.; Magiera, M.; Lis, A.; et al. What is machine learning, artificial neural networks and deep learning?—Examples of practical applications in medicine. Diagnostics 2023, 13, 2582. [Google Scholar] [CrossRef] [PubMed]
  3. LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
  4. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Advances in Neural Information Processing Systems; Curran Associates: Red Hook, NY, USA, 2017; pp. 5998–6008. [Google Scholar]
  5. Silver, D.; Huang, A.; Maddison, C.J.; Guez, A.; Sifre, L.; van den Driessche, G.; Schrittwieser, J.; Antonoglou, I.; Panneershelvam, V.; Lanctot, M.; et al. Mastering the game of Go with deep neural networks and tree search. Nature 2016, 529, 484–489. [Google Scholar] [CrossRef] [PubMed]
  6. Bertini, A.; Salas, R.; Chabert, S.; Sobrevia, L.; Pardo, F. Using machine learning to predict complications in pregnancy: A systematic review. Front. Bioeng. Biotechnol. 2021, 9, 780389. [Google Scholar] [CrossRef]
  7. Juszczak, K.; Summers, S.; Elson, D.; Peters, T.M. Automated measurement of fetal head circumference from ultrasound images using deep learning. Ultrasound Med. Biol. 2020, 46, 1947–1957. [Google Scholar]
  8. van den Heuvel, T.L.A.; de Bruijn, D.; de Korte, C.L.; van Ginneken, B. Automated measurement of fetal head circumference using 2D ultrasound images. PLoS ONE 2018, 13, e0200412. [Google Scholar] [CrossRef]
  9. Hassan, A.N.; Guend, H.; Syed, S. Wearable sensor-based phenotyping of maternal and fetal health: Opportunities and challenges. npj Digit. Med. 2022, 5, 50. [Google Scholar]
  10. Tricco, A.C.; Lillie, E.; Zarin, W.; O’Brien, K.K.; Colquhoun, H.; Levac, D.; Moher, D.; Peters, M.D.J.; Horsley, T.; Weeks, L.; et al. PRISMA extension for scoping reviews (PRISMA-ScR): Checklist and explanation. Ann. Intern. Med. 2018, 169, 467–473. [Google Scholar] [CrossRef]
  11. Peters, M.D.J.; Godfrey, C.M.; McInerney, P.; Soares, C.B.; Khalil, H.; Parker, D. Methodology for JBI scoping reviews. In Joanna Briggs Institute Reviewers’ Manual; Aromataris, E., Munn, Z., Eds.; The Joanna Briggs Institute: Adelaide, Australia, 2015; pp. 1–24. [Google Scholar]
  12. McCoy, J.A.; Levine, L.D.; Wan, G.; Chivers, C.; Teel, J.; La Cava, W.G. Intrapartum electronic fetal heart rate monitoring to predict acidemia at birth with the use of deep learning. Am. J. Obstet. Gynecol. 2025, 232, 116.e1–116.e9. [Google Scholar] [CrossRef]
  13. Ben M’Barek, I.; Jauvion, G.; Merrer, J.; Koskas, M.; Sibony, O.; Ceccaldi, P.F.; Le Pennec, E.; Stirnemann, J. DeepCTG® 2.0: Development and validation of a deep learning model to detect neonatal acidemia from cardiotocography during labor. Comput. Biol. Med. 2025, 184, 109448. [Google Scholar] [CrossRef]
  14. Gumilar, K.E.; Wardhana, M.P.; Akbar, M.I.A.; Putra, A.S.; Banjarnahor, D.P.P.; Mulyana, R.S.; Fatati, I.; Yu, Z.Y.; Hsu, Y.C.; Dachlan, E.G.; et al. Artificial intelligence–large language models (AI-LLMs) for reliable and accurate cardiotocography (CTG) interpretation in obstetric practice. Comput. Struct. Biotechnol. J. 2025, 23, 3034–3045. [Google Scholar] [CrossRef]
  15. Roozbeh, N.; Montazeri, F.; Vahidi Farashah, M.; Mehrnoush, V.; Darsareh, F. Proposing a machine learning-based model for predicting nonreassuring fetal heart. Sci. Rep. 2025, 15, 7812. [Google Scholar] [CrossRef]
  16. Zhao, Z.; Zhu, J.; Jiao, P.; Wang, J.; Zhang, X.; Lu, X.; Zhang, Y. Hybrid-FHR: A multi-modal AI approach for automated fetal acidosis diagnosis. BMC Med. Inform. Decis. Mak. 2024, 24, 19. [Google Scholar] [CrossRef]
  17. Tarvonen, M.; Manninen, M.; Lamminaho, P.; Jehkonen, P.; Tuppurainen, V.; Andersson, S. Computer vision for identification of increased fetal heart variability in cardiotocogram. Gynecol. Obstet. Investig. 2024, 89, 460–470. [Google Scholar] [CrossRef] [PubMed]
  18. Mushtaq, G.; Veningston, K. AI-driven interpretable deep learning based fetal health classification. SLAst 2024, 4, 100206. [Google Scholar] [CrossRef] [PubMed]
  19. Melaet, R.; de Vries, I.R.; Kok, R.D.; Oei, S.G.; Huijben, I.A.M.; van Sloun, R.J.G.; van Laar, J.O.E.H.; Vullings, R. Artificial intelligence-based cardiotocogram assessment during labor. Eur. J. Obstet. Gynecol. Reprod. Biol. 2024, 295, 75–85. [Google Scholar] [CrossRef] [PubMed]
  20. Wahbah, M.; Zitouni, M.S.; Al Sakaji, R.; Funamoto, K.; Widatalla, N.; Krishnan, A.; Kimura, Y.; Khandoker, A.H. A deep learning framework for noninvasive fetal ECG signal extraction. Front. Physiol. 2024, 15, 1329313. [Google Scholar] [CrossRef]
  21. Mendis, L.; Palaniswami, M.; Keenan, E.; Brownfoot, F. Rapid detection of fetal compromise using input length-invariant deep learning on fetal heart rate signals. Sci. Rep. 2024, 14, 63108. [Google Scholar] [CrossRef]
  22. Li, J.; Li, J.; Guo, C.; Chen, Q.; Liu, G.; Li, L.; Luo, X.; Wei, H. Multicentric intelligent cardiotocography signal interpretation using deep semi-supervised domain adaptation via minimax entropy and domain invariance. Comput. Methods Programs Biomed. 2024, 249, 108145. [Google Scholar] [CrossRef]
  23. Das, S.; Obaidullah, S.M.; Mahmud, M.; Kaiser, M.S.; Roy, K.; Saha, C.K.; Goswami, K. A machine learning pipeline to classify foetal heart rate deceleration with optimal feature set. Sci. Rep. 2023, 13, 2495. [Google Scholar] [CrossRef]
  24. Liang, H.; Lu, Y. A CNN-RNN unified framework for intrapartum cardiotocograph classification. Comput. Methods Programs Biomed. 2022, 223, 107300. [Google Scholar] [CrossRef]
  25. Zhou, Z.; Zhao, Z.; Zhang, X.; Zhang, X.; Jiao, P.; Ye, X. Identifying fetal status with fetal heart rate: Deep learning approach based on long convolution. Comput. Biol. Med. 2023, 161, 106970. [Google Scholar] [CrossRef]
  26. Lee, K.S.; Choi, E.S.; Nam, Y.J.; Liu, N.W.; Yang, Y.S.; Kim, H.Y.; Ahn, K.H.; Hong, S.C. Real-time classification of fetal status based on deep learning and cardiotocography data. J. Med. Syst. 2023, 47, 60. [Google Scholar] [CrossRef]
  27. Ben M’Barek, I.; Jauvion, G.; Vitrou, J.; Holmström, E.; Koskas, M.; Ceccaldi, P.F. DeepCTG® 1.0: An interpretable model to detect fetal hypoxia from cardiotocography data during labor and delivery. Front. Pediatr. 2023, 11, 1190441. [Google Scholar] [CrossRef]
  28. Cao, Z.; Wang, G.; Xu, L.; Li, C.; Hao, Y.; Chen, Q.; Li, X.; Liu, G.; Wei, H. Intelligent antepartum fetal monitoring via deep learning and fusion of cardiotocographic signals and clinical data. Healthcare Inform. Res. 2023, 29, 215–226. [Google Scholar] [CrossRef] [PubMed]
  29. Daydulo, Y.D.; Thamineni, B.L.; Dasari, H.K.; Aboye, G.T. Morse wavelet CTG analysis. BMC Med. Inform. Decis. Mak. 2022, 22, 2068. [Google Scholar] [CrossRef]
  30. Spairani, E.; Daniele, B.; Signorini, M.G.; Magenes, G. A deep learning mixed-data type approach for the classification of FHR signals. Front. Bioeng. Biotechnol. 2022, 10, 887549. [Google Scholar] [CrossRef] [PubMed]
  31. Boudet, S.; Houzé de l’Aulnoit, A.; Peyrodie, L.; Demailly, R.; de L’aulnoit, D.H. Use of deep learning to detect maternal heart rate and false signals on fetal heart rate recordings. Biosensors 2022, 12, 691. [Google Scholar] [CrossRef]
  32. Frasch, M.G.; Strong, S.B.; Nilosek, D.; Leaverton, J.; Schifrin, B.S. Detection of preventable fetal distress from scanned cardiotocogram tracings using deep learning. Front. Pediatr. 2021, 9, 736834. [Google Scholar] [CrossRef]
  33. Fotiadou, E.; van Sloun, R.J.G.; van Laar, J.O.E.H.; Vullings, R. A dilated inception CNN-LSTM network for fetal heart rate estimation. Physiol. Meas. 2021, 42, 045007. [Google Scholar] [CrossRef]
  34. Liu, L.C.; Tsai, Y.H.; Chou, Y.C.; Jheng, Y.C.; Lin, C.K.; Lyu, N.Y.; Chien, Y.; Yang, Y.P.; Chang, K.J.; Chang, K.H.; et al. Concordance analysis of intrapartum cardiotocography between physicians and artificial intelligence-based technique using modified one-dimensional fully convolutional networks. J. Chin. Med. Assoc. 2021, 84, 1022–1028. [Google Scholar] [CrossRef] [PubMed]
  35. Signorini, M.G.; Pini, N.; Malovini, A.; Bellazzi, R.; Magenes, G. Integrating machine learning techniques and physiology based heart rate features for antepartum fetal monitoring. Comput. Methods Programs Biomed. 2020, 185, 105015. [Google Scholar] [CrossRef] [PubMed]
  36. Hoodbhoy, Z.; Noman, M.; Shafique, A.; Nasim, A.; Chowdhury, D.; Hasan, B. Use of machine learning algorithms for prediction of fetal risk using cardiotocographic data. Int. J. Appl. Basic Med. Res. 2019, 9, 226–230. [Google Scholar] [CrossRef] [PubMed]
  37. Zhao, Z.; Zhang, Y.; Comert, Z.; Deng, Y. Computer-Aided Diagnosis System of Fetal Hypoxia Incorporating Recurrence Plot with Convolutional Neural Network. Front. Physiol. 2019, 10, 255. [Google Scholar] [CrossRef]
  38. Cömert, Z.; Şengür, A.; Budak, Ü.; Kocamaz, A.F. Prediction of intrapartum fetal hypoxia considering feature selection algorithms and machine learning models. Health Inf. Sci. Syst. 2019, 7, 17. [Google Scholar] [CrossRef]
  39. Zhao, Z.; Deng, Y.; Zhang, Y.; Zhang, Y.; Zhang, X.; Shao, L. DeepFHR: Intelligent prediction of fetal acidemia using fetal heart rate signals based on convolutional neural network. BMC Med. Inform. Decis. Mak. 2019, 19, 286. [Google Scholar] [CrossRef]
  40. Tang, H.; Wang, T.; Li, M.; Yang, X. The design and implementation of cardiotocography signals classification algorithm based on neural network. Comput. Math. Methods Med. 2018, 2018, 8568617. [Google Scholar] [CrossRef]
  41. Leonarduzzi, R.; Spilka, J.; Frecon, J.; Wendt, H.; Pustelnik, N.; Jaffard, S.; Abry, P.; Doret, M. P-leader multifractal analysis and sparse SVM for intrapartum fetal acidosis detection. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. 2015, 2015, 1971–1974. [Google Scholar] [CrossRef]
  42. Maeda, K.; Noguchi, Y.; Matsumoto, F.; Nagasawa, T. Quantitative fetal heart rate evaluation without pattern classification: FHR score and artificial neural network analysis. J. Matern.-Fetal Neonatal Med. 2010, 23, 1517–1522. [Google Scholar] [CrossRef]
  43. Salamalekis, E.; Thomopoulos, P.; Giannaris, D.; Salloum, I.; Vasios, G.; Prentza, A.; Koutsouris, D. Computerised intrapartum diagnosis of fetal hypoxia based on fetal heart rate monitoring and fetal pulse oximetry recordings utilising wavelet analysis and neural networks. BJOG 2002, 109, 553–562. [Google Scholar] [CrossRef]
  44. Liszka-Hackzell, J.J. Categorization of fetal heart rate patterns using neural networks. Comput. Methods Programs Biomed. 2001, 67, 209–217. [Google Scholar] [CrossRef]
  45. Kol, S.; Thaler, I.; Paz, N.; Shmueli, O. Interpretation of nonstress tests by an artificial neural network. Am. J. Obstet. Gynecol. 1995, 173, 801–805. [Google Scholar] [CrossRef] [PubMed]
  46. Keith, R.D.; Westgate, J.; Ifeachor, E.C.; Greene, K.R. Suitability of artificial neural networks for feature extraction from cardiotocogram during labour. J. Perinat. Med. 1995, 23, 531–540. [Google Scholar] [CrossRef] [PubMed]
  47. Kloska, A.; Harmoza, A.; Kloska, S.M.; Marciniak, T.; Sadowska-Krawczenko, I. Predicting preterm birth using machine learning methods. Sci. Rep. 2025, 15, 89905. [Google Scholar] [CrossRef] [PubMed]
  48. Ohtaka, A.; Akazawa, M.; Hashimoto, K. Transvaginal ultrasound for preterm birth prediction. J. Med. Ultrason. 2023, 50, 394. [Google Scholar] [CrossRef]
  49. Bitar, G.; Liu, W.; Tunguhan, J.; Kumar, K.V.; Hoffman, M.K. A machine learning algorithm using clinical and demographic data for all-cause preterm birth prediction. Am. J. Perinatol. 2024, 41 (Suppl. S1), e3115–e3123. [Google Scholar] [CrossRef]
  50. Ushida, T.; Kotani, T.; Baba, J.; Imai, K.; Moriyama, Y.; Nakano-Kobayashi, T.; Iitani, Y.; Nakamura, N.; Hayakawa, M.; Kajiyama, H.; et al. Antenatal prediction models for outcomes of extremely and very preterm infants based on machine learning. Arch. Gynecol. Obstet. 2023, 306, 1287–1296. [Google Scholar] [CrossRef]
  51. Andrade Júnior, V.L.; França, M.S.; Santos, R.A.F.; Hatanaka, A.R.; Cruz, J.J.; Hamamoto, T.E.K.; Traina, E.; Sarmento, S.G.P.; Elito Júnior, J.; Pares, D.B.d.S.; et al. A new model based on artificial intelligence to screening preterm birth. J. Matern.-Fetal Neonatal Med. 2023, 36, 2241100. [Google Scholar] [CrossRef]
  52. Kokkinidis, I.; Logaras, E.; Rigas, E.S.; Tsakiridis, I.; Dagklis, T.; Billis, A.; Bamidis, P.D. Towards an explainable AI-based tool to predict preterm birth. Stud. Health Technol. Inform. 2023, 305, 207. [Google Scholar] [CrossRef]
  53. Khan, W.; Zaki, N.; Ghenimi, N.; Ahmad, A.; Bian, J.; Masud, M.M.; Ali, N.; Govender, R.; Ahmed, L.A. Predicting preterm birth using explainable machine learning in a prospective cohort of nulliparous and multiparous pregnant women. PLoS ONE 2023, 18, e0293925. [Google Scholar] [CrossRef]
  54. Zhang, Y.; Du, S.; Hu, T.; Xu, S.; Lu, H.; Xu, C.; Li, J.; Zhu, X. Establishment of a model for predicting preterm birth based on the machine learning algorithm. BMC Pregnancy Childbirth 2023, 23, 6058. [Google Scholar] [CrossRef]
  55. Sun, Q.; Zou, X.; Yan, Y.; Zhang, H.; Wang, S.; Gao, Y.; Liu, H.; Liu, S.; Lu, J.; Yang, Y.; et al. Machine learning-based prediction model of preterm birth using electronic health record. Comput. Math. Methods Med. 2022, 2022, 9635526. [Google Scholar] [CrossRef]
  56. Wong, K.; Tessema, G.A.; Chai, K.; Pereira, G. Development of prognostic model for preterm birth using machine learning in a population-based cohort of Western Australia births between 1980 and 2015. Sci. Rep. 2022, 12, 21936. [Google Scholar] [CrossRef] [PubMed]
  57. Zhou, Y.; Liu, Y.; Zhang, Y.; Zhang, Y.; Wu, W.; Fan, J. Identifying non-linear association between maternal free thyroxine and risk of preterm delivery by a machine learning model. Front. Endocrinol. 2022, 13, 817595. [Google Scholar] [CrossRef] [PubMed]
  58. Park, S.; Moon, J.; Kang, N.; Kim, Y.H.; You, Y.A.; Kwon, E.; Ansari, A.; Hur, Y.M.; Park, T.; Kim, Y.J. Predicting preterm birth through vaginal microbiota, cervical length, and WBC using a machine learning model. Front. Microbiol. 2022, 13, 912853. [Google Scholar] [CrossRef] [PubMed]
  59. Rawashdeh, H.; Awawdeh, S.; Shannag, F.; Henawi, E.; Faris, H.; Obeid, N.; Hyett, J. Intelligent system based on data mining techniques for prediction of preterm birth for women with cervical cerclage. Comput. Biol. Chem. 2020, 89, 107233. [Google Scholar] [CrossRef]
  60. Gao, C.; Osmundson, S.; Velez Edwards, D.R.; Jackson, G.P.; Malin, B.A.; Chen, Y. Deep learning predicts extreme preterm birth from electronic health records. J. Biomed. Inform. 2019, 100, 103334. [Google Scholar] [CrossRef]
  61. Elaveyini, U.; Devi, S.P.; Rao, K.S. Neural networks prediction of preterm delivery with first trimester bleeding. Arch. Gynecol. Obstet. 2011, 282, 203–209. [Google Scholar] [CrossRef]
  62. Catley, C.; Frize, M.; Walker, C.R.; Petriu, D.C. Predicting high-risk preterm birth using artificial neural networks. IEEE Trans. Inf. Technol. Biomed. 2006, 10, 540–549. [Google Scholar] [CrossRef]
  63. Goodwin, L.K.; Iannacchione, M.A.; Hammond, W.E.; Crockett, P.; Maher, J.E.; Schlitz, K.; Manning, C. Data mining methods find demographic predictors of preterm birth. Am. J. Obstet. Gynecol. 2001, 185, 1097–1100. [Google Scholar] [CrossRef]
  64. Woolery, L.K.; Grzymala-Busse, J. Machine learning for an expert system to predict preterm birth risk. J. Am. Med. Inform. Assoc. 1994, 1, 439–446. [Google Scholar] [CrossRef]
  65. Zheng, W.; Jiang, Y.; Jiang, Z.; Li, J.; Bian, W.; Hou, H.; Yan, G.; Shen, W.; Zou, Y.; Luo, Q.; et al. Association between deep learning radiomics based on placental MRI and preeclampsia with fetal growth restriction: A multicenter study. Eur. J. Radiol. 2025, 184, 111985. [Google Scholar] [CrossRef]
  66. Wang, Z.; Cheng, L.; Li, G.; Cheng, H. Development of immune-derived molecular markers for preeclampsia based on multiple machine learning algorithms. Sci. Rep. 2025, 15, 86442. [Google Scholar] [CrossRef]
  67. Liu, X.; Zhang, D.; Qiu, H. NMF typing and machine learning algorithm-based exploration of preeclampsia-related mechanisms on ferroptosis signature genes. Cell. Mol. Biol. Lett. 2025, 29, 72. [Google Scholar] [CrossRef]
  68. Lv, B.; Wang, G.; Pan, Y.; Yuan, G.; Wei, L. Construction and evaluation of machine learning-based predictive models for early-onset preeclampsia. Pregnancy Hypertens. 2025, 30, 101198. [Google Scholar] [CrossRef] [PubMed]
  69. da Silva, S.M.S.D.; Nogueira, M.S.; Rizzato, J.M.B.; de Lima Silva, S.; Cortelli, S.C.; Borges, R.; da Silva Martinho, H.; Silva, R.A.; de Carvalho, L.F.D.C.E.S. Machine learning combined with infrared spectroscopy for detection of hypertension pregnancy: Towards newborn and pregnant blood analysis. BMC Pregnancy Childbirth 2024, 24, 6941. [Google Scholar] [CrossRef] [PubMed]
  70. Eberhard, B.W.; Gray, K.J.; Bates, D.W.; Kovacheva, V.P. Deep survival analysis for interpretable time-varying prediction of preeclampsia risk. J. Biomed. Inform. 2024, 150, 104688. [Google Scholar] [CrossRef] [PubMed]
  71. Zhou, T.; Gu, S.; Shao, F.; Li, P.; Wu, Y.; Xiong, J.; Wang, B.; Zhou, C.; Gao, P.; Hua, X. Prediction of preeclampsia from retinal fundus images via deep learning in singleton pregnancies: A prospective cohort study. J. Hypertens. 2023, 41, 1602–1611. [Google Scholar] [CrossRef]
  72. Vasilache, I.A.; Scripcariu, I.S.; Doroftei, B.; Bernad, R.L.; Cărăuleanu, A.; Socolov, D.; Melinte-Popescu, A.S.; Vicoveanu, P.; Harabor, V.; Mihalceanu, E.; et al. Prediction of intrauterine growth restriction and preeclampsia using machine learning-based algorithms: A prospective study. Diagnostics 2024, 14, 453. [Google Scholar] [CrossRef]
  73. Bülez, A.; Hansu, K.; Çağan, E.S.; Şahin, A.R.; Dokumacı, H.Ö. Artificial intelligence in early diagnosis of preeclampsia. Niger. J. Clin. Pract. 2024, 27, 383–388. [Google Scholar] [CrossRef]
  74. Kaya, Y.; Bütün, Z.; Çelik, Ö.; Salik, E.A.; Tahta, T. Risk assessment for preeclampsia in the preconception period based on maternal clinical history via machine learning methods. J. Clin. Med. 2024, 14, 155. [Google Scholar] [CrossRef]
  75. Tiruneh, S.A.; Rolnik, D.L.; Teede, H.J.; Enticott, J. Prediction of pre-eclampsia with machine learning approaches: Leveraging important information from routinely collected data. Int. J. Med. Inform. 2024, 192, 105645. [Google Scholar] [CrossRef]
  76. Huang, P.; Song, Y.; Yang, Y.; Bai, F.; Li, N.; Liu, D.; Li, C.; Li, X.; Gou, W.; Zong, L. Identification and verification of diagnostic biomarkers based on mitochondria-related genes related to immune microenvironment for preeclampsia using machine learning algorithms. Front. Immunol. 2024, 14, 1304165. [Google Scholar] [CrossRef]
  77. Araújo, D.C.; de Macedo, A.A.; Veloso, A.A.; Alpoim, P.N.; Gomes, K.B.; Carvalho, M.D.G.; Dusse, L.M.S. Complete blood count as a biomarker for preeclampsia with severe features diagnosis: A machine learning approach. BMC Pregnancy Childbirth 2024, 24, 628. [Google Scholar] [CrossRef]
  78. Li, T.; Xu, M.; Wang, Y.; Wang, Y.; Tang, H.; Duan, H.; Zhao, G.; Zheng, M.; Hu, Y. Prediction model of preeclampsia using machine learning-based methods: A population-based cohort study in China. Front. Endocrinol. 2024, 15, 1345573. [Google Scholar] [CrossRef]
  79. Gil, M.M.; Cuenca-Gómez, D.; Rolle, V.; Pertegal, M.; Díaz, C.; Revello, R.; Adiego, B.; Mendoza, M.; Molina, F.S.; Santacruz, B.; et al. Validation of machine-learning model for first-trimester prediction of pre-eclampsia using cohort from PREVAL study. Ultrasound Obstet. Gynecol. 2024, 63, 769–777. [Google Scholar] [CrossRef]
  80. Edvinsson, C.; Björnsson, O.; Erlandsson, L.; Hansson, S.R. Predicting intensive care need in women with preeclampsia using machine learning: A pilot study. Hypertens. Pregnancy 2024, 43, 166–174. [Google Scholar] [CrossRef] [PubMed]
  81. Ansbacher-Feldman, Z.; Syngelaki, A.; Meiri, H.; Cirkin, R.; Nicolaides, K.H.; Louzoun, Y. Machine-learning-based prediction of pre-eclampsia using first-trimester maternal characteristics and biomarkers. Ultrasound Obstet. Gynecol. 2022, 60, 739–745. [Google Scholar] [CrossRef] [PubMed]
  82. Villalaín, C.; Herraiz, I.; Domínguez-Del Olmo, P.; Angulo, P.; Ayala, J.L.; Galindo, A. Prediction of delivery within 7 days after diagnosis of early-onset preeclampsia using machine-learning models. Front. Cardiovasc. Med. 2022, 9, 910701. [Google Scholar] [CrossRef] [PubMed]
  83. Liu, M.; Yang, X.; Chen, G.; Ding, Y.; Shi, M.; Sun, L.; Huang, Z.; Liu, J.; Liu, T.; Yan, R.; et al. Development of a prediction model on preeclampsia using machine learning-based method: A retrospective cohort study in China. Front. Physiol. 2022, 13, 896969. [Google Scholar] [CrossRef]
  84. Bennett, R.; Mulla, Z.D.; Parikh, P.; Hauspurg, A.; Razzaghi, T. An imbalance-aware deep neural network for early prediction of preeclampsia. PLoS ONE 2022, 17, e0266042. [Google Scholar] [CrossRef]
  85. Hoffman, M.K.; Ma, N.; Roberts, A. A machine learning algorithm for predicting maternal readmission for hypertensive disorders of pregnancy. Am. J. Obstet. Gynecol. MFM 2021, 3, 100250. [Google Scholar] [CrossRef]
  86. Wang, G.; Zhang, Y.; Li, S.; Zhang, J.; Jiang, D.; Li, X.; Li, Y.; Du, J. A machine learning-based prediction model for cardiovascular risk in women with preeclampsia. Front. Cardiovasc. Med. 2021, 8, 736491. [Google Scholar] [CrossRef]
  87. Sufriyana, H.; Husnayain, A.; Chen, Y.-L.; Kuo, C.-Y.; Singh, O.; Yeh, T.-Y.; Wu, Y.-W.; Su, E.C.-Y. Comparison of Multivariable Logistic Regression and Other Machine Learning Algorithms for Prognostic Prediction Studies in Pregnancy Care: Systematic Review and Meta-Analysis. JMIR Med. Inform. 2020, 8, e16503. [Google Scholar] [CrossRef]
  88. Jhee, J.H.; Lee, S.; Park, Y.; Lee, S.E.; Kim, Y.A.; Kang, S.-W.; Kwon, J.-Y.; Park, J.T. Prediction model development of late-onset preeclampsia using machine learning-based methods. PLoS ONE 2019, 14, e0221202. [Google Scholar] [CrossRef] [PubMed]
  89. Bigdeli, S.K.; Ghazisaedi, M.; Ayyoubzadeh, S.M.; Hantoushzadeh, S.; Ahmadi, M. Predicting Gestational Diabetes Mellitus in the First Trimester Using Machine Learning Algorithms: A Cross-Sectional Study at a Hospital Fertility Health Center in Iran. BMC Med. Inform. Decis. Mak. 2025, 25, 3. [Google Scholar] [CrossRef] [PubMed]
  90. Zhao, M.; Su, X.; Huang, L. Early gestational diabetes mellitus risk predictor using neural network with NearMiss. J. Matern.-Fetal Neonatal Med. 2025, 38, 2470317. [Google Scholar] [CrossRef] [PubMed]
  91. Zaky, H.; Fthenou, E.; Srour, L.; Farrell, T.; Bashir, M.; El Hajj, N.; Alam, T. Machine learning based model for the early detection of gestational diabetes mellitus. BMC Med. Inform. Decis. Mak. 2025, 25, 29. [Google Scholar] [CrossRef]
  92. Zhou, H.; Chen, W.; Cheng, C.; Zhang, Y.; Chen, J.; Lin, J.; He, K.; Guo, X. Predictive Value of Ultrasonic Artificial Intelligence in Placental Characteristics of Early Pregnancy for Gestational Diabetes Mellitus. Front. Endocrinol. 2024, 15, 1344666. [Google Scholar] [CrossRef]
  93. Chen, M.; Xu, W.; Guo, Y.; Yan, J. Predicting recurrent gestational diabetes mellitus using artificial intelligence models: A retrospective cohort study. Arch. Gynecol. Obstet. 2024, 310, 1621–1630. [Google Scholar] [CrossRef]
  94. Kaya, Y.; Bütün, Z.; Çelik, Ö.; Akça Salik, E.; Tahta, T.; Altun Yavuz, A. The early prediction of gestational diabetes mellitus by machine learning models. BMC Pregnancy Childbirth 2024, 24, 574. [Google Scholar] [CrossRef]
  95. Cubillos, G.; Monckeberg, M.; Plaza, A.; Morgan, M.; Estévez, P.A.; Choolani, M.; Kemp, M.W.; Illanes, S.E.; Pérez, C.A. Development of machine learning models to predict gestational diabetes risk in the first half of pregnancy. BMC Pregnancy Childbirth 2023, 23, 469. [Google Scholar] [CrossRef] [PubMed]
  96. Hu, X.; Hu, X.; Yu, Y.; Wang, J. Prediction model for gestational diabetes mellitus using the XGBoost machine learning algorithm. Front. Endocrinol. 2023, 14, 1105062. [Google Scholar] [CrossRef] [PubMed]
  97. Houri, O.; Gil, Y.; Krispin, E.; Amitai-Komem, D.; Chen, R.; Hochberg, A.; Wiznitzer, A.; Hadar, E. Predicting adverse perinatal outcomes among gestational diabetes complicated pregnancies using neural network algorithm. J. Matern.-Fetal Neonatal Med. 2023, 36, 2286928. [Google Scholar] [CrossRef] [PubMed]
  98. Kadambi, A.; Fulcher, I.; Venkatesh, K.; Schor, J.S.; Clapp, M.A.; Wen, T. Predicting the Risk of Gestational Diabetes Using Clinical Data with Machine Learning: A Predictive Model Study. Am. J. Obstet. Gynecol. MFM 2023, 5, 100965. [Google Scholar] [CrossRef]
  99. Watanabe, M.; Eguchi, A.; Sakurai, K.; Yamamoto, M.; Mori, C.; Japan Environment Children’s Study (JECS) Group. Prediction of gestational diabetes mellitus using machine learning from birth cohort data of the Japan Environment and Children’s Study. Sci. Rep. 2023, 13, 21410. [Google Scholar] [CrossRef]
  100. Zhou, M.; Ji, J.; Xie, N.; Chen, D. Prediction of birth weight in pregnancy with gestational diabetes mellitus using an artificial neural network. J. Zhejiang Univ. Sci. B 2022, 23, 459–470. [Google Scholar] [CrossRef]
  101. Kumar, M.; Ang, L.T.; Png, H.; Ng, M.; Tan, K.; Loy, S.L.; Tan, K.H.; Chan, J.K.Y.; Godfrey, K.M.; Chan, S.-Y.; et al. Automated Machine Learning (AutoML)-Derived Preconception Predictive Risk Model to Guide Early Intervention for Gestational Diabetes Mellitus. Int. J. Environ. Res. Public Health 2022, 19, 6792. [Google Scholar] [CrossRef]
  102. Kumar, M.; Chen, L.; Tan, K.; Ang, L.T.; Ho, C.; Wong, G.; Soh, S.-E.; Tan, K.H.; Chan, J.K.Y.; Godfrey, K.M.; et al. Population-Centric Risk Prediction Modeling for Gestational Diabetes Mellitus: A Machine Learning Approach. Diabetes Res. Clin. Pract. 2022, 185, 109237. [Google Scholar] [CrossRef]
  103. Yang, J.; Clifton, D.; Hirst, J.E.; Kavvoura, F.K.; Farah, G.; Mackillop, L.; Lu, H. Machine learning-based risk stratification for gestational diabetes management. Sensors 2022, 22, 4805. [Google Scholar] [CrossRef]
  104. Liao, L.D.; Ferrara, A.; Greenberg, M.B.; Boggess, K.; Njoroge, J.; Zhang, Z.; Bradshaw, P.T.; Hubbard, A.E.; Zhu, Y. Development and Validation of Prediction Models for Gestational Diabetes Treatment Modality Using Supervised Machine Learning: A Population-Based Cohort Study. BMC Med. 2022, 20, 307. [Google Scholar] [CrossRef]
  105. Du, Y.; Rafferty, A.R.; McAuliffe, F.M.; Wei, L.; Mooney, C. An explainable machine learning-based clinical decision support system for prediction of gestational diabetes mellitus. Sci. Rep. 2022, 12, 11272. [Google Scholar] [CrossRef]
  106. Araya, J.; Rodriguez, A.; Lagos-SanMartin, K.; Mennickent, D.; Gutiérrez-Vega, S.; Ortega-Contreras, B.; Valderrama-Gutiérrez, B.; Gonzalez, M.; Farías-Jofré, M.; Guzmán-Gutiérrez, E. Maternal thyroid profile in first and second trimester of pregnancy is correlated with gestational diabetes mellitus through machine learning. Placenta 2021, 112, 19–26. [Google Scholar] [CrossRef] [PubMed]
  107. Liu, H.; Li, J.; Leng, J.; Wang, H.; Liu, J.; Li, W.; Liu, H.; Wang, S.; Ma, J.; Chan, J.C.; et al. Machine learning risk score for prediction of gestational diabetes in early pregnancy in Tianjin, China. Diabetes Metab. Res. Rev. 2023, 39, e3397. [Google Scholar] [CrossRef] [PubMed]
  108. Ahmadzia, H.K.; Dzienny, A.C.; Bopf, M.; Phillips, J.M.; Federspiel, J.J.; Amdur, R.; Rice, M.M.; Rodriguez, L. Machine Learning Models for Prediction of Maternal Haemorrhage and Transfusion: Model Development Study. JMIR Bioinform. Biotechnol. 2024, 5, e52059. [Google Scholar] [CrossRef] [PubMed]
  109. Wang, M.; Yi, G.; Zhang, Y.; Li, M.; Zhang, J. Quantitative Prediction of Postpartum Haemorrhage in Cesarean Section on Machine Learning. BMC Med. Inform. Decis. Mak. 2024, 24, 166. [Google Scholar] [CrossRef]
  110. Holcroft, S.; Karangwa, I.; Little, F.; Behoor, J.; Bazirete, O. Predictive Modelling of Postpartum Haemorrhage Using Early Risk Factors: A Comparative Analysis of Statistical and Machine Learning Models. Int. J. Environ. Res. Public Health 2024, 21, 600. [Google Scholar] [CrossRef]
  111. Westcott, J.M.; Hughes, F.; Liu, W.; Grivainis, M.; Hoskins, I.; Fenyö, D. Prediction of Maternal Haemorrhage Using Machine Learning: Retrospective Cohort Study. J. Med. Internet Res. 2022, 24, e34108. [Google Scholar] [CrossRef]
  112. Liu, J.; Wang, C.; Yan, R.; Lu, Y.; Bai, J.; Wang, H.; Li, R. Machine learning-based prediction of postpartum haemorrhage after vaginal delivery: Combining bleeding high-risk factors and uterine contraction curve. Arch. Gynecol. Obstet. 2022, 306, 1115–1124. [Google Scholar] [CrossRef]
  113. Akazawa, M.; Hashimoto, K.; Katsuhiko, N.; Kaname, Y. Machine Learning Approach for the Prediction of Postpartum Haemorrhage in Vaginal Birth. Sci. Rep. 2021, 11, 22620. [Google Scholar] [CrossRef]
  114. Venkatesh, K.K.; Strauss, R.A.; Grotegut, C.A.; Heine, R.P.; Chescheir, N.C.; Stringer, J.S.A.; Stamilio, D.M.; Menard, K.M.; Jelovsek, J.E. Machine learning and statistical models to predict postpartum haemorrhage. Obstet. Gynecol. 2020, 135, 935–944. [Google Scholar] [CrossRef]
  115. Borycka, K.; Młyńczak, M.; Rosoł, M.; Korzeniewski, K.; Iwanowski, P.; Heřman, H.; Janku, P.; Uchman-Musielak, M.; Dosedla, E.; Diaz, E.G.; et al. Detection of obstetric anal sphincter injuries using machine learning-assisted impedance spectroscopy: A prospective, comparative, multicentre clinical study. Sci. Rep. 2025, 15, 392. [Google Scholar] [CrossRef] [PubMed]
  116. Hu, T.; Zhao, L.; Zhao, X.L.; He, L.; Zhong, X.; Yin, Z.; Chen, J.; Han, Y.; Li, K. Accurate prediction of mediolateral episiotomy risk during labor: Development and verification of an artificial intelligence model. BMC Pregnancy Childbirth 2025, 25, 370. [Google Scholar] [CrossRef]
  117. Boie, S.; Glavind, J.; Uldbjerg, N.; Steer, P.J.; Bor, P.; CONDISOX Trial Group. Continued versus discontinued oxytocin stimulation in the active phase of labour (CONDISOX): Individual management based on artificial intelligence—A secondary analysis. BMC Pregnancy Childbirth 2024, 24, 6461. [Google Scholar] [CrossRef] [PubMed]
  118. Wong, M.S.; Wells, M.; Zamanzadeh, D.; Akre, S.; Pevnick, J.M.; Bui, A.A.T.; Gregory, K.D. Applying automated machine learning to predict mode of delivery using ongoing intrapartum data in laboring patients. Am. J. Perinatol. 2023, 40, 577–585. [Google Scholar] [CrossRef] [PubMed]
  119. Kuanar, A.; Akbar, A.; Sujata, P.; Kar, D. Deep neural network modelling for prediction of the mode of delivery. Eur. J. Obstet. Gynecol. Reprod. Biol. 2024, 293, 84–90. [Google Scholar] [CrossRef]
  120. Xu, J.; Liu, Z.; Lu, Y.; Zheng, Z.; Zhang, X. A machine learning model to predict spontaneous vaginal delivery failure for term nulliparous women: An observational study. Int. J. Gynecol. Obstet. 2023, 162, 292–300. [Google Scholar] [CrossRef]
  121. Chen, G.; Bai, J.; Ou, Z.; Lu, Y.; Wang, H. PSFHS: Intrapartum ultrasound image dataset for AI-based segmentation of pubic symphysis and fetal head. PSFHS: Intrapartum ultrasound image dataset for AI-based segmentation of pubic symphysis and fetal head. Sci. Data 2024, 11, 3266. [Google Scholar] [CrossRef]
  122. Liu, Y.S.; Lu, S.; Wang, H.B.; Hou, Z.; Zhang, C.Y.; Chong, Y.W.; Wang, S.; Tang, W.Z.; Qu, X.L.; Yan, Z. An evaluation of cervical maturity for Chinese women with labor induction by machine learning and ultrasound images. BMC Pregnancy Childbirth 2023, 23, 737. [Google Scholar] [CrossRef]
  123. Lodi, M.; Poterie, A.; Exarchakis, G.; Brien, C.; Lafaye de Micheaux, P.; Deruelle, P.; Gallix, B. Prediction of cesarean delivery in class III obese nulliparous women: An externally validated model using machine learning. J. Gynecol. Obstet. Hum. Reprod. 2023, 52, 102624. [Google Scholar] [CrossRef]
  124. Zhang, R.; Sheng, W.; Liu, F.; Zhang, J.; Bai, W. Establishment and validation of a machine learning-based prediction model for termination of pregnancy via cesarean section. Int. J. Gen. Med. 2023, 16, 5567–5578. [Google Scholar] [CrossRef]
  125. D’Souza, R.; Doyle, O.; Miller, H.; Pillai, N.; Angehrn, Z.; Li, P.; Ispas-Jouron, S. Prediction of successful labor induction in persons with a low Bishop score using machine learning: Secondary analysis of two randomised controlled trials. Birth 2023, 50, e358–e366. [Google Scholar] [CrossRef]
  126. Meyer, R.; Weisz, B.; Eilenberg, R.; Avgil-Tsadok, M.; Uziel, M.; Sivan, E.; Mazaki-Tovi, S.; Tsur, A. Utilizing machine learning to predict unplanned cesarean delivery. Int. J. Gynaecol. Obstet. 2022, 161, 255–263. [Google Scholar] [CrossRef]
  127. Hu, T.; Du, S.; Li, X.; Yang, F.; Zhang, S.; Yi, J.; Xiao, B.; Li, T.; He, L. Establishment of a model for predicting the outcome of induced labor in full-term pregnancy based on machine learning algorithm. Sci. Rep. 2022, 12, 19179. [Google Scholar] [CrossRef] [PubMed]
  128. Ghi, T.; Conversano, F.; Ramirez Zegarra, R.; Pisani, P.; Dall’Asta, A.; Lanzone, A.; Lau, W.; Vimercati, A.; Iliescu, D.G.; Mappa, I.; et al. Novel artificial intelligence approach for automatic differentiation of fetal occiput anterior and non-occiput anterior positions during labor. Ultrasound Obstet. Gynecol. 2022, 62, 271–280. [Google Scholar] [CrossRef] [PubMed]
  129. Islam, M.S.; Awal, M.A.; Laboni, J.N.; Pinki, F.T.; Karmokar, S.; Mumenin, K.M.; Al-Ahmadi, A.; Rahman, M.A.; Hossain, M.S.; Mirjalili, S.; et al. HGSORF: Henry Gas Solubility Optimization-based Random Forest for C-Section prediction and XAI-based cause analysis. Comput. Biol. Med. 2022, 147, 105671. [Google Scholar] [CrossRef] [PubMed]
  130. Chill, H.H.; Guedalia, J.; Lipschuetz, M.; Shimonovitz, T.; Unger, R.; Shveiky, D.; Karavani, G. Prediction model for obstetric anal sphincter injury using machine learning. Int. Urogynecol. J. 2022, 33, 1893–1901. [Google Scholar] [CrossRef]
  131. Ullah, Z.; Saleem, F.; Jamjoom, M.; Fakieh, B. Reliable prediction models based on enriched data for identifying the mode of childbirth by using machine learning methods: Development study. JMIR Med. Inform. 2021, 9, e28856. [Google Scholar] [CrossRef]
  132. Guedalia, J.; Sompolinsky, Y.; Novoselsky Persky, M.; Cohen, S.M.; Kabiri, D.; Yagel, S.; Unger, R.; Lipschuetz, M. Prediction of severe adverse neonatal outcomes at the second stage of labour using machine learning: A retrospective cohort study. BJOG 2021, 130, 1927–1937. [Google Scholar] [CrossRef]
  133. Tarimo, C.S.; Bhuyan, S.S.; Li, Q.; Mahande, M.J.J.; Wu, J.; Fu, X. Validating machine learning models for the prediction of labour induction intervention using routine data: A registry-based retrospective cohort study at a tertiary hospital in northern Tanzania. BMJ Open 2021, 11, e051925. [Google Scholar] [CrossRef]
  134. Meyer, R.; Hendin, N.; Zamir, M.; Mor, N.; Levin, G.; Sivan, E.; Aran, D.; Tsur, A. Implementation of machine learning models for the prediction of vaginal birth after cesarean delivery. J. Matern.-Fetal Neonatal Med. 2020, 35, 3677–3683. [Google Scholar] [CrossRef] [PubMed]
  135. Ricciardi, C.; Improta, G.; Amato, F.; Cesarelli, G.; Romano, M. Classifying the type of delivery from cardiotocographic signals: A machine learning approach. Comput. Methods Programs Biomed. 2020, 196, 105712. [Google Scholar] [CrossRef] [PubMed]
  136. Beksac, M.S.; Tanacan, A.; Bacak, H.O.; Leblebicioglu, K. Computerized prediction system for the route of delivery (vaginal birth versus cesarean section). J. Perinat. Med. 2018, 46, 55–61. [Google Scholar] [CrossRef] [PubMed]
  137. Fergus, P.; Selvaraj, M.; Chalmers, C. Machine learning ensemble modelling to classify caesarean section and vaginal delivery types using cardiotocography traces. Comput. Biol. Med. 2018, 93, 7–16. [Google Scholar] [CrossRef]
  138. Devoe, L.D.; Samuel, S.; Prescott, P.; Work, B.A. Predicting the duration of the first stage of spontaneous labor using a neural network. J. Matern.-Fetal Investig. 1996, 6, 205–209. [Google Scholar] [CrossRef]
  139. Nielsen, P.V.; Stigsby, B.; Nickelsen, C.; Nim, J. Intra- and inter-observer variability in the assessment of intrapartum cardiotocograms. Acta Obstet. Gynecol. Scand. 1987, 66, 421–424. [Google Scholar] [CrossRef]
  140. Arduini, D.; Rizzo, G.; Romanini, C. Computerized analysis of fetal heart rate. J. Perinat. Med. 1994, 22 (Suppl. S1), 22–27. [Google Scholar] [CrossRef]
  141. Stout, M.J.; Cahill, A.G. Electronic fetal monitoring: Past, present, and future. Clin. Perinatol. 2011, 38, 127–142. [Google Scholar] [CrossRef]
  142. Georgieva, A. OxSys: Integrating clinical risk factors into computerized CTG analysis. Ultrasound Obstet. Gynecol. 2017, 50, 371–378. [Google Scholar]
  143. Baumert, M. Phase-rectified signal averaging for CTG feature extraction. IEEE Trans. Biomed. Eng. 2013, 60, 1432–1440. [Google Scholar]
  144. Liu, S. Wavelet-based feature extraction of CTG signals for fetal compromise detection. Comput. Biol. Med. 2015, 65, 232–239. [Google Scholar]
  145. Long, Y. Nonclassic CTG features and composite metrics: Systematic review. J. Matern.-Fetal Neonatal Med. 2019, 32, 543–551. [Google Scholar]
  146. Steer, P.J. Early indicators of fetal compromise: Late preterm and small for gestational age associations. BJOG 2012, 119, e116–e124. [Google Scholar]
  147. Park, C.E.; Choi, B.; Park, R.W.; Kwak, D.W.; Ko, H.S.; Seong, W.J.; Cha, H.-H.; Kim, H.M.; Lee, J.; Seol, H.-J.; et al. Automated interpretation of cardiotocography using deep learning in a nationwide multicenter study. Sci. Rep. 2025, 15, 19617. [Google Scholar] [CrossRef]
  148. Blencowe, H.; Cousens, S.; Oestergaard, M.Z.; Chou, D.; Moller, A.B.; Narwal, R.; Adler, A.; Vera Garcia, C.; Rohde, S.; Say, L.; et al. National, regional, and worldwide estimates of preterm birth rates in the year 2010 with time trends since 1990 for selected countries: A systematic analysis and implications. Lancet 2012, 379, 2162–2172. [Google Scholar] [CrossRef]
  149. Yagel, S.; Cohen, S.M.; Admati, I.; Skarbianskis, N.; Solt, I.; Zeisel, A.; Beharier, O.; Goldman-Wohl, D. Expert review: Preeclampsia type I and type II. Am. J. Obstet. Gynecol. MFM 2023, 5, 101203. [Google Scholar] [CrossRef]
  150. Roberts, J.M.; Cooper, D.W. Pathogenesis and genetics of pre-eclampsia. Lancet 2001, 357, 53–56. [Google Scholar] [CrossRef]
  151. Redman, C.W.G.; Sargent, I.L. Placental stress and pre-eclampsia: A revised view. Placenta 2009, 30, S38–S42. [Google Scholar] [CrossRef]
  152. O’Gorman, N.; Wright, D.; Poon, L.C.; Rolnik, D.L.; Syngelaki, A.; Wright, A.; Akolekar, R.; Cicero, S.; Janga, D.; Jani, J.; et al. Accuracy of competing-risks model in screening for pre-eclampsia by maternal factors and biomarkers at 11–13 weeks’ gestation. Ultrasound Obstet. Gynecol. 2017, 49, 751–755. [Google Scholar] [CrossRef]
  153. Ferrara, A. Increasing prevalence of gestational diabetes mellitus: A public health perspective. Diabetes Care 2007, 30 (Suppl. S2), S141–S146. [Google Scholar] [CrossRef]
  154. HAPO Study Cooperative Research Group; Metzger, B.E.; Lowe, L.P.; Dyer, A.R.; Trimble, E.R.; Chaovarindr, U.; Coustan, D.R.; Hadden, D.R.; McCance, D.R.; Hod, M.; et al. Hyperglycemia and adverse pregnancy outcomes. N. Engl. J. Med. 2008, 358, 1991–2002. [Google Scholar]
  155. Royal College of Obstetricians and Gynaecologists. Green-Top Guideline No. 63: Management of Women with Diabetes in Pregnancy; RCOG Press: London, UK, 2015. [Google Scholar]
  156. American College of Obstetricians and Gynecologists. Practice bulletin No. 190: Gestational diabetes mellitus. Obstet. Gynecol. 2018, 131, e49–e64. [Google Scholar] [CrossRef] [PubMed]
  157. Diabetes Canada Clinical Practice Guidelines Expert Committee. Diabetes and pregnancy. Can. J. Diabetes 2018, 42 (Suppl. S1), S255–S282. [Google Scholar] [CrossRef] [PubMed]
  158. Clausen, T.D.; Mathiesen, E.R.; Hansen, T.; Pedersen, O.; Jensen, D.M.; Lauenborg, J.; Damm, P. High prevalence of type 2 diabetes and pre-diabetes in adult offspring of women with gestational diabetes mellitus or type 1 diabetes: The role of intrauterine hyperglycemia. Diabetes Care 2008, 31, 340–346. [Google Scholar] [CrossRef] [PubMed]
  159. Bláha, J.; Bartošová, T. Epidemiology and Definition of PPH Worldwide. Best Pract. Res. Clin. Anaesthesiol. 2022, 36, 325–339. [Google Scholar] [CrossRef]
  160. Hancock, A.; Weeks, A.D.; Lavender, D.T. Is Accurate and Reliable Blood Loss Estimation the “Crucial Step” in Early Detection of Postpartum Haemorrhage: An Integrative Review of the Literature. BMC Pregnancy Childbirth 2015, 15, 230. [Google Scholar] [CrossRef]
  161. American College of Obstetricians and Gynecologists. Quantitative Blood Loss in Obstetric Haemorrhage: ACOG Committee Opinion No. 794. Obstet. Gynecol. 2019, 134, e150–e156. [Google Scholar] [CrossRef]
  162. Le Bihan, L.; Nowak, E.; Anouilh, F.; Tremouilhac, C.; Merviel, P.; Tromeur, C.; Robin, S.; Drugmanne, G.; Le Roux, L.; Couturaud, F.; et al. Development and Validation of a Predictive Tool for Postpartum Haemorrhage after Vaginal Delivery: A Prospective Cohort Study. Biology 2023, 12, 54. [Google Scholar] [CrossRef]
  163. Grobman, W.A.; Lai, Y.; Landon, M.B.; Spong, C.Y.; Leveno, K.J.; Rouse, D.J.; Varner, M.W.; Moawad, A.H.; Caritis, S.N.; Harper, M.; et al. Development of a nomogram for prediction of vaginal birth after cesarean delivery. Obstet. Gynecol. 2007, 109, 806–812. [Google Scholar] [CrossRef]
  164. Macones, G.A.; Hausman, N.; Edelstein, R.; Stamilio, D.M.; Marder, S.J. Predicting outcomes of trials of labor in women attempting vaginal birth after cesarean delivery: A comparison of multivariate methods with neural networks. Am. J. Obstet. Gynecol. 2001, 184, 409–416. [Google Scholar] [CrossRef]
  165. World Health Organization. Ethics and Governance of Artificial Intelligence for Health; WHO: Geneva, Switzerland, 2021; Available online: https://www.who.int/publications/i/item/9789240029200 (accessed on 15 July 2025).
  166. Organisation for Economic Co-Operation and Development (OECD). OECD Framework for the Classification of AI Systems: A Tool for Effective Policy Making; OECD: Paris, France, 2024; Available online: https://oecd.ai (accessed on 29 September 2025).
  167. U.S. Department of Health and Human Services. Standards for Privacy of Individually Identifiable Health Information (“HIPAA Privacy Rule”); 45 CFR Parts 160 and 164; HHS: Washington, DC, USA, 2013.
  168. U.S. Food and Drug Administration. Proposed Regulatory Framework for Modifications to Artificial Intelligence/Machine Learning-Based Software as a Medical Device (SaMD)—Discussion Paper and Request for Feedback; FDA: Silver Spring, MD, USA, 2019. Available online: https://www.fda.gov/media/122535/download (accessed on 29 September 2025).
  169. U.S. Food and Drug Administration. Artificial Intelligence and Machine Learning (AI/ML) Software as a Medical Device (SaMD) Action Plan; FDA: Silver Spring, MD, USA, 2021. Available online: https://www.fda.gov/media/145022/download (accessed on 29 September 2025).
  170. European Union. Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 (General Data Protection Regulation); OJ L 119; European Union: Brussels, Belgium, 2016; pp. 1–88. [Google Scholar]
  171. European Union. Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 Laying Down Harmonised Rules on Artificial Intelligence (AI Act); OJ L; European Union: Brussels, Belgium, 2024. [Google Scholar]
  172. U.S. Food and Drug Administration. Investigational Device Exemptions (IDE) for Early Feasibility Medical Device Clinical Studies, Including Certain First in Human (FIH) Studies; Guidance for Industry and FDA Staff; FDA: Silver Spring, MD, USA, 2013.
  173. European Union. Regulation (EU) 2017/745 of the European Parliament and of the Council of 5 April 2017 on Medical Devices; OJ L 117; European Union: Brussels, Belgium, 2017; pp. 1–175. [Google Scholar]
  174. Rivera, S.C.; Liu, X.; Chan, A.W.; Denniston, A.K.; Calvert, M.J.; SPIRIT-AI and CONSORT-AI Working Group. Guidelines for clinical trial protocols for interventions involving artificial intelligence: The SPIRIT-AI extension. Nat. Med. 2020, 26, 1351–1363. [Google Scholar] [CrossRef]
  175. Liu, X.; Cruz Rivera, S.; Moher, D.; Calvert, M.J.; Denniston, A.K.; CONSORT-AI and SPIRIT-AI Working Group. Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: The CONSORT-AI extension. BMJ 2020, 370, m3164. [Google Scholar] [CrossRef]
Figure 1. PRISMA flow diagram for identification of included studies.
Figure 1. PRISMA flow diagram for identification of included studies.
Jcm 14 06974 g001
Table 1. AI applications in cardiotocography.
Table 1. AI applications in cardiotocography.
ReferenceCountryData SourceBest-Performing AI ModelPerformance MetricsClinical Utility
McCoy et al. (2025)
[12]
USAInternal: 124,777 CTGs; External: 552 CTGsIncTime architectureAUC: 0.85 (pH < 7.05); 0.89 (pH < 7.05 + base excess < −10); Sens: 90%, Spec: 48% (for PPV 30%)DL system for intrapartum detection of foetal acidemia
M’Barek et al. (2025)
[13]
France27,662 CTGsCNN with pretraining and combined FHR + UC inputsAUC (Severe acidemia): 0.74–0.83. AUC (Moderate + Severe): 0.70–0.83. Improved vs. DeepCTG 1.0 by ~0.05 AUCImproved detection of neonatal acidemia compared to traditional and earlier DL models
Gumilar et al. (2025) [14]Indonesia7 CTGsGPT-4oMean performance scores (0–100 scale): SHDs: 80.43, GPT-4o: 77.86, Gemini: 57.14, Copilot: 47.29; CG4o surpassed others.Promising tool for aiding less experienced clinicians in CTG interpretation.
Roozbeh et al. (2025)
[15]
Iran7166 deliveriesRFRF: AUC 0.77, Acc 0.77, Prec 0.72Supports early identification of NFH risk using routine clinical data.
Zhao et al. (2024)
[16]
China552 CTGsSE-TCN with CMFF (MHA)Acc: 96.8%; Sens: 96.0%; Spec: 97.5%; Prec: 97.5%; F1-Score: 96.7%Automates foetal acidosis diagnosis.
Tarvonen et al. (2024)
[17]
Finland4988 CTGsSALKACohen’s Kappa: 0.981; Sens: 0.981; PPV: 0.822; False-negative rate: 0.01Enables automated, real-time HRV detection comparable to experts, especially in neonatal acidemia cases.
Mushtaq et al. (2024)
[18]
India2126 CTGsDNNAcc 0.99; Sens 0.93; Spec 0.93; AUC 0.96, Precision 0.93High-performance, interpretable tool for CTG classification.
Melaet et al. (2024)
[19]
Netherlands678 CTGs (train n = 548; validation n = 87)Patient-specific FHR predictor NNAUC 0.96 for distinguishing normal vs. pathological segmentsEnables earlier prediction of foetal compromise.
Wahbah et al. (2024)
[20]
Japan and USA70 pregnant womenBiLSTM-based DL framework with signal enhancement techniquesSubject-dependent accuracy: 94.2%, F1-score: 0.97; Subject-independent accuracy: 88.8%, F1-score: 0.96Enables accurate noninvasive foetal ECG extraction and foetal heart rate estimation.
Mendis et al. (2024)
[21]
Australia552 CTGsFHR-LINet25% reduction in the time taken to detect foetal compromise compared to the state-of-the-art multimodal CNNEnables earlier prediction of foetal compromise
Li J et al. (2024)
[22]
ChinaSource domain:16,355 CTGs; Target domain: 3351 CTGDSSDA-MMEDI (GoogLeNet with MME, DI, and DGMI integration)Acc: 80.14%, Sens: 74.52%, Spec: 83.22%, F1-score: 72.67%, Kappa: 57.08%, MCC: 57.13%, AUC: 0.8502Enables earlier prediction of foetal compromise.
Das et al. (2023)
[23]
Bangladesh125 CTGsMLP with fuzzy annotationsAcc: 97.94%, ROC: 0.999Accurately identifies Early, Late, and Variable decelerations.
Liang et al. (2023)
[24]
China552 CTGs, enhanced to 4738 samples1D-CNN + BiGRU hybridAcc: 95.15%, Sens: 96.20%, Spec: 94.09%, F1-score: 95.20%, AUC: 99.29%Real-time CTG classification and hypoxia risk detection
Zhou Z et al. (2023) [25]China552 CTGsTGLCNAcc: 89.80%Improves classification accuracy and interpretability of CTG.
Lee KS et al. (2023)
[26]
South Korea5249 CTGs, 141,001 5-min samples2D ResNet CNNSens: 98.0%, Spec: 99.5%, F1-score: 98.7%Enables real-time foetal health classification via mobile app and server; supports remote antepartum monitoring
M’Barek I et al. (2023)
[27]
France1527 CTGsLRAUC: CTU-UHB (0.743), Beaujon (0.739), SPaM (0.768–0.873); improved specificity (12% FPR vs. 25% obstetricians)Enables earlier prediction of foetal compromise.
Cao Z et al. (2023)
[28]
China16,355 CTGsCNN for CTG + LGBM for classification using multimodal featuresAcc: 90.77%, AUC: 0.9201, Normal-F1: 0.9376, Abnormal-F1: 0.8223, Prec: 82.83%, Spec: 93.15%Supports early and intelligent antepartum screening.
Daydulo et al. (2022)
[29]
Ethiopia552 CTGsResNet-50 with Morse wavelet transform1st stage labour: Acc 98.7%, Sens 97.0%, Spec 100% 2nd stage labour: Acc 96.1%, Sens 94.1%, Spec 97.7%Reliable, automated FHR analysis for both early and late labour stages.
Spairani et al. (2022)
[30]
Italy14,000 ambulatory non-stress CTGsHybrid NNAcc: 80.1%, AUC 0.81; Sens 69%; Spec 92%;Enables earlier prediction of foetal compromise.
Boudet et al. (2022)
[31]
France635 CTGsFSDop model (GRU with data augmentation and time delay correction)Sens: 93.1%, PPV: 95.6%, Acc: 99.68%, AUC: 0.9992.Improves detection of false maternal heart rate signals in CTG; enhances preprocessing for foetal monitoring and DL-based foetal distress detection.
Frasch et al. (2021)
[32]
USA36 CTGsSSD (Single Shot MultiBox Detector)Acc: 93.6%; Prec: 87%; Rec: 82.5%Enables early detection of foetal compromise.
Fotiadou et al. (2021)
[33]
NetherlandsPrivate dataset: 28 CTGs Public: 68 CTGsEnsemble of CNN-LSTM with HR reliability classifierPrivate test set: MAE = 2.0 bpm, MSE = 49.4 bpm2, PPA = 97.3%, Coverage = 87.9%
PhysioNet: MAE = 1.1 bpm, MSE = 6.9 bpm2, PPA = 99.6%, Coverage = 82%.
Improves monitoring robustness by identifying unreliable segments
Liu LC et al. (2021)
[34]
Taiwan323,922-min CTGs (2605 for training/validation; 634 for testing)Modified FCNAUC 0.892; κ 0.525; sensitivity 0.528; FPR 0.632Enables early detection of foetal compromise.
Signorini MG et al. (2020)
[35]
Italy120 CTGsRFAUC 0.974; Sens 0.891; Spec 0.870; PPV 0.891; NPV 0.899Provides an interpretable, early antenatal IUGR screening tool from routine CTG.
Hoodbhoy et al. (2019)
[36]
Pakistan2126 CTGsXGBoostXGBoost: Prec >92% for pathological class; high precision (>96%) for suspect and pathological in training dataUseful for identifying high-risk pregnancies in low-resource settings.
Zhao Z et al. (2019)
[37]
China & Turkey552 CTGs8-layer CNN using RP imagesAcc: 98.69%, Sens: 99.29%, Spec: 98.10%, AUC: 98.70%DL model for automated foetal hypoxia prediction in clinical settings
Cömert et al. (2019) [38]Turkey552 CTGsSVM with a reduced feature set of 12 relevant featuresSens: 77.40%, Spec: 93.86%Potential of combining feature selection algorithms with ML models to improve the prediction of foetal hypoxia.
Zhao Z et al. (2019)
[39]
China552 CTGs2D CNN with 5×5 kernel, 15 filters, image resolution 64×64Acc: 98.34%, Sens: 98.22%, Spec: 94.87%, Quality Index: 96.53%, AUC: 97.82%Enables early detection of foetal compromise.
Tang H et al. (2018)
[40]
China24,360 twenty-minute FHR time-series samplesMKNet (CNN)MKNet: Acc 94.7%; AUC 0.95; MKRNN: Acc 90.3%; AUC 0.91Real-time automated FHR interpretation on portable devices.
Leonarduzzi R et al. (2015)
[41]
France3049 CTG sSparse SVM with p = 0.25AUC: 0.71; Sens: 0.70; Spec: 0.70Improves foetal acidosis detection during labour via advanced signal complexity analysis.
Maeda K et al. (2010)
[42]
France29 CTGSANNAcc: 86% (internal test on 29 cases); Sens, Spec, PPV, NPV: all 100% for neural indexProvides a fully numeric, objective FHR analysis framework.
Salamalekis E et al. (2002)
[43]
Greece61 CTGsSelf-Organising Map neural networkSens 83.3%; Spec 97.9% for identifying acidemic fetuses (umbilical pH < 7.20)Enables early detection of foetal compromise.
Liszka-Hackzell JJ. et al. (2001)
[44]
Sweden34 CTGS for training; 38 CTGs for testingHybrid SOM-BP model using CTG-derived feature vectorsHigh accuracyEarly demonstration of AI use in CTG pattern recognition.
Kol S et al. (1995)
[45]
IsraelNonstress test recordsANNSens: 88.9%; FPR: 4.3%Evaluates ANN for nonstress tests.
Keith RD et al. (1994)
[46]
United Kingdom50,000 five-minute CTG segmentsBack-propagation NN on deceleration subtaskNN5 agreement with experts: ~75% vs. System 8000: ~47%; convergence in ~24 h for deceleration magnitude classificationAutomated feature extraction to support an expert-system for real-time labour decision support
Table 2. AI in prediction of preterm delivery.
Table 2. AI in prediction of preterm delivery.
ReferenceCountryData SourceBest-Performing AI ModelPerformance MetricsPredictorsClinical Utility
Kloska A et al. (2025)
[47]
Poland28 preterm, 22 term deliveriesBoosted Linear SVMAcc 82%, Precision 83%, Recall 86%, F1-score 84%CBC (WBC, PLT, Hb, HCT), CRP, BMI, parity, gestational diabetes, education level, etc.Early detection of PTB using low-cost and routinely collected clinical data
Ohtaka A et al. (2024)
[48]
Japan30 preterm, 29 term deliveriesXception CNNAcc 0.718, AUC 0.704. VGG16: acc 0.654, Recall 0.808Segmented transvaginal ultrasound images of the cervix at admissionImage-based prediction of PTB in high-risk pregnancies
Bitar G et al. (2024)
[49]
USA12,440 deliveriesXGBoostDerivation cohort AUC: 0.70; Validation cohort AUC: 0.63Multiple gestation, number of emergency department visits in the year prior to the index pregnancy, initial body mass index, gravidity, prior preterm deliveryEarly detection of preterm birth using low-cost and routinely collected clinical data
Ushida T et al. (2023)
[50]
China31,157 infants <32 weeks GA and ≤1500 gGBDTAUROC: In-hospital death: 0.855; Short-term adverse outcomes: 0.750; Medium-term adverse outcomes: 0.70112 antenatal variables: maternal age, gestational age, parity, delivery mode, diabetes, HDP, chorioamnionitis, PROM, ACS, foetal sex, birth weight, chorionicityImproved predictive accuracy for mortality and neurological outcomes in extremely preterm infants using only antenatal variables.
Andrade Júnior VL et al. (2023)
[51]
Brazil524 singleton pregnancies (18–24 weeks)SBELM (NN stacking)At 10% FPR: AUC 0.808, Sens 47.3%, Spec 92.8%, PPV 32.7%, NPV 96.0%.Cervical funneling, cervical length, index (CL/internal angle), previous PTB < 37 w, previous curettage, weight ≤ 58 kg, non-smoker status, absence of antibiotics useViable clinical tool for sPTB < 35 w screening during 2nd-trimester
Kokkinidis I et al. (2023)
[52]
Greece375 pregnant women (128 PTB)Voting ensemble (XGBoost, RF, MLP)AUC: 0.84, acc: 81%; F1-score: 0.7032 features: demographics, social history, obstetric history, and clinical screening variablesEarly detection of preterm birth using low-cost and routinely collected clinical data
Khan W et al. (2023)
[53]
United Arab Emirates3509 (deliveriesXGBoostAUC 0.735 (parous), 0.723 (nulliparous)35 selected features including: prior PTB, caesarean history, pre-eclampsia, BMI at delivery, maternal age, placenta previa, amniotic infection, physical activity, smokingPersonalised PTB risk interpretation for parous and nulliparous women.
Zhang Y et al. (2023)
[54]
China5411 deliveriesAdaBoostAcc 0.954, Recall 0.985, Precision 0.963, F1-score 0.969, AUC 0.93.21 EHR-derived features including parity, placenta previa, PPROM, diabetes, multiple gestation, etc.Early detection of preterm birth using low-cost and routinely collected clinical data
Sun Q et al. (2022)
[55]
China9550 deliveries (4775 PTB, 4775 controls)RFAcc 0.816, AUC 0.891 (95% CI: 0.871–0.901), Sens 0.751, Spec 0.882Age, magnesium, fundal height, MPV, waist size, total cholesterol, triglycerides, WBC count, and several others from blood/urine/physical examsEarly detection of preterm birth using low-cost and routinely collected clinical data
Wong K et al. (2022)
[56]
Australia≈ 953,000 births, 8.6% PTBMLPAt 5% FPR (90% spec): MLP AUC 86.43%, F1 50.44%, Sens 52.69%, PPV ≈ 48%Maternal socio-demographics, chronic conditions, pregnancy complications, past obstetric history, family historyEarly detection of preterm birth using low-cost and routinely collected clinical data
Zhou Y et al. (2022)
[57]
China65,565 deliveriesGAMU-shaped FT4–PTB association (p < 0.001); low FT4: HR 1.34 (95% CI 1.13–1.59); high FT4: HR 1.41 (95% CI 1.13–1.76)First-trimester maternal FT4Enables early risk stratification of PTB based on non-linear FT4 associations to inform surveillance and intervention planning.
Park S et al. (2022)
[58]
South Korea94 deliveries (38 PTB, 56 term deliveries)SVM with bacterial risk scores and white blood cell (WBC) dataSens: 71% (bacterial risk score only), 77% (with WBC data). Spec: 59% (bacterial risk score only), 67% (with WBC data)Bacterial risk scores from cervicovaginal fluid, focusing on the ratios of Lactobacillus iners and Ureaplasma parvumPotential for non-invasive prediction of PTB using cervicovaginal fluid bacterial profiles.
Rawashdeh H et al. (2020)
[59]
Australia274 cervical cerclage casesRF (both classification and regression)Classification task (delivery before 26 weeks): acc 95%, Sens 100%, G-mean 0.96, AUC 0.98.Maternal age, parity, previous PTB/miscarriages, cervical length, cervical status, progesterone use, symptoms, multiple gestation, uterine anomalies, indication/type of cerclage(1) Pre-cerclage counseling tool for PTB risk before 26 weeks, (2) Timeline prediction for delivery to optimise neonatal ICU preparedness and care planning.
Gao C et al. (2019)
[60]
USA25,689 deliveriesEnsemble of LSTM-WORD2VEC models (trained on 30 balanced datasets)AUC 0.827, Sens 0.965, Spec 0.698, PPV 0.033.Temporal EHR medical concepts (diagnoses, procedures, meds, labs) before 20 weeks GAEarly detection of preterm birth using low-cost and routinely collected clinical data
Elaveyini U et al. (2011)
[61]
India50 women with first trimester bleedingANN with 7 input neuronsacc: 70%Maternal age, gestational age at bleeding, duration, amount, episodes of bleeding, presence of hematoma, placental locationPTB risk stratification in pregnancies with first trimester bleeding.
Catley C et al. (2006)
[62]
Canada~48,000 deliveries, verified on 19,710 deliveriesANN with two hidden layers and weight elimination techniqueSens: 54.8%, Spec: 85.1–92.9%, AUC up to 0.738 obstetrical variables: maternal age, parity, previous term births, previous PTBs, multiple gestation, fetus’s gender, intention to breastfeed, smoking after 20 weeksEarly detection of preterm birth using low-cost and routinely collected clinical data
Goodwin LK et al. (2001)
[63]
USA19, 970 deliveriesCustom classifier (statistical + case-based + CART hybrid)Custom classifier on femographic only (7 variables): AUC 0.72; All variables: AUC 0.75Maternal age and binary coding for county of residence, education, marital status, payer source, race, and religion demographic characteristicsEarly detection of preterm birth using low-cost and routinely collected clinical data
Woolery LK et al. (1994)
[64]
USA18,890 deliveriesLERS (Rough Set Theory-based Rule Induction)Acc (expert system using LERS rules): Database 1: 88.8% Database 2: 59.2%, Database 3: 53.4%214 variables including demographics, high-risk factors; medical and intervention history, ICD-9 codesEarly detection of preterm birth using low-cost and routinely collected clinical data
Table 3. AI in pre-eclampsia.
Table 3. AI in pre-eclampsia.
ReferenceCountryData SourceBest-Performing AI ModelPerformance MetricsPredictorsClinical Utility
Zheng W. et al. (2025)
[65]
ChinaSagittal T2-weighted placental MRI from 420 pregnancies (140 PE, 280 normotensive)LR on fused radiomic + DL featuresDice (segmentation): 0.917; AUC (PE vs. normotensive): train 0.839, test 0.858, internal val 0.888, external val 0.843Radiomic wavelet, shape, texture features; five deep-learning componentsAutomated placental MRI analysis to identify PE and stratify FGR risk.
Wang Z et al. (2025)
[66]
ChinaGEO microarrays cohortsRFAUC 0.792 (test)11 IRDEGs (ADIPOR2, CD72, DDX17, FGF11, LCN6, NEDD4, NR1D1, NR2C1, RXRG, TMSB4X, VEGFA)Blood-based 11-gene panel for early PE prediction and insight into immune dysregulation mechanisms
Liu X et al. (2025)
[67]
ChinaGEO microarray cohortsXGBoostAUC 0.792CRKL; STK31; HTRA4; EPHB3; PAPPA2Potential diagnostic biomarkers and targets for PE.
Lv B et al. (2025)
[68]
ChinaEHR data from 1040 women (PE incidence 6.8%)XGBoostTraining AUC 0.963, F1 0.554; Test AUC 0.936, F1 0.488Pre-pregnancy BMI; pregnancies count; MAP; smoking; AFP MoM; conception methodEarly PE risk stratification using routine antenatal data to guide aspirin prophylaxis and monitoring.
da Silva SMS et al. (2025)
[69]
Brazil30 pregnant women (15 PE, 15 controls) and 30 matched newborn samplesPLS-DANewborn vs. pregnant: 99.7% acc using 10 wavenumbers; PE vs. control (newborn): ≤63% acc even with 100 features; maternal PE vs. control: <55% accuracyWavenumbers corresponding to carotenoids, DNA/RNA (PO2), collagen/proteins, lipids/fatty acidsDemonstrates feasibility of screening for hypertensive pregnancy via plasma Fourier-transform infrared (FT-IR) spectroscopy.
Eberhard BW et al. (2024)
[70]
USAEHR data from 66,425 deliveriesModified DeepHit deep survival NNTime-dependent concordance index (Ctd): 0.839; Time-dependent AUC: 0.824; Overall survival AUC: 0.778Age; race/ethnicity; chronic hypertension; parity; SBP/DBP; heart rate; platelets; creatinine; engineered temporal features up to 20 weeks’ gestationEarly PE risk stratification using routine antenatal data to guide aspirin prophylaxis and monitoring
Zhou T. et al. (2024)
[71]
ChinaRetinal fundus photographs obtained before 20 weeks’ gestation in 1138 singleton pregnanciesInception-ResNet-v2 CNNAUC 0.883, Sens 0.722, Spec 0.934Retinal vascular features encoded in fundus score (reflecting microvascular changes), plus maternal age, BMI, parity, chronic hypertension, prepregnancy BMI categoryEarly PE risk stratification using routine antenatal data to guide aspirin prophylaxis and monitoring
Vasilache I-A et al. (2024)
[72]
RomaniaEHR data from 210 singleton pregnanciesRFPE acc 96.3%; IUGR 95.9%; early IUGR 96.2%; late IUGR 95.2%; PE + IUGR association 95.1% (sens/spec ≥ 90%)Maternal age; BMI; nulliparity; conception type; smoking; history of PE/IUGR/preterm birth/autoimmune/CKD/DM/HTN; MAP; β-HCG, PAPP-A, PlGF, PP-13 (all MoM)Early PE risk stratification using routine antenatal data to guide aspirin prophylaxis and monitoring
Bülez A et al. (2024).
[73]
TurkeyHER data from 10,307 women (1158 PE, 9194 controls)LightGBMSens 73.7%; Spec 92.7%; Acc 90.6%; AUC 0.832Hemoglobin; age; AST; ALT; blood group; plus sociodemographics, vitalsEarly PE risk stratification using routine antenatal data to guide aspirin prophylaxis and monitoring
Kaya Y. et al. (2024)
[74]
TurkeyEHR data from 100 women admitted in 1st trimesterXGBoostAcc 70% (nulliparous), 72.7% (parous); AUC-ROC 0.64/0.767; Sens 80%/60%; Spec 60%/83.3%Maternal age; BMI; smoking; history of DM, GDM, HTN, SLE-APS; gravida; parity; MAP; previous PEEarly PE risk stratification using routine antenatal data to guide aspirin prophylaxis and monitoring
Tiruneh et al. (2024)
[75]
AustraliaEHR data from 48,250 womenRFAUC 0.84, acc 0.79Maternal age; ethnicity; BMI; parity; prior PE history; nulliparity; history of GDM; pre-existing hypertension; diabetes; family history of hypertension/diabetes/PE; renal disease; smoking; PCOSEarly PE risk stratification using routine antenatal data to guide aspirin prophylaxis and monitoring
Huang P et al. (2024)
[76]
ChinaGEO datasets (80 PE, 77 controls); validation cohort (12 PE, 12 controls)LRAUC (3-gene model): 0.871; individual genes AUCs > 0.70CPOX, DEGS1, SH3BP5 gene expressionProvides a 3-gene blood-based diagnostic signature enabling early, noninvasive PE detection.
Araújo DC et al. (2024)
[77]
BrazilEHR data from 132 women (65 severe PE, 67 controls)LightGBMAUROC 0.90 ± 0.10; Sens 0.95; Spec 0.79; Acc 0.87; Precision 0.82Neutrophils, mean corpuscular hemoglobin (MCH), aggregate index of systemic inflammation (AISI)Supports third-trimester sPE diagnosis using routine CBC.
Li T et al. (2024)
[78]
nanEHR data from 4644 pregnancies (49 preterm PE, 161 term PE cases)Voting ClassifieAll PE: AUC 0.831; DR10 0.513
Preterm PE: AUC 0.884; DR10 0.625
Maternal age, height, pre-pregnancy weight, parity, conception method, history of PE/HTN/CKD/DM; MAP; UtA-PI; PAPP-A; PlGFEarly PE risk stratification using routine antenatal data to guide aspirin prophylaxis and monitoring
Gil MM et al. (2024)
[79]
SpainEHR data from 10,110 1st trimester pregnanciesNNEarly PE DR 84.4%; AUC 0.920;
Preterm PE DR 77.8%, AUC 0.913;
All PE DR 55.7% (49.0–62.2), AUC 0.846
Maternal factors, MAP, UtA-PI, PlGFEnables non-MoM–based first-trimester screening for PE.
Edvinsson C. et al. (2024)
[80]
SweedenEHR data from 81 women (41 severe PE, 40 controls)XGBoostTest acc 0.82, AUC 0.85; Cross-val acc 0.88, AUC 0.91.AST, uric acid, BMIEarly PE risk stratification using routine antenatal data to guide aspirin prophylaxis and monitoring
Ansbacher-Feldman Z. et al. (2022)
[81]
UKEHR data from 60,789 1st trimester pregnanciesNNPE: AUC 0.82; Preterm PE: AUC 0.91Maternal age, BMI, parity, prior PE, interpregnancy interval, race/ethnicity, IVF status; MAP; UtA-PI; PlGF; PAPP-AEarly PE risk stratification using routine antenatal data to guide aspirin prophylaxis and monitoring
Villalaín C. et al. (2022)
[82]
SpainEHR data from 215 singleton early-onset PE casesSVMAUC 0.79; sens 77.3%; spec 80.1%; PPV 81.5%; NPV 76.2%Age; BMI; prior PE; gestational age; SBP/DBP; platelets; creatinine; AST/ALT; sFlt-1, PlGF; Doppler indices; foetal biometryProvides individualized risk of imminent delivery and severe complications in PE.
Liu M et al. (2022)
[83]
ChinaEHR data from 11,152 pregnanciesRFAUROC 0.86 (95% CI 0.80–0.92), acc 0.74, precision 0.82, recall 0.42, F1 0.56; Brier score 0.17, calibration slope 0.92, intercept 0.20Age; BMI; weight; height; GA; parity; chronic HTN; prior DM; prior PE; MAP; free β-hCG; PAPP-A; uterine artery PIEarly PE risk stratification using routine antenatal data to guide aspirin prophylaxis and monitoring
Bennett R et al. (2022)
[84]
USATexas units (360,943 deliveries, 3.98% PE), Oklahoma units (84,632 deliveries, 5.58% PE), and MOMI cohort (31,431 deliveries, 8.73% PE)Cost-sensitive DNN with focal loss & weighted cross-entropyAUC Texas: 0.66; AUC Oklahoma: 0.64; AUC External (MOMI): 0.77Demographics, comorbidities, prenatal labs, BMI, BP spikes, and temporal features (varies by dataset)Early PE risk stratification using routine antenatal data to guide aspirin prophylaxis and monitoring
Hoffman MK et al. (2021)
[85]
USAEHR data from 20,032 pregnanciesNNAt a 10% FPR: detects 53% of all PE cases (vs 41% without biomarkers) and 75% of preterm PE (vs 53% without)Age, BMI, parity, prior PE, MAP, UtA-PI, PlGF, PAPP-AEnables non-MoM–based first-trimester screening for overall and preterm PE.
Wang G. et al. (2021)
[86]
ChinaEHR from 907 women with PERFAUC 0.711 (95% CI 0.697–0.726); acc 0.817; sens 0.815; spec 0.984; PPV 0.777; NPV 0.807SBP, BUN, neutrophil count, glucose, D-Dimer (top five of 20 clinical and lab features)Identifies high-risk women for targeted CVD prevention and monitoring after PE.
Sufriyana H et al. (2020)
[87]
IndonesiaEHR data 3318 PE/eclampsia, 19,883 controlsRFAUROC (external validation): geographical split 0.88; temporal split 0.86 (95% CI 0.85–0.86)17 features from demographics and medical history over 24 months (e.g., age, parity, comorbidities, prior hospitalizations)Early PE risk stratification using routine antenatal data to guide aspirin prophylaxis and monitoring
Jhee JH et al. (2019)
[88]
South KoreaEHR data from 11,006 pregnanciesSGBAUC 0.924; acc 0.973SBP; DBP; BUN; creatinine; platelet count; WBC; calcium; UPCR; demographics, medical history, labs pattern-cluster featuresEnables early prediction of late-onset PE.
Table 4. AI in gestational diabetes.
Table 4. AI in gestational diabetes.
ReferenceCountryData SourceBest-
Performing AI Model
Performance MetricsPredictorsClinical Utility
Bigdeli SK et al. (2025)
[89]
IranEHR data from 16,730 pregnanciesRFInsulin model: AUC 0.64; acc 0.62; precision 0.60; recall 0.63;
GTT model: AUC 0.94; acc 0.89; precision 0.86; recall 0.92
Demographics; medical history; clinical findings; first-trimester FBS, Hb, Hct, Cr, PLT, vit D3, NT sonographic markersFirst-trimester GDM risk stratification
Zhao M. et al. (2025)
[90]
ChinaEHR data from 103,172 pregnancies (15,138 GDM; 88,034 controls)MLP with NearMissAUC 0.943; acc 0.884BMI; age; age of menarche; higher education; folic acid supplementation; family history of DM; HGB; WBC; PLT; Scr; HBsAg; ALT; ALB; TBILFirst-trimester GDM risk stratification
Zaky H et al. (2025)
[91]
QatarEHR data from 138 pregnancies (63 GDM, 75 controls)Stacking ensembleAcc 88.8%; recall 92.1%; precision 87.3%; F1-score 89.6%History of high glucose/diabetes, HbA1c%, glucose, insulin, NT-proBNP, lipids, electrolytes, blood counts, liver/renal markers, hormones, family history, vitaminsFirst-trimester GDM risk stratification
Zhou H et al. (2025)
[92]
China2D ultrasound images at 11–13 weeks: discovery (n = 305; 139 GDM, 166 controls) and independent validation (n = 110; 53 GDM, 57 controls)Nomogram (radiomics + DLCNN + clinical)Discovery AUC 0.93, Validation AUC 0.88Radiomics features; age; pre-pregnancy BMI; DLCNN scoreFirst-trimester GDM risk stratification
Chen M et
al. (2024)
[93]
ChinaEHR data from 588 women with two consecutive singleton deliveries and index-pregnancy GDMLGBAUROC 0.942First-trimester FPG, 1–2 h OGTT glucose, triglycerides, cholesterol, HbA1c, macrosomia, preterm birth, age > 35 y, abdominal circumference, gestational weight gainFirst-trimester GDM risk stratification
Kaya et al. (2024)
[94]
TurkeyEHR data from 97 pregnanciesXGB ClassifierAcc 66.7%, AUC 0.55; sens 80%, spec 50%Age; BMI; gravida; parity; previous birth weight; smoking; first-visit plasma glucose; family history of DMFirst-trimester GDM risk stratification
Cubillos G. et al. (2023)
[95]
ChileEHR data from 1611 pregnanciesMLP with optimised hyperparametersSens 0.82; Spec 0.72–0.74; Acc 0.73–0.75; AUCROC 0.81First-trimester fasting glycemia, age, BMI, weight, gravidityFirst-trimester GDM risk stratification
Hu X et al. (2023)
[96]
ChinaEHR data from 735 pregnancies (training set) and 190 pregnancies (testing set)XGBoostAUC 0.946; acc 0.87520 first-trimester variables (e.g., previous GDM, age, HbA1c, MAP, lipids, liver enzymes)First-trimester GDM risk stratification
Houri O et al. (2023)
[97]
IsraelEHR data from 452 GDM pregnanciesNNAcc: 82% at GDM diagnosis; 91% at deliveryAge; parity; gravidity; pre-pregnancy BMI; GCT; OGTT values; maternal weight (pre-preg, at diagnosis, at delivery); treatment type; glycemic controlFirst-trimester GDM risk stratification
Kadambi et al. (2023)
[98]
USAMonitoring Mothers-to-be (nuMoM2b) EHR dataLRAUC 0.74Maternal race; BMI at first visit; prepregnancy BMI; family history of GDM; hypertension; valvular heart disease; structural heart disease; coronary artery disease; cardiac arrhythmia; polycystic ovary syndromeFirst-trimester GDM risk stratification
Watanabe M. et al. (2023)
[99]
JapanEHR data from 82,698 GDM pregnanciesGBDTAUC 0.67 for recurrent GDM; AUC 0.74 for new-onset GDM775 variables covering pre-pregnancy lifestyle, anthropometrics, smoking, diet, SF-8 QOL, K6 distress, lab values, etc.First-trimester GDM risk stratification
Zhou M et al. (2022)
[100]
China492 GDM pregnancies with 2D ultrasound scans within 3 days before deliveryANNMAE 153.5 g; MAPE 4.7%; ANN vs. Hadlock: MAE 148.5 g vs. 192.2 g (p < 0.001)Foetal biometry from ultrasound; maternal anthropometricsEnhances foetal weight estimation accuracy in GDM
Kumar M. et al. (2022)
[101]
SingaporeS-PRESTO cohort (n = 222)Gradient boosting classifier + linear SVMAUC 0.93HbA1c, mean BP, fasting insulin, triglycerides/HDL ratioPreconception risk stratification for GDM; deployable via web app
Kumar M. et al. (2022)
[102]
SingaporeGUSTO mother-offspring cohort (n = 909)CatBoostAUC 0.82mean arterial BP at booking; maternal age; previous history of GDM; ethnicity (Chinese/Indian vs. Malay)First-trimester GDM risk stratification deployable via web app
Yang J. et al. (2022)
[103]
UKOUH GDm-Health system: 1148 GDM pregnancies; external validation: 709 cases.XGBoost regressionInternal (OUH): MSE 0.021, R2 0.482, MAE 0.112; External (RBH): MSE 0.020, R2 0.519, MAE 0.108Pre-/post-breakfast, post-lunch, post-dinner glucose readings; engineered “High-Readings” and “Gradients”; maternal age; gestational day; medication statusPredicts short-term hyperglycemia risk to guide timely clinical monitoring and intervention.
Liao LD et al. (2022)
[104]
USAEHR data from 30,474 GDM pregnancies: discovery (n = 27,240) and validation (n = 3234)Super learner (LASSO, CART, RF, XGBoost)AUC 0.934 (discovery)/0.815 (validation)Demographics, clinical history, OGTT/glucose challenge values, SMBG metrics, labs across four timepointsEarly triage for pharmacologic treatment of GDM
Du Y. et al. (2022)
[105]
IrelandEHR data from 484 overweight/obese womenSVMAUC-ROC 0.792; AUC-PR 0.485; balanced ACC 0.751Family history DM; weight; WBC; fasting glucose; insulinFirst-trimester GDM risk stratification in overweight/obese women.
Araya J. et al. (2021)
[106]
ChileEHR data from 39 pregnancies (33 NGT, 6 GDM)Principal component analysisSpontaneous clustering of GDM vs. NGTFT4, TT3, TT4, TSH (1st & 2nd trimester); OGTT; diastolic blood pressure; prior GDMSuggests thyroid hormone profiling may augment early GDM diagnosis beyond OGTT.
Liu H et al. (2020)
[107]
ChinaEHR data from 19,331 pregnanciesXGBoostAUC 0.742Fasting plasma glucose; pre-pregnancy BMI; alanine aminotransferase; maternal age; waist circumference; weight gain; family history of diabetesFirst-trimester GDM risk stratification
Table 5. AI in postpartum haemorrhage.
Table 5. AI in postpartum haemorrhage.
ReferenceCountryData SourceBest-Performing AI ModelPerformance MetricsPredictorsClinical Utility
Ahmadzia HK et al. (2024)
[108]
USA228,438 deliveriesGradient BoostingROC-AUC 0.833; PR-AUC 0.21050 antepartum and intrapartum characteristics and hospital characteristics; top features: mode of delivery; oxytocin incremental dose for labour; intrapartum tocolytic use; presence of anaesthesia nurse; hospital typeIdentification of high-risk PPH parturients to guide proactive interventions
Wang M et al. (2024)
[109]
China6144 caesarean deliveriesRFMAE 21.7 mL (< 5.4% error); RMSE 33.75 mL (< 9.3% error) on test set.27 indicators: haemoglobin; WBC; platelets; PT; INR; APTT; TT; fibrinogen; Na; K; Cl; Ca; bilirubin; urea; creatinine; weight; height; infant weight; age; number of pregnancies; gestational week; blood pressures; complications; anaesthesia method; ASA class; emergency status; pregnancy daysIdentification of high-risk PPH parturients during cesarian to guide proactive interventions
Holcroft S. et al. (2024)
[110]
Rwanda430 deliveries (108 PPH cases, 322 controls)RFSens 80.7%, spec 71.3%, misclassification rate 12.19%Haemoglobin level at labour; maternal age; no medical insurance; multiple foetuses; pre-labour bleeding; intrauterine foetal death; BMI; multiparity; history of PPHIdentifies women at high risk of PPH upon admission for targeted interventions
Westcott JM et al. (2022)
[111]
USA30,867 deliveriesGBDTAUROC: 0.979, Acc: 98.1%, Sens: 76.3%497 variables including demographics, obstetric/medical/surgical/family history, vital signs, lab results, labour medication exposures, and delivery outcomesIdentification of high-risk PPH parturients to guide proactive interventions
Liu J et al. (2022)
[112]
China10,520 vaginal deliveriesLGB + LRAUC 0.803, Brier 0.061, F-measure 0.845, Sens 0.694, Spec 0.80049 clinical variables (16 known high-risk factors + TOCO features such as contraction frequency, Mean_Area intensity; haematocrit; shock index; WBC; gestational hypertension; neonatal weight; second stage labour time; amniotic fluid volume; BMI; etc.)Identification of high-risk PPH parturients after vaginal delivery
Akazawa M et al. (2021)
[113]
Japan9894 vaginal deliveries (188 PPH cases)LRAUC 0.708; Acc 0.686; FPR 0.312; FNR 0.39811 clinical variables: age; parity; maternal height; weight before pregnancy; weight on admission; gestational age; birthweight; baby sex; foetal position; oxytocin use; delivery modeIdentification of high-risk PPH parturients during vaginal delivery to guide proactive interventions
Venkatesh KK et al. (2020)
[114]
USA152,279 deliveriesXGBoostAUC≈0.9355 maternal risk factors available at labour admission (from literature and expert consensus) were included—e.g., maternal demographics (age, race), obstetric history/diagnoses (placenta previa, foetal macrosomia, pre-eclampsia), comorbidities (chronic hypertension, diabetes), and initial vital signsIdentification of high-risk PPH parturients to guide proactive interventions
Table 6. AI in labour and delivery outcomes.
Table 6. AI in labour and delivery outcomes.
ReferenceCountryData SourceBest-Performing AI ModelPerformance MetricsPredictorsClinical Utility
Borycka K et al. (2025)
[115]
Czech Republic, Slovakia, Poland, SpainImpedance spectroscopy and 3-D EAUS data from 152 deliveriesEnsemble tree-based ML model with 10-fold cross-validationOverall acc 0.86; sens 0.67–0.95; spec 0.80–0.98Impedance-derived spectral features; age; BMI; parity; head circumference; mode of delivery; time since deliveryNon-invasive, bedside detection of OASI to guide early intervention and repair decisions
Hu T et al. (2025)
[116]
ChinaEHR data from 1191 vaginal deliveries (300 episiotomies)SVMSVM: Acc 0.793; Recall 0.981; Precision 0.790; F1 0.875; AUC 0.882. Age; gestational age; parity; history of stillbirth; BMI; pregnancy complications; perineal length, elasticity, thickness, edema and skin tear; UC; duration of labour; shoulder dystocia; assisted breech; instrumental delivery; EFW; late deceleration; severe variable deceleration; amniotic fluid contamination; abnormal foetal position; working years of midwife; professional title; maternal cooperationDecision support by predicting the risk of mediolateral episiotomy.
Boie S et al. (2024)
[117]
Denmark and NetherlandsEHR data from 1198 deliveriesXGBoostAUROC:0.75, AUPRC: 0.39Maternal age, BMI, parity, cervical dilation, foetal station, oxytocin dosage, etc.Individual risk assessment for caesarean delivery after active labour onset.
Wong Ms et al. (2024)
[118]
USAEHR data from 37,932 deliveriesEnsemble model chosen via AutoMLAUC: 0.82Intrapartum clinical data (e.g., cervical dilation, FHR, uterine activity)Supports dynamic prediction of mode of delivery during labour using real-time data.
Kuanara et al. (2024)
[119]
IndiaEHR data from 101 deliveriesDNNTrain: AUC 0.99; KS 0.98; error rate caesarean 0.02, vaginal 0.00; Test: error rate caesarean 0.20, vaginal 0.10Mother’s weight, height, age, GA, Hb, FHF amniotic fluid index, cervix length, child birth weight, pregnancy countClinical decision support in selecting mode of delivery.
Xu J et al. (2024)
[120]
ChinaEHR data from 100 deliveries in training set, 50 in validation setGNBTraining AUC: 0.82, Validation AUC: 0.79, Acc: 80.9%, Sens: 72.7%, Spec: 75.0%, Precision: 84.2%
F1 Score: 0.78
Angle of progression, cervical length, subpubic arch angle, estimated foetal weightMay assist in early prediction of spontaneous vaginal delivery failure in term nulliparous women.
Chen G et al. (2024)
[121]
China and New ZealandRetrospective image collection from 1124 parturientsUNet variantsDice Coefficients (Segmentation Accuracy): 89.04–90.02%
Segmentation targets include pubic symphysis and foetal head from transperineal ultrasound images; used to compute angle of progressionEnables development of AI tools for objective, automated assessment of foetal head descent and prediction of delivery mode.
Liu Ys et al. (2023)
[122]
ChinaEHR data from 101 deliveriesXGBoostMAE 13.49 h, RMSE 16.98 hAge, BMI, gestational age, cervical length, foetal weight, BPD, Bishop score components, etc.Potential to improve prediction of labour induction outcomes over traditional Bishop score
Lodi et al. (2023)
[123]
FranceEHR data from 410 class III obese nulliparous women with attempted vaginal deliveryProbability ForestAUC: 0.70, Acc: 0.66, Sens: 0.44, Spec: 0.87
Initial maternal weight, labour inductionSupport personalised counseling on delivery mode in late pregnancy in class III obese nulliparous women.
Zhang R et al. (2023)
[124]
ChinaEHR data from 2552 deliveries (training n = 2025; validation n = 527)RFAccuracy 0.8956; MCC 0.7530; AUC-ROC 0.9791; AUC-PRC 0.9579Age; maternal height; weight at delivery; weight gain; parity; assisted reproduction; abnormal blood glucose; hypertensive disorders; scarred uterus; PROM; placenta previa; abnormal foetal position; thrombocytopenia; floating foetal head; labour analgesiaPredicts likelihood of caesarean section to support clinicians in individualized delivery planning.
D’Souza et al. (2023)
[125]
Canada, UK, USA, SwitzerlandEHR data from 1107 participants with singleton pregnancies and Bishop Score <4, undergoing induction of labour with dinoprostone vaginal insertML model not specifiedAUROC: 0.73Parity, gestational age (37–41 weeks), maternal BMI, maternal age, maternal comorbidities, Bishop scorePrediction of successful labour induction in women with a low Bishop score.
Myer R et al. (2022)
[126]
IsraelEHR data from 73,667 deliveries (train: 48,084; validation: 12,016; test: 13,567)XGBoostXGBoost AUC: Training: 0.874, Validation: 0.839, Test: 0.84013 features (e.g., maternal age, BMI, cervical dilation, effacement, labour onset, ultrasound-adjusted foetal biometry, parity)Web calculator (BirthAI.org) to predict unplanned caesarean delivery for individualized counseling.
Hu T et al. (2022)
[127]
ChinaEHR data from 907 participants (primipara n = 495; multipara n = 312)LRPrimipara: AUC 0.84; acc 90.3%; recall 0.986; precision 0.908; F1 0.943. Multipara: AUC 0.89; acc 97.1%; recall 0.993; precision 0.977; F1 0.982.Age; height; weight; BMI; gestational age; previous caesareans; number of abortions; Bishop score; foetal weight; amniotic fluid index; amniotic fluid contamination; foetal head circumference; foetal abdominal circumference; biparietal diameter; femur length; uterine height; abdominal circumference; membrane status; labour analgesiaAllows clinicians to estimate probability of successful oxytocin-induced labour at admission.
Ghi T et al. (2022)
[128]
Europe, Asia, AfricaEHR data from 1219 term pregnancies in second stage of labourPattern-recognition feed-forward NNOverall acc 90.4%; foetal occiput anterior (OA) acc 91.1%; non-OA acc 89.3%; F1-score 88.7%; PR-AUC 85.4%; Cohen’s κ = 0.81Transabdominal and transperineal ultrasound (TPU) images parameters for detection of foetal positionRapid, automatic classification of foetal OA vs. non-OA on TPU to aid in labour wards.
Islam Ms et al. (2022)
[129]
Bangladesh and Saudi ArabiaPakistan Demographic and Health Survey (PDHS) 2012–13 and 2017–18 datasetsHGSORF (Random Forest optimized with HGSO)Acc: 98.33%, Sens: 98.33%, Spec: 98.33%, Precision: 98.34%, AUC: ~99%24 features including: maternal age, BMI, ANC visits, previous C-section, household size, domestic violence, husband’s education and occupation, etcHigh-potential decision support system (DSS) for predicting CS likelihood; includes XAI tools (SHAP and LIME) to improve interpretability
Chill Hh et al. (2021)
[130]
IsraelEHR data from 98,463 deliveries (323 OASI cases)CatBoost gradient boostingAUC 0.756Parity; number of previous births; maternal weight; GA; birth weight; head circumference; induction method; duration of second stageStratification of women by OASI risk.
Ullah Z et al. (2021)
[131]
Saudi ArabiaEHR data from 80 deliveriesk-NN on enriched dataAcc: 84.38%Age, delivery number, delivery time (premature, timely, latecomer), blood pressure status, FHRDemonstrates potential of ML models to predict mode of delivery.
Guedalia J et al. (2021)
[132]
IsraelEHR data from 73,868 term deliveries in second stage of labourGradient BoostingAUC: 0.761; Sens: 72.1%, OR: 5.3 for high-risk vs. low-risk groupAntepartum features and intrapartum data gathered during the first stage of labourEnables early identification of high-risk deliveries for severe adverse neonatal outcomes.
Tarimo Cs et al. (2021)
[133]
TanzaniaEHR data from 21,578 deliveriesBoostingBoosting model: AUC: 0.75, Acc: 0.74, Sens: 0.85, Spec: 0.59, PPV: 0.75, NPV: 0.73Maternal age, parity, gestational age, BMI, birth weight, PROM, multiple gestation, maternal education, marital status, occupation, alcohol useProvides insight into early identification of candidates for labour induction using routine data.
Meyer R et al. (2020)
[134]
IsraelEHR data from 989 consecutive singleton TOLAC deliveriesRFAUC-PR 0.351 ± 0.028,Prior vaginal delivery, maternal height, prior arrest of descent, maternal weight, gestational age, etc.Enhanced prediction of TOLAC success.
Ricciardi C et al. (2020)
[135]
ItalyEHR and CTG recordings from 370 deliveriesRFAcc 91.1%, Sens 90.0%, Spec 92.2%, Precision 92.1%, AUCROC 96.7%17 features: gestational age, FHR metrics, (UCs, accelerations/decelerations, spectral power (LF/HF), entropy, Poincaré plot axes, etc.Helps predict the type of delivery from intrapartum CTG signals
Beksac Ms et al. (2018)
[136]
TurkeyHER data from 800 deliveries (600 vaginal births and 200 caesarean sections)ANN with back-propagationSens: 60.9%, Spec: 97.5%, PPV: 81.8%, NPV: 93.1%, Test Efficiency: 91.8%Maternal age, gravida, parity, gestational age, labour induction type, presentation, risk factorsProvides a supportive decision tool to predict delivery mode.
Fergus P et al. (2018)
[137]
UKCTG recordings from 506 vaginal and 46 caesarean deliveriesEnsemble model combining FLDA, RF, and SVMSens 87%, Spec 90%, AUC 96%, MSE 9%13 FHR features: including STV, SampEn, DFA, RMS, FD, SD1, SD2, SDRatio, RBL, accelerations, decelerations, etc.Provides a supportive decision tool to predict delivery mode based on FHR alone.
Macones Ga et al. (2001)
[134]
USAHER data from 400 women with prior caesarean delivery (100 failed TOLAC, 300 successful VBAC)Multivariate LRSens 77%, Spec 65%, Acc 69%Substance abuse, prior successful VBAC, cervical dilation at admission, need for labour augmentationEnhanced prediction of TOLAC success.
Devoe Ld et al. (1996)
[138]
USAEHR data from 200 term pregnancies with spontaneous labour (159 for training, 41 for testing)Feedforward NNCorrelation with actual duration: r = 0.88 (NN), Compared to partogram: r = 0.35.UC, EFW, foetal position, station, gestational age, maternal parity, age, height, weight, membrane status, cervical dilatationProvides more accurate prediction of first-stage labour duration.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Correia, V.; Mascarenhas, T.; Mascarenhas, M. Smart Pregnancy: AI-Driven Approaches to Personalised Maternal and Foetal Health—A Scoping Review. J. Clin. Med. 2025, 14, 6974. https://doi.org/10.3390/jcm14196974

AMA Style

Correia V, Mascarenhas T, Mascarenhas M. Smart Pregnancy: AI-Driven Approaches to Personalised Maternal and Foetal Health—A Scoping Review. Journal of Clinical Medicine. 2025; 14(19):6974. https://doi.org/10.3390/jcm14196974

Chicago/Turabian Style

Correia, Vera, Teresa Mascarenhas, and Miguel Mascarenhas. 2025. "Smart Pregnancy: AI-Driven Approaches to Personalised Maternal and Foetal Health—A Scoping Review" Journal of Clinical Medicine 14, no. 19: 6974. https://doi.org/10.3390/jcm14196974

APA Style

Correia, V., Mascarenhas, T., & Mascarenhas, M. (2025). Smart Pregnancy: AI-Driven Approaches to Personalised Maternal and Foetal Health—A Scoping Review. Journal of Clinical Medicine, 14(19), 6974. https://doi.org/10.3390/jcm14196974

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop