Comprehensive Approach with Machine Learning Techniques to Investigate Early-Onset Preeclampsia and Its Long-Term Cardiovascular Implications
Abstract
Featured Application
Abstract
1. Introduction
2. Related Work
3. Materials and Methods
3.1. Study Overview and Dataset Composition
3.1.1. Timestamp 1: Pre-Pregnancy Baseline
3.1.2. Timestamp 2: Diagnosis-to-Delivery Phase
3.1.3. Timestamp 3: Follow-Up Visit
3.2. Data Strategy
3.3. Machine-Learning Models
3.3.1. Supervised Learning
3.3.2. Unsupervised Learning
4. Results
4.1. Early-Onset Preeclampsia Risk Reevaluation Model
4.2. Post-Pregnancy Hypertension Prediction Model
4.3. Mid-Term Follow up
5. Discussion
5.1. Principal Findings
5.2. Limitations
5.3. Future Work
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
PE | Preeclampsia |
CV | Cardiovascular |
eoPE | Early-onset preeclampsia |
FGR | Fetal Growth Restriction |
AI | Artificial intelligence |
ML | Machine learning |
ABPM | Ambulatory blood pressure monitoring |
IMT | Intima media thickness |
GA | Genetic algorithm |
NPV | Negative predictive value |
ROC-AUC | Area under the receiver operating characteristic |
SVMs | Support Vector Machines |
LR | Logistic Regression |
Appendix A
Dataset Phase | N Patients | N Variables | Numerical | Categorical | Missings (%) |
---|---|---|---|---|---|
Baseline (T1) | 80 | 30 | 15 | 15 | 4.49 |
Diagnosis to delivery (T2) | 45 | 28 | 18 | 10 | 5.09 |
Follow-up visit (T3): Blood test | 43 | 47 | 47 | 0 | 0.92 |
Follow-up visit (T3): Atherosclerosis | 44 | 5 | 5 | 0 | 1.89 |
Follow-up visit (T3): Ambulatory blood pressure | 37 | 15 | 15 | 0 | 0 |
Development of Early-Onset Preeclampsia | |||
---|---|---|---|
Characteristics | No (N = 40) | Yes (N = 40) | p-Value |
Maternal Ethnicity Caucasian African American South American | 31 (77.5) 2 (5.0) 7 (17.5) | 26 (65.0) 1 (2.5) 13 (32.5) | NS |
Maternal age (years) | 32.40 +/− 5.64 | 32.25 +/− 5.55 | NS |
Pre-gestational weight (kg) | 70.85 +/− 12.36 | 74.68 +/− 19.48 | NS |
Maternal size (cm) | 161.38 +/− 7.19 | 160.15 +/− 6.55 | NS |
Maternal body mass index (Kg/m2) | 26.80 +/− 5.04 | 28.55 +/− 6.91 | NS |
Conception method Spontaneous In vitro fertilization IVF-Ovodonation | 40 (100.0) 0 (0) 0 (0) | 38 (95.0) 1 (2.5) 1 (2.5) | NS |
Number of previous gestations that have reached the 22 weeks of gestation 0 1 >1 | 20 (50.0) 11 (27.5) 9 (22.5) | 20 (50.0) 14 (35.0) 6 (15.0) | NS |
Number of previous vaginal deliveries 0 1 >1 | 26 (65.0) 6 (15.0) 8 (20.0) | 25 (62.5) 11 (27.5) 4 (10.0) | NS |
Number of previous cesarean sections 0 1 >1 | 34 (85.0) 5 (12.5) 1 (2.5) | 33 (82.5) 6 (15.0) 1 (2.5) | NS |
Number of previous abortions 0 1 >1 | 25 (62.5) 10 (25.0) 5 (12.5) | 27 (67.5) 8 (20.0) 5 (12.5) | NS |
Preeclampsia in previous gestations Early-onset preeclampsia Late-onset preeclampsia | 0 (0) 0 (0) | 2 (5.0) 4 (10.0) | <0.05 |
Maternal chronic hypertension | 0 (0) | 4 (10.0) | NS |
Baseline maternal renal disease | 0 (0) | 1 (2.5) | NS |
Pre-gestational diabetes mellitus | 0 (0) | 3 (7.5) | NS |
Maternal thrombophilia | 0 (0) | 2 (5.0) | NS |
Maternal age > 40 | 3 (7.5) | 4 (10.0) | NS |
Nullipara or the last children was more than 10 years ago | 20 (50.0) | 20 (50.0) | NS |
Maternal body mass index > 35 Kg/m2 | 2 (5.0) | 6 (15.0) | NS |
Mother/sister with history of preeclampsia | 0 (0) | 2 (5.0) | NS |
Low dose aspirin intake (100 mg/day) Starting at or before 16 weeks Starting after 16 weeks | 0 (0) 0 (0) | 2 (5.0) 7 (17.5) | <0.05 |
Low dose heparin prophylaxis Starting at or before 16 weeks Starting after 16 weeks | 0 (0) 0 (0) | 2 (5.0) 3 (7.5) | NS |
Smoking during pregnancy | 2 (5.0) | 2 (5.0) | NS |
Crown-rump length (1st trimester ultrasound) | 64.32 +/− 10.37 | 56.26 +/− 5.83 | <0.005 |
Gestational age based on the crown-rump length | 12.65 +/− 0.79 | 12.00 +/− 0.46 | <0.005 |
Gestational age based on the 1st trimester ultrasound | 12.55 +/− 1.06 | 12.67 +/− 1.68 | NS |
Gestational age based on the 2nd trimester ultrasound | 20.74 +/− 0.77 | 20.44 +/− 0.88 | NS |
Pulsatility index of the right uterine artery | 0.92 +/− 0.26 | 1.93 +/− 0.56 | <0.005 |
Pulsatility index of the left uterine artery | 0.92 +/− 0.24 | 1.78 +/− 0.57 | <0.005 |
Mean pulsatility index of the uterine arteries | 0.92 +/− 0.23 | 1.86 +/− 0.44 | <0.005 |
Feature | Feature Meaning | Category | Scale | Missing Values (%) |
---|---|---|---|---|
Ethnicity | Maternal ethnicity | PC | 0: Caucasian, 1: African American, 2: South American, 3: North African, 4: Indian, 5: Other | 0 |
Weight | Pre-gestational weight | N | Kg | 0 |
Size | Maternal size | N | cm | 0 |
BMI | Maternal body mass index | N | Kg/m2 | 0 |
Conception | Conception method | PC | 0: Spontaneous, 1: Spousal artificial insemination, 2: Donor artificial insemination, 3: In vitro fertilization, 4: IVF-Ovodonation | 0 |
Parity | Previous gestations that have reached at least the 22 weeks | N | Number | 0 |
Vaginal_d | Previous vaginal deliveries | N | Number | 0 |
Cesarea | Previous cesarean sections | N | Number | 0 |
Miscarriages | Previous abortions | N | Number | 0 |
Previous_PE | Preeclampsia in previous gestations | PC | 0: No, 1: Yes (early-onset preeclampsia), 2: Yes (late-onset preeclampsia) | 0 |
Chronic_ht | Maternal chronic hypertension | DC | 0: No, 1: Yes | 0 |
Nephropathy | Baseline maternal renal disease | DC | 0: No, 1: Yes | 0 |
Pregest_dm | Pre-gestational diabetes mellitus | DC | 0: No, 1: Yes | 0 |
Thrombophilia | Maternal thrombophilia | DC | 0: No, 1: Yes | 1.25 |
SLE | Systemic lupus erythematosus | DC | 0: No, 1: Yes | 0 |
Fh_PE | Family history of preeclampsia | DC | 0: No, 1: Yes (Mother or sister) | 10.0 |
Nulliparous | Previous children | DC | 0: No, 1: Yes | 0 |
BMI_more35 | Maternal body mass index > 35 Kg/m2 | DC | 0: No, 1: Yes | 0 |
Smoker | Maternal smoking habits | PC | 0: No, 1: Yes, 2: Former smoker | 0 |
Maternal age | Age at onset of pregnancy | N | Years | 0 |
MaternalAge_more40 | Age at onset of pregnancy >40 years | DC | 0: No, 1: Yes | 0 |
ASA | Aspirin intake | PC | 0: No, 1: Yes (before week 16), 2: Yes (week 16-week 20) | 1.25 |
Heparin | Low molecular weight heparin administration | PC | 0: No, 1: Yes (before week 16), 2: Yes (week 16-week 20) | 0 |
1Us-gestAge | Gestational age based on the 1st trimester ultrasound | N | Weeks | 7.50 |
CR-length_Us1 | Crown-rump length (1st trimester ultrasound) | N | mm | 20.00 |
CR-gestAge | Gestational age based on the crown-rump length | N | Weeks | 20.00 |
2Us-gestAge | Gestational age based on the 2nd trimester ultrasound | N | Weeks | 12.50 |
PIlUtA_1921Us | Pulsatility index of the left uterine artery (19 + 0–21 + 6 weeks ultrasound) | N | Number | 23.75 |
PIrUtA_1921Us | Pulsatility index of the right uterine artery (19 + 0–21 + 6 weeks ultrasound) | N | Number | 23.75 |
mPIUtA_1921Us | Mean pulsatility index of the uterine arteries (19 + 0–21 + 6 weeks ultrasound) | N | Number | 23.75 |
Feature | Feature Meaning | Category | Scale | Missing Values (%) |
---|---|---|---|---|
GA_diagnosis | Gestational age at which the diagnosis of preeclampsia was established | N | Weeks | 6.67 |
SBP_diagnosis | Systolic blood pressure at diagnosis | N | mmHg | 2.22 |
DBP_diagnosis | Diastolic blood pressure at diagnosis | N | mmHg | 2.22 |
MBP_diagnosis | Mean blood pressure at diagnosis | N | mmHg | 2.22 |
PIlUtA_diagUs | Pulsatility index of the left uterine artery in the diagnostic ultrasound | N | Number | 11.11 |
PIrUtA_diagUs | Pulsatility index of the right uterine artery in the diagnostic ultrasound | N | Number | 11.11 |
mPIUtA_diagUs | Mean pulsatility index of the uterine arteries in the diagnostic ultrasound | N | Number | 11.11 |
tFlow_diagUs | End-diastolic flow in the umbilical artery on diagnostic ultrasound | PC | 0: Anterograde, 1: Absent, 2: Retrograde | 11.11 |
MCAPI_diagUs | Pulsatility index in the middle cerebral artery on diagnostic ultrasound | N | Number | 13.33 |
UAPI_diagUs | Pulsatility index in the umbilical artery on diagnostic ultrasound | N | Number | 8.89 |
CPratio_diagUs | Cerebro-placental ratio in diagnostic ultrasound | N | Number | 13.33 |
a_wave_diagUs | ‘a’ wave in ductus venosus on diagnostic ultrasound | PC | 0: Anterograde, 1: Absent, 2: Retrograde | 15.56 |
sFlt1_diagUs | Soluble fms-like tyrosine kinase-1 on diagnostic ultrasound | N | pg/mL | 24.44 |
1RF_medicalHist | 1 risk factor in the medical history (for PREP-L and PREP-S) | DC | 0: No, 1: Yes | 0 |
2orMoreRF_medicalHist | 2 or more risk factors in the medical history (for PREP-L and PREP-S) | DC | 0: No, 1: Yes | 0 |
SBP_max | Maximum systolic blood pressure | N | mmHg | 0 |
DBP_max | Maximum diastolic blood pressure | N | mmHg | 2.22 |
MBP_max | Maximum mean blood pressure (systemic or pulmonary) | N | mmHg | 2.22 |
Num_oralAntihypert | Number of oral antihypertensive drugs administered between alphamethyldopa, labetalol, hydralazine, and nifedipine | N | Number | 0 |
Num_intravAntihypert | Number of intravenous antihypertensive drugs administered between alfamethyldopa, labetalol and hydralazine. | N | Number | 0 |
Corticost_adm | Prenatal corticosteroid administration | PC | 0: No, 1: Yes (at least 1 complete cycle), 2: Yes (incomplete) | 0 |
Mg_sulfate_adm | Magnesium sulfate administration | PC | 0: No, 1: Yes (antepartum), 2: Yes (intrapartum), 3: Yes (antepartum and intrapartum), 4: Yes (postpartum), 5: Yes (antepartum, intrapartum and postpartum) | 0 |
Neuro_symptoms | Patients’ symptoms (headache or visual disturbance) | DC | 0: No, 1: Yes | 0 |
thirdSpace_symptoms | Patients’ symptoms (edema of the face and hands, sudden weight gain or chest pain) | DC | 0: No, 1: Yes | 0 |
Hepatic_symptoms | Patients’ symptoms (nausea/vomiting or epigastric pain) | DC | 0: No, 1: Yes | 0 |
Num_severityCrit | Number of severity criteria, including severe hypertension, thrombocytopenia, hepatic impairment, renal impairment, pulmonary edema and neurological events | N | Number | 0 |
Num_maternalComp | Number of maternal complications, including refractory hypertension, ischemic heart disease, intubation, oliguria, dialysis, HELLP syndrome, hepatic hematoma, coagulopathy, stroke and abruptio | N | Number | 0 |
FGR_finalStage | Fetal growth restriction stage (final classification) | PC | 0: No, 1: Stage I, 2: Stage II, 3: Stage III, 4: Stage IV, 5: Small for gestational age (SGA) | 8.89 |
Feature | Feature Meaning | Category | Scale | Missing Values (%) |
---|---|---|---|---|
Blood and Urine Test Features | ||||
Hemoglobin | Protein in red blood cells that carries oxygen | N | d/dL | 2.33 |
Hematocrit | Proportion of blood volume occupied by red cells | N | % | 2.33 |
MCV | Mean corpuscular volume | N | fL (femtoliters) | 2.33 |
Neutrophils | A type of white blood cell | N | 109/L | 2.33 |
Leukocytes | White blood cells that fight infection | N | 109/L | 2.33 |
Platelets | Cells that help blood clot | N | 109/L | 2.33 |
Prothrombin_act | Test to measure blood clotting ability | N | % | 0 |
Prothrombin_t | Time taken for blood to clot | N | Seconds | 0 |
INR | Standardized measure of blood clotting | N | Ratio | 0 |
aPTT | Time to form a blood clot in a partial thromboplastin test (Activated Partial Thromboplastin Time) | N | Seconds | 0 |
Triglycerides | Type of fat found in blood | N | mg/dL | 0 |
Cholesterol | Total cholesterol level in blood | N | mg/dL | 0 |
LDL | Low-density lipoprotein (‘bad cholesterol’) | N | mg/dL | 0 |
HDL | High-density lipoprotein (‘good cholesterol’) | N | mg/dL | 0 |
Glucose | Blood sugar level | N | mg/dL | 0 |
Creatinine | Waste product indicating kidney function | N | mg/dL | 0 |
Sodium | Essential electrolyte in blood | N | Mmol/L | 0 |
Potassium | Essential electrolyte in blood | N | Mmol/L | 0 |
Calcium | Essential mineral for bones and teeth | N | mg/dL | 0 |
Phosphorus | Important mineral for bones and energy production | N | mg/dL | 0 |
ALT/GPT | Liver enzyme indicating liver health | N | U/L | 0 |
AST/GOT | Liver enzyme indicating liver health | N | U/L | 0 |
TSH | Hormone stimulating the thyroid | N | U/mL | 0 |
Bilirubin | Yellow pigment formed by breakdown of red blood cells | N | md/dL | 0 |
LDH | Enzyme that helps producing energy | N | U/L | 0 |
Total_proteins | Total protein in blood | N | g/dL | 0 |
Glycated_Hb | Long term indicator of blood sugar levels | N | % | 0 |
C_reactive_prot | Inflammation marker | N | mg/dL | 0 |
Uric_acid | Waste product indicating metabolism | N | mg/dL | 0 |
Iron | Essential mineral for blood production | N | g/dL | 0 |
Ferritin | Protein that stores iron | N | ng/dL | 0 |
Transferrin | Protein that transports irons | N | mg/dL | 0 |
Transferrin_sat | Percentage of transferrin that is saturated with iron | N | % | 0 |
Albumin | Main protein in blood | N | g/dL | 0 |
Urine_alb | Protein in urine indicating kidney damage | N | mg/dL | 0 |
Urine_creat | Waste product in urine indicating kidney function | N | mg/dL | 0 |
Urine_alb/creat | Ratio indicating kidney function | N | Mg/g | 0 |
Anithyroid_perox | Antibodies against thyroid peroxidase | N | IU/mol | 0 |
IgA | Immunoglobulin A (a type of antibody) | N | mg/dL | 0 |
IgG | Immunoglobulin G (a type of antibody) | N | mg/dL | 0 |
IgM | Immunoglobulin M (a type of antibody) | N | mg/dL | 0 |
C3_complement | Protein of the immune system | N | mg/dL | 0 |
C4_complement | Protein of the immune system | N | mg/dL | 0 |
Active_MMP9 | Active form of the enzyme matrix metalloproteinase-9 | N | ng/dL | 4.65 |
Total_MMP9 | Total matrix metalloproteinase-9 | N | ng/dL | 4.65 |
TIMP1 | Tissue inhibitor of metalloproteinase-1 | N | ng/dL | 4.65 |
MMP9/TIMP1 | Ratio of MMP9 to TIMP1 | N | ratio | 4.65 |
Atherosclerosis study features | ||||
IMT_lfa | Intima-media thickness of left femoral artery | N | mm | 6.81 |
IMT_rfa | Intima-media thickness of right femoral artery | N | mm | 4.55 |
IMT_lca | Intima-media thickness of left carotid artery | N | mm | 0 |
IMT_rca | Intima-media thickness of right carotid artery | N | mm | 0 |
Num_plaques | Number of territories with atherosclerotic plaques | N | Number | 0 |
Ambulatory blood pressure monitoring features | ||||
SBP_24h | Systolic blood pressure 24 h | N | mmHg | 0 |
DBP_24h | Diastolic blood pressure 24 h | N | mmHg | 0 |
MBP_24h | Mean blood pressure 24 h | N | mmHg | 0 |
Sreadings_24h | Systolic readings over limit in 24 h | N | % | 0 |
Dreadings_24h | Diastolic readings over limit in 24 h | N | % | 0 |
SBP_diurnal | Diurnal systolic blood pressure | N | mmHg | 0 |
DBP_diurnal | Diurnal diastolic blood pressure | N | mmHg | 0 |
MBP_diurnal | Diurnal mean blood pressure | N | mmHg | 0 |
Sreadings_diurnal | Diurnal systolic readings over limit | N | % | 0 |
Dreadings_diurnal | Diurnal diastolic readings over limit in 24 h | N | % | 0 |
SBP_ nocturnal | Nocturnal systolic blood pressure | N | mmHg | 0 |
DBP_ nocturnal | Nocturnal diastolic blood pressure | N | mmHg | 0 |
MBP_ nocturnal | Nocturnal mean blood pressure | N | mmHg | 0 |
Sreadings_ nocturnal | Nocturnal systolic readings over limit | N | % | 0 |
Dreadings_ nocturnal | Nocturnal diastolic readings over limit in 24 h | N | % | 0 |
Metric | Formula |
---|---|
Accuracy | (TP + TN)/(TP + TN + FP + FN) |
Precision (PPV) | TP/(TP + FP) |
Recall (Sensitivity) | TP/(TP + FN) |
Specificity | TN/(TN + FP) |
Negative Predictive Value (NPV) | TN/(TN + FN) |
F1-score | 2 × (Precision × Recall)/(Precision + Recall) |
References
- Gathiram, P.; Moodley, J.J. Pre-eclampsia: Its pathogenesis and pathophysiolgy. Cardiovasc. J. Africa 2016, 27, 71–78. [Google Scholar] [CrossRef] [PubMed]
- Lisonkova, S.; Sabr, Y.; Mayer, C.; Young, C.; Skoll, A.; Joseph, K.S. Maternal morbidity associated with early-onset and late-onset preeclampsia. Obstet. Gynecol. 2014, 124, 771–781. [Google Scholar] [CrossRef]
- Brown, M.A.; Lindheimer, M.D.; De Swiet, M.; Van Assche, A.; Moutquin, J.M. The classification and diagnosis of the hypertensive disorders of pregnancy: Statement from the International Society for the Study of Hypertension in Pregnancy (ISSHP). Hypertens. Pregnancy 2001, 20, ix–xiv. [Google Scholar] [CrossRef]
- Bellamy, L.; Casas, J.P.; Hingorani, A.D.; Williams, D.J. Pre-eclampsia and risk of cardiovascular disease and cancer in later life: Systematic review and meta-analysis. BMJ 2007, 335, 974–977. [Google Scholar] [CrossRef]
- McDonald, S.D.; Malinowski, A.; Zhou, Q.; Yusuf, S.; Devereaux, P.J. Cardiovascular sequelae of preeclampsia/eclampsia: A systematic review and meta-analyses. Am. Heart J. 2008, 156, 918–930. [Google Scholar] [CrossRef]
- Verghese, D.; Muller, L.; Velamakanni, S. Addressing Cardiovascular Risk Across the Arc of a Woman’s Life: Sex-Specific Prevention and Treatment. Curr. Cardiol. Rep. 2023, 25, 1053–1064. [Google Scholar] [CrossRef]
- Masini, G.; Foo, L.F.; Tay, J.; Wilkinson, I.B.; Valensise, H.; Gyselaers, W.; Lees, C.C. Preeclampsia has two phenotypes which require different treatment strategies. Am. J. Obstet. Gynecol. 2022, 226, S1006–S1018. [Google Scholar] [CrossRef] [PubMed]
- McNestry, C.; Killeen, S.L.; Crowley, R.K.; McAuliffe, F.M. Pregnancy complications and later life women’s health. Acta Obstet. Et Gynecol. Scand. 2023, 102, 523–531. [Google Scholar] [CrossRef]
- Domínguez del Olmo, P.; Herraiz, I.; Villalaín, C.; De la Parte, B.; Rodríguez-Sánchez, E.; Ruiz-Hurtado, G.; Fernández-Friera, L.; Morales, E.; Ayala, J.L.; Solís, J.; et al. Cardiovascular disease in women with early-onset preeclampsia: A matched case-control study. J. Matern.-Fetal Neonatal Med. 2025, 38, 2459302. [Google Scholar] [CrossRef] [PubMed]
- Tan, M.Y.; Wright, D.; Syngelaki, A.; Akolekar, R.; Cicero, S.; Janga, D.; Singh, M.; Greco, E.; Wright, A.; Maclagan, K.; et al. Comparison of diagnostic accuracy of early screening for pre-eclampsia by NICE guidelines and a method combining maternal factors and biomarkers: Results of SPREE. Ultrasound Obstet. Gynecol. 2018, 51, 743–750. [Google Scholar] [CrossRef]
- Yoffe, L.; Gilam, A.; Yaron, O.; Polsky, A.; Farberov, L.; Syngelaki, A.; Nicolaides, K.; Hod, M.; Shomron, N. Early Detection of Preeclampsia Using Circulating Small non-coding RNA. Sci. Rep. 2018, 8, 3401. [Google Scholar] [CrossRef]
- Chaemsaithong, P.; Sahota, D.S.; Poon, L.C. First trimester preeclampsia screening and prediction. Am. J. Obstet. Gynecol. 2022, 226, S1071–S1097.e2. [Google Scholar] [CrossRef] [PubMed]
- IMarić, I.; Tsur, A.; Aghaeepour, N.; Montanari, A.; Stevenson, D.K.; Shaw, G.M.; Winn, V.D. Early prediction of preeclampsia via machine learning. Am. J. Obstet. Gynecol. MFM 2020, 2, 100100. [Google Scholar] [CrossRef]
- Butler, L.; Gunturkun, F.; Chinthala, L.; Karabayir, I.; Tootooni, M.S.; Bakir-Batu, B.; Celik, T.; Akbilgic, O.; Davis, R.L. AI-based preeclampsia detection and prediction with electrocardiogram data. Front. Cardiovasc. Med. 2024, 11, 1360238. [Google Scholar] [CrossRef]
- Villa, P.M.; Marttinen, P.; Gillberg, J.; Lokki, A.I.; Majander, K.; Orden, M.R.; Taipale, P.; Pesonen, A.; Räikkönen, K.; Hämäläinen, E.; et al. Cluster analysis to estimate the risk of preeclampsia in the high-risk Prediction and Prevention of Preeclampsia and Intrauterine Growth Restriction (PREDO) study. PLoS ONE 2017, 12, e0174399. [Google Scholar] [CrossRef] [PubMed]
- Wang, G.; Zhang, Y.; Li, S.; Zhang, J.; Jiang, D.; Li, X.; Li, Y.; Du, J. A Machine Learning-Based Prediction Model for Cardiovascular Risk in Women with Preeclampsia. Front. Cardiovasc. Med. 2021, 8, 736491. [Google Scholar] [CrossRef]
- Villalain, C.; Gómez-Arriaga, P.; Simón, E.; Galindo, A.; Herraiz, I. Longitudinal changes in angiogenesis biomarkers within 72 h of diagnosis and time-to-delivery in early-onset preeclampsia. Pregnancy Hypertens. 2022, 28, 139–145. [Google Scholar] [CrossRef]
- Gõmez-Arriaga, P.I.; Herraiz, I.; Lõpez-Jiménez, E.A.; Escribano, D.; Denk, B.; Galindo, A. Uterine artery Doppler and sFlt-1/PlGF ratio: Prognostic value in early-onset pre-eclampsia. Ultrasound Obstet. Gynecol. 2014, 43, 525–532. [Google Scholar] [CrossRef] [PubMed]
- Report of the National High Blood Pressure Education Program Working Group on High Blood Pressure in Pregnancy—PubMed. Available online: https://pubmed.ncbi.nlm.nih.gov/10920346/ (accessed on 8 July 2024).
- Verlohren, S.; Herraiz, I.; Lapaire, O.; Schlembach, D.; Zeisler, H.; Calda, P.; Sabria, J.; Markfeld-Erol, F.; Galindo, A.; Schoofs, K.; et al. New gestational phase-specific cutoff values for the use of the soluble fms-like tyrosine kinase-1/placental growth factor ratio as a diagnostic test for preeclampsia. Hypertension 2014, 63, 346–352. [Google Scholar] [CrossRef]
- Harris, P.A.; Taylor, R.; Thielke, R.; Payne, J.; Gonzalez, N.; Conde, J.G. Research electronic data capture (REDCap)—A metadata-driven methodology and workflow process for providing translational research informatics support. J. Biomed. Inform. 2009, 42, 377–381. [Google Scholar] [CrossRef]
- Stekhoven, D.J.; Bühlmann, P. MissForest—non-parametric missing value imputation for mixed-type data. Bioinformatics 2012, 28, 112–118. [Google Scholar] [CrossRef] [PubMed]
- Scikit-Learn Developers. 6.4. Imputation of Missing Values. Available online: https://scikit-learn.org/stable/modules/impute.html (accessed on 13 November 2024).
- Scikit-Learn Developers. StandardScaler. Available online: https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html (accessed on 13 November 2024).
- Fernando García. PyWinEA Library. Available online: https://github.com/FernandoGaGu/pywinEA (accessed on 2 December 2024).
- Scikit-Learn Developers. Cross-Validation: Evaluating Estimator Performance. Available online: https://scikit-learn.org/stable/modules/cross_validation.html (accessed on 3 December 2024).
- Brownlee, J. Repeated K-Fold Cross-Validation for Model Evaluation in Python. In Machine Learning Mastery. 2020. Available online: https://machinelearningmastery.com/repeated-k-fold-cross-validation-with-python/ (accessed on 3 December 2024).
- Scikit-learn developers. GridSearchCV. Available online: https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html (accessed on 6 February 2025).
- Lundberg, S. An introduction to Explainable AI with Shapley Values. Available online: https://shap.readthedocs.io/en/latest/example_notebooks/overviews/An%20introduction%20to%20explainable%20AI%20with%20Shapley%20values.html (accessed on 19 March 2025).
- Scikit-Learn Developers. KMeans. Available online: https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html (accessed on 10 December 2024).
- Wright, D.; Wright, A.; Nicolaides, K.H. The competing risk approach for prediction of preeclampsia. Am. J. Obstet. Gynecol. 2020, 223, 12–23.e7. [Google Scholar] [CrossRef] [PubMed]
- Staff, A.C.; Fjeldstad, H.E.; Fosheim, I.K.; Moe, K.; Turowski, G.; Johnsen, G.M.; Alnaes-Katjavivi, P.; Sugulle, M. Failure of physiological transformation and spiral artery atherosis: Their roles in preeclampsia. Am. J. Obstet. Gynecol. 2022, 226, S895–S906. [Google Scholar] [CrossRef] [PubMed]
- González, C.V.; García, I.H.; Fernández-Friera, L.; Ruiz-Hurtado, G.; Morales, E.; Solís, J.; Galindo, A. Salud cardiovascular y renal en la mujer: La preeclampsia como marcador de riesgo. Nefrología 2023, 43, 269–280. [Google Scholar] [CrossRef]
- Amiri, M.; Tehrani, F.R.; Rahmati, M.; Behboudi-Gandevani, S.; Azizi, F. Changes over-time in blood pressure of women with preeclampsia compared to those with normotensive pregnancies: A 15 year population-based cohort study. Pregnancy Hypertens. 2019, 17, 94–99. [Google Scholar] [CrossRef]
Reference | Method/Approach | Merits (Key Findings) | Limitations/Research Gap |
---|---|---|---|
Maric et al. [13] | Elastic Net and Gradient Boosting algorithms using 67 early prenatal variables. | Achieved a high predictive accuracy with an AUC of 0.89 and a true-positive rate of 72.3%. | Faced issues with substantial missing data. Limited its analysis to variables collected before 16 weeks of gestation. |
Butler et al. [14] | Modified ResNet Convolutional Neural Network analyzing raw ECG signals. | Demonstrated high performance in predicting PE risk up to 90 days before diagnosis, with AUCs reaching 0.92. | The study focuses solely on ECG data, potentially missing other relevant clinical factors. |
Villa PM et al. [15] | Unsupervised cluster analysis on women with known clinical risk factors for PE. | Identified 25 distinct patient clusters with different risk factor combinations, enabling more granular risk assessment. | The approach identifies risk groups but does not provide individualized predictive scores for future risk. |
Wang G et al. [16] | Five different ML algorithms (including Random Forest) for postpartum CV risk prediction. | The Random Forest model showed the best performance for general CV risk prediction (AUC of 0.711). | The model struggled to identify positive cases (low sensitivity) and did not focus specifically on post-pregnancy hypertension. |
Method/Technique | Dataset(s) Used | Role in the Study | Justification/Description |
---|---|---|---|
Genetic Algorithm | Baseline and diagnostic | Feature selection | Evolutionary algorithm selected due to its flexibility, modular structure, and availability of reliable Python libraries. Most important parameters: Elitism (to retain the best-performing individuals), Annihilation (to eliminate the least fit), Fill with elite (to repopulate with top individuals) and Mutation rate (to introduce diversity and avoid local optima). |
Cross-validation | All supervised models | Model optimization (in the GA fitness function to identify the combination of variables that maximized the F1 score) and validation (in performance evaluation to prevent overfitting) | Ensures robust performance estimation by averaging across multiple folds and repetitions; reduces overfitting and variance |
Support Vector Machine | Baseline and diagnostic | Final classifier for eoPE risk reevaluation model | Defines an optimal hyperplane that maximizes class separation, making it suitable for high-dimensional, small-sample problems; best F1-score on baseline dataset |
Logistic Regression | Baseline and diagnostic | Final classifier for post-pregnancy HT model | Models the log-odds of the outcome using a logistic function; best F1-score on diagnostic dataset |
K-Nearest Neighbors | Baseline and diagnostic | Evaluated but not selected | Classifies based on labels of nearest neighbors; lower F1-score on both datasets |
Decision Tree | Baseline and diagnostic | Evaluated but not selected | Rule-based partitioning; lower F1-score on both datasets |
GridSearchCV | Final models | Hyperparameter tuning of classifiers | Systematic search over predefined parameter grids applied to optimize model-specific parameters, enhancing predictive performance and model generalizability. |
SHAP | Final models | Model interpretability | Explains individual predictions by estimating feature contributions; enhances transparency and clinical insight |
mPIUtA_1921Us, PIrUtA_1921Us, Chronic_ht, MaternalAge_more40, Nephropathy, Heparin | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Accuracy | Precision | Recall | Specificity | NPV | F1 Score | ||||||
Train | Validation | Train | Validation | Train | Validation | Train | Validation | Train | Validation | Train | Validation |
0.973 (0.969–0.978) | 0.970 (0.956–0.984) | 0.997 (0.994–1.000) | 0.991 (0.979–1.000) | 0.950 (0.941–0.957) | 0.950 (0.922–0.978) | 0.998 (0.994–1.000) | 0.990 (0.976–1.000) | 0.951 (0.944–0.959) | 0.956 (0.932–0.981) | 0.972 (0.968–0.977) | 0.968 (0.953–0.984) |
2orMoreRF_medicalHist, CPratio_diagUs, tFlow_diagUs, a_wave_diagUs, SBP_max, MBP_max | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Accuracy | Precision | Recall | Specificity | NPV | F1 Score | ||||||
Train | Validation | Train | Validation | Train | Validation | Train | Validation | Train | Validation | Train | Validation |
0.844 (0.823–0.866) | 0.837 (0.814–0.860) | 0.821 (0.789–0.852) | 0.818 (0.767–0.870) | 0.754 (0.714–0.795) | 0.741 (0.672–0.809) | 0.898 (0.876–0.921) | 0.891 (0.855–0.928) | 0.858 (0.839–0.878) | 0.855 (0.828–0.882) | 0.785 (0.754–0.816) | 0.771 (0.733–0.810) |
24 h | Diurnal | Nocturnal | ||||
---|---|---|---|---|---|---|
Blue | Red | Blue | Red | Blue | Red | |
SBP | 110.39 ± 7.61 | 129.83 ± 6.82 | 113.39 ± 8.27 | 132.33 ± 8.55 | 102.16 ± 8.39 | 121.67 ± 3.88 |
DBP | 71.13 ± 5.53 | 82.50 ± 9.54 | 74.0 ± 6.19 | 84.50 ± 10.19 | 63.00 ± 6.66 | 76.17 ± 6.43 |
MBP | 84.26 ± 5.77 | 98.17 ± 8.21 | 87.13 ± 6.40 | 100.33 ± 9.03 | 76.10 ± 6.80 | 91.33 ± 5.35 |
Sreadings | 5.97 ± 7.54 | 45.50 ± 21.90 | 4.10 ± 7.01 | 39.73 ± 21.68 | 5.41 ± 8.16 | 47.0 ± 14.57 |
Dreadings | 19.13 ± 15.60 | 48.78 ± 24.62 | 13.64 ± 14.52 | 42.60 ± 31.14 | 22.62 ± 24.81 | 63.28 ± 17.65 |
Principal Findings | Clinical Applicability |
---|---|
Baseline phase: SVM-based eoPE risk reevaluation model achieved an F1-score of 97.2% (96.8–97.7%) in training and 96.8% (95.3–98.4%) in validation. Additionally, the model demonstrated high sensitivity (95.0% (94.1–95.7%) in training and 95.0 (92.2–97.8%) in validation) | Refined risk stratification during the second trimester could be crucial for assigning risk to patients who missed first-trimester screening and for reclassifying the risk of those already taking (or not taking) aspirin, to determine whether they would benefit from more intensive monitoring. Additionally, it could replace current Bayes-based models without the need to use biomarkers such as PlGF, which are expensive and not universally available. |
Diagnostic phase: LR-based post-pregnancy hypertension prediction model achieved an F1-score of 78.5% (75.4–81.6%) in training and 77.1% (73.3–81.0%) in validation | It could lead to better resource allocation to identify high-risk individuals who require intensive blood pressure monitoring and treatment after delivery, as well as promote lifestyle interventions to prevent cardiovascular events. |
Mid-term follow-up: clustering-based analysis on blood test, CV test and ABPM variables | High blood pressure levels at the time of PE diagnosis should be taken into account in mid-term follow up, as they are associated with persistent chronic HT and poor postpartum blood pressure control. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Domínguez-del Olmo, P.; Herraiz, I.; Villalaín, C.; Galindo, A.; Moreno-Espino, M.; Ayala, J.L. Comprehensive Approach with Machine Learning Techniques to Investigate Early-Onset Preeclampsia and Its Long-Term Cardiovascular Implications. Appl. Sci. 2025, 15, 8887. https://doi.org/10.3390/app15168887
Domínguez-del Olmo P, Herraiz I, Villalaín C, Galindo A, Moreno-Espino M, Ayala JL. Comprehensive Approach with Machine Learning Techniques to Investigate Early-Onset Preeclampsia and Its Long-Term Cardiovascular Implications. Applied Sciences. 2025; 15(16):8887. https://doi.org/10.3390/app15168887
Chicago/Turabian StyleDomínguez-del Olmo, Paula, Ignacio Herraiz, Cecilia Villalaín, Alberto Galindo, Mailyn Moreno-Espino, and Jose Luis Ayala. 2025. "Comprehensive Approach with Machine Learning Techniques to Investigate Early-Onset Preeclampsia and Its Long-Term Cardiovascular Implications" Applied Sciences 15, no. 16: 8887. https://doi.org/10.3390/app15168887
APA StyleDomínguez-del Olmo, P., Herraiz, I., Villalaín, C., Galindo, A., Moreno-Espino, M., & Ayala, J. L. (2025). Comprehensive Approach with Machine Learning Techniques to Investigate Early-Onset Preeclampsia and Its Long-Term Cardiovascular Implications. Applied Sciences, 15(16), 8887. https://doi.org/10.3390/app15168887