Abstract
Objectives: We aimed to validate the predictive performance of the PRE-ECV Score, a pre-specified clinical tool, using logistic recalibration and bootstrap optimism correction to estimate the probability of successful external cephalic version (ECV) at term. Methods: The PRE-ECV Score was defined a priori based on a literature review and expert consensus, incorporating eight variables supported by level A or B scientific evidence (total score range 0–13). Validation of this pre-specified score was performed in a retrospective, single-center cohort of 100 consecutive ECV procedures between November 2023 and October 2025. Study conduct and reporting adhered to the Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD) statement. Results: The revised PRE-ECV Score demonstrated moderate discrimination (AUC = 0.76; 95% CI, 0.66–0.85). Calibration was moderate, with a bootstrap-corrected intercept of −3.02 (95% CI, −5.42 to −1.51) and a calibration slope of 0.70 (95% CI, 0.44–1.18). Observed ECV success rates increased across score strata, from 34.6% (0–4 points) to 79.4% (5–8 points) and 100% (≥9 points). Conclusions: The PRE-ECV Score demonstrated moderate discrimination and satisfactory calibration in this single-center validation. Given the small sample and model optimism, these results should be interpreted strictly as proof-of-concept.
1. Introduction
Breech presentation at term complicates approximately 3–4% of singleton pregnancies and is strongly associated with an increased risk of cesarean delivery [1]. Cesarean section rates have risen dramatically worldwide, with recent estimates approaching 50% in several regions, including Poland [2,3]. Reducing unnecessary cesarean deliveries has become a global public health priority [1]. External cephalic version (ECV) is a safe and effective intervention recommended by international guidelines to reduce the incidence of breech presentation and thereby lower cesarean section rates. According to the Royal College of Obstetricians and Gynaecologists Green-top Guideline No. 20a, ECV should be routinely offered at term, as supported also by the Cochrane review [4,5].
Success rates of ECV vary widely between 35% and 86%, influenced by maternal, fetal, and procedural factors [1,2,6]. Multiparity, non-engagement of the breech, a palpable fetal head, posterior placental location, adequate amniotic fluid volume, and the use of tocolysis have all been shown to strongly increase the likelihood of success [7,8,9,10]. While numerous predictors of ECV success have been described, multivariable tools and clinical scores proposed to combine them have generally shown modest discrimination and inconsistent calibration [1,11,12,13,14,15].
Despite accumulating evidence, the implementation of ECV remains inconsistent, particularly in countries with high cesarean section rates such as Poland. Recent national surveys confirm low awareness and uptake among obstetricians and midwives, despite evidence of safety and effectiveness [3,16]. There is, therefore, a need for a simple, reliable, and clinically applicable scoring system to support decision-making and counseling.
To address this gap, we designed the PRE-ECV Score, a pragmatic tool based on eight clinical and ultrasonographic variables supported by level A and B evidence. The present study aimed to validate this pre-specified score in a consecutive cohort of women undergoing ECV at term. We assessed its predictive performance through logistic recalibration and applied bootstrap resampling to obtain optimism-corrected estimates, in line with TRIPOD recommendations.
2. Methods
2.1. Study Design and Setting
The PRE-ECV Score was defined based on a literature review and expert consensus. In this study, we conducted a single-center, retrospective validation of this pre-specified score in a cohort of women undergoing ECV procedures at the Clinical Department of Gynecology, Obstetrics, and Gynecological Oncology, Medical University of Silesia in Katowice (Markiefki 87, 40-211 Katowice, Poland) between November 2023 and October 2025. Performance was evaluated through logistic recalibration (intercept and slope) and bootstrap resampling to correct for optimism, in line with TRIPOD recommendations [17]. Records from 100 consecutive ECV procedures in singleton term pregnancies (gestational age ≥ 37 weeks) with breech or transverse presentation were analyzed. The primary outcome, successful ECV, was defined as immediate conversion of the fetus to cephalic presentation confirmed by ultrasound at the end of the procedure. All women undergoing ECV during the study period were included; no cases were excluded.
2.2. Predictor Variables and Scoring System
The PRE-ECV Score was developed a priori based on a literature review [1,2,3,4,5,6,7,8,9,10] and expert consensus among all of the authors. The total score ranges from 0 to 13 points and incorporates eight clinical and ultrasonographic variables supported by level A (2 points) or B (1 point) scientific evidence. All predictors were assessed immediately prior to the ECV procedure by the attending obstetrician. ECV success was determined by ultrasound at the end of the procedure. Outcome assessors were not blinded to predictor information. The PRE-ECV Score is presented in Table 1.
Table 1.
PRE-ECV Score.
2.3. Statistical Analysis
Candidate predictors were combined into an additive point score ranging from 0 to 13. The scoring system was fully prespecified before outcome evaluation, with no data-driven variable selection or coefficient optimization. To map the pre-specified score to predicted probabilities between the total score and the probability of successful ECV, we fitted a logistic regression model with the total score as the sole predictor, thereby ensuring model simplicity and avoiding overfitting.
Model discrimination was quantified using the area under the receiver-operating characteristic curve (AUC). Calibration was assessed visually using a calibration plot with a locally weighted regression (LOESS-smoothed) curve overlaid on the ideal 45° reference line.
Given the modest sample size (N = 100) and anticipated optimism, we performed bootstrap optimism correction for model performance metrics using 2000 bootstrap resamples. In each bootstrap sample, the model was refitted, and performance was evaluated both within the bootstrap sample and in the original dataset to estimate optimism. Optimism-corrected AUC, calibration intercept, and calibration slope are reported alongside apparent values.
For clinical interpretability, we calculated observed ECV success rates (with 95% Wilson confidence intervals) across three prespecified strata (0–4, 5–8, and ≥9 points). There were no missing data for predictors or outcome. This was a consecutive sample cohort, and no a priori sample size calculation was performed. All analyses were conducted using Python version 3.11 (Python Software Foundation, Wilmington, DE, USA).
3. Results
3.1. Study Population
Among 100 ECV attempts, 69 were successful, yielding an overall success rate of 69.0%. The revised PRE-ECV Score was significantly higher in the successful group (median = 6; IQR 5–7) than in the unsuccessful group (median = 4; IQR 4–6; p < 0.001, Mann–Whitney U test). Baseline characteristics of the study population stratified by ECV outcome are summarized in Table 2
Table 2.
Characteristics of the study population stratified by ECV, stratified according to procedural outcome. Values are presented as median (IQR) or N (%). Percentages are calculated per column.
3.2. Discrimination, Calibration and Decision Curve Analysis
The revised PRE-ECV Score demonstrated moderate discrimination, with an apparent AUC of 0.76; (95% CI of 0.66–0.85; DeLong method). Bootstrap optimism-corrected calibration yielded an intercept of −3.02 (95% CI −5.42 to −1.51) and a calibration slope of 0.70 (95% CI 0.44–1.18), indicating moderate agreement between predicted and observed probabilities. Decision curve analysis showed that the model provided greater net clinical benefit than “treat all” or “treat none” strategies across a clinically relevant range of threshold probabilities (Figure 1).
Figure 1.
Model performance of the revised PRE-ECV Score. (A) Receiver-operating characteristic (ROC) curve demonstrating discrimination of the PRE-ECV Score for predicting successful external cephalic version (ECV). The apparent area under the ROC curve (AUC) was 0.76 with a bootstrap-derived 95% confidence interval of 0.66–0.85 (DeLong method). (B) Calibration plot showing agreement between predicted and observed probabilities of ECV success. The LOESS-smoothed calibration curve (solid black line) demonstrates moderate calibration relative to the ideal 45° reference line (gray dashed), consistent with a bootstrap-corrected calibration intercept of −3.02 and calibration slope of 0.70. Black points represent mean predicted and observed probabilities within deciles of predicted risk. (C) Decision curve analysis (DCA) evaluating the net clinical benefit of using the PRE-ECV Score across a range of threshold probabilities. The PRE-ECV model (solid black line) provides greater net benefit than “treat all” (gray dashed line) and “treat none” (gray dotted line) strategies for threshold probabilities between approximately 0.25 and 0.70, indicating meaningful potential for guiding clinical decision-making.
3.3. Validation Sample Only
The Youden index identified a cutoff of ≥9 points in this cohort. The predicted probability of ECV success for a score of 9 was 0.97 in the apparent logistic model and 0.96 after bootstrap correction. These estimates reflect the prevalence and score distribution of this single-center sample and may be optimistic; therefore, threshold determination should be re-evaluated in external validation.
3.4. Clinical Utility
Decision-curve analysis (DCA) demonstrated that the PRE-ECV Score provided greater net benefit than either a “treat-all” or “treat-none” strategy across clinically relevant threshold probabilities ranging from 0.30 to 0.80. At the predefined cutoff of ≥9 points, the score achieved a sensitivity of 64% (95% CI, 52–75), a specificity of 80.6%, a positive predictive value of 88% (95% CI, 75–95), and a negative predictive value of 50% (95% CI, 38–62). These findings indicate balanced discriminatory performance and potential clinical utility, although threshold-based decision-making should be re-evaluated in future external validation studies.
4. Discussion
The PRE-ECV Score demonstrated moderate but clinically meaningful discrimination in predicting external cephalic version success, with an AUC of 0.76 (95% CI 0.66–0.85). Decision-curve analysis suggested a net clinical benefit of using the score, supporting its potential role as a decision aid. At the predefined cutoff of ≥9 points, the model achieved balanced diagnostic performance, with a clear separation between strata. These findings, although encouraging, should be interpreted as exploratory and hypothesis-generating, pending external validation.
Most previously proposed ECV prediction tools report mixed discrimination and variable calibration. In the widely cited Dahl et al. (2021) model (BMI, parity, placental location, presentation), the AUC was 0.667 (95% CI 0.634–0.701) with good calibration; the cohort’s overall ECV success rate was 40.6% [1]. External validation by Kishkovich et al. (2023) in an independent US cohort yielded AUC 0.70 (95% CI 0.65–0.75) with a 44.4% success rate, importantly, practice patterns differed substantially (neuraxial anesthesia 83.5% in the derivation setting vs. 10.4% in the validation cohort), underscoring spectrum/practice-mix effects on model transportability [11]. Earlier tools performed similarly or worse. De Hundt et al. (2012) externally validated a prior model and found AUC 0.66 (95% CI 0.60–0.72) with systematic underestimation of success by 4–14%, indicating miscalibration in a new setting [12]. The clinical score by Burgos et al. (2012) reported a predictive capacity 70.1% (95% CI 66.9–73.4) [14]. The GNK-PIMS model proposed by Tasnim et al. has shown the weakest performance among commonly cited tools in comparative work [15]. Recent head-to-head testing by Yerrabelli et al. [13] (2025)—evaluating six models in a single US institution—found Dahl et al. [1] (2021) to be the best performer (AUC 0.779; bootstrapped 95% CI 0.71–0.84) and well-calibrated, whereas Tasnim et al. [15] (2012) performed worst (AUC 0.626) and others clustered around AUC 0.68–0.71.
The strengths of this study include the use of prespecified predictors derived from prior literature and expert consensus; a comprehensive evaluation of model performance encompassing discrimination, multiple calibration indices, and decision curve analysis; and the development of a transparent, point-based scoring system that has the potential to be simple to apply in clinical practice if validated externally.
Limitations relate to its retrospective, single-center design and modest sample size (N = 100), which raise concerns about spectrum effects and residual confounding. The AUC observed in our cohort likely reflects residual optimism due to the sample size and relatively high number of prespecified predictors. Although bootstrap correction partially accounts for optimism, it remains a form of “testing on ourselves” and cannot guarantee reproducibility in independent data. Accordingly, PRE-ECV should be regarded solely as a proof-of-concept model. While the high apparent AUC reflects potential, it also highlights the risk of overfitting in a small, single-center dataset. Only external, multicenter validation will determine whether the model has reproducible value, and until then, no clinical implementation should be attempted.
Moreover, we emphasize that a successful external cephalic version (ECV) does not automatically result in cephalic presentation at delivery or vaginal birth. However, immediate ECV success is the most direct and procedure-specific outcome and can be clearly attributed to the predictors included in the score. Evaluating downstream outcomes, such as mode of delivery, would introduce post-procedural factors and practice-related variability; therefore, this was deliberately avoided in this proof-of-concept validation.
Further research should prioritize multicenter validation and potential recalibration. In addition, cost-effectiveness analyses, such as those conducted by Tan et al., may help clarify whether targeted use of adjuncts in moderate-probability groups is economically and clinically advantageous [18]. Finally, compared with published cohorts—Dahl (2021) [1] (success 40.6%), Kishkovich (2023) [11] (44.4%), and Yerrabelli (2025) [13] (52.2%)—our center’s success rate (69%) was higher, which may inflates apparent AUC and net benefit. Differences in operator experience, patient selection, and anesthesia/tocolysis protocols likely contribute and should be considered when applying PRE-ECV outside our setting. External validation in centers with diverse practice patterns will be necessary to confirm calibration and net benefit.
5. Conclusions
The PRE-ECV Score represents a proof-of-concept model that demonstrated apparently strong performance within a small, single-center validation cohort. Its relatively high apparent discrimination likely reflects sample-specific characteristics and residual overfitting rather than true generalizable accuracy. These findings should therefore be interpreted as preliminary evidence of potential utility rather than as a clinically deployable tool. External, multicenter validation is essential before considering any clinical implementation.
Author Contributions
M.M.-D. Conceptualization; Methodology; Data curation; Formal analysis; Investigation; Writing—original draft; Visualization. J.S. Conceptualization; Methodology; Supervision; Formal analysis; Project administration; Writing—original draft; Writing—review & editing. K.N. Validation; Writing—review & editing. K.S. Validation; Writing—review & editing. E.W. Validation; Writing—review & editing. R.S. Supervision; Writing—review & editing. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
The study was approved by the Review Board of the Chair and Department of Gynecology, Obstetrics, and Gynecological Oncology at the Medical University of Silesia, Katowice (Project identification code: 23/2025, date of approval: 15 October 2025). The study was conducted in accordance with ethical principles governing medical research, including the Declaration of Helsinki. All personal data were securely protected and remained confidential within the participating research center.
Informed Consent Statement
The requirement to obtain consent for data collection and analysis was waived due to the study’s retrospective nature, which involved anonymized data in a non-interventional setting.
Data Availability Statement
The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Dahl, C.M.; Zhang, Y.B.; Ong, J.X.B.; Yeh, C.; Son, M.M.; Miller, E.S.; Roy, A.; Grobman, W.A.M. A Multivariable Predictive Model for Success of External Cephalic Version. Obstet. Gynecol. 2021, 138, 426–433. [Google Scholar] [CrossRef] [PubMed]
- Manasar-Dyrbus, M.; Seifert, B.; Drosdzol-Cop, A.; Stojko, R.; Staniczek, J. Transforming clinical practice in just one year: Lessons from external cephalic version success. Ginekol. Pol. 2025, 96, 482–489. [Google Scholar] [CrossRef] [PubMed]
- Manasar-Dyrbus, M.; Drosdzol-Cop, A.; Stojko, S.; Stojko, R.; Staniczek, J. Strategies to reduce cesarean deliveries: Surveying Polish obstetricians on external cephalic version practices. Ginekol. Pol. 2025, 96, 271–281. [Google Scholar] [CrossRef] [PubMed]
- Hofmeyr, G.J.; Kulier, I.R. External cephalic version for breech presentation at term. Cochrane Database Syst. Rev. 2015, 2015, CD000083. [Google Scholar] [CrossRef] [PubMed]
- Impey, L.W.M.; Murphy, D.J.; Griffiths, M.; Bray, E.; Penna, L.K. External Cephalic Version and Reducing the Incidence of Term Breech Presentation: Green-top Guideline No. 20a. BJOG Int. J. Obstet. Gynaecol. 2017, 124, e178–e192. [Google Scholar] [CrossRef]
- Kwiatek, M.; Geca, T.; Stupak, A.; Kwasniewski, W.; Mlak, R.; Kwasniewska, I.A. External cephalic version—Single-center experience. Ginekol. Pol. 2024, 95, 779–784. [Google Scholar] [CrossRef] [PubMed]
- Kok, M.; Van Der Steeg, J.W.; Mol, B.W.J.; Opmeer, B.; Van Der Post, I.J.A. Which factors play a role in clinical decision-making in external cephalic version? Acta Obstet. Gynecol. Scand. 2008, 87, 31–35. [Google Scholar] [CrossRef] [PubMed]
- Kok, M.; Cnossen, J.; Gravendeel, L.; Van Der Post, J.A.; Mol, I.B.W. Ultrasound factors to predict the outcome of external cephalic version: A meta-analysis. Ultrasound Obstet. Gynecol. 2009, 33, 76–84. [Google Scholar] [CrossRef] [PubMed]
- Goetzinger, K.R.; Harper, L.M.; Tuuli, M.G.; Macones, G.A.; Colditz, G.A. Effect of Regional Anesthesia on the Success Rate of External Cephalic Version: A Systematic Review and Meta-Analysis. Obstet. Gynecol. 2011, 118, 1137–1144. [Google Scholar] [CrossRef] [PubMed]
- Kok, M.; Cnossen, J.; Gravendeel, L.; Van Der Post, J.; Opmeer, B.; Mol, I.B.W. Clinical factors to predict the outcome of external cephalic version: A metaanalysis. Am. J. Obstet. Gynecol. 2008, 199, 630.e1–630.e7. [Google Scholar] [CrossRef] [PubMed]
- Kishkovich, T.P.; Naert, M.N.; Warsame, F.B.; Taboada, M.P.; James, K.E.; Barth, W.H.J.; Clapp, M.A. External Validation of a Prediction Model for External Cephalic Version Success. Obstet. Gynecol. 2023, 141, 964–966. [Google Scholar] [CrossRef] [PubMed]
- De Hundt, M.; Vlemmix, F.; Kok, M.; Van Der Steeg, J.W.; Bais, J.M.; Mol, B.W.; Van Der Post, J.A. External Validation of a Prediction Model for Successful External Cephalic Version. Am. J. Perinatol. 2012, 29, 231–236. [Google Scholar] [CrossRef] [PubMed]
- Yerrabelli, R.S.; Palsgaard, P.K.; Shankarappa, P.; Jennings, I.V. The Optimal Prediction Model for Successful External Cephalic Version. Am. J. Perinatol. 2025, 42, 751–757. [Google Scholar] [CrossRef] [PubMed]
- Burgos, J.; Cobos, P.; Rodriguez, L.; Pijoán, J.I.; Fernández-Llebrez, L.; Martínez-Astorquiza, T.; Melchor, J.C. Clinical score for the outcome of external cephalic version: A two-phase prospective study. Aust. N. Z. J. Obstet. Gynaecol. 2012, 52, 59–61. [Google Scholar] [CrossRef] [PubMed]
- Tasnim, N.; Mahmud, G.; Javaid, I.K. GNK-PIMS Score: A Predictive Model for Success of External Cephalic Version. J. South Asian Fed. Obstet. Gynaecol. 2012, 4, 99–102. [Google Scholar] [CrossRef]
- Manasar-Dyrbuś, M.; Janik, A.; Jendyk, C.; Drosdzol-Cop, A.; Brzęk, A.; Stojko, R.; Staniczek, J. Strategies to reduce cesarean deliveries: Surveying Polish midwives and midwifery students on external cephalic version practices. BMC Nurs. 2025, 24, 582. [Google Scholar] [CrossRef] [PubMed]
- Collins, G.S.; Reitsma, J.B.; Altman, D.G.; Moons, I.K. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): The TRIPOD Statement. BMC Med. 2015, 350, g7594. [Google Scholar] [CrossRef] [PubMed]
- Tan, J.M.; Macario, A.; Carvalho, B.; Druzin, M.L.; El-Sayed, I.Y.Y. Cost-effectiveness of external cephalic version for term breech presentation. BMC Pregnancy Childbirth 2010, 10, 3. [Google Scholar] [CrossRef] [PubMed]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.