Temporal Trends and Machine Learning-Based Risk Prediction of Female Infertility: A Cross-Cohort Analysis Using NHANES Data (2015–2023)
Abstract
1. Introduction
2. Methods
2.1. Data Source and Study Population
2.2. Definition of Infertility
2.3. Variable Selection and Harmonization
2.4. Statistical and Machine Learning Analysis
2.5. Ethical Considerations
3. Results
3.1. Temporal Trends in Female Infertility Based on NHANES Data (2015–2023)
3.2. Descriptive Statistics of the Study Population Based on Common Variables Across NHANES Cohorts (2015–2023)
3.3. Relative Importance of Common Predictors Across NHANES Cohorts Based on Logistic Regression Coefficients
3.4. Infertility Rate by Risk Factor Across Cohorts
3.5. Multivariate Analysis of Infertility Predictors
3.6. Model Performance Comparison
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
NHANES | National Health and Nutrition Examination Survey |
PCOS | Polycystic Ovary Syndrome |
PID | Pelvic Inflammatory Disease |
ML | Machine Learning |
LR | Logistic Regression |
RF | Random Forest |
XGBoost | Extreme Gradient Boosting |
NB | Naïve Bayes |
SVM | Support Vector Machine |
AUC | Area Under the Receiver Operating Characteristic Curve |
ROC | Receiver Operating Characteristic |
OR | Odds Ratios |
CIs | Confidence Intervals |
CDC | Centers for Disease Control and Prevention |
References
- Cox, C.M.; Thoma, M.E.; Tchangalova, N.; Mburu, G.; Bornstein, M.J.; Johnson, C.L.; Kiarie, J. Infertility prevalence and the methods of estimation from 1990 to 2021: A systematic review and meta-analysis. Hum. Reprod. Open 2022, 2022, hoac051. [Google Scholar] [CrossRef]
- Dourou, P.; Gourounti, K.; Lykeridou, A.; Gaitanou, K.; Petrogiannis, N.; Sarantaki, A. Quality of life among couples with a fertility related diagnosis. Clin. Pract. 2023, 13, 251–263. [Google Scholar] [CrossRef] [PubMed]
- Feng, J.; Wu, Q.; Liang, Y.; Liang, Y.; Bin, Q. Epidemiological characteristics of infertility, 1990–2021, and 15-year forecasts: An analysis based on the global burden of disease study 2021. Reprod. Health 2025, 22, 26. [Google Scholar] [CrossRef]
- Moutzouroulia, A.; Asimakopoulou, Z.; Tzavara, C.; Asimakopoulos, K.; Adonakis, G.; Kaponis, A. The impact of infertility on the mental health of women undergoing in vitro fertilization treatment. Sex. Reprod. Healthc. 2025, 43, 101072. [Google Scholar] [CrossRef] [PubMed]
- Văduva, C.C.; Dîră, L.; Boldeanu, L.; Șerbănescu, M.S.; Carp-Velișcu, A. A Narrative Review Regarding Implication of Ovarian Endometriomas in Infertility. Life 2025, 15, 161. [Google Scholar] [CrossRef]
- Dubé-Zinatelli, E.; Anderson, F.; Ismail, N. The overlooked mental health burden of polycystic ovary syndrome: Neurobiological insights into PCOS-related depression. Front. Neuroendocrinol. 2025, 78, 101203. [Google Scholar] [CrossRef]
- Zheng, L.; Xu, X.; Zhou, J.Z.; Hong, L.; He, Y.F.; Fang, Y.X.; Wang, B.B.; Chen, H.; Chen, K.J.; Yang, S.S.; et al. The burden of polycystic ovary syndrome-related infertility in 204 countries and territories, 1990–2021: An analysis of the global burden of disease study 2021. Front. Endocrinol. 2025, 16, 1559246. [Google Scholar] [CrossRef]
- Boelig, R.C.; Manuck, T.; Oliver, E.A.; Di Mascio, D.; Saccone, G.; Bellussi, F.; Berghella, V. Labor and delivery guidance for COVID-19. Am. J. Obstet. Gynecol. MFM 2020, 2, 100110. [Google Scholar] [CrossRef] [PubMed]
- Matthes, K.L.; Le Vu, M.; Staub, K. Fertility dynamics through historical pandemics and COVID-19 in Switzerland, 1871–2022. Popul. Stud. 2025, 1–16. [Google Scholar] [CrossRef]
- Ganesan, S.; Al Ketbi, L.M.; Cantarutti, F.M.; Al Kaabi, N.; Al Mansoori, M.; Al Saedi, M.R.; Al Blooshi, F.I.; Al Nuaimi, R.A.; Ibrahim, M.; Eltantawy, I.; et al. Influence of COVID-19 pandemic on pregnancy and fertility preferences among the residents of the United Arab Emirates (UAE). J. Glob. Health 2024, 14, 05002. [Google Scholar] [CrossRef]
- Liu, D.; Luo, X.; Zhou, K. Association between current relative fat mass and history of female infertility based on the NHANES survey. Sci. Rep. 2025, 15, 6294. [Google Scholar] [CrossRef] [PubMed]
- Li, B.; Zhai, H. Life’s crucial 9 is inversely and linearly associated with female infertility prevalence: A cross-sectional analysis from NHANES 2013–2018. Sci. Rep. 2025, 15, 14918. [Google Scholar] [CrossRef]
- Khan, F.M.; Akhtar, M.S.; Khan, I.U.; Haider, Z.A.; Khan, N.H. Clinical Prediction of Female Infertility Through Advanced Machine Learning Techniques. Int. J. Innov. Sci. Technol. 2024, 6, 900–917. [Google Scholar]
- Tadese, Z.B.; Nimani, T.D.; Mare, K.U.; Gubena, F.; Wali, I.G.; Sani, J. Exploring machine learning algorithms for predicting fertility preferences among reproductive age women in Nigeria. Front. Digit. Health 2025, 6, 1495382. [Google Scholar] [CrossRef]
- Dehghan, S.; Moghaddasi, H.; Rabiei, R.; Choobineh, H.; Maghooli, K.; Vahidi-Asl, M. Machine learning in predicting infertility treatment success: A systematic literature review of techniques. J. Educ. Health Promot. 2025, 14, 103. [Google Scholar] [CrossRef]
- Taha, K. Machine learning in biomedical and health big data: A comprehensive survey with empirical and experimental insights. J. Big Data 2025, 12, 61. [Google Scholar] [CrossRef]
- National Center for Health Statistics (NCHS). National Health and Nutrition Examination Survey Data, 2015–2023; U.S. Department of Health and Human Services, Centers for Disease Control and Prevention: Hyattsville, MD, USA. Available online: https://www.cdc.gov/nchs/nhanes/index.html (accessed on 3 July 2025).
- Akhtar, M.; Ahmed, K.A.; Ferdib-Al-Islam. An Improved Prediction of Polycystic Ovary Syndrome Using SMOTE-based Oversampling and Stacking Classifier. In Proceedings of the 2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT), Delhi, India, 6–8 July 2023; pp. 1–6. [Google Scholar] [CrossRef]
- Kazakov, J.; Fogel, J.; Lowery, T.S.; Tetrokalashvili, M. Family planning behavior before and during the COVID-19 pandemic. J. Turk. Ger. Gynecol. Assoc. 2024, 25, 200. [Google Scholar] [CrossRef]
- Aly, J.; Choi, L.; Christy, A.Y. The impact of coronavirus on reproduction: Contraceptive access, pregnancy rates, pregnancy delay, and the role of vaccination. F S Rev. 2022, 3, 190–200. [Google Scholar] [CrossRef]
- Abdel Tawab, N.; Tayel, S.A.; Radwan, S.M.; Ramy, M.A. The effects of COVID-19 pandemic on women’s access to maternal health and family planning services in Egypt: An exploratory study in two governorates. BMC Health Serv. Res. 2024, 24, 267. [Google Scholar] [CrossRef]
- Săndulescu, M.S.; Văduva, C.C.; Siminel, M.A.; Dijmărescu, A.L.; Vrabie, S.C.; Camen, I.V.; Tache, D.E.; Neamţu, S.D.; Nagy, R.D.; Carp-Velişcu, A.; et al. Impact of COVID-19 on fertility and assisted reproductive technology (ART): A systematic review. Rom. J. Morphol. Embryol. 2022, 63, 503. [Google Scholar] [CrossRef]
- Haider, W.; Mujahid, A.Y.; Sajjad, M.; Khan, A.; Mumtaz, M.; Imran, A.; Anwar, A.; Aftab, F.; Amin, F.; Naqvi, S.Z.H.; et al. COVID-19 and Reproductive Function: A Detailed Review of Fertility Outcomes, Sperm Alterations, and Vertical Transmission Concerns: Impact of COVID-19 on male & female fertility. Dev. Med.-Life-Sci. 2024, 1, 29–47. [Google Scholar] [CrossRef]
- Hamilton, B.E.; Martin, J.A.; Osterman, M.J. Births: Provisional Data for 2020. In National Center for Health Statistics, Centers for Disease Control and Prevention; U.S. Department of Health and Human Services: Hyattsville, MD, USA, 2021. [Google Scholar]
- Xu, T.; de Figueiredo Veiga, A.; Hammer, K.C.; Paschalidis, I.C.; Mahalingaiah, S. Informative predictors of pregnancy after first IVF cycle using eIVF practice highway electronic health records. Sci. Rep. 2022, 12, 839. [Google Scholar] [CrossRef] [PubMed]
- Christ, J.P.; Cedars, M.I. Current guidelines for diagnosing PCOS. Diagnostics 2023, 13, 1113. [Google Scholar] [CrossRef]
- Kabakchieva, P. Polycystic ovary syndrome: Diverse clinical presentations across adolescence, reproductive age, and menopause. Anti-Aging East. Eur. 2024, 3, 78–86. [Google Scholar] [CrossRef]
- Ghafari, A.; Maftoohi, M.; Samarin, M.E.; Barani, S.; Banimohammad, M.; Samie, R. The last update on polycystic ovary syndrome (PCOS), diagnosis criteria, and novel treatment. Endocr. Metab. Sci. 2025, 17, 100228. [Google Scholar] [CrossRef]
- Kreisel, K.M.; Llata, E.; Haderxhanaj, L.; Pearson, W.S.; Tao, G.; Wiesenfeld, H.C.; Torrone, E.A. The burden of and trends in pelvic inflammatory disease in the United States, 2006–2016. J. Infect. Dis. 2021, 224, S103–S112. [Google Scholar] [CrossRef] [PubMed]
- Knight, N. Are High Global Infertility Rates an Issue of Public Health? 2024. Available online: https://www.volusonclub.net/empowered-womens-health/are-high-global-infertility-rates-an-issue-of-public-health/ (accessed on 29 June 2025).
- Peng, J.; Geng, X.; Zhao, Y.; Hou, Z.; Tian, X.; Liu, X.; Xiao, Y.; Liu, Y. Machine learning algorithms in constructing prediction models for assisted reproductive technology (ART) related live birth outcomes. Sci. Rep. 2024, 14, 32083. [Google Scholar] [CrossRef] [PubMed]
Variable | 2015–2016 | 2017–2018 | 2021–2023 | Total (N = 6560) | p-Value |
---|---|---|---|---|---|
Sample Size (n) | 2534 | 2483 | 1543 | 6560 | |
Age Menarche (years) | 12.7 ± 1.8 | 12.7 ± 1.8 | 12.3 ± 1.7 | 12.6 ± 1.8 | |
Total Deliveries | 2.4 ± 1.9 | 2.4 ± 4.0 | 1.5 ± 1.4 | 2.2 ± 2.9 | |
Menstrual Irregularity (%) | |||||
Yes | 1168 (46.09%) | 1064 (42.85%) | 1017 (65.91%) | 3249 (49.53%) | |
No | 1366 (53.90%) | 1419 (57.15%) | 415 (34.09%) | 3311 (50.47%) | |
Hysterectomy (%) | |||||
Yes | 557 (21.98%) | 557 (22.43%) | 181 (11.73%) | 1295 (19.74%) | |
No | 1977 (78.02%) | 1926 (77.57%) | 1362 (88.27%) | 5265 (80.26%) | |
Pelvic Infection (PID) (%) | |||||
Yes | 83 (3.28%) | 114 (4.59%) | 76 (4.93%) | 273 (4.16%) | |
No | 2451 (96.72%) | 2369 (95.41%) | 1467 (95.07%) | 6287 (95.84%) | |
Ever Pregnant (%) | |||||
Yes | 2134 (84.21%) | 2115 (85.18%) | 1114 (72.20%) | 5263 (80.23%) | |
No | 400 (15.79%) | 368 (14.82%) | 429 (27.80%) | 1197 (19.77%) | |
Both Ovaries Removed (%) | |||||
Yes | 287 (11.33%) | 287 (11.59%) | 66 (4.28%) | 640 (9.76%) | |
No | 2247 (88.67%) | 2196 (88.44%) | 1477 (95.72%) | 5920 (90.24%) |
Variable | Adjusted OR | 95% CI | p-Value |
---|---|---|---|
Age menarche | 1.00 | 1.0–1.0 | 0.5365 |
Menstrual irregularity | 0.55 | 0.40–0.77 | 0.0005 * |
Hysterectomy | 1.36 | 0.88–2.09 | 0.1683 |
Total deliveries | 0.00 | 0.0–inf | 0.9919 |
Pelvic infection | 1.05 | 0.87–1.28 | 0.6002 |
Both ovaries removed | 1.02 | 0.82–1.28 | 0.8303 |
Model | Accuracy | Precision | Recall | F1-Score | Specificity |
---|---|---|---|---|---|
Logistic Regression | 0.949 | 0.784 | 0.992 | 0.876 | 0.939 |
Random Forest | 0.948 | 0.790 | 0.975 | 0.873 | 0.942 |
XGBoost | 0.950 | 0.786 | 1.000 | 0.880 | 0.939 |
Naive Bayes | 0.948 | 0.784 | 0.983 | 0.873 | 0.940 |
SVM | 0.950 | 0.786 | 1.000 | 0.880 | 0.939 |
Ensemble (Stacking Classifier) | 0.950 | 0.786 | 1.000 | 0.880 | 0.939 |
Base Layer: XGBoost + Random Forest | |||||
+ Logistic Regression + SVM | |||||
Meta-Layer: Logistic Regression |
Model | Parameter | Value |
---|---|---|
Logistic Regression | C | 1 |
penalty | l2 | |
solver | lbfgs | |
max_depth | 5 | |
min_samples_split | 2 | |
Random Forest | n_estimators | 200 |
XGBoost | colsample_bytree | 1 |
learning_rate | 0.01 | |
max_depth | 3 | |
n_estimators | 200 | |
subsample | 0.8 | |
Naive Bayes | var_smoothing | 0 |
SVM | C | 0.1 |
gamma | scale | |
kernel | linear |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Begum, I.A.; Ghimire, D.; Hosen, A.S.M.S. Temporal Trends and Machine Learning-Based Risk Prediction of Female Infertility: A Cross-Cohort Analysis Using NHANES Data (2015–2023). Diagnostics 2025, 15, 2250. https://doi.org/10.3390/diagnostics15172250
Begum IA, Ghimire D, Hosen ASMS. Temporal Trends and Machine Learning-Based Risk Prediction of Female Infertility: A Cross-Cohort Analysis Using NHANES Data (2015–2023). Diagnostics. 2025; 15(17):2250. https://doi.org/10.3390/diagnostics15172250
Chicago/Turabian StyleBegum, Ismat Ara, Deepak Ghimire, and A. S. M. Sanwar Hosen. 2025. "Temporal Trends and Machine Learning-Based Risk Prediction of Female Infertility: A Cross-Cohort Analysis Using NHANES Data (2015–2023)" Diagnostics 15, no. 17: 2250. https://doi.org/10.3390/diagnostics15172250
APA StyleBegum, I. A., Ghimire, D., & Hosen, A. S. M. S. (2025). Temporal Trends and Machine Learning-Based Risk Prediction of Female Infertility: A Cross-Cohort Analysis Using NHANES Data (2015–2023). Diagnostics, 15(17), 2250. https://doi.org/10.3390/diagnostics15172250