Abstract
Background/Objectives: Chronic endometritis (CE) is a well-known risk factor for recurrent implantation failure. However, the traditional approach to CE diagnosis has several drawbacks. On the other hand, there is a lot of evidence that some clinical, instrumental, and/or laboratory parameters of patients are associated with CE. The aim of this study is to build a CE prediction model using machine learning tools based on low-invasive pathological features. Methods: The data of 108 women (44 with and 64 without CE) from a multicenter perspective cross-sectional study was included in this study. Basic characteristics, reproductive history, laboratory and ultrasound indicators, and immunohistochemistry results were collected. Binary feature selection was performed using forward stepwise selection with logistic regression as the evaluation criterion. For each feature configuration, a gradient-boosting model was trained on decision trees with a binary logistic loss function. The models were evaluated and compared on test data using standard metrics. Results: We built five comparable predictive models for CE. The models yielded the following AUCs (95% CI): Model 1 (seven indicators)—0.704 (0.5170, 0.8907), Model 2 (seven indicators)—0.673 (0.4716, 0.8745), Model 3 (nine indicators)—0.677 (0.4916, 0.8622), Model 4 (five indicators)—0.758 (0.5913, 0.9241), and Model 5 (five indicators)—0.769 (0.5913, 0.9241). Models 2 and 5 have the better recall and precision values, but the differences were not significant. SHAP values indicated that serum adiponectin level (Model 2) and SHBG (Model 5) had the greatest association with CE risks. Conclusions: Models 2 and 5 show the most promising potential for clinical application, as they demonstrate superior recall and precision metrics and require assessment of only 5–7 risk markers (with only a few being non-routine) for their implementation.
1. Introduction
Chronic endometritis (CE) is strongly associated with recurrent implantation failure (RIF) [1]. Recent studies have consistently shown a significant prevalence of CE in infertile women, particularly among those diagnosed with RIF, recurrent spontaneous abortion, and unexplained infertility, with observed frequencies of 23.4%, 37.6%, and 19.46%, respectively [2,3]. On the other hand, Yilmaz et al. (2025) found that CE had been uncommon in patients with infertility and implantation failure. Consequently, routine diagnosis of CE in patients with these pathologies is not necessary, at least in the context of ART technologies [4].
Currently, the most specific method for evaluating CE is immunohistochemical examination of the endometrium for transmembrane heparan sulfate proteoglycan syndecan-1 (CD138). However, this method has several limitations. First, the techniques, conditions, and interpretations of immunohistochemistry for CD138+ in human endometrium have not yet been standardized. Second, the sampling methods used in endometrial biopsy are not fully effective and safe. Consequently, this procedure may fail to sample the required endometrial tissue, while potentially leading to the development of complications, such as endometrial thinning and intrauterine adhesions. Third, the diagnostic approach requires specific clinical indications. This often poses difficulties because of the asymptomatic or oligosymptomatic nature of CE [5]. In addition, patients have to endure a painful procedure.
These challenges highlight the need to develop less invasive and technically simpler methods for CE establishment. However, recent attempts to identify specific clinical or serum laboratory indicators of CE have been unsuccessful. Nevertheless, there is substantial evidence determining associations between certain clinical or laboratory indicators and CE.
Chen et al. (2016) [6] conducted a study involving 93 patients who had undergone laparoscopic and hysteroscopic examination for infertility. The research identified that the independent risk factors for CE had been episodes of prolonged menstrual bleeding, abortions, and fallopian tube obstruction. It is noteworthy that these patients exhibited no clinical signs of CE. Specifically, none of the patients reported pelvic pain, indicating the latent nature of the disease [6]. Hosseini et al. (2024) found an elevated risk of CE in women with endometrial polyps and uterine fibroids [7]. Kabodmehri et al. (2022) have shown a higher incidence of CE in patients with submucosal myoma than in patients with intramural and subserosal fibroids or in the control group [8]. In another study, researchers found a dependent relationship between CE and endometrial polyps (EPs): Women with EPs showed a higher prevalence of CE compared to women without EPs [9]. These findings suggest that ultrasound examination of pelvic organs and gynecological history data can also be used as part of a comprehensive approach to CE diagnosis. In our previous studies, we identified a significant association between elevated serum interleukin (IL) 1 levels, the IL-1/tumor necrosis factor α (TNFα) ratio, and adiponectin with CE in women with normal body mass [10,11]. Furthermore, low leptin levels were associated with CE in women who were overweight and obese [11].
Many of the described clinical and laboratory parameters have been shown to lack sufficient specificity for CE diagnosis. This may be related, among other factors, to the comorbidity associated with inflammation, including low-grade inflammation [12].
Therefore, we put forward a hypothesis: To enhance the diagnostic specificity and accuracy of CE evaluation based on low-invasive parameters, it is necessary to employ approaches that take into account the combinations of these parameters in a particular patient’s case. This novel diagnostic approach to CE has the potential to enhance early detection of this condition before complications and reproductive losses develop. Furthermore, the identification of CE in women with infertility will enable therapeutic adjustments aimed at restoring endometrial function. Previous studies have addressed CE prediction using deep learning models, primarily convolutional neural networks (CNNs), for analyzing hysteroscopic images to link findings with histopathologic CE diagnosis [13,14].
The aim of this study is to build a CE prediction model using machine learning (ML) tools based on low-invasive pathological features.
In this study, we present the results of developing extreme gradient-boosting machine learning models for predicting CE based on data from medical history, instrumental examinations, and laboratory tests of women from a non-selective population obtained during a previously conducted epidemiological study [15]. The results of the analysis suggest that a combination of certain clinical and laboratory parameters could be useful for CE establishment, thereby facilitating appropriate referral of patients for endometrial biopsy to confirm the diagnosis.
2. Materials and Methods
2.1. Patients
To conduct this analysis, we used the data from a multicenter perspective cross-sectional study of the prevalence of polycystic ovarian syndrome (PCOS) in an unselected multiethnic population of premenopausal women (ES-PEP study). This study included premenopausal women who were undergoing a mandatory annual employment-related health assessment, which ensured the non-selectivity of the sample. The database from this study contains information from a comprehensive examination, which comprised the following: questionnaire survey, general physical examination, gynecological examination, ultrasound examination, laboratory tests (biochemical, hormonal, and immunological), etc. [15]. We included 137 women in the analysis according to the following inclusion and exclusion criteria:
Inclusion criteria: age 18–45 years, and regular menstrual cycle.
Exclusion criteria: biochemical hyperandrogenism (HA) [16], hyperprolactinemia (prolactin ≥ 726 IU/L), hypothyroidism (thyroid-stimulating hormone [TSH] ≥ 4 mmol/mL), non-classic congenital adrenal hyperplasia (17-hydroxyprogesterone [17-OH] ≥ 6.9 nmol/L), premature ovarian failure (follicular stimulating hormone [FSH] ≥ 20 mME/L), amenorrhea, endometrial ablation, bilateral oophorectomy, sexually transmitted infections, and acute conditions.
Next, we excluded data of 29 women with body mass index (BMI) ≥ 30 due to the low quality of preliminary models, which could be associated with low-grade inflammation on the background of obesity. The final dataset included 108 women: 44 with CE and 64 without CE.
2.2. Clinical, Instrumental, and Laboratory Parameters
We included the following clinical, ultrasound, and laboratory parameters into the analysis: age, duration of menstrual cycle, visceral adipose tissue (VAT, %), menstrual cycle duration, number of days of heavy menstrual bleeding, Pictorial Blood Assessment Chart (PBAC), spontaneous abortion, extrauterine pregnancy, missed abortion, Cesarian section, endometrial thickness, uterine fibroids, endometrial polyp, serum level of C-reactive protein (CRP), anti-Mullerian hormone (AMH), total testosterone (T), sex hormone-binding globulin (SHBG), prolactin, TSH, 17-OH, FSH, luteinizing hormone (LH), estradiol (E2), leptin, adiponectin (ADIPOQ), IL-1, IL-4, IL-6, IL-8, IL-10, TNFα, and interferon (IFN) γ. Details of all measurements were previously described [10,15,17]. Also, we used calculated parameters: leptin/ADIPOQ and IL-1/TNFα ratio.
Evaluation of CE was conducted by expression of CD138+ in endometrial stroma and described earlier [17]. Briefly, on days 8–10 of the menstrual cycle, endometrial pipelle biopsies were performed. The obtained endometrial samples underwent immunohistochemical examination of CD138 expression using standard antibody kits (Dako, Glostrup, Denmark). The results were evaluated based on the presence or absence of a positive cytoplasmic reaction for CD138 in individual plasmatic cells within the endometrial stroma (plasma cells). According to the ES-PEP study protocol, an endometrial biopsy was performed on all participants who consented to the procedure, without selecting participants based on clinical symptoms of CE.
2.3. Machine Learning and Statistical Analysis Methods
2.3.1. Data Pre-Processing
For this study, we use a dataset hosted in the IEEE DataPort repository: https://dx.doi.org/10.21227/7vd8-8f90 (accessed on 19 October 2025). This dataset contains raw, structured, and fully anonymized data in CSV format and includes all clinical, instrumental, and laboratory parameters used for ML and statistical analysis. The dataset is publicly available, ensuring the reproducibility of results and facilitating further research.
The data were checked for missing values and anomalies. Missing values were imputed using Multiple Imputation by Chained Equations (MICE) based on random forests. To address class imbalance in the training dataset, the Synthetic Minority Over-sampling Technique for Nominal and Continuous (SMOTENC) was applied. Numerical features were standardized using StandardScaler. The data was split into training (70%) and test (30%) sets while maintaining class proportionality.
Before imputation and standardization, the dataset was divided into two groups (with and without CE) and compared using the Mann–Whitney U test (for continuous variables) or the Chi-square test (for categorical variables) on the previously described parameters.
2.3.2. Selection of Features and Model
Feature selection was performed using forward stepwise selection with logistic regression, with Receiver Operating Characteristic–Area Under Curve (ROC-AUC) as the evaluation criterion.
For each feature configuration, a gradient-boosting model was trained using decision trees with a binary logistic loss function, and the ROC-AUC metric was used as the evaluation criterion.
Given that we utilized the maximum possible exclusion criteria for sample formation (signs potentially associated with inflammation), we did not perform further confounding variable control.
2.3.3. Model Evaluation and Interpretation
The models were evaluated on test data using the following performance metrics.
Accuracy (overall correctness of predictions), recall—the ability to correctly identify positive cases, precision—the proportion of true positive predictions among all positive predictions, specificity—the ability to correctly identify negative cases, F1-score—the harmonic mean of precision and recall, ROC-AUC—area under the ROC curve, and PR-AUC—area under the precision–recall curve. The following visualizations were generated: Confusion matrices to show classification results, ROC, and PR curves for precision vs. recall.
Model comparison was performed using the DeLong test [18] for ROC-AUC metrics and bootstrap resampling (1000 iterations) to calculate confidence intervals (95%) for PR-AUC and recall metrics. Differences were considered statistically significant at p < 0.05.
Feature importance and its impact on predictions were assessed using SHAP (Shapley additive explanations) values, including global interpretations through bar plots for feature importance visualization and Beeswarm plots for SHAP value distribution analysis. This approach provides a comprehensive understanding of how each feature contributes to the model’s predictions on both individual and aggregate levels.
To identify the most clinically applicable models, we relied on the following criteria:
Acceptable discriminatory power: AUC above 0.7 is generally considered acceptable, and an AUC of 0.6–0.7 may still provide useful discriminative capacity in many medical contexts [19,20].
Economic efficiency and minimal invasiveness: The feature sets in these models consist of variables that are low-cost and easy to collect, which is the cornerstone of our proposed diagnostic procedure.
2.3.4. Software
- Programming language: Python 3.12.2
- Runtime environment: Jupyter Notebook 6.5.4
- Libraries used:
- •
- Pandas 2.3.2—for structured data analysis and manipulation (DataFrame operations).
- •
- NumPy 2.2.0—for multi-dimensional array computations and mathematical operations.
- •
- Missingno 0.5.2—for visualizing missing data patterns in datasets.
- •
- Matplotlib 3.10.5—for creating basic static visualizations.
- •
- Seaborn 0.13.2—for advanced statistical plotting.
- •
- SciPy 1.16.1—for statistical analysis and mathematical functions.
- •
- Miceforest 6.0.3—for multiple imputation by chained equations (MICE) to handle missing values.
- •
- Imbalanced-learn 0.14.0—for handling class imbalance using the SMOTENC algorithm.
- •
- Scikit-learn 1.7.1—for data preprocessing, model building, and validation.
- •
- SHAP 0.48.0—for model interpretability and explanation of predictions.
- •
- Mlxtend 0.23.4—for machine learning tools, including feature selection.
- •
- XGBoost 3.0.4—for gradient-boosting machine learning models.
- •
- MLstatkit 0.1.9—for performing DeLong’s test to compare ROC-AUC of two models.
The source code developed for data preprocessing, selection of features, model training and model evaluation, and interpretation is publicly available in the associated GitHub 0.1.0 repository: https://doi.org/10.5281/zenodo.17347044 (accessed on 19 October 2025).
3. Results
3.1. Characteristics of Patients
We found that women with CE were older than women without CE, but this difference was not significant (p = 0.05). Women with CE had significantly lower levels of serum T, leptin/ADIPOQ ratio, TNFα, and IL-1/TNFα ratio, and had significantly higher levels of serum IL-1 (Table 1).
Table 1.
Clinical and laboratory characteristics of premenopausal women with or without CE.
3.2. Prediction Models
Using logistic regression, we built a series of prediction models. We found that models built of data from gynecological history had a low quality due to the small number of cases (Appendix A). Consequently, we removed the gynecological history data from the dataset.
We built five distinct predictive models for CE whose metrics are presented in Table 2.
Table 2.
The characteristics of CE prediction models.
The SHAP summary plots provided a comprehensive visualization of feature effects on the predictive model performance, as illustrated in Figure 1. The features are ranked according to their average absolute SHAP values, from highest to lowest significance.
Figure 1.
The impact of low-invasive indicators on each CE prediction model. (a) Average impact on model output magnitude and (b) SHAP value.
In the Model 1, low levels of TNFα, CRP, and T, as well as elevated ADIPOQ, IL-1, IFNγ, and the number of heavy menstrual bleeding days, were associated with a higher risk of CE. Due to the fact that we previously found out that IL-1/TNFα ratio had contributed more significantly to CE than IL-1, we replaced IL-1 and TNFα with the IL-1/TNFα ratio. Thus, in Model 2, elevated levels of ADIPOQ, 17-OH, E2, and VAT, as well as low T, IL-8, and CRP, were associated with a higher risk of CE. We found that the IL-1/TNFα ratio did not have enough impact on the prediction of CE. Next, we substituted individual measurements of leptin and ADIPOQ with their ratio based on compelling evidence that this ratio serves as a promising biomarker not only for metabolic disorders and low-grade inflammation but also for assessing endometrial receptivity [21]. In Model 3, elevated 17-OH, IL-1, E2, IL-6, PBAC, and heavy menstrual bleeding days, as well as low TNFα, T, and CRP, were associated with a higher risk of CE. To investigate the contribution of both relative characteristics, we built Model 4. In this Model, high-level SHBG and IFNγ, as well as low FSH, CRP, and leptin/ADIOQ ratio, were associated with a higher risk of CE. As our aim was to use for the prediction of CE not only laboratory but also basic and ultrasound low-invasive features, we conducted manual parameter-tuning methods to build Model 5. The last model included five parameters, from which elevated SHBG and low FSH, CRP, leptin/ADIPOQ ratio, and endometrial thickness were associated with a higher risk of CE. All CE prediction models demonstrated moderate quality according to the ROC-AUC value (Table 2).
3.3. Model Evaluation and Interpretation
Next, we compared all prediction models to find which one had the best quality and power of prediction. According to the DeLong test, all models had comparable ROC-AUC (with Bonferroni correction α = 0.0056) (Table 3, Figure 2a).
Table 3.
Comparison of CE prediction models by ROC-AUC.
Figure 2.
Model comparison by metrics. (a) Model comparison by accuracy, recall, precision, specificity, F1, ROC AUC, and PR AUC; (b) confusion matrices for models; Models 2 and 4 have the best proportion of correctly identified positive cases, but the differences are not significant (0—control; 1—CE); (c) PR-AUC of models; there are no significant differences between the models; (d) precision–recall curves of the models; there are no significant differences between the models.
To evaluate the predictive performance of the models, we compared the recall and precision metrics across all models to identify the one demonstrating the highest quality and predictive power (Table 4, Figure 2a). Recall represents the proportion of truly positive cases (CE patients) correctly identified among all actual positive cases in the population. Precision, on the other hand, represents the proportion of correctly identified positive cases (true CE patients) among all cases classified as positive by the model. This precision analysis enables the identification of true cases while minimizing the misclassification of healthy subjects.
Table 4.
Comparison of CE prediction models by PR-AUC.
Bootstrap resampling analysis did not reveal statistically significant differences between the models (Table 4, Figure 2c,d). However, Models 2 and 4 were identified as the most promising for clinical application due to their superior recall and precision values (Figure 2a), as well as their superior ability to accurately predict true CE cases compared to the other models (Figure 2b).
4. Discussion
To the best of our knowledge, this study represents the first attempt to develop ML prediction models for CE using low-invasive patient data. We developed five comparable prediction models, each incorporating different parameters derived from gynecological history, ultrasound examination, and laboratory data. Although no significant differences were found between the models, Models 2 and 5 were identified as the most promising for clinical application due to their superior performance. These models demonstrate better recall and precision metrics and require assessment of only 6–7 risk features (with most being routine measurements) for implementation.
As we have previously described, the establishment of CE is challenging due to the lack of standardization in the diagnostic approach and the heterogeneity of clinical signs. Immunohistochemical examination for CD138+ cells requires clinical evidence, and the invasiveness and painfulness of the biopsy procedure are equally important factors since they affect quality of life and consent for examination of patients [6]. Building upon our previous research findings and incorporating published data regarding clinical associations with CE, we hypothesized that a comprehensive evaluation of potential risk factors for CE could lead to the development of a diagnostic algorithm that would not require invasive procedures. Multi-parameter modeling based on ML offers a significant advancement over traditional biomarker screening methodologies. While conventional approaches rely on univariate analyses employing statistical tests, they often fall short in capturing the complexity of biological systems [22]. According to our results, we identified two predictive models (Model 2 and Model 5) as the most promising for clinical application. Model 2 included seven features: ADIPOQ, 17-OH, T, IL-8, E2, CRP, and VAT. Model 5 included five features: SHBG, FSH, CRP, leptin/ADIPOQ ratio, and endometrial thickness. All of these parameters can be divided into four relative groups: hormones (17-OH, T, SHBG, E2, and FSH), adipokines (ADIPOQ, leptin/ADIPOQ ratio), proinflammatory markers (IL-8, CRP), and clinical features (VAT, endometrial thickness). Interestingly, both models include features that reflected reproductive function and inflammatory and metabolic state. This can mean that CE is a condition caused not only by infection, but also by hormonal and metabolic dysregulation.
The role of 17-OH in inflammation remains unclear. This hormone is primarily elevated in response to stress or due to 21-hydroxylase deficiency. Notably, elevated levels of 17-OH have been observed in patients with ankylosing spondylitis [23], suggesting a potential involvement of this hormone in autoimmune inflammatory responses. Conversely, treatment with 17-OH caproate has been shown to reduce the rate of recurrent preterm delivery in pregnant women [24]. This effect may be attributed to its regulatory influence on immune cells within the endometrium. According to our results, elevated 17-OH is associated with the risk of CE. This finding necessitates further research.
T is an important hormone for the endometrium. Androgen receptors are predominantly expressed in endometrial stromal cells, while T participates in the regulation of endometrial decidualization and prevention of oxidative stress. It is known that HA is associated with impaired endometrial receptivity, which leads to pregnancy loss [25]. In our study, patients with HA were excluded from the analysis. Moreover, low T levels were identified as a risk factor for CE. Additionally, elevated SHBG levels were also found to be a risk factor for CE, indicating the involvement of androgens in the development of endometrial inflammation. It is noteworthy that some studies have demonstrated an association of SHBG with metabolic disorders and low-grade inflammation in patients with PCOS [26]. Given the contradictory nature of the existing results, further research is required to assess the contribution of both elevated and reduced androgen levels to the development of CE.
E2 is a key regulator of endometrial function. Low E2 is associated with RIF in the first trimester [27]. However, there is limited evidence from studies showing that women with recurrent pregnancy loss may exhibit elevated basal levels of E2 [28]. Khan et al. (2015) showed in vitro promotion of pelvic inflammation (elevated levels of IL-6 and TNFα) by E2 and LPS in eutopic/ectopic endometrial stromal cells of women with endometriosis [29]. Elevated E2 can be a risk factor for endometrial inflammation. This is reflected in our results, which indicate that elevated E2 levels contribute to the CE.
FSH is a gonadotropin that is synthesized and secreted from the anterior pituitary gland. This hormone stimulates the growth and maturation of oocytes and promotes estradiol production in granulosa cells [30]. According to some authors, FSH could act on the endometrial aromatase, inducing the production of local estrogen that may confer endometrial receptivity during the period of implantation or influence estrogen-dependent epithelial production of factors related to embryo implantation [31]. There is no convincing evidence of the effect of FSH level on endometrial inflammation. Hagag et al. (2024) found significantly higher FSH in women with CE than in women without CE. But most of the women with CE were perimenopausal [32]. In our study, a low serum FSH level was associated with a higher risk of CE in premenopausal women. Further research is required to determine the contribution of FSH to the development of endometrial inflammation.
Adiponectin is an adipokine that participates in metabolism regulation and is essential for the normal functioning of the reproductive system [33]. The expression of adiponectin receptors is reduced in the endometrium of women with infertility [34]. Additionally, adiponectin plays an important anti-inflammatory role and regulates energy metabolism in the endometrium. On the other hand, a proinflammatory effect of adiponectin has been identified in autoimmune diseases [35]. Moreover, elevated levels of adiponectin have been associated with an increased risk of cardiovascular diseases [36,37]. This paradox can be explained by the impairment of liver function in such patients, since adiponectin is metabolized in the liver [38]. Thus, elevated adiponectin levels may reflect the presence of metabolic disorders associated with liver damage.
Leptin, being an adipokine and one of the key regulators of energy metabolism, is capable of stimulating the proliferation and apoptosis of endometrial epithelial cells, influencing endometrial receptivity, the uterine immune system, and endometrial decidualization [39]. In our previous study, we found a negative association between serum leptin levels and the presence of CE in women of reproductive age [17]. Upon further detailed analysis, we established that women with CE who were overweight or obese exhibited significantly lower serum leptin concentrations compared to patients without CE but who were overweight/obese [11]. These findings contradict the existing data that leptin is primarily a proinflammatory factor [40]. However, the prediction models obtained through our research also indicate that low leptin/ADIPOQ ratios are associated with CE in women with normal weight or who are overweight. In Model 2, an increase in VAT was also associated with CE, which contradicts the results obtained for adiponectin and leptin. To resolve these contradictions, further research is required, including a search for other factors that could have influenced the results.
CRP is the main marker of systemic low-grade inflammation in patients with metabolic and cardiovascular disorders [41]. On the other hand, Arefi et al. (2010) have reported higher pregnancy rates in women with elevated CRP levels on the transfer day of in vitro fertilization compared with women who had lower CRP levels [42]. Our predictive model showed a reverse association between CRP level and risk of CE. Interestingly, CRP concentration is inversely correlated with E2 during the menstrual cycle [43]. And we also found a higher risk of CE in the background of elevated E2. In this context, the observed pattern can be explained by a compensatory increase in estradiol levels in response to endometrial inflammation. However, validation of this hypothesis requires further research.
IL-8 is a proinflammatory cytokine that participates in the regulation of decidualization and vascularization of the endometrium. The studies investigating the level of IL-8 in endometrium or serum in the background of reproductive disorders are contradictory [44]. In our previous study, we found elevated serum IL-1 and IL-1/TNFα ratio, but not IL-8, in women with CE [10]. In our current research, low IL-8 is associated with a higher risk of CE.
Thin endometrium is one of the US’s findings of endometritis. Endometrial stromal thickening was found in 50–67% women with CE and recurrent pregnancy loss or RIF [45]. This finding is consistent with our data that low endometrial thickness is a risk factor for CE.
Our study has several limitations. First, the small sample size may have impacted the overall quality of the models, particularly affecting the contribution of rare features such as miscarriage rates and missed abortions. Second, it should be noted that this study has a retrospective design, originating from an epidemiology study focused on PCOS prevalence rather than CE. Third, we did not take into account the presence or absence of autoimmune conditions in women due to the lack of this information in the database. Consequently, CE diagnosis was not established for all participants. The developed prediction models are applicable exclusively to patients without comorbidities (such as HA, obesity, hypothyroidism, etc.).
5. Conclusions
This is the first study aimed at developing a new approach to diagnosing CE using machine learning tools. We utilized medical history, clinical, ultrasound, and laboratory parameters to construct predictive models of CE. Five models with various combinations of direct and indirect features were obtained. Although all five models demonstrated comparable performance across all metrics, we concluded that Models 2 and 5 show the most promise for further development of a low-invasive approach to diagnosing CE in premenopausal women. These models demonstrate better, though not statistically significant, recall and precision metrics. Moreover, implementing these models requires assessment of only 6–7 risk features, most of which are routine measurements.
It is worth noting that both the most promising models (Model 2 and Model 5) included features associated not only with inflammation but also with metabolic and endocrine disorders. This finding may suggest that the pathogenesis of CE involves not only infection and endometrial injury factors but also the patient’s hormonal-metabolic status. However, further research is required to validate this hypothesis and exclude the influence of confounding factors that were not accounted for in our study.
Author Contributions
Conceptualization, K.D.I. and A.V.A.; methodology K.D.I., A.V.A. and T.G.B.; software, A.V.A. and T.G.B.; validation, K.D.I., A.V.A. and T.G.B.; formal analysis, T.G.B.; investigation, I.G.N., M.R.A., I.N.D., L.F.S., E.M.S. and L.M.L.; resources, A.V.A.; data curation, A.V.A.; writing—original draft preparation, K.D.I. and A.V.A.; writing—review and editing, A.V.A., L.V.S. and L.V.S.; visualization, T.G.B.; supervision, K.D.I.; project administration, K.D.I.; funding acquisition, K.D.I. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by the Russian Science Foundation, grant number 25-25-20034, and the Irkutsk Region, grant number No. 30-2025-004302. The APC was funded by the authors.
Institutional Review Board Statement
The study was conducted in accordance with the Declaration of Helsinki and approved by the Institutional Ethics Committee of the Scientific Center for Family and Human Reproduction Problems (protocol code 2.1, approval date 24 February 2016).
Informed Consent Statement
Informed consent was obtained from all subjects involved in the study.
Data Availability Statement
The dataset used in this research is hosted in the IEEE DataPort repository: https://dx.doi.org/10.21227/7vd8-8f90 (accessed on 19 October 2025). The source code developed for data preprocessing, selection of features, model training and model evaluation, and interpretation is publicly available in the associated GitHub repository: https://doi.org/10.5281/zenodo.17347044 (accessed on 19 October 2025).
Conflicts of Interest
The authors declare no conflicts of interest.
Abbreviations
The following abbreviations are used in this manuscript:
| CE | Chronic Endometritis |
| RIF | Recurrent implantation failure |
| EP | Endometrial polyp |
| IL | Interleukin |
| TNF α | Tumor necrosis factor |
| ML | Machine learning |
| PCOS | Polycystic ovarian syndrome |
| HA | Hyperandrogenism |
| TSH | Thyroid-stimulating hormone |
| 17-OH | 17-Hydroxyprogesterone |
| FSH | Follicular-stimulating hormone |
| BMI | Body mass index |
| PBAC | Pictorial Blood Assessment Chart |
| CRP | C-reactive protein |
| AMH | Anti-Mullerian hormone |
| T | Total testosterone |
| SHBG | Sex hormone-binding globulin |
| LH | Luteinizing hormone |
| E2 | Estradiol |
| ADIPOQ | Adiponectin |
| IFN | Interferon |
| MICE | Multiple Imputation by Chained Equations |
| SMOTENC | Synthetic Minority Over-sampling Technique for Nominal and Continuous |
| ROC-AUC | Receiver Operating Characteristic—Area Under Curve |
| PR-AUC | Area Under the precision–recall curve |
| SHAP | Shapley additive explanations |
Appendix A
Appendix A.1. Metrics of CE Prediction Models Included Gynecological History
| Model | Accuracy | Recall (CI 95%) | Precision | Specificity | F1 | ROC AUC | PR AUC |
| Model 1 | 0.545 | 0.696 (0.428, 0.923) | 0.450 | 0.450 | 0.545 | 0.615 | 0.453 |
| Model 2 | 0.515 | 0.618 (0.333, 0.889) | 0.421 | 0.450 | 0.500 | 0.588 | 0.483 |
| Model 2.5 | 0.545 | 0.538 (0.263, 0.818) | 0.438 | 0.550 | 0.483 | 0.562 | 0.400 |
| Model 3 | 0.545 | 0.538 (0.263, 0.818) | 0.438 | 0.550 | 0.483 | 0.562 | 0.400 |
| Model 4 | 0.576 | 0.614 (0.333, 0.875) | 0.471 | 0.550 | 0.533 | 0.600 | 0.439 |
Appendix A.2. Characteristics of the Models Included Gynecological History
| Model 1 | Model 2 | Model 2.5 | Model 3 | Model 4 |
| FSH IL-1 IL-8 SHBG Endometrial thickness Missed abortion | SHBG E2 IFN FSH CRP Missed abortion | SHBG FSH Leptin/ADPQ IFNγ Missed abortion | SHBG FSH Leptin/ADPQ IFNγ Missed abortion | Leptin/ADPQ SHBG FSH Endometrial thickness Missed abortion |
References
- Wang, C.; Lu, Y.; Ou, M.; Qian, L.; Zhang, Y.; Yang, Y.; Luo, L.; Wang, Q. Risk Factors for Recurrent Implantation Failure as Defined by the European Society for Human Reproduction and Embryology. Hum. Reprod. 2025, 40, 1138–1147. [Google Scholar] [CrossRef]
- Zargar, M.; Ghafourian, M.; Nikbakht, R.; Mir Hosseini, V.; Moradi Choghakabodi, P. Evaluating Chronic Endometritis in Women with Recurrent Implantation Failure and Recurrent Pregnancy Loss by Hysteroscopy and Immunohistochemistry. J. Minim. Invasive Gynecol. 2020, 27, 116–121. [Google Scholar] [CrossRef]
- Ticconi, C.; Inversetti, A.; Marraffa, S.; Campagnolo, L.; Arthur, J.; Zambella, E.; Di Simone, N. Chronic Endometritis and Recurrent Reproductive Failure: A Systematic Review and Meta-Analysis. Front. Immunol. 2024, 15, 1427454. [Google Scholar] [CrossRef] [PubMed]
- Yilmaz, B.D.; Schwartz, K.M.; Chan, M.; Cedars, M.I.; Cakmak, H.; Huang, D. Chronic Endometritis and Its Association with Implantation History, BCL6, and ERA in Infertility Patients. J. Assist. Reprod. Genet. 2025, 42, 3303–3310. [Google Scholar] [CrossRef] [PubMed]
- Yasuo, T.; Kitaya, K. Challenges in Clinical Diagnosis and Management of Chronic Endometritis. Diagnostics 2022, 12, 2711. [Google Scholar] [CrossRef]
- Chen, Y.-Q.; Fang, R.-L.; Luo, Y.-N.; Luo, C.-Q. Analysis of the Diagnostic Value of CD138 for Chronic Endometritis, the Risk Factors for the Pathogenesis of Chronic Endometritis and the Effect of Chronic Endometritis on Pregnancy: A Cohort Study. BMC Womens Health 2016, 16, 60. [Google Scholar] [CrossRef]
- Hosseini, S.; Abbasi, H.; Salehpour, S.; Saharkhiz, N.; Nemati, M. Prevalence of Chronic Endometritis in Infertile Women Undergoing Hysteroscopy and Its Association with Intrauterine Abnormalities: A Cross-Sectional Study. JBRA Assist. Reprod. 2024, 28, 430–434. [Google Scholar] [CrossRef] [PubMed]
- Kabodmehri, R.; Etezadi, A.; Sharami, S.H.; Ghanaei, M.M.; Hosseinzadeh, F.; Heirati, S.F.D.; Pourhabibi, Z. The Association between Chronic Endometritis and Uterine Fibroids. J. Family Med. Prim. Care 2022, 11, 653–659. [Google Scholar] [CrossRef]
- Vitagliano, A.; Cialdella, M.; Cicinelli, R.; Santarsiero, C.M.; Greco, P.; Buzzaccarini, G.; Noventa, M.; Cicinelli, E. Association between Endometrial Polyps and Chronic Endometritis: Is It Time for a Paradigm Shift in the Pathophysiology of Endometrial Polyps in Pre-Menopausal Women? Results of a Systematic Review and Meta-Analysis. Diagnostics 2021, 11, 2182. [Google Scholar] [CrossRef]
- Ievleva, K.; Danusevich, I.; Atalyan, A.; Egorova, I.; Babaeva, N.; Rashidova, M.; Akhmedzyanova, M.R.; Sholokhov, L.; Nadeliaeva, I.; Lazareva, L.; et al. Diagnostic significance of interleukin levels in blood serum in premenopausal women with chronic endometritis and normal weight or overweight. Acta Biomed. Sci. 2024, 9, 38–48. [Google Scholar] [CrossRef]
- Ievleva, K.; Danusevich, I.; Atalyan, A.; Sharifulin, E.; Lazareva, L.; Nadeliaeva, I.; Rashidova, M.; Akhmedzyanova, M.R.; Belenkaya, L.; Sholokhov, L.; et al. Adipokine levels and their association with chronic endometritis in reproductive-aged women. Vopr. ginekol. akus. perinatol. Gynecol. Obstet. Perinatol. 2023, 22, 60–68. [Google Scholar] [CrossRef]
- Bays, H.E.; Bindlish, S.; Clayton, T.L. Obesity, Diabetes Mellitus, and Cardiometabolic Risk: An Obesity Medicine Association (OMA) Clinical Practice Statement (CPS) 2023. Obes. Pillars 2023, 5, 100056. [Google Scholar] [CrossRef]
- Mihara, M.; Yasuo, T.; Kitaya, K. Precision Medicine for Chronic Endometritis: Computer-Aided Diagnosis Using Deep Learning Model. Diagnostics 2023, 13, 936. [Google Scholar] [CrossRef]
- Kitaya, K.; Yasuo, T.; Yamaguchi, T. Bridging the Diagnostic Gap between Histopathologic and Hysteroscopic Chronic Endometritis with Deep Learning Models. Medicina 2024, 60, 972. [Google Scholar] [CrossRef] [PubMed]
- Suturina, L.; Lizneva, D.; Lazareva, L.; Danusevich, I.; Nadeliaeva, I.; Belenkaya, L.; Atalyan, A.; Belskikh, A.; Bairova, T.; Sholokhov, L.; et al. Ethnicity and the Prevalence of Polycystic Ovary Syndrome: The Eastern Siberia PCOS Epidemiology and Phenotype Study. J. Clin. Endocrinol. Metab. 2024, 110, e32–e43. [Google Scholar] [CrossRef]
- Suturina, L.; Lizneva, D.; Atalyan, A.; Lazareva, L.; Belskikh, A.; Bairova, T.; Sholokhov, L.; Rashidova, M.; Danusevich, I.; Nadeliaeva, I.; et al. Establishing Normative Values to Determine the Prevalence of Biochemical Hyperandrogenism in Premenopausal Women of Different Ethnicities from Eastern Siberia. Diagnostics 2022, 13, 33. [Google Scholar] [CrossRef] [PubMed]
- Sharifulin, E.; Igumnov, I.; Krusko, O.; Atalyan, A.; Suturina, L. Chronic Endometritis in Women of Reproductive Age with Polycystic Ovary Syndrome. Acta Biomed. Sci. 2020, 5, 27–36. [Google Scholar] [CrossRef]
- DeLong, E.R.; DeLong, D.M.; Clarke-Pearson, D.L. Comparing the Areas under Two or More Correlated Receiver Operating Characteristic Curves: A Nonparametric Approach. Biometrics 1988, 44, 837–845. [Google Scholar] [CrossRef]
- Çorbacıoğlu, Ş.K.; Aksel, G. Receiver Operating Characteristic Curve Analysis in Diagnostic Accuracy Studies: A Guide to Interpreting the Area under the Curve Value. Turk. J. Emerg. Med. 2023, 23, 195–198. [Google Scholar] [CrossRef]
- Carrington, A.M.; Manuel, D.G.; Fieguth, P.W.; Ramsay, T.; Osmani, V.; Wernly, B.; Bennett, C.; Hawken, S.; Magwood, O.; Sheikh, Y.; et al. Deep ROC Analysis and AUC as Balanced Average Accuracy, for Improved Classifier Selection, Audit and Explanation. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 329–341. [Google Scholar] [CrossRef]
- Dos Santos, E.; Pecquery, R.; de Mazancourt, P.; Dieudonné, M.-N. Adiponectin and Reproduction. Vitam. Horm. 2012, 90, 187–209. [Google Scholar] [CrossRef]
- Widmann, G.; Luger, A.K.; Sonnweber, T.; Schwabl, C.; Cima, K.; Gerstner, A.K.; Pizzini, A.; Sahanic, S.; Boehm, A.; Coen, M.; et al. Machine Learning Based Multi-Parameter Modeling for Prediction of Post-Inflammatory Lung Changes. Diagnostics 2025, 15, 783. [Google Scholar] [CrossRef]
- Giltay, E.J.; van Schaardenburg, D.; Gooren, L.J.; Popp-Snijders, C.; Dijkmans, B.A. Androgens and Ankylosing Spondylitis: A Role in the Pathogenesis? Ann. N. Y. Acad. Sci. 1999, 876, 340–364. [Google Scholar] [CrossRef] [PubMed]
- Meis, P.J.; Klebanoff, M.; Thom, E.; Dombrowski, M.P.; Sibai, B.; Moawad, A.H.; Spong, C.Y.; Hauth, J.C.; Miodovnik, M.; Varner, M.W.; et al. Prevention of Recurrent Preterm Delivery by 17 Alpha-Hydroxyprogesterone Caproate. N. Engl. J. Med. 2003, 348, 2379–2385. [Google Scholar] [CrossRef]
- Yamagata, K.; Mizuno, Y.; Mizuno, Y.; Tamaru, S.; Kajihara, T. Androgens Modulate Endometrial Function. Med. Mol. Morphol. 2025, 58, 93–99. [Google Scholar] [CrossRef]
- Di Stasi, V.; Maseroli, E.; Rastrelli, G.; Scavello, I.; Cipriani, S.; Todisco, T.; Marchiani, S.; Sorbi, F.; Fambrini, M.; Petraglia, F.; et al. SHBG as a Marker of NAFLD and Metabolic Impairments in Women Referred for Oligomenorrhea and/or Hirsutism and in Women With Sexual Dysfunction. Front. Endocrinol. 2021, 12, 641446. [Google Scholar] [CrossRef]
- Günther, V.; Allahqoli, L.; Deenadayal-Mettler, A.; Maass, N.; Mettler, L.; Gitas, G.; Andresen, K.; Schubert, M.; Ackermann, J.; von Otte, S.; et al. Molecular Determinants of Uterine Receptivity: Comparison of Successful Implantation, Recurrent Miscarriage, and Recurrent Implantation Failure. Int. J. Mol. Sci. 2023, 24, 17616. [Google Scholar] [CrossRef] [PubMed]
- Gürbüz, B.; Yalti, S.; Ozden, S.; Ficicioglu, C. High Basal Estradiol Level and FSH/LH Ratio in Unexplained Recurrent Pregnancy Loss. Arch. Gynecol. Obstet. 2004, 270, 37–39. [Google Scholar] [CrossRef] [PubMed]
- Khan, K.; Kitajima, M.; Inoue, T.; Fujishita, A.; Nakashima, M.; Masuzaki, H. 17β-Estradiol and Lipopolysaccharide Additively Promote Pelvic Inflammation and Growth of Endometriosis. Reprod. Sci. 2015, 22, 585–594. [Google Scholar] [CrossRef]
- Holesh, J.E.; Bass, A.N.; Lord, M. Physiology, Ovulation. In StatPearls; StatPearls Publishing: Treasure Island, FL, USA, 2025. [Google Scholar]
- Brosens, J.; Verhoeven, H.; Campo, R.; Gianaroli, L.; Gordts, S.; Hazekamp, J.; Hägglund, L.; Mardesic, T.; Varila, E.; Zech, J.; et al. High Endometrial Aromatase P450 mRNA Expression Is Associated with Poor IVF Outcome. Hum. Reprod. 2004, 19, 352–356. [Google Scholar] [CrossRef]
- Hagag, H.M.; Ismail, K.A.; Almutairi, M.M.; Alnefaie, B.I.; Alajmani, S.H.; Altalhi, A.M.; Alkhamash, A.H.; Althobaiti, N.S.; Alhumaidi, M.A.; Bawahab, A.A.; et al. Clinicopathological Aspects of Dilation and Curettage (D&C) Biopsies Taken from Patients Living at High Altitude in Taif, KSA, with a Special Emphasis on Chronic Endometritis. Life 2024, 14, 1021. [Google Scholar] [CrossRef]
- Khoramipour, K.; Chamari, K.; Hekmatikar, A.A.; Ziyaiyan, A.; Taherkhani, S.; Elguindy, N.M.; Bragazzi, N.L. Adiponectin: Structure, Physiological Functions, Role in Diseases, and Effects of Nutrition. Nutrients 2021, 13, 1180. [Google Scholar] [CrossRef]
- Sarankhuu, B.-E.; Jeon, H.J.; Jeong, D.-U.; Park, S.-R.; Kim, T.-H.; Lee, S.K.; Han, A.R.; Yu, S.-L.; Kang, J. Adiponectin Receptor 1 Regulates Endometrial Receptivity via the Adenosine Monophosphate–Activated Protein Kinase/E–Cadherin Pathway. Mol. Med. Rep. 2024, 30, 184. [Google Scholar] [CrossRef]
- Brezovec, N.; Perdan-Pirkmajer, K.; Čučnik, S.; Sodin-Šemrl, S.; Varga, J.; Lakota, K. Adiponectin Deregulation in Systemic Autoimmune Rheumatic Diseases. Int. J. Mol. Sci. 2021, 22, 4095. [Google Scholar] [CrossRef] [PubMed]
- Lee, C.H.; Lui, D.T.W.; Cheung, C.Y.Y.; Fong, C.H.Y.; Yuen, M.M.A.; Chow, W.S.; Woo, Y.C.; Xu, A.; Lam, K.S.L. Higher Circulating Adiponectin Concentrations Predict Incident Cancer in Type 2 Diabetes—The Adiponectin Paradox. J. Clin. Endocrinol. Metab. 2020, 105, e1387–e1396. [Google Scholar] [CrossRef]
- Baker, J.F.; Newman, A.B.; Kanaya, A.; Leonard, M.B.; Zemel, B.; Miljkovic, I.; Long, J.; Weber, D.; Harris, T.B. The Adiponectin Paradox in the Elderly: Associations With Body Composition, Physical Functioning, and Mortality. J. Gerontol. A Biol. Sci. Med. Sci. 2019, 74, 247–253. [Google Scholar] [CrossRef]
- Zhao, S.; Kusminski, C.M.; Scherer, P.E. Adiponectin, Leptin and Cardiovascular Disorders. Circ. Res. 2021, 128, 136–149. [Google Scholar] [CrossRef]
- Ievleva, K.D.; Danusevich, I.N.; Suturina, L.V. The role of leptin in endometrium disorders: Literature review. Probl. Endokrinol. 2024, 70, 106–114. [Google Scholar] [CrossRef] [PubMed]
- Liu, C.; Li, X. Role of Leptin and Adiponectin in Immune Response and Inflammation. Int. Immunopharmacol. 2025, 161, 115082. [Google Scholar] [CrossRef] [PubMed]
- Tate, A.R.; Rao, G.H.R. Inflammation: Is It a Healer, Confounder, or a Promoter of Cardiometabolic Risks? Biomolecules 2024, 14, 948. [Google Scholar] [CrossRef]
- Arefi, S.; Babashamsi, M.; Shariat Panahi, P.; Asgharpour Saruiy, L.; Zeraati, H. C-reactive protein level and pregnancy rate in patients undergoing IVF/ICSI. Int. J. Reprod. Biomed. 2010, 8, 197–202. [Google Scholar]
- Gaskins, A.J.; Wilchesky, M.; Mumford, S.L.; Whitcomb, B.W.; Browne, R.W.; Wactawski-Wende, J.; Perkins, N.J.; Schisterman, E.F. Endogenous Reproductive Hormones and C-Reactive Protein across the Menstrual Cycle: The BioCycle Study. Am. J. Epidemiol. 2012, 175, 423–431. [Google Scholar] [CrossRef] [PubMed]
- Vilotić, A.; Nacka-Aleksić, M.; Pirković, A.; Bojić-Trbojević, Ž.; Dekanski, D.; Jovanović Krivokuća, M. IL-6 and IL-8: An Overview of Their Roles in Healthy and Pathological Pregnancies. Int. J. Mol. Sci. 2022, 23, 14574. [Google Scholar] [CrossRef] [PubMed]
- Singh, N.; Sethi, A. Endometritis—Diagnosis, Treatment and Its Impact on Fertility—A Scoping Review. JBRA Assist. Reprod. 2022, 26, 538–546. [Google Scholar] [CrossRef] [PubMed]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).