Next Article in Journal
The Effect of Fatty Acid-Binding Protein 3 Exposure on Endothelial Transcriptomics
Previous Article in Journal
Dermatogenomic Insights into Systemic Diseases: Implications for Primary and Preventive Medicine
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Preliminary Machine Learning Assessment of Oxidation-Reduction Potential and Classical Sperm Parameters as Predictors of Sperm DNA Fragmentation Index

by
Emmanouil D. Oikonomou
1,
Efthalia Moustakli
1,2,
Athanasios Zikopoulos
3,
Stefanos Dafopoulos
4,
Ermioni Prapa
5,
Antonis-Marios Gkountis
5,
Athanasios Zachariou
6,
Agni Pantou
7,
Nikolaos Giannakeas
1,
Konstantinos Pantos
7,
Alexandros T. Tzallas
1,* and
Konstantinos Dafopoulos
8,*
1
Human Computer Interaction Laboratory, Department of Informatics and Telecommunications, University of Ioannina, Kostakioi, 47150 Arta, Greece
2
Department of Nursing, School of Health Sciences, University of Ioannina, 4th kilometer National Highway Str. Ioannina-Athens, 45500 Ioannina, Greece
3
Torbay and South Devon NHS Foundation Trust Lowes Brg, Torquay TQ2 7AA, UK
4
Department of Health Sciences, European University Cyprus, 2404 Nicosia, Cyprus
5
Centre for Human Reproduction, Genesis Athens Thessaly, 41335 Larissa, Greece
6
Department of Urology, School of Medicine, University of Ioannina, 45110 Ioannina, Greece
7
Centre for Human Reproduction, Genesis Athens Clinic, 14-16, Papanikoli, 15232 Athens, Greece
8
Department of Obstetrics and Gynaecology, Faculty of Medicine, School of Health Sciences, University of Thessaly, 41110 Larissa, Greece
*
Authors to whom correspondence should be addressed.
Submission received: 5 November 2025 / Revised: 4 December 2025 / Accepted: 4 January 2026 / Published: 8 January 2026

Abstract

Background/Objectives: Traditional semen analysis techniques frequently result in incorrect male infertility diagnoses, despite advancements in assisted reproductive technology (ART). Reduced fertilization potential, decreased embryo development, and lower pregnancy success rates are associated with elevated DNA Fragmentation Index (DFI), which has been proposed as a diagnostic indicator of sperm DNA integrity. Improving reproductive outcomes requires incorporating DFI into predictive models due to its diagnostic importance. Methods: In this study, semen samples were stratified into low and high DFI groups across two datasets: the “Reference” dataset (162 samples) containing sperm motility (A, B, and C), total sperm count, and morphology percentage, and the “ORP” dataset (37 samples) with the same features plus oxidation-reduction potential (ORP). We trained and evaluated four machine learning (ML) models—Logistic Regression, Support Vector Machines (SVM), Bernoulli Naive Bayes (BNB), and Random Forest (RF)- using three feature subsets and three preprocessing techniques (Robust Scaling, Min-Max Scaling, and Standard Scaling). Results: Feature subset selection had a significant impact on model performance, with the full feature set (X_all) yielding the best results, and the combination of Robust and MinMax scaling forming the most effective preprocessing pipeline. Conclusions: ORP proved to be a critical feature, enhancing model generalization and prediction performance. These findings suggest that data enrichment, particularly with ORP, could enable the development of ML frameworks that improve prognostic precision and patient outcomes in ART.

Graphical Abstract

1. Introduction

Male infertility accounts for nearly half of the cases of worldwide infertility, leading more men in assisted reproductive technologies (ART). Spermatozoa quality is critical for successful fertilization when using ART, such as in vitro fertilization (IVF) [1]. Therefore, an accurate evaluation of semen quality is necessary for improving IVF outcomes, with traditional metrics like sperm concentration, motility, and morphology being the most common clinical practice. However, they fail to provide a proper evaluation of the spermatozoa functionality [2], while new research highlights that sperm health and reproductive success are significantly influenced by oxidative stress (OS) and the associated imbalance in oxidation-reduction potential (ORP).
Seminal plasma and spermatozoa are abundant in antioxidants that protect sperm cells from OS, particularly in the post-testicular stage. Glutathione peroxidase, superoxide dismutase, and catalase are high molecular weight enzymatic antioxidants found in seminal plasma [3]. Men deficient in these enzymes often exhibit impaired fertility due to sperm DNA damage [4]. Most of the antioxidant potential of semen, however, comes from nonenzymatic molecules such as ascorbic acid, α -tocopherol, pyruvate, glutathione, L-carnitine, taurine, and hypotaurine, along with beta-carotene, albumin, and ubiquinol [5]. Numerous studies have shown that seminal antioxidant capacity is reduced in infertile men with higher ROS concentrations compared to those with normal ROS levels. However, it remains unclear how reduced sperm antioxidant capacity contributes to sperm dysfunction. Whether high ROS concentrations in the semen of infertile men result from increased ROS production, reduced ROS scavenging capacity, or both, is still under debate [6]. Any disruption of the balance between ROS formation and antioxidant capacity in the seminal plasma leads to OS when highly reactive ROS overwhelm the antioxidant defense systems [7]. Fragmentation of sperm DNA in nuclei and mitochondria, lipid peroxidation, and apoptosis—including mitophagy and lipophagy—may all result from high ROS concentrations [8]. Despite its recognized importance in male infertility (MI), OS is not yet routinely assessed in clinical practice [9]. The measurement of ORP offers a promising, integrative approach to evaluating OS by quantifying the overall balance between oxidants and antioxidants within a biological sample. However, the clinical application of ORP is challenging, as its interpretation requires robust analytical tools to ensure accuracy and reliability [10].
Sperm DNA fragmentation has been increasingly recognized as an indicator of sperm quality, as elevated DNA fragmentation index (DFI) values have been associated with lower fertilization and pregnancy success rates, reduced embryo quality, and higher miscarriage risk. DFI assessment provides complementary information on sperm DNA integrity that traditional semen analysis cannot capture [11,12]. DFI was selected as the model’s target variable because it represents a quantifiable measure of sperm DNA integrity, providing complementary insight beyond standard semen parameters. DFI is particularly relevant for cases of unexplained or recurrent infertility, where classical semen characteristics may appear normal. By combining ORP, a measure of oxidative balance, with DFI, a marker of resultant DNA damage, we aimed to explore whether biochemical stress indicators enhance genomic—integrity prediction through machine learning (ML). Although many studies support its usefulness—particularly in cases of recurrent pregnancy loss or unexplained infertility—its role as a routine diagnostic test remains under debate. Current evidence suggests that DFI testing may offer valuable insights when combined with standard semen parameters rather than as a standalone diagnostic tool [13]. This study aimed to examine whether incorporating ORP alongside conventional semen parameters enhances ML-based prediction of DFI.

2. Related Work

Many studies have attempted implementing ML methodologies in MI diagnostics [14,15]. One study examining the role of semen parameters in IVF, intracytoplasmic sperm injection (ICSI), and intrauterine insemination treatment (IUI) demonstrated the robustness of RF algorithms in predicting successful clinical pregnancies [16]. In addition, researchers used the DFI alongside traditional semen parameters to classify individuals into four clusters, using the k-means clustering method [17]. A neural network was utilized in a study on IUI, achieving an accuracy of 71.2% [18]. Golnaz Shemshaki’s team [19] used ML methods to evaluate the impact of biochemical markers in seminal fluid, based on the physiological function of accessory glands and ROS levels. With a huge dataset of 4239 patients [20], scientists developed ML models to describe changes in semen characteristics. Finally, two studies also investigated the relationship between lifestyle and male fertility using ML models. In the first study was developed an Extreme Gradient Boosting algorithm (XGBoost), which revealed that smoking was a significant negative factor for semen volume, sperm concentration, and sperm motility, while age was the most important factor influencing DFI [21]. However, in the second study, researchers [22] used a pre-trained model to assess the relationship between sperm parameters and various anthropometric, metabolic, and nutritional factors.
Even though the traditional semen parameters are significant for the training of new ML models, the current study aims to show the significance of ORP as a new feature in addition to the classical semen parameters (motility A, B, and C, total number of spermatozoa and morphology), as well as to find the best among 4 ML approaches. Correctly identifying the dataset’s features of a ML problem is essential [23], as we can save a lot of computational time, have a better understanding of the data and the results of the model, and remove the necessity of many samples, a problem known as curse of dimensionality [24].
In this study, MI is viewed as a binary classification problem, where class 1 represents patients with lower DFI values and class 0 represents those with higher DFI values [25]. Since our method predicts clinically significant DNA fragmentation status instead of using indirect indicators, it is directly clinically interpretable due to the explicit use of DFI as the model target. In this context, ML techniques [26] are important in the binary classification analysis, where different models are used [27]. Following this reasoning, in this paper we used 4 of the most traditional ML classifiers [28] to detect the differences between the classes based on six features:
  • Logistic Regression (LR) is a ML methodology which uses a sigmoid function (logistic) (Equation (1)) to predict the probability of an event ocurring. Essentially, we are computing a relationship between one or more independent variables ( X 1 , X 2 . . . . X n ) and the dependent variable y.
    σ ( x ) = 1 1 + e x
  • Support Vector Machine (SVM) is a technique where it is constructed a hyperplane in a high dimensional space [29]. The best hyperplane is the one that has the biggest separation, called as margin between the nearest data points. These data points are called support vectors, and we are trying to maximize their distances between the hyperplane based on a mathematical function, the kernel. The most popular kernel functions are the linear, polynomial, radial basis function (RBF) and the sigmoid (Equation (2)).
    Linear Kernel : K ( x , x ) = x x Polynomial Kernel : K ( x , x ) = ( x x + c ) d Gaussian ( RBF ) Kernel : K ( x , x ) = exp x x 2 2 σ 2 Sigmoid Kernel : K ( x , x ) = tanh α x x + c
  • The naive Bayes classifier is based on the Bayes’ theorem (Equation (3)), with the ‘naive’ assumption of feature independence given the value of a class variable. There are multiple classifiers depending on the data distribution, i.e., Gaussian Naive Bayes (Gaussian (normal) distribution), Multinomial Naive Bayes (Multinomial distribution), Bernoulli Naive Bayes (Bernoulli Distribution) and Complement Naive Bayes which is a Multinomial Naive Bayes classifier designed specifically for imbalanced datasets [30].
    P ( A | B ) = P ( B | A ) P ( A ) P ( B )
  • Random forests (RF) are the combination of decision trees to find the best single outcome [31]. Decision trees are non-parametric supervised learning methods which are conceptualized like trees, consisting of a root node, branches, internal nodes and leaf nodes. It is essentially splitting until the identification of the optimal split point within it, in a top-down manner. The mathematical criteria for splitting can be expressed as Entropy and Gini index (Equation (4)).
    Entropy : Entropy = p 1 log 2 ( p 1 ) + p 2 log 2 ( p 2 ) Gini index : G i n i ( E ) = 1 p 1 2 + p 2 2

3. Materials and Methods

3.1. Study Population and Sample Collection

Semen samples were collected from 162 participants undergoing infertility evaluation at the center for human reproduction, GENESIS Athens-Thessaly. The cohort included both normospermic men and patients presenting with male-factor infertility; no direct pregnancy outcome data were recorded. All participants provided written informed consent prior to inclusion in the study. Each participant maintained a period of sexual abstinence lasting between two and five days before sample collection. Semen samples were obtained by masturbation and were allowed to liquefy for fifteen minutes at room temperature before analysis.

3.2. Conventional Semen Analysis

Semen analysis was performed according to the World Health Organization’s (WHO) 2021 guidelines. The parameters evaluated included sperm concentration, motility categories A, B, and C, as well as the percentage of morphologically normal spermatozoa. Based on the WHO motility classification, category A refers to rapid progressive motility, category B to slow or moderate progressive motility, and category C to non-progressive motility. These conventional semen parameters formed the foundation of the “Reference” dataset that was later used for ML analysis. For contextual interpretation, we provide the WHO (2021) reference lower limits, which are defined as the 5th percentile values derived from fertile men and are used as comparative benchmarks: total motility 42 % , progressive motility 30 % , normal morphology 4 % , and total sperm count 39 million/ejaculate.

3.3. ORP Measurement

After liquefaction, aliquots of semen were used for oxidation–reduction potential (ORP) measurement using the MiOXSYS® analyzer (Aytu Bioscience, Inc., Englewood, CO, USA, SN: 03-100226). The device measures the redox balance in semen via a galvanostatic electrochemical sensor. A disposable sensor containing 30 µL of liquefied semen was placed into the analyzer according to the manufacturer’s instructions. Static ORP (sORP) values were normalized to sperm concentration and expressed as mV / 10 6 sperm / mL . Quality control inspections were routinely performed to ensure instrument reliability. To minimize variability due to semen viscosity, samples exhibiting high viscosity were allowed extended liquefaction time or were excluded from ORP measurement, in accordance with the manufacturer’s recommendations.

3.4. DNA Fragmentation Assessment

DNA fragmentation was assessed using the Cariad Sperm DNA Fragmentation Kit (Zhuhai Cariad Medical Technology Co., Ltd., Zhuhai, Guangdong, China) following the manufacturer’s protocol. The assay quantifies the proportion of spermatozoa exhibiting fragmented DNA, expressed as the DFI. Each semen sample was analyzed immediately after liquefaction under standardized laboratory conditions to ensure consistency. Based on reference data provided by the manufacturer and supported by previous studies, a DFI value of 28% was used as the cutoff threshold [32,33]. Samples with D F I < 28 % were categorized as class 1, indicating lower levels of DNA fragmentation and better sperm DNA integrity, whereas those with D F I 28 % were categorized as class 0, indicating increased DNA fragmentation and poorer DNA integrity. Internal validation with duplicate measurements confirmed intra-assay variability below 5%.

3.5. Dataset Development

Following sample collection and laboratory assessment, two datasets were constructed to investigate the predictive value of oxidation–reduction potential and conventional semen parameters. The first dataset, referred to as the ORP dataset, consisted of 37 semen samples that included measurements for motility (A, B, and C), total sperm count, morphology, and ORP. Within this dataset, 12 samples were assigned to class 0 ( D F I 28 ) and 25 samples to class 1 ( D F I < 28 ). The second dataset, referred to as the Reference dataset, included 162 samples containing all the aforementioned semen parameters except ORP. In this dataset, 50 samples were assigned to class 0 and 112 samples to class 1.

3.6. Machine Learning Workflow

Our experiment protocol was designed carefully to evaluate the performance of ML models [34] in classifying semen samples, based on the features from spermogram and ORP. Our aim was to find the best features as well as the optimal combination of them for the correct diagnosis. For a better understanding, we divided the protocol into two phases corresponding to the two datasets, with a systematic assessment of the features and preprocessing methodologies (Figure 1).
In the first phase, the Reference dataset was analyzed utilizing three different subsets of features: (1) the motility features only, (2) motility and morphology and (3) all features, which included the already mentioned features along with the total number of spermatozoa. The second phase focused on the ORP dataset, where we assessed the role of ORP in classification. To do this, we created two scenarios: the first included the ORP and used the feature subset of the Reference dataset, while the second excluded the ORP. In both phases we used four ML algorithms: 1. LR, 2. SVM, 3. BNB and 4. RF. Each algorithm was trained and tested using a 5-fold cross-validation method, under three approaches: (1) no scaling, (2) Robust and MinMax scalers and (3) Robust and Standard scalers. The Robust scaler (Equation (5)) was applied to features with outliers, while the MinMax (Equation (6)) and the Standard scalers (Equation (7)) were utilized on all features to ensure consensus comparisons across the subsets [35]:
  • Robust scaler: Scales data by removing the median and dividing by the interquartile range (IQR). It is particularly useful for datasets with outliers, as it focuses on the central 50 % of the data, making it robust to extreme values.
    x = x Median ( x ) IQR ( x )
  • MinMax scaler: Scales the data to a fixed range of [0, 1], by subtracting the minimum value and dividing by the range (max − min). It is sensitive to outliers, as extreme values can skew the scaling, making it less effective when outliers are present.
    x = x min ( x ) max ( x ) min ( x )
  • Standard scaler: Scales the data by removing the mean and dividing by the standard deviation, resulting in a distribution with a mean of 0 and a standard deviation of 1. It is commonly used when the data is assumed to be normally distributed or when algorithms require standardized data.
    x = x μ σ

4. Results

As mentioned in Materials and Methods, we have analyzed each dataset (Reference and ORP) using four classification algorithms, evaluating each algorithms performance based on five metrics: (a) accuracy, (b) precision, (c) recall, (d) AUC and (e) f1 metrics. The evaluations were conducted across three different preprocessing pipelines: no scaling, Robust-MinMax scaling and Robust-Standard scaling. In Table 1 we summarize the mean and standard deviations of the metrics across the train and test folds of all pipelines per dataset. Additionally, in Figure 2 and Figure 3, we summarize the results of the train and test folds from the cross-validation process. Each results value represents the average performance across each fold of the dataset.

4.1. Phase 1: Reference Dataset

In the Reference dataset, the BNB algorithm seems to be the most robust model among the four, as shown in Table 2. The second best is the RF followed by SVM and LR. The optimal scaling approach was the Robust-MinMax pipeline, which yielded the best results for LR and BNB, while the SVM model achieved the best performance with the Robust-Standard scaling. The best subset feature dataset for LR and RF was X_all, which contains the total number of spermatozoa with the motilities and the morphology. The best subset for BNB was X_mot (all the motilities) and for SVM the X_mot_morph (motilities with morphology). LR and RF achieved their best metrics with the X_all feature subset, which includes the total number of spermatozoa along with motility and morphology features. By contrast, the BNB and SVM models performed best with the X_mot (only motilities) and X_mot_morph (motilities and morphology) subsets respectively. These differences highlight the critical role of subset selection in model performance.

4.2. Phase 2: ORP Dataset

In the ORP dataset, the inclusion of ORP consistently improved all the algorithms’ best metrics except Recall, as can be seen in Table 3. However, the ORP has a less favorable effect on SVM and LR across all runs. A single run corresponds to executing a k-fold cross-validation for each dataset and each ML pipeline. Notably, the SVM performance deteriorated, with only 1 (SVM’s best) out of the 9 runs performing better with ORP, while the remaining 8 runs performed better without the ORP feature. Similarly, LR showed a similar trend but with more moderate effects, with 4 out of the 9 runs performing better with ORP and the 5 without it. The best performing subset of features across all runs was the X_all (all-features subset) which includes the motility, morphology, sperm count, and ORP features. Nevertheless, the LR model achieved the best performance for subset X_mot_morph in the case of NoORP data, and the BNB for X_all of ORP data. This suggests that adding ORP strengthens predictions of sperm DNA integrity by improving the model’s capacity to categorize samples into low and high DFI groups when combined with all other traditional semen parameters.
Interestingly, in this case too, the best algorithm is the BNB with the Robust-MinMax pipeline (same as in Reference dataset), given the possible misleading effectiveness of SVM, since its best performance metrics were achieved without scaling in both data subsets. Scaling is a key prerequisite for SVM, since very large values could dominate the other features when when computing distances between the hyperplane and support vectors.

5. Discussion

To the best of our knowledge, this is the first scientific work to use classical semen parameters in combination with ORP values to qualitatively predict DFI, offering a novel perspective on the role of OS to sperm DNA integrity. The results of the study reveal several interesting aspects of the implementation of ML methods to semen analysis. Specifically, the X_all subsets generally yielded the best results across all the different runs. As the most robust model across the two phases emerged the BNB algorithm, particularly when used with the Robust-MinMax scaling pipeline. However, in the ORP and Reference dataset, the best subset was the X_mot subset while in NoORP was the X_all, highlighting the complicated training of a ML algorithm, depending on the dataset and the corresponding features. The RF algorithm was found to be the most stable, having X_all as the best subset showing the high importance of the total count of spermatozoa. The SVM model, on the other hand, performed more inconsistently across subsets and scaling strategies, with unusual results when the ORP feature was included. This unusual behavior may be due to overfitting in the no-scaling pipeline, while in the other 8 runs, it had less effective generalization when the ORP feature was omitted (NoORP dataset). The number of samples and the inclusion of the ORP affect the importance values for each metric, highlighting the need for careful evaluation of the features and the dataset size. Additionally, due to the limited sample size, we employed simpler ML algorithms to avoid overfitting. However, as larger datasets become available, the exploration of more complex and robust ML architectures, along with external validation on independent test sets, will be critical to confirm the predictive power and clinical utility of ORP as a biomarker for sperm quality. Finally, the measurement of ORP may be influenced by the inherent heterogeneity of semen samples. Specifically, variations between the donor’s total ejaculate volume (in mL) and the smaller analyzed subsample (in µL) can affect the consistency of ORP readings, but the measurements are scientifically validated [36]. Further research in this area will help us understand the value of features in MI, allowing us to create more successful algorithms.
The findings of this study suggest a potential importance of ORP in semen analysis, as results between the ORP, NoORP, and Reference datasets show similar trends despite differences in sample size. Compared with traditional semen analysis, the proposed machine learning approach offers a more thorough assessment of OS, DNA integrity, and overall sperm function. By identifying nonlinear patterns that traditional diagnostic techniques miss, ML can improve prediction accuracy. It is important to note that these patterns should be interpreted with caution, as the higher metrics of the BNB model on the ORP dataset compared to the NoORP dataset may reflect sample size effects rather than the inclusion of ORP. Therefore, larger multicenter cohorts are necessary to validate these preliminary findings. However, the consistency across 90 different runs ( 27 3 + 9 1 ) highlights the potential of ORP as a significant characteristic, which could play an important role in the performance of ML methodologies on DFI prediction. This is crucial, since the implementation of robust ML methodologies could enhance traditional methods of ART, bringing a data-driven perspective to this critical research area. Achieving this requires the development of more diverse and bigger datasets, with careful selection of the features, such as those derived from state-of-the-art medical examinations. Additionally, a deeper understanding of feature contributions is essential for interpreting the impact of variables such as ORP. Analyzing how each feature influences model predictions can provide more direct evidence of their relevance, beyond overall performance metrics [37]. In the present study, the small test set limited our ability to perform a robust feature-level analysis.
The demonstration of ORP as a fundamental feature in measuring OS is critical, since increased OS impairs sperm motility, as well as membrane and DNA integrity—essential factors for successful fertilization and early embryo development. Our approach bridges the gap between sperm diagnostics and the downstream processes of fertilization and embryo viability, providing a novel preliminary viewpoint closely aligned with embryological objectives. A high DFI indicates DNA damage, which correlates with decreased embryo quality, lower fertilization rates, and poor pregnancy outcomes. Improved DFI prediction can increase the likelihood of successful ART outcomes by guiding clinical treatments, informing patient counseling, and supporting more effective embryo selection strategies. Parameters reflecting sperm function such as capacitation or acrosome integrity were not available for all samples and were therefore excluded. These measures could provide unique insights into sperm activity and can substantially improve the biological interpretability and predictive performance of future models. Therefore, to increase model robustness and clarify the connection between ORP, oxidative stress, and sperm DNA integrity, future research should include these functional and biochemical markers with larger sample sizes.
Finally, biochemical markers such as alanine transaminase (ALT), superoxide dismutase (SOD), or catalase activity, can further characterize oxidative metabolism. Future studies integrating enzymatic assays with ORP could clarify whether specific enzymatic profiles modulate the oxidative balance captured by ORP. In addition, given that OS is modulated by lifestyle, future iterations of this ML framework should incorporate variables such as age, smoking status, alcohol intake, diet quality, physical activity, and BMI. Including these parameters may allow the model to capture environmental and behavioral contributions to oxidative imbalance, thus enhancing clinical interpretability.

6. Conclusions

By proving that ML models can accurately identify semen samples based on DNA fragmentation, this study emphasizes the importance of DFI as a diagnostic predictor of sperm quality and reproductive potential. Our results show a relationship between OS and sperm DNA integrity and that adding ORP as a variable improves the prediction accuracy of these models. Clinically speaking, sophisticated diagnostic instruments that combine ORP with traditional semen parameters can help embryologists find spermatozoa with the best oxidative balance and DNA integrity, enabling better sperm selection and ART choices. These ML frameworks have the potential to improve ART outcomes by optimizing culture conditions and enhancing embryo quality through the integration of DFI prediction into routine assessments. Overall, this study shows that a promising approach for more accurate sperm diagnostics is the combination of ORP measurements with DFI-based classification. By expanding our knowledge of the connection between sperm DNA fragmentation, OS, and reproductive success, these findings aid in the continuous development of ART and eventually assist patients while also improving reproductive research.
In practical diagnostics, ML algorithms combining classical semen parameters with ORP could be embedded into automated semen analysis platforms. Following external validation, such models could identify samples with elevated oxidative imbalance or high DFI probability, thereby guiding clinicians toward additional testing, antioxidant therapy, or tailored ART approaches. Ultimately, this integration could support rapid, data-driven decision-making in male infertility diagnostics.

Author Contributions

Conceptualization, N.G., A.T.T. and K.D.; methodology, E.D.O. and E.M.; software, E.D.O., N.G. and A.T.T.; validation, E.D.O., E.M., A.Z. (Athanasios Zikopoulos), N.G. and A.T.T.; formal analysis, E.D.O. and E.M.; investigation, E.D.O. and E.M.; resources, A.P., K.P. and K.D.; data curation, E.D.O. and E.M.; writing—original draft preparation, E.D.O. and E.M.; writing—review and editing, A.Z. (Athanasios Zikopoulos), S.D., E.P., A.-M.G., A.Z. (Athanasios Zachariou), A.P., N.G., K.P., A.T.T. and K.D.; visualization, E.D.O. and E.M.; supervision, A.T.T. and K.D.; project administration, K.P. and K.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki, and approved by the Ethics and Deontology Committee of the Medically Assisted Reproduction Unit “GENESIS ATHENS THESSALY” (protocol code 366 during its meeting on 13 January 2025).

Informed Consent Statement

Informed consent was obtained from all individual participants included in the study.

Data Availability Statement

Data is unavailable due to privacy or ethical restrictions.

Acknowledgments

We are grateful to the participants for their involvement in the study. In addition, we extend our gratitude to GENESIS Athens—Thessaly research team for their support in data collection and project coordination. Their scientific expertise played a significant role in the execution of this study. Additionally, during the preparation of this manuscript/study, the author(s) used GPT-5 mini (GPT, OpenAI) for the purposes of improving the clarity, grammar, and overall language structure of the text. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ARTAssisted Reproductive Technology
MLMachine Learning
ORPOxidation—Reduction Potential
DFIDNA Fragmentation Index
IVFIn Vitro Fertilization
OSOxidative Stress
ROSReactive Oxygen Species
MIMale Infertility
IUIIntrauterine Insemination Treatment
ICSIIntracytoplasmic Sperm Injection
LRLogistic Regression
SVMSupport Vector Machine
NBNaive Bayes
RFRandorm Forest
XGBoostExtreme Gradient Boosting
MiOXSYSMale Infertility Oxidative System

References

  1. Mazzilli, R.; Rucci, C.; Vaiarelli, A.; Cimadomo, D.; Ubaldi, F.M.; Foresta, C.; Ferlin, A. Male factor infertility and assisted reproductive technologies: Indications, minimum access criteria and outcomes. J. Endocrinol. Investig. 2023, 46, 1079–1085. [Google Scholar] [CrossRef] [PubMed]
  2. Tanga, B.M.; Qamar, A.Y.; Raza, S.; Bang, S.; Fang, X.; Yoon, K.; Cho, J. Semen evaluation: Methodological advancements in sperm quality-specific fertility assessment. Animal Bioscience 2021, 34, 1253–1270. [Google Scholar] [CrossRef] [PubMed]
  3. LI, T.K. The Glutathione and Thiol Content of Mammalian Spermatozoa and Seminal Plasma. Biol. Reprod. 1975, 12, 641–646. [Google Scholar] [CrossRef] [PubMed]
  4. Antinozzi, C.; Di Luigi, L.; Sireno, L.; Caporossi, D.; Dimauro, I.; Sgrò, P. Protective Role of Physical Activity and Antioxidant Systems During Spermatogenesis. Biomolecules 2025, 15, 478. [Google Scholar] [CrossRef]
  5. Henkel, R.; Sandhu, I.S.; Agarwal, A. The excessive use of antioxidant therapy: A possible cause of male infertility? Andrologia 2018, 51, e13162. [Google Scholar] [CrossRef]
  6. Bouhadana, D.; Godin Pagé, M.H.; Montjean, D.; Bélanger, M.C.; Benkhalifa, M.; Miron, P.; Petrella, F. The Role of Antioxidants in Male Fertility: A Comprehensive Review of Mechanisms and Clinical Applications. Antioxidants 2025, 14, 1013. [Google Scholar] [CrossRef]
  7. Moustakli, E.; Zikopoulos, A.; Sakaloglou, P.; Bouba, I.; Sofikitis, N.; Georgiou, I. Functional association between telomeres, oxidation and mitochondria. Front. Reprod. Health 2023, 5, 1107215. [Google Scholar] [CrossRef]
  8. Takeshima, T.; Usui, K.; Mori, K.; Asai, T.; Yasuda, K.; Kuroda, S.; Yumura, Y. Oxidative stress and male infertility. Reprod. Med. Biol. 2020, 20, 41–52. [Google Scholar] [CrossRef]
  9. Walke, G.; Gaurkar, S.S.; Prasad, R.; Lohakare, T.; Wanjari, M. The Impact of Oxidative Stress on Male Reproductive Function: Exploring the Role of Antioxidant Supplementation. Cureus 2023, 15, e42583. [Google Scholar] [CrossRef]
  10. Yang, H.; Li, G.; Jin, H.; Guo, Y.; Sun, Y. The effect of sperm DNA fragmentation index on assisted reproductive technology outcomes and its relationship with semen parameters and lifestyle. Transl. Androl. Urol. 2019, 8, 356–365. [Google Scholar] [CrossRef]
  11. Li, F.; Duan, X.; Li, M.; Ma, X. Sperm DNA fragmentation index affect pregnancy outcomes and offspring safety in assisted reproductive technology. Sci. Rep. 2024, 14, 356. [Google Scholar] [CrossRef] [PubMed]
  12. Solanki, M.; Joseph, T.; Muthukumar, K.; Samuel, P.; Aleyamma, T.K.; Kamath, M.S. Impact of sperm DNA fragmentation in couples with unexplained recurrent pregnancy loss: A cross-sectional study. J. Obstet. Gynaecol. Res. 2024, 50, 1687–1696. [Google Scholar] [CrossRef] [PubMed]
  13. Agarwal, A.; Henkel, R.; Sharma, R.; Tadros, N.N.; Sabanegh, E. Determination of seminal oxidation-reduction potential (ORP) as an easy and cost-effective clinical marker of male infertility. Andrologia 2017, 50, e12914. [Google Scholar] [CrossRef] [PubMed]
  14. Panner Selvam, M.K.; Moharana, A.K.; Baskaran, S.; Finelli, R.; Hudnall, M.C.; Sikka, S.C. Current Updates on Involvement of Artificial Intelligence and Machine Learning in Semen Analysis. Medicina 2024, 60, 279. [Google Scholar] [CrossRef]
  15. Chu, K.Y.; Nassau, D.E.; Arora, H.; Lokeshwar, S.D.; Madhusoodanan, V.; Ramasamy, R. Artificial Intelligence in Reproductive Urology. Curr. Urol. Rep. 2019, 20, 52. [Google Scholar] [CrossRef]
  16. Mehrjerd, A.; Dehghani, T.; Jajroudi, M.; Eslami, S.; Rezaei, H.; Ghaebi, N.K. Ensemble machine learning models for sperm quality evaluation concerning success rate of clinical pregnancy in assisted reproductive techniques. Sci. Rep. 2024, 14, 24283. [Google Scholar] [CrossRef]
  17. Peng, T.; Liao, C.; Ye, X.; Chen, Z.; Li, X.; Lan, Y.; Fu, X.; An, G. Machine learning-based clustering to identify the combined effect of the DNA fragmentation index and conventional semen parameters on in vitro fertilization outcomes. Reprod. Biol. Endocrinol. 2023, 21, 26. [Google Scholar] [CrossRef]
  18. Sene, A.A.; Zandieh, Z.; Soflaei, M.; Torshizi, H.M.; Sheibani, K. Using artificial intelligence to predict the intrauterine insemination success rate among infertile couples. Middle East Fertil. Soc. J. 2021, 26, 46. [Google Scholar] [CrossRef]
  19. Shemshaki, G.; Murthy, A.S.N.; Malini, S.S. Assessment and Establishment of Correlation between Reactive Oxidation Species, Citric Acid, and Fructose Level in Infertile Male Individuals: A Machine-Learning Approach. J. Hum. Reprod. Sci. 2021, 14, 129–136. [Google Scholar] [CrossRef]
  20. Santi, D.; Spaggiari, G.; Casonati, A.; Casarini, L.; Grassi, R.; Vecchi, B.; Roli, L.; De Santis, M.C.; Orlando, G.; Gravotta, E.; et al. Multilevel approach to male fertility by machine learning highlights a hidden link between haematological and spermatogenetic cells. Andrology 2020, 8, 1021–1029. [Google Scholar] [CrossRef]
  21. Zhou, M.; Yao, T.; Li, J.; Hui, H.; Fan, W.; Guan, Y.; Zhang, A.; Xu, B. Preliminary prediction of semen quality based on modifiable lifestyle factors by using the XGBoost algorithm. Front. Med. 2022, 9, 811890. [Google Scholar] [CrossRef] [PubMed]
  22. Bachelot, G.; Lamaziere, A.; Czernichow, S.; Faure, C.; Racine, C.; Levy, R.; Dupont, C. Machine learning approach to assess the association between anthropometric, metabolic, and nutritional status and semen parameters. Asian J. Androl. 2024, 26, 349–355. [Google Scholar] [CrossRef] [PubMed]
  23. Li, J.; Cheng, K.; Wang, S.; Morstatter, F.; Trevino, R.P.; Tang, J.; Liu, H. Feature Selection: A Data Perspective. ACM Comput. Surv. 2017, 50, 1–45. [Google Scholar] [CrossRef]
  24. Kuo, F.; Sloan, I. Lifting the Curse of Dimensionality. Not. AMS 2005, 52, 1320–1328. [Google Scholar]
  25. Stavros, S.; Potiris, A.; Molopodi, E.; Mavrogianni, D.; Zikopoulos, A.; Louis, K.; Karampitsakos, T.; Nazou, E.; Sioutis, D.; Christodoulaki, C.; et al. Sperm DNA Fragmentation: Unraveling Its Imperative Impact on Male Infertility Based on Recent Evidence. Int. J. Mol. Sci. 2024, 25, 10167. [Google Scholar] [CrossRef]
  26. Sakkas, K.; Dimitriou, E.G.; Ntagka, N.E.; Giannakeas, N.; Kalafatakis, K.; Tzallas, A.T.; Glavas, E. Personalized Visualization of the Gestures of Parkinson’s Disease Patients with Virtual Reality. Future Internet 2024, 16, 305. [Google Scholar] [CrossRef]
  27. Oikonomou, E.D.; Karvelis, P.; Giannakeas, N.; Vrachatis, A.; Glavas, E.; Tzallas, A.T. How natural language processing derived techniques are used on biological data: A systematic review. Netw. Model. Anal. Health Informatics Bioinform. 2024, 13, 23. [Google Scholar] [CrossRef]
  28. Sarker, I.H. Machine Learning: Algorithms, Real-World Applications and Research Directions. SN Comput. Sci. 2021, 2, 160. [Google Scholar] [CrossRef]
  29. Patle, A.; Chouhan, D.S. SVM kernel functions for classification. In Proceedings of the 2013 International Conference on Advances in Technology and Engineering (ICATE), Mumbai, India, 23–25 January 2013; pp. 1–9. [Google Scholar] [CrossRef]
  30. Yang, F.J. An Implementation of Naive Bayes Classifier. In Proceedings of the 2018 International Conference on Computational Science and Computational Intelligence (CSCI), Las Vegas, NV, USA, 12–14 December 2018; pp. 301–306. [Google Scholar] [CrossRef]
  31. Cutler, A.; Cutler, D.R.; Stevens, J.R. Random Forests. In Ensemble Machine Learning; Springer: New York, NY, USA, 2012; pp. 157–175. [Google Scholar] [CrossRef]
  32. Sun, T.C.; Zhang, Y.; Li, H.T.; Liu, X.M.; Yi, D.X.; Tian, L.; Liu, Y.X. Sperm DNA fragmentation index, as measured by sperm chromatin dispersion, might not predict assisted reproductive outcome. Taiwan. J. Obstet. Gynecol. 2018, 57, 493–498. [Google Scholar] [CrossRef]
  33. Caliskan, Z.; Kucukgergin, C.; Aktan, G.; Kadioglu, A.; Ozdemirler, G. Evaluation of sperm DNA fragmentation in male infertility. Andrologia 2022, 54, e14587. [Google Scholar] [CrossRef]
  34. Zandbagleh, A.; Miltiadous, A.; Sanei, S.; Azami, H. Beta-to-Theta Entropy Ratio of EEG in Aging, Frontotemporal Dementia, and Alzheimer’s Dementia. Am. J. Geriatr. Psychiatry 2024, 32, 1361–1382. [Google Scholar] [CrossRef]
  35. de Amorim, L.B.; Cavalcanti, G.D.; Cruz, R.M. The choice of scaling technique matters for classification performance. Appl. Soft Comput. 2023, 133, 109924. [Google Scholar] [CrossRef]
  36. Agarwal, A.; Parekh, N.; Panner Selvam, M.K.; Henkel, R.; Shah, R.; Homa, S.T.; Ramasamy, R.; Ko, E.; Tremellen, K.; Esteves, S.; et al. Male Oxidative Stress Infertility (MOSI): Proposed Terminology and Clinical Practice Guidelines for Management of Idiopathic Male Infertility. World J. Men’s Health 2019, 37, 296. [Google Scholar] [CrossRef]
  37. Ewald, F.K.; Bothmann, L.; Wright, M.N.; Bischl, B.; Casalicchio, G.; König, G. A Guide to Feature Importance Methods for Scientific Inference. arXiv 2024, arXiv:2404.12862. [Google Scholar] [CrossRef]
Figure 1. The flowchart of our methodology. For the Reference dataset, three different scaling methods (Robust—MinMax, Robust—Standard, and no scaling) are applied to three distinct feature sets (X_mot: subset comprising only motility parameters, X_mot_morph: X_mot which includes X_mot plus the morphology feature, X_all: which extends X_mot_morph by adding the total sperm count). Each configuration is then processed using four different algorithms to classify the diagnosis into two categories: D F I 28 and D F I < 28 . For the ORP dataset, an additional step splits the data into two subsets: one with the ORP feature and one without it. The * symbol indicates that X_mot* refers to two distinct datasets: X_mot_ORP and X_mot_NoORP, as well as X_mot_morph and X_all. This multi-step approach provides comprehensive insights into the classification results under different conditions.
Figure 1. The flowchart of our methodology. For the Reference dataset, three different scaling methods (Robust—MinMax, Robust—Standard, and no scaling) are applied to three distinct feature sets (X_mot: subset comprising only motility parameters, X_mot_morph: X_mot which includes X_mot plus the morphology feature, X_all: which extends X_mot_morph by adding the total sperm count). Each configuration is then processed using four different algorithms to classify the diagnosis into two categories: D F I 28 and D F I < 28 . For the ORP dataset, an additional step splits the data into two subsets: one with the ORP feature and one without it. The * symbol indicates that X_mot* refers to two distinct datasets: X_mot_ORP and X_mot_NoORP, as well as X_mot_morph and X_all. This multi-step approach provides comprehensive insights into the classification results under different conditions.
Dna 06 00003 g001
Figure 2. The performance metrics (Accuracy, Precision, AUC, Recall, and f1 Score) across the three different data at the train split: ORP, NoORP, and Reference. The subplots show the distribution of these metrics across the 5 folds of the cross-validation process.
Figure 2. The performance metrics (Accuracy, Precision, AUC, Recall, and f1 Score) across the three different data at the train split: ORP, NoORP, and Reference. The subplots show the distribution of these metrics across the 5 folds of the cross-validation process.
Dna 06 00003 g002
Figure 3. The performance metrics (Accuracy, Precision, AUC, Recall, and f1 Score) across the three different data at the test split: ORP, NoORP, and Reference. The subplots show the distribution of these metrics across the 5 folds of the cross-validation process.
Figure 3. The performance metrics (Accuracy, Precision, AUC, Recall, and f1 Score) across the three different data at the test split: ORP, NoORP, and Reference. The subplots show the distribution of these metrics across the 5 folds of the cross-validation process.
Dna 06 00003 g003
Table 1. The mean and standard deviation (SD) of evaluation metrics across test and train sets for three datasets. Values in fresh turquoise represent the highest mean, and values in akabeni represent the lowest standard deviation.
Table 1. The mean and standard deviation (SD) of evaluation metrics across test and train sets for three datasets. Values in fresh turquoise represent the highest mean, and values in akabeni represent the lowest standard deviation.
DataStat MetricsTest_AccuracyTrain_AccuracyTest_PrecisionTrain_PrecisionTest_Recall
ORPMean0.840.860.870.890.92
SD0.110.020.130.020.11
NoORPMean0.840.840.870.860.92
SD0.110.020.130.030.11
ReferenceMean0.770.770.770.770.94
SD0.080.020.050.010.07
DataStat Metricstrain_recalltest_f1train_f1test_roc_auctrain_roc_auc
ORPMean0.920.890.900.770.84
SD0.030.080.020.210.03
NoORPMean0.920.890.890.760.80
SD0.030.080.010.190.03
ReferenceMean0.940.850.850.660.66
SD0.020.060.010.100.02
Table 2. Best pipelines for the Reference dataset based on evaluation metrics of four ML algorithms: (a) Logistic Regression (LR), (b) Support Vector Machines (SVM), (c) Bernoulli Naive Bayes (BNB), and (d) Random Forest (RF). Fresh turquoise indicates the best metric value averaged across 5 folds.
Table 2. Best pipelines for the Reference dataset based on evaluation metrics of four ML algorithms: (a) Logistic Regression (LR), (b) Support Vector Machines (SVM), (c) Bernoulli Naive Bayes (BNB), and (d) Random Forest (RF). Fresh turquoise indicates the best metric value averaged across 5 folds.
ModelDataset SubsetScalerTest_AccuracyTrain_AccuracyTest_PrecisionTrain_PrecisionTest_Recall
LRX_allRobust—MinMax0.700.700.870.850.69
SVMX_mot_morphRobust-Standard0.720.750.770.780.85
BNBX_motRobust-Min_Max0.760.760.770.770.94
RFX_allNo scaling0.730.770.780.810.84
ModelDataset SubsetScalerTrain_RecallTest_F1Train_F1Test_AUCTrain_AUC
LRX_allRobust—MinMax0.700.750.770.750.78
SVMX_mot_morphRobust-Standard0.870.810.830.770.81
BNBX_motRobust-Min_Max0.940.850.850.660.66
RFX_allNo scaling0.870.810.840.780.82
Table 3. Best pipelines for ORP dataset, based on evaluation metrics of four ML algorithms: (a) LR, (b) SVM, (c) BNB, and (d) RF. Dataset subsets include: X_all, X_mot_morph, and X_mot. With fresh turquoise is colored the best metric value across 5 folds including the ORP feature, while akabeni indicates the best metric value across 5 folds excluding the ORP feature.
Table 3. Best pipelines for ORP dataset, based on evaluation metrics of four ML algorithms: (a) LR, (b) SVM, (c) BNB, and (d) RF. Dataset subsets include: X_all, X_mot_morph, and X_mot. With fresh turquoise is colored the best metric value across 5 folds including the ORP feature, while akabeni indicates the best metric value across 5 folds excluding the ORP feature.
ModelORPDataset SubsetScalerTest_AccuracyTrain_AccuracyTest_PrecisionTrain_PrecisionTest_Recall
LRORPX_allRobust-MinMax0.780.820.850.860.88
NoORPX_mot_morphRobust-MinMax0.780.790.840.840.88
SVMORPX_allNo scaling0.840.870.890.940.88
NoORPX_allNo scaling0.840.870.890.940.88
BNBORPX_motRobust-MinMax0.840.860.870.890.92
NoORPX_allRobust-MinMax0.840.840.840.860.92
RFORPX_allNo scaling0.760.830.820.870.88
NoORPX_allNo scaling0.790.820.840.860.88
ModelORPDataset SubsetScalertrain_Recalltest_f1train_f1test_AUCtrain_AUC
LRORPX_allRobust-MinMax0.870.850.870.850.86
NoORPX_mot_morphRobust-MinMax0.850.850.840.850.87
SVMORPX_allNo scaling0.860.880.900.860.89
NoORPX_allNo scaling0.870.880.900.860.89
BNBORPX_motRobust-MinMax0.920.890.900.770.84
NoORPX_allRobust-MinMax0.920.890.890.760.80
RFORPX_allNo scaling0.880.840.880.850.91
NoORPX_allNo scaling0.880.850.870.880.90
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Oikonomou, E.D.; Moustakli, E.; Zikopoulos, A.; Dafopoulos, S.; Prapa, E.; Gkountis, A.-M.; Zachariou, A.; Pantou, A.; Giannakeas, N.; Pantos, K.; et al. A Preliminary Machine Learning Assessment of Oxidation-Reduction Potential and Classical Sperm Parameters as Predictors of Sperm DNA Fragmentation Index. DNA 2026, 6, 3. https://doi.org/10.3390/dna6010003

AMA Style

Oikonomou ED, Moustakli E, Zikopoulos A, Dafopoulos S, Prapa E, Gkountis A-M, Zachariou A, Pantou A, Giannakeas N, Pantos K, et al. A Preliminary Machine Learning Assessment of Oxidation-Reduction Potential and Classical Sperm Parameters as Predictors of Sperm DNA Fragmentation Index. DNA. 2026; 6(1):3. https://doi.org/10.3390/dna6010003

Chicago/Turabian Style

Oikonomou, Emmanouil D., Efthalia Moustakli, Athanasios Zikopoulos, Stefanos Dafopoulos, Ermioni Prapa, Antonis-Marios Gkountis, Athanasios Zachariou, Agni Pantou, Nikolaos Giannakeas, Konstantinos Pantos, and et al. 2026. "A Preliminary Machine Learning Assessment of Oxidation-Reduction Potential and Classical Sperm Parameters as Predictors of Sperm DNA Fragmentation Index" DNA 6, no. 1: 3. https://doi.org/10.3390/dna6010003

APA Style

Oikonomou, E. D., Moustakli, E., Zikopoulos, A., Dafopoulos, S., Prapa, E., Gkountis, A.-M., Zachariou, A., Pantou, A., Giannakeas, N., Pantos, K., Tzallas, A. T., & Dafopoulos, K. (2026). A Preliminary Machine Learning Assessment of Oxidation-Reduction Potential and Classical Sperm Parameters as Predictors of Sperm DNA Fragmentation Index. DNA, 6(1), 3. https://doi.org/10.3390/dna6010003

Article Metrics

Back to TopTop