1. Introduction
The swift progress in the development of medical technologies and the widespread use of electronic health records have created enormous amounts of medical data. This high-dimensional data holds the potential to transform healthcare by facilitating more precise diagnoses, customized treatments, and predictive analytics. Nonetheless, this data’s vast volume and complexity pose substantial challenges, particularly in extracting meaningful insights. A major challenge is the existence of irrelevant or redundant features that can hinder the effectiveness of machine learning models, causing overfitting and diminished generalization ability.
In recent decades, metaheuristic and evolutionary algorithms have proven to be highly effective in solving a variety of optimization problems [
1,
2,
3,
4,
5]. The Dragonfly Algorithm (DA), a contemporary metaheuristic inspired by the behavior of dragonflies [
6], stands out as a recently successful algorithm capable of surpassing other well-established optimizers in the literature. It has been employed in diverse real-world applications, including economic emission dispatch in power systems [
7,
8], simulation building [
9], wireless node localization in computer networks [
10], and machine learning [
11,
12]. The DA has demonstrated excellent performance across numerous continuous, discrete, single-objective, and multi-objective optimization problems, outperforming several state-of-the-art metaheuristic and evolutionary algorithms such as Particle Swarm Optimization (PSO) and Differential Evolution (DE). Recent studies have also highlighted the growing role of intelligent and language-aware AI systems in healthcare and social media analytics, demonstrating effective applications of machine learning and natural language processing in Arabic and multilingual contexts [
13,
14,
15].
Recently, the authors in [
6] presented a binary version of the DA, known as BDA, which utilizes a transfer function (TF) to transform a continuous search space into a discrete one. An initial evaluation of BDA’s effectiveness was performed on various feature selection challenges, with the findings indicating the method’s satisfactory performance [
16].
The persuasive advantages of the EPD operator prompted us to incorporate it into the newly developed Dragonfly Algorithm (DA) to evaluate its efficacy on FS problems. In this research, we adapted the DA by choosing the top three solutions along with one that is randomly generated to reposition a solution from the lower half of the population. This strategy allows solutions with lower fitness to influence the population’s structure. Comprehensive results and extensive comparisons indicate that the EPD significantly boosts the DA’s performance, enhancing the proposed method’s ability to surpass other optimizers and achieve superior solutions with better convergence characteristics. This study presents an EPD-enhanced DA-based optimizer aimed at improving the basic DA’s performance on FS tasks. Our main contributions in this research include:
The notable advantages of the EPD operator encouraged us to utilize it with the recently introduced DA and examine its efficiency in FS problems in the medical domain.
In the suggested method, a solution from the worst 50% of the population is repositioned by choosing one of the top three solutions and a randomly generated solution.
The suggested method has been evaluated on seven medical datasets, each with unique configurations and attributes, to illustrate its effectiveness, solution quality, and efficiency in feature selection tasks.
The EPD operator is combined with the modified DA (mDA) for the first time to address feature selection problems.
The paper is structured as follows:
Section 2 covers related work.
Section 3 highlighted the basics of DA, binary DA, and the EPD operator.
Section 4 outlines the proposed methodology.
Section 5 presents the results.
Section 6 highlighted the clinical discussion of the findings. Lastly,
Section 7 provides the conclusions and suggests avenues for future research.
2. Related Works
In our literature review, we employed a systematic research approach to locate and evaluate pertinent studies. We conducted searches in key scientific databases such as Elsevier, IEEE, Springer, and MDPI, utilizing specific keywords like metaheuristics, evolutionary computation, nature-inspired approaches, hybrid approaches, local search, Evolutionary Population Dynamics (EPD), and the Dragonfly Algorithm (DA). Our selection criteria were aimed at studies that proposed, examined, or implemented hybrid metaheuristic algorithms, with a special focus on methods integrating EPD and DA or similar evolutionary strategies. This approach guaranteed a thorough and focused review of existing literature, enabling us to evaluate the theoretical and practical contributions of current methods and to distinctly define the innovation of the proposed mDA algorithm.
Numerous studies have attempted to apply DA or enhance its effectiveness in addressing practical challenges such as photovoltaic systems [
17], prolonging RFID network lifespan [
18], 0-1 knapsack problems [
19], and the economic emission dispatch problem [
7]. In 2017, the researchers in [
20] introduced a memory-based hybrid DA combined with PSO principles for global optimization problems. Moreover, the authors in [
21] developed a modified DA with elite opposition learning for global optimization.
DA is applied within the healthcare care domain for the purpose of feature selection. The research in [
22] aimed to categorize breast cancer tumors as either benign or malignant. Implementing the Dragonfly algorithm allows a curated selection of features to be identified, thereby augmenting the precision of classification models. The algorithm enhances the feature selection methodology by systematically identifying the most salient features and concurrently discarding redundancies. This methodological framework enhances diagnostic accuracy in the medical field, particularly in differentiating among various classifications of breast cancer tumors. DA was employed in [
23] for the purpose of feature selection within the domain of machine learning; the investigation did apply this algorithm to a dataset pertaining to the classification of chronic kidney disease, thereby demonstrating considerable enhancements in classification precision. Although the primary emphasis was placed on improving classification results, the efficacy of the dragonfly algorithm in feature selection could yield advantages in medical contexts for applications such as disease diagnosis or prognosis.
Subsequent investigations could delve into the algorithm’s potential in analyzing medical data to advance predictive models within healthcare environments. In [
24], DA was utilized in medical image registration within the context of this investigation. It was evaluated against alternative bio-inspired algorithms, including particle swarm optimization and artificial bee colony methods. The outcomes of the simulations demonstrated that the dragonfly algorithm yielded superior quality in image registration results, though with a prolonged convergence time. This contradiction between the quality of registration and the computational duration is of principal importance when selecting an algorithm for medical applications, such as the monitoring of tumor progression. Consequently, DA has shown significant potential in the domain of medical image registration tasks, providing high-quality outcomes despite the associated increase in computational time. In the segmentation of thermographic images for early diagnosis of breast diseases, ref. [
25] mimicking the swarming behaviors of dragonflies, the algorithm balances exploration and exploitation phases to compute optimal thresholds for image segmentation aims to provide a reliable method for clinicians to analyze thermography images effectively, assisting in the early detection of breast cancer.
Building upon these earlier medical applications of the DA algorithm, recent literature has demonstrated a clear transition toward hybrid evolutionary–deep learning paradigms for FS and transformer-based architectures with intrinsic interpretability mechanisms. A comprehensive literature survey [
26] distills the current status of evolutionary feature selection, focusing on integration approaches for attention, adaptive population modeling, and multi-objective optimization. These approaches have a significant impact on structuring the proposed mDA algorithm.
Similarly, parallel literature on deep attention networks and vision transformer models [
27,
28] has been observed for integration purposes in healthcare data, including imaging and electronic health records. These models utilize attention pooling, hierarchical fusion, and representation mechanisms that inherently produce salient features that meet the criteria for feature selection. The interpretability naturally incorporated by these transformer models makes it easier to validate them for clinical use.
Methodologically, recent breakthroughs have been made in domain-aware transformer frameworks [
29] that incorporate priors informed by physics and biology into their optimization approaches. These models provide enhanced semantic coherence and can be conceptualized alongside traditional wrapper approaches to FS that rely on domain information to inform and guide the process of identifying salient features. This continues to strengthen the role of jointly evolving search approaches with knowledge-informed architectures for robustness and interpretability.
From a visualization and explainability perspective, recent work continues to proliferate in post-hoc attribution methods such as SHAP and LIME [
30] to effectively integrate these attribution approaches with FS algorithms, guaranteeing clinical plausibility and interpretability for these models. Empirical analyses [
31] have verified that clinician-centered explanations can outperform standard SHAP explanations in both clinician trust and diagnostic accuracy, underscoring the need for interpretability and user-centered design in medical AI. Complementary analyses have concurrently argued for the use of attention maps, gradient attribution, and perturbation-based validations to ensure the relevance of selected features in clinical decision-support systems.
Taken together, these recent developments represent a clear paradigm shift toward interpretable, domain-grounded, and hybridized feature selection frameworks. They have established a coherent research stream integrating evolutionary search, deep representational learning, and explainable AI, a trajectory that is directly reflected in the conception and design of our proposed mDA model.
4. Methodology
Feature selection is presented as a binary optimization challenge, limiting solutions to binary outcomes. Therefore, the binary version of the DA can be utilized to tackle this challenge. In this research, a vector consisting of zeroes and ones represents a solution to a FS problem, with a zero indicating that the associated feature is not selected and a one indicating that the feature is selected. The length of the solution vector corresponds to the number of features present in the original dataset. This study introduces eight wrapper feature selection techniques that leverage the BDA. Each technique uses a transfer function to convert a continuous value into a binary form. The KNN classifier [
39] is employed to evaluate the selected feature subsets. The fitness function takes into account both classification accuracy and the number of features selected, in line with the understanding that feature selection is a multi-objective task. The objective function is shown in Equation (
10):
where
denotes the classification error rate,
signifies the count of selected features, and
represents the total number of features in the initial dataset. The parameters
and
correspond to the significance of classification accuracy and subset length, respectively.
ranges within the interval [0,1], and
is defined as (
), adapted from [
40].
4.1. Applying the EPD Strategy to BDA
As previously discussed, the EPD approach discards the least efficient solutions from the population and replaces them by creating new solutions in the vicinity of the more effective ones. This EPD strategy serves as a simple yet efficient operator for methods based on populations [
36], and hence it is incorporated into the traditional DA as it is also a stochastic population-based optimizer. EPD improves the exploitation capability of BDA by removing the poorest solutions from the group and creating new nearby solutions around the superior ones.
To incorporate the EPD technique within the BDA algorithm, the swarm of dragonflies is split into two groups after sorting by their fitness scores. The less fit group is removed and reinitialized using four different strategies derived from the better half of the population.
In this study, we integrated the EPD scheme with the binary DA. The hybridization model utilizes a random selection operator. Specifically, one of the top three dragonflies in the population is chosen along with a randomly selected dragonfly. Subsequently, the ’poor’ solution’s leader is chosen randomly. To execute this concept, a random selection mechanism is used to pick the solutions. Additionally, this method incorporates a basic mutation operator.
In this method, the top three individuals are chosen, and a fourth solution is created randomly. Each of the worst half solutions is repositioned around one of these four solutions based on a generated random number. The procedure is simple: a random number is generated in each iteration, and one of four choices is applied to reposition the suboptimal solution: if [0, 0.25], the best solution is used; if [0.25, 0.5], the second-best solution is used; if [0.5, 0.75], the third-best solution is chosen; and if [0.75, 1], a random solution is used.
The selected solution will be used as a starting point to reposition the poor solution. Repositioning the poor solutions around the best solutions aims to heighten the median of the swarm in each step. However, this process may cause a premature convergence of the algorithm. As a remedy, a randomly generated solution is used in the first rule to promote exploration and prevent trapping in local optima.
The overall pseudo code of the mDA algorithm is described in Algorithm 2.
| Algorithm 2 Pseudocode of the mDA |
Initialize the population Initialize while (end condition is not satisfied) do Evaluate each dragonfly Update (F) and (E) Update the main coefficients Calculate , and E (using Equations ( 1)–( 5)) Update step vectors ( ) using Equations ( 8) and ( 9) for to n do Update the position of i-th dragonfly using EPD approach Return the best solution using Equation ( 10)
|
It should be noted that the computational complexity of the proposed modified Differential Algorithm (mDA) is not substantially different from that of the original Differential Algorithm (DA). The DA has a computational complexity of , where t represents the number of iterations, d stands for the number of variables, and n denotes the number of solutions. The introduction of binary operators does not alter this complexity, as they are incorporated into the original DA’s position updating method. However, to reinitialize 50% of the solutions, an additional complexity of is introduced, making the overall computational complexity of the proposed mDA . It is important to note that because half of the solutions need to be re-evaluated for their objective value, the mDA requires more function evaluations than the DA.
4.2. Experiments
We conducted our experiments using seven publicly accessible medical datasets: Breastcancer, BreastEW, Colon, HeartEW, Leukemia, Lymphography, and PenglungEW. The comparative algorithms included Traditional PSO, Genetic Algorithm (GA), Particle Swarm Optimization (PSO), Grey Wolf Optimizer (GWO), Salp Swarm Algorithm (SSA), Grasshopper Algorithm (GOA), and Harris Hawks Optimization (HHO). The evaluation metrics comprised Accuracy, mean number of selected features, and mean best fitness.
4.3. Dataset Description
We assess the performance of the proposed nDA model compared to the binary DA version and other methods using seven renowned medical datasets obtained from the UCI benchmark repository [
41,
42].
Table 1 summarizes the datasets employed in our experiments. It enumerates seven unique datasets, each selected for its significance to particular medical and biological issues. The table describes the number of features, cases, and categories for each dataset, which are essential measures for assessing the data’s complexity and breadth. Detailed information of each dataset is given below:
Breastcancer: This dataset contains 9 features with 699 instances, divided into 2 classes, making it suitable for binary classification tasks related to breast cancer detection.
BreastEW: Comprising 30 features and 569 instances, this dataset also targets breast cancer but with a different feature set, reflecting varied experimental conditions or data collection methodologies.
Colon: A relatively small dataset with only 62 instances but a high dimensionality of 2000 features, primarily used for studying gene expression profiles in colon cancer.
HeartEW: Contains 270 instances and 13 features used in analyzing heart disease with two outcome classes.
Leukemia: The most feature-rich dataset in the collection, with 7129 features across 72 instances, indicative of high-throughput genetic profiling in leukemia studies.
Lymphography: Consists of 148 instances and 18 features, used in diagnosing lymph node tumors with binary class outcomes.
PenglungEW: Distinct from the others, this dataset has 73 instances and 325 features but expands the classification challenge to 7 classes, possibly indicating different stages or types of lung diseases.
Table 1.
List of the Datasets Used in the Experiments.
Table 1.
List of the Datasets Used in the Experiments.
| No. | Dataset | No. of Features | No. of Instances | No. of Classes |
|---|
| 1. | Breastcancer | 9 | 699 | 2 |
| 2. | BreastEW | 30 | 569 | 2 |
| 3. | Colon | 2000 | 62 | 2 |
| 4. | HeartEW | 13 | 270 | 2 |
| 5. | Leukemia | 7129 | 72 | 2 |
| 6. | Lymphography | 18 | 148 | 2 |
| 7. | PenglungEW | 325 | 73 | 7 |
4.4. Evaluation and Experimental Settings
The suggested approach is implemented using the Matlab R2019a tool on an Intel(R) Core i7 processor running at 2.00 GHz with 16 GB of RAM. The suggested and contrasted alternatives are implemented using the same platform and programming language to provide fair comparisons.
In this work, the performance of the algorithms on the findings obtained was validated using the cross-validation method, which randomly splits each data set into 80% training and 20% testing parts [
43,
44]. Furthermore, each technique was run 20 times to ensure robustness and reliability of the results.
The classification accuracy in Equation (
11) is used to evaluate the proposed approach, an important measure to evaluate the classification problems; more accuracy means a better solution. On the other hand, the FS algorithms aim to reduce the dimensionality of the dataset by selecting the minimum number of features concerning the classification accuracy. The accuracy and the number of the selected features are included as objectives of the fitness function; therefore, the objective is to minimize the number of features and increase the accuracy of the classification. So, the accuracy is converted to a minimization problem by taking the error rate instead of the accuracy (1-accuracy), as shown in Equation (
12).
where
and
are parameters between 0 and 1 to represent the importance weight of each objective (
=
),
ErrorRate indicates the classification error rate,
R represents the number of selected features and the total number of features is denoted as
N, based on the literature [
43,
44,
45,
46,
47];
is set to 0.99 and
equal to 0.01.
A sensitivity analysis was utilized to choose the experiment settings properly. The population size was set up in the analysis using various alternative values: 10, 20, 30, 50, and 100 search agents. While the maximum number of iterations was tested on three values: 50, 100, and 150. As clearly shown in
Table 2, the population size with a value equal to 30 performed better. In addition, the maximum number of iterations is set to 100, based on the sensitivity results and according to previous studies [
45,
46,
47]. On the other hand, the KNN is the most often used classifier with the different datasets available in the UCI repository. And according to [
43,
44], the value of k is set to 5.
The sensitivity analysis of the classification accuracy for the proposed alogrithm (mDA), showing how it performs across a variety of medical datasets with differing numbers of iterations and population sizes. The results reveal that the Breastcancer dataset consistently exhibits high classification accuracy, which marginally improves as the number of iterations increases, indicating the algorithm’s stability across different population sizes. In contrast, the Colon dataset shows notable variability in accuracy, especially at larger population sizes, which might suggest a sensitivity to the dimensionality of the feature space or a propensity for overfitting. The Leukemia and HeartEW datasets display a positive trend in accuracy with increasing iterations, underscoring that more iterations aid in achieving better generalization of the model. However, the PenglungEW dataset, which is categorized into multiple classes, tends to have lower overall accuracy, pointing to the inherent challenges somehow of multi-class classification. This analysis provides essential insights into the robustness and efficiency of the mDA algorithm, highlighting its potential advantages and limitations when applied to different types of medical datasets.
Table 3 enumerates the parameters utilized in the experiments outlined in the study, providing a clear framework for the setup and execution of the classification models tested. The population size for the experiments was set at 30 based on the sensitivity analysis experiment, with a maximum of 100 iterations per run to allow the algorithms sufficient opportunity to converge toward optimal solutions. The K-nearest neighbors algorithm (KNN) used a value of K equal to 5, optimizing the balance between bias and variance in the classification. Two parameters,
, and
, were set at 0.99 and 0.01 respectively in the fitness function, indicating a strong preference for one aspect of the model’s performance over another, likely focusing on maximizing accuracy while controlling for overfitting or complexity in the model. This structured approach in parameter selection underscores the methodological rigor of the experiments, aiming to achieve both precise and generalizable outcomes.
5. Results
Table 4 provides a comparative analysis of classification accuracy between the proposed algorithm (mDA) and other well-known optimization algorithms across various medical datasets. The table includes algorithms such as Genetic Algorithm (GA), Particle Swarm Optimization (PSO), Grey Wolf Optimizer (GWO), Salp Swarm Algorithm (SSA), Grasshopper Optimisation Algorithm (GOA), Harris Hawks Optimization (HHO), and standard (DA).
The mDA algorithm generally shows superior or competitive performance compared to other algorithms. For instance, in the Breastcancer dataset, mDA achieves a classification accuracy of 0.9678, slightly outperforming the nearest competitor, GOA, which scored 0.9673. Similarly, in the BreastEW dataset, mDA’s accuracy of 0.9382 is among the highest, surpassed only by GA’s 0.9581. However, in the Colon and PenglungEW datasets, while mDA does not lead, it still maintains a respectable performance, indicating its robustness across different types of data challenges. This comparative analysis highlights the efficacy of mDA in achieving high classification accuracy and showcases its potential as a reliable tool in medical data analysis.
Table 5 presents a comparative analysis of the average number of selected features across various algorithms when applied to different medical datasets. The table highlights the efficiency of each algorithm in feature selection, which is critical for reducing model complexity while maintaining or enhancing prediction accuracy. For instance, in the Breastcancer dataset, mDA and HHO select the fewest features, demonstrating their capability to achieve efficient feature reduction. In contrast, the Colon dataset shows significant variation in the number of features selected, with mDA choosing 923.20 features on average, which is lower than GWO and SSA but higher than DA and HHO.
In more complex datasets like Leukemia, which initially have thousands of features, the variation in selected features is stark, with mDA selecting 3332.30 features, substantially fewer than GWO and SSA but more than GOA and DA. This indicates that while mDA generally selects more features compared to some algorithms, it may balance feature reduction and maintaining classification performance.
Overall, this table effectively demonstrates how different optimization algorithms manage the trade-off between reducing the number of features and retaining sufficient information for accurate classification, providing insights into their applicability in various scenarios.
Table 6 summarizes the feature selection ratio and dimensionality reduction efficiency obtained by the proposed mDA across all benchmark datasets. The results show that mDA consistently achieves significant feature-space reduction, eliminating on average nearly half of the original features (≈49%) while maintaining strong classification performance. In particular, for medium-dimensional datasets such as BreastEW and PenglungEW, mDA attains up to 60% reduction efficiency, highlighting its robustness in identifying highly relevant features. Although the reduction rate is moderate in high-dimensional datasets such as Colon and Leukemia (≈53%), this behavior reflects the algorithm’s tendency to preserve discriminative information essential for accurate classification. Overall, the results confirm that mDA provides an effective balance between compactness and accuracy, outperforming several baseline algorithms.
Table 7 offers a comparative analysis of average best fitness values achieved by various algorithms when applied to different datasets in the research. This table shows the effectiveness of each algorithm based on their fitness values, a measure reflecting how well each algorithm has optimized a predefined objective function. For example, the Breastcancer dataset shows that mDA achieves an impressive fitness value of 0.0233, indicating its superior performance in optimizing the classification task compared to all other algorithms. Similarly, in the Leukemia dataset, mDA reports the lowest fitness value of 0.0064, suggesting a significant optimization capability over the alternatives.
However, in the Colon dataset, GA displays lower fitness values, indicating potential areas where mDA could be less effective or require adjustments to enhance its optimization performance. The trend across different datasets highlights the varying effectiveness of these algorithms in specific scenarios, offering insights into their potential applications and limitations.
Overall, this comparative analysis provides a clear perspective on the optimization capabilities of each algorithm, with mDA generally demonstrating robust performance across most datasets, underscoring its utility in solving complex classification problems effectively.
Furthermore, we used the F-test to statistically validate the difference in the performance obtained. The results of the F-test are displayed under each table. However, the comparative performance is summarized in each table using the Wins–Ties–Losses (WTL) criterion. More wins are an indication of superior overall performance consistency.
Figure 1 depicts the convergence curves for the standard (DA) and the modified DA (mDA) over a series of iterations applied to the given dataset. The x-axis represents the number of iterations, ranging from 0 to 100, while the y-axis measures the average best-so-far fitness values, which indicate the optimization performance of each algorithm at each iteration.
From the graph, it is evident that mDA (shown in red) starts with higher fitness values than traditional DA (shown in black), and maintains a consistent improvement, demonstrating a steeper and more continuous decline in the fitness values as the iterations progress. This suggests that mDA converges more effectively towards the optimum solution than DA in most cases, showing a more gradual decline in fitness values. By iteration 100, mDA achieves significantly lower fitness values than DA, indicating better optimization performance for all datasets.
The convergence curve for mDA is relatively smoother and steeper, which underscores its enhanced capability to quickly and efficiently refine solutions, likely due to improved algorithmic modifications that enhance its search and optimization processes within the feature space of the datasets. These graphs effectively illustrate the comparative advantage of mDA in achieving lower fitness values faster, which is indicative of better overall performance in optimizing the classification task for this particular dataset.
To statistically assess the difference in performance between the algorithms compared, we employed a range of non-parametric tests that can be applied to multiple algorithms and datasets. The first step was to test the hypothesis that the performances of all algorithms are equivalent using the Friedman test, which evaluates the null hypothesis that all algorithms achieve equivalent performance across datasets. This test is appropriate here because it is rank-based and does not rely on normality assumptions. As shown in
Table 8, the Friedman statistic was
with 7 degrees of freedom, resulting in an extremely small
p-value (
). This strongly rejects the null hypothesis and confirms that there are significant performance differences between the algorithms.
Then we computed the average rank of each algorithm across all datasets, presented in
Table 9. The proposed mDA algorithm achieved the lowest average rank (2.304), indicating the most consistent superior performance relative to the other baseline methods.
To identify which performance differences were statistically significant at the pairwise level, a Nemenyi post-hoc test was performed. The results in
Table 10 show that mDA is significantly better than all other algorithms at the
significance level, with all
p-values being extremely small. This confirms that the superiority of mDA is not due to random fluctuations but reflects genuine performance advantages across benchmarks.
The Critical Difference (CD) diagram in
Figure 2 provides a visual representation of these statistical findings. Algorithms whose rank differences exceed the CD threshold (0.97) are considered significantly different. As depicted, mDA is clearly separated from all other methods and lies well outside the CD interval of competing approaches, reinforcing the conclusion that its performance advantage is statistically significant.
Taken all together, the Friedman test result, the result of the post-hoc test using the Nemenyi method, and the CD diagram provide a strong indication that the new mDA algorithm performs better than all of the baselines. The above findings provide strong evidence for the reliability of the advantages of the mDA algorithm.
Moreover, a Filter Feature Selection (FFS-) technique was performed for further analysis.
Table 11 compares the proposed approach and various classification methods combined with filter feature selection. The results illustrate the proposed approach’s superior performance on most datasets. For instance, the mDA algorithm achieved the best results on five out of seven datasets: breast cancer, BreastEW, Heart, and PenglungEW, with accuracies of 0.9678, 0.9382, 0.8227, 0.7627, and 0.6035, respectively. As for the remaining datasets, Colon and Leukaemia achieved the highest accuracies of 0.9838 and 0.8611 using J48 and AdaBoost, respectively. This discrepancy arises because the Colon and Leukaemia datasets may possess specific characteristics that do not align well with the strengths of the mDA algorithm. These characteristics could include different distributions of features, higher levels of noise, or imbalanced class distributions, which the mDA algorithm might not handle as effectively as other algorithms like J48 or AdaBoost.
6. Clinical Discussion
Beyond the quantitative improvements achieved by the proposed mDA algorithm across the seven medical benchmarks, we further examined the clinical relevance of the most frequently selected features to ensure that the selected subsets align with established biomedical understanding.
For the Breastcancer and BreastEW datasets, the selected attributes consistently highlighted morphological and cytological descriptors such as cell uniformity, epithelial cell size, bare nuclei, and chromatin texture. These indicators have a correspondence with histopathologic criteria used by pathologists for distinguishing between benign and malignant lesions and for estimating tumor aggressiveness, thus verifying that the algorithm can identify salient features for diagnosis.
In the HeartEW dataset, the retained variables primarily encompassed cardiorespiratory and electrocardiographic markers—including chest pain type, ST-segment slope, and exercise induced responses, that are well-established predictors of ischemia and cardiovascular risk in coronary artery disease (CAD). The resulting can mirror the features commonly employed in clinical scoring systems for non-invasive cardiac diagnosis.
For the ColonEW and LeukemiaEW gene-expression benchmarks, mDA identified compact and biologically interpretable gene signatures associated with pathways in cell-cycle control, apoptosis regulation, and hematopoietic proliferation. These subsets capture known genomic signals implicated in oncogenic progression, reflecting the model’s capacity to balance sparsity with biological coherence.
In the Lymphography dataset, the selected variables predominantly involved patterns of lymph node enlargement, capsularity, and structural irregularity, which correspond to the clinical staging criteria used in lymphoma assessment. Such features directly relate to tumor dissemination and disease extent, indicating that mDA emphasizes anatomically meaningful discriminants.
Finally, for the multi-class PenglungEW dataset, the retained subset covered heterogeneous cytological and morphological indicators that capture inter-subtype variation across lung cancer classes. The inclusion of a slightly larger feature set here reflects the need to preserve separability across diversehistopathological subtypes. Together, these findings demonstrate that the mDA optimizer systematically gravitates toward clinically coherent and biologically plausible feature sets rather than arbitrary statistical artifacts. This not only enhances interpretability but also reinforces the practical translational value of the proposed framework in medical diagnostic contexts.
7. Conclusions and Future Works
This research introduces an enhanced hybrid DA optimizer incorporating EPD, aimed at boosting the performance of the standard DA in handling FS tasks. The mDA methodology was extensively applied to seven medical datasets. Detailed comparisons of the mDA’s overall classification accuracy, selected features, fitness, and convergence characteristics were made against several well-known metaheuristic-based techniques. The exhaustive comparative results and analysis demonstrated the superior effectiveness of the proposed algorithm for various FS tasks within the medical field.
The conducted experiments confirm the proposed mDA algorithm’s efficacy for feature selection within medical fields. Its capability to attain higher accuracy with a reduced number of features highlights its potential as a valuable tool for medical data analysis, providing a balance between clarity and predictive performance.
Future research could explore applying the EPD approach to various other population-based optimization algorithms. The effectiveness of the proposed binary DA and EPD-based techniques could also be utilized to address additional data-mining challenges. In future work, we plan to compare the proposed mDA with various FS methods within the domain. Furthermore, the proposed binary Dragonfly Algorithm (DA) and EPD-based techniques can be effectively utilized to tackle other complex data-mining challenges, especially those dealing with high-dimensional, unstructured, or heterogeneous datasets. As an illustration, future research could investigate areas such as LASSO and ridge regression, image-based feature selection, multimodal data analysis, and other intricate fields to demonstrate the scalability and practical applicability of the proposed methods. Future work should not only focus on technical advancements but also address significant socio-technical and ethical open research questions (ORQs) highlighted by this study and related literature. These encompass concerns such as transparency, fairness, responsible usage, computational sustainability, and the impact of human–AI interaction in the deployment of EPD- and DA-based algorithms.