Novel Hybrid Feature Selection Using Binary Portia Spider Optimization Algorithm and Fast mRMR
Abstract
:1. Introduction
1.1. Motivation and Objectives
- Using fast mRMR in the PSOA initialization step in providing a better population-to-target optimization using feature significance, thereby forming a strong basis upon which subsequent optimization searches are built and in which the effectiveness of the algorithm in optimization is enhanced.
- Apply the PSOA optimization methodology in the process of selecting the most relevant features under a high-dimensional framework.
- Evaluate the performance of the proposed model using six distinct cancer microarray datasets.
1.2. Paper Structure
2. Methods
2.1. Portia Spider Optimization Algorithm
- Portia spiders use the vibration released by the ensnared insects.
- A silk dragline is drawn by the Portia spiders to ensure the position of the prey.
- Mechanism of stalking and striking (exploration phase):This phase describes the position change mechanism of the Portia spider according to the chemical signal received from the neighbor spider. It changes myopathic gait and takes an instinctive jump to prevent the concealing of the prey. From this movement, the Portia spider confirms the position of the prey. The methodology behind this approach is mentioned using Equations (4) and (5).Subject to
- Mechanism of invading and imitating (exploitation phase):This phase presents the natural techniques adopted by the Portia spider on the web of other spiders for easy hunting. In this stage, the Portia spider generates its own vibration and tries to sneak into the web of the prey. Rather than generating its own signal, it often takes advantage of the vibrations produced by the other insects like a gentle breeze. The intriguing skill of the Portia spider is presented using Equation (6).Most Portia spiders adopt the above-mentioned trick to capture the prey. If it fails, a silk dragline is dragged just above the web so that it can reach the updated position and attack the prey with a single jump (new position change). The optimal solution or the new update position can be mathematically determined using Equation (7) from Equation (6).
Algorithm 1 Portia spider algorithm |
Require: Population size (N); number of iterations ()
|
2.2. Fast mRMR
Algorithm 2 Fast mRMR feature selection |
Require:
|
2.3. Binary Portia Spider Optimizer Algorithm
2.4. Weighted Support Vector Machine
3. FmRMR-PSOA: The Proposed Model
3.1. Implementation Steps of fastmRMR-BPSOA
- Dataset D with features and labels Y. [Training data (80%)-Test data (20%)].
- Number of top features M to consider after fast mRMR.
- PSOA parameters: population size N, maximum iterations T, and attraction–repulsion parameters.
- Optimal subset of features .
- 1.
- Apply fast mRMR
- (a)
- Compute the mutual information (MI) for all features with respect to the target variable Y.
- (b)
- Apply fast mRMR to rank the features based on their relevance to Y and minimal redundancy among features.
- (c)
- Select the top M features: .
- 2.
- Initialize PSOA
- (a)
- Representation: Each spider is a binary vector , where indicates inclusion of feature , and indicates exclusion.
- (b)
- Randomly initialize a population of N spiders . Each vector has a size of M, corresponding to the features in .
- 3.
- Define Fitness Function
- (a)
- For each spider :
- i
- Extract the feature subset corresponding to .
- ii
- Train a machine learning model using on training data.
- iii
- Evaluate using a validation metric (e.g., accuracy, F1 score).
- iv
- Fitness score: .
- 4.
- Run PSOA
- (a)
- Iteration: For to T:
- i
- For each spider :
- A.
- Update its position (binary vector ) based on PSOA attraction–repulsion rules and fitness scores of other spiders.
- B.
- Apply boundary constraints to ensure binary values .
- ii
- Evaluate fitness for updated positions of all spiders.
- (b)
- Stopping criteria: Stop if the algorithm converges (no significant improvement in fitness) or reaches T.
- 5.
- Return the Optimal Subset
- (a)
- Identify the spider with the highest fitness.
- (b)
- Return , the feature subset represented by .
3.2. Outline of Classifiers
3.2.1. Support Vector Machine (SVM)
3.2.2. Decision Tree
3.2.3. XGBoost
3.2.4. AdaBoost
3.2.5. Random Forest
3.3. Summary of Biomedical Datasets
4. Experimental Analysis
4.1. Phase-I Evaluation
- Out of all the methods, fast mRMR-RF outperformed the others at 93.50% in the ALL-MLL dataset, achieving the highest accuracy. fast mRMR-RF surpassed the next best model, fast mRMR-AdaBoost, by 0.61%. It achieved an F1-Score of 94.78 and an MCC of 86.60. Compared to the worst algorithm of fast mRMR-SVM, the Random Forest algorithm was found to outperform by 4.61%. Additionally, it achieved an FNR of 1.67%, indicating an impressive ability to predict true positives with an FPR of 13.75%.
- Fast mRMR-AdaBoost had the best results in the colon cancer dataset with an accuracy of 93.94% and an F1-Score of 95.24. This was an improvement of 0.51% over the previous best algorithm fast mRMR-WSVM. Compared to fast mRMR-SVM, which had an accuracy of 88.38%, AdaBoost demonstrates an improvement of 5.56%. Moreover, with an MCR of 6.06%, AdaBoost was also able to balance sensitivity at 97.56% with specificity at 88%, pointing toward an impressive reduction in misclassifications.
- In the case of the central nervous system dataset, the implemented model fast mRMR-RF has the best performance with 93.43% accuracy while qualifying F1-Score at 94.65%. It surpasses the second-best model, which is fast mRMR-AdaBoost, by a margin of 1.01% in accuracy and improves the fast mRMR-SVM method by 6.06%. Its MCC is 86.29%, combined with the lowest FPR of 11.39% makes the model Random Forest demonstrate strong classification results with low false alarms. The model creates these results while having an MCR that is low at 6.57%.
- Regarding the ovarian cancer dataset, the achieved results are different as the model fast mRMR-AdaBoost exhibits the best performance with an achieved overall accuracy of 92.93% while meanwhile having an F1-Score of 94.89. This model shows improvement of 1.01 in comparison with the second-best model fast mRMR-XGBoost while increasing 3.03% over fast mRMR-SVM. With 87.10% specificity combined with an FNR of 4.41%, the model AdaBoost demonstrates performance that is stable and predictive for this dataset.
- As indicated in GSE4115 dataset, the fast mRMR-AdaBoost model outshines the rest with the highest accuracy of 94.47% while also outperforming the second-best, fast mRMR-SVM, with a score of 0.05 in accuracy. This model has a 1.04% improvement in accuracy compared to the fast mRMR-XGBoost method. Furthermore, the AdaBoost model also boasts a low FNR of 1.36% while increasing specificity to 82.69%, which solidifies this as the best methodology for this dataset.
- Fast mRMR-WSVM and fast mRMR-AdaBoost models show the highest accuracy with 94.95% in the GSE10245 dataset which beats the third model in rank fast mRMR-RF by 0.51%. Furthermore, the highest F1 Score of 96.58% with a MCC of 87.46 indicates more reliability in classification is established by fast mRMR-AdaBoost. The lowest-ranked model of all, fast mRMR-SVM, when compared to the AdaBoost model, does show a notable difference in accuracy of 4.04%, as shown. In addition, the FNR is at 0.70% making the AdaBoost method the best choice for this dataset.
- Table 2 shows the performance analysis of the model without using the BPSOA technique as the feature selection algorithm.
- Figure 3 shows the ROC analysis of the model without using BPSOA for different datasets.
4.2. Phase-II Evaluation
- For the ALL-MLL dataset, the leading algorithm is still the fast mRMR-BPOSA-XGBoost with an accuracy of 99.49%, F1-Score of 99.68%, and MCC score of 98.43%. This result is significantly better than the previously discussed fast mRMR-RF, with an accuracy of 93.50%, and means a 6.39% improvement.
- Similarly for the colon cancer dataset, the top performer is the fast mRMR-BPOSA-RF approach with 98.99% accuracy, F1-Score of 99.36%, and MCC of 96.92%. The previous methodology known as fast mRMR-AdaBoost was also outperformed (its accuracy was 93.94%, so the RF model shows a remarkable improvement of 5.05%). The RF model has an extremely low MCR of 1.01% and an even lower FNR of 0.64%, showing its great reliability and effectiveness in true positive detection.
- The methodology with the best results on the central nervous system dataset is the fast mRMR-BPOSA-AdaBoost, with an accuracy of 99.49%, F1-Score of 99.70%, and MCC of 98.18%. There is a noticeable improvement from the previously mentioned fast mRMR-RF (accuracy of 93.43%), with a 6.06% increase, so again there is some advantage.
- Fast mRMR-BPOSA-WSVM has an accuracy of 98.99%, F1-Score of 99.40%, and MCC of 96.18%, making it the best for the ovarian cancer dataset. Its performance is better than fast mRMR-AdaBoost, the previous best performer, which had an accuracy of 92.93%. The accuracy improvement is 6.06%, a significant increase. Furthermore, WSVM has a low FNR of 0.60%, which allows for highly accurate positive predictions.
- For the GSE4115 dataset, the best performing model is fast mRMR-BPOSA-RF with an accuracy of 99.50%, F1-Score of 99.72% and MCC of 97.32%. When comparing it to the previously mentioned model fast mRMR-AdaBoost with an accuracy of 94.47%, RF has an improvement of 5.03%. With a 0.56% FNR and a 100% perfect specificity, it is the most reliable method.
- The best model previously, fast mRMR-AdaBoost, had an accuracy of 94.95%. However, for the GSE10245 dataset, fast mRMR-BPOSA-RF has an accuracy of 97.98%, F1-Score of 98.84%, and MCC of 90.94%, which is a 3.03% increase. With an MCR of only 2.02% and a low FNR of 1.72%, for this dataset, RF proves to be the most effective methodology.
- Table 3 shows the performance analysis of the proposed model using the fast mRMR and BPSOA techniques as the feature selection algorithm.
- Figure 4 shows the ROC analysis of the proposed model using the fast mRMR and BPSOA techniques for selecting features for different datasets.
4.3. Critical Analysis
- The newly designed model has increased accuracy to 99.38%, which is better than any previously reported accuracy. The figure is also so much better than [5], whose accuracy was 97.23%, which is actually 2.15% better. It has also become better than what [7,8] achieved. For [7], accuracy is 98.61%, and for [8], it was 94.17%, with improvements of 0.78% and 5.21%, respectively. It can also conclude that of achieving improved accuracy of 7.98%, 2.28%, and 4.50% to the research [21,22,23], respectively. This indicates that all these improvements show the efficiency of the fast mRMR-BPSOA model while utilizing the advanced feature selection mechanism to classify the ALL-AML dataset.
- The proposed model has increased accuracy to 98.99%, which clearly shows improvement over multiple existing methods. Compared to 93.33% achieved in [5], the increase is indeed remarkable, standing at 5.66%, while compared to 93.12% garnered in [6], the number stands at 5.87%. In the same way, concerning [7] with an accuracy of 85.48% with an increment of 13.51%, while from [8] with an accuracy of 95.83%, there has been a 3.16% increase. So has it been from [11] 7.06% increment, while in the case of [20], the proposed model stands at 0.73% increment. It can also conclude that of achieving improved accuracy of 25.26%, and 3.76% to the research [21,23], respectively. These results best justify the claim of the proposed techniques concerning the different models used on the colon cancer dataset.
- The proposed model for the CNS dataset achieves an accuracy of 99.79%, again outperforming the previously best accuracy of 85.48% achieved by [7]. This phenomenal 16.46% outperformance shows that the model can perform well on complex datasets with high-dimensional features. It is also achieved improved accuracy of 20.02% to the research [21] with 83.14% of accuracy.
- The proposed model reveals that the system outperforms existing benchmarks with a 98.99% accuracy for ovarian cancer. Again, the proposed model reveals a decrement of 1.01% compared to [6,7] that scored 100%. Here, the slight increment of 0.11% reveals the model outperforms 98.88% from [11]. Also, it outperforms 96.98 from [20] with a 2.01% increment. It is also achieved improved accuracy of 5.17% to the research [21] with 94.10% of accuracy. The developed model seems to perform reasonably well but does not outperform all benchmark strategies for the ovarian cancer dataset.
- For the GSE4115 dataset the model’s accuracy on the GSE4115 dataset is 99.50%, which is also the improvements in accuracy of 8.76% to [21] with 91.49% of accuracy. The accuracy on the GSE10245 dataset is 97.98%, which is also an improved accuracy of 11.04% and 8.75% to the research [21,23], respectively. In any case, these results illustrate the strength of the proposed model in working across different datasets.
- Table 4 shows the comparative study of the proposed model with existing models.
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Rajeswari, M.; Chandrasekar, A.; Nasiya, P.M. Disease Prognosis by Machine Learning over Big Data from Healthcare Communities. Int. J. Recent Technol. Eng. 2019, 8, 680–683. [Google Scholar] [CrossRef]
- Sahu, B. A Combo Feature Selection Method (Filter +Wrapper) for Microarray Gene Classification. Int. J. Pure Appl. Math. 2018, 118, 389–401. [Google Scholar]
- Xue, B.; Zhang, M.; Browne, W.N.; Yao, X. A Survey on Evolutionary Computation Approaches to Feature Selection. IEEE Trans. Evol. Comput. 2016, 20, 606–626. [Google Scholar] [CrossRef]
- Mahapatra, M.; Majhi, S.K.; Dhal, S.K. MRMR-SSA: A Hybrid Approach for Optimal Feature Selection. Evol. Intell. 2022, 15, 2017–2036. [Google Scholar] [CrossRef]
- Yu, K.; Li, W.; Xie, W.; Wang, L. A Hybrid Feature-Selection Method Based on MRMR and Binary Differential Evolution for Gene Selection. Processes 2024, 12, 313. [Google Scholar] [CrossRef]
- Alomari, O.A.; Khader, A.T.; Al-Betar, M.A.; Abualigah, L.M. MRMR BA: A Hybrid Gene Selection Algorithm for Cancer Classification. J. Theor. Appl. Inf. Technol. 2017, 95, 2610–2618. [Google Scholar]
- Alomari, O.A.; Khader, A.T.; Al-Betar, M.A.; Awadallah, M.A. A Novel Gene Selection Method Using Modified MRMR and Hybrid Bat-Inspired Algorithm with β-Hill Climbing. Appl. Intell. 2018, 48, 4429–4447. [Google Scholar] [CrossRef]
- Alshamlan, H.; Badr, G.; Alohali, Y. MRMR-ABC: A Hybrid Gene Selection Algorithm for Cancer Classification Using Microarray Gene Expression Profiling. BioMed Res. Int. 2015, 2015, 1–15. [Google Scholar] [CrossRef]
- Wang, S.; Kong, W.; Aorigele; Deng, J.; Gao, S.; Zeng, W. Hybrid Feature Selection Algorithm MRMR-ICA for Cancer Classification from Microarray Gene Expression Data. Comb. Chem. High Throughput Screen. 2018, 21, 420–430. [Google Scholar] [CrossRef]
- Sahu, B.; Panigrahi, A.; Sukla, S.; Biswal, B.B. MRMR-BAT-HS: A Clinical Decision Support System for Cancer Diagnosis. Leukemia 2020, 7129, 48. [Google Scholar]
- Qin, X.; Zhang, S.; Dong, X.; Shi, H.; Yuan, L. Improved Aquila Optimizer with MRMR for Feature Selection of High-Dimensional Gene Expression Data. Clust. Comput. 2024, 27, 13005–13027. [Google Scholar] [CrossRef]
- Vincentina, N.; Nagarajan, R. Feature Selection in Microarray Using Proposed Hybrid Minimum Redundancy-Maximum Relevance (MRMR) and Modified Genetic Algorithm (MGA). Int. J. Exp. Res. Rev. 2024, 39, 82–91. [Google Scholar] [CrossRef]
- Yaqoob, A. Combining the MRMR Technique with the Northern Goshawk Algorithm (NGHA) to Choose Genes for Cancer Classification. Int. J. Inf. Technol. 2024. [Google Scholar] [CrossRef]
- Tasci, E.; Jagasia, S.; Zhuge, Y.; Camphausen, K.; Krauze, A.V. GradWise: A Novel Application of a Rank-Based Weighted Hybrid Filter and Embedded Feature Selection Method for Glioma Grading with Clinical and Molecular Characteristics. Cancers 2023, 15, 4628. [Google Scholar] [CrossRef]
- Varzaneh, Z.A.; Hossein, S.; Mood, S.E.; Javidi, M.M. A New Hybrid Feature Selection Based on Improved Equilibrium Optimization. Chemom. Intell. Lab. Syst. 2022, 228, 104618. [Google Scholar] [CrossRef]
- Yaqoob, A.; Verma, N.K.; Aziz, R.M. Improving Breast Cancer Classification with MRMR + SS0 + WSVM: A Hybrid Approach. Multimed. Tools Appl. 2024. [Google Scholar] [CrossRef]
- Mahmoud, A.; Takaoka, E. An Enhanced Machine Learning Approach with Stacking Ensemble Learner for Accurate Liver Cancer Diagnosis Using Feature Selection and Gene Expression Data. Healthc. Anal. 2025, 7, 100373. [Google Scholar] [CrossRef]
- Paramita, D.P.; Mohapatra, P. A Modified metaheuristic Algorithm Integrated ELM Model for Cancer Classification. Sci. Iran. 2022, 29, 613–631. [Google Scholar] [CrossRef]
- Razmjouei, P.; Moharamkhani, E.; Hasanvand, M.; Daneshfar, M.; Shokouhifar, M. Metaheuristic-Driven Two-Stage Ensemble Deep Learning for Lung/Colon Cancer Classification. Comput. Mater. Contin. 2024, 80, 3855–3880. [Google Scholar] [CrossRef]
- Mahto, R.; Ahmed, S.U.; Rahman, R.u.; Aziz, R.M.; Roy, P.; Mallik, S.; Li, A.; Shah, M.A. A Novel and Innovative Cancer Classification Framework through a Consecutive Utilization of Hybrid Feature Selection. BMC Bioinform. 2023, 24, 479. [Google Scholar] [CrossRef]
- Panda, M. Elephant Search Optimization Combined with Deep Neural Network for Microarray Data Analysis. J. King Saud Univ.—Comput. Inf. Sci. 2017, 32, 940–948. [Google Scholar] [CrossRef]
- Sahu, B.; Dash, S. Hybrid multifilter ensemble based feature selection model from microarray cancer datasets using GWO with deep learning. In Proceedings of the 2023 3rd International Conference on Intelligent Technologies (CONIT), Hubli, India, 23–25 June 2023; pp. 1–6. [Google Scholar]
- Das, A.; Neelima, N.; Deepa, K.; Özer, T. Gene Selection Based Cancer Classification with Adaptive Optimization Using Deep Learning Architecture. IEEE Access 2024, 12, 62234–62255. [Google Scholar] [CrossRef]
- Son, H.; Nguyen, T. Portia Spider Algorithm: An Evolutionary Computation Approach for Engineering Application. Artif. Intell. Rev. 2024, 57, 24. [Google Scholar] [CrossRef]
- Peng, H.; Ding, C. Minimum Redundancy and Maximum Relevance Feature Selection and Recent Advances in Cancer Classification. Feature Sel. Data Min. 2005, 52, 1–107. [Google Scholar]
- Panigrahi, A.; Pati, A.; Sahu, B.; Das, M.N.; Kumar, S.; Sahoo, G.; Kant, S. En-MinWhale: An Ensemble Approach Based on MRMR and Whale Optimization for Cancer Diagnosis. IEEE Access 2023, 11, 113526–113542. [Google Scholar] [CrossRef]
- Ramírez-Gallego, S.; Lastra, I.; Martínez-Rego, D.; Bolón-Canedo, V.; Benítez, J.M.; Herrera, F.; Alonso-Betanzos, A. Fast-MRMR: Fast Minimum Redundancy Maximum Relevance Algorithm for High-Dimensional Big Data. Int. J. Intell. Syst. 2016, 32, 134–152. [Google Scholar] [CrossRef]
- Huang, C.; Zhou, J.; Chen, J.; Yang, J.; Clawson, K.; Peng, Y. A Feature Weighted Support Vector Machine and Artificial Neural Network Algorithm for Academic Course Performance Prediction. Neural Comput. Appl. 2023, 35, 11517–11529. [Google Scholar] [CrossRef]
- Elnaggar, A.H.; El-Hameed, A.S.A.; Yakout, M.A.; Areed, N.F.F. Machine Learning for Breast Cancer Detection with Dual-Port Textile UWB MIMO Bra-Tenna System. Information 2024, 15, 467. [Google Scholar] [CrossRef]
- Mohi, M.; Mamun, A.A.; Chakrabarti, A.; Mostafiz, R.; Dey, S.K. An Ensemble Machine Learning-Based Approach to Predict Cervical Cancer Using Hybrid Feature Selection. Neurosci. Inform. 2024, 4, 100169. [Google Scholar] [CrossRef]
Dataset | Instances/Attributes | Classes Distribution | Imbalance Ratio |
---|---|---|---|
A1 | 70/7129 | AML/ALL: 25/47 | 1.88 |
A2 | 62/2000 | Tumor/Normal: 40/22 | 1.82 |
A3 | 60/7129 | Class0/Class1: 39/21 | 1.85 |
A4 | 235/15,154 | Cancer/Normal: 162/91 | 1.78 |
A5 | 192/22,216 | Cancer/Normal: 97/95 | 1.02 |
A6 | 58/54,676 | AC/SCC: 40/18 | 2.22 |
Dataset | Model | (%) | MCR (%) | (%) | (%) | (%) | (%) | (%) | (%) | MCC (%) |
---|---|---|---|---|---|---|---|---|---|---|
ALL-AMLL | Fast mRMR-SVM | 88.89 | 11.11 | 85.27 | 97.35 | 90.91 | 77.65 | 2.65 | 22.35 | 77.90 |
Fast mRMR-WSVM | 90.91 | 9.09 | 86.29 | 99.07 | 92.24 | 81.11 | 0.93 | 18.89 | 82.53 | |
Fast mRMR-XGBoost | 91.41 | 8.59 | 91.74 | 94.07 | 92.89 | 87.50 | 5.93 | 12.50 | 82.10 | |
Fast mRMR-AdaBoost | 92.89 | 7.11 | 91.27 | 97.46 | 94.26 | 86.08 | 2.54 | 13.92 | 85.27 | |
Fast mRMR-RF | 93.50 | 6.50 | 91.47 | 98.33 | 94.78 | 86.25 | 1.67 | 13.75 | 86.60 | |
COLON | Fast mRMR-SVM | 88.38 | 11.62 | 86.61 | 94.83 | 90.53 | 79.27 | 5.17 | 20.73 | 76.10 |
Fast mRMR-WSVM | 93.43 | 6.57 | 90.55 | 99.14 | 94.65 | 85.37 | 0.86 | 14.63 | 86.79 | |
Fast mRMR-XGBoost | 92.93 | 7.07 | 93.39 | 94.96 | 94.17 | 89.87 | 5.04 | 10.13 | 85.21 | |
Fast mRMR-AdaBoost | 93.94 | 6.06 | 93.02 | 97.56 | 95.24 | 88.00 | 2.44 | 12.00 | 87.10 | |
Fast mRMR-RF | 91.92 | 8.08 | 91.13 | 95.76 | 93.39 | 86.25 | 4.24 | 13.75 | 83.18 | |
CNS | Fast mRMR-SVM | 87.37 | 12.63 | 86.40 | 93.10 | 89.63 | 79.27 | 6.90 | 20.73 | 73.89 |
Fast mRMR-WSVM | 88.38 | 11.62 | 87.80 | 93.10 | 90.38 | 81.71 | 6.90 | 18.29 | 75.97 | |
Fast mRMR-XGBoost | 91.41 | 8.59 | 88.52 | 97.30 | 92.70 | 83.91 | 2.70 | 16.09 | 82.87 | |
Fast mRMR-AdaBoost | 92.42 | 7.58 | 91.27 | 96.64 | 93.88 | 86.08 | 3.36 | 13.92 | 84.20 | |
Fast mRMR-RF | 93.43 | 6.57 | 92.74 | 96.64 | 94.65 | 88.61 | 3.36 | 11.39 | 86.29 | |
Ovarian | Fast mRMR-SVM | 89.90 | 10.10 | 90.07 | 95.49 | 92.70 | 78.46 | 4.51 | 21.54 | 76.70 |
Fast mRMR-WSVM | 91.92 | 8.08 | 92.48 | 95.35 | 93.89 | 85.51 | 4.65 | 14.49 | 82.04 | |
Fast mRMR-XGBoost | 92.42 | 7.58 | 89.60 | 98.25 | 93.72 | 84.52 | 1.75 | 15.48 | 84.79 | |
Fast mRMR-AdaBoost | 92.93 | 7.07 | 94.20 | 95.59 | 94.89 | 87.10 | 4.41 | 12.90 | 83.44 | |
Fast mRMR-RF | 90.82 | 9.18 | 90.00 | 94.74 | 92.31 | 85.37 | 5.26 | 14.63 | 81.10 | |
GSE4115 | Fast mRMR-SVM | 94.42 | 5.58 | 93.06 | 99.26 | 96.06 | 83.87 | 0.74 | 16.13 | 87.06 |
Fast mRMR-WSVM | 93.94 | 6.06 | 92.03 | 99.22 | 95.49 | 84.29 | 0.78 | 15.71 | 86.87 | |
Fast mRMR-XGBoost | 93.43 | 6.57 | 93.88 | 97.18 | 95.50 | 83.93 | 2.82 | 16.07 | 83.54 | |
Fast mRMR-AdaBoost | 94.47 | 5.53 | 94.16 | 98.64 | 96.35 | 82.69 | 1.36 | 17.31 | 85.42 | |
Fast mRMR-RF | 93.43 | 6.57 | 91.95 | 99.28 | 95.47 | 80.00 | 0.72 | 20.00 | 84.42 | |
GSE10245 | Fast mRMR-SVM | 90.91 | 9.09 | 92.95 | 95.39 | 94.16 | 76.09 | 4.61 | 23.91 | 73.84 |
Fast mRMR-WSVM | 94.95 | 5.05 | 96.88 | 96.88 | 96.88 | 86.84 | 3.13 | 13.16 | 83.72 | |
Fast mRMR-XGBoost | 93.94 | 6.06 | 95.63 | 96.84 | 96.23 | 82.50 | 3.16 | 17.50 | 80.89 | |
Fast mRMR-AdaBoost | 94.95 | 5.05 | 94.00 | 99.30 | 96.58 | 83.93 | 0.70 | 16.07 | 87.46 |
Dataset | Model | (%) | MCR (%) | (%) | (%) | (%) | (%) | (%) | (%) | MCC (%) |
---|---|---|---|---|---|---|---|---|---|---|
ALL-AML | Fast mRMR-BPSOA-SVM | 95.48 | 4.52 | 95.48 | 98.67 | 97.05 | 85.71 | 1.33 | 14.29 | 87.60 |
Fast mRMR-BPSOA-WSVM | 97.47 | 2.53 | 97.44 | 99.35 | 98.38 | 91.11 | 0.65 | 8.89 | 92.73 | |
Fast mRMR-BPSOA-XGBoost | 99.38 | 0.62 | 99.37 | 100.00 | 99.68 | 97.50 | 0.00 | 2.50 | 98.43 | |
Fast mRMR-BPSOA-AdaBoost | 98.48 | 1.52 | 98.74 | 99.37 | 99.05 | 95.00 | 0.63 | 5.00 | 95.27 | |
Fast mRMR-BPSOA-RF | 98.99 | 1.01 | 99.38 | 99.38 | 99.38 | 97.30 | 0.62 | 2.70 | 96.68 | |
COLON | Fast mRMR-BPSOA-SVM | 97.98 | 2.02 | 99.39 | 98.20 | 98.80 | 96.77 | 1.80 | 3.23 | 92.61 |
Fast mRMR-BPSOA-WSVM | 96.97 | 3.03 | 98.11 | 98.11 | 98.11 | 92.31 | 1.89 | 7.69 | 90.42 | |
Fast mRMR-BPSOA-XGBoost | 97.98 | 2.02 | 99.38 | 98.17 | 98.77 | 97.06 | 1.83 | 2.94 | 93.12 | |
Fast mRMR-BPSOA-AdaBoost | 98.48 | 1.52 | 99.39 | 98.79 | 99.09 | 96.97 | 1.21 | 3.03 | 94.63 | |
Fast mRMR-BPSOA-RF | 98.99 | 1.01 | 99.36 | 99.36 | 99.36 | 97.56 | 0.64 | 2.44 | 96.92 | |
CNS | Fast mRMR-BPSOA-SVM | 95.96 | 4.04 | 97.62 | 97.62 | 97.62 | 86.67 | 2.38 | 13.33 | 84.29 |
Fast mRMR-BPSOA-WSVM | 97.98 | 2.02 | 100.00 | 97.63 | 98.80 | 100.00 | 2.37 | 0.00 | 92.63 | |
Fast mRMR-BPSOA-XGBoost | 97.34 | 2.66 | 99.35 | 97.45 | 98.39 | 96.77 | 2.55 | 3.23 | 90.85 | |
Fast mRMR-BPSOA-AdaBoost | 99.79 | 0.21 | 99.40 | 100.00 | 99.70 | 96.97 | 0.00 | 3.03 | 98.18 | |
Fast mRMR-BPSOA-RF | 97.98 | 2.02 | 99.39 | 98.19 | 98.79 | 96.88 | 1.81 | 3.13 | 92.79 | |
Ovarian | Fast mRMR-BPSOA-SVM | 97.47 | 2.53 | 98.74 | 98.13 | 98.43 | 94.74 | 1.88 | 5.26 | 91.95 |
Fast mRMR-BPSOA-WSVM | 98.99 | 1.01 | 99.40 | 99.40 | 99.40 | 96.77 | 0.60 | 3.23 | 96.18 | |
Fast mRMR-BPSOA-XGBoost | 97.98 | 2.02 | 98.85 | 98.85 | 98.85 | 91.67 | 1.15 | 8.33 | 90.52 | |
Fast mRMR-BPSOA-AdaBoost | 98.48 | 1.52 | 99.43 | 98.87 | 99.15 | 95.24 | 1.13 | 4.76 | 92.21 | |
Fast mRMR-BPSOA-RF | 96.95 | 3.05 | 98.16 | 98.16 | 98.16 | 91.18 | 1.84 | 8.82 | 89.34 | |
GSE4115 | Fast mRMR-BPSOA-SVM | 97.47 | 2.53 | 98.20 | 98.80 | 98.50 | 90.63 | 1.20 | 9.38 | 90.58 |
Fast mRMR-BPSOA-WSVM | 99.49 | 0.51 | 100.00 | 99.39 | 99.70 | 100.00 | 0.61 | 0.00 | 98.22 | |
Fast mRMR-BPSOA-XGBoost | 98.99 | 1.01 | 99.39 | 99.39 | 99.39 | 96.97 | 0.61 | 3.03 | 96.36 | |
Fast mRMR-BPSOA-AdaBoost | 97.47 | 2.53 | 99.37 | 97.53 | 98.44 | 97.22 | 2.47 | 2.78 | 91.89 | |
Fast mRMR-BPSOA-RF | 99.50 | 0.50 | 100.00 | 99.44 | 99.72 | 100.00 | 0.56 | 0.00 | 97.32 | |
GSE10245 | Fast mRMR-BPSOA-SVM | 94.44 | 5.56 | 97.59 | 95.86 | 96.72 | 86.21 | 4.14 | 13.79 | 78.83 |
Fast mRMR-BPSOA-WSVM | 96.97 | 3.03 | 98.29 | 98.29 | 98.29 | 86.96 | 1.71 | 13.04 | 85.24 | |
Fast mRMR-BPSOA-XGBoost | 96.97 | 3.03 | 98.16 | 98.16 | 98.16 | 91.43 | 1.84 | 8.57 | 89.59 | |
Fast mRMR-BPSOA-AdaBoost | 96.46 | 3.54 | 99.40 | 96.51 | 97.94 | 96.15 | 3.49 | 3.85 | 86.13 | |
Fast mRMR-BPSOA-RF | 97.98 | 2.02 | 99.42 | 98.28 | 98.84 | 95.83 | 1.72 | 4.17 | 90.94 |
Methodology | ALL-AML | Colon | CNS | Ovarian | GSE4115 | GSE10245 |
---|---|---|---|---|---|---|
[5] | 97.23 | 93.33 | – | – | – | – |
[6] | – | 93.12 | – | 100 | – | – |
[7] | 98.61 | 85.48 | 83.33 | 100 | – | – |
[8] | 94.17 | 95.83 | – | – | – | – |
[11] | – | 91.93 | – | 98.88 | – | – |
[20] | – | 98.27 | – | 96.98 | – | – |
[21] | 92.11 | 79.03 | 83.14 | 94.10 | 91.49 | 88.24 |
[22] with RNN | 97.11 | – | – | – | – | – |
[22] with LSTM | 97.17 | – | – | – | – | – |
[23] with CNN | 94.50 | 93.60 | – | – | – | 87.70 |
[23] with LSTM | 94.10 | 94.30 | – | – | – | 88.20 |
[23] with DCNN | 95.10 | 95.40 | – | – | – | 90.10 |
[Proposed] | 99.38 | 98.99 | 99.79 | 98.99 | 99.5 | 97.98 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Sahu, B.; Panigrahi, A.; Pati, A.; Das, M.N.; Jain, P.; Sahoo, G.; Liu, H. Novel Hybrid Feature Selection Using Binary Portia Spider Optimization Algorithm and Fast mRMR. Bioengineering 2025, 12, 291. https://doi.org/10.3390/bioengineering12030291
Sahu B, Panigrahi A, Pati A, Das MN, Jain P, Sahoo G, Liu H. Novel Hybrid Feature Selection Using Binary Portia Spider Optimization Algorithm and Fast mRMR. Bioengineering. 2025; 12(3):291. https://doi.org/10.3390/bioengineering12030291
Chicago/Turabian StyleSahu, Bibhuprasad, Amrutanshu Panigrahi, Abhilash Pati, Manmath Nath Das, Prince Jain, Ghanashyam Sahoo, and Haipeng Liu. 2025. "Novel Hybrid Feature Selection Using Binary Portia Spider Optimization Algorithm and Fast mRMR" Bioengineering 12, no. 3: 291. https://doi.org/10.3390/bioengineering12030291
APA StyleSahu, B., Panigrahi, A., Pati, A., Das, M. N., Jain, P., Sahoo, G., & Liu, H. (2025). Novel Hybrid Feature Selection Using Binary Portia Spider Optimization Algorithm and Fast mRMR. Bioengineering, 12(3), 291. https://doi.org/10.3390/bioengineering12030291