Enhancing Machine Learning Model Prediction with Feature Selection for Botnet Intrusion Detection †
Abstract
1. Introduction
2. State of the Arts Analysis
3. Research Methodology
3.1. Datasets Used
3.2. Data Preprocessing
3.3. Feature Selection (FS)
3.4. Data Balancing
4. Results and Discussion
4.1. Machine Learning (ML) Results
4.2. Features Importance Analysis
4.3. Evaluation of Our Results Against to State-of-the-Art Results
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Zaman, S.; Tauqeer, H.; Ahmad, W.; Shah, S.M.A.; Ilyas, M. Implementation of Intrusion Detection System in the Internet of Things: A Survey. In Proceedings of the 2020 IEEE 23rd International Multitopic Conference (INMIC), Bahawalpur, Pakistan, 5–7 November 2020; pp. 1–6. [Google Scholar] [CrossRef]
 - Sharma, P. Critical Review of Various Intrusion Detection Techniques for Internet of Things. In Proceedings of the 2nd International Conference on Data, Engineering and Applications (IDEA), Bhopal, India, 28–29 February 2020; pp. 1–6. [Google Scholar] [CrossRef]
 - Alaiz-Moreton, H.; Aveleira-Mata, J.; Ondicol-Garcia, J.; Muñoz-Castañeda, A.L.; García, I.; Benavides, C. Multiclass Classification Procedure for Detecting Attacks on MQTT-IoT Protocol. Complexity 2019, 2019, 6516253. [Google Scholar] [CrossRef]
 - Rahim, R.; Ahanger, A.S.; Khan, S.M.; Ma, F. Analysis of IDS using Feature Selection Approach on NSL-KDD Dataset. In Proceedings of the SCRS Conference on Intelligent Systems, Bangalore, India, 5–6 September 2022; Volume 26. [Google Scholar]
 - Soe, Y.N.; Santosa, P.I.; Hartanto, R. DDoS Attack Detection Based on Simple ANN with SMOTE for IoT Environment. In Proceedings of the 2019 Fourth International Conference on Informatics and Computing (ICIC), Semarang, Indonesia, 16–17 October 2019; pp. 1–5. [Google Scholar] [CrossRef]
 - Alissa, K.; Alyas, T.; Zafar, K.; Abbas, Q.; Tabassum, N.; Sakib, S. Botnet Attack Detection in IoT Using Machine Learning. Comput. Intell. Neurosci. 2022, 2022, 4515642. [Google Scholar] [CrossRef] [PubMed]
 - Raghuvanshi, A.; Singh, U.K.; Sajja, G.S.; Pallathadka, H.; Asenso, E.; Kamal, M.; Singh, A.; Phasinam, K. Intrusion Detection Using Machine Learning for Risk Mitigation in IoT-Enabled Smart Irrigation in Smart Farming. J. Food Qual. 2022, 2022, 3955514. [Google Scholar] [CrossRef]
 - Al-Ambusaidi, M.; Yinjun, Z.; Muhammad, Y.; Yahya, A. ML-IDS: An Efficient ML-Enabled Intrusion Detection System for Securing IoT Networks and Applications. Soft Comput. 2024, 28, 1765–1784. [Google Scholar] [CrossRef]
 - Maniriho, P.; Niyigaba, E.; Bizimana, Z.; Twiringiyimana, V.; Mahoro, L.J.; Ahmad, T. Anomaly-Based Intrusion Detection Approach for IoT Networks Using Machine Learning. In Proceedings of the 2020 International Conference on Computer Engineering, Network, and Intelligent Multimedia (CENIM), Surabaya, Indonesia, 17–18 November 2020. [Google Scholar]
 - Swarna Sugi, S.S.; Ratna, S.R. Investigation of Machine Learning Techniques in Intrusion Detection System for IoT Network. In Proceedings of the 2020 3rd International Conference on Intelligent Sustainable Systems (ICISS), Thoothukudi, India, 3–5 December 2020; pp. 1164–1167. [Google Scholar] [CrossRef]
 - Ali, M.L.; Thakur, K.; Schmeelk, S.; Debello, J.; Dragos, D. Deep Learning vs. Machine Learning for Intrusion Detection in Computer Networks: A Comparative Study. Appl. Sci. 2025, 15, 1903. [Google Scholar] [CrossRef]
 - Idouglid, L.; Tkatek, S.; Elfayq, K.; Guezzaz, A. A Novel Anomaly Detection Model for the Industrial Internet of Things Using Machine Learning Techniques. Radioelectron. Comput. Syst. 2024, 2024, 143–151. [Google Scholar] [CrossRef]
 - Kerrakchou, I.; El Hassan, A.A.; Chadli, S.; Emharraf, M.; Saber, M. Selection of Efficient Machine Learning Algorithm on Bot-IoT Dataset for Intrusion Detection in Internet of Things Networks. Indones. J. Electr. Eng. Comput. Sci. 2023, 31, 1784–1793. [Google Scholar] [CrossRef]
 - Gaber, T.; El-Ghamry, A.; Hassanien, A.E. Injection Attack Detection Using Machine Learning for Smart IoT Applications. Phys. Commun. 2022, 52, 101685. [Google Scholar] [CrossRef]
 - Sarwar, A.; Hasan, S.; Khan, W.U.; Ahmed, S.; Marwat, S.N.K. Design of an Advance Intrusion Detection System for IoT Networks. In Proceedings of the 2022 2nd International Conference on Artificial Intelligence (ICAI), Islamabad, Pakistan, 30–31 March 2022; pp. 46–51. [Google Scholar]
 - Venkatesan, S. Design an Intrusion Detection System Based on Feature Selection Using ML Algorithms. Math. Stat. Eng. Appl. 2023, 72, 702–710. [Google Scholar]
 - Li, J.; Othman, M.S.; Chen, H.; Yusuf, L.M. Optimizing IoT Intrusion Detection System: Feature Selection versus Feature Extraction in Machine Learning. J. Big Data 2024, 11, 36. [Google Scholar] [CrossRef]
 - Altulaihan, E.; Almaiah, M.A.; Aljughaiman, A. Anomaly Detection IDS for Detecting DoS Attacks in IoT Networks Based on Machine Learning Algorithms. Sensors 2024, 24, 713. [Google Scholar] [CrossRef] [PubMed]
 - Baich, M.; Hamim, T.; Sael, N.; Chemlal, Y. Machine Learning for IoT Based Networks Intrusion Detection: A Comparative Study. Procedia Comput. Sci. 2022, 215, 742–751. [Google Scholar] [CrossRef]
 - Yang, Z.; Liu, X.; Li, T.; Wu, D.; Wang, J.; Zhao, Y.; Han, H. A Systematic Literature Review of Methods and Datasets for Anomaly-Based Network Intrusion Detection. Comput. Secur. 2022, 116, 102675. [Google Scholar] [CrossRef]
 - BoT-IoT Dataset. Available online: https://ieee-dataport.org/documents/bot-iot-dataset (accessed on 11 April 2025).
 




| Algorithms | Accuracy | Precision | Recall | F1 | MCC | FAR | Time (s) | 
|---|---|---|---|---|---|---|---|
| RF | 99.99% | 99.99% | 99.99% | 99.99% | 98.01% | 0.019 | 552 | 
| NB | 99.97% | 99.99% | 99.97% | 99.98% | 47.34% | 0.2475 | 7.5 | 
| DT | 99.99% | 99.99% | 99.99% | 99.99% | 97.53% | 0.019 | 145 | 
| KNN | 99.99% | 99.99% | 99.99% | 99.99% | 93.00% | 0.0099 | 6888 | 
| LR | 99.51% | 99.99% | 99.51% | 99.75% | 15.60% | 0.059 | 210 | 
| XGB | 99.99% | 100% | 99.99% | 99.99% | 99.50% | 0.019 | 43.68 | 
| Techniques | Algorithms | Accuracy | Precision | Recall | F1 | MCC | FAR | Time (s) | 
|---|---|---|---|---|---|---|---|---|
| Fisher score | RF | 99.99% | 100% | 99.99% | 99.99% | 98.54% | 0.25 | 787 | 
| NB | 99.96% | 99.99% | 99.97% | 99.98% | 45.67% | 0.24 | 5.76 | |
| DT | 99.99% | 99.99% | 99.99% | 99.99% | 96.13% | 0.019 | 88 | |
| KNN | 99.99% | 99.99% | 99.99% | 99.99% | 85.57% | 0.059 | 1881 | |
| LR | 98.12% | 99.99% | 98.12% | 99.05% | 76% | 0.099 | 155 | |
| XGBoost | 99.99% | 99.99% | 100% | 99.99% | 97.49% | 0.049 | 23 | |
| ANOVA | RF | 99.99% | 100% | 99.99% | 99.99% | 97.15% | 0.029 | 287 | 
| NB | 98.38% | 100% | 98.38% | 99.18% | 9.11% | 0.2475 | 1.5 | |
| DT | 99.99% | 99.99% | 99.99% | 99.99% | 96.13% | 0.019 | 42 | |
| KNN | 99.99% | 99.99% | 99.99% | 99.99% | 82.91% | 0.0099 | 191 | |
| LR | 99.38% | 100% | 99.38% | 99.05% | 14.47% | 0.099 | 26 | |
| XGBoost | 99.99% | 100% | 99.99% | 99.99% | 97.15% | 0.049 | 14 | |
| Ridge | RF | 99.96% | 99.99% | 99.96% | 99.98% | 50.40% | 0.049 | 199 | 
| NB | 98.03% | 100% | 98.03% | 99.00% | 8.25% | 0.0001 | 1.8 | |
| DT | 99.96% | 99.99% | 99.96% | 99.98% | 49.46% | 0.059 | 17 | |
| KNN | 99.96% | 99.99% | 99.96% | 99.98% | 49.68% | 0.029 | 710 | |
| LR | 99.98% | 99.98% | 100% | 99. 99% | 15.85% | 0.99 | 5.20 | |
| XGBoost | 99.95% | 99.99% | 99.95% | 99.97% | 46.31% | 0.039 | 35 | |
| Lasso | RF | 99.99% | 100% | 99.99% | 99.99% | 98.54% | 0.277 | 300 | 
| NB | 99.95% | 99.99% | 99.95% | 99.97% | 38.61% | 0.25 | 1.5 | |
| DT | 99.99% | 100% | 99.99% | 99.99% | 97.15% | 0.001 | 43 | |
| KNN | 99.99% | 99.99% | 99.99% | 99.99% | 78.61% | 0.0198 | 268 | |
| LR | 99.13% | 99.99% | 99.13% | 99.56% | 11.97% | 0.039 | 651 | |
| XGB | 99.99% | 100% | 99.99% | 99.99% | 98.07% | 0.001 | 13 | |
| Recursive Feature Elimination | RF | 99.98% | 99.99% | 99.99% | 99.99% | 96.13% | 0.019 | 297 | 
| NB | 99.98% | 99.98% | 99.99% | 99.99% | 15.35% | 0.910 | 2 | |
| DT | 99.99% | 99.99% | 99.99% | 99.99% | 97.00% | 0.039 | 76 | |
| KNN | 99.99% | 99.99% | 99.99% | 99.99% | 90.00% | 0.039 | 4761 | |
| LR | 99.40% | 99.99% | 99.40% | 99.70% | 14.82% | 0.009 | 77 | |
| XGB | 99.99% | 99.99% | 99.99% | 99.99% | 97.06% | 0.019 | 15 | |
| Forward feature selection | RF | 99.99% | 100% | 99.99% | 99.99% | 99.02% | 0.01 | 322 | 
| NB | 99.95% | 99.99% | 99.95% | 99.97% | 38.16% | 0.257 | 2 | |
| DT | 99.99% | 99.99% | 99.99% | 99.99% | 98.01% | 0.019 | 44 | |
| KNN | 99.99% | 99.99% | 99.99% | 99.99% | 89.92% | 0.019 | 3406 | |
| LR | 98.45% | 99.99% | 98.45% | 99.22% | 8.68% | 0.069 | 83 | |
| XGB | 99.99% | 99.99% | 99.99% | 99.99% | 97.53% | 0.019 | 13 | 
| Ref | FS | ML | Evaluation Metrics | Accuray | Time | 
|---|---|---|---|---|---|
| [10] | Gain Ratio | KNN | Geometric mean, Kappa statistics, detection time, accuracy. | 92.29% | - | 
| [12] | PCA | RF | Accuracy, Detection Rate, False alarm Rate. | 98.9% | - | 
| [13] | Not  mentioned  | DT | Accuracy, precision, recall, F-measure. | 99.99% | Not mentioned | 
| Our  approach  | Lasso | XGBoost | Accuracy, precision, recall, F-score, MCC, false alarm rate, time (s) | 99.99% | 13 s | 
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.  | 
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Baich, M.; Sael, N. Enhancing Machine Learning Model Prediction with Feature Selection for Botnet Intrusion Detection. Eng. Proc. 2025, 112, 55. https://doi.org/10.3390/engproc2025112055
Baich M, Sael N. Enhancing Machine Learning Model Prediction with Feature Selection for Botnet Intrusion Detection. Engineering Proceedings. 2025; 112(1):55. https://doi.org/10.3390/engproc2025112055
Chicago/Turabian StyleBaich, Marwa, and Nawal Sael. 2025. "Enhancing Machine Learning Model Prediction with Feature Selection for Botnet Intrusion Detection" Engineering Proceedings 112, no. 1: 55. https://doi.org/10.3390/engproc2025112055
APA StyleBaich, M., & Sael, N. (2025). Enhancing Machine Learning Model Prediction with Feature Selection for Botnet Intrusion Detection. Engineering Proceedings, 112(1), 55. https://doi.org/10.3390/engproc2025112055
        
                                                
