Automated Hyperparameter Optimization for Cyberattack Detection Based on Machine Learning in IoT Systems
Abstract
1. Introduction
2. Methodology
2.1. Materials
2.2. Data Preprocessing, Standardization, and Division
- -
- Initially, given the large number of records in the CICIoT2023 dataset (approximately 46 million records), it was decided to implement a stratified subsampling of 1% in all classes, resulting in a dataset of 466,846 records. This decision was based on the fact that it ensures that the proportion of the attack class and the benign class is maintained, preserving the original distribution and the nature of the imbalance problem. The global seed used was 42, which is used for both subsampling and stratified cross-validation. All compared methods share exactly the same folds in order to perform paired comparisons. The evaluation protocol is based on k-fold cross-validation (k = 10); in each fold, preprocessing is adjusted exclusively with the training data from the fold to avoid data leakage. In the CIC-DDoS2019 dataset, which contained 431,371 records, 100% of the records were used, maintaining the same criteria as for the first dataset used.
- -
- Descriptors with constant values in all records were analyzed and removed from the datasets to optimize the analysis. In the CICIoT2023 dataset, six descriptors were identified that met this condition, namely: ‘ece_flag_number’, ‘cwr_flag_number’, ‘Telnet’, ‘SMTP’, ‘IRC’, and ‘DHCP’. In the CIC-DDoS2019 dataset, twelve descriptors with constant values were identified: ‘Bwd PSH Flags’, ‘Fwd URG Flags’, ‘Bwd URG Flags’, ‘FIN Flag Count’, ‘PSH Flag Count’, ‘ECE Flag Count’, ‘Fwd Avg Bytes/Bulk’, ‘Fwd Avg Packets/Bulk’, ‘Fwd Avg Bulk Rate’, ‘Bwd Avg Bytes/Bulk’, ‘Bwd Avg Packets/Bulk’, and ‘Bwd Avg Bulk Rate’. In addition, the ‘Unnamed: 0’ descriptor containing the sequential numbering of the records was removed, as was the ‘Label’ descriptor that included the type of attack for each record. It was also decided to remove this because there was another descriptor, ‘Class’, that grouped the different types of attacks more appropriately. At this stage, 6 descriptors from CICIoT2023 and 14 descriptors from CIC-DDoS2019 were removed.
- -
- An analysis of duplicate, empty, and infinite values (positive and negative) was also performed to identify and remove them from the datasets. In the case of CICIoT2023, no records were found that met these conditions. In contrast, 7698 duplicate records were identified in CIC-DDoS2019, which were removed to ensure the quality of the information analyzed.
- -
- Standardizing (z-score) the data allows the algorithms to generalize much better and more stable. For each descriptor, we compute the mean and standard deviation on the training split only and apply the same parameters to validation and test.
- -
- The new preprocessed dataset is divided into training, validation, and test sets with a ratio of 60%, 20%, and 20%, respectively, using a fixed random seed (42) to preserve class priors. The division was performed in a stratified manner, ensuring that the proportion of each class remained constant across all subsets.
2.3. Feature Selection
| Algorithm 1: Feature Selection Process | |
| 1. | Step 1: Load dataset and optimization algorithms |
| 2. | Import CICIoT2023, CIC-DDoS2019. |
| 3. | |
| 4. | Step 2: Define the fitness function |
| 5. | Use a DecisionTree (DT) as the base model |
| 6. | Evaluate performance on the validation set using the geometric mean between |
| 7. | F1 Score and MCC. |
| 8. | Step 3: Run optimization algorithms |
| 9. | : |
| 10. | using DT. |
| 11. | Record the best binary mask. |
| 12. | End |
| 13. | Step 4: Add Feature Selection Results |
| 14. | Step 5: Output |
| 15. | Return to the final list of features selected by at least three optimization algorithms |
2.4. Proposed HPO Method
- (i)
- Scalar objective function (weighted F1 + normalized MCC)
- (ii)
- Surrogate-based acquisition.
- (iii)
- Candidate set and ε-greedy selection with elite-guided mutation
- (iv)
- Stratified K-fold evaluation in parallel
- (v)
- Computational complexity
Method Calibration and Parameter Fixing
2.5. Evaluation Metrics for ML Models
3. Results
3.1. Selection of Descriptors
3.2. Results Without Hyperparameter Tuning
3.3. Results with Hyperparameter Tuning
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- García-García, R.E. Applications of Artificial Intelligence in Hospital Quality Management: A Review of Digital Strategies in Healthcare Settings. Rev. Cient. Sist. Inform. 2025, 5, e928. [Google Scholar] [CrossRef]
- Mahoto, N.A.; Shaikh, A.; Sulaiman, A.; Reshan, M.S.A.; Rajab, A.; Rajab, K. A Machine Learning Based Data Modeling for Medical Diagnosis. Biomed. Signal Process. Control 2023, 81, 104481. [Google Scholar] [CrossRef]
- Bolia, C.; Joshi, S. Optimized Deep Neural Network for High-Precision Psoriasis Classification from Dermoscopic Images. Rev. Cient. Sist. Inform. 2025, 5, e966. [Google Scholar] [CrossRef]
- Rodriguez, M.R.R.; Calpa, C.A.D.; Paz, H.A.M. Comparison of kernel functions in the prediction of cardiovascular disease in Artificial Neural Networks (ANN) and Support Vector Machines (SVM). EthAIca 2025, 4, 172. [Google Scholar] [CrossRef]
- Del-Águila-Castro, M. Sistemas Inteligentes y su Aplicación en la Evaluación del Desempeño Académico Universitario: Una Revisión de la Literatura en el Contexto Sudamericano. Rev. Cient. Sist. Inform. 2024, 4, e671. [Google Scholar] [CrossRef]
- Garikapati, D.; Shetiya, S.S. Autonomous Vehicles: Evolution of Artificial Intelligence and the Current Industry Landscape. Big Data Cogn. Comput. 2024, 8, 42. [Google Scholar] [CrossRef]
- Shafique, R.; Rustam, F.; Murtala, S.; Jurcut, A.D.; Choi, G.S. Advancing Autonomous Vehicle Safety: Machine Learning to Predict Sensor-Related Accident Severity. IEEE Access 2024, 12, 25933–25948. [Google Scholar] [CrossRef]
- El Hajj, M.; Hammoud, J. Unveiling the Influence of Artificial Intelligence and Machine Learning on Financial Markets. J. Risk Financ. Manag. 2023, 16, 434. [Google Scholar] [CrossRef]
- Melo, R.A.V.; Peña, J.C.C.; Rosero, J.A.R. Enriching the tourist experience at the Santuario de las Lajas through image recognition using WhatsApp. EthAIca 2025, 4, 180. [Google Scholar] [CrossRef]
- Raiaan, M.A.K.; Sakib, S.; Fahad, N.M.; Al Mamun, A.; Rahman, A.; Shatabda, S.; Mukta, S.H. A Systematic Review of Hyperparameter Optimization Techniques in Convolutional Neural Networks. Decis. Anal. J. 2024, 11, 100470. [Google Scholar] [CrossRef]
- Bischl, B.; Binder, M.; Lang, M.; Pielok, T.; Richter, J.; Coors, S.; Thomas, J.; Ullmann, T.; Becker, M.; Boulesteix, A.; et al. Hyperparameter Optimization: Foundations, Algorithms, Best Practices, and Open Challenges. WIREs Data Min. Knowl. Discov. 2023, 13, e1484. [Google Scholar] [CrossRef]
- Franceschi, L.; Donini, M.; Perrone, V.; Klein, A.; Archambeau, C.; Seeger, M.; Pontil, M.; Frasconi, P. Hyperparameter Optimization in Machine Learning. arXiv 2025. [Google Scholar] [CrossRef]
- Shekhar, S.; Bansode, A.; Salim, A. A Comparative Study of Hyper-Parameter Optimization Tools. In Proceedings of the 2021 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE), Brisbane, Australia, 8–10 December 2021. [Google Scholar] [CrossRef]
- scikit-learn. RandomForestClassifier. Available online: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html (accessed on 24 August 2025).
- Akinremi, B. Best Tools for Model Tuning and Hyperparameter Optimization. Neptune.ai. Available online: https://neptune.ai/blog/best-tools-for-model-tuning-and-hyperparameter-optimization (accessed on 24 August 2025).
- Yang, L.; Shami, A. On Hyperparameter Optimization of Machine Learning Algorithms: Theory and Practice. Neurocomputing 2020, 415, 295–316. [Google Scholar] [CrossRef]
- Elgeldawi, E.; Sayed, A.; Galal, A.R.; Zaki, A.M. Hyperparameter Tuning for Machine Learning Algorithms Used for Arabic Sentiment Analysis. Informatics 2021, 8, 79. [Google Scholar] [CrossRef]
- Injadat, M.; Moubayed, A.; Nassif, A.B.; Shami, A. Systematic Ensemble Model Selection Approach for Educational Data Zining. Knowl.-Based Syst. 2020, 200, 105992. [Google Scholar] [CrossRef]
- Claesen, M.; Simm, J.; Popovic, D.; Moreau, Y.; De Moor, B. Easy Hyperparameter Search Using Optunity. arXiv 2014. [Google Scholar] [CrossRef]
- Lorenzo, P.R.; Nalepa, J.; Kawulok, M.; Ramos, L.S.; Pastor, J.R. Particle Swarm Optimization for Hyper-Parameter Selection in Deep Neural Networks. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO ’17), Berlin, Germany, 15–19 July 2017; ACM: New York, NY, USA, 2017; pp. 481–488. [Google Scholar] [CrossRef]
- Bergstra, J.; Bengio, Y. Random Search for Hyper-Parameter Optimization. J. Mach. Learn. Res. 2012, 13, 281–305. [Google Scholar]
- Witt, C. Worst-Case and Average-Case Approximations by Simple Randomized Search Heuristics. In Proceedings of the 22nd Annual Conference on Theoretical Aspects of Computer Science (STACS’05), Stuttgart, Germany, 24–26 February 2005; Springer: Berlin/Heidelberg, Germany, 2005; pp. 44–56. [Google Scholar] [CrossRef]
- Snoek, J.; Larochelle, H.; Adams, R.P. Practical Bayesian Optimization of Machine Learning Algorithms. arXiv 2012. [Google Scholar] [CrossRef]
- Hyperopt Documentation. Available online: https://hyperopt.github.io/hyperopt/ (accessed on 17 September 2025).
- Bergstra, J.; Yamins, D.; Cox, D.D. Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures. Proc. Mach. Learn. Res. 2013, 28, 115–123. [Google Scholar]
- Kumari, P.; Jain, A.K. A Comprehensive Study of DDoS Attacks over IoT Network and Their Countermeasures. Comput. Secur. 2023, 127, 103096. [Google Scholar] [CrossRef]
- Miri Kelaniki, S.; Komninos, N. A Study on IoT Device Authentication Using Artificial Intelligence. Sensors 2025, 25, 5809. [Google Scholar] [CrossRef]
- Wahab, S.A.; Sultana, S.; Tariq, N.; Mujahid, M.; Khan, J.A.; Mylonas, A. A Multi-Class Intrusion Detection System for DDoS Attacks in IoT Networks Using Deep Learning and Transformers. Sensors 2025, 25, 4845. [Google Scholar] [CrossRef]
- Saito, T.; Rehmsmeier, M. The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets. PLoS ONE 2015, 10, e0118432. [Google Scholar] [CrossRef]
- Chicco, D.; Jurman, G. The Advantages of the Matthews Correlation Coefficient (MCC) over F1 Score and Accuracy in Binary Classification Evaluation. BMC Genom. 2020, 21, 6. [Google Scholar] [CrossRef]
- Davis, J.; Goadrich, M. The Relationship between Precision-Recall and ROC Curves. In Proceedings of the 23rd International Conference on Machine Learning (ICML ’06), Pittsburgh, PA, USA, 25–29 June 2006; ACM: New York, NY, USA, 2006; pp. 233–240. [Google Scholar] [CrossRef]
- Khanday, S.A.; Fatima, H.; Rakesh, N. Implementation of Intrusion Detection Model for DDoS Attacks in Lightweight IoT Networks. Expert Syst. Appl. 2023, 215, 119330. [Google Scholar] [CrossRef]
- Najar, A.A.; Manohar, N.S. A Robust DDoS Intrusion Detection System Using Convolutional Neural Network. Comput. Electr. Eng. 2024, 117, 109277. [Google Scholar] [CrossRef]
- Mahadik, S.S.; Pawar, P.M.; Muthalagu, R. Edge-HetIoT Defense against DDoS Attack Using Learning Techniques. Comput. Secur. 2023, 132, 103347. [Google Scholar] [CrossRef]
- Ullah, S.; Mahmood, Z.; Ali, N.; Ahmad, T.; Buriro, A. Machine Learning-Based Dynamic Attribute Selection Technique for DDoS Attack Classification in IoT Networks. Computers 2023, 12, 115. [Google Scholar] [CrossRef]
- Lv, H.; Du, Y.; Zhou, X.; Ni, W.; Ma, X. A Data Enhancement Algorithm for DDoS Attacks Using IoT. Sensors 2023, 23, 17496. [Google Scholar] [CrossRef]
- Neto, E.C.P.; Dadkhah, S.; Ferreira, R.; Zohourian, A.; Lu, R.; Ghorbani, A.A. CICIoT2023: A Real-Time Dataset and Benchmark for Large-Scale Attacks in IoT Environment. Sensors 2023, 23, 5941. [Google Scholar] [CrossRef]
- Sharafaldin, I.; Lashkari, A.H.; Hakak, S.; Ghorbani, A.A. Developing Realistic Distributed Denial of Service (DDoS) Attack Dataset and Taxonomy. In Proceedings of the 2019 International Carnahan Conference on Security Technology (ICCST), Chennai, India, 1–3 October 2019; pp. 1–8. [Google Scholar] [CrossRef]
- Akouhar, M.; Ouhssini, M.; El Fatini, M.; Abarda, A.; Agherrabi, E. Dynamic Oversampling-Driven Kolmogorov–Arnold Networks for Credit Card Fraud Detection: An Ensemble Approach to Robust Financial Security. Egypt. Inform. J. 2025, 31, 100712. [Google Scholar] [CrossRef]
- Hutter, F.; Hoos, H.H.; Leyton-Brown, K. Sequential Model-Based Optimization for General Algorithm Configuration. In Learning and Intelligent Optimization; Coello, C.A.C., Ed.; Springer: Berlin/Heidelberg, Germany, 2011; pp. 507–523. [Google Scholar] [CrossRef]
- scikit-learn: Machine Learning in Python—Scikit-Learn 1.7.2 Documentation. Available online: https://scikit-learn.org/stable/ (accessed on 18 August 2025).

| Parameter | C1 (Stable Exploration) | C2 (High Exploration) | C3 (High Exploitation) | C4 (Stable Exploitation) | C5 (Intensive Optimization) |
|---|---|---|---|---|---|
| init_points | 12 | 20 | 12 | 12 | 24 |
| n_iter | 20 | 24 | 24 | 18 | 150 |
| top_k | 8 | 10 | 6 | 8 | 12 |
| can_per_iter | 128 | 160 | 96 | 128 | 256 |
| epsilon_start | 0.25 | 0.35 | 0.10 | 0.25 | 0.35 |
| epsilon_end | 0.08 | 0.15 | 0.05 | 0.08 | 0.05 |
| ucb_kappa | 1.5 | 2.0 | 1.0 | 1.5 | 1.5 |
| early_stop | 8 | 10 | 6 | 6 | 25 |
| n_splits_cv | 3 | 3 | 3 | 5 | 10 |
| best_reward | 0.86310 | 0.86575 | 0.86310 | 0.86649 | 0.86833 |
| Dataset | Total Number of Descriptors | Retained Descriptors (Consensus) | Dominant Descriptors |
|---|---|---|---|
| CICIoT2023 | 47 | 19 | [‘flow_duration’, ‘Header_Length’, ‘Protocol Type’, ‘Rate’, ‘fin_flag_number’, ‘urg_count’, ‘rst_count’, ‘HTTP’, ‘DNS’, ‘SSH’, ‘TCP’, ‘ARP’, ‘Min’, ‘IAT’, ‘Covariance’, ‘Weight’, ‘Drate’, ‘ack_count’, ‘syn_count’] |
| CIC-DDoS2019 | 80 | 18 | [‘Protocol’, ‘Total Backward Packets’, ‘Fwd Packets Length Total’, ‘Bwd Packets Length Total’, ‘Fwd Packet Length Min’, ‘Bwd Packet Length Max’, ‘Bwd Packet Length Mean’, ‘Bwd IAT Total’, ‘Fwd PSH Flags’, ‘Packet Length Max’, ‘ACK Flag Count’, ‘Init Fwd Win Bytes’, ‘Init Bwd Win Bytes’, ‘Active Max’, ‘Idle Max’, ‘Total Fwd Packets’, ‘SYN Flag Count’, ‘Idle Min’] |
| Model | CICIoT2023 | CIC-DDoS2019 | ||||||
|---|---|---|---|---|---|---|---|---|
| Accuracy | F1-Score | MCC | Inference Time | Accuracy | F1-Score | MCC | Inference Time | |
| RF | 0.994998 | 0.994740 | 0.988533 | 0.760837 | 0.946728 | 0.946909 | 0.928399 | 0.520209 |
| ADA | 0.963704 | 0.961538 | 0.961538 | 0.836089 | 0.922146 | 0.921282 | 0.893120 | 0.395241 |
| DT | 0.993788 | 0.993787 | 0.985759 | 0.013083 | 0.946303 | 0.946506 | 0.927728 | 0.010000 |
| XGB | 0.993810 | 0.993566 | 0.985809 | 0.052084 | 0.945725 | 0.945887 | 0.926866 | 0.049176 |
| MLP | 0.989215 | 0.989106 | 0.975290 | 0.060001 | 0.942385 | 0.942610 | 0.922150 | 0.051996 |
| Model | Space of Values |
|---|---|
| RF | “n_estimators”: {20, 150}; “max_depth”: {3, 60}; “max_features”: {“sqrt”, “log2”, None}; “min_samples_split”: {2, 30}; “min_samples_leaf”: {1, 10}; “min_impurity_decrease”: [1 × 10−9, 1 × 10−3] (log); “ccp_alpha”: [1 × 10−6, 1 × 10−1] (log); “bootstrap”: {True, False}; “class_weight”: {None, “balanced”}. |
| ADA | “n_estimators”: {20, 150}; “learning_rate”: [1 × 10−3, 1.0] (log); |
| DT | “max_depth”: {3, 60}; “max_leaf_nodes”: {2, 50}; “criterion”: {“gini”, “entropy”, “log_loss”}; “min_samples_split”: {2, 30}; “min_samples_leaf”: {1, 10}; “ccp_alpha”: [1 × 10−6, 1 × 10−1] (log); “splitter”: {“best”, “random”} |
| XGB | “learning_rate”: [1 × 10−3, 0.2] (log); “n_estimators”: {20, 150}; “max_depth”: {3, 8}; “min_child_weight”: [0.5, 10.0] (log); “subsample”: [0.6, 0.9]; “colsample_bytree”: [0.6, 0.9]; “reg_lambda”: [0.3, 10.0] (log); “reg_alpha”: [1 × 10−5, 1.0] (log); “gamma”: [1 × 10−8, 1.0] (log) |
| MLP | “solver”: {“adam”, “sgd”, “lbfgs”}; “learning_rate_init”: [1 × 10−4, 3 × 10−2] (log); “beta_1”: [0.70, 0.99] (only Adam); “epsilon”: [1 × 10−9, 1 × 10−6] (only Adam); “momentum”: [0.0, 0.95] (only SGD); “power_t”: [0.30, 0.90] (only SGD); “hidden_layer_sizes”: {10–30, 10–30, 10–30}; “activation”: {“relu”, “tanh”, “logistic”}; “alpha”: [1 × 10−6, 1 × 10−2] (log); “max_iter”: {10, 50}; “batch_size”: {8, 32}; “early_stopping”: {True, False} |
| Model | Method HPO | Best Params | Best Score (F1w ⊗ MCC) | Time Train (s) |
|---|---|---|---|---|
| RF | HyperOpt | {‘n_estimators’:96, ‘max_depth’: 37, ‘max_features’: None, ‘min_samples_split’:23, ‘min_samples_leaf’:5, ‘min_impurity_decrease’: 1.2117950042020245 × 10−5, ‘bootstrap’: True, ‘class_weight’: None} | 0.99532 | 8086.56 |
| Our method | {‘n_estimators’: 105, ‘max_depth’: None, ‘max_features’: None, ‘min_samples_split’: 6, ‘min_samples_leaf’: 7, ‘min_impurity_decrease’: 2.4218390008497485 × 10−7, ‘bootstrap’: True, ‘class_weight’: None} | 0.995013 | 1853.59 | |
| ADA | HyperOpt | {‘n_estimators’: 105, ‘learning_rate’: 0.646227} | 0.980214 | 3748.65 |
| Our method | {‘n_estimators’: 116, ‘learning_rate’: 0.845034} | 0.983072 | 2478.84 | |
| DT | HyperOpt | {‘max_depth’: 78, ‘max_leaf_nodes’: 22, ‘min_samples_split’: 24, ‘min_samples_leaf’: 6, ‘ccp_alpha’: 0.01039, ‘criterion’: ‘entropy’, ‘splitter’: ‘best’} | 0.984092 | 154.86 |
| Our method | {‘max_depth’: None, ‘max_leaf_nodes’: 50, ‘min_samples_split’: 11, ‘min_samples_leaf’: 1, ‘ccp_alpha’: 4.377998017830722 × 10−6, ‘criterion’: ‘entropy’, ‘splitter’: ‘best’} | 0.994370 | 83.99 s | |
| XGB | HyperOpt | {‘n_estimators’: 137, ‘max_depth’: 8, ‘min_child_weight’: 0.573939, ‘gamma’: 0.170329, ‘reg_alpha’: 1.395383012428995 × 10−5, ‘reg_lambda’: 2.244989, ‘subsample’: 0.683878, ‘colsample_bytree’: 0.792417, ‘learning_rate’: 0.046230} | 0.993881 | 387.61 |
| Our method | {‘n_estimators’: 109, ‘max_depth’: 8, ‘min_child_weight’: 0.551940, ‘gamma’:0.025024, ‘reg_alpha’: 0.1469389, ‘reg_lambda’: 10.0, ‘subsample’:0.668079, ‘colsample_bytree’: 0.711893,’learning_rate’: 0.200} | 0.993237 | 1499.20 | |
| MLP | HyperOpt | {‘hidden_layer_sizes’: (30, 11, 25), ‘activation’: ‘tanh’, ‘alpha’: 7.361794916529305 × 10−6, ‘batch_size’: 10, ‘max_iter’: 30, ‘learning_rate_init’: 0.000562, ‘beta_1’: 0.949618, ‘early_stopping’: True, ‘epsilon’: 5.040415616377387 × 10−9, ‘solver’: ‘adam’} | 0.989440 | 5033.69 |
| Our method | {‘hidden_layer_sizes’: (19, 12, 17), ‘activation’: ‘tanh’, ‘alpha’: 0.000117, ‘batch_size’: 15, ‘max_iter’: 44, ‘learning_rate_init’: 0.000492, ‘momentum’: 0.065116, ‘beta_1’: 0.961121, ‘power_t’: 0.838329, ‘early_stopping’: True, ‘epsilon’: 7.930569433855144 × 10−8, ‘solver’: ‘adam’} | 0.988206 | 3068.92 |
| Model | Method HPO | Best Params | Best Score (F1w ⊗ MCC) | Time Train (s) |
|---|---|---|---|---|
| RF | HyperOpt | {‘n_estimators’: 88, ‘max_depth’: None, ‘max_features’: None, ‘min_samples_split’: 12, ‘min_samples_leaf’: 5, ‘min_impurity_decrease’: 9.427909291245356 × 10−9, ‘bootstrap’: True, ‘class_weight’: None} | 0.999319 | 3060.93 |
| Our method | {‘n_estimators’: 81, ‘max_depth’: None, ‘max_features’: ‘sqrt’, ‘min_samples_split’: 5, ‘min_samples_leaf’: 1, ‘min_impurity_decrease’: 4.5520967411891875 × 10−9, ‘bootstrap’: False, ‘class_weight’: None} | 0.999352 | 966 | |
| ADA | HyperOpt | {‘n_estimators’: 139, ‘learning_rate’: 0.966997} | 0.996135 | 2847.36 |
| Our method | {‘n_estimators’: 106, ‘learning_rate’: 0.728665} | 0.995254 | 1140.78 | |
| DT | HyperOpt | {‘max_depth’: 79, ‘max_leaf_nodes’: 36, ‘min_samples_split’: 17, ‘min_samples_leaf’: 8, ‘ccp_alpha’: 0.014595, ‘criterion’: ‘entropy’, ‘splitter’: ‘best’} | 0.994834 | 61.5483 |
| Our method | {‘max_depth’: 113, ‘max_leaf_nodes’: 46, ‘min_samples_split’: 22, ‘min_samples_leaf’: 8, ‘ccp_alpha’: 6.078083099681951 × 10−5, ‘criterion’: ‘gini’, ‘splitter’: ‘best’} | 0.998540 | 41.82 | |
| XGB | HyperOpt | {‘n_estimators’: 147, ‘max_depth’: 8, ‘min_child_weight’: 0.908644, ‘reg_lambda’: 0.322439, ‘reg_alpha’: 0.112405, ‘gamma’: 0.000749, ‘subsample’: 0.88280, ‘colsample_bytree’: 0.659436, ‘learning_rate’: 0.178872} | 0.999467 | 46.53 |
| Our method | {‘n_estimators’: 24, ‘max_depth’: 8, ‘min_child_weight’: 0.712459, ‘reg_lambda’: 0.380084, ‘reg_alpha’: 1.536042335195694 × 10−5, ‘gamma’: 0.020828, ‘subsample’: 0.707942, ‘colsample_bytree’: 0.812375, ‘learning_rate’: 0.2000} | 0.999111 | 181.21 | |
| MLP | HyperOpt | {‘hidden_layer_sizes’: (16, 18, 26), ‘max_iter’: 37, ‘batch_size’: 12, ‘alpha’: 1.4382949513732408 × 10−5, ‘solver’: ‘adam’, ‘learning_rate_init’: 0.00429, ‘beta_1’: 0.757296, ‘epsilon’: 5.809266354888556 × 10−9, ‘activation’: ‘logistic’, ‘early_stopping’: True} | 0.997620 | 3027.20 |
| Our method | {‘hidden_layer_sizes’: (15, 11, 11), ‘max_iter’: 41, ‘batch_size’: 9, ‘alpha’: 6.17411488067891 × 10−6, ‘solver’: ‘adam’, ‘learning_rate’: ‘constant’, ‘learning_rate_init’: 0.00082, ‘beta_1’: 0.701612, ‘epsilon’: 6.855748994731211 × 10−7, ‘momentum’: 0.819165, ‘power_t’: 0.388781, ‘activation’: ‘tanh’, ‘early_stopping’: True} | 0.996975 | 3133.23 |
| Model | CICIoT2023 | CIC-DDoS2019 | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Method HPO | Accuracy | F1-Score | MCC | Inference Time | Accuracy | F1-Score | MCC | Inference Time | |
| RF | HyperOpt | 0.995405 | 0.995311 | 0.989466 | 0.571676 | 0.999272 | 0.999272 | 0.997919 | 0.593505 |
| Our method | 0.995470 | 0.995379 | 0.989612 | 0.934244 | 0.999499 | 0.999499 | 0.998566 | 0.476788 | |
| ADA | HyperOpt | 0.981622 | 0.978839 | 0.957784 | 1.462640 | 0.995678 | 0.995681 | 0.987664 | 1.700661 |
| Our method | 0.985809 | 0.984441 | 0.967489 | 1.533813 | 0.995928 | 0.995933 | 0.988386 | 1.074964 | |
| DT | HyperOpt | 0.984299 | 0.980654 | 0.964263 | 0.476859 | 0.994615 | 0.994617 | 0.98461 | 0.111519 |
| Our method | 0.995009 | 0.994903 | 0.988564 | 0.343606 | 0.998496 | 0.998496 | 0.995703 | 0.107510 | |
| XGB | HyperOpt | 0.994045 | 0.993695 | 0.986349 | 0.274208 | 0.999331 | 0.999331 | 0.998088 | 0.115514 |
| Our method | 0.993949 | 0.993642 | 0.986128 | 0.268540 | 0.999152 | 0.999152 | 0.997577 | 0.099511 | |
| MLP | HyperOpt | 0.989858 | 0.989420 | 0.976754 | 0.403589 | 0.997899 | 0.997900 | 0.994002 | 0.508512 |
| Our method | 0.989708 | 0.989149 | 0.976398 | 0.482332 | 0.997767 | 0.997769 | 0.993628 | 0.149718 | |
| Ref | Method | Dataset | Descriptors | Accuracy | F1-Score | MCC | HPO |
|---|---|---|---|---|---|---|---|
| [32] | LSTM | BOTIoT | 20 | 0.9800 | 0.9900 | - | Not reported |
| TON-IoT | 0.9900 | 0.9900 | - | ||||
| [33] | CNN inception model | CIC-DDoS2019 | 43 | 0.9682 | 0.9650 | - | The HPO method has not been reported. A summary table of hyperparameters is presented. |
| [34] | hibrido (CNN + LSTM) | CIC-DDoS2019 | 26 | - | 100 *, 97+ | - | RandomSearch |
| [35] | DecisionTree | MBB-IoT | 30 | 0.9998 | - | - | Not reported |
| [36] | DecisionTree + KG Smote | MBB-IoT | 77 | - | By class only (no overall F1 is reported) | - | Not reported |
| Our study | RandomForest | CICIoT2023 | 19 | 0.9955 | 0.9954 | 0.9896 | Proposed method |
| CIC-DDoS2019 | 18 | 0.9995 | 0.9995 | 0.9986 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Becerra-Suarez, F.L.; Pinedo, L.; Gavilán-Colca, M.J.; Díaz, M.; Forero, M.G. Automated Hyperparameter Optimization for Cyberattack Detection Based on Machine Learning in IoT Systems. Informatics 2025, 12, 126. https://doi.org/10.3390/informatics12040126
Becerra-Suarez FL, Pinedo L, Gavilán-Colca MJ, Díaz M, Forero MG. Automated Hyperparameter Optimization for Cyberattack Detection Based on Machine Learning in IoT Systems. Informatics. 2025; 12(4):126. https://doi.org/10.3390/informatics12040126
Chicago/Turabian StyleBecerra-Suarez, Fray L., Lloy Pinedo, Madeleine J. Gavilán-Colca, Mónica Díaz, and Manuel G. Forero. 2025. "Automated Hyperparameter Optimization for Cyberattack Detection Based on Machine Learning in IoT Systems" Informatics 12, no. 4: 126. https://doi.org/10.3390/informatics12040126
APA StyleBecerra-Suarez, F. L., Pinedo, L., Gavilán-Colca, M. J., Díaz, M., & Forero, M. G. (2025). Automated Hyperparameter Optimization for Cyberattack Detection Based on Machine Learning in IoT Systems. Informatics, 12(4), 126. https://doi.org/10.3390/informatics12040126

