Multi-Class Intrusion Detection in Internet of Vehicles: Optimizing Machine Learning Models on Imbalanced Data
Abstract
:1. Introduction
- RQ: How can machine learning models be optimized to handle imbalanced and multi-class IoV traffic data and provide robust performance in different attack scenarios?
- CICIoV2024 benchmark analysis: A thorough analysis of machine learning methods such as Random Forest, AdaBoost, Logistic Regression, and Deep Neural Network (DNN) was performed on the CICIoV2024 dataset without the use of data balancing strategies. This ensures that the analysis and results reflect real-world, imbalanced traffic conditions.
- Benchmark ML models addition: Two ensemble models were added to the benchmark, Extra Trees and XGBoost. This allows for a more extensive analysis with our approach.
- Advanced Optimization: We ensure that models are not only accurate, but also computationally efficient, reducing the risk of overfitting by using Optuna to automate the search process for optimal hyperparameters and implement early stopping mechanisms.
- Benchmark improvement: With the addition of different algorithms and the use of Optuna, we were able to improve the results for all the models in the benchmark and added two more.
- Multi-Class Performance Analysis: This study highlights the model’s strengths and weaknesses in detecting minority classes by thoroughly analyzing its performance in a variety of attack scenarios, including DoS, gas spoofing, speed spoofing, steering wheel spoofing, and RPM spoofing.
- Broader Implications Beyond Security: Although IDS provides a foundational framework, this research advances our knowledge of machine learning techniques for processing data in real time and creating resource-efficient models for Internet of Vehicles systems.
2. Related Work
3. Methodology
3.1. Experimental Setup
3.2. Dataset
3.3. Data Preprocessing
3.4. Optuna Optimization
3.5. Training Methods
3.5.1. Train–Test Split Approach
3.5.2. 10-Fold Cross-Validation Approach
4. Experimental Evaluation
4.1. Train–Test Split Approach
4.2. 10-Fold Cross-Validation Approach
5. Discussion
6. Conclusions and Future Work
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Taslimasa, H.; Dadkhah, S.; Neto, E.C.P.; Xiong, P.; Ray, S.; Ghorbani, A.A. Security issues in Internet of Vehicles (IoV): A comprehensive survey. Internet Things 2023, 22, 100809. [Google Scholar] [CrossRef]
- Gong, W.; Yang, S.; Guang, H.; Ma, B.; Zheng, B.; Shi, Y.; Li, B.; Cao, Y. Multi-order feature interaction-aware intrusion detection scheme for ensuring cyber security of intelligent connected vehicles. Eng. Appl. Artif. Intell. 2024, 135, 108815. [Google Scholar] [CrossRef]
- Mehedi, S.T.; Anwar, A.; Rahman, Z.; Ahmed, K. Deep transfer learning based intrusion detection system for electric vehicular networks. Sensors 2021, 21, 4736. [Google Scholar] [CrossRef]
- Wang, S.; Zheng, B.; Liu, Z.; Fan, Z.; Liu, Y.; Dai, Y. A Lightweight Intrusion Detection System for Vehicular Networks Based on an Improved ViT Model. IEEE Access 2024, 12, 118842–118856. [Google Scholar] [CrossRef]
- Moulahi, T.; Zidi, S.; Alabdulatif, A.; Atiquzzaman, M. Comparative Performance Evaluation of Intrusion Detection Based on Machine Learning in In-Vehicle Controller Area Network Bus. IEEE Access 2021, 9, 99595–99605. [Google Scholar] [CrossRef]
- Nagarajan, J.; Mansourian, P.; Shahid, M.A.; Jaekel, A.; Saini, I.; Zhang, N.; Kneppers, M. Machine Learning based intrusion detection systems for connected autonomous vehicles: A survey. Peer-to-Peer Netw. Appl. 2023, 16, 2153–2185. [Google Scholar] [CrossRef]
- Aloraini, F.; Javed, A.; Rana, O. Adversarial Attacks on Intrusion Detection Systems in In-Vehicle Networks of Connected and Autonomous Vehicles. Sensors 2024, 24, 3848. [Google Scholar] [CrossRef]
- Neto, E.C.P.; Taslimasa, H.; Dadkhah, S.; Iqbal, S.; Xiong, P.; Rahman, T.; Ghorbani, A.A. CICIoV2024: Advancing realistic IDS approaches against DoS and spoofing attack in IoV CAN bus. Internet Things 2024, 26, 101209. [Google Scholar] [CrossRef]
- Cheng, P.; Xu, K.; Li, S.; Han, M. TCAN-IDS: Intrusion Detection System for Internet of Vehicle Using Temporal Convolutional Attention Network. Symmetry 2022, 14, 310. [Google Scholar] [CrossRef]
- El-Gayar, M.M.; Alrslani, F.A.; El-Sappagh, S. Smart Collaborative Intrusion Detection System for Securing Vehicular Networks Using Ensemble Machine Learning Model. Information 2024, 15, 583. [Google Scholar] [CrossRef]
- Yang, L.; Moubayed, A.; Shami, A. MTH-IDS: A Multitiered Hybrid Intrusion Detection System for Internet of Vehicles. IEEE Internet Things J. 2022, 9, 616–632. [Google Scholar] [CrossRef]
- Wang, S.; Wang, Y.; Zheng, B.; Cheng, J.; Su, Y.; Dai, Y. Intrusion Detection System for Vehicular Networks Based on MobileNetV3. IEEE Access 2024, 12, 106285–106302. [Google Scholar] [CrossRef]
- Almehdhar, M.; Albaseer, A.; Khan, M.A.; Abdallah, M.; Menouar, H.; Al-Kuwari, S.; Al-Fuqaha, A. Deep Learning in the Fast Lane: A Survey on Advanced Intrusion Detection Systems for Intelligent Vehicle Networks. IEEE Open J. Veh. Technol. 2024, 5, 869–906. [Google Scholar] [CrossRef]
- Korba, A.A.; Sebaa, S.; Mabrouki, M.; Ghamri-Doudane, Y.; Benatchba, K. A Life-long Learning Intrusion Detection System for 6G-Enabled IoV. In Proceedings of the 20th International Wireless Communications and Mobile Computing Conference, IWCMC 2024, Ayia Napa, Cyprus, 27–31 May 2024; Institute of Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2024; pp. 1773–1778. [Google Scholar] [CrossRef]
- Qin, J.; Xun, Y.; Liu, J. CVMIDS: Cloud-Vehicle Collaborative Intrusion Detection System for Internet of Vehicles. IEEE Internet Things J. 2024, 11, 321–332. [Google Scholar] [CrossRef]
- Yang, L.; Shami, A. A Transfer Learning and Optimized CNN Based Intrusion Detection System for Internet of Vehicles. In Proceedings of the IEEE International Conference on Communications, Seoul, Republic of Korea, 16–20 May 2022; Institute of Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2022; Volume 2022, pp. 2774–2779. [Google Scholar] [CrossRef]
- Gul, M.F.; Bakir, H. Improving Attack Detection in IoV Systems using GA-based Hyperparameter Optimization. In Proceedings of the 8th International Artificial Intelligence and Data Processing Symposium, IDAP 2024, Malatya, Türkiye, 21–22 September 2024; Institute of Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2024. [Google Scholar] [CrossRef]
- Akiba, T.; Sano, S.; Yanase, T.; Ohta, T.; Koyama, M. Optuna: A Next-generation Hyperparameter Optimization Framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, Anchorage, AK, USA, 4–8 August 2019; Teredesai, A., Kumar, V., Li, Y., Rosales, R., Terzi, E., Karypis, G., Eds.; ACM: New York, NY, USA, 2019; pp. 2623–2631. [Google Scholar] [CrossRef]
- Jin, F.; Chen, M.; Zhang, W.; Yuan, Y.; Wang, S. Intrusion detection on internet of vehicles via combining log-ratio oversampling, outlier detection and metric learning. Inf. Sci. 2021, 579, 814–831. [Google Scholar] [CrossRef]
- Singh, A.P.; Chaurasia, B.K.; Tripathi, A. Stacking Enabled Ensemble Learning Based Intrusion Detection Scheme (SELIDS) for IoV. SN Comput. Sci. 2024, 5, 1000. [Google Scholar] [CrossRef]
- Du, L.; Gu, Z.; Wang, Y.; Gao, C. Open World Intrusion Detection: An Open Set Recognition Method for CAN Bus in Intelligent Connected Vehicles. IEEE Netw. 2024, 38, 76–82. [Google Scholar] [CrossRef]
- Ullah, S.; Khan, M.A.; Ahmad, J.; Jamal, S.S.; Huma, Z.E.; Hassan, M.T.; Pitropakis, N.; Arshad; Buchanan, W.J. HDL-IDS: A Hybrid Deep Learning Architecture for Intrusion Detection in the Internet of Vehicles. Sensors 2022, 22, 1340. [Google Scholar] [CrossRef]
- Wang, Y.; Qin, G.; Zou, M.; Liang, Y.; Wang, G.; Wang, K.; Feng, Y.; Zhang, Z. A lightweight intrusion detection system for internet of vehicles based on transfer learning and MobileNetV2 with hyper-parameter optimization. Multimed. Tools Appl. 2024, 83, 22347–22369. [Google Scholar] [CrossRef]
- Fu, M.; Wang, P.; Liu, M.; Zhang, Z.; Zhou, X. IoV-BERT-IDS: Hybrid Network Intrusion Detection System in IoV Using Large Language Models. IEEE Trans. Veh. Technol. 2024, 74, 1909–1921. [Google Scholar] [CrossRef]
- Wang, W.; Sun, D. The improved AdaBoost algorithms for imbalanced data classification. Inf. Sci. 2021, 563, 358–374. [Google Scholar] [CrossRef]
- Rahman, H.A.A.; Wah, Y.B.; He, H.; Bulgiba, A. Comparisons of ADABOOST, KNN, SVM and Logistic Regression in Classification of Imbalanced Dataset. In Proceedings of the Soft Computing in Data Science, Putrajaya, Malaysia, 2–3 September 2015; pp. 54–64. [Google Scholar]
Hardware | Software |
---|---|
Processor: Intel® Core™ i7-8750H | Windows 11, 64-bit |
CPU @ 2.20 GHz, 2.21 GHz | Jupyter Lab 4.2.5 |
RAM: 16.00 GB, SSD: 1TB + 256 GB | Python 3.12.7 |
Graphics card: Nvidia® GeForce™ GTX 1050 Ti 4GB VRAM | Optuna 4.0.0 |
Feature | Description |
---|---|
ID | Represents the message priority and specifies the type of data being transmitted—not unique |
DATA_0 | Byte 0 of the transmitted data |
DATA_1 | Byte 1 of the transmitted data |
DATA_2 | Byte 2 of the transmitted data |
DATA_3 | Byte 3 of the transmitted data |
DATA_4 | Byte 4 of the transmitted data |
DATA_5 | Byte 5 of the transmitted data |
DATA_6 | Byte 6 of the transmitted data |
DATA_7 | Byte 7 of the transmitted data |
label | Indicates whether the traffic is benign or an attack |
category | Specifies the traffic classification as benign, DoS, or spoofing |
specific_class | Specifies the precise class of the traffic as benign, DoS, spoofing-GAS, spoofing-RPM, spoofing-SPEED, and spoofing-STEERING_WHEEL |
Model | Hyperparameter | Split | 10-Fold |
---|---|---|---|
RF | n_estimators | 112 | 57 |
max_depth | 25 | 8 | |
min_samples_split | 2 | 6 | |
AdaBoost | n_estimators | 169 | 91 |
learning_rate | 0.22031826639728547 | 0.868649991657305 | |
XGBoost | n_estimators | 346 | 338 |
max_depth | 4 | 6 | |
learning_rate | 0.08257351184613305 | 0.12486607179583395 | |
subsample | 0.820421224719867 | 0.8398820455022838 | |
colsample_bytree | 0.6442582375605579 | 0.8885000829639664 | |
LR | C | 0.43873275517235 | 3.2376253329526534 |
penalty | l1 | l1 | |
solver | saga | saga | |
ET | n_estimators | 94 | 50 |
max_depth | 16 | 34 | |
DNN | layers | 4 | 4 |
units | 128 | 96 | |
dropout_rate | 0.2774088047541644 | 0.24552040101739658 | |
learning_rate | 0.0009501230980605523 | 0.00010682901273334459 | |
batch_size | 112 | 128 | |
epochs | 45 | 34 |
Models | Train/Test Split (s) 1 | 10-Fold CV (s) 1 | |||
---|---|---|---|---|---|
Optuna | Training | Optuna | Training | ||
Random Forest | 1830 | 120 | 1971 | 238 | |
AdaBoost | 5620 | 191 | 5935 | 496 | |
XGBoost | 1820 | 106 | 3512 | 1532 | |
Logistic Regression | 1745 | 1116 | 3997 | 3647 | |
Extra Trees | 1819 | 47 | 3271 | 221 | |
Deep Neural Networks | 11,010 | 1022 | 40,387 | 5104 |
Models | Neto et al. [8] | Gul and Bakir [17] | Our Study | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Acc. | Rec. | Prec. | F1 | Acc. | Rec. | Prec. | F1 | Acc. | Rec. | Prec. | F1 | |||
LR | 0.89 | 0.50 | 0.48 | 0.49 | 0.90 | 0.60 | 0.72 | 0.62 | 0.97 | 0.86 | 0.98 | 0.88 | ||
AdaBoost | 0.92 | 0.66 | 0.48 | 0.51 | 0.97 | 0.73 | 0.75 | 0.72 | 0.98 | 0.82 | 0.80 | 0.81 | ||
RF | 0.96 | 0.76 | 0.76 | 0.76 | 1.00 | 0.97 | 0.99 | 0.97 | 1.00 | 1.00 | 1.00 | 1.00 | ||
DNN | 0.96 | 0.76 | 0.83 | 0.78 | 0.99 | 0.97 | 0.97 | 0.97 | 1.00 | 1.00 | 1.00 | 1.00 | ||
XGBoost | – | – | – | – | – | – | – | – | 1.00 | 1.00 | 1.00 | 1.00 | ||
ET | – | – | – | – | – | – | – | – | 1.00 | 1.00 | 1.00 | 1.00 |
Model | Precision | Recall | Accuracy | F1-Score |
---|---|---|---|---|
Random Forest | 1.00 | 1.00 | 1.00 | 1.00 |
AdaBoost | 0.30 | 0.33 | 0.92 | 0.32 |
XGBoost | 1.00 | 1.00 | 1.00 | 1.00 |
Logistic Regression | 0.98 | 0.86 | 0.97 | 0.88 |
Extra Trees | 1.00 | 1.00 | 1.00 | 1.00 |
Deep Neural Networks | 1.00 | 1.00 | 1.00 | 1.00 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Palma, Á.; Antunes, M.; Bernardino, J.; Alves, A. Multi-Class Intrusion Detection in Internet of Vehicles: Optimizing Machine Learning Models on Imbalanced Data. Future Internet 2025, 17, 162. https://doi.org/10.3390/fi17040162
Palma Á, Antunes M, Bernardino J, Alves A. Multi-Class Intrusion Detection in Internet of Vehicles: Optimizing Machine Learning Models on Imbalanced Data. Future Internet. 2025; 17(4):162. https://doi.org/10.3390/fi17040162
Chicago/Turabian StylePalma, Ágata, Mário Antunes, Jorge Bernardino, and Ana Alves. 2025. "Multi-Class Intrusion Detection in Internet of Vehicles: Optimizing Machine Learning Models on Imbalanced Data" Future Internet 17, no. 4: 162. https://doi.org/10.3390/fi17040162
APA StylePalma, Á., Antunes, M., Bernardino, J., & Alves, A. (2025). Multi-Class Intrusion Detection in Internet of Vehicles: Optimizing Machine Learning Models on Imbalanced Data. Future Internet, 17(4), 162. https://doi.org/10.3390/fi17040162