Research on Intrusion Detection Method Based on Transformer and CNN-BiLSTM in Internet of Things
Abstract
:1. Introduction
- To overcome the limitations of single models in fully capturing the diverse characteristics of IoT traffic, this study proposes an intrusion detection model based on CNN-BiLSTM-Transformer. The model uses CNN to extract local features, BiLSTM to model temporal dependencies, and Transformer to integrate global relationships.
- To address the issue of data imbalance, Borderline-SMOTE is employed to oversample minority classes. In addition, Isolation Forest and Local Outlier Factor (LOF) are applied to reduce noise in the dataset.
- To address the presence of numerous redundant features in the raw datasets, this research employs three strategies—XGBoost, Chi-square (Chi2), and Mutual Information—to effectively select critical features, ensuring that the model focuses on the most discriminative ones.
2. Intrusion Detection Method
2.1. Overall Framework of the Intrusion Detection Model
2.2. Introduction to the Dataset
2.3. Data Preprocessing
2.4. Feature Selection
2.5. Class Imbalance Handling
2.6. CNN-BiLSTM-Transformer Model
- 1D convolutional layer (Conv1d) for feature extraction.
- Activation function for non-linear transformation.
- Max pooling (MaxPool1d) to reduce matrix dimensions.
- Batch normalization (BatchNorm1d) to accelerate training and stabilize learning.
- Flatten operation to convert features into a 1D vector [32].
3. Experimental Results and Analysis
3.1. Experimental Environment and Parameter Settings
3.2. Experimental Results Analysis
3.3. Comparative Experiments
3.4. Ablation Experiments
3.5. Analysis of Experimental Results on the BoT-IoT Dataset
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Udurume, M.; Shakhov, V.; Koo, I. Comparative Analysis of Deep Convolutional Neural Network—Bidirectional Long Short-Term Memory and Machine Learning Methods in Intrusion Detection Systems. Appl. Sci. 2024, 14, 6967. [Google Scholar] [CrossRef]
- Hnamte, V.; Hussain, J. DCNNBiLSTM: An efficient hybrid deep learning-based intrusion detection system. Telemat. Inform. Rep. 2023, 10, 100053. [Google Scholar] [CrossRef]
- Wang, B.; Sang, Y.; Zhang, Y.; Li, S.; Ge, R.; Ding, Y. A Longitudinal Measurement and Analysis of Pink, a Hybrid P2P IoT Botnet. In Proceedings of the International Conference on Collaborative Computing: Networking, Applications and Worksharing, Hangzhou, China, 15–16 October 2022; Springer Nature: Cham, Switzerland, 2022; pp. 419–436. [Google Scholar]
- Yang, B. The “Eighth-Second Difficulty” of Black Myth: Wukong. 21st Century Business Herald, 30 August 2024; p. 8. [Google Scholar] [CrossRef]
- Guang, Y. Behind the Massive Malicious Attack on DeepSeek. China Informatization Weekly, 17 February 2025; p. 14. [Google Scholar] [CrossRef]
- Gao, J. Network intrusion detection method combining CNN and BiLSTM in cloud computing environment. Comput. Intell. Neurosci. 2022, 2022, 7272479. [Google Scholar] [CrossRef]
- Omarov, B.; Sailaukyzy, Z.; Bigaliyeva, A.; Kereyev, A.; Naizabayeva, L.; Dautbayeva, A. One Dimensional Conv-BiLSTM Network with Attention Mechanism for IoT Intrusion Detection. Comput. Mater. Contin. 2023, 77, 3765. [Google Scholar] [CrossRef]
- Wang, J.; Si, C.; Wang, Z.; Fu, Q. A New Industrial Intrusion Detection Method Based on CNN-BiLSTM. Comput. Mater. Contin. 2024, 79, 4297. [Google Scholar] [CrossRef]
- AlSaleh, I.; Al-Samawi, A.; Nissirat, L. Novel Machine Learning Approach for DDoS Cloud Detection: Bayesian-Based CNN and Data Fusion Enhancements. Sensors 2024, 24, 1418. [Google Scholar] [CrossRef]
- Wei, W.; Chen, Y.; Lin, Q.; Ji, J.; Wong, K.C.; Li, J. Multi-objective evolving long–short term memory networks with attention for network intrusion detection. Appl. Soft Comput. 2023, 139, 110216. [Google Scholar] [CrossRef]
- Xi, C.; Wang, H.; Wang, X. A novel multi-scale network intrusion detection model with transformer. Sci. Rep. 2024, 14, 23239. [Google Scholar] [CrossRef]
- Xiang, R.; Li, S.; Pan, J. A Novel IoT Intrusion Detection Model Using 2dCNN-BiLSTM. Radioengineering 2024, 33, 236–245. [Google Scholar] [CrossRef]
- Bamber, S.S.; Katkuri, A.V.R.; Sharma, S.; Angurala, M. A hybrid CNN-LSTM approach for intelligent cyber intrusion detection system. Comput. Secur. 2025, 148, 104146. [Google Scholar] [CrossRef]
- Altunay, H.C.; Albayrak, Z. A hybrid CNN+ LSTM-based intrusion detection system for industrial IoT networks. Eng. Sci. Technol. Int. J. 2023, 38, 101322. [Google Scholar] [CrossRef]
- Akuthota, U.C.; Bhargava, L. Transformer Based Intrusion Detection for IoT Networks. IEEE Internet Things J. 2025, 12, 6062–6067. [Google Scholar] [CrossRef]
- Wang, S.; Xu, W.; Liu, Y. Res-TranBiLSTM: An intelligent approach for intrusion detection in the Internet of Things. Comput. Netw. 2023, 235, 109982. [Google Scholar] [CrossRef]
- Xu, H.; Sun, L.; Fan, G.; Li, W.; Kuang, G. A hierarchical intrusion detection model combining multiple deep learning models with attention mechanism. IEEE Access 2023, 11, 66212–66226. [Google Scholar] [CrossRef]
- Manan, I.; Rehman, F.; Sharif, H.; Ali, C.N.; Ali, R.R.; Liaqat, A. Cyber Security Intrusion Detection Using Deep Learning Approaches, Datasets, Bot-IoT Dataset. In Proceedings of the 2023 4th International Conference on Advancements in Computational Sciences (ICACS), Lahore, Pakistan, 20–22 February 2023; IEEE: New York, NY, USA, 2023; pp. 1–5. [Google Scholar]
- Ren, K.; Yuan, S.; Zhang, C.; Shi, Y.; Huang, Z. CANET: A hierarchical cnn-attention model for network intrusion detection. Comput. Commun. 2023, 205, 170–181. [Google Scholar] [CrossRef]
- Lan, M.; Luo, J.; Chai, S.; Chai, R.; Zhang, C.; Zhang, B. A novel industrial intrusion detection method based on threshold-optimized CNN-BiLSTM-Attention using ROC curve. In Proceedings of the 2020 39th Chinese Control Conference (CCC), Shenyang, China, 27–29 July 2020; pp. 7384–7389. [Google Scholar]
- Zhang, B.; Zhang, Y.; Jiang, X. Feature selection for global tropospheric ozone prediction based on the BO-XGBoost-RFE algorithm. Sci. Rep. 2022, 12, 9244. [Google Scholar] [CrossRef]
- Gad, A.R.; Nashat, A.A.; Barkat, T.M. Intrusion detection system using machine learning for vehicular ad hoc networks based on ToN-IoT dataset. IEEE Access 2021, 9, 142206–142217. [Google Scholar] [CrossRef]
- Ling, Z.; Hao, Z.J. Intrusion detection using normalized mutual information feature selection and parallel quantum genetic algorithm. Int. J. Semant. Web Inf. Syst. 2022, 18, 1–24. [Google Scholar] [CrossRef]
- Chen, X.; Gong, Z.; Huang, D.; Jiang, N.; Zhang, Y. Overcoming Class Imbalance in Network Intrusion Detection: A Gaussian Mixture Model and ADASYN Augmented Deep Learning Framework. In Proceedings of the 2024 4th International Conference on Internet of Things and Machine Learning, Nanchang, China, 9–11 August 2024; pp. 48–53. [Google Scholar]
- Hu, C.; Deng, R.; Hu, X.; He, M.; Zhao, H.; Jiang, X. An automatic methodology for lithology identification in a tight sandstone reservoir using a bidirectional long short-term memory network combined with Borderline-SMOTE. Acta Geophys. 2024, 1–17. [Google Scholar] [CrossRef]
- Ma, H.; Ghojogh, B.; Samad, M.N.; Zheng, D.; Crowley, M. Isolation Mondrian forest for batch and online anomaly detection. In Proceedings of the 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Toronto, ON, Canada, 11–14 October 2020; pp. 3051–3058. [Google Scholar]
- Jeong, W.; Tsingas, C.; Almubarak, M.S. Local outlier factor as part of a workflow for detecting and attenuating blending noise in simultaneously acquired data. Geophys. Prospect. 2020, 68, 1523–1539. [Google Scholar] [CrossRef]
- Oseni, A.; Moustafa, N.; Creech, G.; Sohrabi, N.; Strelzoff, A.; Tari, Z.; Linkov, I. An explainable deep learning framework for resilient intrusion detection in IoT-enabled transportation networks. IEEE Trans. Intell. Transp. Syst. 2022, 24, 1000–1014. [Google Scholar] [CrossRef]
- Alhassan, A.M. Self-Adaptive Lightweight Attention Module-Based BiLSTM Model for Effective Intrusion Detection. Arab. J. Sci. Eng. 2024, 1–26. [Google Scholar] [CrossRef]
- Long, Z.; Yan, H.; Shen, G.; Zhang, X.; He, H.; Cheng, L. A Transformer-based network intrusion detection approach for cloud security. J. Cloud Comput. 2024, 13, 5. [Google Scholar] [CrossRef]
- Yao, R.; Wang, N.; Liu, Z.; Chen, P.; Sheng, X. Intrusion detection system in the advanced metering infrastructure: A cross-layer feature-fusion CNN-LSTM-based approach. Sensors 2021, 21, 626. [Google Scholar] [CrossRef] [PubMed]
- Priyadarshini, I.; Mohanty, P.; Alkhayyat, A.; Sharma, R.; Kumar, S. SDN and application layer DDoS attacks detection in IoT devices by attention-based Bi-LSTM-CNN. Trans. Emerg. Telecommun. Technol. 2023, 34, e4758. [Google Scholar] [CrossRef]
- Dai, W.; Li, X.; Ji, W.; He, S. Network Intrusion Detection Method Based on CNN, BiLSTM, and Attention Mechanism. IEEE Access 2024, 12, 53099–53111. [Google Scholar] [CrossRef]
- Moussavou Boussougou, M.K.; Park, D.J. Attention-based 1D CNN-BiLSTM hybrid model enhanced with fasttext word embedding for Korean voice phishing detection. Mathematics 2023, 11, 3217. [Google Scholar] [CrossRef]
- Peng, H.; Wu, C.; Xiao, Y. A BiLSTM-Based IoT Intrusion Detection System with Mutual Information and Focal Loss. In Proceedings of the 2024 6th International Conference on Frontier Technologies of Information and Computer (ICFTIC), Qingdao, China, 13–15 December 2024; IEEE: New York, NY, USA, 2024; pp. 1–6. [Google Scholar]
- Zhao, J.; Liu, Y.; Zhang, Q.; Zheng, X. CNN-AttBiLSTM mechanism: A DDoS attack detection method based on attention mechanism and CNN-BiLSTM. IEEE Access 2023, 11, 136308–136317. [Google Scholar] [CrossRef]
- Sangeetha, J.; Kumaran, U. Using BiLSTM Structure with Cascaded Attention Fusion Model for Sentiment Analysis. J. Sci. Ind. Res. 2023, 82, 444–449. [Google Scholar]
- Li, X.; Wang, H.; Xiu, P.; Zhou, X.; Meng, F. Resource usage prediction based on BiLSTM-GRU combination model. In Proceedings of the 2022 IEEE International Conference on Joint Cloud Computing (JCC), Fremont, CA, USA, 15–18 August 2022; pp. 9–16. [Google Scholar]
- Yin, X.; Fang, W.; Liu, Z.; Liu, D. A novel multi-scale CNN and Bi-LSTM arbitration dense network model for low-rate DDoS attack detection. Sci. Rep. 2024, 14, 5111. [Google Scholar] [CrossRef]
- Yin, X.; Fang, W.; Liu, Z.; Liu, D. An efficient network intrusion detection method based on information theory and genetic algorithm. In Proceedings of the PCCC 2005: 24th IEEE International Performance, Computing, and Communications Conference, Phoenix, AZ, USA, 7–9 April 2005; pp. 11–17. [Google Scholar]
- Zhou, Q.; Wang, Z. A network intrusion detection method for information systems using federated learning and improved transformer. Int. J. Semant. Web Inf. Syst. 2024, 20, 1–20. [Google Scholar] [CrossRef]
- Yao, W.; Shi, H.; Zhao, H. Scalable anomaly-based intrusion detection for secure Internet of Things using generative adversarial networks in fog environment. J. Netw. Comput. Appl. 2023, 214, 103622. [Google Scholar] [CrossRef]
- Yao, H.; Gao, P.; Zhang, P.; Wang, J.; Jiang, C.; Lu, L. Hybrid intrusion detection system for edge-based IIoT relying on machine-learning-aided detection. IEEE Netw. 2019, 33, 75–81. [Google Scholar] [CrossRef]
- Udas, P.B.; Karim, M.E.; Roy, K.S. SPIDER: A shallow PCA based network intrusion detection system with enhanced recurrent neural networks. J. King Saud Univ.-Comput. Inf. Sci. 2022, 34, 10246–10272. [Google Scholar] [CrossRef]
ID | Data Category | Quantity | Percentage |
---|---|---|---|
1 | BENIGN | 2,273,097 | 80.3000% |
2 | DoS Hulk | 231,073 | 8.1629% |
3 | PortScan | 158,930 | 5.6144% |
4 | DDoS | 128,027 | 4.5227% |
5 | DoS GoldenEye | 10,293 | 0.3636% |
6 | FTP-Patator | 7938 | 0.2804% |
7 | SSH-Patator | 5897 | 0.2093% |
8 | DoS slowloris | 5796 | 0.2047% |
9 | DoS Slowhttptest | 5499 | 0.1942% |
10 | Bot | 1966 | 0.0694% |
11 | Web Attack-Brute Force | 1507 | 0.0532% |
12 | Web Attack-XSS | 652 | 0.0230% |
13 | Infiltration | 36 | 0.0012% |
14 | Web Attack-Sql Injection | 21 | 0.0007% |
15 | Heartbleed | 11 | 0.0004% |
Configuration Item/Parameter | Value | Selection Justification |
---|---|---|
CPU Version | Intel i5-10300H (Intel Corporation, Santa Clara, CA, USA) | Meets computational requirements |
GPU Version | NVIDIA GTX 1650 Ti (NVIDIA Corporation, Santa Clara, CA, USA) | 4GB VRAM sufficient for model training |
Operating System | Windows 10 (Microsoft Corporation, Redmond, WA, USA) | Stable platform with framework compatibility |
Python Version | Python 3.10.14 (Python Software Foundation, Wilmington, DE, USA) | PyTorch-recommended version |
Deep Learning Framework | PyTorch 1.12.1 (Meta Platforms, Inc., Menlo Park, CA, USA) | Base framework for this study |
Training Epochs | 50 | Early stopping (validation loss, patience = 5) |
Learning Rate | 0.001 | Optimal via grid search (0.1, 0.01, 0.001, 0.0001) |
Batch Size | 64 | Memory-efficiency balance (tested 32/64/128) |
Optimizer | Adam | Default parameters (β1 = 0.9, β2 = 0.999) |
Loss Function | Binary Cross-Entropy Loss | Standard for binary classification tasks |
Reference | Method | Accuracy | Precision | Recall | F1-Score |
---|---|---|---|---|---|
[6] | CNN + BiLSTM + C5.0 | 95.50% | 95.32% | 95.13% | 95.22% |
[7] | 1D Conv-BiLSTM + Attention | 98.07% | 98.81% | 98.42% | 98.61% |
[8] | CNN + BiLSTM + SMOTE | 97.70% | 97.80% | 97.70% | 97.75% |
[9] | BaysCNN + PCA | 99.66% | 97.69% | 97.69% | 97.69% |
[10] | EvoBMF + BiLSTM + MHA | 90.31% | 93.28% | 88.24% | 90.69% |
[11] | Multi-scale Transformer | 99.25% | 99.07% | 99.02% | 99.04% |
[16] | TranBiLSTM + ResNet | 99.15% | 99.15% | 99.14% | 99.14% |
[42] | BiGAN | 82.30% | 76.50% | 76.30% | 76.40% |
[43] | Deep autoencoder | 98.23% | 98.17% | 98.29% | 98.23% |
[44] | Hybrid RNN | 82.91% | 86.78% | 82.91% | 84.79% |
The Proposed Model | CNN + BiLSTM + Transformer | 99.80% | 99.69% | 99.49% | 99.81% |
Accuracy | Precision | Recall | F1-Score | |
---|---|---|---|---|
The Proposed Model | 99.80% | 99.69% | 99.94% | 99.81% |
Without Oversampling | 96.76% | 96.56% | 94.14% | 96.87% |
Without Feature Selection | 97.46% | 95.29% | 94.98% | 96.13% |
Without the Transformer Module | 95.92% | 94.17% | 93.20% | 95.41% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, C.; Li, J.; Wang, N.; Zhang, D. Research on Intrusion Detection Method Based on Transformer and CNN-BiLSTM in Internet of Things. Sensors 2025, 25, 2725. https://doi.org/10.3390/s25092725
Zhang C, Li J, Wang N, Zhang D. Research on Intrusion Detection Method Based on Transformer and CNN-BiLSTM in Internet of Things. Sensors. 2025; 25(9):2725. https://doi.org/10.3390/s25092725
Chicago/Turabian StyleZhang, Chunhui, Jian Li, Naile Wang, and Dejun Zhang. 2025. "Research on Intrusion Detection Method Based on Transformer and CNN-BiLSTM in Internet of Things" Sensors 25, no. 9: 2725. https://doi.org/10.3390/s25092725
APA StyleZhang, C., Li, J., Wang, N., & Zhang, D. (2025). Research on Intrusion Detection Method Based on Transformer and CNN-BiLSTM in Internet of Things. Sensors, 25(9), 2725. https://doi.org/10.3390/s25092725