Detecting IoT Attacks Using an Ensemble Machine Learning Model
Abstract
:1. Introduction
2. Background and Related Work
2.1. IoT-Specific Attacks Overview
- An authentication attack is an attack against privileged access. A remote to the user (R2U) attack (such as HTTPtunnel and FTP_write) occurs when an intruder sends malformed packets to a computer or server to which he/she does not have access. User-to-root (U2R) attacks (such as Rootkit) occur when a malicious intruder attempts to gain access to a network resource by posing as a normal user and then accessing it using full permission;
- In a probe attack, an intruder runs a scan of a network device to determine potential vulnerabilities in the design of its topology or port settings and then exploits those in the future to gain illegal access to confidential information. There are several types of probe attacks, such as IPsweep, Nmap, and Portsweep.
2.2. ML-Specific Related Work on Security and Privacy
2.3. Voting and Stacking Techniques
2.4. Ensemble Machine Learning-Based Attack Detection
2.5. IoT System with Cloud and Fog
3. Proposed Approach
- Data collection at Cloud LayerThis step involves collecting data from the thing layer and passing it to the cloud layer. To accomplish this, data from the thing layer can first be transported to the fog layer. The fog layer can then transport it to the cloud layer. While transporting the data to the cloud layer, the fog layer can also filter data to decide which data to be transported to the cloud. IoT attacks can be predicted using the following attributes: (1) login details, (2) the fields of network data packets, such as fragment details, protocol type, source and destination address, (3) service type, (4) flags, and (5) duration. We provide detailed information about the data used in our simulation in the next section.
- Selecting a best model on the cloudThe objective of this step is to combine various basic machine learning classifiers (such as naïve Bayes, KNN, and decision trees) with ensemble techniques (such as stacking, bagging, and voting) to obtain optimal results (accuracy, precision, execution time). As this is a time-consuming step, we recommend running it in the cloud. In addition, we simply apply the basic machine learning classifiers, as they require a short execution time.Figure 3 illustrates this step by including four layers: (1) the data layer, (2) the base layer, (3) the meta-layer, and (4) the method selection layer. In the data layer, collected data from the previous step is pre-processed and fed into the base layer. The base layer applies different combinations of base classifiers, such as naïve Bayes , decision trees , and KNN . The results of these combinations are then fed into the meta layer, where ensemble methods, such as stacking , bagging , and voting , aggregate the outcomes. Each ensemble method is evaluated in terms of accuracy, precision, recall, and ROC and execution time. Further, the model with a combination of base classifiers and an ensemble method that yields the best results is selected.Algorithm 1 describes the above-proposed approach in detail. The input parameters of the algorithms are: (1) base classifiers (i.e., ), (2) ensemble methods (i.e., ), and (3) training dataset (D). At the first two lines of the algorithm, the output and the result (i.e., variable OUTPUT and Result in Algorithm 1) are initialized to . The third line initializes the execution time to the maximum value.In the fourth line, we store all the combinations of the base classifiers (i.e., using the function findAllCombinations) in variable C. The proposed approach aims to determine the best combination and the best ensemble method. Therefore, in line 5, we iterate each of the combinations, and then, again, in line 7, each base classifier in the corresponding combination is iterated. Each base classifier is applied to the training dataset (D) with the outcome being stored in o (line 8).Line 10 involves an iteration of the ensemble methods and the application of each ensemble method to the outcome (o) at step 11. At line 12, the ensemble result is calculated in terms of accuracy, precision, recall, etc. Further, at line 13, the execution time of the combination of base classifiers and an ensemble method is calculated. The new result (r) and execution time (time) is then compared to the previous best result (Result) and time (ExcecutionTime). If this is the best result so far, the corresponding combination and ensemble method is stored in the output (OUTPUT); see line 14. Further, the result is stored in line 15. In the end, the best output is returned at line 21.
- Running the best model on the fog layerThis step involves executing the model selected in the previous step over the fog layer with the real-time data collected from the thing layer. The model consists of a combination of base classifiers and an ensemble method.
Algorithm 1: Find a best model. |
4. Simulation Environment
4.1. Server Configuration
4.2. Dataset Description
4.3. Data Separation for the Cloud and Fog Layers
4.4. Simulated Base Classifiers and Ensemble Methods
5. Results and Analysis
5.1. Cloud Layer Result Analysis
5.1.1. Execution Time
5.1.2. Performance Measures
5.2. Fog Layer Result Analysis
5.2.1. Performance Measures
5.2.2. Errors Associated
5.2.3. Execution Time and CPU Usage
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Abdulghani, H.A.; Nijdam, N.A.; Collen, A.; Konstantas, D. A Study on Security and Privacy Guidelines, Countermeasures, Threats: IoT Data at Rest Perspective. Symmetry 2019, 11, 774. [Google Scholar] [CrossRef] [Green Version]
- Wang, A.; Liang, R.; Liu, X.; Zhang, Y.; Chen, K.; Li, J. An Inside Look at IoT Malware. In Industrial IoT Technologies and Applications; Chen, F., Luo, Y., Eds.; Industrial IoT 2017; Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering; Springer: Cham, Switzerland, 2017. [Google Scholar]
- Razdan, S.; Sharma, S. Internet of Medical Things (IoMT): Overview, Emerging Technologies, and Case Studies. IETE Tech. Rev. 2021, 1–14. [Google Scholar] [CrossRef]
- Zarpelão, B.B.; Miani, R.S.; Kawakani, C.T.; de Alvarenga, S.C. A survey of intrusion detection in Internet of Things. J. Netw. Comput. Appl. 2017, 84, 25–37. [Google Scholar] [CrossRef]
- Chaabouni, N.; Mosbah, M.; Zemmari, A.; Sauvignac, C.; Faruki, P. Network Intrusion Detection for IoT Security Based on Learning Techniques. IEEE Commun. Surv. Tutor. 2019, 21, 2671–2701. [Google Scholar] [CrossRef]
- Xiao, L.; Wan, X.; Lu, X.; Zhang, Y.; Wu, D. IoT Security Techniques Based on Machine Learning: How Do IoT Devices Use AI to Enhance Security? IEEE Signal Process. Mag. 2018, 35, 41–49. [Google Scholar] [CrossRef]
- Giacinto, G.; Roli, F.; Bruzzone, L. Combination of neural and statistical algorithms for supervised classification of remote-sensing images. Pattern Recognit. Lett. 2000, 21, 385–397. [Google Scholar]
- Bansal, A.; Mahapatra, S. A Comparative Analysis of Machine Learning Techniques for Botnet Detection. In Proceedings of the 10th International Conference on Security of Information and Networks SIN ’17, New York, NY, USA, 13–15 October 2017; Association for Computing Machinery: New York, NY, USA, 2017; pp. 91–98. [Google Scholar] [CrossRef]
- Jaber, A.N.; Rehman, S.U. FCM–SVM based intrusion detection system for cloud computing environment. Clust. Comput. 2020, 23, 3221–3231. [Google Scholar]
- Zhang, Y.; Ren, Y.; Wang, J.; Fang, L. Network forensic computing based on ANN-PCA. In Proceedings of the 2007 International Conference on Computational Intelligence and Security Workshops (CISW 2007), Harbin, China, 15–19 December 2007; pp. 942–945. [Google Scholar]
- Hemavathi, D.; Srimathi, H. Effective feature selection technique in an integrated environment using enhanced principal component analysis. J. Ambient. Intell. Humaniz. Comput. 2021, 12, 3679–3688. [Google Scholar]
- Salo, F.; Nassif, A.B.; Essex, A. Dimensionality reduction with IG-PCA and ensemble classifier for network intrusion detection. Comput. Netw. 2019, 148, 164–175. [Google Scholar]
- Hosseini, S.; Zade, B.M.H. New hybrid method for attack detection using combination of evolutionary algorithms, SVM, and ANN. Comput. Netw. 2020, 173, 107168. [Google Scholar]
- Amor, N.B.; Benferhat, S.; Elouedi, Z. Naive bayes vs. decision trees in intrusion detection systems. In Proceedings of the 2004 ACM Symposium on Applied Computing, Nicosia, Cyprus, 14–17 March 2004; pp. 420–424. [Google Scholar]
- Ingre, B.; Yadav, A. Performance analysis of NSL-KDD dataset using ANN. In Proceedings of the 2015 International Conference on Signal Processing and Communication Engineering Systems, Guntur, India, 2–3 January 2015; pp. 92–96. [Google Scholar] [CrossRef]
- Zhang, C.; Ruan, F.; Yin, L.; Chen, X.; Zhai, L.; Liu, F. A Deep Learning Approach for Network Intrusion Detection Based on NSL-KDD Dataset. In Proceedings of the 2019 IEEE 13th International Conference on Anti-counterfeiting, Security, and Identification (ASID), Xiamen, China, 25–27 October 2019; pp. 41–45. [Google Scholar] [CrossRef]
- Wang, H.; Sayadi, H.; Sasan, A.; Rafatirad, S.; Mohsenin, T.; Homayoun, H. Comprehensive Evaluation of Machine Learning Countermeasures for Detecting Microarchitectural Side-Channel Attacks; GLSVLSI'20; Association for Computing Machinery: New York, NY, USA, 2020; pp. 181–186. [Google Scholar] [CrossRef]
- Ahmad, R.; Alsmadi, I. Machine learning approaches to IoT security: A systematic literature review. Int. Things (IoT) 2021, 14, 100365. [Google Scholar] [CrossRef]
- Ambedkar, C.; Babu, V.K. Detection of probe attacks using machine learning techniques. Int. J. Res. Stud. Comput. Sci. Eng. (IJRSCSE) 2015, 2, 25–29. [Google Scholar]
- Sabhnani, M.; Serpen, G. Why machine learning algorithms fail in misuse detection on KDD intrusion detection data set. Intell. Data Anal. 2004, 8, 403–415. [Google Scholar]
- Abdelkefi, A.; Jiang, Y.; Sharma, S. SENATUS: An Approach to Joint Traffic Anomaly Detection and Root Cause Analysis. In Proceedings of the 2018 2nd Cyber Security in Networking Conference (CSNet), Paris, France, 24–26 October 2018; pp. 1–8. [Google Scholar] [CrossRef] [Green Version]
- Khare, N.; Devan, P.; Chowdhary, C.L.; Bhattacharya, S.; Singh, G.; Singh, S.; Yoon, B. Smo-dnn: Spider monkey optimization and deep neural network hybrid classifier model for intrusion detection. Electronics 2020, 9, 692. [Google Scholar] [CrossRef]
- Manimurugan, S.; Majdi, A.Q.; Mohmmed, M.; Narmatha, C.; Varatharajan, R. Intrusion detection in networks using crow search optimization algorithm with adaptive neuro-fuzzy inference system. Microprocess. Microsyst. 2020, 79, 103261. [Google Scholar]
- Kasliwal, B.; Bhatia, S.; Saini, S.; Thaseen, I.S.; Kumar, C.A. A hybrid anomaly detection model using G-LDA. In Proceedings of the 2014 IEEE International Advance Computing Conference (IACC), Gurgaon, India, 21–22 February 2014; pp. 288–293. [Google Scholar]
- Ieracitano, C.; Adeel, A.; Morabito, F.C.; Hussain, A. A novel statistical analysis and autoencoder driven intelligent intrusion detection approach. Neurocomputing 2020, 387, 51–62. [Google Scholar]
- Chan, Y.H. Biostatistics 305. Multinomial logistic regression. Singap. Med. J. 2005, 46, 259. [Google Scholar]
- Liu, J.; Kantarci, B.; Adams, C. Machine learning-driven intrusion detection for contiki-NG-based IoT networks exposed to NSL-KDD dataset. In Proceedings of the 2nd ACM Workshop on Wireless Security and Machine Learning, Linz, Austria, 13 July 2020; pp. 25–30. [Google Scholar]
- Su, T.; Sun, H.; Zhu, J.; Wang, S.; Li, Y. BAT: Deep learning methods on network intrusion detection using NSL-KDD dataset. IEEE Access 2020, 8, 29575–29585. [Google Scholar]
- Abu Al-Haija, Q.; Al-Badawi, A. Attack-Aware IoT Network Traffic Routing Leveraging Ensemble Learning. Sensors 2022, 22, 241. [Google Scholar] [CrossRef]
- Yong, B.; Wei, W.; Li, K.C.; Shen, J.; Zhou, Q.; Wozniak, M.; Połap, D.; Damaševičius, R. Ensemble machine learning approaches for webshell detection in Internet of things environments. In Transactions on Emerging Telecommunications Technologies; Wiley: Hoboken, NJ, USA, 2020; p. e4085. [Google Scholar] [CrossRef]
- Rashid, M.M.; Kamruzzaman, J.; Hassan, M.M.; Imam, T.; Gordon, S. Cyberattacks Detection in IoT-Based Smart City Applications Using Machine Learning Techniques. Int. J. Environ. Res. Public Health 2020, 17, 9347. [Google Scholar] [CrossRef]
- Tsogbaatar, E.; Bhuyan, M.H.; Taenaka, Y.; Fall, D.; Gonchigsumlaa, K.; Elmroth, E.; Kadobayashi, Y. SDN-Enabled IoT Anomaly Detection Using Ensemble Learning. In Artificial Intelligence Applications and Innovations; Maglogiannis, I., Iliadis, L., Pimenidis, E., Eds.; Springer International Publishing: Cham, Switzerland, 2020; pp. 268–280. [Google Scholar]
- Sharma, S. Towards Artificial Intelligence Assisted Software Defined Networking for Internet of Vehicles. In Intelligent Technologies for Internet of Vehicles; Magaia, N., Mastorakis, G., Mavromoustakis, C., Pallis, E., Markakis, E.K., Eds.; Springer International Publishing: Cham, Switzerland, 2021; pp. 191–222. [Google Scholar] [CrossRef]
- Latif, S.A.; Wen, F.B.X.; Iwendi, C.; Li, F.; Wang, L.; Mohsin, S.M.; Han, Z.; Band, S.S. AI-empowered, blockchain and SDN integrated security architecture for IoT network of cyber physical systems. Comput. Commun. 2022, 181, 274–283. [Google Scholar] [CrossRef]
- Rambabu, K.; Venkatram, N. Ensemble classification using traffic flow metrics to predict distributed denial of service scope in the Internet of Things (IoT) networks. Comput. Electr. Eng. 2021, 96, 107444. [Google Scholar] [CrossRef]
- Kumar, P.; Gupta, G.P.; Tripathi, R. An ensemble learning and fog-cloud architecture-driven cyber-attack detection framework for IoMT networks. Comput. Commun. 2021, 166, 110–124. [Google Scholar] [CrossRef]
- Khare, S.; Totaro, M. Ensemble Learning for Detecting Attacks and Anomalies in IoT Smart Home. In Proceedings of the 2020 3rd International Conference on Data Intelligence and Security (ICDIS), South Padre Island, TX, USA, 24–26 June 2020; pp. 56–63. [Google Scholar] [CrossRef]
- Hung, Y.H. Improved Ensemble-Learning Algorithm for Predictive Maintenance in the Manufacturing Process. Appl. Sci. 2021, 11, 6832. [Google Scholar] [CrossRef]
- Wang, J.; Pan, J.; Esposito, F.; Calyam, P.; Yang, Z.; Mohapatra, P. Edge cloud offloading algorithms: Issues, methods, and perspectives. ACM Comput. Surv. (CSUR) 2019, 52, 1–23. [Google Scholar]
- Zhang, P.; Zhou, M.; Fortino, G. Security and trust issues in Fog computing: A survey. Future Gener. Comput. Syst. 2018, 88, 16–27. [Google Scholar]
- Hu, P.; Dhelim, S.; Ning, H.; Qiu, T. Survey on fog computing: Architecture, key technologies, applications and open issues. J. Netw. Comput. Appl. 2017, 98, 27–42. [Google Scholar]
- Tariq, N.; Asim, M.; Al-Obeidat, F.; Zubair Farooqi, M.; Baker, T.; Hammoudeh, M.; Ghafir, I. The Security of Big Data in Fog-Enabled IoT Applications Including Blockchain: A Survey. Sensors 2019, 19, 1788. [Google Scholar] [CrossRef] [Green Version]
- Alzoubi, Y.I.; Osmanaj, V.H.; Jaradat, A.; Al-Ahmad, A. Fog computing security and privacy for the Internet of Thing applications: State-of-the-art. Secur. Priv. 2021, 4, e145. [Google Scholar] [CrossRef]
- Alrawais, A.; Alhothaily, A.; Hu, C.; Xing, X.; Cheng, X. An attribute-based encryption scheme to secure fog communications. IEEE Access 2017, 5, 9131–9138. [Google Scholar]
- Hu, P.; Ning, H.; Qiu, T.; Song, H.; Wang, Y.; Yao, X. Security and privacy preservation scheme of face identification and resolution framework using fog computing in internet of things. IEEE Int. Things J. 2017, 4, 1143–1155. [Google Scholar]
- Li, Z.; Zhou, X.; Liu, Y.; Xu, H.; Miao, L. A non-cooperative differential game-based security model in fog computing. China Commun. 2017, 14, 180–189. [Google Scholar]
- Osanaiye, O.; Chen, S.; Yan, Z.; Lu, R.; Choo, K.K.R.; Dlodlo, M. From cloud to fog computing: A review and a conceptual live VM migration framework. IEEE Access 2017, 5, 8284–8300. [Google Scholar]
- ATLANTIC-eVISION: Cross-Atlantic Experimental Validation of Intelligent SDN-controlled IoT Networks 2021–2022. Available online: https://ngiatlantic.eu/funded-experiments/atlantic-evision-cross-atlantic-experimental-validation-intelligent-sdn (accessed on 20 March 2022).
- Berman, M.; Demeester, P.; Lee, J.W.; Nagaraja, K.; Zink, M.; Colle, D.; Krishnappa, D.K.; Raychaudhuri, D.; Schulzrinne, H.; Seskar, I.; et al. Future Internets Escape the Simulator. Commun. ACM 2015, 58, 78–89. [Google Scholar] [CrossRef]
- Suñé, M.; Bergesio, L.; Woesner, H.; Rothe, T.; Köpsel, A.; Colle, D.; Puype, B.; Simeonidou, D.; Nejabati, R.; Channegowda, M.; et al. Design and implementation of the OFELIA FP7 facility: The European OpenFlow testbed. Comput. Netw. 2014, 61, 132–150. [Google Scholar] [CrossRef] [Green Version]
Reference | ML/DL Algorithm Used | Features Used (🗸) or Not (×) | Analysis Performed (🗸) or Not Performed (×) |
---|---|---|---|
[19,20] | Decision Tree + Rule Induction | E(🗸), S(×), N(×), D(×) | A(🗸), ROC(×), FScore(×), MCC(×), DR(×) |
[7,8] | Deep Neural Network (DNN) | E (×), S(×), N(🗸), D(×) | A(×), ROC (×), FScore(🗸), MCC(×), DR(×) |
[22,23] | Optimization + DNN | E(🗸), S(×), N(🗸), D(🗸) | A(🗸), ROC (×), FScore(🗸), MCC(×), DR(×) |
[9,13] | SVM-ANN + hybrid optimization | E(×), S(×), N(🗸), D(×) | A(×), ROC (×), FScore(🗸), MCC(🗸), DR(🗸) |
[21] | PCA + Random Decision | E(×), S(×), N(🗸), D(🗸) | A(🗸), ROC (🗸), FScore(×), MCC(×), DR(🗸) |
[10,11] | Dimensionality Reduction + DNN | E (×), S(×), N(🗸), D(🗸) | A(🗸), ROC(×), FScore(🗸), MCC(×), DR(🗸) |
[24] | GA-based Latent Dirichlet Allocation | E(🗸), S(×), N(×), D(×) | A(🗸), ROC (×), FScore(🗸), MCC(×), DR(🗸) |
[25] | Autoencoder based LSTM classifier | E (🗸), S(🗸), N(🗸), D(🗸) | A(×), ROC (×), FScore(🗸), MCC(×), DR(×) |
[26] | Multinomial Logistic Regression | E(×), S(×), N(×), D(×) | A(×), ROC (🗸), FScore(×), MCC(×), DR(×) |
[27] | Ensemble Learning with XGboost | E (🗸), S (×), N(×), D(×) | A(🗸), ROC (×), FScore(×), MCC(×), DR(×) |
Model | Base Classifier Combinations | ||
---|---|---|---|
1 | DT | RF | KNN |
2 | RF | KNN | LR |
3 | KNN | LR | NB |
4 | LR | NB | DT |
5 | NB | DT | RF |
6 | DT | KNN | LR |
7 | RF | LR | NB |
8 | KNN | NB | DT |
9 | LR | DT | RF |
10 | NB | RF | KNN |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tomer, V.; Sharma, S. Detecting IoT Attacks Using an Ensemble Machine Learning Model. Future Internet 2022, 14, 102. https://doi.org/10.3390/fi14040102
Tomer V, Sharma S. Detecting IoT Attacks Using an Ensemble Machine Learning Model. Future Internet. 2022; 14(4):102. https://doi.org/10.3390/fi14040102
Chicago/Turabian StyleTomer, Vikas, and Sachin Sharma. 2022. "Detecting IoT Attacks Using an Ensemble Machine Learning Model" Future Internet 14, no. 4: 102. https://doi.org/10.3390/fi14040102
APA StyleTomer, V., & Sharma, S. (2022). Detecting IoT Attacks Using an Ensemble Machine Learning Model. Future Internet, 14(4), 102. https://doi.org/10.3390/fi14040102