Classification and Explanation for Intrusion Detection System Based on Ensemble Trees and SHAP Method
Abstract
:1. Introduction
- Improvement of attack detection performance of IDS with big IoT-based IDS datasets based on ensemble tree models. In particular, in the first step, model selection has performed to determine the best model for tuning, along with adjusted hyperparameters, on preprocessing IoT-based IDS datasets. Better model results and their hyperparameters were used to build the best IDS model. The best IDS model selection algorithm is presented in Algorithm 1; the Pycaret library [10] was used to support the implementation of this algorithm. In the next step, the selected model classification has built with the training and evaluation process by tuning the hyperparameter values. These models are ensemble tree classifications, including DT (decision tree) and RF (random forest). They have been evaluated using the receiver operating characteristic (ROC) curve and validation curve on three IoT-based IDS datasets, including IoTID20, NF-BoT-IoT-v2, and NF-ToN-IoT-v2. The training and evaluation process of the best selected models with turned hyperparameters is presented in Algorithm 2.
- Explanation about attack detection results achieved from a decision of the proposed ensemble IDS methods based on the SHAP (SHapley additive exPlanations) method. The SHAP method [11,12] has been used for the global and local explanation of ensemble trees for binary classifiers and multiclass classifiers to increase the trustability of the prediction results. The SHAP value is calculated based on Algorithm 3. In particular, a heatmap for global explanation and a decision plot for a local explanation was used in the SHAP method. The proposed model explanation with SHAP is presented in Algorithm 4.
2. Related Work
2.1. AI-Based IDS Approach
2.2. AI-Based IDS Related IoT Approach
2.2.1. IoT-IDS with BoT-IoT, ToN-IoT Datasets and Their Variant Datasets
2.2.2. IoT-IDS with IoTID20 Dataset
2.3. AI-Based IDS for the IoT and Explanation Approach
3. Material and Method
3.1. Ensemble Trees Classification
3.2. SHAP Explanation
3.2.1. Shap Global Explanation
3.2.2. SHAP Local Explanation
3.3. The Proposed Method
Algorithm 1: The Best Model Selection and Turned Model Hyperparameters |
input: Original Datasets (), Input Feature Columns (), Output Feature Column () output: Best Models Selection (), Turned Models Hyperparameters ()
|
Algorithm 2: Training and Evaluation Process of the Best Selected Models with Turned Hyperparameters |
input: Best Model Selected (), Turned Model Hyperparameters (), Processed Datasets (), Input Feature Columns (), Output Feature Column () output: Trained Model (), ROC Curve (), Validation Curve ()
|
Algorithm 3: SHAP value estimation for single feature value |
input Pre-trained model (f), example/instance (x), feature index (i), data matrix (X), number of iterations (K) output: SHAP value for the value of the feature(S)
|
Algorithm 4: Models Explanation with SHAP |
input: Trained Model (), Testing Data () output: Global Explanation with Heatmap (), Local Explanation with Decision plot (),
|
4. Experiment
4.1. Related IoT IDS Datasets
4.1.1. Iotid20
4.1.2. Netflow V2 Datasets
4.2. Experimental Setup and Evaluation Metrics
- Accuracy. Accuracy measures how many observations, both positive and negative, were correctly classified.
- AUC and ROC. The AUC and ROC curves can be used to measure the performance of the classification models at various threshold settings. The probability curve is presented by ROC, whereas the degree of separability area under the ROC curve is represented by the AUC. These curves show the extent to which the classification model can distinguish between each output class. The higher the AUC, the better the model is for predicting each class correctly. Figure 5 shows the AUC and ROC curves.The curve plots two parameters: true positive rate (TPR ) and false positive rate (FPR), as follows:
- Recall. Recall indicates how many of the actual positive cases the model was able to accurately predict.
- Precision. Precision indicates how many of the correctly predicted cases actually turned out to be positive.
- F1. This combines precision and recall into one metric by calculating the harmonic average between precision and recall.
4.3. Experimental Results
4.3.1. First Experimental Results: Classification Model Performance Evaluation
4.3.2. Second Experimental Results: Explanation of Model Classification Decision
- Global Explanation with Heatmap
- Local Explanation with Decision Plot
5. Discussion and Comparison
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
IDS | Intrusion Detection System |
DNN | Deep Neural Network |
ML | Machine Learning |
DL | Deep Learning |
DT | Decision Tree |
RF | Random Forest |
SHAP | SHapley Additive exPlanation |
XAI | eXplanation Artificial Intelligent |
DoS | Denial of Service |
IoTs | Internet of Things |
SDN | Software Defined Networking |
ROC | Receiver Operating Computing |
SVM | Support Vector Machine |
WSN | Wireless Sensor Network |
RNN | Recurrent Neural Network |
LSTM | Long Short Term Memory |
GRU | Gated Recurrent Unit |
GPU | Graphic Processing Unit |
TPU | Tensor Processing Unit |
TCNN | Temporal Convolution Neural Network |
H2ID | Hierarchical Hybrid for Intrusion Detection |
IG | Information Gain |
GR | Gain Ratio |
BO-GP | Bayesian Optimization Gaussian Process |
DFF | Deep FeedForward |
CNN | Convolution Neural Network |
LR | Logistic Regression |
NB | Naive Bayes |
SLFN | Single Hidden FeedForward Neural Network |
PSO | Particle Swam Optimization |
DBN | Deep Bayesian Network |
References
- Sultana, N.; Chilamkurti, N.; Peng, W.; Alhadad, R. Survey on SDN based network intrusion detection system using machine learning approaches. Peer Netw. Appl. 2019, 12, 493–501. [Google Scholar] [CrossRef]
- Lee, W.; Stolfo, S.J.; Mok, K.W. A data mining framework for building intrusion detection models. In Proceedings of the 1999 IEEE Symposium on Security and Privacy, Oakland, CA, USA, 14 May 1999; IEEE: Piscataway, NJ, USA, 1999; pp. 120–132. [Google Scholar]
- Buczak, A.L.; Guven, E. A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Commun. Surveys Tuts. 2016, 18, 1153–1176. [Google Scholar] [CrossRef]
- Lee, I.; Lee, K. The internet of things (iot): Applications, investments, and challenges for enterprises. Bus. Horizons 2015, 58, 431–440. [Google Scholar] [CrossRef]
- Zhang, Z.K.; Cho, M.C.Y.; Wang, C.W.; Hsu, C.W.; Chen, C.K.; Shieh, S. Iot security: Ongoing challenges and research opportunities. In Proceedings of the 2014 IEEE 7th International Conference on Service-Oriented Computing and Applications, Matsue, Japan, 17–19 November 2014; pp. 230–234. [Google Scholar]
- Alaba, F.A.; Othman, M.; Hashem, I.A.T.; Alotaibi, F. Internet of things security: A survey. J. Netw. Comput. Appl. 2017, 88, 10–28. [Google Scholar] [CrossRef]
- Sarica, A.K.; Angin, P. A Novel SDN Dataset for Intrusion Detection in IoT Networks. In Proceedings of the 16th International Conference on Network and Service Management (CNSM), Izmir, Turkey, 2–6 November 2020; pp. 1–5. [Google Scholar]
- Spadaccino, P.; Cuomo, F. Intrusion Detection Systems for IoT: Opportunities and Challenges offered by Edge Computing. arXiv 1–20. Available online: Https://arxiv.org/pdf/2012.01174.pdf (accessed on 21 December 2021).
- Amarasinghe, K.; Kenney, K.; Manic, M. Toward explainable deep neural network based anomaly detection. In Proceedings of the 11th International Conference on Human System Interaction (HSI), Gdansk, Poland, 4–6 July 2018; pp. 311–317. [Google Scholar]
- Pycaret Open Source. Available online: Https://github.com/pycaret/pycaret (accessed on 21 December 2021).
- Lundberg, S.M.; Lee, S.I. A unified approach to interpreting model predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 4765–4774. [Google Scholar]
- Lundberg, S.M.; Erion, G.G.; Lee, S.I. Consistent individualized feature attribution for tree ensembles. arXiv 2018, arXiv:1802.03888. [Google Scholar]
- Heba, F.E.; Darwish, A.; Hassanien Aboul, E.; Abraham, A. Principle components analysis and support vector machine based intrusion detection system. In Proceedings of the 10th International Conference on Intelligent Systems Design and Applications, Cairo, Egypt, 29 November–1 December 2010; pp. 363–367. [Google Scholar]
- Jia, N.; Liu, D. Application of svm based on information entropy in intrusion detection. In International Conference on Intelligent and Interactive Systems and Applications; Springer: Berlin/Heidelberg, Germany, 2017. [Google Scholar]
- Wang, H.; Gu, J.; Wang, S. An effective intrusion detection framework based on svm with feature augmentation. Knowl.-Based Syst. 2017, 136, 130–139. [Google Scholar] [CrossRef]
- Kruegel, C.; Mutz, D.; Robertson, W.; Valeur, F. Bayesian event classification for intrusion detection. In Proceedings of the 19th Annual Computer Security Applications, Washington, DC, USA, 8–12 December 2003; pp. 14–23. [Google Scholar]
- Jemili, F.; Zaghdoud, M.; Mohamed, B. A framework for an adaptive intrusion detection system using Bayesian network. In IEEE Intelligence and Security Informatics; IEEE: Piscataway, NJ, USA, 2007; pp. 66–70. [Google Scholar]
- Heckerman, D. A tutorial on learning with bayesian networks. In Innovations in Bayesian Networks; Springer: Berlin/Heidelberg, Germany, 2008; pp. 33–82. [Google Scholar]
- Kruegel, C.; Toth, T. Using Decision Trees to Improve Signature-Based Intrusion Detection. In Recent Advances in Intrusion Detection; Vigna, G., Kruegel, C., Jonsson, E., Eds.; Springer: Berlin/Heidelberg, Germany, 2003. [Google Scholar]
- Kumar, M.; Hanumanthappa, M.; Kumar, T.S. Intrusion Detection System using decision tree algorithm. In Proceedings of the IEEE 14th International Conference on Communication Technology, Chengdu, China, 9–11 November 2012; pp. 629–634. [Google Scholar]
- Peng, K.; Leung, V.; Zheng, L.; Wang, S.; Huang, C.; Lin, T. Intrusion Detection System Based on Decision Tree over Big Data in Fog Environment. Wirel. Commun. Mob. Comput. 2018, 2018, 4680867. [Google Scholar] [CrossRef] [Green Version]
- Chew, Y.J.; Ooi, S.Y.; Wong, K.S.; Pang, Y.H. Decision Tree with Sensitive Pruning in Network-based Intrusion Detection System. In Computational Science and Technology. Lecture Notes in Electrical Engineering; Alfred, R., Lim, Y., Haviluddin, H., On, C., Eds.; Springer: Singapore, 2003; Volume 603. [Google Scholar]
- Tesfahun, A.; Bhaskari, D.L. Intrusion Detection Using Random Forests Classifier with SMOTE and Feature Reduction. In Proceedings of the 2013 International Conference on Cloud & Ubiquitous Computing & Emerging Technologies, Pune, India, 15–16 November 2013; pp. 127–132. [Google Scholar]
- Farnaaz, N.; Jabbar, M.A. Random Forest Modeling for Network Intrusion Detection System. Procedia Comput. Sci. 2016, 89, 213–217. [Google Scholar] [CrossRef] [Green Version]
- Aung, Y.Y.; Min, M.M. An analysis of random forest algorithm based network intrusion detection system. In Proceedings of the 2017 18th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD), Kanazawa, Japan, 26–28 June 2017; pp. 127–132. [Google Scholar]
- Primartha, R.; Tama, B.A. Anomaly detection using random forest: A performance revisited. In Proceedings of the 2017 International Conference on Data and Software Engineering (ICoDSE), Palembang, Indonesia, 1–2 November 2017; pp. 1–6. [Google Scholar]
- Zhang, H.; Dai, S.; Li, Y.; Zhang, W. Real-time Distributed-Random-Forest-Based Network Intrusion Detection System Using Apache Spark. In Proceedings of the 2018 IEEE 37th International Performance Computing and Communications Conference (IPCCC), Orlando, FL, USA, 17–19 November 2018; pp. 1–7. [Google Scholar]
- Iman, A.N.; Ahmad, T. Improving Intrusion Detection System by Estimating Parameters of Random Forest in Boruta. In Proceedings of the 2020 International Conference on Smart Technology and Applications (ICoSTA), Surabaya, Indonesia, 20 February 2020; pp. 1–6. [Google Scholar]
- Waskle, S.; Parashar, L.; Singh, U. Intrusion Detection System Using PCA with Random Forest Approach. In Proceedings of the 2020 International Conference on Electronics and Sustainable Communication Systems (ICESC), Coimbatore, India, 2–4 July 2020; pp. 803–808. [Google Scholar]
- Park, T.; Cho, D.; Kim, H. An Effective Classification for DoS Attacks in Wireless Sensor Networks. In Proceedings of the 2018 Tenth International Conference on Ubiquitous and Future Networks (ICUFN), Prague, Czech Republic, 3–6 July 2018; pp. 689–692. [Google Scholar]
- Vigneswaran, R.K.; Vinayakumar, R.; Soman, K.P.; Poornachandran, P. Evaluating Shallow and Deep Neural Networks for Network Intrusion Detection Systems in Cyber Security. In Proceedings of the 2018 9th International Conference on Computing, Communication and Networking Technologies (ICCCNT), Bengaluru, India, 10–12 July 2018; pp. 1–6. [Google Scholar]
- Ieracitano, C.; Adeel, A.; Gogate, M.; Dashtipour, K.; Morabito, F.C.; Larijani, H.; Raza, A.; Hussain, A. Statistical Analysis Driven Optimized Deep Learning System for Intrusion Detection. In Proceedings of the 9th International Conference on Brain Inspired Cognitive Systems (BICS 2018), Xi’an, China, 7–8 July 2018. [Google Scholar]
- Le, T.T.H.; Kim, J.; Kim, H. Analyzing Effective of Activation Functions on Recurrent Neural Networks for Intrusion Detection. J. Multimed. Inf. Syst. 2016, 3, 91–96. [Google Scholar]
- Kim, J.; Kim, J.; Thu, H.L.T.; Kim, H. Long Short Term Memory Recurrent Neural Network Classifier for Intrusion Detection. In Proceedings of the 2016 International Conference on Platform Technology and Service (PlatCon), Jeju, Korea, 15–17 February 2016; pp. 1–5. [Google Scholar]
- Kim, J.; Kim, H. An Effective Intrusion Detection Classifier Using Long Short-Term Memory with Gradient Descent Optimization. In Proceedings of the 2017 International Conference on Platform Technology and Service (PlatCon), Busan, Korea, 13–15 February 2017; pp. 1–6. [Google Scholar]
- Kang, H.; Kim, H. The Impact of PCA-Scale Improving GRU Performance for Intrusion Detection. In Proceedings of the 2019 International Conference on Platform Technology and Service (PlatCon), Jeju, Korea, 28–30 January 2019; pp. 1–6. [Google Scholar]
- Le, T.-T.-H.; Kim, Y.; Kim, H. Network Intrusion Detection Based on Novel Feature Selection Model and Various Recurrent Neural Networks. Appl. Sci. 2019, 9, 1392. [Google Scholar] [CrossRef] [Green Version]
- Mirsky, Y.; Doitshman, T.; Elovici, Y.; Shabtai, A. Kitsune: An Ensemble of Autoencoders for Online Network Intrusion Detection. arXiv 2018, arXiv:1802.09089. [Google Scholar]
- Roopak, M.; Tian, G.Y.; Chambers, J. An Intrusion Detection System Against DDoS Attacks in IoT Networks. In Proceedings of the 10th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, Nevada, USA, 6–8 January 2020; pp. 562–567. [Google Scholar]
- Shiravi, A.; Shiravi, H.; Tavallaee, M.; Ghorbani, A.A. Toward developing a systematic approach to generate benchmark datasets for intrusion detection. Comput. Secur. 2012, 31, 357–374. [Google Scholar] [CrossRef]
- Moustafa, N.; Slay, J. UNSW-NB15: A comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In Proceedings of the 2015 Military Communications and Information Systems Conference (MilCIS), Canberra, Australia, 10–12 November 2015; pp. 1–6. [Google Scholar]
- Sharafaldin, I.; Lashkari, A.H.; Ghorbani, A.A. Toward generating a new intrusion detection dataset and intrusion traffic characterization. In Proceedings of the 4th International Conference on Information Systems Security and Privacy (ICISSP 2018), Funchal, Portugal, 22–24 January 2018; pp. 108–116. [Google Scholar]
- Koroniotis, N.; Moustafa, N.; Sitnikova, E.; Turnbull, B. Towards the development of realistic botnet dataset in the internet of things for network forensic analytics: Bot-iot dataset. Future Gener. Comput. Syst. 2019, 100, 779–796. [Google Scholar] [CrossRef] [Green Version]
- Moustafa, N. New Generations of Internet of Things Datasets for Cybersecurity Applications based Machine Learning: TON_IoT_Datasets. In Proceedings of the eResearch Australasia Conference, Brisbane, Australia, 21–25 October 2019. [Google Scholar]
- Sarhan, M.; Layeghy, S.; Moustafa, N.; Portmann, M. Netflow datasets for machine learning-based network intrusion detection systems. arXiv 2020, arXiv:2011.09144. [Google Scholar]
- Sarhan, M.; Layeghy, S.; Moustafa, N.; Portmann, M. Towards a standard feature set of nids datasets. arXiv 2021, arXiv:2101.11315. [Google Scholar]
- Ullah, I.; Mahmoud, Q.H. A Scheme for Generating a Dataset for Anomalous Activity Detection in IoT Networks. In Advances in Artificial Intelligence; Goutte, C., Zhu, X., Eds.; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2020. [Google Scholar]
- Oreški, D.; Andročec, D. Genetic algorithm and artificial neural network for network forensic analytics. In Proceedings of the 2020 43rd International Convention on Information, Communication and Electronic Technology (MIPRO), Opatija, Croatia, 28 September–2 October 2020; pp. 1200–1205. [Google Scholar]
- Derhab, A.; Aldweesh, A.; Emam, A.Z.; Khan, F.A. Intrusion Detection System for Internet of Things Based on Temporal Convolution Neural Network and Efficient Feature Engineering. Wireless Commun. Mobile Comput. 2020, 2020, 6689134. [Google Scholar] [CrossRef]
- Bovenzi, G.; Aceto, G.; Ciuonzo, D.; Persico, V.; Pescapé, A. A Hierarchical Hybrid Intrusion Detection Approach in IoT Scenarios. In Proceedings of the GLOBECOM 2020–2020 IEEE Global Communications Conference, Taipei, Taiwan, 7–11 December 2020; pp. 1–7. [Google Scholar]
- Nimbalkar, P.; Kshirsagar, D. Feature selection for intrusion detection system in Internet-of-Things (IoT). ICT Express 2021, 7, 77–181. [Google Scholar] [CrossRef]
- Anthi, E.; Williams, L.; Słowińska, M.; Theodorakopoulos, G.; Burnap, P. A supervised intrusion detection system for smart home iot devices. IEEE Internet Things J. 2019, 6, 9042–9053. [Google Scholar] [CrossRef]
- Injadat, M.; Moubayed, A.; Shami, A. Detecting Botnet Attacks in IoT Environments: An Optimized Machine Learning Approach. In Proceedings of the 2020 32nd International Conference on Microelectronics (ICM), Aqaba, Jordan, 14–17 December 2020; pp. 1–4. [Google Scholar]
- Sarhan, M.; Layeghy, S.; Moustafa, N.; Gallagher, M.; Portmann, M. Feature Extraction for Machine Learning-based Intrusion Detection in IoT Networks. arXiv 2021, arXiv:2108.12722v1. [Google Scholar]
- Lo, W.W.; Layeghy, S.; Sarhan, M.; Gallagher, M.; Portmann, M. E-GraphSAGE: A Graph Neural Network based Intrusion Detection System. arXiv 2021, arXiv:2103.16329. [Google Scholar]
- Sarhan, M.; Layeghy, S.; Portmann, M. Feature Analysis for ML-based IIoT Intrusion Detection. arXiv 2021, arXiv:2108.12732. [Google Scholar]
- Qaddoura, R.; Al-Zoubi, A.M.; Almomani, I.; Faris, H. A Multi-Stage Classification Approach for IoT Intrusion Detection Based on Clustering with Oversampling. Appl. Sci. 2021, 11, 3022. [Google Scholar] [CrossRef]
- Alkahtani, H.; Aldhyani, T.H. Intrusion Detection System to Advance Internet of Things Infrastructure-Based Deep Learning Algorithms. Complexity 2021, 2021, 5579851. [Google Scholar] [CrossRef]
- Islam, N.; Farhin, F.; Sultana, I.; Kaiser, M.S.; Rahman, M.S.; Mahmud, M.; Hosen, A.S.; Cho, G.H. Towards Machine Learning Based Intrusion Detection in IoT Networks. Comput. Mater. Contin. 2021, 69, 1801–1821. [Google Scholar] [CrossRef]
- Song, Y.; Hyun, S.; Cheong, Y.-G. Analysis of Autoencoders for Network Intrusion Detection. Sensors 2021, 21, 4294. [Google Scholar] [CrossRef] [PubMed]
- Hussein, A.Y.; Falcarin, P.; Sadiq, A.T. Enhancement performance of random forest algorithm via one hot encoding for IoT IDS. Period. Eng. Nat. Sci. 2021, 9, 579–591. [Google Scholar]
- Nascita, A.; Montieri, A.; Aceto, G.; Ciuonzo, D.; Persico, V.; Pescapé, A. XAI Meets Mobile Traffic Classification: Understanding and Improving Multimodal Deep Learning Architectures. IEEE Trans. Netw. Service Manag. 2021, 18, 4225–4246. [Google Scholar] [CrossRef]
- Marino, D.L.; Wickramasinghe, C.S.; Manic, M. An Adversarial Approach for Explainable AI in Intrusion Detection Systems. arXiv 2018, arXiv:1811.11705v1. [Google Scholar]
- Mane, S.; Rao, D. Explaining Network Intrusion Detection System Using Explainable AI Framework. arXiv 2021, arXiv:2103.07110. [Google Scholar]
- Wang, M.; Zheng, K.; Yang, Y.; Wang, X. An Explainable Machine Learning Framework for Intrusion Detection Systems. IEEE Access 2020, 8, 3127–73141. [Google Scholar] [CrossRef]
- Mahbooba, B.; Timilsina, M.; Sahal, R.; Serrano, M. Explainable Artificial Intelligence (XAI) to Enhance Trust Management in Intrusion Detection Systems Using Decision Tree Model. Complexity 2021, 2021, 6634811. [Google Scholar] [CrossRef]
- Szczepański, M.; Choraś, M.; Pawlicki, M.; Kozik, R. Achieving Explainability of Intrusion Detection System by Hybrid Oracle-Explainer Approach. In Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK, 19–24 July 2020; pp. 1–8. [Google Scholar]
- Sarhan, M.; Layeghy, S.; Portmann, M. An Explainable Machine Learning-based Network Intrusion Detection System for Enabling Generalisability in Securing IoT Networks. arXiv 2021, arXiv:2104.07183v1. [Google Scholar]
- Sarhan, M.; Layeghy, S.; Portmann, M. Evaluating Standard Feature Sets Towards Increased Generalisability and Explainability of ML-based Network Intrusion Detection. arXiv 2021, arXiv:2104.07183v2. [Google Scholar]
- Štrumbelj, E.; Kononenko, I. Explaining prediction models and individual predictions with feature contributions. Knowl. Inf. Syst. 2014, 41, 647–665. [Google Scholar] [CrossRef]
Label Name | Value | Number of Samples |
---|---|---|
Label | Normal | 40.073 |
Anomaly | 585.710 | |
Category | Normal | 40.073 |
DoS | 59.391 | |
Mirai | 415.677 | |
MITM ARP Sppofing | 35.377 | |
Scan | 75.265 |
Label Name | Value | Number of Samples |
---|---|---|
Label | Normal | 13.859 |
Anomaly | 586.241 | |
Attack | Benign | 13.859 |
Reconnaissance | 470.655 | |
DDoS | 56.844 | |
DoS | 56.833 | |
Theft | 1.909 |
Label Name | Value | Number of Samples |
---|---|---|
Label | Normal | 270.279 |
Anomaly | 1.379.274 | |
Attack | Ransomware | 142 |
Benign | 270.279 | |
XSS | 99.944 | |
Scanning | 21.467 | |
Password | 156.299 | |
DoS | 17.717 | |
DDoS | 326.345 | |
Injection | 468.539 | |
MITM | 1.295 |
Dataset | Best Model | Output Class | Output Label | AUC | Validation |
---|---|---|---|---|---|
IoTID20 | DT | Binary class | Anomaly | 1.00 | 1.00 |
Normal | 1.00 | ||||
MITM ARP Spoofing | 1.00 | ||||
Scan | 1.000 | ||||
IoTID20 | DT | Multiclass | DoS | 1.000 | 1.00 |
Mirai | 1.000 | ||||
Normal | 1.000 | ||||
NF-BoT-IoT-v2 | DT | Binary class | Anomaly | 0.97 | 0.97 |
Normal | 0.97 | ||||
NF-BoT-IoT-v2 | RF | Multiclass | Ransomware | 1.00 | 0.99 |
DoS | 1.00 | ||||
DDoS | 1.00 | ||||
Reconnaissance | 1.00 | ||||
Benign | 1.00 | ||||
Theft | 1.00 | ||||
NF-ToN-IoT-v2 | RF | Binary class | Anomaly | 1.00 | 1.00 |
Normal | 1.00 | ||||
Ransomware | 1.00 | ||||
Benign | 1.00 | ||||
XSS | 1.00 | ||||
Scanning | 1.00 | ||||
NF-ToN-IoT-v2 | RF | Multiclass | Password | 1.00 | 0.99 |
DoS | 1.00 | ||||
DDoS | 0.86 | ||||
Injection | 1.00 | ||||
MITM | 1.00 |
Dataset | Model | Accuracy | F1 | AUC |
---|---|---|---|---|
IoTID20 | Ensemble [47] | 87% | 87% | - |
SLFN [57] | 98.42% | 98% | - | |
CNN-LSTM [58] | 98% | 98.40% | - | |
DT [59] | 100% | - | - | |
AutoEncoders [60] | 94% | - | - | |
RF [61] | 97.85% | - | - | |
Proposed Method | 100% | 100% | 100% | |
NF-BoT-IoT-v2 | Extra trees [45] | 93.82% | 97% | 96.28% |
Extra trees [46] | 99.99% | 100% | - | |
E-GraphSAGE [55] | 93.57% | 97% - | ||
RF [68,69] | 100% | 100% | 99.88% | |
DFF [68,69] | 99.54% | 100% | 99.96% | |
Proposed Method | 100% | 100% | 100% | |
NF-ToN-IoT-v2 | Extra trees [45] | 99.66% | 100% | 99.65% |
Extra trees [46] | 98.05% | 98% | - | |
E-GraphSAGE [55] | 99.69% | 100% | - | |
DFF [56] with CHI | 85.61% | 91% | 85.19% | |
RF [56] with COR | 99.38 % | 100% | 99.46% | |
RF [68,69] | 99.66% | 100% | 99.61% | |
DFF [68,69] | 94.74% | 96% | 98.43% | |
Proposed Method | 100 % | 100% | 93% |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Le, T.-T.-H.; Kim, H.; Kang, H.; Kim, H. Classification and Explanation for Intrusion Detection System Based on Ensemble Trees and SHAP Method. Sensors 2022, 22, 1154. https://doi.org/10.3390/s22031154
Le T-T-H, Kim H, Kang H, Kim H. Classification and Explanation for Intrusion Detection System Based on Ensemble Trees and SHAP Method. Sensors. 2022; 22(3):1154. https://doi.org/10.3390/s22031154
Chicago/Turabian StyleLe, Thi-Thu-Huong, Haeyoung Kim, Hyoeun Kang, and Howon Kim. 2022. "Classification and Explanation for Intrusion Detection System Based on Ensemble Trees and SHAP Method" Sensors 22, no. 3: 1154. https://doi.org/10.3390/s22031154
APA StyleLe, T. -T. -H., Kim, H., Kang, H., & Kim, H. (2022). Classification and Explanation for Intrusion Detection System Based on Ensemble Trees and SHAP Method. Sensors, 22(3), 1154. https://doi.org/10.3390/s22031154