Performance Investigation of Principal Component Analysis for Intrusion Detection System Using Different Support Vector Machine Kernels
Abstract
:1. Introduction
- The study provides a model for IDS using PCA that decreases the number of selected features and enhances IDS performance based on the KDD Cup ‘99 and UNSW-NB15 datasets.
- The study evaluates the reduced dataset of the model using the linear, polynomial, Gaussian radial basis, and sigmoid kernel functions employed on SVM. results prove that the Gaussian radial basis outperformed other functions.
2. Background and Related Works
2.1. Intrusion Detection Systems
2.2. Support Vector Machine (SVM)
2.3. Principal Component Analysis
- Determine the normalized d-dimensional dataset’s covariance matrix.
- Determine the covariance matrix’s eigenvectors and eigenvalues.
- Sort the eigenvalues from highest to lowest.
- Choose the k eigenvectors that correspond to the k biggest eigenvalues, where k is the new feature subspace’s number of dimensions.
- Construct the projection matrix from the k eigenvectors that were chosen.
- Create a new k-dimensional feature space by transforming the original data.
Pseudocode For Computing PCA | |
1 | Procedure PCA |
2 | Compute dot product matrix: XT X = |
3 | Eigen analysis:XT X = VɅVT |
4 | Compute Eigen vectors:U = XVɅT |
5 | Save a Specific number of First components: Ud = |
6 | Compute d features:Y = Ud TX |
2.4. Related Works
3. Proposed Investigation Model
- A.
- Clean Data: The UNSW-NB15 and KDD CUP’99 datasets are cleaned by filling missing values, smoothing noisy data, resolving inconsistencies, and removing outliers.
- B.
- Data Transformation: The UNSW-NB15 and KDD CUP’99 datasets are transformed by employing the normalization technique to change the value, and structure, of data to fit our model requirements.
4. Experimental Results and Analysis
4.1. Performance Evaluation Matrices
4.2. Experimental Design, Analysis, and Findings
4.3. Experimental Results and Discussion
- (a)
- The RBF kernel function is the best compared to linear, polynomial, and Sigmoid.
- (b)
- The Sigmoid kernel function is the worst.
5. Conclusions and Future Works
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Almaiah, M.A.; Al-Zahrani, A.; Almomani, O.; Alhwaitat, A.K. Classification of cyber security threats on mobile devices and applications. In Artificial Intelligence and Blockchain for Future Cybersecurity Applications; Springer: Berlin/Heidelberg, Germany, 2021; pp. 107–123. [Google Scholar]
- Almaiah, A.; Almomani, O. An investigation of digital forensics for shamoon attack behaviour in fog computing and threat intelligence for incident response. J. Theor. Appl. Inf. Technol. 2020, 15, 98. [Google Scholar]
- Zhang, M.; Sun, K. Computer Network Security Protection Strategy Based on Big Data. In Innovative Computing; Springer: Berlin/Heidelberg, Germany, 2022; pp. 1343–1350. [Google Scholar]
- Al-Momani, O.M.D. Dynamic Redundancy Forward Error Correction Mechanism for the Enhancement of Internet-Based Video Streaming. Ph.D. Thesis, Universiti Utara Malaysia, Kedah, Malaysia, 2010. [Google Scholar]
- Gupta, M.; Almomani, O.; Khasawneh, A.M.; Darabkh, K.A. Smart remote sensing network for early warning of disaster risks. In Nanotechnology-Based Smart Remote Sensing Networks for Disaster Prevention; Elsevier: Amsterdam, The Netherlands, 2022; pp. 303–324. [Google Scholar]
- Almomani, O.; Almaiah, M.A.; Alsaaidah, A.; Smadi, S.; Mohammad, A.H.; Althunibat, A. Machine Learning Classifiers for Network Intrusion Detection System: Comparative Study. In Proceedings of the 2021 International Conference on Information Technology (ICIT), Amman, Jordan, 14–15 July 2021; pp. 440–445. [Google Scholar]
- Almomani, O. A Feature Selection Model for Network Intrusion Detection System Based on PSO, GWO, FFA and GA Algorithms. Symmetry 2020, 12, 1046. [Google Scholar] [CrossRef]
- Almomani, O. A Hybrid Model Using Bio-Inspired Metaheuristic Algorithms for Network Intrusion Detection System. Comput. Mater. Contin 2021, 68, 409–429. [Google Scholar] [CrossRef]
- Mohammad, A.H.; Alwada’n, T.; Almomani, O.; Smadi, S.; ElOmari, N. Bio-inspired Hybrid Feature Selection Model for Intrusion Detection. Comput. Mater. Contin 2022, 73, 133–150. [Google Scholar] [CrossRef]
- Ahmad, Z.; Khan, A.S.; Shiang, C.W.; Abdullah, J.; Ahmad, F. Network intrusion detection system: A systematic study of machine learning and deep learning approaches. Trans. Emerg. Telecommun. Technol. 2021, 32, e4150. [Google Scholar] [CrossRef]
- Sajja, G.S.; Mustafa, M.; Ponnusamy, R.; Abdufattokhov, S. Machine Learning Algorithms in Intrusion Detection and Classification. Ann. Rom. Soc. Cell Biol. 2021, 25, 12211–12219. [Google Scholar]
- Madi, M.; Jarghon, F.; Fazea, Y.; Almomani, O.; Saaidah, A. Comparative analysis of classification techniques for network fault management. Turk. J. Electr. Eng. Comput. Sci. 2020, 28, 1442–1457. [Google Scholar] [CrossRef]
- Al Hwaitat, A.K.; Almaiah, M.A.; Almomani, O.; Al-Sayed, M.A.R.M.; Asaifi, R.M.; Adhim, K.K.; Althunibat, A.; Alsaaidah, A. Improved Security Particle Swarm Optimization (PSO) Algorithm to Detect Radio Jamming Attacks in Mobile Networks. Int. J. Adv. Comput. Sci. Appl. 2020, 11, 614–625. [Google Scholar] [CrossRef]
- Mohammadi, M.; Rashid, T.A.; Karim, S.H.T.; Aldalwie, A.H.M.; Tho, Q.T.; Bidaki, M.; Rahmani, A.M.; Hosseinzadeh, M. A comprehensive survey and taxonomy of the SVM-based intrusion detection systems. J. Netw. Comput. Appl. 2021, 178, 102983. [Google Scholar] [CrossRef]
- Karamizadeh, S.; Abdullah, S.M.; Manaf, A.A.; Zamani, M.; Hooman, A. An overview of principal component analysis. J. Signal Inf. Process. 2020, 4, 173–175. [Google Scholar] [CrossRef] [Green Version]
- Kherif, F.; Latypova, A. Principal component analysis. In Machine Learning; Elsevier: Amsterdam, The Netherlands, 2020; pp. 209–225. [Google Scholar]
- Wang, W.; Du, X.; Wang, N. Building a cloud IDS using an efficient feature selection method and SVM. IEEE Access 2018, 7, 1345–1354. [Google Scholar] [CrossRef]
- Masadeh, R.; AlSaaidah, B.; Masadeh, E.; Al-Hadidi, M.R.; Almomani, O. Elastic Hop Count Trickle Timer Algorithm in Internet of Things. Sustainability 2022, 14, 12417. [Google Scholar] [CrossRef]
- Almaiah, M.A.; Hajjej, F.; Ali, A.; Pasha, M.F.; Almomani, O. A Novel Hybrid Trustworthy Decentralized Authentication and Data Preservation Model for Digital Healthcare IoT Based CPS. Sensors 2022, 22, 1448. [Google Scholar] [CrossRef]
- Saaidah, A.; Almomani, O.; Al-Qaisi, L.; Madi, M.K. An efficient design of RPL objective function for routing in internet of things using fuzzy logic. Int. J. Adv. Comput. Sci. Appl. 2019, 10, 184–190. [Google Scholar] [CrossRef] [Green Version]
- Albalas, F.; Al-Soud, M.; Almomani, O.; Almomani, A. Security-aware CoAP application layer protocol for the internet of things using elliptic-curve cryptography. Int. Arab J. Inf. Technol. 2018, 15, 25–37. [Google Scholar]
- Smadi, S.; Alauthman, M.; Almomani, O.; Saaidah, A.; Alzobi, F. Application Layer Denial of Services Attack Detection Based on StackNet. Int. J. 2020, 3929, 2278–3091. [Google Scholar] [CrossRef]
- Huraj, L.; Horak, T.; Strelec, P.; Tanuska, P. Mitigation against DDoS Attacks on an IoT-Based Production Line Using Machine Learning. Appl. Sci. 2021, 11, 1847. [Google Scholar] [CrossRef]
- Šimon, M.; Huraj, L.; Horák, T. DDoS reflection attack based on IoT: A case study. In Computer Science Online Conference; Springer: Berlin/Heidelberg, Germany, 2018; pp. 44–52. [Google Scholar]
- Horak, T.; Strelec, P.; Huraj, L.; Tanuska, P.; Vaclavova, A.; Kebisek, M. The vulnerability of the production line using industrial IoT systems under ddos attack. Electronics 2021, 10, 381. [Google Scholar] [CrossRef]
- Adil, M.; Almaiah, M.A.; Alsayed, A.O.; Almomani, O. An Anonymous Channel Categorization Scheme of Edge Nodes to Detect Jamming Attacks in Wireless Sensor Networks. Sensors 2020, 20, 2311. [Google Scholar] [CrossRef] [Green Version]
- Kaur, T.; Malhotra, V.; Singh, D. Comparison of network security tools-firewall intrusion detection system and Honeypot. Int. J. Enhanc. Res. Sci. Technol. Eng. 2014, 3, 201–202. [Google Scholar]
- Lundin, E.; Jonsson, E. Survey of Intrusion Detection Research; Chalmers University of Technology: Goteborg, Sweden, 2002. [Google Scholar]
- Bridges, R.A.; Glass-Vanderlan, T.R.; Iannacone, M.D.; Vincent, M.S.; Chen, Q. A survey of intrusion detection systems leveraging host data. ACM Comput. Surv. 2019, 52, 1–35. [Google Scholar] [CrossRef] [Green Version]
- Pisner, D.A.; Schnyer, D.M. Support vector machine. In Machine Learning; Elsevier: Amsterdam, The Netherlands, 2020; pp. 101–121. [Google Scholar]
- Thaseen, I.S.; Kumar, C.A. Intrusion detection model using fusion of PCA and optimized SVM. In Proceedings of the 2014 International Conference on Contemporary Computing and Informatics (IC3I), Mysuru, India, 27–29 November 2014; pp. 879–884. [Google Scholar]
- Nskh, P.; Varma, M.N.; Naik, R.R. Principle component analysis based intrusion detection system using support vector machine. In Proceedings of the 2016 IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT), Bengaluru, India, 20–21 May 2016; pp. 1344–1350. [Google Scholar]
- Raja, M.C.; Rabbani, M.M.A. Combined analysis of support vector machine and principle component analysis for IDS. In Proceedings of the 2016 International Conference on Communication and Electronics Systems (ICCES), Coimbatore, India, 21–22 October 2016; pp. 1–5. [Google Scholar]
- Ikram, S.T.; Cherukuri, A.K. Improving accuracy of intrusion detection model using PCA and optimized SVM. J. Comput. Inf. Technol. 2016, 24, 133–148. [Google Scholar] [CrossRef] [Green Version]
- Mishra, A.; Cheng, A.M.K.; Zhang, Y. Intrusion detection using principal component analysis and support vector machines. In Proceedings of the 2020 IEEE 16th International Conference on Control & Automation (ICCA), Hokkaido, Japan, 6–9 July 2020; pp. 907–912. [Google Scholar]
- Bhattacharya, S.; Maddikunta, P.K.R.; Kaluri, R.; Singh, S.; Gadekallu, T.R.; Alazab, M.; Tariq, U. A novel PCA-firefly based XGBoost classification model for intrusion detection in networks using GPU. Electronics 2020, 9, 219. [Google Scholar] [CrossRef] [Green Version]
- Moustafa, N.; Slay, J. UNSW-NB15: A comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In Proceedings of the 2015 Military Communications and Information Systems Conference (MilCIS), Canberra, Australia, 10–12 November 2015; pp. 1–6. [Google Scholar]
- Liu, W.; Wang, J. A brief survey on nature-inspired metaheuristics for feature selection in classification in this decade. In Proceedings of the 2019 IEEE 16th International Conference on Networking, Sensing and Control (ICNSC), Banff, AB, Canada, 9–11 May 2019; pp. 424–429. [Google Scholar]
- Smadi, S.; Aslam, N.; Zhang, L. Detection of online phishing email using dynamic evolving neural network based on reinforcement learning. Decis. Support Syst. 2018, 107, 88–102. [Google Scholar] [CrossRef]
Abbreviations | Definition |
---|---|
IDS | Intrusion Detection System |
ML | Machine Learning |
SVM | Support Vector Machine |
RBF | Gaussian radial Basis Function |
PCA | Principal Component Analysis |
IoT | Internet of Things |
DoS | Denial of Service |
DDoS | Distributed Denial of Service |
TP | True Positive |
TN | True Negative |
FP | False Positive |
FN | False Negative |
Kernel | Mathematical Functions |
---|---|
Linear | K (ys, yt) = ys yt |
Polynomial | |
RBF | |
Sigmoid |
Article | Dataset | SVM Kernels | Accuracy | C for SVM | Gamma for SVM | Reduction Techniques | Evaluation Metrics |
---|---|---|---|---|---|---|---|
[31] | KDD Cup’99 | RBF | 0.990 | Automatic | Automatic | PCA | TP, TN, FP, and FN |
[32] | KDD Cup’99 | RBF polynomial Linear | 0.998 0.997 0.992 | 25 and 35 | 3 and 3.5 | PCA | Accuracy |
[33] | KDD Cup’99 | Sigmoid RBF | 0.9605 0.811 | Nan | 0.1 | PCA | Response time, network efficiency, system error rate, and sensitivity |
[35] | UNSW-NB15 | RBF | 99.97 | Nan | Nan | PCA | Accuracy, TPR amd FPR |
Our model | KDD Cup’99 | Linear polynomial RBF, Sigmoid | 0.958 0.982 0.991 0.850 | 1.0 | Scale | PCA | TP, FN, FP, TN, Accuracy Precision, Sensitivity And F-measure |
Our model | UNSW-NB15 | Linear polynomial RBF, Sigmoid | 0.917 0.915 0.939 0.732 | 1.0 | Scale | PCA | TP, FN, FP, TN, Accuracy Precision, Sensitivity and F-measure |
Parameters | Values |
---|---|
Kernel | Linear, poly, RBF, and Sigmoid |
C | 1.0 |
Gamma | Scale |
Shrinking | True |
Cache_size | 200 MB |
Max_iter | −1 |
Random_state | 0 |
Predicted | |||
---|---|---|---|
Normal | Attack | ||
Actual | Normal | TP | FN |
Attack | FP | TN |
KDD Cup’99 | UNSW-NB15 | |||||||
---|---|---|---|---|---|---|---|---|
SVM-Linear | SVM-Poly | SVM-rbf | SVM-Sigmoid | SVM-Linear | SVM-Poly | SVM-rbf | SVM-Sigmoid | |
TP | 93.90% | 97.16% | 98.97% | 85.25% | 91.71% | 90.18% | 93.23% | 75.93% |
FN | 6.10% | 2.84% | 1.03% | 14.75% | 8.29% | 9.82% | 6.77% | 24.07% |
FP | 2.54% | 0.74% | 0.77% | 12.90% | 8.13% | 6.88% | 5.19% | 29.98% |
TN | 97.46% | 99.26% | 99.23% | 87.10% | 91.87% | 93.12% | 94.81% | 70.02% |
Accuracy | 95.81% | 98.29% | 99.11% | 86.25% | 91.78% | 91.50% | 93.94% | 73.28% |
Precision | 96.94% | 99.12% | 99.10% | 85.00% | 93.28% | 94.16% | 95.67% | 75.70% |
Sensitivity | 93.90% | 97.16% | 98.97% | 85.25% | 91.71% | 90.18% | 93.23% | 75.93% |
F-measure | 95.39% | 98.13% | 99.03% | 85.13% | 92.49% | 92.13% | 94.44% | 75.82% |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Almaiah, M.A.; Almomani, O.; Alsaaidah, A.; Al-Otaibi, S.; Bani-Hani, N.; Hwaitat, A.K.A.; Al-Zahrani, A.; Lutfi, A.; Awad, A.B.; Aldhyani, T.H.H. Performance Investigation of Principal Component Analysis for Intrusion Detection System Using Different Support Vector Machine Kernels. Electronics 2022, 11, 3571. https://doi.org/10.3390/electronics11213571
Almaiah MA, Almomani O, Alsaaidah A, Al-Otaibi S, Bani-Hani N, Hwaitat AKA, Al-Zahrani A, Lutfi A, Awad AB, Aldhyani THH. Performance Investigation of Principal Component Analysis for Intrusion Detection System Using Different Support Vector Machine Kernels. Electronics. 2022; 11(21):3571. https://doi.org/10.3390/electronics11213571
Chicago/Turabian StyleAlmaiah, Mohammed Amin, Omar Almomani, Adeeb Alsaaidah, Shaha Al-Otaibi, Nabeel Bani-Hani, Ahmad K. Al Hwaitat, Ali Al-Zahrani, Abdalwali Lutfi, Ali Bani Awad, and Theyazn H. H. Aldhyani. 2022. "Performance Investigation of Principal Component Analysis for Intrusion Detection System Using Different Support Vector Machine Kernels" Electronics 11, no. 21: 3571. https://doi.org/10.3390/electronics11213571
APA StyleAlmaiah, M. A., Almomani, O., Alsaaidah, A., Al-Otaibi, S., Bani-Hani, N., Hwaitat, A. K. A., Al-Zahrani, A., Lutfi, A., Awad, A. B., & Aldhyani, T. H. H. (2022). Performance Investigation of Principal Component Analysis for Intrusion Detection System Using Different Support Vector Machine Kernels. Electronics, 11(21), 3571. https://doi.org/10.3390/electronics11213571