You are currently viewing a new version of our website. To view the old version click .
Electronics
  • Article
  • Open Access

13 November 2022

Cybersecurity in Smart Cities: Detection of Opposing Decisions on Anomalies in the Computer Network Behavior

,
,
and
1
Center for Applied Mathematics and Electronics, 11000 Belgrade, Serbia
2
Amity International Business School, Amity University, Noida 201303, India
3
Mathematical Institute of SASA, 11000 Belgrade, Serbia
4
School of Computing, Mathematics, and Engineering, Charles Sturt University, Bathurst, NSW 2795, Australia
This article belongs to the Special Issue Data-Driven Intelligent Technologies for Smart Cities

Abstract

The increased use of urban technologies in smart cities brings new challenges and issues. Cyber security has become increasingly important as many critical components of information and communication systems depend on it, including various applications and civic infrastructures that use data-driven technologies and computer networks. Intrusion detection systems monitor computer networks for malicious activity. Signature-based intrusion detection systems compare the network traffic pattern to a set of known attack signatures and cannot identify unknown attacks. Anomaly-based intrusion detection systems monitor network traffic to detect changes in network behavior and identify unknown attacks. The biggest obstacle to anomaly detection is building a statistical normality model, which is difficult because a large amount of data is required to estimate the model. Supervised machine learning-based binary classifiers are excellent tools for classifying data as normal or abnormal. Feature selection and feature scaling are performed to eliminate redundant and irrelevant data. Of the 24 features of the Kyoto 2006+ dataset, nine numerical features are considered essential for model training. Min-Max normalization in the range [0,1] and [−1,1], Z-score standardization, and new hyperbolic tangent normalization are used for scaling. A hyperbolic tangent normalization is based on the Levenberg-Marquardt damping strategy and linearization of the hyperbolic tangent function with a narrow slope gradient around zero. Due to proven classification ability, in this study we used a feedforward neural network, decision tree, support vector machine, k-nearest neighbor, and weighted k-nearest neighbor models Overall accuracy decreased by less than 0.1 per cent, while processing time was reduced by more than a two-fold reduction. The results show a clear benefit of the TH scaling regarding processing time. Regardless of how accurate the classifiers are, their decisions can sometimes differ. Our study describes a conflicting decision detector based on an XOR operation performed on the outputs of two classifiers, the fastest feedforward neural network, and the more accurate but slower weighted k-nearest neighbor model. The results show that up to 6% of different decisions are detected.

1. Introduction

The rapid development of smart cities reveals computer network connectivity and interoperability issues and highlights the problems that can arise in large-scale heterogeneous data processing. These issues are obstacles to organic efforts to improve urban intelligence and environmental sustainability while offering significant potential for key technologies and engineering practices in data-driven smart city systems. Understanding data management is important for unlocking smart cities [1,2]. Leveraging real-time data improves the operational efficiency, connectivity, decision-making, and overall performance of Internet of Things (IoT)-based computer networks and communications platforms for data collection, device management, and cloud solutions. The authors of [3] provide a realistic view of how organizations can evolve to the next level of maturity and how the forces driving this transition can adopt and benefit from IoT. Cloud services enable excessive connectivity between various IoT devices and sensors, resulting in billions of connected devices and massive amounts of data. Three principles must be followed: data must be sent over multiple channels dynamically, it must be secure, and it must be scalable. With the development of smart city technology, network security threats have become an important obstacle to maximizing the benefits of data-driven technologies, and intrusion detection has become an important prerequisite for protecting sensitive data [4,5].
The intrusion detection system (IDS) is derived from the human immune system (HIS), which consists of humoral immunity that protects the body from pathogens from outside the body (similar to detecting malicious attacks). Similarly, cell-mediated immunity reacts to self cells deviation (a negative selection process related to detecting abnormalities) [6].
The primary purpose of an intrusion detection system is to monitor network traffic to detect patterns (signatures) of malicious attacks or deviations from standard network functionality. A signature-based IDS compares the anonymous network data patterns against a known set of attack signatures. It is the simplest and most effective method against various common attacks. However, the performance of the signature-based IDS is limited to known attacks, i.e., the detector cannot identify unknown attacks. On the other hand, to detect changes in network behavior anomaly-based IDS monitors the state of the network traffic and generates alerts when abnormalities are detected [7,8]. Anomaly detection’s main benefit is identifying previously unknown suspicious behavior or known malicious activity. The biggest challenge is determining what is considered normal computer network behavior. Developing statistical models for normal network behavior is difficult because model evaluation requires a large amount of data, which takes time and storage [9].
Binary classifiers based on supervised machine learning (ML) are good candidates for detecting “normality”, although they require large amounts of data [10]. The authors of [11] provide publication citation statistics for various ML techniques collected from 2005 to 2020. The results show that the most cited articles are related to Support Vector Machines (SVMs), followed by publications on neural networks, Decision Trees (DTs), and nearest-neighbor models.
In this study, we present five standard binary classifiers: the k-Nearest Neighbor (k-NN), weighted k-NN (wk-NN), DT, SVM, and Feedforward Neural Network (FNN). A k-NN is the most well-known distance-based algorithm that assigns a new instance to a class to which most of its k nearest neighbors belong [12,13]. A k-NN model with k = 10 and a similarity measure is based on Euclidean distance because of its robustness to noisy data, flexibility, and easy implementation [14]. The wk-NN model is used because it extends the k-NN model to improve the accuracy by heavily weighting neighbors in the decision who are closer to the new instance than neighbors who are more distant [15]. The weights are calculated as the inverse square of the Euclidean distance [16]. Medium Gaussian SVM provides high prediction speed in binary classification [15]. The model classifies instances in n-dimensional space using a hyperplane. The model uses a hyperplane to classify instances in n-dimensional space [17]. DT models predict the class label in input data based on decisions from the root to the leaf nodes [18,19]. Due to its high prediction speed and low memory costs, medium DT (Iterative Dichotomiser 3 algorithm) with 20 splits is used [16]. A feedforward neural network (FNN) with one hidden layer (nine input, nine hidden, and one output neuron) is used due to its fast processing speed and generalization ability [14,15]. It is one of the simplest and quickest models that rely on backpropagation to produce results based on the predicted probabilities and classification thresholds. The calculation is done by transferring the data from the input to the output and then propagating the error of the cost function backward to adjust the weights [20].
The Kyoto 2006+ dataset was used as a benchmark for the experiments because it was created for anomaly detection and contained records of more than ten years of actual network traffic data collected from honeypots on five computer networks within and outside Kyoto University [21]. Of 24 features, 14 statistical features are extracted from the KDD Cup ‘99 dataset, and 10 additional features were added by the authors for further network analysis and evaluation of the other network-based intrusion detection systems [22]. The Kyoto 2006+ dataset provides labeled instances that do not describe exact attack-specific details, but give a separation between normal and abnormal network traffic [23].
All irrelevant features of the Kyoto 2006+ dataset are removed using feature selection and feature scaling techniques. Feature selection is used to remove all categorical features, connection duration features, statistical features and features intended for further analysis. After identifying the nine relevant numerical features, the Min-Max normalization in the ranges [0,1] and [−1,1], Z-score standardization, and a novel tangent-hyperbolic (TH) normalization are used for scaling.
The idea of TH normalization is to scale the features to a fixed range [ 0.7616 , + 0.7616 ] , i.e., [ t a n h ( 1 ) , t a n h ( 1 ) ] , and then use a Levenberg-Marquardt (LM) damping strategy to speed up training and improve model performance [24,25]. The results show that TH scaling provides clear benefits in reducing processing time, which improves the efficiency of IoT and cloud computing operations [26].
The performance of the above models is compared in terms of processing time, accuracy, F1-score, false alarms, and true positive rate. Each classifier needs to make very accurate decisions about anomalies. However, regardless of the overall accuracy of the classifiers, when working in parallel, one may consider network traffic normal while the other detects anomalies, and vice versa. This paper proposes an XOR-based detector to detect unusual conflicting decisions in computer networks.
In our previous work [14,15], we introduced the concept of XOR detection. Since the result of the XOR operation is 1 if the two bits are different, if the decisions are other (for non-zero bits), the total number of different decisions can be calculated as the sum of all the results. The XOR-based detector is designed to compare the outcomes of two binary classifiers: FNN (eager learner) and wk-NN (lazy learner). The results show a small percentage of conflicting decisions that are not affected by record size, model accuracy, or processing time. However, it can be used to warn not only of anomalies but also of potentially harmful activities in smart city computer networks concerning privacy, data breaches, and more.
The remainder of this paper is organized as follows: Section 2 presents a systematic review of relevant references on anomaly-based intrusion detection. The Kyoto 2006+ dataset is compared with 13 other datasets most commonly used for IDS experiments. Feature selection and feature scaling are briefly introduced and the TH normalization is described in detail. Section 3 introduces the concept of an XOR-based conflicting decisions detector. Section 4 presents the experimental results. Section 5 concludes this paper.

3. Proposed Work

In machine learning, classification refers to predicting a class label for a given instance of input data. A supervised ML model learns from the training set and its true labels and then makes predictions on the test set. Binary classification is used when a binary label is assigned to an unknown data instance. Figure 1 depicts a diagram of the classification process for five binary classifications presented in this work.
Figure 1. Binary classification process.
First, the Kyoto 2006+ dataset has been relieved of all irrelevant features. The feature scaling is then done. The classifier is trained using a known set of data instances in the training phase. The classifier is then tested on an unknown data set. Each classifier is expected to be highly accurate in decision-making.
However, regardless of the accuracy of the classifiers, when working in parallel, one may detect anomalies while the other considers the network data normal, and vice versa. We propose an XOR-based detector of conflicting decisions designed to compare the outcomes of two binary classifiers. The basic idea of this detector is to apply an XOR bitwise operation to the classification results. Figure 2 shows the conceptual design of a detector that makes a decision based on the outputs of the FNN (eager learner) and wk-NN (lazy learner) [15].
Figure 2. Detector of the conflicting decisions.
In conflicting decision detection, we consider it insignificant whether the outcomes of the classifiers are both true or false. What is considered necessary is that the results differ. If the classifiers make different decisions, the XOR logical operation performed on their outputs results in a total of one; otherwise, the result is zero. The number of different decisions can then be determined, as in Equation (8):
s u m x o r = i = 1 n x o r ( o u t 1 i , o u t 2 i ) ,
where s u m x o r represents the cumulative sum of n decisions, o u t 1 i and o u t 2 i are outcomes of the classifiers for i = 1 , ,   n , and x o r ( o u t 1 i , o u t 2 i ) is logically true (1) if the decisions differ, otherwise the result is false (0). If highly sensitive data must be protected, detection of opposing decisions can help raise additional alarms, related not only to the anomalies in the computer network but also to the potentially harmful non-standard challenges in 5G networks concerning privacy centered around location tracking, semantic information attacks, leakage from access points, etc. In [64,65], the authors discuss security in terms of alert range and accuracy of criteria selection.

4. Results and Discussion

The MATLAB classification learner is used to compare the effects of the feature selection on binary classification. Initially, the features are free of Not-a-Number (NaN) values that MATLAB does not recognize. In the first part of the experiments, accuracy and processing time for 17 and nine features were compared. Table 5 displays ACC and tp for four daily records, with different numbers of instances in the data sets.
Table 5. ACC and tp for four daily records.
For all other models, the processing time was significantly shorter when nine features were used for model evaluation, compared to the time when 17 features were used. As expected, the processing time is as long as the number of instances increases. At the same time, the accuracy of the SVM model decreased to ~0.2%, followed by the k-NN and wk-NN models at ~0.8%, and the DT model at ~2%. For these reasons, nine features are assumed to be sufficient for use in experiments on the effects of feature scaling on accuracy and processing time and on XOR-based opposing decision detection. In this part of the experiments, the feedforward neural network was not tested because it does not deal with non-numerical features.
In the second part of the experiment, the effect scaling feature is presented. A daily record of the Kyoto 2006+ dataset of approximately 60,000 instances was used as a benchmark. The experiments were performed as follows. First, the features are freed of NaN values and a subset of 57,300 instances is used as a benchmark for the experiments. Second, all irrelevant features are removed. Feature Label is left for the decision about anomalies; when Label = 1, the network traffic is considered normal; otherwise, anomalies are detected (Label = −1). Third, four feature scaling methods are used: TH, Min-Max in the range [0,1] and [−1,1] and Z-score standardization. The Z-score standardization was first used for training.
The features are not scaled to the same fixed range, and the model is trained using normally distributed instances. The results show that the decisions for all models are very accurate except for the DT classifier. In addition, the processing time of the two nearest neighbor models is significantly longer than that of all other models. Min-Max scaling in the range [ 0 , 1 ] is then used to solve the problem of different scales. Compared to the previous results, it can be seen that feature scaling has no positive effect on any classifier.
Furthermore, the Min-Max scaling in the range [ 1 , 1 ] is used to avoid problems caused by very long or very small derivatives, but does not affect the classification results. Although the results showed very accurate models, there is still a problem with long processing times. Finally, the TH scaling is used. All features are all scaled into the same symmetrical range of ± 0.7616 . Compared to other scaling methods, the results show a significant reduction in processing time, which is more than half that of the nearest neighbor models. The results are presented in terms of Accuracy, processing time, FP, TPR and F1-score. The results support the assumption that using the TH scaling to accelerate training is reasonable. All models except SVM have high accuracy and F1-scores. The results are given in Table 6.
Table 6. ACC and tp, F1-score, FP and TPR.
Overall, the results show that the TH scaling has a significant positive effect on processing time at the expense of a slight decrease in accuracy and F1-score. Scaling the features within  ± 0.7616 ensures that each feature is equally important in the decision and does not influence the others. The results also show that the damping strategy speeds up model training. The results show that feature scaling does not affect the number of false positive results. The percentage of FPs compared to the total number of instances varies from 0.065%, when the Z-score standardization is used for scaling and FNN is the classifier, to 0.11% when Min-Max normalization in the range [−1,1] is used for scaling and SVM is the classifier. Low false positive values and high TPR for all the models indicate the applicability of the proposed feature selection method. The wk-NN model showed the highest classification accuracy in all cases except when the FNN is trained with the data scaled in the range [−1,1]. In this case, there is a 0.05% accuracy difference between the wk-NN model and feedforward neural network. In addition, the wk-NN model has the best F1-score other than FNN trained on data scaled in the range [0,1].
To demonstrate the functionality of the XOR-based model, we ran experiments on 3-day records, containing 57,270, 57,280, and 58,300 instances. First, we divided each daily record so that two-thirds of the instances are used to evaluate the models, and one-third is used to calculate the sum of the detected decisions. According to the findings, it was expected that classifiers detected network behavior equally. To investigate this expectation, the fastest FNN and the most accurate wk-NN models’ decisions are compared (See Table 7).
Table 7. Detection of different decisions.
The results show that the number of conflicting decisions between the weighted k-NN model and the feedforward neural network are independent of the number of instances. Uncertainty in the results can be caused by data errors, residual errors in the model, unidentified malicious attacks, etc. It should be noted that the XOR-based detector of conflicting decisions cannot predict the specific conflict in the decision. It only provides additional warnings to the authorities in such cases. The decision criteria can be chosen in different ways depending on the sensitivity of the data, the technology used, legal practice, etc. [66].

5. Conclusions

As technology advances, the number of cyber-attacks has increased exponentially. As a result, detecting and predicting cyber-attacks is essential for any system that processes sensitive data. Detecting network behavior anomalies is a relatively simple process of determining what is “normal” and what is an “anomaly”. With the rapid growth of computer networks and increasingly faster data processing, the classifiers need to improve the predictability. The studies presented in this paper can serve as a reference for researchers who want to use new methods for feature selection and scaling, or to choose the appropriate algorithms based on their application scenarios and available resources.
The authors often use datasets that are simulations of the network traffic. In such cases, the impact of duplicate and redundant records on model estimation can lead to low processing power and reduce the model’s overall accuracy. The Kyoto 2006+ dataset is a publicly available 10-year data set of real network traffic designed for anomaly detection. The issue of the data set size is solved by feature selection and scaling. The nine numerical features are scaled using TH, Min-Max[0,1] and Min-Max[−1,1] normalization and Z-score standardization. Five ML-based binary classifiers, namely: FNN, k-NN, wk-NN, DT, and SVM, are used to determine whether the network is working properly. The classifiers’ performance are compared using accuracy, processing time, number of false alarms, TPR, and F1-score.
This paper proposes an XOR-based model to detect conflicting decisions in abnormal computer network behavior. The outputs of the fastest FNN and the most accurate wk-NN are compared. It has been demonstrated that their decisions sometimes differ. The sum of the non-zero bits determines the number of opposite conclusions after the classifiers’ results are XORred. The results show that dataset size, model accuracy, and processing time do not affect the number of decisions.

Author Contributions

Conceptualization, D.P. and M.S.; methodology, D.P.; software, D.P.; validation, D.P., L.G., M.S. and M.A.R.; formal analysis, D.P.; investigation, D.P. and M.S.; resources, D.P. and M.S.; data curation, D.P.; writing—original draft preparation, D.P. and L.G.; writing—review and editing, L.G. and M.S.; visualization, D.P.; supervision, L.G., M.S. and M.A.R.; project administration, D.P., L.G., M.S. and M.A.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

http://www.takakura.com/Kyoto_data/ (accessed on 11 May 2020).

Conflicts of Interest

The authors declare that they have no conflict of interest.

References

  1. Fang, Y.; Shan, Z.; Wang, W. Modeling and key technologies of a data driven smart cities. IEEE Access 2021, 9, 91244–91258. [Google Scholar] [CrossRef]
  2. Rahman, A.; Al-Saggaf, Y.; Zia, T. A data mining framework to predict cyber attack for cyber security. In Proceedings of the 15th IEEE Conference on Industrial Electronic and Applications, Kristiansand, Norway, 9–13 November 2020; pp. 207–212. [Google Scholar]
  3. Ramakrishnan, R.; Gaur, L. Internet of Things: Approach and Applicability in Manufacturing; Chapman and Hall: London, UK; CRC: Boca Raton, FL, USA, 2019. [Google Scholar] [CrossRef]
  4. Kaularachchi, Y. Implementing data driven smart city applications for future cities. Smart Cities 2022, 5, 455–474. [Google Scholar] [CrossRef]
  5. Mohamed, N.; Al-Jaroodi, J.; Jawhar, I. Opportunities and challenges of data-driven cybersecurity for smart cities. In Proceedings of the 2020 IEEE Systems Security Symposium, Crystal City, VA, USA, 1 July–1 August 2020; pp. 1–7. [Google Scholar] [CrossRef]
  6. Aliyu, F.; Sheltami, T.; Deriche, M.; Nasser, N. Human immune-based intrusion detection and prevention system for fog computing. J. Netw. Syst. Manag. 2020, 30, 11. [Google Scholar] [CrossRef]
  7. Sen, J.; Methab, S. Machine Learning Applications in Misuse and Anomaly Detection. 2009. Available online: https://arxiv.org/ftp/arxiv/papers/2009/2009.06709.pdf (accessed on 18 July 2022).
  8. Bialas, A.; Michalak, M.; Flisiuk, B. Anomaly detection in network traffic security assurance. In Engineering in Dependability of Computer Systems and Networks; Zamojski, W., Mayurkiewicy, J., Sugier, J., Walkowiak, T., Kacprzyk, J., Eds.; Springer: Cham, Switzerland, 2020; p. 987. [Google Scholar] [CrossRef]
  9. Almomani, O. A feature selection model for network intrusion detection system based on PSO, GWO, FFA and GA algorithms. Symmetry 2020, 12, 1046. [Google Scholar] [CrossRef]
  10. Bhuyan, M.H.; Bhattacharyya, D.K.; Kalita, J.K. Network anomaly detection: Methods systems and tools. IEEE Commun. Surv. Tutor. 2014, 16, 303–336. [Google Scholar] [CrossRef]
  11. Kumar, S.; Gupta, S.; Arora, S. Research trends in network-based intrusion detection systems: A review. IEEE Access 2021, 9, 157761–157779. [Google Scholar] [CrossRef]
  12. Bohara, B.; Bhuyan, J.; Wu, F.; Ding, J. A survey on the use of data clustering for intrusion detection system in cybersecurity. Int. J. Netw. Secur. Its Appl. 2020, 12, 1. [Google Scholar] [CrossRef]
  13. Lin, I.-C.; Chang, C.-C.; Peng, C.-H. An anomaly-based IDS framework using centroid-based classification. Symmetry 2022, 14, 105. [Google Scholar] [CrossRef]
  14. Protic, D.; Stankovic, M.; Antic, V. WK-FNN design for detection of anomalies in the computer network traffic. Facta Univ. Ser. Electron. Energetics 2022, 35, 269–282. [Google Scholar] [CrossRef]
  15. Protic, D.; Stankovic, M. A hybrid model for anomaly-based intrusion detection in complex computer networks. In Proceedings of the 21st International Arab Conference on Information Technology (ACIT), Giza, Egypt, 28–30 November 2020; pp. 1–8. [Google Scholar] [CrossRef]
  16. Protic, D.; Stankovic, M. Detection of anomalies in the computer network behaviour. Eur. J. Eng. Form. Sci. 2021, 4, 10–17. [Google Scholar] [CrossRef]
  17. Ahmed, I.; Shin, H.; Hong, M. Fast content-based file type identification. In Advances in Digital Forensics VII; Springer: Berlin/Heidelberg, Germany, 2011. [Google Scholar] [CrossRef]
  18. Ruggieri, S. Complete search for feature selection decision trees. J. Mach. Learn. Res. 2019, 20, 1–34. [Google Scholar]
  19. Pham, B.T.; Jaafari, A.; Avand, M.; Al-Ansari, N.; Du, T.D.; Yen, H.P.H.; Phong, T.V.; Nguyen, D.H.; Le, H.V.; Mafi-Gholami, D.; et al. Performance evaluation of machine learning methods for forest fire modeling and prediction. Symmetry 2020, 12, 1022. [Google Scholar] [CrossRef]
  20. Hardesty, L. Explained: Neural networks. MIT News, 14 April 2017. Available online: https://news.mit.edu/2017/explained-neural-networks-deep-learning-0414 (accessed on 11 July 2021).
  21. Yusof, N.N.M.; Sulaiman, N.S. Cyber attack detection dataset: A review. J. Phys. Conf. Ser. 2022, 2319, 1–6. [Google Scholar] [CrossRef]
  22. Song, J.; Takakura, H.; Okabe, Y.; Eto, M.; Inoue, D.; Nakao, K. Statistical analysis of honeypot data and building of Kyoto 2006+ dataset for NIDS evaluation. In Proceedings of the First Workshop on Building Analysis Datasets and Gathering Experience Returns for Security, Salzburg, Austria, 10–13 April 2011; pp. 29–36. [Google Scholar] [CrossRef]
  23. Mills, R.; Marnerides, A.K.; Broadbent, M.; Race, N. Practical Intrusion Detection of Emerging Threat. Available online: https://eprints.lancs.ac.uk/id/eprint/156068/1/TNSM_Paper_Accepted_Version.pdf (accessed on 5 September 2022).
  24. Levenberg, K. A method for the solution of certain problems in least squares. Q. Appl. Math. 1944, 5, 164–168. [Google Scholar] [CrossRef]
  25. Marquardt, D. An algorithm for least-squares estimation of nonlinear parameters. SIAM J. Appl. Math. 1963, 11, 431–441. [Google Scholar] [CrossRef]
  26. Su, P.; Chen, Y.; Lu, M. Smart city information processing under internet of things and cloud computing. J. Supercomput. 2022, 78, 3676–3695. [Google Scholar] [CrossRef]
  27. Raza, S.; Wallgren, L.; Voigt, T. SVELTE: Real-time intrusion detection in the Internet of Things. Ad Hoc Netw. 2013, 11, 2661–2674. [Google Scholar] [CrossRef]
  28. Shrestha, R.; Omidkar, A.; Ahmadi Roudi, S.; Abbas, R.; Kim, S. Machine-learning enabled intrusion detection system for cellular connected UAV Networks. Electronics 2021, 10, 1549. [Google Scholar] [CrossRef]
  29. Alsheikh, M.A.; Lin, S.; Niyato, D.; Tan, H.P. Machine learning in wireless sensor networks: Algorithms, strategies, and applications. IEEE Commun. Surv. Tutor. 2014, 16, 1996–2018. [Google Scholar] [CrossRef]
  30. Kumar, Y.V.; Kamatchi, K. Anomaly based network intrusion detection using ensemble machine learning technique. Int. J. Res. Eng. Sci. Manag. 2020, 3, 290–297. [Google Scholar]
  31. Pai, V.; Devidas, B.; Adesh, N.D. Comparative analysis of machine learning algorithms for intrusion detection. IOP Conf. Ser. Mater. Sci. Eng. 2021, 1013, 1–7. [Google Scholar] [CrossRef]
  32. Ring, M.; Wunderlich, S.; Scheuring, D.; Landes, D.; Hotho, A.A. A survey of network-based intrusion detection data sets. Comput. Secur. 2019, 86, 147–167. [Google Scholar] [CrossRef]
  33. Abiodun, O.I.; Jantan, A.; Omolara, A.E.; Dada, K.V.; Mohamed, N.A.; Arshad, H. State-of-the-art in artificial neural network applications: A survey. Heliyon 2018, 4, e00938. [Google Scholar] [CrossRef] [PubMed]
  34. Band, S.S.; Ardabili, S.; Sookhak, M.; Chronopoulos, A.T.; Elnaffar, S.; Moslehpour, M.; Csaba, M.; Torok, B.; Pai, H.T.; Mosavi, A. When smart cities get smarter via machine learning: An in-depth literature review. IEEE Access 2022, 10, 60985–61015. [Google Scholar] [CrossRef]
  35. SIGKDD-KDD Cup. KDD Cup 1999: Computer Network Intrusion Detection. 2018. Available online: www.kdd.org (accessed on 21 September 2022).
  36. McCarthy, R. Network Analysis with the Bro Security Monitor. 2014. Available online: https://www.admin-magazine.com/Archive/2014/24/Network-analysis-with-the-Bro-Network-Security-Monitor (accessed on 21 September 2022).
  37. Ambusaidi, M.A.; He, X.; Nanda, P.; Tan, Z. Building an intrusion detection system using a filter-based feature selection algorithm. IEEE Trans. Comput. 2016, 65, 2986–2998. [Google Scholar] [CrossRef]
  38. Bistron, M.; Piotrowsk, Z. Artificial intelligence applications in military systems and their influence on sense of security of citizens. Electronics 2021, 10, 871. [Google Scholar] [CrossRef]
  39. Maza, S.; Touahria, M. Feature selection algorithms in intrusion detection system: A survey. KSII Trans. Internet Inf. Syst. 2018, 12, 5079–5099. [Google Scholar] [CrossRef]
  40. Kousis, A.; Tjortjis, C. Data mining algorithms for smart cities: A bibliometric analysis. Algorithms 2021, 14, 242. [Google Scholar] [CrossRef]
  41. Cheong, Y.G.; Park, K.; Kim, H.; Kim, J.; Hyun, S. Machine learning based intrusion detection systems for class imbalanced datasets. J. Korea Inst. Inf. Secur. Cryptol. 2017, 27, 1385–1395. [Google Scholar] [CrossRef]
  42. Nawi, N.M.; Atomi, W.H.; Rehman, M.Z. The effect of data preprocessing on optimizing training on artificial neural network. Procedia Technol. 2013, 11, 23–39. [Google Scholar] [CrossRef]
  43. Weston, J.; Elisseff, A.; Schoelkopf, B.; Tipping, M. Use of the zero norm with linear models and kernel methods. J. Mach. Learn. Res. 2003, 3, 1439–1461. [Google Scholar]
  44. Song, L.; Smola, A.; Gretton, A.; Borgwardt, K.; Bedo, J. Supervised feature selection via dependence estimation. In Proceedings of the International Conference on Machine Learning, 2007, Corvallis, OR, USA, 20–24 June 2007; Available online: http://www.gatsby.ucl.ac.uk/~gretton/papers/SonSmoGreetal07.pdf (accessed on 7 August 2022).
  45. Dy, J.G.; Brodley, C.E. Feature selection for unsupervised learning. J. Mach. Learn. Res. 2005, 5, 845–889. [Google Scholar]
  46. Mitra, P.; Murthy, C.A.; Pal, S. Unsupervised feature selection using feature similarity. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 301–312. [Google Scholar] [CrossRef]
  47. Zhao, Z.; Liu, H. Semi-supervised feature selection via spectral analysis. In Proceedings of the SIAM International Conference on Data Mining, Minneapolis, MN, USA, 26–28 April 2007; pp. 641–646. [Google Scholar] [CrossRef]
  48. Xu, Z.; Jin, R.; Ye, J.; Lyu, M.; King, I. Discriminative semi-supervised feature selection via manifold regularization. IEEE Trans. Neural Netw. 2010, 21, 1033–1047. [Google Scholar] [PubMed]
  49. Swathi, K.; Rao, B.B. Impact of PDS based kNN classifiers on Kyoto dataset. Int. J. Rough Sets Data Anal. 2019, 6, 61–72. [Google Scholar] [CrossRef]
  50. Uhm, Y.; Pak, W. Service-aware two-level partitioning for machine learning-based network intrusion detection with high performance and high scalability. IEEE Access 2020, 9, 6608–6622. [Google Scholar] [CrossRef]
  51. Singh, A.P.; Kaur, A. Flower pollination algorithm for feature analysis of Kyoto 2006+ dataset. J. Inf. Optim. Sci. 2019, 40, 467–478. [Google Scholar]
  52. Garcia, S.; Luengo, J.; Herera, F. Data preparation basic models. In Data Preprocessing in Data Mining; Intelligent System Reference Library; Springer: Berlin/Heidelberg, Germany, 2015; Volume 72, pp. 39–57. [Google Scholar] [CrossRef]
  53. Al-Imran, M.; Ripon, S.H. Network Intrusion Detection: An analytical assessment using deep learning and state-of-the-art machine learning models. Int. J. Comput. Intell. Syst. 2021, 14, 1–20. [Google Scholar] [CrossRef]
  54. Obaid, H.S.; Dheyab, S.A.; Sabry, S.S. The impact of data pre-processing techniques and dimensionality reduction on the accuracy of machine learning. In Proceedings of the 9th Annual Information Technology, Electromechanical Engineering and Microelectronics Conference (IEMECON), Jaipur, India, 13–15 March 2019; pp. 279–283. [Google Scholar] [CrossRef]
  55. Khraisat, A.; Gondal, I.; Vamplew, P.; Kamruzzaman, J. Survey of intrusion detection systems: Techniques, datasets and challenges. Cybersecurity 2019, 2, 1–22. [Google Scholar] [CrossRef]
  56. Ferryian, A.; Thamrin, A.H.; Takeda, K.; Murai, J. Generating network intrusion detection dataset based on real and encrypted synthetic attack traffic. Appl. Sci. 2021, 11, 7868. [Google Scholar] [CrossRef]
  57. Soltani, M.; Siavoshani, M.J.; Jahangir, A.H. A content-based deep intrusion detection system. Int. J. Inf. Secur. 2022, 21, 547–562. [Google Scholar] [CrossRef]
  58. Tsai, C.-F.; Hsu, Y.-F.; Lin, C.-Y.; Lin, W.-Y. Intrusion detection by machine learning. Expert Syst. Appl. 2009, 36, 11994–12000. [Google Scholar] [CrossRef]
  59. Serkani, E.; Gharaee, H.; Mohammadzadeh, N. Anomaly detection using SVM as classifier and DT for optimizing feature vectors. ISeCure 2019, 11, 159–171. [Google Scholar]
  60. Rahman, A.; Islam, Z. AWST: A novel attribute weight selection technique for data clustering. In Proceedings of the 13th Australasian Data Mining Conference (AusDM 2015), Sydney, Australia, 8–9 August 2015; pp. 51–58. [Google Scholar]
  61. Rahman, M.A.; Islam, M.Z. CRUDAW: A novel fuzzy technique for clustering records following user defined attribute weights. In Proceedings of the Tenth Australasian Data Mining Conference (AusDM 2012), Sydney, Australia, 5–17 December 2012; Volume 134, pp. 27–42. [Google Scholar]
  62. Lampton, M. Damping-undamping strategies for Levenberg-Marquardt least-squares method. Comput. Phys. 2019, 11, 110–115. [Google Scholar] [CrossRef]
  63. Dinov, I.D. Data Science and Predictive Analytics; Springer: Ann Arbor, MI, USA, 2018. [Google Scholar]
  64. Allier, S.; Anquetil, N.; Hora, A.; Ducasse, S. A framework to compare alert ranking algorithms. In Proceedings of the 19th Working Conference on Reverse Engineering, 2012, Kingston, ON, Canada, 15–18 October 2012. [Google Scholar]
  65. Zhao, N.; Jin, P.; Wang, L.; Yang, X.; Liu, R.; Zhang, W.; Sui, K.; Pei, D. Automatically and adaptively identifying severe alerts for online service systems. In Proceedings of the IEEE INFOCOM 2020—IEEE Conference on Computer Communications, Toronto, ON, Canada, 6–9 July 2020. [Google Scholar]
  66. Gaur, L.; Solanki, A.; Jain, V.; Khazanchi, D. Handbook of Research on Engineering Innovations and Technology Management in Organizations; ICI Global: Brussels, Belgium, 2020. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.