You are currently viewing a new version of our website. To view the old version click .
Electronics
  • Editor’s Choice
  • Article
  • Open Access

12 March 2024

IoT-Based Intrusion Detection System Using New Hybrid Deep Learning Algorithm

and
Department of Information Security Engineering, Graduate School of Natural and Applied Sciences, Gazi University, 06560 Ankara, Turkey
*
Author to whom correspondence should be addressed.
This article belongs to the Section Artificial Intelligence

Abstract

The most significant threat that networks established in IoT may encounter is cyber attacks. The most commonly encountered attacks among these threats are DDoS attacks. After attacks, the communication traffic of the network can be disrupted, and the energy of sensor nodes can quickly deplete. Therefore, the detection of occurring attacks is of great importance. Considering numerous sensor nodes in the established network, analyzing the network traffic data through traditional methods can become impossible. Analyzing this network traffic in a big data environment is necessary. This study aims to analyze the obtained network traffic dataset in a big data environment and detect attacks in the network using a deep learning algorithm. This study is conducted using PySpark with Apache Spark in the Google Colaboratory (Colab) environment. Keras and Scikit-Learn libraries are utilized in the study. ‘CICIoT2023’ and ‘TON_IoT’ datasets are used for training and testing the model. The features in the datasets are reduced using the correlation method, ensuring the inclusion of significant features in the tests. A hybrid deep learning algorithm is designed using one-dimensional CNN and LSTM. The developed method was compared with ten machine learning and deep learning algorithms. The model’s performance was evaluated using accuracy, precision, recall, and F1 parameters. Following the study, an accuracy rate of 99.995% for binary classification and 99.96% for multiclassification is achieved in the ‘CICIoT2023’ dataset. In the ‘TON_IoT’ dataset, a binary classification success rate of 98.75% is reached.

1. Introduction

Wireless sensor networks serve as a bridge between the real world and the digital world. The network created by connecting sensors to each other to detect the real world and spread these data to the digital world is generally called a wireless sensor network. Wireless Sensor Networks (WSN) provide real-time data flow in various fields, such as military surveillance, battlefield monitoring, forest fire tracking, building security monitoring, and healthcare services. A wireless data network is part of the Internet of Things (IoT), and the collected data are processed, analyzed, and presented to the user with the help of a base station. A WSN typically includes at least one gateway node that serves as a link between the network and the external world [1].
Communication between these sensors and the gateway node should be conducted in the most energy-efficient way because sensor nodes have limited energy and their batteries cannot be recharged. Due to the characteristic features of wireless sensor networks, the communication methods should be simple, efficient, and easily adaptable to different scenarios. As a result, the resources they can use are limited. Due to factors such as low power consumption, processor constraints, and the inability to add some devices due to cost, wireless sensor networks are vulnerable to attacks. Security and privacy are crucial considerations in these systems. Enhancing them and adding new features will require overcoming obstacles, such as energy constraints and processor limitations. Applying traditional security measures, such as cryptography, to such networks is challenging because WSNs are highly vulnerable to attacks due to their open and distributed structures and the limited resources of sensor nodes. Additionally, frequent broadcasting of packets in WSNs may be necessary, and sensor nodes can be randomly deployed in an environment, making it easy for a malicious attack to be injected into the WSN [2].
An aggressive sensor can compromise the network, eavesdrop on messages, inject spoofed messages, alter the integrity of data, and waste network resources. Denial of Service (DoS) attacks are considered one of the most common and dangerous threats to WSN security. DoS attacks remain a significant challenge today. There are various forms of this attack, and its main purpose is to interrupt or suspend the services provided by WSNs [2]. The destructive impact of DoS attacks is that they consume the power resources of nodes and significantly shorten their operational lifetimes. Therefore, sensors can die quickly because of DoS attacks. Nodes that run out of power become useless; hence, the intended use of the WSN is compromised.
As modern technology is based on data, DoS attacks are a very dangerous and significant attack type. Data is known as raw information and becomes meaningful after processing. With the beginning of the computer age, the amount of data used has significantly increased. Logs left by network traffic, system events, and system components can be included in big data. Thanks to big data analytics and related technologies, data streams can be continuously monitored, and anomalies and changes in the network can be detected to ensure network security. By working together, big data and artificial intelligence algorithms can analyze the past and present data of the network, determining whether the current network traffic is normal or attacked [3].
In networks with limited resources that are vulnerable to attacks, there are systems capable of detecting these attacks in real time and alerting the relevant sensor node. These systems are called Intrusion Detection Systems (IDS). An IDS is a proactive attack detection tool used to detect and classify unauthorized entries, attacks, or violations of security policies in a timely manner [4]. Due to the limited resources of sensor nodes, these intrusion detection systems should have high accuracy and should not impose an additional burden on the network’s resource consumption.
In this study, a new intrusion detection system in a big data environment is developed with a hybrid deep learning algorithm. The algorithm is implemented in Pyspark, Apache Spark’s Python support, using the Google Colabs environment. Apache Spark is preferred because of its fast execution and the data used in the algorithm are part of big data. This intrusion detection algorithm is trained and tested using CICIoT2023 and TON_IOT datasets. The system is evaluated for both binary and multiclass classification, using evaluation parameters such as accuracy, precision, recall and F1-score. The developed model was compared with ten traditional machine and deep learning algorithms (Random Forest (RF), Decision Tree (DT), Gradient Boost (GB), AdaBoost (ADA), Naive Bayes (NB), Logistic Regression (LR), K-Nearest Neighbor (KNN), Convolutional Neural Network (CNN), Multi-layer Perceptron (MLP) and Long Short-Term Memory (LSTM)).
The contributions of the proposed study to the literature are as follows:
  • A new IDS has been developed efficiently in a big data environment using a new hybrid deep learning algorithm.
  • The developed algorithm has been tested for both binary and multiclass classification and achieved high accuracy in both cases.
  • The developed hybrid algorithm has been compared with ten mostly used machine and deep learning algorithms. The results showed that the proposed hybrid method has better accuracy than traditional methods.
  • Deep learning algorithms, such as CNN and LSTM, were individually tested. It was observed that the hybrid algorithm created using CNN and LSTM performs better than using them separately.
  • High accuracy has been achieved in a large dataset such as CICIoT2023, which exhibits an imbalanced distribution of values without the use of any balancing methods.
  • The addition of a second dataset to the study resulted in a high intrusion detection rate in a different dataset.
The study consists of seven sections. Section 2 describes the studies conducted with the CICIoT2023 and TON_IOT datasets. Section 3 defines the Distributed Denial of Service (DDoS) attacks present in the dataset and provides a general description of anomaly detection systems developed to prevent them. Section 4 describes the preprocessing stages of the used dataset and introduces the deep learning algorithms employed in the study. Section 5 explains the developed algorithm. The evaluation results of the model and their comparison are given in Section 6. Section 7 includes a discussion of the study. Section 8 consists of the conclusion and future work section.

3. DDoS Attacks and Intrusion Detection System for IoTs

This section provides information about DDoS attacks in IoT and the Intrusion Detection Systems (IDS) used to detect them.

3.1. Intrusion Detection Systems

As communication technologies continue to evolve, the security of the devices used and the network they create have become very crucial. Cyber attacks are defined as all attempts that can threaten the confidentiality, integrity, and accessibility of information [23]. The integrity and confidentiality of the data stored must be protected from these attacks. Intrusion detection systems are used for this purpose. These systems distinguish any attack from normal traffic and can warn the user in case of an attack. An IDS, whose general working mechanism is given in Figure 1, records all activities that differ from normal traffic as anomalies. These systems are also prone to false alarms. Therefore, it is very important that the IDSs that are developed have a high accuracy value. In this study, not only attack detection but also determination of the attack type were made with a high accuracy rate.
Figure 1. Intrusion detection systems.

3.2. DDoS

A DDoS attack is a malicious attempt to disrupt the normal traffic of the target system. DDoS attacks aim to disrupt service by sending packets that exceed the capacity of targeted source machines to respond to requests. Attackers use zombie computers created with malware inserted into victims’ computers to send large amounts of packets. DDoS attacks cause high network traffic with packets sent over the network, causing the system not to respond to the requests of normal users who want to receive service [4]. A DDoS attack is one of the biggest threats to internet-based applications and their resources. The aim of this attack is to incapacitate internet-based services by transmitting a substantial volume of attack traffic [13]. Since the attack types are given in the main category in the TON_IOT dataset, the details of these attacks are not known. For example, it is not given which types are included under the main category of DDoS. In the CICIoT2023 dataset, subcategories of attack types are also given. This dataset includes different attack types under the headings of flood and fragmentation. These attacks are described below.
In a SYN flood attack, the attacker consumes the resources of IoT devices by repeatedly sending half-open synchronization packets for the TCP connection request. These connections are left open for further communication [4]. The victim machine, using all available ports, may respond slowly or not at all to legitimate traffic. A UDP flood attack is an attack in which large packets are sent by attackers without any permission using User Datagram Protocol (UDP), a fast data-sharing protocol [4]. The ICMP (Internet Control Message Protocol) is the network protocol used for IP control/error reporting. In the ICMP flood attack, the attacker aims to take the network offline by sending too many ICMP requests [24]. There are RSTFIN flood attacks made with FIN and RST packets in the TCP protocol. The FIN packet is sent to securely terminate the TCP connection between the current client and server. The RST packet is sent by the server in abnormal situations and is used to forcibly close the connection. In the RSTFIN flood attack, the attacker causes congestion of the system by sending FIN and RST packets that do not belong to the target network [24]. The main focus of the HTTP flood attack is on generating attack traffic that simulates a close resemblance to normal network traffic. Thus, it becomes difficult for the victim to distinguish between legitimate traffic and attack traffic. In the HTTP flood attack, the aim is to exhaust the server’s resources by ensuring that session connection request rates are higher than those generated by legitimate users [25]. In the Slowloris attack, HTTP sends the request in pieces and slowly, and the created request is not completed. As a result, the server keeps the relevant connection in a waiting phase to complete the connection and receive the necessary data. In this way, over time, open connection requests increase and cause the congestion of the system. [25]. The PSH packet used in the PSHACK flood attack is used to ensure that the client receiving this command sends all data to a specified application and that the data are processed. Packets with combinations of PSH and ACK are often seen in normal incoming traffic [26]. The attacker can create a PSHACK flood attack by intensively directing these packet combinations to the target server. A synonymous IP flood attack is a type of DDoS attack that aims to consume the resources of DNS servers by sending a high volume of requests for a nonexistent domain. This attack, which uses the TCP protocol, uses high-speed packets [27].
In fragmentation attacks, the aim is to send packets larger than the MTU (Maximum Transfer Unit) limit that can be transferred at the network entrance, thus ensuring that they are fragmented and sent. This size is 1500 bytes in Ethernet network. Fragmentation attacks are performed by sending frames higher than this value. An ACK fragmentation attack is a version of the ACK and PUSH-ACK flood attack. Fragmented packets pass through switches, firewalls, IDS, and IPS because the router does not reassemble fragmented frames. These packages may contain random and irrelevant information. With this attack, the aim is for the victim to consume resources [28]. In the ICMP fragmentation attack, fragmented ICMP packets are used.
The victim is exposed to ICMP packets that cannot be reassembled. Since these packets contain random and irrelevant information, the victim’s resources are consumed by trying to combine them [28]. A UDP fragmentation attack is an adaptation of a UDP flood attack. Since the fragmented UDP packets are deceptive and unrelated to each other, the target server wastes its resources by trying to reassemble them. This type of attack causes the victim’s CPUs to overheat and consume their resources unnecessarily [28].

4. Materials and Methods

In this section, information about the CICIoT2023 and TON_IOT datasets used in the study is provided. Afterwards, the preprocessing steps of the dataset are explained. Then, the deep learning algorithms used in the study are defined.

4.1. Dataset

Information about the CICIoT2023 and TON_IOT datasets used in the study is included in Section 4.

4.1.1. CICIoT2023

This dataset was produced by Neto et al. [21] and published in the University of New Brunswick (UNB)—Canadian Institute for Cybersecurity (CIC) database. An IoT topology consisting of 105 IoT devices was established. A total of 67 IoT devices were directly involved in the attacks, and another 38 Zigbee and Z-Wave devices were connected to five hubs. This topology is designed to mimic a real physical IoT smart home environment. The testbed consists of smart home devices, cameras, sensors, and microcontrollers that are connected and configured to allow the execution of various attacks. The test environment is also equipped with various tools and software that allow it to perform various attacks and capture both benign and malicious attack traffic. The testbed produced 33 different attack types. These attacks are classified into seven categories: DDoS, DoS, Recon, Web-based, brute force, spoofing, and Mirai. The dataset contains 47 features. Details of these features are given in Table 2.
Table 2. List of features of CICIoT2023.
Table 3 shows the number of attacks in the dataset and the attack classes they belong to. Figure 2 shows the distribution graph of attacks according to attack class. As can be seen in the figure, the DDoS is the most common class in the dataset, with 73%. This is followed by the DoS class, with 17%. Web-based and brute force attack classes are below 1%.
Table 3. Count of attack type of CICIoT2023.
Figure 2. Distrubition of CICIoT2023.

4.1.2. TON_IOT

The TON_IoT [29] dataset was collected in a realistic, large-scale test environment at the UNSW Canberra Cyber Institute’s IoT Lab in 2019. The dataset includes a number of modern IoT attacks, such as scanning, DoS, DDoS, ransomware, backdoor, injection, cross-site scripting (XSS), password cracking, and Man-In-The-Middle (MITM) attacks. Figure 3 shows a data structure of the TON_IoT dataset. In this study, the Processed Windows 10 dataset, which is a subset of the TON_IoT, was used.
Figure 3. Data structure of TON_IoT dataset.
The test environment where the dataset would be recorded was designed with three layers consisting of edge, fog, and cloud to emulate a realistic physical IoT network. The edge layer includes the physical devices and operating systems required for the infrastructure required for configuration and virtualization technologies to be used in the fog and cloud layers. This layer includes multiple IoT/IIoT devices, such as Modbus, light bulb sensors, smartphones, smart TVs; host systems, such as workstations and servers used to intercept IoT/IIoT devices; hypervisors; and physical network relays [30]. The fog layer encompasses the virtualization technology responsible for managing virtual machines (VMs) and their associated services. This layer enables the creation of a dynamic experimental IoT/IIoT network within the ToN IoT framework, allowing communication between the edge, fog, and cloud layers [30]. The cloud layer contains online-configured cloud services within the testbed. The fog and edge services connect cloud virtualization and cloud data analytics services. Additionally, the public vulnerable website is used to create injection hacking events against websites. The other cloud services are set up to transmit sensor data to the cloud and visualize the corresponding patterns [30].
The created dataset was made more suitable for artificial intelligence training and testing by using the performance monitoring tool and extracting information such as disk, process, processor, and memory in CSV format. The Windows 10 dataset contains 125 features and 2 class labels. Descriptions of the features in TON_IoT dataset are available on UNSW Canberra Cyber Institute database [31]. As given in Table 4, there are 35,974 records collected for the Windows 10 dataset.
Table 4. Count of attack type Processed Windows 10 of TON_IOT.

4.2. Preprocessing

It is not appropriate to use datasets in deep learning algorithms without preprocessing. Data collected from real-world environments often contain many errors and irregularities and need to be cleaned. For example, if there are string values in the dataset, they cannot be used in deep learning training without numerical conversion. Preprocessing aims to provide the algorithm with smoother data, thereby enhancing the efficiency of the model. Figure 4 presents a flow diagram of the data processing stages of proposed algorithm.
Figure 4. Flow diagram of preprocess.
The first operation performed on the dataset is to delete cells that do not contain any data and remove blank values. Rows containing empty data were removed to prevent any negative effects on the model.
It is not always beneficial to include all features in large datasets in training. Features in the dataset may be correlated with each other and may not benefit the result. Additionally, having too many values also increases the cost of education. In order to see unnecessary attributes, the correlation matrix of the features in the dataset is extracted, and the features with high correlation values are removed from the dataset. In this study, feature selection was made using the Pearson correlation coefficient method. The PCC Formula (1) is given below. Here, μ is the mean of variable, and σ is the standard deviation.
P C C X , Y =   Σ Xi μ x Yi μ y σ x . σ y
Correlation resulted between −1 and 1. This means that when the PCC approaches positive 1, it signifies a positive correlation between the two variables. This implies that when one variable decreases or increases, the other positively correlated variable also moves in the same direction. Similarly, two variables that are negatively correlated behave in the reverse direction [32]. Features with correlation values above a certain value are removed. The correlation matrix of the CICIoT2023 dataset is given in Figure 5. This value was determined as 0.99 for the dataset given the correlation matrix. As a result, 40 features were selected in the CICIoT2023 dataset and 85 features were selected in the Processed Windows 10 of TON_IOT dataset.
Figure 5. Diagonal correlation matrix of CICIoT2023 dataset.
The next process applied to the dataset is label encoding, which involves converting non-numeric features into numerical values. Label encoding has been applied to categorical features within the dataset. There is no need to apply this process to numeric features in the dataset. Following the numerical conversion of string values, normalization (2) has been performed on the dataset.
x′ = (x − μ)/σ
x is the original value, x′ is normalized value, and μ and σ are the mean and standard deviation values, respectively. Thanks to the normalization process, numerically large features are prevented from negatively affecting the result and performance of the deep learning model [3]. Concerning the label attribute of the dataset, if binary evaluation is performed, normal values are labeled as 0 and attack values are labeled as 1. This step is not performed if multiclass classification is conducted. The next process is to add a new dimension to the data shape and make it compatible for the CNN layer. The last stage of data processing is to divide the dataset into two: training and testing. This ratio was chosen as 0.8.
After the steps shown in Figure 4 were completed in order, the attack classes in the dataset were selected. There are 12 different attack types under the category of DDoS in the CICIoT2023 dataset. By removing the least common DDoS attacks, which are UDP fragmentation, ACK fragmentation, HTTP_Flood and SlowLoris, 8 DDoS attack types were selected. Since there is no subtype of DDoS attack in the TON_IoT dataset, only binary evaluation was made here by taking only DDoS classes.

4.3. Deep Learning Algorithms

The fundamental logic in machine learning and deep learning algorithms involves creating a model by learning features extracted from the dataset during training. Subsequently, this model is used to make predictions for unknown data. Artificial intelligence-based anomaly detection systems utilize this structure to enable the developed model to detect abnormal situations in the network. This section explains the deep learning algorithms used in the study.

4.3.1. Convolutional Neural Network (CNN)

The CNN algorithm is one of the deep learning algorithms based on an artificial neural network. The basic algorithm of CNN is given in Figure 6. It consists of convolution, pooling, flattening, and fully connected layers. The convolution layer is the cornerstone of CNN. The convolutional layer is responsible for processing data from a receiving cell. Equation (3) for the size of the output volume (Wo) is described as follows, where P is the stride, Wi is the size of the input volume, S is the kernel size of the convolutional layer neurons, and M is the amount of zero padding [8].
W o = W i S + 2 M   P + 1
Figure 6. CNN basic algorithm.
After, convolution is performed with a filter that extracts the characteristics of the input value. The feature map is created with this layer. In order to reduce the number of parameters to be calculated and to make training easier, the size of the input data is reduced by pooling. The pooling layer can be selected in two different ways: the largest of the values within the area of the selected size (max pooling) or the average of the values (average pooling). There may be more than one convolution/pooling layer in the created algorithm. This stage is known as feature extraction. The feature extracted data become available for calculation. The CNN algorithm can be one-dimensional, two-dimensional, or three-dimensional. One-dimensional CNN was used in the model. The differences between 1D, 2D, and 3D CNN are as follows:
  • In 1D CNN, the filter moves in one dimension. Input and output data must be two-dimensional. It can be used in time series-type data.
  • In 2D CNN, the filter moves in two dimensions. Input and output data must be three-dimensional. It can be used in algorithms that use images as input.
  • In 3D CNN, the filter moves in three dimensions. Input and output data must be four-dimensional. It can be used in algorithms that use video as input.
The classification region consists of layers known as flattened and fully connected. After the convolutional stage, the data need to be flattened to be usable in the fully connected stage. This step is carried out in the flatten layer. The fully connected layer uses the classical artificial neural network model. Classification results are determined by calculating the weight values in this layer. In order to apply CNN to nonimage data, the dimensions of input must be transformed. Thus, CNN can be used with one-dimensional convolutional layers [33].

4.3.2. Long Short-Term Memory (LSTM)

The LSTM algorithm is a type of RNN that can learn long-term dependencies and retain sequential data in memory. It solves the vanishing gradient problem caused by gradual decay in gradient inversion operations during calculation. LSTM is an algorithm suitable for use in matters related to time series [34]. Thanks to these features, it can be used in algorithms such as language processing, video processing, and speech recognition. The LSTM algorithm consists of memory blocks called cells, and these are the main components of the algorithm. The LSTM algorithm is shown in Figure 7. The LSTM algorithm consists of three parts: forget gate (ft), input gate (it), and output gate (ot). Input and output gates represent the input and output of data at time t. The forget gate decides whether the data will be forgotten or not by comparing instantaneous data inputs with the previous data state [3].
Figure 7. LSTM basic algorithm.
The mathematical equation describing the relationship between the gates in an LSTM cell is as follows [3]:
it = σ(wi · [ht−1, Xt])
ft = σ(wf · [ht−1, Xt])
ot = σ(wo · [ht−1, Xt])
Ct˜ = tanh(wt · [ht−1, xt])
Ct = ft × Ct−1 + it × Ct˜
ht = ot × tanh(Ct)

5. Definition of Model

A hybrid deep learning model was developed using one-dimensional (1D) CNN and LSTM algorithms to detect DDoS attacks. The model has a sequential algorithm consisting of layers. Figure 8 shows the flow diagram of the proposed model.
Figure 8. Flow diagram of proposed model.
The developed sequential model starts with the input layer. This layer describes the dataset size used. As seen in Figure 8, 2 CNN algorithms were applied. The parameter settings of the CNNs used were adjusted to provide the best results. These parameters are presented in tabular form in Table 5. After the convulation process, the pooling layer was applied. The main task of the pooling layer is to reduce the dimensionality of the extracted feature matrix. While the computational load is reduced in the pooling layer, important information is preserved [35]. Max pooling was applied between the first and second CNN, and average pooling was applied after the second CNN. After CNN, the LSTM algorithm was used sequentially. The parameters of the LSTM algorithm used are set to units = 140, dropout = 0.2, and recurrent_dropout = 0.4. After the LSTM algorithm, the dimensions in the flatten layer were made suitable for the dense layer. The flatten layer is followed by dense layers known as fully connected. The dropout function was implemented at a ratio of 30%. The reason for using the dropout layer is to prevent the algorithm from overlearning. The function of this layer is to ignore some nodes randomly. This partly refers to a situation in which neurons can change the way they correct the errors of other neurons [36].
Table 5. Used CNN parameters.
The ReLu function (10) was used as the activation function in both CNN algorithms. The ReLU function was used because it provides computational simplicity and eliminates negative values.
f(x) = max(0,x)
The output layer of the model ends with a softmax activation function. This function returns the probability of the maximum value for the sample evaluated in a multiclass probability problem to have the most accurate label in terms of probability [37]. With the softmax function, the result is produced as a probability distribution.

6. Experiments and Results

In this section, evaluation parameters and test results are given. In the study, PySpark, which provides the opportunity to write in Python programming language on Apache Spark, was used through the Google Colab platform. Scikit-learn and Keras libraries were used to create deep learning algorithms.
Training and testing of the model was conducted on a computer with the following configuration:
-
MacOS v12.6 operating system;
-
M1 Apple Silicon (2020);
-
13.3″ screen;
-
8-core CPU;
-
8-core GPU;
-
8 GB RAM;
-
256 GB SSD.
The CICIoT2023 dataset consists of more than one data file, and by combining these files, considerable data to be processed emerged. Among the studies examined, there are studies conducted by taking samples from the dataset [10,13]. This both reduces the training cost and does not have a serious impact on the outcome. At the same time, using the entire dataset consumes computer resources and makes processing inoperable. In this way, it eliminates the need to use high-capacity computers and servers, which are expensive and difficult to access. Instead of the entire dataset, a subspace set of the dataset was used, reduced to 20%. The attack class ratio of the subspace cluster is the same as the original version. Thus, training and testing costs and time were saved. In addition to the proposed algorithm tests, the dataset was also tested with ten machine learning and deep learning algorithms.
The parameter values of the artificial intelligence and machine algorithms used are given below:
  • Random forest: max_depth = 4, n_estimators = 100;
  • Decision tree: max_depth = 5, random_state = 0;
  • Gradient boost: n_estimators = 10, max_depth = 3, learning_rate = 0.1;
  • AdaBoost: n_estimators = 10, learning_rate = 0.1, random_state = 0;
  • Naive Bayes: default;
  • Logistic regression: default;
  • K-nearest neighbour: n_neighbors = 3, leaf_size = 50;
  • MLP: hidden_layer_sizes = (5,10,5), max_iter = 5;
  • CNN: filters = 64, kernel_size = 2, activation = ‘relu’;
  • LSTM: units = 100, dropout = 0.2, and recurrent_dropout = 0.2.
The results obtained in the study were evaluated from different perspectives. Firstly, multiclass and binary evaluations of DDoS attack classes in the CICIoT2023 dataset were made. Secondly, binary evaluation was made for the TON_IOT-Windows10 dataset. The developed algorithm was compared with ten machine and deep learning algorithms in both datasets. Details of the hybrid algorithm used are explained in the “Five Definitions of Model” section. The evaluations were made with the parameters accuracy, precision, recall, and F1 Score. Additionally, ROC curves and confusion matrix graphics were also created and included in the study.

6.1. CICIoT2023 Dataset Results

The CICIoT2023 dataset, whose preprocessing was completed, was first tested as binary with the proposed hybrid algorithm. The developed algorithm has been compared with machine learning and deep learning algorithms, including random forest, decision tree, gradient boost, AdaBoost, naive Bayes, logistic regression, K-nearest neighbour, CNN, MLP, and LSTM. Table 6 shows the binary evaluation results of the algorithms for the CICIoT2023 dataset. Figure 9 shows the results graphed.
Table 6. CICIoT2023 dataset binary classification results (%).
Figure 9. CICIoT2023 dataset binary classification results graph.
As can be seen in Table 6 and Figure 9, the proposed algorithm reached the highest binary classification accuracy value. This was followed by the GB, MLP and CNN algorithms. It can be seen that the lowest result is the NB algorithm. The confusion matrix for testing the developed hybrid algorithm in binary is given in Figure 10. The diagram of the created ROC curve is presented in Figure 11.
Figure 10. Confusion matrix for binary classification of CICIoT2023 dataset.
Figure 11. ROC curves for binary classification of CICIoT2023 dataset.
According to the confusion matrix in Figure 10, it can be seen that the false positive rate (FPR) is almost negligible; only about a hundred records were misclassified. True positive (TPR) records were quite high. According to the ROC chart in Figure 11, the AUC-ROC value was above 0.99.
The algorithms were also evaluated as multiclass classification. The developed algorithm and ten machine learning and deep learning algorithms were tested. Table 7 shows the multiclass evaluation results of the algorithms for the CICIoT2023 dataset. In Figure 12, the results are presented graphically.
Table 7. CICIoT2023 dataset multiclass classification results.
Figure 12. CICIoT2023 dataset multiclass classification result graph.
As can be seen in Table 7 and Figure 12, the proposed algorithm reached the highest multiclass classification accuracy value. This was followed by the MLP, CNN, and GB algorithms. The lowest results belong to the ADA and DT algorithms. The confusion matrix for testing the developed hybrid algorithm in multiclass is given in Figure 13. The diagram of the created ROC curve is presented in Figure 14.
Figure 13. Confusion matrix for multiclassification of CICIoT2023 dataset.
Figure 14. ROC curves for multiclassification of CICIoT2023 dataset.
According to the confusion matrix in Figure 13, the false positive rate (FPR) remained at very low levels, reaching a maximum value of 41 records in all combinations. True positive (TPR) records showed that the performance in one-to-one matching of classes reached high values. According to the ROC chart of the multiclass classification in Figure 14, the AUC-ROC value is close to 0.99 in all attack classifications. Table 8 compares the performance of our work with other state-of-the-art methods that are tested under the CICIoT2023 dataset. The comparison is conducted with respect to model, dataset, and accuracy value.
Table 8. Comparison of other works using CICIoT2023 dataset.
According to the evaluation results above, there is a significant difference between binary classification and multiclass classification. As the attack classes in the dataset increase, the accuracy value of the algorithm decreases. In binary classification on the CICIoT2023 dataset, the machine learning and deep learning algorithms tested resulted in approximately similar outcomes. In multiclass classification, there was a significant decrease in the DT and AdaBoost algorithms. There is no serious decrease in the developed hybrid algorithm. In studies conducted using the CICIoT2023 dataset, the best results were achieved with the proposed hybrid algorithm, with rates of 99.995% in attack detection and 99.96% in attack type detection. The proposed algorithm reaches the highest value in terms of accuracy compared to other studies and other tested algorithms.

6.2. TON_IOT Dataset Result

The presented hybrid algorithm was also evaluated using the TON_IOT dataset. The attack detection accuracy rate of the ProcessedWindowsDataset-Windows10 dataset included in the dataset is given in Table 9. Figure 15 shows the results graphed.
Table 9. TON_IOT-Processed-Windows10 dataset binary classification results (%).
Figure 15. TON_IOT dataset binary classification results graph.
As can be seen in Table 9 and Figure 15, the proposed algorithm reached the highest accuracy value in binary classification in the TON_IOT dataset. This was followed by the MLP, KNN, and CNN algorithms. The lowest result belongs to the NB algorithm. The confusion matrix of testing the developed hybrid algorithm as binary on the TON_IOT dataset is given in Figure 16. The diagram of the created ROC curve is presented in Figure 17.
Figure 16. Confusion matrix for binary classification of TON_IOT dataset.
Figure 17. ROC curves for binary classification of TON_IOT dataset.
According to the confusion matrix in Figure 16, the false positive rate (FPR) remained quite low with a total of 87 records. According to the ROC curves of binary classification in Figure 17, the AUC-ROC value is close to 0.99. Table 10 lists previous studies using the TON_IOT dataset.
Table 10. Comparison of other works using TON-IoT dataset.
The developed algorithm was also tested on a different dataset, the TON_IOT dataset, so the reliability of the algorithm was demonstrated. Compared to state-of-the-art studies, the highest attack detection accuracy value rate of 98.75% was achieved in the TON_IOT dataset. As can be seen in Table 8 and Table 10, the proposed algorithm has achieved better accuracy value than the studies performed on two datasets.

7. Discussion

The reason for using LSTM in the algorithm is that LSTMs, which are deep learning algorithms, are effective in capturing flow dynamics and maintaining information throughout the cycle. The LSTM algorithm is able to learn long-term dependencies and keep sequential data in memory. The forget gate in the LSTM algorithm decides whether the previous data will be forgotten or not. It has been regarded appropriate to use this structure in intense attacks, such as DDoS, thanks to the calculations made using sequential data. Another algorithm used in the classification stage is CNN. The CNN algorithm provides successful results in image classification, audio classification, and video classification and has strong capabilities to deal with classification problems by changing the depth and width of the network. Thanks to the convolution process, CNN can detect time-sensitive attack situations with fewer connections and parameters compared to standard feedforward neural networks with a similar number of layers. Since many features are extracted from the incoming data, it is effective in detecting attack types with unique features, such as DDoS [6]. The hybrid utilization of LSTM and CNN, leveraging their complementary features, has been observed to yield better results in conducted tests compared to their individual use.
Moreover, the analysis of the dataset in preprocessing steps and the removal of missing data have enabled making the data usable. By selecting the most relevant features from the dataset, the computational load of the algorithm is reduced, resulting in decreased training and testing costs. The preprocessing steps employed in our algorithm ensure that both training and test data are processed with noncomplex information.
Consequently, the developed algorithm has achieved a higher accuracy rate than all tested algorithms and state-of-the-art studies. The scope of the study has been expanded by evaluating the developed algorithm in multiclass classification and by assessing its performance on a different dataset.
Classification accuracy may give different results in different datasets even though the algorithms used are the same. This situation could be observed from the evaluation results of the TON_IOT dataset, which was used as the second dataset in the study. Compared to CICIoT2023, the accuracy values of the ADA and DT algorithms have increased in the TON_IOT dataset. Although the NB algorithm gave high results in multiclass evaluation, it remained at the lowest accuracy result in both binary evaluations. The NB algorithm does not consider interdependencies between features, which affects its accuracy [42]. The low results of the NB algorithm in binary can be thought of as there are intense dependencies in binary class evaluation, and this affects the accuracy value.

8. Conclusions and Future Works

In our modern world, needs such as internet use and communication of devices with each other are inevitable. In addition to the benefits these technologies provide us, there are also cases of their abuse. One of the types of malicious use of network and communication channels is cyber attacks. The most commonly used method for this is DDoS attacks, which aim to restrict or completely make the use of target systems inaccessible. Detecting DDoS attacks is very crucial to be able to counter them. In this study, a new hybrid deep learning algorithm using CNN and LSTM deep learning models was developed to detect DDoS attacks. CICIoT2023 and TON_IOT datasets, which are current datasets, were used in training and testing this algorithm. Firstly, preprocessing and feature selection steps were applied to datasets. After, the proposed algorithm was tested as binary, and then it was tested as multiclass in the CICIoT2023 dataset. Algorithm evaluation was made by calculating accuracy, precision, recall, F1-score, and ROC data. As a result of these trainings and tests, a 99.995% attack detection rate and a 99.96% attack type detection rate were achieved. By achieving this high accuracy rate, a reference point has been created for future studies, contributing to the literature. In the evaluation made also using the TON_IOT dataset, an attack detection rate of 98.75% was reached. The proposed hybrid deep learning algorithm developed in this study is aimed to reach the highest accuracy value.
The accuracy of the developed hybrid algorithm may be increased by optimizing the deep learning algorithm parameters. One of these optimization methods is the metaheuristic approach. In future studies, it can be combined with modern and effective metaheuristic techniques to improve the optimization of errors in the algorithm to be developed [43].
To achieve high accuracy, a large volume of data must be used. The high volume of data used also increases training and testing times. In the physical world, it is crucial to detect attack traffic, such as DDoS, that requires rapid intervention by using system resources as efficiently as possible. As a future study, optimizing these training times and developing intrusion detection systems that have both high accuracy rates and low cost will be a great contribution to the literature.

Author Contributions

Conceptualization, S.Y. and M.D.; methodology S.Y.; software, S.Y.; validation, S.Y.; formal analysis, S.Y. and M.D.; investigation, S.Y.; resources, S.Y. and M.D.; data curation, S.Y.; writing—original draft preparation, S.Y.; writing—review and editing, S.Y. and M.D.; visualization, S.Y. and M.D.; supervision, M.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

We used the CICIoT2023 dataset and TON_IoT dataset, which are publicly accessed datasets (https://www.unb.ca/cic/datasets/iotdataset-2023.html, https://ieee-dataport.org/documents/toniot-datasets) (accessed on 24 November 2023), for the evaluation of the proposed IDS.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Dener, M.; Al, S.; Orman, A. STLGBM-DDS: An Efficient Data Balanced DoS Detection System for Wireless Sensor Networks on Big Data Environment. IEEE Access 2022, 10, 92931–92945. [Google Scholar] [CrossRef]
  2. Batchu, R.K.; Seetha, H. A generalized machine learning model for DDoS attacks detection using hybrid feature selection and hyperparameter tuning. Comput. Netw. 2021, 200, 108498. [Google Scholar] [CrossRef]
  3. Al, S.; Dener, M. STL-HDL: A new hybrid network intrusion detection system for imbalanced dataset on big data environment. Comput. Secur. 2021, 110, 102435. [Google Scholar] [CrossRef]
  4. Cil, A.E.; Yildiz, K.; Buldu, A. Detection of DDoS attacks with feed forward based deep neural network model. Expert Syst. Appl. 2021, 169, 114520. [Google Scholar] [CrossRef]
  5. Almaraz-Rivera, J.G.; Perez-Diaz, J.A.; Cantoral-Ceballos, J.A. Transport and Application Layer DDoS Attacks Detection to IoT Devices by Using Machine Learning and Deep Learning Models. Sensors 2022, 22, 3367. [Google Scholar] [CrossRef]
  6. Jia, Y.; Zhong, F.; Alrawais, A.; Gong, B.; Cheng, X. Flowguard: An intelligent edge defense mechanism against IoT DDoS attacks. IEEE Internet Things J. 2020, 7, 9552–9562. [Google Scholar] [CrossRef]
  7. Alghazzawi, D.; Bamasag, O.; Ullah, H.; Asghar, M.Z. Efficient detection of DDoS attacks using a hybrid deep learning model with improved feature selection. Appl. Sci. 2021, 11, 11634. [Google Scholar] [CrossRef]
  8. Ferrag, M.A.; Shu, L.; Djallel, H.; Choo, K.-K.R. Deep learning-based intrusion detection for distributed denial of service attack in Agriculture 4.0. Electronics 2021, 10, 1257. [Google Scholar] [CrossRef]
  9. Mamoudan, M.M.; Ostadi, A.; Pourkhodabakhsh, N.; Fathollahi-Fard, A.M.; Soleimani, F. Hybrid neural network-based metaheuristics for prediction of financial markets: A case study on global gold market. J. Comput. Des. Eng. 2023, 10, 1110–1125. [Google Scholar] [CrossRef]
  10. Wei, Y.; Jang-Jaccard, J.; Sabrina, F.; Singh, A.; Xu, W.; Camtepe, S. Ae-mlp: A hybrid deep learning approach for ddos detection and classification. IEEE Access 2021, 9, 146810–146821. [Google Scholar] [CrossRef]
  11. Kumar, P.; Bagga, H.; Netam, B.S.; Uduthalapally, V. SAD-IoT: Security analysis of ddos attacks in iot networks. Wirel. Pers. Commun. 2022, 122, 87–108. [Google Scholar] [CrossRef]
  12. Alzahrani, R.J.; Alzahrani, A. Security Analysis of DDoS Attacks Using Machine Learning Algorithms in Networks Traffic. Electronics 2021, 10, 2919. [Google Scholar] [CrossRef]
  13. Patil, N.V.; Krishna, C.R.; Kumar, K. SSK-DDoS: Distributed stream processing framework based classification system for DDoS attacks. Clust. Comput. 2022, 25, 1355–1372. [Google Scholar] [CrossRef]
  14. Haq, M.A.; Khan, M.A.R.; AL-Harbi, T. Development of PCCNN-Based Network Intrusion Detection System for EDGE Computing. Comput. Mater. Contin. 2021, 71, 1769. [Google Scholar] [CrossRef]
  15. Iwendi, C.; Rehman, S.U.; Javed, A.R.; Khan, S.; Srivastava, G. Sustainable Security for the Internet of Things Using Artificial Intelligence Architectures. ACM Trans. Internet Technol. 2021, 21, 1–22. [Google Scholar] [CrossRef]
  16. Gamal, M.; Abbas, H.M.; Moustafa, N.; Sitnikova, E.; Sadek, R.A. Few-Shot Learning for Discovering Anomalous Behaviors in Edge Networks. Comput. Mater. Contin. 2021, 69, 1823–1837. [Google Scholar] [CrossRef]
  17. Gad, A.R.; Nashat, A.A.; Barkat, T.M. Intrusion Detection System Using Machine Learning for Vehicular Ad Hoc Networks Based on ToN-IoT Dataset. IEEE Access 2021, 9, 142206–142217. [Google Scholar] [CrossRef]
  18. Disha, R.A.; Waheed, S. Performance analysis of machine learning models for intrusion detection system using Gini Impurity-based Weighted Random Forest (GIWRF) feature selection technique. Cybersecurity 2022, 5, 1–22. [Google Scholar] [CrossRef]
  19. Kaur, J.; Agrawal, A.; Khan, R.A. P2ADF: A privacy-preserving attack detection framework in fog-IoT environment. Int. J. Inf. Secur. 2023, 22, 749–762. [Google Scholar] [CrossRef]
  20. Verma, R.; Chandra, S. RepuTE: A soft voting ensemble learning framework for reputation-based attack detection in fog-IoT milieu. Eng. Appl. Artif. Intell. 2023, 118, 105670. [Google Scholar] [CrossRef]
  21. Neto, E.C.P.; Dadkhah, S.; Ferreira, R.; Zohourian, A.; Lu, R.; Ghorbani, A.A. CICIoT2023: A Real-Time Dataset and Benchmark for Large-Scale Attacks in IoT Environment. Sensors 2023, 23, 5941. [Google Scholar] [CrossRef] [PubMed]
  22. Wang, Z.; Chen, H.; Yang, S.; Luo, X.; Li, D.; Wang, J. A lightweight intrusion detection method for IoT based on deep learning and dynamic quantization. PeerJ Comput. Sci. 2023, 9, e1569. [Google Scholar] [CrossRef] [PubMed]
  23. Guven, E.N. Examination, Design and Implementation of Intelligent Intrusion Detection Systems. Master’s Thesis, Gazi University Graduate School of Natural and Applied Sciences, Ankara, Turkey, 2007. [Google Scholar]
  24. Cebeloglu, F.S.; Karakose, M. A cyber security analysis used for unmanned aerial vehicles in the smart city. In Proceedings of the 2019 1st International Informatics and Software Engineering Conference (UBMYK), Ankara, Turkey, 6–7 November 2019; pp. 1–6. [Google Scholar]
  25. Sreeram, I.; Vuppala, V.P.K. HTTP flood attack detection in application layer using machine learning metrics and bio inspired bat algorithm. Appl. Comput. Inform. 2019, 15, 59–66. [Google Scholar] [CrossRef]
  26. Chen, E.Y. Detecting TCP-based DDoS attacks by linear regression analysis. In Proceedings of the Fifth IEEE International Symposium on Signal Processing and Information Technology, Athens, Greece, 21 December 2005; pp. 381–386. [Google Scholar]
  27. Raptis, G.E.; Katsini, C.; Alexakos, C. Towards Automated Matching of Cyber Threat Intelligence Reports based on Cluster Analysis in an Internet-of-Vehicles Environment. In Proceedings of the 2021 IEEE International Conference on Cyber Security and Resilience (CSR), Rhodes, Greece, 26–28 July 2021; pp. 366–371. [Google Scholar]
  28. Kumari, P.; Jain, A.K. A Comprehensive Study of DDoS Attacks over IoT Network and Their Countermeasures. Comput. Secur. 2023, 127, 103096. [Google Scholar] [CrossRef]
  29. Ton IoT Dataset. Available online: https://www.unsw.adfa.edu.au/unsw-canberra-cyber/cybersecurity/ADFA-ton-iot-Datasets/ (accessed on 22 October 2023).
  30. Moustafa, N.; Keshky, M.; Debiez, E.; Janicke, H. Federated TON_IoT Windows datasets for evaluating AI-based security applications. In Proceedings of the IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), Guangzhou, China, 29 December 2020–1 January 2021; IEEE: Piscataway, NJ, USA; pp. 848–855. [Google Scholar]
  31. Description of Windows 10 Features. Available online: https://cloudstor.aarnet.edu.au/plus/s/ds5zW91vdgjEj9i?path=%2FDescrip[]tion_stats_datasets%2FDescription_stats_Windows_dataset#pdfviewer (accessed on 23 October 2023).
  32. Kumar, P.; Gupta, G.P.; Tripathi, R. TP2SF: A Trustworthy Privacy-Preserving Secured Framework for sustainable smart cities by leveraging blockchain and machine learning. J. Syst. Arch. 2020, 115, 101954. [Google Scholar] [CrossRef]
  33. Tsimenidis, S.; Lagkas, T.; Rantos, K. Deep Learning in IoT Intrusion Detection. J. Netw. Syst. Manag. 2022, 30, 8. [Google Scholar] [CrossRef]
  34. Lin, S.; Tian, H. Short-Term Metro Passenger Flow Prediction Based on Random Forest and LSTM. In Proceedings of the 2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), Chongqing, China, 12–14 June 2020; pp. 2520–2526. [Google Scholar] [CrossRef]
  35. Khattak, A.; Asghar, M.Z.; Ali, M.; Batool, U. An efficient deep learning technique for facial emotion recognition. Multimedia Tools Appl. 2021, 81, 1649–1683. [Google Scholar] [CrossRef]
  36. Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. J. Mach. Learn. Res. 2014, 15, 1929–1958. [Google Scholar]
  37. Chartuni, A.; Márquez, J. Multi-Classifier of DDoS Attacks in Computer Networks Built on Neural Networks. Appl. Sci. 2021, 11, 10609. [Google Scholar] [CrossRef]
  38. Kumar, P.; Gupta, G.P.; Tripathi, R. An ensemble learning and fog-cloud architecture-driven cyber-attack detection framework for IoMT networks. Comput. Commun. 2020, 166, 110–124. [Google Scholar] [CrossRef]
  39. Mohamed, R.H.; Mosa, F.A.; Sadek, R.A. Efficient Intrusion Detection System for IoT Environment. Int. J. Adv. Comput. Sci. Appl. 2022, 13, 4. [Google Scholar] [CrossRef]
  40. Hairab, B.I.; Aslan, H.K.; Elsayed, M.S.; Jurcut, A.D.; Azer, M.A. Anomaly Detection of Zero-Day Attacks Based on CNN and Regularization Techniques. Electronics 2023, 12, 573. [Google Scholar] [CrossRef]
  41. Dobrojevic, M.; Zivkovic, M.; Chhabra, A.; Sani, N.S.; Bacanin, N.; Amin, M.M. Addressing Internet of Things security by enhanced sine cosine metaheuristics tuned hybrid machine learning model and results interpretation based on SHAP approach. PeerJ Comput. Sci. 2023, 9, e1405. [Google Scholar] [CrossRef] [PubMed]
  42. Asharf, J.; Moustafa, N.; Khurshid, H.; Debie, E.; Haider, W.; Wahab, A. A Review of Intrusion Detection Systems Using Machine and Deep Learning in Internet of Things: Challenges, Solutions and Future Directions. Electronics 2020, 9, 1177. [Google Scholar] [CrossRef]
  43. Zhan, C.; Zhang, X.; Yuan, J.; Chen, X.; Zhang, X.; Fathollahi-Fard, A.M.; Wang, C.; Wu, J.; Tian, G. A hybrid approach for low-carbon transportation system analysis: Integrating CRITIC-DEMATEL and deep learning features. Int. J. Environ. Sci. Technol. 2023, 1–14. [Google Scholar] [CrossRef] [PubMed]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.