An Investigation into the Application of Deep Learning in the Detection and Mitigation of DDOS Attack on SDN Controllers

: Software-Deﬁned Networking (SDN) is a new paradigm that revolutionizes the idea of a software-driven network through the separation of control and data planes. It addresses the problems of traditional network architecture. Nevertheless, this brilliant architecture is exposed to several security threats, e.g., the distributed denial of service (DDoS) attack, which is hard to contain in such software-based networks. The concept of a centralized controller in SDN makes it a single point of attack as well as a single point of failure. In this paper, deep learning-based models, long-short term memory (LSTM) and convolutional neural network (CNN), are investigated. It illustrates their possibility and efﬁciency in being used in detecting and mitigating DDoS attack. The paper focuses on TCP, UDP, and ICMP ﬂood attacks that target the controller. The performance of the models was evaluated based on the accuracy, recall, and true negative rate. We compared the performance of the deep learning models with classical machine learning models. We further provide details on the time taken to detect and mitigate the attack. Our results show that RNN LSTM is a viable deep learning algorithm that can be applied in the detection and mitigation of DDoS in the SDN controller. Our proposed model produced an accuracy of 89.63%, which outperformed linear-based models such as SVM (86.85%) and Naive Bayes (82.61%). Although KNN, which is a linear-based model, outperformed our proposed model (achieving an accuracy of 99.4%), our proposed model provides a good trade-off between precision and recall, which makes it suitable for DDoS classiﬁcation. In addition, it was realized that the split ratio of the training and testing datasets can give different results in the performance of a deep learning algorithm used in a speciﬁc work. The model achieved the best performance when a split of 70/30 was used in comparison to 80/20 and 60/40 split ratios.


Introduction
With the current surge in the number of devices with networking capabilities, complex management strategies are required to provide a good quality of service (QoS). Achieving a good QoS becomes a hurdle in current traditional networks due to the vertical integration of the control and data planes. Furthermore, network optimization becomes difficult due to a high dependence on vendor-specific hardware and software.
Software-Defined Networking (SDN) is a new paradigm that solves the issues existing in traditional Internet architectures. It provides flexibility in management by making the networking programmable from a logically centralized control point. SDN decouples the control plane from the data plane present in traditional networks and deploys it in a remote device called the controller or control layer, as shown in Figure 1. It comes with the benefits of the centralized control functionalities, applications running on the network operating system, the unique capture of the global view of the architecture, the public interface of the north and south bounds, and its dynamic programmability in forwarding packets. Devices in the data plane, such as switches, forward the packets according to the control decisions or rules sent from the controller. The controller communicates with the application layer through the northbound application programming interface (API) and communicates with the data plane through the southbound API. Controller-switch communication is carried out using the OpenFlow protocol [1]. Due to the flexibility in network control it offers, SDN has become an alternative approach for traditional security infrastructures. However, absolute security of the system is at stake if the SDN framework itself gets compromised. The controller is always prone to a single point of failure. Hence, an attack on the controller can lead to the failure of the entire network [3].
Major security problems in the SDN are issues of unauthorized controller access (intrusion), man-in-the-middle attack, and a flow rule change that modifies packets. Other pertinent issues are malicious packets hijacking the controller, denial of service by switchcontroller communication flood, and configuration problems. Distributed denial of service (DDoS) is one the most common and dreadful threats that are aimed at successfully disrupting regular traffic from arriving at the controller. The attack is achieved by flooding the controller with more malicious packets than it can accommodate, thus rendering it inoperable. The attack is made possible by making use of multiple compromised switches (bot) for the production of malicious packets. The attacker forms a botnet, a group of bots, from the switches connected to the controller and then gains control over the entire network to operate after rendering the controller inoperable, as depicted in Figure 2. It is, therefore, necessary to implement a system that addresses this security threat. Traditional methods are insufficient, so machine learning based DDoS detection techniques have received more attention. In this paper, the feasibility and efficiency of applying variants of deep neural networks, namely convolutional neural network (CNN) and long short-term memory (LSTM), in training an ML model to detect and mitigate DDoS attack on SDN controllers are investigated. LSTM is an artificial recurrent neural network (RNN) architecture which is well-suited for data classification, processing and making predictions. According to the literature, many machine learning algorithms such as Support Vector Machine (SVM), K Nearest Neighbor (KNN), Artificial Neural Network (ANN), and Naïve Bayes (NB) have been explored in detecting DDoS attacks in the various layers of the SDN architecture. However, only the deep reinforcement learning-based algorithm has been applied in the application layer of the SDN to mitigate such attacks.
The main contribution of this work include: • A new dataset comprising of normal and malicious (DDoS) traffic developed using Mininet and the Floodlight controller is collated. • A DDoS defence mechanism based on the trained model for the identification and mitigation of DDoS attacks on the SDN controller is introduced. • The performance of the selected deep learning candidate is compared with that of other machine learning linear models. These models are k-nearest neighbor (KNN), logistic regression, linear support vector classifier (LinearSVC), support vector classifier (SVC), decision tree, random forest, gradient boosting, Gaussian naïve Bayes (NB), Bernoulli NB, and multinomial NB. These models and the selected candidate model are trained based on the same generated dataset. • The performance analysis of linear-based ML and neural network models in the detection and mitigation of DDoS flood attacks was done using various train-test split ratios (60/40, 70/30, and 80/20).
The rest of the paper is organized as follows. Section 2 reviews related work. Section 3 presents the proposed model and methodology. Section 4 discusses the results. Section 5 concludes the paper.

Related Work
In the field of network security, the advent of SDN has provided researchers' unparalleled control over network infrastructure by establishing a single control point for data flows that routes the entire network [5]. A range of literature was reviewed that is relevant to this current work and highlights from them are stated in the subsequent texts. To handle the issue of DDoS attack in SDN, researchers have proposed and implemented DDoS identification mechanisms based on artificial intelligence, mainly machine learning.
In [6], the authors used KNN, SVM, and Naïve Bayes to detect DDoS packets. KNN was most suitable with 97% accuracy, while SVM had 82% and Naïve Bayes 83%. The authors in [7] used Support Vector Machine (SVM) together with their own proposed algorithm, idle timeout adjustment (IA). They showed that their proposed approach differed from previous works and did better than the initial methods used. Neural network, Naive Bayes and SVM have been used in [3]. The neural network and Naive Bayes models provided 100% accuracy, while SVM presented 99% accuracy.
In [8], the authors used a support vector machine (SVM), and their results showed an average accuracy of 95.24%. The authors in [9] used Linear regression, Naïve Bayes, KNN, Decision Tree, Random Forest, SVM, and ANN. Their linear regression model achieved the highest accuracy, precision, and recall results at 98.65%. Naïve Bayes, on the other hand, showed the worst result at 97.45%. All others had accuracy between that of linear regression and naïve Bayes. In [10], the authors used Naïve Bayes. They had an average precision of 0.98 for training dataset with all features inclusive. They also recorded an average precision of 0.81 for training dataset with seven of the features removed. SVM has been used in detecting DDoS attacks; the model produced an accuracy of 99.8% [11].
In [12], the authors used Naïve Bayes, SVM, and Neural network. Naïve Bayes had an accuracy of up to 70% while SVM and the neural network had the same accuracy of 80%. The authors in [13] used SVM for DDoS attack detection. It was observed that the SVM algorithm achieved more than 98% accuracy on both the attacker and victim side for SYN flooding, ICMP flooding, and DNS reflection attacks. In [14], the authors used a deep neural network signature-based Intrusion Detection System (IDS). Their results show that the collaborative detection mechanism developed produced a true-positive rate of more than 90% with less than 5% false positives.
In [15], the authors worked on a reinforcement learning-based smart DDoS flood mitigation agent. Their findings demonstrate that the agent could effectively mitigate DDoS flood attacks of various protocols. Deep learning algorithms have also been used in SDN-based architectures to solve the problem of intrusion detection [16][17][18]. Other deep learning algorithms [19,20] have been applied in non-SDN architectures to detect DDoS and intrusion detection.
From the related works discussed, it is evident that machine learning has been used to identify DDoS attacks at all levels of the SDN architecture. Deep learning has been used in both SDN and non-SDN architectures for intrusion detection but not DDoS classification in SDN [17,18]. This circumstance makes it necessary to explore the feasibility and efficiency of applying CNN or RNN LSTM algorithms in the identification and mitigation of DDoS attacks on the controller. Table 1 shows a summary of the related works. Detection of distributed denial of service attacks using machine learning algorithms in software-defined networks [12] Naïve Bayes, SVM, Neural network.
The naïve Bayes had an accuracy of up to 70% and SVM and the neural network had the same accuracy of 80% • Two feature processed dataset. • Used only TCP flood.

10.
Multi-SDN based cooperation scheme for DDoS attack defence [13] SVM It was observed that the SVM algorithm would achieve more than 98% accuracy on both the attacker and victim side of SYN flooding, ICMP flooding, and DNS reflection attack.
• Used TCP, UDP, and ICMP floods • Training and testing split ratio not mentioned.

Methodology
Our anomaly detection technique is based on gathering certain parameters of the network when operating in a normal and also when subjected to a DDoS attack. These features include: • The following assumptions were made in this research: • The normal operation of the network is constant (the exchange of information between nodes has a particular profile), which forms the basis of our anomaly detection and defence mechanism. • The training of the detection engine is done off-device; the model is only exported and used on the controller.

Architecture
A three-tier architecture consisting of seven switches, eight hosts (two hosts per switch) and an external controller (the single host connected to a switch) was used in this research. Figure 3 shows the three-tier topology implementation in Mininet.

Simulation Test Bed
The work was simulated using Mininet and floodlight as an external controller. The SDN Mininet simulator software was used in creating the three-tier data center topology (shown in Figure 3). The floodlight controller and OpenFlow switches were deployed using a virtual machine running Ubuntu. After setting up the network, a specialized tool known as hping3 [21] was used to generate data traffic. Using the hping3 tool, we simulated a normal TCP, UDP, and ICMP traffic between two endpoints in the network. Afterwards, we simulated a DoS for TCP, UDP, and ICMP flood attacks. The statistics of the various switches were collected; these include: The three different kinds of traffic generated by the hping3 tool were UDP, TCP, and ICMP. The initial regular data generated were labeled as normal traffic. Afterwards, malicious traffic was generated using hping3 for the various UDP, TCP, and ICMP floods, which was labeled as malicious traffic. In total, 10,031 data collected, 4270 being malicious traffic (approximately 43%) and 5761 (approximately 57%) normal traffic.

Scenarios Considered
The data gathered were used to build binary classification models using the following ML models: K-neighbor nearest (KNN), Logistic Regression, Linear SVC, SVC, Decision Tree, Random Forest, Gradient Boosting, and Naïve Bayes classifiers such as Gaussian, Bernoulli, and multinomial, as well as the main algorithms being investigated for the purpose of this work, namely RNN LSTM and CNN. The model summary for LSTM and CNN is shown in Figures 4 and 5, respectively.  The performance of each model was evaluated based on the following key performance indicators: recall, accuracy, true negative rate, and the time used for identifying and mitigating the DDoS attacks. Accuracy is the proportion of correct predictions from the total dataset given. Recall is the percentage of predicted normal data against the total amount of normal data presented. The true-negative rate, on the other hand, measures the sum of true-negative against the sum of the condition of negative. It relates to the test's ability to detect malicious data against the total amount of malicious data presented.
Three scenarios were considered in the research. In the first scenario, 80% of the data were used in training and the remaining 20% for testing. The second scenario utilized 70% of the data for training and 30% for testing. In the third scenario, 60% of the data were used for training, while 40% were used for testing.

Detection and Defence Mechanism
The model was exported after evaluations were made and used in an application that runs on the controller. The application was used to measure the detection time when the controller was subjected to a DDoS attack. If the model detects a DDoS attack, its output is fed into the defence engine. The defence/mitigation engine is built on top of NetFilterQueue [22], a Linux system implementation that matches packets as accepted, dropped, altered, or given a mark. The rules to match packets have to be manually set, which does not make it scalable. Leveraging this, we implemented the defence mechanism by automatically matching packets based on the output of the detection algorithm.

Results and Discussion
The performance of the models in terms of precision and recall is shown in Figures 6-11. In terms of precision, the linear regression models (GradientBoosting, KNN, DecisionTree, GaussianNB, MultinomialNB, SVC, and LinearSVC ) outperformed the LSTM model (a marginal difference of 1.4%) for the variants of split-ratios considered. Nonetheless, the LSTM model achieved a good tradeoff between a high recall and precision, as shown in Figures 12-14, which makes it suitable for DDoS classification.    A summary of the performance of the models in shown in Table 2. The values in bold indicate the highest accuracy, recall and true-negative rate of each model.

Detection of DDoS Attack Using LSTM Model
This section of the experiment was aimed at creating scenarios of DDoS in the form of ICMP, UDP, and TCP flood attacks on the controller. It also determined how long it took the trained LSTM model to detect the attacks. Ten scenarios were considered, and Figure 15 shows the time it took the trained model to detect each flood attack, using a 60/40 train-test ratio. In Figure 15, the highest time it took the LSTM model to detect TCP DDoS flood among the 10 attempts was 18.70 s and the least time was 12.83 s. It took 11.73 and 15.90 s, respectively, as the lowest and highest times to detect the UDP DDoS flood attack. In addition, it took 11.76 and 15.73 s, respectively, as the lowest and highest times to detect ICMP flood attack among the 10 scenarios considered. Figure 16 shows the detection time of the LSTM model when a 70/30 train-test ratio was used. In Figure 16, the highest time it took the LSTM model to detect the TCP flood among the 10 scenarios was 18.71 s and the least time was 12.84 s. It took 11.81 and 15.89 s, respectively, as the lowest and highest times to detect UDP flood attack. In addition, it took 11.68 and 15.76 s, respectively, as the lowest and highest times to detect ICMP flood attack. Figure 17 shows the detection time of the LSTM model when a 80/20 train-test ratio was used. In Figure 17, the highest time it took the LSTM model to detect the TCP flood was 18.55 s and the least time was 12.67 s. It took 11.62 and 15.70 s, respectively, as the lowest and highest times to detect the UDP flood attack. In addition, it took 11.56 and 15.63 s, respectively, as the lowest and highest times to detect the ICMP flood attack. Table 3 gives a summary of the detection time of the LSTM model.

Mitigation of DDoS Attack Using LSTM Model
This section of the experiment was aimed at creating scenarios of DDoS in the form of ICMP, UDP, and TCP flood attacks on the controller. It also determined how long it took for the trained LSTM model to mitigate the attack after detection. After the controller detects the DDoS attack, the IP, protocol type, and destination port are sent to the mitigation engine, which instantly drops all packets from that particular source IP. Ten scenarios (attempts) were considered. The train-test ratios used were 60/40, 70/30, and 80/20. Figure 18 shows the mitigation time of the LSTM model when a 60/40 train-test ratio was used. In Figure 18, the highest time it took the LSTM model to mitigate the TCP DDoS flood attack was 4.75 s and the least time was 3.01 s. It took 3.28 and 4.51 s, respectively, as the lowest and highest times to mitigate the UDP flood attack. In addition, it took 3.18 and 4.45 s, respectively, as the lowest and highest times to mitigate the ICMP flood attack. Figure 19 shows the mitigation time of the LSTM model when a 70/30 train-test ratio was used. In Figure 19, the highest time it took the LSTM model to mitigate the TCP flood attack was 4.68 s and the least time was 2.99 s. It took 3.28 and 4.54 s, respectively, as the lowest and highest times to mitigate the UDP flood attack. In addition, it took 3.16 and 4.45 s, respectively, as the lowest and highest times to mitigate the ICMP flood attack. Figure 20 shows the mitigation time of the LSTM model when a 80/20 train-test ratio was used. In Figure 20, the highest time it took the LSTM model to mitigate the TCP flood attack was 4.86 s and the least time was 3.16 s. It took 3.42 and 4.65 s, respectively, as the lowest and highest times to mitigate the UDP flood attack. In addition, it took 3.36 and 4.57 s, respectively, as the lowest and highest times to mitigate the ICMP flood attack. Table 4 gives a summary of the mitigation time of the LSTM model.

Comparison of the LSTM Model with the Best Performing Linear-Based ML Models
We compared the best performing linear-based ML models with the LSTM model in terms of detection and mitigation time. According to the observed detection times for all the three different ratio splits, the 80/20 split had the best time values for detecting DDoS attacks. Hence, these values were compared with that of the best performing linear models in the sample split ratio. This is shown in Figures 21-24.    It was observed that the LSTM model had a higher time in detecting DDoS attacks on the SDN controller. However, it was not more than 4 s from the highest time for the linear-based models, which makes it a good result.
In addition, from the mitigation time for all the three different ratio splits, it was observed that the 70/30 split had the best time values for mitigating the DDoS attacks. Hence, these values were compared to that of the best performing linear models in the sample split ratio. This is shown in Figures 25-28.    In Figure 28, all the classification models (both the linear models and the LSTM model) after detecting a DDoS flood attack, took almost the same time to mitigate the attack. Hence, it can be concluded that the LSTM model performs exceptionally just as linear models will in the mitigation of DDoS attacks on the SDN controller. It was also observed that, aside KNN and GB, the RNN LSTM model performed better in some protocols than the three remaining linear models. Table 5, shows a comparison of the classification models from this research and other related works. In Table 5, it can be observed that the LSTM model (which achieved an accuracy of 89.63%) is quite good in comparison to the linear-based ML models. Furthermore, the LSTM model has a good tradeoff between precision and recall rates which makes it a good classification model for DDoS detection (Figures 12-14).

Conclusions and Future Work
In this research, we demonstrated that RNN LSTM is a viable deep learning algorithm that can be applied in the detection and mitigation of DDoS in the SDN controller. In addition, it was observed that the split ratio of the training and testing dataset could give different results in the performance of a deep learning algorithm used in a specific work. Thus, a 70/30 split produces a better model accuracy when compared to 80/20 and 60/40 split ratios. It can be concluded that RNN LSTM is also a good model for the identification and mitigation of DDoS attacks in the SDN architecture. The software-defined network used in this work was designed and tested within a virtual environment which simulates a software running on a set of network devices. Thus, future works may be carried out in a real SDN architecture to test how this application works in real-time. Future works will also explore the gathering of a larger dataset with in depth feature selection analysis and tuning of hyper-parameters to achieve better performance when using the neural network models.

Data Availability Statement:
The data used in this study is available at https://github.com/ jayluxferro/SDN-DoS/blob/master/README.md.

Conflicts of Interest:
The authors declare no conflict of interest.