Two-Phase Deep Learning-Based EDoS Detection System

: Cloud computing is currently considered the most cost-effective platform for offering business and consumer IT services over the Internet. However, it is prone to new vulnerabilities. A new type of attack called an economic denial of sustainability (EDoS) attack exploits the pay-per-use model to scale up the resource usage over time to the extent that the cloud user has to pay for the unexpected usage charge. To prevent EDoS attacks, a few solutions have been proposed, including hard-threshold and machine learning-based solutions. Among them, long short-term memory (LSTM)-based solutions achieve much higher accuracy and false-alarm rates than hard-threshold and other machine learning-based solutions. However, LSTM requires a long sequence length of the input data, leading to a degraded performance owing to increases in the calculations, the detection time, and consuming a large number of computing resources of the defense system. We, therefore, propose a two-phase deep learning-based EDoS detection scheme that uses an LSTM model to detect each abnormal ﬂow in network trafﬁc; however, the LSTM model requires only a short sequence length of ﬁve of the input data. Thus, the proposed scheme can take advantage of the efﬁciency of the LSTM algorithm in detecting each abnormal ﬂow in network trafﬁc, while reducing the required sequence length of the input data. A comprehensive performance evaluation shows that our proposed scheme outperforms the existing solutions in terms of accuracy and resource consumption.


Introduction
Cloud computing has become one of the fastest-growing segments in the IT industry. Cloud computing is increasingly attracting big, medium, and small businesses by offering on-demand inexpensive and scalable resources for achieving the system requirements. However, security is still a significant concern with this emerging technology. An economic denial of sustainability (EDoS) attack is currently becoming one of the most challenging cloud security issues [1].
An EDoS attack exploits the pay-per-use and auto-scaling features of cloud computing to charge a cloud adopter an excessive bill, leading to a large-scale service withdrawal or bankruptcy. An EDoS attack is a new variant of a distributed denial of service (DDoS) attack. Unlike a DDoS attack, which can prevent legitimate users from accessing a service for a certain amount of time, an EDoS attack can restrain a cloud adopter from delivering service indefinitely, leading to bankruptcy [2]. Because EDoS is a relatively new form of attack with a tricky nature, it is more challenging to detect an EDoS attack than a DDoS attack [1].
To the best of our knowledge, there are very few solutions that can tackle EDoS attacks efficiently. Most of the existing solutions addressing EDoS attacks are hard-threshold-based solutions with high false-alarm rates or high time complexity, such as graphical Turing tests and crypto-puzzle solutions [3][4][5][6][7][8][9][10][11][12]. A machine learning-based approach is presented in [13] to resolve the issue caused by a hard threshold and decreased the high false-positive rate. However, this method does not achieve high accuracy. Moreover, the method in [13] only detects that there is an attack happening in network traffic and doesn't know which flow is abnormal. Because an EDoS attack is a type of slow rate attack, the attack rate looks similar to the legitimate network traffic from the victim-end during each time period. To efficiently detect this type of slow rate attack, it is necessary to trace or collect the historical information of the attack source. Considering that network traffic also has a sequential relationship in the time dimension, in [14,15], two multivariate time-series data-based algorithms are proposed based on two variant forms of a recurrent neural network (RNN), i.e., a long short-term memory (LSTM) and a gated recurrent unit (GRU) to detect and mitigate the EDoS attacks in each network flow. These solutions achieve high accuracy by using a dynamic threshold that can reduce the high false-alarm rate. However, a disadvantage of an LSTM or other variants of an RNN in anomaly detection is the required long sequence length of the model, which requires the defense system to use such algorithms, increasing the calculation and detection times and consuming a large amount of computing resources. The LSTM model presented in [14] requires a sequence length of 250 for the input data. In some cases, the network traffic can contain a huge number of network flows. Because the systems in [14,15] detects each network flow, the calculation time and the delay of these schemes will increase significantly.
Understanding the above issues, we propose a two-phase deep learning-based EDoS detection scheme using the LSTM algorithm to detect and mitigate each abnormal flow; however, the sequence length of the LSTM model is significantly reduced. In the first phase, an artificial neural network (ANN) algorithm observes the network traffic within a time interval to check if there is an attack during that time period. We call the detector in the first phase the period detector because it detects the period under attack. The second phase detection using the LSTM algorithm is then triggered to detect an attack flow if any attacks are detected in the first phase. The second detector, called a flow detector, determines precisely which flow is abnormal. Using the period detector before the flow detector, we can know when an attack has occurred, and which data are the most critical among the long sequence of input data for the LSTM model. By doing so, we can reduce the sequence length of the LSTM input data, and can eliminate the case where the flow detector has to handle a huge amount of network flows but without abnormal flows in the network traffic. Thus, by applying the two phases of detection, our model exploits the advantage of an LSTM, while reducing the sequence length of the input data to five (the shortest length yet) and decreasing the calculation time, the delay time of the system, significantly.
Our contributions in this study are as follows: • We proposed a novel solution using two-phase detectors to efficiently detect EDoS attacks. It is known that the solutions using an LSTM or other variants of an RNN algorithm to tackle EDoS attacks achieve higher accuracy and lower false-alarm rates than other existing hard-threshold-based solutions. However, the LSTM-based solutions require a long sequence length of input data, increasing the detection time and computational overhead for a defense system. The proposed scheme exploits the advantages of LSTM algorithms, i.e., high accuracy and low false-alarm rate, and overcomes the shortcoming in which a long sequence of input data is required. • Second, we implemented an EDoS detection and prevention system using two-phase detectors to detect common types of EDoS attacks. Whereas the existing schemes only detect if an attack is happening and warns the cloud provider to react, the implemented system detects and mitigates each abnormal flow in the network traffic. • Finally, we conducted considerable experiments to demonstrate the effectiveness of our scheme. We collected and analyzed the experimental results and made detailed comparisons with other algorithms to illustrate that the proposed scheme outperforms the others in terms of accuracy and resource consumption.
The rest of this paper is organized as follows: In Section 2, we present related studies. The background knowledge is described in Section 3. In Section 4, we describe our proposed scheme in detail. Section 5 describes our testbed. The performance evaluation will be delivered in Section 6 followed by the discussion in Section 7. Finally, we provide some concluding remarks and discuss future areas of research in Section 8.

Related Works
Several techniques for alleviating EDoS attacks can be found in the literature. Each comes with its own benefits and limitations. Self-verifying proof of work (sPoW) [4] proposed by Khor and Nakao acts like a DNS server but returns a crypto puzzle with an encryption key instead of the IP address. The client must solve this puzzle to receive authorization to use the cloud service. However, attackers can easily launch a puzzle accumulation attack without solving a crypto-puzzle-based request. In addition, the scheme is vulnerable to a false positive rates because some legitimate users might be denied service owing to the difficulty of the puzzle. In [5,6], the authors proposed EDoS-Shield to lessen the impact of EDoS attacks in cloud computing. Two main components, virtual firewalls (VFs) and authentication nodes (V-nodes) have major roles in detecting EDoS attacks. The authentication nodes conduct a verification process using a graphical Turing test for requests sent from a new IP address and choose to put this address in either a whitelist or a blacklist. The subsequent requests that are from an IP address in the blacklist will be blocked by the virtual firewall. Although EDoS-Shield is quite effective at preventing botnet-generated traffic, this model uses a graphical Turing test, which is poor at verification. Such a test can lead to a high false-positive rate and increase end-to-end latency. Two other studies, also using a graphical Turing test and a crypto puzzle technique to mitigate EDoS attacks, are described in [7,8], respectively. The limitations of these mechanisms are similar to those of EDos-Shield or sPow, i.e., a high false-positive rate and increasing end-to-end latency.
Some statistical methods such as entropy [16] or fuzzy [17] are proposed to detect EDoS attacks. In [16], the authors achieved a good detection accuracy. However, the method was experimented on in an extremely simple testbed, which raises doubt regarding its performance in real-world performance. Fuzzy-entropy-based EDoS mitigation described in [17] produces high errors because of the predefined rule.
Some researchers have recently proposed machine-learning-based techniques to handle EDoS attacks. In [13], an execution trace to detect different EDoS attacks is analyzed. They use an SVM and a neural network, and propose a set of features to detect three types of EDoS attacks. However, this mechanism only detects whether an attack is occurring and warns the cloud to react. If there is an attack happening, the cloud provider will not scale up the system. In addition, EDoS attacks cannot be detected on each flow. Recognizing that network traffic is a type of multivariate time-series data, the authors in [14,15] proposed an LSTM and gated recurrent unit (GRU), two variant forms of a recurrent neural network (RNN). These algorithms can handle sequential relationship data problems extremely effectively. The mechanisms in [14,15] achieve high accuracy and are evaluated through several metrics such as accuracy, detection time, cost, and complexity. However, using LSTM and GRU leads to a problem of high resource consumption. The sequence length of the input data required for two algorithms is long. The LSTM in [14] requires a sequence length of 250, and the GRU requires a length of 100. This lengthens the detection time and increases the resources of the defense system.
As shown in the above review, no existing proposals have the correct approach to achieve a high accuracy, use fewer resources, and detect each flow of network traffic in EDoS attack tackling. From an analysis of the LSTM algorithm and recommendations on the defense inspired by two recently conducted in-depth studies on EDoS characteristics ( [18,19]), we propose a two-phase deep learning-based EDoS detection mechanism to take advantage of the LSTM and eliminate the limitations of the model complexity.

EDoS Attack Analysis
As mention in [19], an EDoS is a type of low-rate DDoS attack. Unlike a high-rate DDoS attack, however, an EDoS attack is sophisticated and arduous to detect because of its low-rate traffic and stealthy behavior. EDoS and high-rate DDoS attacks are also different in terms of their purpose. In high-rate DDoS attacks, the attacker's goal is to disrupt the services offered by a cloud service provider. Therefore, high-rate DDoS attackers irrationally launch attacks over a short amount of time with maximum resources. Conversely, EDoS attacks target the financial component of the service provider. As depicted in Figure 1, EDoS attacks exploit the auto-scaling feature of cloud computing and cause an unwanted installation of new virtual machines. The costs associated with this unpaid malicious usage burden the cloud service provider. EDoS attackers gradually push illegitimate traffic over a longer period of time. Because EDoS attack traffic looks similar to benign traffic, its detection is a challenging task. Like DDoS attacks, EDoS attacks can be categorized into two types: bandwidth depletion and resource depletion attacks [20]. In bandwidth depletion attacks, attackers flood a victim with unwanted traffic, which exhausts the network bandwidth of the victim. The source IP address of an attacker generates one or two flows with a high-level packet volume in each flow. ICMP flooding, Smurf, and Fraggle attacks are representatives of this type. With respect to resource depletion attacks, the attacker aims to generate a large number of flows to a victim address. Each generated flow contains a small number of packets. Because these packets are spoofed IP packets, there will be no reply packets from the victim server. Thus, these flows remain alive during the attack. This consumes a significant number of resources of the victim, including the CPU and memory, and the victim server has to require more resources from the cloud service provider. A TCP SYN flooding attack is a representative of this type. In [13,21], an attack that targets a specific application (such as a database API request attack) and a Yo-Yo attack are introduced. However, when we study the characteristics of these attacks, we see they can be classified into the above two types. In conclusion, the common types of EDoS attacks can be summarized in Table 1.

Basics of Artificial Neural Network
In this study, an ANN model is used in the first phase of the proposed two-phase deep learning-based EDoS detection mechanism. The ANN module collects the traffic information during a time period and detects whether an attack is occurring during this time period. An ANN is a biologically inspired form of distributed computation [22] and is composed of simple processing units and connections between them. In this study, we employ classic feed-forward neural networks trained using a back-propagation algorithm.
A feed-forward neural network has an input layer and an output layer, with one or more hidden layers in between the input and output layers. The ANN functions as follows: each node i in the input layer has a signal x i as the network input, multiplied by a weight value w ij between the input layer and the hidden layer. Each node j in the hidden layer receives the signal aj according to this formula: The output a j of each node j of the hidden layer is then broadcast to the output layer: where θ j , θ k are the biases in the hidden layer and output layer, respectively; n and m are the numbers of nodes in the hidden layer and output layer. The output of the output layer is passed through a function. The activation could be a sigmoid function for a binary classification problem or a softmax function for a multiple classification. The output obtained from the activation function will be compared with the target. In this study, we used the mean square error as the error function: where T i and Y i are the target value and output value, respectively. The training process is the process where the ANN model is updated by parameters w ij to make the error function E m go toward zero.

Basics of Long Short-Term Memory
LSTM, which is a variant form of an RNN [23], is known to be a solution to overcoming the major disadvantage of an RNN, i.e., an inability to process long sequences of data or information. Thus, an LSTM is commonly used for anomaly detection problems such as network intrusion detection, because this problem requires a long sequence of network traffic information. An LSTM takes the form of a repeating cell chain. The cell contains four types of interactive neural networks that interact in a special way to enable the network to remember historical information. The LSTM protects and controls the state of the cells through the input gate, output gate, and forget gate. Figure 2 depicts the architecture of the LSTM. The figure below shows an LSTM cell. The internal calculation formula of the LSTM cell is defined as follows: where i, o, f indicate the input gate, output gate, and forget gate, respectively. Here, W is a weight matrix, and b is the bias. In addition,C and C are the candidate state and new state, h is the output, x is the input, i is the input time, and σ denotes a sigmoid function. However, the disadvantage of an LSTM is that it requires long sequential input data. Because the LSTM model does not know where the most important part of the sequential input data is located, it requires a long sequence to extract the historical information and make a prediction or classify this input data. In this study, we combine an ANN and an LSTM to classify each flow in a traffic network. The ANN model acts as a period detector to detect when an attack occurs, from which the LSTM model, which plays the role of a flow detector, can know where the important part of the sequential information is located to reduce the sequential input data.

System Design
In this section, our research goal and system design analysis are first introduced. Then, the component and workflows are given. Finally, the internal modules of our proposed scheme are thoroughly explained.

An Objective and System Design Analysis
Cloud consumers are typically monitored using a multivariate time series, whose anomaly detection is critical for service quality management [14]. For instance, as shown in Figure 3, the parameters of a cloud instance, such as CPU utilization and memory consumption are collected and tracked by a network administrator. When an EDoS attack is launched, the parameters of the cloud consumer will suddenly change in value. An EDoS attack is similar to a low-rate DDoS attack in terms of the characteristics [18]. The networking traffic launched by this attack is not changed too dramatically during each time period, similar to a conventional DDoS attack. As discussed in [14,15], an LSTM or a GRU, two variant forms of an RNN, are chosen for EDoS detection. These algorithms are not only able to learn from historical data but also to simultaneously keep track of the multivariate time series data to detect an EDoS attack. These methods will collect the sequential multiple variables information of the attacker at each observation and learn not only the past values of each variable but also the correlation information between variables. Nevertheless, the complexity of the model is a limitation of these algorithms. The algorithms also require a long sequence of input data. The sequence length can be 100 or longer (in [14], 250-sequence-length input data are required). This increases the calculation time overhead and response time of the entire system. In particular, the calculation time required to detect each flow is even longer. Thus, in this paper, a two-phase deep learning-based mechanism is proposed to overcome this limitation of an LSTM in EDoS attack detection. The first phase in the proposed two-phase scheme is a period detector. The period detector will detect if an EDoS attack has occurred within an observed time period. An ANN is the core algorithm in the period detector. The second phase is a flow detector. This phase will accurately filter out abnormal flows. LSTM is the core of the flow detector. The aim of the period detector is to reduce the sequence length of the input data for the LSTM model in the flow detector. If we use the flow detector directly, we require a hundred or longer sequence length of the flow feature input data for each flow to fit the model, because the model cannot know when an abnormal time has occurred where it is the most important part of a long sequence of input data. Using the period detector before a flow detector, we can know when an attack happened and where the most important part of a long sequential input data is for an LSTM model. By doing so, we can reduce the sequential length of the input data for the LSTM, especially in the case that network traffic can contain a huge number of normal flows. If we only use the flow detector, the calculation time and response of the system can increase much more highly but not necessarily to detect each network flow. In brief, the period detector helps reduce the calculation time of the flow detector. Figure 4 depicts an overview of our proposed mechanism aimed at detecting EDoS attacks. This conceptual architecture includes four main modules: a raw data processing scheme, a period detector, a flow detector, and a firewall. The system workflow and mission of each module will be explained in detail in the next sections.   Figure 5 presents a detailed architecture of the two-phase deep learning-based EDoS detection mechanism. During the data preprocessing stage, the collector runs a Wireshark tool every 5 s to capture all packets going through a switch when installing OpenVswitch, in which the system resources of the victim server are also collected. The network packets will be captured into a pcap file and saved to a disk in the collector machine. Afterward, the feature extractor will load the pcap file from the collector to extract the data attributes. These data attributes will also be transformed and standardized to fit the ANN model. After preprocessing, the appropriate data will be sent to the first detector-ANN model to detect whether an attack happened during that 5-s time period. If the ANN model, which is a binary classifier, outputs an attack, the observed pcap file of a 5-s time period is now split into five consecutive pcap files of 1-s time periods. The feature extractor is called again to extract the flow-based attributes of each flow in the five consecutive pcap files constructed. The extracted flow-based features are constructed for sequential data having a length of five. The sequential data are sent to the flow detector, i.e., the LSTM model, to classify where each flow in the observed time period is abnormal. If a flow is detected as an abnormal flow, the source address of this flow will be updated to a blacklist of a firewall. Note that to effectively adapt our proposed framework to different network systems, the ANN and LSTM models will be replaced by new ANN and LSTM models that are trained using the updated database in a preset time. The workflow of the two-phase deep learning-based EDoS detection mechanism is summarized in Algorithm 1.

Internal Modules
Herein, we present all components of the two-phase deep learning-based EDoS detection mechanism, as shown in Figure 5.

Collector
The collector module will capture all network packets going through a virtual switch installed on openVswitch and collect the system resources of the victim server. The collector uses the Wireshark tool to achieve this mission. The Wireshark tool is the world's most popular network protocol analyzer [24]. Wireshark has tools for capturing, viewing, and analysis of data packets. Every 5 s, the collector runs Wireshark tool to capture all of the network packets and saves these network packets into a pcap file. The pcap files are placed into storage in the collector machine.

Feature Extractor
This module loads the pcap file from pcap file storage and extracts the data information to take out the appropriate features, as shown in Table 2. These attributes are then normalized before fitting them to the ANN or LSTM model. These features are key, and are selected by applying the correlation-based feature selection process proposed in [25]. The purpose of feature selection is to increase the detection accuracy of the model and decrease the computing resources. Using irrelevant or redundant features will decrease the accuracy and waste computing resources. As proposed in [25], a feature set S will be evaluated using the metric M S : where k is the number of features in the feature set S, r c f is the mean correlation value between features and class label f in S, r f f is the average inter-correlation between two features in S. In addition, r c f and r f f are calculated by the information gain (IG), which measures the correlation between two random variables X and Y: where H(X), H(Y), H(X,Y) are calculated as: We calculate M S for each random subset of features and chose the subset S that has the largest value M S .
The module runs a script that takes advantage of Apache Spark framework [26], which is an open-source unified analytics for large-scale data processing to speed up the calculation time. After being extracted, these features need to be normalized because they do not have similar ranges of values, and their formula can be expressed as follows: where x is the standardized value, µ is the mean of the distribution, and σ is the standard deviation of the distribution. This module is called again to extract and calculate features from five pcap files of a 1-s time period to make sequential data for the LSTM model when the ANN model detects that there is an attack during a 5-s time period.

. ANN Model-Period Detector
The ANN module aims to learn a set of features, as shown in Table 2, to detect whether there is an attack during a time period of 5 s. In this study, we employ classic feed-forward neural networks trained with a back-propagation algorithm. The ANN model parameters are presented in Table 3. Our proposed ANN model has an input layer of 10 nodes, 3 hidden layers, and an output layer of 1 node. Because our ANN model is a binary classifier, the activation function of the output layer is a sigmoid function: A 10-dimensional vector x whose elements correspond to input variables will be fit into the ANN model. The trained model will be calculated based on the learned parameters to output a z value, and this z value will then output a new value f (z) between zero and one by the sigmoid function. Based on this f (z) value, the period detector classifies whether the observed time period is abnormal or normal.

LSTM Model-Flow Detector
The LSTM model classifies each individual flow during the attack time period detected from ANN model as a normal or abnormal flow. For each flow, the LSTM model considers a time series X = {x (1) , x (2) , x (3) , . . . , x (n) } where each step x (t) in the time series is an m-dimensional feature vector x m , whose elements correspond to each feature in Table 4. For our proposed LSTM model, the value of parameter n in a considered time series is five. In other words, the sequential length of input data is five, which is the shortest length compared to other LSTM models applying for the same problem. By using the period detector, we can determine when an attack occurs in network traffic. Then, we only need focus on a time series of each flow during that attacked time period. Thus, we can reduce the sequential length of input data for the LSTM model. Figure 6 shows the model architecture of the proposed LSTM. Our proposed model includes an LSTM layer, a dropout layer to avoid over-fitting the problem and a fully connected layer and an output layer of a single node. The sigmoid function is used as the activation function for the output layer because the LSTM model is a binary classifier. Time series data X = {x (t−4) , x (t−3) , x (t−2) , x (t−1) , x (t) } are fitted into the LSTM layer. The LSTM layer learns both the temporal and spacial representation from the input sequential data and outputs an output vector h = {h (t−4) , h (t−3) , h (t−2) , h (t−1) , h (t) } following the equations from (4) to (9). Each element in the output vector play acts as a unit in the input layer of the fully connected layer. The fully connected layer works as an artificial neural network. It outputs a value of z and this value then goes through the sigmoid function to output a value of zero (normal flow) or one (abnormal flow). The model parameters are presented in Table 5.   The virtual firewall (vFirewall) works as a filter mechanism filtering the incoming requests to the cloud services, based on a comparison with a regularly-updated blacklist. If a packet whose IP address matches the IP listed in a blacklist, vFirewall will execute a drop action to drop this packet. The blacklist is updated by the flow detector. Other packets whose IP addresses do not match the IP lists in the blacklist will be forwarded to the cloud server by a forward

Experimental Setup
In this section, we present our attack scenarios for the simulation. Next, the description of our dataset is given. Finally, we describe our test bed in detail.

Attack Scenarios
As discussed in Section 2, there are two common types of EDoS attacks, bandwidth depletion, and resource depletion. To prove the efficiency of our proposed mechanism, we simulate an ICMP flooding attack, a TCP SYN flooding attack, and an HTTP flooding attack. The ICMP flooding attack is a representative of bandwidth depletion attacks, and the others are resource depletion attacks.
• ICMP flooding attack, also known as the Ping flood attack is a common EDoS attack in which an attacker takes down a victim server by overwhelming it with ICMP echo requests, i.e., ping request. The network bandwidth of the victim server will be overloaded by the attacker's ping request and thus, the victim server has to request more resource usage to reply to other normal requests. • TCP SYN flooding attack, also known as "half-open attack", exploits part of the normal TCP three-way handshake to consume resources on the victim server and render it unresponsive. An attack continuously sends an initial connection request to the victim server making all ports unavailable to respond to upcoming legitimate traffic. • TCP-HTTP flooding attack is one of the EDoS attacks where a web server is exploited by an attacker through seemingly-legitimate HTTP GET or POST requests. The attacker forces the targeted webserver to allocate the utmost resources for each request.
Since the attacker does not use the spoofed IP request as a TCP SYN flooding attack, this kind of attack is not established in the network layer. Instead, it is established in the application layer.
EDoS attacks are similar to low-rate DDoS attacks [19]. A packet per second (attack rate) of 10,000 is considered the standard rate to differentiate between a low-rate DDoS and high-rate EDoS [15]. Following the instructions in [14,15], we use the Bonesi tool [27] to simulate an ICMP flooding attack and TCP SYN flooding attack. In reality, a real EDoS attack scenario not only contains attack traffic but also contains normal traffic. To create a realistic EDoS attack, we mix EDoS attack traffic and normal traffic. We simulated different levels of EDoS attacks whose request rate ranged from 1000 to 7000 and mixed with a fixed rate of 400 normal requests per second following the EDoS evaluation scheme proposed by Al-Haidari et al. [28], as shown in Table 6. Table 6. EDoS attack simulation with different numbers of requests.

Dataset Description
Following the recommendations in [14,15,29,30], we choose the SMD dataset and UNSW-NB15 dataset for ANN and LSTM model training, respectively. The SMD dataset was publicly published in KDD 2019 and was collected from real network traffic statistics of a large internet company. It is also used in many anomaly detection studies. There are 38 features in each SMD dataset sample. By using the correlation-based feature selection process, we chose 10 key features for ANN model. UNSW-NB15 is created by the IXIA PerfectStorm tool in the Cyber Range Lab of the Australian Center for Cyber Security. A raw network packet file (pcap file) of 50.2 GB is provided. We extracted this pcap file to calculate multivariate time series of features for the LSTM model. The shapes of the training set, validation set, and testing set of the ANN model are (28,479,10), (4257,10), and (20,300,10). For the LSTM model, these values are (30,400,10), (6010, 10), and (18,379,10), respectively. Figure 7 shows our testing topology, which consists of a virtual machine as a victim server, another virtual machine as a normal user that installed the Packet Sender tool [31] for generating normal traffic, two virtual machines with the Bonesi tool installed as attackers, a physical machine on which the installed Whireshark tool and Apache spark framework act as the Collector, Feature Extractor, and two detectors, i.e., an ANN model and an LSTM model. In the victim machine, we installed an Nginx server following the instructions in [32]. The Linux IPTables [33] will be implemented in the victim virtual machine. OpenVswitch is installed in a virtual machine. All of the virtual machines running Ubuntu 20.04 OS in our testbed are created by Oracle Virtualbox version 5.2.42 [34]. The physical machine's configuration is Intel(R) Core(TM) i7-4770 CPU, 3.40GHz and a total of 48GB memory, running 64bit Ubuntu Linux v18.04. For training and implementing the two deep learning models, we use the Keras 2.4.3 library-an open-source software library that provides a Python interface [35]. The models are also training on the physical machine mentioned earlier.

Result Analysis
A successful EDoS attack mitigation mechanism requires correctly identifying attacks and a quick response time (Quality of Service), while minimizing the resource consumption [1]. In this section, we will evaluate our proposed model based on two main criteria: quality of service and resource consumption. In addition, we evaluate the efficiency of the proposed model at the cloud service provider end by comparing the CPU usage of the victim server when being protected by our proposed model to that when not being protected.

Quality of Service
To prove the effectiveness of the proposed EDoS Detection system, we compare our model to other existing works with the same environmental setup. Firstly, to the best of our knowledge, based on a review of the relevant works [13,15], the most recent and stateof-the-art research applying machine learning to tackle an EDoS attack and detect multiple types of EDoS attacks are used in our work. Hence, to prove the detection accuracy of our proposed model, we compare our approach with the two approaches (support vector machine (SVM) and neural network (NN)) in [13]. Our approach is proposed to eliminate the disadvantage of calculation and delay time of the RNN-based method. Thus, to prove the advantages of calculation time, the response of the system, and the resource usage, we compare our proposed approach with the proposed model in [15], which uses GRU, another variant form of RNN but with a longer sequential input (100) than the sequential input we use for the LSTM model. Four metrics, i.e., accuracy (AC), detection rate (R), F1-score (F1), and false alarm rate are calculated to evaluate the anomaly detection capacity of our proposed mechanism. The formula of the four metrics are expressed as follows: • Accuracy is the proportion of correct detections over the number of total flows: • Detection Rate is the proportion of number of detected abnormal flows to the number of all abnormal flows: • False Alarm Rate is the ratio of abnormal flows falsely classified as normal flows: • F1-score is the weighted average of P and R: • Here, P = TP TP+FP is the ratio of correct abnormal flows between total detected abnormal flows.
TP, FP, TN, and FN indicate True Positive, False Positive, True Negative, False Negative, respectively.

Evaluate the Period Detector
We first evaluate the performance of the period detector. Despite the fact that the period detector cannot classify which exact flow is abnormal, it helps the system detect when an attack occurs and reduces the sequence length of the input data for the flow detector, which leads to a speedup of the detection time of the whole system for each flow. We need to evaluate the period detector to prove the efficiency of our chosen parameters for this ANN model. Figure 8 presents the detection performance of the period detector in three common types of EDoS attacks using the model parameters listed in Table 3 .The results are summarized in Table 7. Accuracy, detection rate, false alarm rate, and F1-score, when detecting ICMP flooding attack are 95.896%, 94.208%, 4.104%, 95.831%, respectively. The results show that our model can accurately detect when an EDoS attack occurs, which will help the flow detector quickly detect abnormal flows. We then evaluated the detection performance of the flow detector. The flow detector is triggered after the period detector detects an attack and alerts the flow detector. The flow detector will detect which exact flow is abnormal. Figure 9 shows the detection performance in three EDoS attacks' common types of the four mentioned solutions: R-EDoS [15], SVM-Abbasi et al. [13], NN-Abbasi et al., and our proposed scheme. The average results are presented in Table 8. These results show that regarding detection rate, accuracy, and F1-score, our proposed model accounts for the highest rate, 98.139%, 98.163%, and 98.163%, which is slightly higher than R-EDoS system and clearly outperforms the two solutions (SVM and NN) in [13]. With respect to false alarm rate, the proposed system dominates the production of wrong warnings when it only accounts for 1.837%, while the false alarm rate of R-EDoS is 4.333%, SVM and NN are 13.779% and 21.113%, respectively. R-EDoS and our approach achieve much better results than SVM and NN in [13] because these two approaches learn from current and historical information of attackers. SVM and NN only observe and learn the attacker's information at the current time period. A normal TCP session sends the very first packets to the victim. Due to the short observation time, the collected TCP flow looks like a TCP SYN flooding flow. Consequently, the system in [13] will consider it as an abnormal flow. That proves recurrent neural network-based algorithms outperform other machine learning-based algorithms in EDoS attack detection.  [13,15], respectively. The detection time and response time are two crucial metrics in evaluating an EDoS detection mechanism [18]. The detection time measures how quickly a new attack is detected. We simulated an SYN flooding attack with an attack rate of 7000 requests per second and calculated the detection time for each classified flow. The result in Figure 10 show that our flow detector outperforms the three other solutions in terms of the detection time metrics. In particular, compared with the R-EDoS method, our model only requires 0.54 s to detect each flow because our LSTM model only uses a sequential input data length of 5 instead of 100, which R-EDoS used for a GRU-based model. This shortens the calculation time of our proposed model compared to the GRU-based model in [15]. This also allows our model to outperform the three other methods in terms of the response time. The response time is defined as follows. The response time is the period when a cloud user makes a request, and one corresponding response is sent back to the user. Overall, the response times of our proposed model (varying within 21-29 ms) and R-EDoS, which are two variant forms of RNN algorithms, are similar and shorter than that of the SVM-based model and NN-based model [13]. Our proposed model, however, is better than R-EDoS in terms of response time (varying from 21-25 ms) because of a shorter sequential input data length.

Resources Consumption
In this section, we evaluate the efficiency of our proposed mechanism in terms of CPU usage and memory usage, which are two representative resource consumption metrics. Figures 12 and 13 show the results of the CPU and memory utilization, respectively. These results were obtained when we simulated an EDoS attack with different numbers of requests of 1000-7000 per second during a 100-s period. For the CPU usage, the result shows the CPU usage of the physical machine when extracting the features and detecting each flow. Because the data sequence length is five, our model consumes the lowest CPU usage (32% to 36 [13], and R-EDoS [15] (40% to 43.2%). With respect to the memory utilization, our proposed scheme only consumes from 2.8% to 3.5%, whereas the consumption of the GRU-based R-EDoS method is approximately 4% higher than that of our two-phase model. Overall, our proposed model consumes fewer resources than the SVM-Abbasi et al., NN-Abbasi et al., and GRU-based R-EDoS models.

Performance Analysis at the Victim Server
In this section, we present the experiment results of the CPU usage of a victim server under the control of our proposed EDoS detection mechanism to evaluate the efficiency of our proposed model at the service provider's end. We launched a TCP-SYN flooding attack with an attack rate of 7000 requests per second and a normal traffic rate of 400 requests per second during a 1-h period.
During the first 5 min, we only launched normal traffic, after which, we launched a TCP-SYN flooding attack. Figure 14 shows the victim's CPU usage when protected by our proposed EDoS detection model and when not protected by our model. Without being protected by our model, the victim server has to handle a large number of abnormal requests, which makes the server consume a huge number of resources. When the TCP-SYN flooding attack is launched (at minute five), the CPU usage increases from 40% to over 50% and reaches 100% after 50 min. Most cloud-based systems set an upper threshold of 80% of the CPU usage to trigger another virtual machine. Therefore, without being protected by our proposed model, the cloud system will need a second virtual machine allocated 30 min after an attack, leading to an increase in the cost the user must pay to the cloud service provider.
However, when being protected by our proposed model, the average CPU usage of the victim server is only approximately 40% during a 1-h period. This means the cloud user does not need to pay for a second virtual machine when being protected by our proposed model while under EDoS attacks.

Discussion
Based on the comprehensive results given above, we summarize some of the outstanding points demonstrating the effectiveness of the two-phase deep learning-based EDoS detection system in detecting EDoS attacks conducted on our practical testbed:

•
Our proposed model achieves a high detection rate, accuracy, and F1-score, and a low error rate, which clearly shows that it outperforms other state-of-the-art existing solutions. More specially, the proposed two-phase deep learning-based EDoS detection system achieves 98.163%, 98.139%, 98.163%, 4.333% of accuracy, detection rate, f1 score, and false alarm rate, respectively. These values are much better than SVM (86220%, 85.392%, 78.859%, 13.779%) and neural network-based (78.887%, 77.901%, 78.859%, 21.113%) models in [13]. Compared with R-EDoS systems, which use GRU (another variant of RNN models), the proposed model also overcomes in terms of accuracy, detection rate, f1-score, and false alarm rate. • Our model can defense against three common types of EDoS attack, i.e., HTTP flooding, TCP-SYN flooding, and ICMP flooding attacks. • Our scheme can detect and mitigate each abnormal flow in network traffic within an extremely short period of time, i.e., 0.54 s. This result highly outperforms other existing approaches, especially R-EDoS [35], which uses another variant of RNN and achieves pretty high accuracy in EDoS detection. The quick detection time makes the response time of the entire system much lower than other existing solutions despite the high amount of network. • Using our proposed EDoS detection model, low CPU (32% to 36.3%) and memory (2.8% to 3.5%) resources are consumed. • Using our defense system, a cloud user can avoid being forced to pay for the unexpected cost coming from the EDoS attacks. In other words, our two-phase deep learning-based EDoS defense system is the most suitable approach to protect cloud systems from various EDoS attacks and brings a better service quality for the protected cloud services. Although our proposed system is very efficient in handling EDoS attacks, it does have a limitation. The accuracy of the period detector, i.e., ANN model is not too high (from 95.896% to 96.439%). Some recent advanced algorithms in image processing proposed in [36,37] could be considered to apply for period detection. These algorithms are both CNN-based models and improved to reduce computation and enhance detection speed.

Conclusions
In this study, we propose a novel mechanism to handle EDoS attacks in each flow coming into a cloud system. This approach not only protects cloud infrastructure from paying much more money for various EDoS attacks such as TCP-HTTP flooding, ICMP flooding, and TCP SYN Flooding, but also help cloud service providers improve their service quality. We present a two-phase deep learning-based detector for EDoS detection based on utilizing the advantages of ANN and LSTM algorithms. By using the period detector, i.e., ANN model to detect when an attack occurs, we can take advantage of the LSTM in terms of accuracy, while eliminating the greatest disadvantage of this algorithm, i.e., the long sequence length of the input data. Finally, our mechanism can apply to different network systems, and adapts well because the deep learning-based detectors are replaced periodically by using an updated training database. The evaluation described in Section 6 shows that our proposed mechanism is extremely efficient in both accuracy, detection time, response time, and resource consumption.
As a future study, we expect to improve the mechanism by using an SDN-based model to enhance the process of mitigating and collecting network features. Moreover, as discussed in Section 7, we will improve the performance of the period detector by considering the two models mentioned in [36,37]. In addition, we plan to compare the proposed scheme to other EDoS defense systems using more evaluation criteria.

Conflicts of Interest:
The authors declare no conflict of interest.