1. Introduction
In recent years, the development of Artificial Intelligence (AI) technology has significantly contributed to various disciplines, including cybersecurity [
1]. One significant issue in cybersecurity is the DDoS attacks, which has escalated over many years [
2]. DDoS attacks disrupt legitimate users’ access to services by injecting enormous volumes of malicious traffic quickly, costing the victims their reputation, resources, and potential clients [
3]. The outbreak of the COVID-19 pandemic in 2020 has resulted in an increased reliance on network infrastructure, leading to a notable surge in DDoS attacks [
4,
5]. Since many businesses function as service providers, they must maintain uninterrupted operations. Consequently, any disruptions stemming from a compromised network or service can lead to significant financial and reputational damages [
6]. According to Cloudflare, a vendor of Content Delivery Networks (CDN), a considerable number of DDoS attacks are initiated each month, as stated in their quarterly report on DDoS attacks [
7]. Even though most malicious traffic records are beneath 500 Mbps, such a volume possesses the capability to cause temporary interruptions to multiple enterprise systems. Quarterly, targeted assaults with a maximum capacity of 100 Gbps transpire, leading to extensive service interruptions and probable shutdowns of data centers and culminating in revenue losses for service providers.
The techniques employed in DDoS attacks are subject to constant evolution, as evidenced by the literature [
8]. Employing outdated countermeasures is insufficient to protect against novel threats [
9]. As a result, there is a need for an approach that facilitates the identification of unknown data attributes by the existing intrusion detection system (IDS). This mechanism would assist telecommunications technicians in detecting covert intrusion. In recent years, significant progress has been achieved in AI technology, and the associated work has been applied in various fields, consisting of cybersecurity [
10]. Several different IDSs based on deep learning have been developed, and they all expose remarkable accuracy. Relevant experiments demonstrate that the accuracy of identifying standard DDoS may reach higher than 90% [
11,
12,
13]. In the event that a conventional IDS is faced with novel forms of attacks, it does not classify them as unknown and inept in addressing them. Consequently, there is a necessity for an IDS that can promptly notify the telecom technician of any unfamiliar traffic for examination at the onset of an attack, instead of assessing its nature as positive or negative. This is particularly crucial when the distinction between previous and current threats is markedly apparent. The response of the defense system will be of utmost importance in the event of an assault characterized by discrete fundamental components. This suggests that the problem no longer pertains to the efficacy of the training process. One possible solution to address the issue at hand is to update both the training and test datasets. However, it is important to note that the model faces a significant challenge regarding unknown traffic, and the open set presents a more complex scenario than the closed set.
This study addresses the limitations of existing IDS architectures, which often struggle to detect unknown traffic in DDoS attacks, by proposing a novel IDS architecture that leverages deep learning technology. Our approach combines deep learning techniques and geometrical metrics to enhance accuracy and improved detection of unknown traffic. The model’s backbone, CNN-Geo, is based on CNN architecture and incorporates a geometrical metric, which offers enhanced detection capabilities. Furthermore, the system’s incremental learning module allows it to adapt to new attack patterns by incorporating newly labeled samples provided by telecom engineers, continuously improving its defensive performance. Given that it employs machine learning techniques such as supervised learning, CNN, geometrical metrics, and incremental learning modules to continuously enhance the model’s performance by learning how to incorporate new information and adapt to new attack patterns, it is a smart and clever system. This framework’s extraordinary level of intelligence allows it to detect unexpected DDoS attacks with high accuracy and maneuver around the restrictions of classical approaches.
The practical applications of our IDS architecture lie in its ability to protect networks and systems against DDoS attacks more effectively than classical approaches. The high detection rate of over 99% against conventional attacks from the well-known CICIDS2017 dataset demonstrates its efficacy. Moreover, the model’s accuracy is further enhanced by 99.8% toward unknown attacks as tested on the recent CICIDDoS2019 open datasets. Our findings suggest that the proposed IDS architecture can significantly improve the detection and defense against DDoS attacks, ensuring the security and reliability of network systems in real-world applications.
The remainder of this paper is organized in the following manner:
Section 2 offers an overview of relevant literature.
Section 3 outlines the underlying assumptions and the proposed detection framework.
Section 4 presents the experimental findings, while
Section 5 concludes the study and discusses potential avenues for future research.
3. Proposed Methodology
We proposed a framework incorporating a CNN architecture to classify the conventional traffic. The OSR obstacle to identifying DDoS attacks is addressed by utilizing a geometrical metrics calculation module and an incremental learning approach, in conjunction with the aforementioned system. The operational illustration of the suggested structure is illustrated in
Figure 1.
In order to equip the model with the ability to identify unknown samples, the study adopted the Geometrical Metrics Calculate module approach, which calculates the metric threshold and enables the identification of samples that fall outside the distribution. Once the threshold is defined, the classification process proceeds only for the elements that satisfy the threshold condition, whereas the samples with the calculation results below the threshold are considered as outliers. The current study’s utilization of the CNN model has several advantages, including its ability to recognize spatial and temporal patterns in the input data. Additionally, the sparse categorical cross-entropy loss function allows for a simpler optimization process, and the adopted coding approach mitigates issues of linear dependence between labels. Moreover, the approach of the calculating module uses geometrical metrics aids in identifying data that deviate from the distribution, thereby augmenting the model’s capability to detect unknown samples.
3.1. CNN Classifier
The present investigation employed a CNN as the basis of the framework due to its aptitude for identifying patterns in data, particularly in the context of datasets with high dimensionality, as depicted in
Figure 2. CNNs can acquire intricate features from unprocessed data, rendering them a suitable option for identifying unfamiliar traffic in Distributed DDoS attacks. The framework inputs a 9 × 9 matrix representing a network flow, and the output comprises two prediction levels corresponding to Benign and Attack classifications. The proposed classifier uses a CNN-based architecture with several convolutional layers followed by batch normalization, dropout, and fully connected layers. The number of filters and the filter size is progressively reduced, resulting in decreasing feature maps, which helps the model capture increasingly complex patterns in the network flow data. The batch normalization and dropout layers help to reduce overfitting and improve the convergence of the model during training. The model achieved promising results in accurately identifying different types of conventional DDoS attacks.
3.2. Density and Coverage
The ability to accurately assess the similarity between a real distribution and a generative model is crucial in machine learning applications. To achieve this objective, it is imperative to devise an algorithm that is capable of assessing the probability of the sets of samples and originating from a typical distribution. The density and coverage metrics have been proposed as two metrics that can effectively assess the performance of generative models.
3.2.1. Density Metric
Density is a metric that quantifies the degree to which the neighborhoods of real samples overlap with those of unknown samples. Specifically, density counts the number of real-sample neighborhood spheres
that contain
. Here,
represents the sphere in
centered around
with a radius of
, and
denotes the distance from
to the
the nearest neighbor among
, excluding itself. The manifold consists of the superimposition of the neighborhood spheres
, and an expected likelihood of unknown samples is measured. The density metric is defined as formula (1) and illustrated according to
Figure 3, where
represents the
-nearest neighborhoods. By taking into account the degree to which unknown samples overlap with real samples in densely packed regions, the density metric is less vulnerable to the effects of outliers.
The process of calculating density will be executed following Algorithm 1.
Algorithm 1 Calculation of Density |
Input: Array of number neighbourhood spheres of real sample that contain each unknown sample. Output: density value
- 1.
- 2.
- 3.
: - 4.
: - 5.
= array[] of zeros - 6.
for : - 7.
: - 8.
- 9.
- 10.
end for - 11.
end for - 12.
- 13.
return
|
Figure 4 provides a detailed flowchart of Algorithm 1, illustrating the key stages and components involved.
3.2.2. Coverage Metric
Coverage, on the other hand, is a metric that aims to quantify diversity by measuring the extent to which unknown samples cover the real samples. In other words, coverage measures the ratio of real samples that are covered by unknown samples. To improve the accuracy of coverage, the nearest neighbor manifolds are built around the real samples instead of the unknown ones, as the former are less prone to outliers. Moreover, the manifold can only be computed per dataset instead of per model, reducing the heavy nearest neighbor computations in a recall. The coverage metric is defined as formula (2) [
40], illustrated by
Figure 5, and represents the fraction of real samples whose neighborhoods contain at least one unknown sample. The coverage metric ranges from 0 to 1.
The process of calculating coverage will be executed following Algorithm 2.
Algorithm 2 Calculation of Coverage |
Input: Array of number neighbourhood spheres of real sample that contain at least one unknown sample. Output: coverage value
- 1.
- 2.
- 3.
: - 4.
: - 5.
= array[] of zeros - 6.
for : - 7.
: - 8.
- 9.
- 10.
break - 11.
end for - 12.
end for - 13.
- 14.
return
|
Figure 6 provides a detailed flowchart of Algorithm 2, illustrating the key stages and components involved.
3.2.3. Density and Coverage Behavior Analysis
To verify the effectiveness of the density and coverage metrics, it is necessary to examine whether they reach their best values when the intended criteria are met. Analyzing the expected values
and
for identical real and unknown distributions reveals that these metrics approach 100% as the sample sizes
and the number of neighborhoods
increase. This analysis further leads to a systematic algorithm for selecting the hyperparameters
for generative models. Specifically, the algorithm can be used to determine the optimal values of
and
that will maximize the effectiveness of the density and coverage metrics in assessing the similarity between the real and unknown distributions. We derive the expected values of density and coverage under the identical real and unknown data in formulas (3) and (4) [
40]:
As .
By taking into account the degree to which unknown samples overlap with real samples in densely packed regions and measuring the extent to which unknown samples cover the real samples, these metrics offer a comprehensive evaluation of the dataset’s distribution. Additionally, by analyzing the expected values of density and coverage for identical real and unknown distributions, it is possible to develop a systematic algorithm for selecting the model’s hyperparameters, thereby optimizing their performance.
3.3. Unknown Identification Module
The proposed Unknown Detecting Module is designed to address the challenge of identifying unknown attacks in the cybersecurity domain. The need for such a module arises due to the ever-evolving threat landscape and the difficulty in identifying and isolating unknown attacks. The proposed module is designed to work with the CICIDS2017-Wednesday dataset, which is a widely used benchmark dataset for network intrusion detection systems. To accomplish its goal, we divide the original data of the CICIDS2017-Wednesday dataset into batch samples with an element number of 10,000 and according to the ratio of the original labels in the dataset. By dividing the dataset into batches, we can assess the similarity between the batches and the baseline dataset without processing the entire dataset simultaneously. This improves the speed and efficiency of the module.
Subsequently, the density and coverage metrics are computed to determine the correlation between the data batches. The density and coverage metrics have been proposed as two metrics that can effectively assess the performance of generative models. Density quantifies the degree to which the neighborhoods of real samples overlap with those of unknown samples. In contrast, coverage aims to quantify diversity by measuring the extent to which unknown samples cover the real samples. We can assess the similarity between the batches and the baseline dataset by calculating the density and coverage metrics. This step allows us to determine the similarity of each batch to the baseline dataset, which is crucial for identifying unknown attacks. To evaluate outliers, we build a threshold for evaluating outliers from the average of all metric density and coverage correlation values as formula (5) and (6) [
40].
where
N is the number of batchs.
The threshold is an important component of the proposed module, allowing us to distinguish between known and unknown attacks. When the metric density and coverage results of the data correlated with the baseline dataset fall below the threshold level, they will be considered outliers. This step enables us to identify unknown attacks that are not present in the baseline dataset and isolate them from the network. By combining the density and coverage metrics, we can effectively identify and isolate unknown attacks from the network. The proposed module can potentially serve as a valuable instrument in augmenting the cybersecurity of networks. The efficacy of the module lies in its ability to detect outliers. Furthermore, the module that has been suggested exhibits scalability, as it can be modified to function with additional datasets. The present study employs a double-index approach for categorization in the unidentified identification module. The schematic representation of the strategy architecture is depicted in
Figure 7.
3.4. Incremental Learning
The model will feature an identification module capable of detecting unknown samples. In the event of the detection of unidentified traffic, communication experts are alerted to label the data for subsequent model retraining. To this end, the proposed framework employs a fine-tuned strategy to update specific modules within the architecture of the model, thereby allowing for the acquisition of new knowledge by including additional classifications. Additionally, the model’s learning rate during training is moderated to mitigate the risk of catastrophic forgetting of previously learned information.
5. Conclusions
Existing studies primarily focus on general categories, resulting in intrusion detection systems’ limitations when detecting unknown attacks. This study presents the novel CNN-Geo framework, a hybrid network architecture combining unsupervised and supervised networks’ features to address these challenges. Utilizing datasets such as CICIDS2017-Wed and CICIDDoS2019, the framework effectively detects unknown cyber-attacks by employing DL techniques and geometrical metric calculating during training alongside the incremental learning solution. Our comprehensive comparison of CNN-Geo with traditional ML algorithms and state-of-the-art approaches demonstrates its superior performance in detecting conventional and unknown DDoS attacks. The experimental results validate the proposed architecture’s effectiveness, achieving a detection rate of more than 99% for conventional attacks in the CICIDS2017-Wed dataset and enhancing the framework’s efficiency to 99.8% in confronting unknown attacks in the recent CICIDDoS2019 unseen datasets. CNN-Geo demonstrates the adaptability to address evolving threats by leveraging telecommunications technicians for traffic defining and incrementally learning. The verified benefits of this research lie in the enhanced detection capabilities of unknown traffic in DDoS attacks and the framework’s ability to incorporate new information and adapt to new attack patterns, making it a powerful and intelligent solution for intrusion detection systems.
The CNN-Geo system was initially developed to provide protection against L3, L4 DDoS attacks. Moreover, it is currently incapable of mitigating the latest attack techniques, such as Connection-less Lightweight Directory Access Protocol (CLDAP) or L7 DDoS attacks, as proposed by Cloudflare. The utilization of this particular attack is prevalent due to the lack of a dataset that encompasses corresponding attack patterns. The L7 attack poses a significant challenge due to the potential for its traffic to originate from a natural source. An avenue for enhancing the efficacy of the model is to integrate deep learning models with metaheuristic optimization algorithms such as Particle Swarm Optimization (PSO). Integrating deep learning and PSO can potentially optimize the model, resulting in an enhanced and flexible intrusion detection system. Subsequent academic pursuits will encompass supplementary modules aimed at tackling those matters. The expectation is that following the confirmation of the efficacy of this research framework, it can be implemented within an intranet setting as a cybersecurity solution for enterprises.