Toward Generating a New Cloud-Based Distributed Denial of Service (DDoS) Dataset and Cloud Intrusion Traffic Characterization
Abstract
:1. Introduction
- Introducing BCCC-cPacket-Cloud-DDoS-2024 [9], a new cloud-based DDoS dataset.
- Design and development of a Benign User Profiler (BUP) [10] tool to generate benign background traffic.
- Design and development of a DDoS characterization model.
- Introducing the cloud-based network traffic dataset creation roadmap.
2. Literature Review
2.1. Cloud-Based DDoS Attack Detection and Traffic Analysis
2.2. Available DDoS Attack Datasets
- KDD99 1998-99 [22];
- CAIDA (2004) [23];
- CAIDA (2007) [24];
- CAIDA (2017) [25];
- CAIDA (2021) [26];
- CDX (2009) [27];
- Kyoto (2009) [28];
- ISCX2012 [29];
- ADFA (2013) [30];
- CTU-13 [31];
- UNSW-NB15 [32];
- CIC-IDS2017 [33];
- CSE-CIC-IDS2018 [34];
- CIC-DDoS2019 [35];
- SR-BH 2020 [36];
- CUPID (2022) [37].
- Imbalanced class distribution: The imbalanced class distribution in datasets often mirrors real-world scenarios, where certain DDoS attacks are more prevalent than others. Addressing this limitation is vital as it ensures that detection models are trained on data that accurately reflects the wild attacks’ distribution.
- Limited diversity of attacks: Datasets with limited diversity fail to capture the full spectrum of DDoS attacks encountered in real-world networks. This shortfall hampers the effectiveness of detection methods by neglecting to train models on a comprehensive range of attack types and techniques.
- Outdated threat scenarios: The inclusion of outdated threat scenarios in datasets may lead to the development of ill-equipped detection models to handle emerging DDoS threats. This limitation highlights the need for datasets that continuously evolve to reflect the evolving landscape of DDoS attacks in real-world environments.
- Lack of Realistic Network Traffic: Realistic network traffic patterns are essential for training accurate DDoS detection models. Datasets lacking such traffic fail to capture network behavior’s intricacies, hindering detection methods’ effectiveness in real-world deployment scenarios.
- Absence of encrypted traffic: With an increasing prevalence of encryption in network communications, datasets lacking encrypted traffic fail to simulate real-world conditions accurately. Including encrypted traffic in datasets is crucial for effectively training detection models capable of handling encrypted DDoS attacks.
- Insufficient labeling accuracy: Inaccurate labeling of data instances undermines the reliability of datasets and, consequently, the effectiveness of detection models trained on them. Ensuring high labeling accuracy is paramount to developing robust DDoS detection mechanisms.
- Limited incorporation of user behavior: User behavior plays a significant role in DDoS attack detection, yet datasets often overlook this aspect. Incorporating user behavior data into datasets enhances the reality of training data, leading to more effective detection models in real-world scenarios.
- Incompatibility with modern protocols: Datasets that do not support modern network protocols fail to reflect the current state of network communications. Ensuring compatibility with modern protocols is essential for developing detection models that address contemporary DDoS threats.
- Limited exploration of low-rate DDoS attacks: Low-rate DDoS attacks pose unique challenges that are usually overlooked in datasets. By exploring these attack types, datasets can better prepare detection models to identify and mitigate low-rate DDoS attacks in real-world scenarios.
- Lack of realistic DDoS traffic variability: Variability in DDoS traffic patterns is essential for training robust detection models capable of adapting to evolving attack strategies. Datasets lacking such variability fail to prepare detection mechanisms for real-world deployment adequately.
- Absence of hybrid DDoS scenarios: Hybrid DDoS attacks combine multiple attack vectors, presenting complex challenges for detection and mitigation. Including hybrid attack scenarios in datasets is crucial for training detection models capable of identifying and mitigating these sophisticated threats.
- Insufficient exploration of DDoS amplification techniques: Datasets often overlook the exploration of DDoS amplification techniques, which attackers commonly use to magnify the impact of their attacks. Understanding and mitigating these techniques requires datasets that adequately represent such attack scenarios.
- Inadequate representation of application-layer DDoS attacks: Application-layer DDoS attacks target specific services or applications, posing unique challenges for detection and mitigation. Datasets must include instances of application-layer attacks to train detection models effectively.
- Non-Inclusion of insider threats: Insider threats present a significant risk to network security, yet datasets often overlook this threat vector. Including instances of insider threats in datasets is essential for training detection models capable of identifying and mitigating such risks.
- Absence of multi-modal data: Multi-modal data, incorporating various data types such as network traffic, system logs, and user behavior, provides a more comprehensive view of DDoS attacks. Datasets lacking multi-modal data fail to capture the complexity of real-world attack scenarios, limiting the effectiveness of detection models.
3. Dataset Creation Roadmap
- 1.
- Scope DefinitionFirstly, defining the scope of the target network demands a deep understanding of the specific environment under study, such as an e-commerce company network encompassing diverse user interactions and transactions. This necessitates comprehensive data collection and analysis, often complicated by the sheer volume and variety of network activities.
- 2.
- Infrastructure PreparationThe preparation of infrastructure, whether cloud-based or otherwise, constitutes a critical initial step in dataset creation. A robust infrastructure ensures scalability, reliability, and performance, essential for generating and analyzing network traffic data. However, configuring and maintaining the infrastructure can be arduous, requiring expertise in network administration and resource optimization to mitigate potential bottlenecks and ensure seamless operation.
- 3.
- Defining Users and EntitiesDefining the corresponding users and entities within the network, along with their respective profiles, is essential for generating realistic traffic patterns. This involves categorizing users based on their roles, behaviors, and privileges and identifying network entities such as servers, clients, and applications. However, accurately characterizing user profiles and entity interactions poses challenges, particularly in large-scale networks with diverse user demographics and complex system architectures.
- 4.
- Designing Benign Traffic GeneratorA benign traffic generator design based on the defined user profiles is crucial for simulating legitimate, realistic, and real-world network activities. However, developing an effective traffic generator balances realism with efficiency and scalability. Generating diverse and realistic traffic patterns while avoiding bias or over-representing specific user behaviors requires careful consideration of traffic generation and profile definition, often necessitating iterative refinement and validation.
- 5.
- Studying Attack TrendsAnalyzing historical attack trends is essential for understanding prevalent threats and vulnerabilities in network environments. However, identifying relevant attack vectors and trends amidst evolving cyber threats can be challenging, requiring continuous monitoring and analysis of security incidents and threat intelligence sources. Moreover, extrapolating past attack trends to anticipate future threats necessitates robust analytical frameworks and predictive modeling techniques.
- 6.
- Attack Selection and ImplementationSelecting suitable attack scenarios and implementing them within the network environment involve various complexities. Identifying realistic attack scenarios that align with the network’s characteristics and threat landscape requires in-depth knowledge of common attack methodologies and their potential impact on network infrastructure. Furthermore, developing and deploying attack implementations necessitates expertise in security testing methodologies and adherence to ethical considerations to prevent unintended consequences or system compromise.
- 7.
- Data Capturing and AnalysisCapturing raw network data in the form of PCAP files is essential for capturing the intricacies of network traffic and facilitating subsequent analysis. However, capturing and storing network traffic data at scale poses challenges regarding data volume, storage capacity, and processing overhead. Moreover, ensuring the integrity and confidentiality of captured data while adhering to privacy regulations requires robust data anonymization and encryption mechanisms.
- 8.
- Development of Traffic AnalyzerDesigning and developing a network traffic analyzer to convert raw PCAP files into analyzed data (e.g., CSV files) is crucial for extracting meaningful insights from captured network traffic. However, developing an efficient and accurate traffic analyzer addresses various technical challenges, such as packet parsing, protocol decoding, and traffic classification. Additionally, ensuring the scalability and reliability of the analyzer across diverse network environments and traffic patterns requires rigorous testing and optimization.
- 9.
- Data Labeling and TestingLabeling the resulting dataset and conducting comprehensive testing and analysis is essential for validating the quality and reliability. However, manually labeling network traffic data for attack and benign activities can be labor intensive and error-prone, necessitating automated labeling techniques and human validation processes. Moreover, thorough testing and analysis of the dataset against predefined metrics and ground truth scenarios are crucial for assessing its effectiveness in simulating real-world network conditions and evaluating defense mechanisms.
4. The New Dataset
4.1. Infrastructure
4.2. Attack Scenarios
|
|
|
4.3. Benign User Profiling
4.3.1. Available Benign User Traffic Generators
4.3.2. Proposed Benign User Traffic Generator
- Web Browsing (normal and admin user) Web browsing behavior encompasses various types of websites that reflect normal user interactions on an average weekly day. Drawing upon research insights from previous works [48,50,51], the following website categories are integrated into the web browsing behavior:
|
|
|
- To ensure authenticity, the Firefox browser is used for this study. This choice is driven by its widespread usage, open-source nature, and reputation for user privacy and security features [54]. Importantly, our approach involves using a real browser for interactions rather than relying on scripted requests, distinguishing this work from other benign user profiles and contributing to the reality of the generated traffic across various online activities. Additionally, factors like the number of open tabs, time of the day, and time spent on each website [48] are considered for a comprehensive and accurate generation of benign network traffic.
- Emailing (normal and admin user)The Gmail web server is selected as the primary platform for simulating email-related behaviors, including sending and receiving. This decision is grounded in the widespread use of Gmail, ensuring that the benign network traffic generated accurately reflects typical email interactions. Configuring users to send emails to each other at regular intervals ensures a controlled and reliable simulation of both sending and receiving activities. Additionally, the selected approach allows the attachments for each email and provides a comprehensive representation of benign traffic associated with email communication.
- Systemic (normal and admin user)This category pertains to the traffic related to operating system (OS) services. The choice to focus on systemic activities stems from the need to capture network traffic associated with routine system-level operations. The rationale behind prioritizing systemic activities is capturing network traffic related to routine system-level operations. It is crucial to clarify that this approach does not involve generating such traffic; instead, the natural network activity of the OS on each machine is enabled for routine service updates and other essential functions. This intentional focus contributes to the dataset’s authenticity, facilitating a more comprehensive evaluation of DDoS detection methods in routine system-level operations scenarios.
- Command Line (admin user)This category explicitly addresses admin user behaviors and involves activities related to the Linux terminal. It encompasses tasks such as updating package lists, installing packages, creating, modifying, and deleting directories and files, and other administrative tasks. Simulating these command-line activities generates benign network traffic representative of the administrative functions carried out through the terminal interface.
- SSH or Remote Command Line (admin user)Like the command-line activity, the SSH or remote command-line category involves executing commands through an SSH session to a remote machine. This distinct category acknowledges the unique nature of remote command-line operations. These interactions are simulated to generate benign network traffic representative of administrative tasks conducted remotely, thereby enhancing the authenticity of generating the benign traffic across diverse scenarios.
- File Transfer, FTP server (admin user)This category is dedicated to activities related to FTP operations and focuses on benign network traffic associated with file transfers. Different file sizes and formats are simulated for both downloading from and uploading to an FTP server. This part ensures comprehensive coverage of everyday file transfer activities associated with administrative tasks.
- File Transfer, SCP (admin user)This category addresses secure file transfers between different machines using SCP. While SCP is associated with SSH, we make it a separate category. Simulating sending and receiving files with various formats and sizes to and from another machine allows us to generate benign network traffic that accurately reflects the secure file transfer activities associated with machine-to-machine interactions. This approach enhances the authenticity of benign traffic generation and contributes to a comprehensive understanding of secure file transfer activities in admin user behaviors.
4.3.3. Benign Scenarios
4.4. Data Capture
4.5. Data Labeling (CSV File Generation)
5. Proposed Traffic Characterization Model
6. Experimental Results
6.1. Feature Selection
6.2. Selecting and Implementing the Learning Algorithms
6.3. Experiment Scenarios and Performance Results
- Task 1: It involves classifying data into three categories: benign, suspicious, and attack.
- Task 2: It focuses on identifying specific attack activities within the dataset.
- Task 3: It concentrates on the identification of different benign activities.
- Task 4: It involves identifying both suspicious and benign activities.
- Task 5: It extends the identification challenge to both suspicious and attack activities.
- Task 6: It entails identifying attack activities and the benign label.
- Task 7: The final task encompasses the broadest identification challenge, requiring the model to classify all activities in the dataset.
7. Analysis and Discussion
7.1. Feature Selection Analysis
7.1.1. Selected Features Analysis
- Header-Related FeaturesExamining header-related features across three feature selection algorithms unveils a consistent trend where the top 10 selected features consistently pertain to header bytes. This initiates a detailed analysis focusing on header bytes in this context, examining specific scenarios and characteristics.The TCP header values exhibit limited patterns for each network flow in benign scenarios. For instance, in a benign context, a standardized handshake procedure occurs at the beginning of each flow, resulting in uniform header values. Any deviations or anomalies in these header-related patterns, particularly those at the onset of a flow, become easily detectable. Such anomalies include DDoS TCP handshake, DDoS TCP SYN (various TCP SYN scenarios), and DDoS TCP SYN-ACK, where attackers aim to exhaust system resources by initiating and keeping open connections to prevent benign connections from establishing. Notably, features like the handshake state and flow duration offer valuable insights into the underlying behavior and nature of the flow.The selected features underscore the significance of both the header and handshake categories in effectively distinguishing between DDoS attacks and benign data. However, more than using these features alone may be required in complex attack scenarios employing a customary handshake. Additional features become necessary in such cases. The prominence of header-related features arises from the fact that, in general DDoS scenarios, attackers often employ predefined packets or requests, increasing the likelihood of similar header-related options. This similarity simplifies the differentiation between DDoS and benign data. For instance, a SYN flood might generate numerous connection requests to a destination port while changing only the source port value in the TCP header.Furthermore, as detailed in Section 4.2, a significant proportion of DDoS attacks in this dataset manipulate header values. Consequently, features related to the TCP header emerge as the most informative. Notably, among the top 40 selected features, approximately 75% are directly derived from header values. These feature categories include header bytes, init win bytes, flag percentage, and handshake-related metrics. In contrast, categories such as delta time, IAT, and packet rate, which are not directly linked to the TCP header, contribute to a comprehensive approach to detecting TCP-based DDoS attacks.This underscores the importance of considering all aspects of the TCP header for effective detection. For example, a TCP-ACK flood attack inundates the target with a high volume of ACK requests, disrupting the flag distribution within header bytes compared to regular traffic.
- Flag-related featuresDDoS attacks often manipulate TCP flags, such as SYN, RST, and PSH, to disrupt standard communication patterns. Within this category, anomalies in the distribution of flags can signal specific attack types. For instance, an abnormal SYN–ACK ratio may indicate a TCP-SYN flood attack, while a dynamic flag distribution strategy could mimic standard traffic patterns, challenging detection mechanisms. This highlights the importance of analyzing flag percentages for a nuanced understanding of attack tactics.
- Delta-time-related featuresAttackers disrupt regular communication by introducing variations in the time intervals between successive packets. Unusual values in “delta time” features can indicate irregular packet transmission patterns. For example, pulsing DDoS attacks involve rhythmic variations in inter-packet delta times, making it challenging for defenders to predict attack patterns. Conversely, specific DDoS attacks use consistently low delta time between packets to maximize traffic volume and increase the likelihood of successful disruption. Identifying and analyzing these delta time patterns are crucial to robust detection mechanisms.
- Inter-arrival time (IAT)-related featuresThe inter-arrival time reflects variations in packet transmission intervals. Bursty DDoS traffic exhibits irregular spikes in packet transmission, causing variations in inter-arrival times. Recognizing and distinguishing these bursts is critical for identifying potential attacks amidst benign traffic. Moreover, sophisticated DDoS attacks may involve coordinated sequences with specific IAT patterns, such as rapid bursts followed by brief periods of inactivity. Understanding and detecting these coordinated sequences enhances the accuracy of identifying intricate attack strategies.
- Rate-related featuresDDoS attacks typically exhibit abnormally high packet rates to cause network congestion and service disruption. Elevated values in the “packet rate” feature can signal the presence of a DDoS attack, especially when compared to the baseline packet rates observed during regular network activity. This can indicate an adaptive strategy, where attackers dynamically adjust the packet rate during an attack to adapt to changing network conditions or evade static detection thresholds. Conversely, some DDoS attacks intentionally maintain a low and stealthy packet rate to avoid detection. Identifying and analyzing deviations from the expected baseline requires a nuanced analysis of packet rate dynamics.
- All togetherThe consistent presence of anomalies across multiple feature categories collectively enhances attack detection accuracy. Recognizing patterns across these diverse features contributes to a more robust detection mechanism. Examining the correlations between different feature categories reveals more comprehensive attack signatures. For instance, a high packet rate combined with abnormal flag percentages may indicate a sophisticated DDoS strategy. Understanding these interdependencies allows for a deeper understanding of the evolving nature of attacks and improves detection accuracy.Moreover, attackers may dynamically adjust their strategies throughout an attack. Continuous monitoring of features enables the identification of evolving attack patterns. A nuanced understanding of the dynamic nature of attacks is becoming imperative for developing adaptive detection mechanisms capable of responding to emerging threats in real-time. Incorporating diverse, informative feature categories into consideration provides a holistic and detailed perspective on network traffic. This inclusive approach empowers the development of robust and adaptive detection models adept at identifying various DDoS attack tactics. By comprehensively examining correlations and evolving patterns, these models prove effective in staying ahead of attackers and responding dynamically to the intricate landscape of cyber threats.
- Online detection strategyIn an online detection system, where computing a single feature value for a potentially high number of open flows can be resource-intensive and time consuming, it is crucial to adopt an approach that optimizes both time and resources. The strategy involves a structured, multi-layered framework, where features are computed in an order that facilitates early detection with minimal computational cost.The first layer of this framework prioritizes features that are easy to calculate and contribute to early detection. Notably, features associated with the handshake scenario emerge as critical components of this initial layer. The rationale behind this prioritization is twofold. Firstly, in the initial stages of a network flow, decisions about its malicious nature cannot be accurately made by calculating features such as header bytes mean or packet rate. Secondly, by focusing on features related to the handshake process, the system can efficiently identify and halt potentially malicious incoming traffic lacking a valid handshake process. This early intervention conserves system resources by avoiding calculating additional feature values for such connections. Furthermore, this proactive approach ensures that more resources are available for benign users.Another set of informative features for early detection includes “init win bytes”. Attackers may manipulate the initial window size in TCP packets during DDoS attacks to impact the target’s resource utilization. Anomalies or irregularities in the values of the “init win bytes” feature serve as indicators of potential attempts to exploit vulnerabilities, overwhelm network resources, or establish malicious connections. Beyond the early detection phase, attention shifts to flows characterized by a normal handshake process, usual flags, and regular features. A normal handshake process denotes that all TCP steps have been executed within a reasonable timeframe. The system monitors and calculates the other most informative features (other selected features) for such flows.
7.1.2. Not-Selected Features Analysis
7.2. Performance Analysis
7.3. Addressing Previous Shortcomings
- Imbalanced class distribution: The dataset achieves a more balanced distribution, with a ratio of 60% benign to 40% non-benign data, including 8% labeled as suspicious. This balance ensures a more representative dataset for training and evaluation.
- Limited diversity of attacks: This work incorporates a wide range of DDoS attacks, totaling 17 different attack types, surpassing the diversity found in existing datasets. This ensures comprehensive coverage of various attack scenarios and enhances the fidelity of the new dataset.
- Outdated threat scenarios: A thorough analysis was conducted, leveraging reports from Microsoft [56,57,58] and Cloudflare [59] to identify and prioritize recent attack trends. Additionally, utilizing a third-party service for attack execution ensures the incorporation of up-to-date attack methodologies, addressing concerns regarding outdated threat scenarios.
- Lack of realistic network traffic: The proposed approach includes the development of a Benign User Traffic generator capable of producing realistic benign user data. This contrasts with previous approaches that relied on simulated or less accurate benign data, thereby enhancing the realism of the new dataset.
- Absence of encrypted traffic: The benign traffic generator is configured to generate diverse traffic, including encrypted traffic, mirroring real-world network scenarios more accurately. This ensures that the dataset encompasses the complexities of encrypted communication, which are often overlooked in previous datasets.
- Insufficient labeling accuracy: Benign and attack scenarios were meticulously scheduled to ensure precise labeling, supported by experimental results validating the accuracy of the labeling process across different algorithms. This meticulous approach enhances the reliability and trustworthiness of the new dataset.
- Limited incorporation of user behavior: The analysis of previous works in user network behavior informs the configuration of diverse user profiles within the benign traffic generator. This includes various behaviors such as web browsing, file transfer, and email checking, ensuring a more comprehensive representation of user activity within the dataset.
- Incompatibility with modern protocols: The generated dataset mirrors realistic network traffic, encompassing a wide array of protocols, including modern ones like QUIC, DNS, and HTTPS. This ensures compatibility with modern network environments, addressing concerns about protocol compatibility present in previous datasets.
- Limited exploration of low-rate DDoS attacks: The new dataset includes diverse attack strategies, ranging from low-rate to high-rate attacks, as evidenced by Table 3. This comprehensive coverage ensures that the dataset adequately represents the variability of DDoS attack intensities encountered in real-world scenarios.
- Lack of realistic DDoS traffic variability: Utilizing a third-party service for DDoS attack execution ensures the incorporation of realistic attack data with diverse strategies. This contrasts with previous approaches that may have utilized packet generator tools without specific attack strategies, enhancing the variability and realism of the new dataset.
7.4. Comparison with Previous Datasets
8. Conclusions and Future Works
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
Date | Time | Attack | Target | Target IP | Capturer |
---|---|---|---|---|---|
Monday, 18 December 2023 | 9:00–9:20 | (1) TCP-SYN (Valid SYN) | Windows-machine-1 | 3.96.128.96 (10.0.9.208) | 35.183.206.0 (10.0.17.180) |
9:30–9:50 | (2) TCP-BYPass-V1 | Linux-admin-1 | 99.79.45.168 (10.0.6.142) | ||
10:00–10:20 | (3) Killall-v2 | Linux-webserver | 15.222.45.224 (10.0.4.57) | ||
10:32–10:52 | (4) TCP-IGMP | Windows-machine-2 | 35.182.194.19 (10.0.4.132) | ||
11:00–11:20 | (5) TCP-SYN | Linux-admin-1 | 99.79.45.168 (10.0.6.142) | ||
11:30–11:50 | (6) Killer-TCP | Windows-machine-4 | 35.183.15.52 (10.0.3.52) | ||
13:00–13:20 | (7) TCP-Control | Linux-webserver | 15.222.45.224 (10.0.4.57) | ||
13:30–13:50 | (8) TCP-MIX | Linux-admin-1 | 99.79.45.168 (10.0.6.142) | ||
14:00–14:20 | (9) TCP-SYN (syn flags only) | Windows-machine-2 | 35.182.194.19 (10.0.4.132) | ||
14:30–14:50 | (10) TCP-ACK | Windows-machine-3 | 3.99.186.200 (10.0.11.84) | ||
15:00–15:20 | (11) TCP-SYN-ACK | Linux-webserver | 15.222.45.224 (10.0.4.57) | ||
15:30–15:50 | (12) TCP-ACK-PSH | Windows-machine-4 | 35.183.15.52 (10.0.3.52) | ||
Tuesday, 19 December 2023 | 9:00–9:20 | (13) TCP-RST-ACK | Linux-webserver | 15.222.45.224 (10.0.4.57) | 3.99.150.239 (10.0.17.180) |
9:30–9:50 | (14) TCP-SYN-TFO | Windows-machine-3 | 3.99.186.200 (10.0.11.84) | ||
10:00–10:20 | (15) TCP-SYN-TIME | Linux-admin-1 | 99.79.45.168 (10.0.6.142) | ||
10:50–11:10 | (16) TCP-OSYN | Windows-machine-1 | 3.96.128.96 (10.0.9.208) | ||
11:20–11:40 | (17) TCP-OSYNP | Linux-webserver | 15.222.45.224 (10.0.4.57) |
Start Times | Behavior | Detail |
---|---|---|
09:00, 09:25, 09:50, 10:15, 10:40, 11:05, 11:30, 11:55, 12:20, 12:45, 13:10, 13:35, 14:00, 14:25, 14:50, 15:15, 15:40, 16:05, 16:30, 16:55 | Email (Sending) | All emails contain attachments as well. |
09:17, 09:42, 10:07, 10:32, 10:57, 11:22, 11:47, 12:12, 12:37, 13:02, 13:27, 13:52, 14:17, 14:42, 15:07, 15:32, 15:57, 16:22, 16:47, 17:12 | Email (Reading) | All emails contain attachments as well. |
09:40 | Web Browsing (Music Streaming) | It is a live stream. |
12:30 | Web Browsing (Video Watching) | YouTube |
09:05 | Web Browsing (Video Watching) | Continued with new video after finishing each one. |
09:40, 13:40 | Web Browsing (News Checking) | CBC.ca |
10:00, 10:40, 14:20, 14:50, 15:20, 11:30, 12:03, 12: 36, 15:25, 15:35, 15:45 | Web Browsing (Downloading) | File sizes: 5 GB, 228 MB, 4 MB, 4 MB, 4 MB, 1.7 KB, 1.7 KB, 1.7 KB, 100 MB, 100 MB, 100 MB |
11:43 | Web Browsing (Food) | UberEats |
10:02, 10:22, 10:25, 10:45 | Web Browsing (Shopping) | Amazon |
13:30, 13:40 | Web Browsing (Shopping) | Bestbuy |
15:13 | Web Browsing (Social Media) | |
16:00 | Web Browsing (Taxi) | Uber |
References
- Aljuhani, A. Machine learning approaches for combating distributed denial of service attacks in modern networking environments. IEEE Access 2021, 9, 42236–42264. [Google Scholar] [CrossRef]
- Bawany, N.Z.; Shamsi, J.A.; Salah, K. DDoS attack detection and mitigation using SDN: Methods, practices, and solutions. Arab. J. Sci. Eng. 2017, 42, 425–441. [Google Scholar] [CrossRef]
- Agarwal, A.; Khari, M.; Singh, R. Detection of DDOS attack using deep learning model in cloud storage application. In Wireless Personal Communications; Springer: Berlin, Germany, 2021; Volume 127, pp. 1–21. [Google Scholar]
- Aamir, M.; Zaidi, M.A. A survey on DDoS attack and defense strategies: From traditional schemes to current techniques. Interdiscip. Inf. Sci. 2013, 19, 173–200. [Google Scholar] [CrossRef]
- Singh, J.; Behal, S. Detection and mitigation of DDoS attacks in SDN: A comprehensive review, research challenges and future directions. Comput. Sci. Rev. 2020, 37, 100279. [Google Scholar] [CrossRef]
- Zeadally, S.; Adi, E.; Baig, Z.; Khan, I.A. Harnessing artificial intelligence capabilities to improve cybersecurity. IEEE Access 2020, 8, 23817–23837. [Google Scholar] [CrossRef]
- Wu, H.; Han, H.; Wang, X.; Sun, S. Research on artificial intelligence enhancing internet of things security: A survey. IEEE Access 2020, 8, 153826–153848. [Google Scholar] [CrossRef]
- Thakkar, A.; Lohiya, R. A review of the advancement in intrusion detection datasets. Procedia Comput. Sci. 2020, 167, 636–645. [Google Scholar] [CrossRef]
- BCCC-Dataset. BCCC CPacket Cloud-based DDoS 2024. Behaviour-Centric Cybersecurity Center (BCCC). Available online: https://www.yorku.ca/research/bccc/ucs-technical/cybersecurity-datasets-cds (accessed on 8 March 2024).
- BCCC-BUP. Benign User Profiler (BUP). Behaviour-Centric Cybersecurity Center (BCCC). Available online: https://github.com/ahlashkari/Benign-User-Profiler-BUP (accessed on 8 March 2024).
- BCCC-NTLFlowLyzer. Network and Transport Layer Flow Analyzer (NTLFlowLyzer), Retrieved 10 February 2024. Behaviour-Centric Cybersecurity Center (BCCC). Available online: https://github.com/ahlashkari/NTLFlowLyzer (accessed on 8 September 2023).
- Tabrizchi, H.; Kuchaki Rafsanjani, M. A survey on security challenges in cloud computing: Issues, threats, and solutions. J. Supercomput. 2020, 76, 9493–9532. [Google Scholar] [CrossRef]
- Saxena, R.; Dey, S. DDoS attack prevention using collaborative approach for cloud computing. Clust. Comput. 2020, 23, 1329–1344. [Google Scholar] [CrossRef]
- Zekri, M.; El Kafhali, S.; Aboutabit, N.; Saadi, Y. DDoS attack detection using machine learning techniques in cloud computing environments. In Proceedings of the 2017 3rd International Conference of Cloud Computing Technologies and Applications (CloudTech), Rabat, Morocco, 24–26 October 2017. [Google Scholar]
- Kautish, S.; Reyana, A.; Vidyarthi, A. SDMTA: Attack detection and mitigation mechanism for DDoS vulnerabilities in hybrid cloud environment. IEEE Trans. Ind. Inform. 2022, 18, 6455–6463. [Google Scholar] [CrossRef]
- Wani, A.R.; Rana, Q.; Saxena, U.; Pandey, N. Analysis and detection of DDoS attacks on cloud computing environment using machine learning techniques. In Proceedings of the 2019 Amity International Conference on artificial intelligence (AICAI), Dubai, United Arab Emirates, 4–6 February 2019; pp. 870–875. [Google Scholar]
- Choi, J.; Choi, C.; Ko, B.; Kim, P. A method of DDoS attack detection using HTTP packet pattern and rule engine in the cloud computing environment. Soft Comput. 2014, 18, 1697–1703. [Google Scholar] [CrossRef]
- Mugunthan, S. Soft computing based autonomous low rate DDOS attack detection and security for cloud computing. J. Soft Comput. Paradig. 2019, 1, 80–90. [Google Scholar]
- Virupakshar, K.B.; Asundi, M.; Channal, K.; Shettar, P.; Patil, S.; Narayan, D. Distributed denial of service (DDoS) attacks detection system for OpenStack-based private cloud. Procedia Comput. Sci. 2020, 167, 2297–2307. [Google Scholar] [CrossRef]
- Jindal, R.; Anwar, A. Emerging Trends of Recently Published Datasets for Intrusion Detection Systems (IDS): A Survey. arXiv 2021, arXiv:2110.00773. [Google Scholar]
- Chang, V.; Golightly, L.; Modesti, P.; Xu, Q.A.; Doan, L.M.T.; Hall, K.; Boddu, S.; Kobusińska, A. A survey on intrusion detection systems for fog and cloud computing. Future Internet 2022, 14, 89. [Google Scholar] [CrossRef]
- Tavallaee, M.; Bagheri, E.; Lu, W.; Ghorbani, A.A. A detailed analysis of the KDD CUP 99 data set. In Proceedings of the 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications, Ottawa, ON, Canada, 8–10 July 2009; pp. 1–6. [Google Scholar]
- Koga, R. Spoofer Data. Available online: https://catalog.caida.org/dataset/spoofer_data (accessed on 8 September 2023).
- DDoS 2007 Attack. Available online: https://catalog.caida.org/dataset/ddos_attack_2007 (accessed on 8 September 2023).
- CAIDA Randomly and Uniformly Spoofed Denial-of-Service Attack Metadata. Available online: https://catalog.caida.org/dataset/2017imcrsdostargets (accessed on 8 September 2023).
- Aggregated Daily RSDoS Attack Metadata (Corsaro 2). Available online: https://catalog.caida.org/dataset/telescope_corsaro2_daily_rsdos (accessed on 8 September 2023).
- Sangster, B.; O’Connor, T.; Cook, T.; Fanelli, R.; Dean, E.; Morrell, C.; Conti, G.J. Toward Instrumenting Network Warfare Competitions to Generate Labeled Datasets. In Proceedings of the 2nd conference on Cyber Security Experimentation and Test (CSET), Montreal, QC, Canada, 10 August 2009. [Google Scholar]
- Song, J.; Takakura, H.; Okabe, Y.; Eto, M.; Inoue, D.; Nakao, K. Statistical analysis of honeypot data and building of Kyoto 2006+ dataset for NIDS evaluation. In Proceedings of the First Workshop on Building Analysis Datasets and Gathering Experience Returns for Security, Salzburg, Austria, 10 April 2011; pp. 29–36. [Google Scholar]
- Shiravi, A.; Shiravi, H.; Tavallaee, M.; Ghorbani, A.A. Toward developing a systematic approach to generate benchmark datasets for intrusion detection. Comput. Secur. 2012, 31, 357–374. [Google Scholar] [CrossRef]
- Creech, G.; Hu, J. Generation of a new IDS test dataset: Time to retire the KDD collection. In Proceedings of the 2013 IEEE Wireless Communications and Networking Conference (WCNC), Shanghai, China, 7–10 April 2013; pp. 4487–4492. [Google Scholar]
- Garcia, S.; Grill, M.; Stiborek, J.; Zunino, A. An empirical comparison of botnet detection methods. Comput. Secur. 2014, 45, 100–123. [Google Scholar] [CrossRef]
- Moustafa, N.; Slay, J. UNSW-NB15: A comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In Proceedings of the 2015 Military Communications and Information Systems Conference (MilCIS), Canberra, Australia; 2015; pp. 1–6. [Google Scholar]
- Lashkari, A.H.; Draper-Gil, G.; Mamun, M.S.I.; Ghorbani, A.A. Characterization of tor traffic using time-based features. In Proceedings of the 3rd International Conference on Information Systems Security and Privacy (ICISSP), Porto, Portugal, 19–21 February 2017; pp. 253–262. [Google Scholar]
- Sharafaldin, I.; Lashkari, A.H.; Ghorbani, A.A. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp 2018, 1, 108–116. [Google Scholar]
- Sharafaldin, I.; Lashkari, A.H.; Hakak, S.; Ghorbani, A.A. Developing realistic distributed denial of service (DDoS) attack dataset and taxonomy. In Proceedings of the 2019 International Carnahan Conference on Security Technology (ICCST), Chennai, India, 1–3 October 2019; pp. 1–8. [Google Scholar]
- Riera, T.S.; Higuera, J.R.B.; Higuera, J.B.; Herraiz, J.J.M.; Montalvo, J.A.S. A new multi-label dataset for Web attacks CAPEC classification using machine learning techniques. Comput. Secur. 2022, 120, 102788. [Google Scholar] [CrossRef]
- Lawrence, H.; Ezeobi, U.; Tauil, O.; Nosal, J.; Redwood, O.; Zhuang, Y.; Bloom, G. CUPID: A labeled dataset with Pentesting for evaluation of network intrusion detection. J. Syst. Archit. 2022, 129, 102621. [Google Scholar] [CrossRef]
- Alhijawi, B.; Almajali, S.; Elgala, H.; Salameh, H.B.; Ayyash, M. A survey on DoS/DDoS mitigation techniques in SDNs: Classification, comparison, solutions, testing tools and datasets. Comput. Electr. Eng. 2022, 99, 107706. [Google Scholar] [CrossRef]
- Packeth Sourceforge. Available online: http://packeth.sourceforge.net (accessed on 8 September 2023).
- Iperf GitHub Page. Available online: https://github.com/esnet/iperf (accessed on 8 September 2023).
- Distributed Internet Traffic Generator. Available online: http://traffic.comics.unina.it/software/ITG/ (accessed on 8 September 2023).
- Ostinato. Available online: https://ostinato.org/ (accessed on 8 September 2023).
- Solarwinds Traffic Generator Wan Killer. Available online: https://www.solarwinds.com/engineers-toolset/use-cases/traffic-generator-wan-killer (accessed on 8 September 2023).
- Packet Sender. Available online: https://packetsender.com/ (accessed on 8 September 2023).
- NMap. Available online: https://nmap.org/nping (accessed on 8 September 2023).
- Net Scan Tools. Available online: https://www.netscantools.com/ (accessed on 8 September 2023).
- Trex-tgn CISCO. Available online: https://trex-tgn.cisco.com (accessed on 8 September 2023).
- Duarte Torres, S.; Weber, I.; Hiemstra, D. Analysis of search and browsing behavior of young users on the web. Acm Trans. Web (Tweb) 2014, 8, 1–54. [Google Scholar] [CrossRef]
- Kumar, R.; Tomkins, A. A characterization of online browsing behavior. In Proceedings of the 19th International Conference on World Wide Web, Raleigh, NC, USA, 26–30 April 2010; pp. 561–570. [Google Scholar]
- Wu, I.C.; Yu, H.K. Sequential analysis and clustering to investigate users’ online shopping behaviors based on need-states. Inf. Process. Manag. 2020, 57, 102323. [Google Scholar] [CrossRef]
- Möller, J.; van de Velde, R.N.; Merten, L.; Puschmann, C. Explaining online news engagement based on browsing behavior: Creatures of habit? Soc. Sci. Comput. Rev. 2020, 38, 616–632. [Google Scholar] [CrossRef]
- Bakhshi, T.; Ghita, B. User traffic profiling. In Proceedings of the 2015 Internet Technologies and Applications (ITA), Wrexham, UK, 8–11 September 2015; pp. 91–97. [Google Scholar]
- Varet, A.; Larrieu, N. Realistic network traffic profile generation: Theory and practice. Comput. Inf. Sci. 2014, 7, 1. [Google Scholar] [CrossRef]
- Nelson, R.; Shukla, A.; Smith, C. Web Browser Forensics in Google Chrome, Mozilla Firefox, and the Tor Browser Bundle. In Digital Forensic Education: An Experiential Learning Approach; Springer Book: Berlin, Germany, 2020; pp. 219–241. [Google Scholar]
- Aouini, Z.; Pekar, A. NFStream: A flexible network data analysis framework. Comput. Netw. 2022, 204, 108719. [Google Scholar] [CrossRef]
- Azure DDoS Protection—2021 Q1 and Q2 DDoS Attack Trends. Available online: https://azure.microsoft.com/en-us/blog/azure-ddos-protection-2021-q1-and-q2-ddos-attack-trends/ (accessed on 8 September 2023).
- Azure DDoS Protection—2021 Q3 and Q4 DDoS Attack Trends. Available online: https://azure.microsoft.com/en-us/blog/azure-ddos-protection-2021-q3-and-q4-ddos-attack-trends/ (accessed on 8 September 2023).
- 2022 in Review: DDoS Attack Trends and Insights. Available online: https://www.microsoft.com/en-us/security/blog/2023/02/21/2022-in-review-ddos-attack-trends-and-insights/ (accessed on 8 September 2023).
- Cloudflare DDoS Reports. Available online: https://radar.cloudflare.com/reports?q=DDoS (accessed on 8 September 2023).
Date | # of IP Packets | # of TCP Packets | # of UDP Packets |
---|---|---|---|
Thursday 14 December | 37,552,636 (99.75%) | 35,213,784 (93.54%) | 2,335,531 (6.20%) |
Saturday 16 December | 38,823,009 (99.81%) | 36,151,292 (92.94%) | 2,666,822 (6.86%) |
Monday 18 December | 28,302,904 (99.85%) | 24,628,272 (86.89%) | 3,671,949 (12.95%) |
Tuesday 19 December | 23,550,922 (99.89%) | 21,653,847 (91.84%) | 1,890,837 (8.02%) |
Sum of All | 128,229,471 | 117,647,195 | 10,565,139 |
Date | # of Benign Flows | # of Attack Flows | # of Suspicious Flows | Sum of All Flows |
---|---|---|---|---|
Thursday 14 December | 105,087 | 0 | 0 | 105,087 |
Saturday 16 December | 189,678 | 0 | 0 | 189,678 |
Monday 18 December | 68,444 (∼21%) | 220,276 (∼69%) | 31,810 (∼10%) | 320,530 |
Tuesday 19 December | 49,990 (∼58%) | 8193 (∼10%) | 27,296 (∼32%) | 85,479 |
Sum of All Flows | 413,199 (∼59%) | 228,469 (∼33%) | 59,106 (∼8%) | 700,774 |
ID | Activity | Thursday | Saturday | Monday | Tuesday | Sum |
---|---|---|---|---|---|---|
1 | Benign | 85,853 | 159,007 | 28,746 | 28,678 | 302,284 |
2 | Benign-SSH | 1333 | 1410 | 120 | 122 | 2985 |
3 | Benign-FTP | 329 | 97 | 28 | 29 | 483 |
4 | Benign-Email-Receive | 480 | 458 | 245 | 212 | 1395 |
5 | Benign-Email-Send | 596 | 558 | 442 | 342 | 1938 |
6 | Benign-Systemic | 2814 | 14,333 | 17,590 | 13,337 | 48,074 |
7 | Benign-Web Browsing HTTP-S | 10,471 | 12,603 | 21,114 | 7084 | 51,272 |
8 | Benign-TELNET | 3211 | 1212 | 159 | 186 | 4768 |
9 | Suspicious | - | - | 31,810 | 27,296 | 59,106 |
10 | Attack-TCP-Valid-SYN | - | - | 8043 | - | - |
11 | Attack-TCP-BYPass-V1 | - | - | 138,368 | - | - |
12 | Attack-Killall-v2 | - | - | 6033 | - | - |
13 | Attack-TCP-IGMP | - | - | 7251 | - | - |
14 | Attack-TCP-SYN | - | - | 6953 | - | - |
15 | Attack-Killer-TCP | - | - | 6254 | - | - |
16 | Attack-TCP-Control | - | - | 5744 | - | - |
17 | Attack-TCP-Flag-MIX | - | - | 7416 | - | - |
18 | Attack-TCP-Flag-SYN | - | - | 7845 | - | - |
19 | Attack-TCP-Flag-ACK | - | - | 10,683 | - | - |
20 | Attack-TCP-Flag-SYN-ACK | - | - | 8204 | - | - |
21 | Attack-TCP-Flag-ACK-PSH | - | - | 7482 | - | - |
22 | Attack-TCP-Flag-RST-ACK | - | - | - | 1445 | - |
23 | Attack-TCP-Flag-SYN-TFO | - | - | - | 3631 | - |
24 | Attack-TCP-Flag-SYN-TIME | - | - | - | 1360 | - |
25 | Attack-TCP-Flag-OSYN | - | - | - | 867 | - |
26 | Attack-TCP-Flag-OSYNP | - | - | - | 890 | - |
1st 10 Features | 2nd 10 Features | 3rd 10 Features | 4th 10 Features | |
---|---|---|---|---|
Analysis of Variance (ANOVA) | max hdr byte, min hdr byte, mean hdr byte, med hdr byte, mode hdr byte, F max hdr byte, F min hdr byte, F mean hdr byte, F std hdr byte, F med hdr byte | F cov hdr byte, F mode hdrbyte, F var hdr byte, B std hdr byte, B cov hdr byte, B var hdr byte, F init win byte, B init win byte, rst flag counts, B rst flag counts | psh flag % in total, rst flag % in total, F psh flag % in total, F syn flag % in total, B psh flag % in total, F psh flag % in F pkts, B psh flag % in B pkts, B rst flag % in B pkts, B pkts IAT mean, B pkts IAT max | B pkts IAT min, B pkts IAT total, B pkts IAT med, B pkts IAT mode, handshake duration, handshake state, mean B pkts DT, med B pkts DT, skew pkts DL, mode F pkts DL |
Information Gain | duration, total hdr byte, max hdr byte, min hdr byte, mean hdr byte, med hdr byte, mode hdr byte, F total hdr byte, F max hdr byte, F min hdr byte | F mean hdr byte, F med hdr byte, F mode hdr byte, F init win byte, pkts rate, B pkts rate, F pkts rate, syn flag % in total, ack flag % in total, F syn flag % in total | pkts IAT mean, packet IAT max, packet IAT min, packet IAT total, pkts IAT med, pkts IAT mode, F pkts IAT mean, F pkts IAT max, F pkts IAT min, F pkts IAT total | F pkts IAT med, F pkts IAT mode, B pkts IAT total, handshake duration, mean pkts DT, var pkts DT, std pkts DT, med pkts DT, med B pkts DT, med F pkts DT |
Extra Tree | total hdr byte, max hdr byte, min hdr byte, mean hdr byte, med hdr byte, mode hdr byte, F total hdr byte, F max hdr byte, F min hdr byte, F mean hdr byte | F med hdr byte, F mode hdr byte, F init win byte,B init win byte, pkts rate, B pkts rate, F pkts rate, rst flag counts, B rst flag counts, syn flag % in total | rst flag % in total, F syn flag % in total, B rst flag % in total, F psh flag % in F pkts, F syn flag % in F pkts, B psh flag % in B pkts, pkts IAT mean, packet IAT max, packet IAT min, packet IAT total | pkts IAT med, pkts IAT mode, F pkts IAT mean, F pkts IAT max, F pkts IAT min, F pkts IAT total, F pkts IAT med, F pkts IAT mode, B pkts IAT total, B pkts IAT mode |
Task | Model | Precision | Recall | F1-Score | Model | Precision | Recall | F1-Score |
---|---|---|---|---|---|---|---|---|
Task 1 | NB | 0.71 | 0.46 | 0.48 | Logistic Reg. | 0.62 | 0.62 | 0.62 |
SVM | 0.56 | 0.60 | 0.58 | KNN | 0.91 | 0.91 | 0.91 | |
RF | 0.94 | 0.94 | 0.94 | Decision Tree | 0.84 | 0.85 | 0.84 | |
XGBoost | 0.94 | 0.94 | 0.94 | Extra Tree | 0.94 | 0.94 | 0.94 | |
Proposed | 0.94 | 0.94 | 0.94 | Bagging | 0.93 | 0.94 | 0.93 | |
Task 2 | NB | 0.55 | 0.58 | 0.56 | Logistic Reg. | 0.58 | 0.59 | 0.58 |
SVM | 0.55 | 0.58 | 0.56 | KNN | 0.75 | 0.70 | 0.72 | |
RF | 0.79 | 0.72 | 0.75 | Decision Tree | 0.71 | 0.69 | 0.70 | |
XGBoost | 0.85 | 0.71 | 0.76 | Extra Tree | 0.75 | 0.75 | 0.75 | |
Proposed | 0.79 | 0.77 | 0.78 | Bagging | 0.80 | 0.72 | 0.75 | |
Task 3 | NB | 0.76 | 0.13 | 0.10 | Logistic Reg. | 0.58 | 0.49 | 0.53 |
SVM | 0.50 | 0.45 | 0.48 | KNN | 0.89 | 0.88 | 0.89 | |
RF | 0.96 | 0.94 | 0.95 | Decision Tree | 0.85 | 0.82 | 0.83 | |
XGBoost | 0.96 | 0.95 | 0.95 | Extra Tree | 0.93 | 0.93 | 0.93 | |
Proposed | 0.96 | 0.96 | 0.96 | Bagging | 0.94 | 0.92 | 0.93 | |
Task 4 | NB | 0.64 | 0.11 | 0.08 | Logistic Reg. | 0.45 | 0.49 | 0.47 |
SVM | 0.37 | 0.40 | 0.38 | KNN | 0.91 | 0.90 | 0.91 | |
RF | 0.95 | 0.92 | 0.93 | Decision Tree | 0.86 | 0.84 | 0.85 | |
XGBoost | 0.95 | 0.94 | 0.94 | Extra Tree | 0.95 | 0.92 | 0.93 | |
Proposed | 0.96 | 0.92 | 0.93 | Bagging | 0.93 | 0.93 | 0.93 | |
Task 5 | NB | 0.41 | 0.46 | 0.43 | Logistic Reg. | 0.51 | 0.56 | 0.53 |
SVM | 0.41 | 0.46 | 0.43 | KNN | 0.69 | 0.69 | 0.69 | |
RF | 0.75 | 0.73 | 0.73 | Decision Tree | 0.61 | 0.67 | 0.64 | |
XGBoost | 0.78 | 0.74 | 0.74 | Extra Tree | 0.74 | 0.74 | 0.74 | |
Proposed | 0.88 | 0.84 | 0.86 | Bagging | 0.72 | 0.71 | 0.71 | |
Task 6 | NB | 0.58 | 0.29 | 0.19 | Logistic Reg. | 0.37 | 0.48 | 0.40 |
SVM | 0.35 | 0.50 | 0.40 | KNN | 0.85 | 0.85 | 0.85 | |
RF | 0.88 | 0.86 | 0.87 | Decision Tree | 0.81 | 0.82 | 0.81 | |
XGBoost | 0.91 | 0.86 | 0.87 | Extra Tree | 0.84 | 0.86 | 0.85 | |
Proposed | 0.97 | 0.96 | 0.97 | Bagging | 0.82 | 0.80 | 0.81 | |
Task 7 | NB | 0.51 | 0.27 | 0.17 | Logistic Reg. | 0.48 | 0.63 | 0.53 |
SVM | 0.29 | 0.46 | 0.35 | KNN | 0.84 | 0.84 | 0.84 | |
RF | 0.85 | 0.85 | 0.85 | Decision Tree | 0.74 | 0.78 | 0.75 | |
XGBoost | 0.86 | 0.86 | 0.85 | Extra Tree | 0.85 | 0.86 | 0.85 | |
Proposed | 0.91 | 0.91 | 0.91 | Bagging | 0.84 | 0.84 | 0.84 |
Dataset | Date | # Labels | # Features | Realistic Traffic | Data Distribution Benign–Malicious | Analyzer | User Profile | Cloud Env. |
---|---|---|---|---|---|---|---|---|
ISCX2012 | 2012 | 6 | 18 | ✗ | 97-3 | ISCXFlowMeter | ✗ | ✗ |
CTU-13 | 2013 | 14 | 84 | ✓ | - | Argus-NetFlow | ✗ | ✗ |
UNSW-NB15 | 2015 | 10 | 157 | ✗ | 87-13 | Argus-Bro-IDS | ✗ | ✗ |
CICIDS2017 | 2017 | 14 | 80 | ✓ | 78-22 | CICFlowMeter | ✓ | ✗ |
CSE-CIC-IDS2018 | 2018 | 15 | 80 | ✓ | 83-17 | CICFlowMeter | ✓ | ✗ |
CIC-DDoS2019 | 2019 | 15 | 80 | ✓ | 10-90 | CICFlowMeter | ✓ | ✗ |
SR-BH2020 | 2020 | 13 | 32 | ✓ | 58-42 | Not Public | ✗ | ✗ |
CUPID | 2022 | 2 | 80 | ✗ | 88-12 | CICFlowMeter | ✗ | ✗ |
BCCC-cPacket- Cloud-DDoS-2024 | 2024 | 26 | 322 | ✓ | 60-40 | NTLFlowLyzer | ✓ | ✓ |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Shafi, M.; Lashkari, A.H.; Rodriguez, V.; Nevo, R. Toward Generating a New Cloud-Based Distributed Denial of Service (DDoS) Dataset and Cloud Intrusion Traffic Characterization. Information 2024, 15, 195. https://doi.org/10.3390/info15040195
Shafi M, Lashkari AH, Rodriguez V, Nevo R. Toward Generating a New Cloud-Based Distributed Denial of Service (DDoS) Dataset and Cloud Intrusion Traffic Characterization. Information. 2024; 15(4):195. https://doi.org/10.3390/info15040195
Chicago/Turabian StyleShafi, MohammadMoein, Arash Habibi Lashkari, Vicente Rodriguez, and Ron Nevo. 2024. "Toward Generating a New Cloud-Based Distributed Denial of Service (DDoS) Dataset and Cloud Intrusion Traffic Characterization" Information 15, no. 4: 195. https://doi.org/10.3390/info15040195