Significance of Machine Learning-Driven Algorithms for Effective Discrimination of DDoS Traffic Within IoT Systems

Alenezi, Mohammed N.

doi:10.3390/fi17060266

Open AccessArticle

Significance of Machine Learning-Driven Algorithms for Effective Discrimination of DDoS Traffic Within IoT Systems

by

Mohammed N. Alenezi

Computer Science & Information Systems Department, The Public Authority for Applied Education and Training, Safat 13147, Kuwait

Future Internet 2025, 17(6), 266; https://doi.org/10.3390/fi17060266

Submission received: 8 May 2025 / Revised: 10 June 2025 / Accepted: 13 June 2025 / Published: 18 June 2025

(This article belongs to the Special Issue Cybersecurity in the IoT)

Download

Browse Figures

Versions Notes

Abstract

As digital infrastructure continues to expand, networks, web services, and Internet of Things (IoT) devices become increasingly vulnerable to distributed denial of service (DDoS) attacks. Remarkably, IoT devices have become attracted to DDoS attacks due to their common deployment and limited applied security measures. Therefore, attackers take advantage of the growing number of unsecured IoT devices to reflect massive traffic that overwhelms networks and disrupts necessary services, making protection of IoT devices against DDoS attacks a major concern for organizations and administrators. In this paper, the effectiveness of supervised machine learning (ML) classification and deep learning (DL) algorithms in detecting DDoS attacks on IoT networks was investigated by conducting an extensive analysis of network traffic dataset (legitimate and malicious). The performance of the models and data quality improved when emphasizing the impact of feature selection and data pre-processing approaches. Five machine learning models were evaluated by utilizing the Edge-IIoTset dataset: Random Forest (RF), Support Vector Machine (SVM), Long Short-Term Memory (LSTM), and K-Nearest Neighbors (KNN) with multiple K values, and Convolutional Neural Network (CNN). Findings revealed that the RF model outperformed other models by delivering optimal detection speed and remarkable performance across all evaluation metrics, while KNN (K = 7) emerged as the most efficient model in terms of training time.

Keywords:

artificial intelligence; machine learning; DDoS; deep learning; network security; detection; IoT security

1. Introduction

In today’s rapidly digital interconnected world, cybersecurity stands as a critical cornerstone of organizational and governmental defense against various growing threats, ranging from data breaches to network attacks. Among these threats are distributed denial of service (DDoS) attacks which form a critical security challenge in modern networks, mostly targeting IoT devices [1,2]. Professionals in cybersecurity show increasing concern about the growing incidence and effects of DDoS attacks. The authors in [3] indicated that the number of major DDoS attacks rose significantly by 288% between 2021 and 2022. Additionally, defenses are becoming more complicated as attackers shift their focus from network infrastructure to application layers [4]. In addition, the growing number of IoT devices greatly increases the possible attack surface, thereby making the IoT more vulnerable to such attacks [5]. Cloudflare [6] claimed the prevention of over 21.3 million DDoS attempts in 2024, a 53% increase over 2023 with an average of 4870 attacks per hour. An extraordinary 5.6 Tbps attack around Halloween [6] brought attention to the serious danger that DDoS attacks represent, particularly given their capacity to interfere with essential web services and cause serious operational and financial harm. In the financial sector, DDoS attacks have the potential to cause significant operational delays and financial losses [7]. These attacks can result in reputation damage in addition to the financial losses, especially for banks. The effects are not limited to the main target in the attack; they also encompass shared infrastructure, cloud providers, and energy usage [8]. The rapid increase in IoT devices raises cybersecurity risks in many areas and particularly for financial institutions, making them more exposed to attacks [9]. Devices such as cameras, sensors, and door controls are vulnerable, as attackers after compromising these devices could interrupt daily operations or gain access to sensitive information [10].

The IoT consists of interconnected devices that are capable of sharing data over the internet. The IoT network is undergoing an exponential growth with an increase from 30 billion devices in 2021 to an expectation of more than 75 billion in 2025 [11]. IoT devices vary widely in design and purpose, encompassing everything from smart household appliances and thermostats to industrial sensors and autonomous vehicles. Although the interconnectivity between IoT devices enhances data gathering and system operations, it introduces significant security vulnerabilities. Of particular concern is DDoS attacks, where attackers exploit vulnerable IoT devices to create large botnets capable of overwhelming network resources and disrupting major and critical services [1,12]. These disruptions could have an impact on many services, particularly in the health and financial sectors. Within the health sector, these attacks could disrupt critical operations, impact patient monitoring and decision-making in healthcare IoT environments [13]. Similarly, any interruption of services in the financial sector leads to a negative effect on the stock market and potential losses of thousands of dollars for every hour of downtime [14].

The detection and classification of DDoS attacks face several complex challenges in modern network environments. The effectiveness of the traditional signature-based detection methods has been reduced due to the adoption of traffic encryption [15]. Furthermore, the continuous evolution in attack patterns created by attackers to circumvent detection mechanisms creates ongoing challenges for security professionals. Additionally, the significant growth in legitimate traffic volume is creating additional complexity and challenges in differentiating between legitimate and attack traffic [15]. Studies by [16,17] show that the increase in IoT device usage has introduced new attack vectors that traditional detection systems struggle to discover effectively.

DDoS attacks present a real threat to IoT environments due to their inherent vulnerabilities, including limited memory and processing capacity. The complexity and magnitude of modern DDoS attacks are not adequately addressed by conventional security measures [15]. Cyber threat detection and mitigation may be enhanced by using machine learning (ML) solutions, such as supervised and deep ML algorithms, which can handle enormous datasets, discover complex patterns, and provide adaptive detection [18]. The effectiveness of these ML algorithms in detecting DDoS attacks in IoT systems should be thoroughly investigated. Although earlier studies have looked at ML techniques, they have not focused on the importance of feature selection when it comes to a model’s effectiveness. Moreover, most previous efforts used accuracy as the primary focus to show a model’s efficacy neglecting factors such as false-positive rates and training/detection time. Taking all of these factors into account provides insight on the optimal detection strategies to employ, given the limited resources and scalability of IoT devices. As a result, the choice of the appropriate ML models for network security in general and DDoS attacks in particular has become crucial in addressing these challenges.

Different ML and DL models showed capabilities in handling network traffic characteristics, processing speed, and detection accuracy rates. Recent studies [19,20] showed that the selection of the model can highly affect the detection accuracy rate up to approximately 25% and the processing speed by up to 69–76% under similar conditions in various domains. Furthermore, feature selection techniques effectively minimize training time and enhance the accuracy of ML-based detection [20,21]. The efficacy of different ML models varies noticeably, with some models showing superior performance in detecting certain attack types while struggling with other types.

Using the Edge-IIoTset dataset [22], this paper aims to thoroughly evaluate the effectiveness of different supervised ML classification algorithms to enhance detection of DDoS attacks in IoT environments.The most accurate and efficient models for attack detection is recommended by analyzing the computational training and detection time in addition to the performance metrics comparison.

The research methodology employs a systematic comparative analysis using the Edge-IIoTset dataset to assess multiple ML classification algorithms. The experimental framework covers data pre-processing, feature selection, and model training phases. The evaluated classification models are: LSTM, RF, CNN, KNN with varying k values (3, 5, 7), and SVM. Performance evaluation is carried out through statistical analysis of standard metrics including confusion matrix, while training time and detection speed are measured to evaluate computational efficiency.

The primary contributions of this research include:

Presenting a detailed comparison of supervised ML and DL techniques in DDoS attack detection on IoT devices using the Edge-IIoTset dataset. Designed for IoT and Industrial Internet of Things (IIoT), this dataset provides a realistic and comprehensive cybersecurity resource.
Investigating the important effect of feature selection on the accuracy and efficiency of detection models, an area that previous studies have not sufficiently emphasized.
Evaluating several useful performance indicators, such as false-positive rates, training time, detection accuracy, and detection speed.
Identifying the best algorithms for resource-constrained IoT devices through practical suggestions that take computational efficiency and performance into account.

The rest of this paper is structured as follows. The generic detection approaches and related ML algorithms are introduced in Section 2. Then the related work is presented in Section 3. In Section 4, the implementation details are explained including: data engineering and preparation framework, model selection and training details, and model evaluation. The results and model performance metrics are analyzed in Section 5. Finally, Section 6 outlines the study’s limitations and future research directions, while the conclusion is presented in Section 7.

2. Background

2.1. Generic Detection Approaches

Network traffic analysis approach is one of the most basic ways to protect against DDoS attacks that target IoT devices. The network traffic is monitored for unusual patterns that indicate an ongoing attack. Organizations typically employ this method by integrating intrusion detection systems with other security measures that are intended to detect suspicious network behaviors [23]. These systems create baseline traffic patterns and detect substantial variations, allowing security teams to take action before services degrade. Even though it is widely used, network traffic analysis has some major problems especially when incorporated in IoT environments. Monitoring systems in IoT environments face challenges and difficulties when dealing with massive traffic, resulting in performance degradation and gaps in coverage [24]. Additionally, these methods have trouble handling encrypted communication, which hides packet contents and prevents careful examination. A study by [25] found that detection accuracy drops during flash crowds or seasonal traffic spikes, leading to false alarms. Moreover, comprehensive monitoring needs expensive technology and experienced staff to evaluate findings, which can be costly for organizations with limited security resources.

Device monitoring strategies are also crucial for overall security as they focus on individual devices. Analyzing device logs and performance data can reveal specific device behaviors and possible compromise signs [26]. Important factors in reliable security systems include patterns of resource use, communication frequencies, and access scheduling. Reviewing these parameters together aids in locating infected devices displaying non-standard behavior, allowing for focused removal and repair. The IoT landscapes provide significant challenges for the implementation of device monitoring solutions. Many IoT devices with limited computational and storage capacity restrict logging and self-monitoring capabilities [27]. This results in unequal visibility over device populations: some give detailed data while others give little information. Additionally, closed systems and proprietary protocols that resist standardized monitoring complicate data collection even further. Kumar et al. [28] found that when the device populations rise into hundreds of millions of endpoints, the centralized collection systems would be overloaded and would cause delays in attack detection. Lastly, privacy concerns add even another level of complication as thorough monitoring may gather private user data, therefore generating problems with regulatory compliance.

Signature-based detection is among the first techniques used in networks to detect DDoS attacks. This method relies on known patterns of identified attack vectors, which makes it easier to identify them quickly when matching traffic patterns are present [29]. The implementation requires signature databases with unique byte sequences, packet structures, and connection patterns related to attack methods. The system initiates alarms and implements protective protocols upon the detection of incoming traffic corresponding to specific signatures. Although signature-based methods can quickly identify established attack methods, they have major drawbacks in IoT environments. First, signature-based approaches are ineffective against new or updated attacks not in existing signature databases [30]. Second, signature development lags behind attacker innovation as DDoS attacks develop, creating vulnerability windows between attack emergence and signature deployment. Third, maintenance requirements add additional burden as signature databases need constant improvements. Studies by [31] show that in IoT environments with restricted resources, signature-based solutions generate notable processing overhead, hence possibly influencing operational efficiency. Lastly, large-scale installations may result in too many false positives when the signatures are not detailed enough. The alarms may exhaust security workers and disguise serious dangers.

Machine learning classification approaches have been utilized for detecting DDoS attacks on IoT systems. Unlike traditional techniques, ML algorithms are more precise when it comes to distinguishing traffic attacks from flash crowd, thus lowering the false positive occurrence. In order to identify DDoS attacks in IoT environments, researchers have studied several supervised learning algorithms categorized as either conventional classification methods or DL approaches [32]. Conventional approaches set threshold values to identify attack traffic using extracted properties such as packet size, protocol type, behavior patterns. The utilization of AI-driven learning approaches have played a significant effect on the automatic generation of hierarchical representations from network data [33]. These techniques have special qualities; some perform well in identifying trends in sequential data and others may extract features from traffic representations that would otherwise be hidden in many dimensions.

In IoT environments, ML classification offers several advantages over traditional detection methods. Unlike signature-based systems limited to known attack patterns, classification models allow for the identification of new attacks displaying traits comparable to training samples [34]. Their multi-dimensional analytic features help to evaluate different traffic conditions, thereby strengthening the detecting systems. With sophisticated models lowering false positive rates by 40–60% while keeping detection sensitivity, research by [35] showed considerable increases in detection accuracy. The improved accuracy helps security specialists focus their attention on real dangers and saves their effort. Moreover, the flexibility of such systems proves to be an operational benefit as regular retraining solves evolving usage patterns and growing device populations [36]. Further benefits come from implementation flexibility as these models may be used depending on resource availability throughout several network tiers. Combining ML approaches with current security architecture improves protection and produces defensive systems that are more resistant to sophisticated attacks. Depending on the environment of application and particular requirements, ML techniques show different efficiency. Network designs, device populations, and attack strategies affect performance [37]. This diversity emphasizes the need for ongoing research targeted at the identification of ideal algorithms fit for certain operating environments and danger settings. Recent research has focused on developing frameworks for algorithm selection that takes predicted risks, network properties, and resource availability into account. The rapid development of IoT environments necessitates continuous research on flexible security solutions that efficiently handle new attack strategies and take into account the resource limitations common in IoT systems.

2.2. Machine Learning Models

Recurrent neural networks (RNN) were inspired by the biological human brain and are well-suited for many tasks. Unlike standard neural networks (feed-forward neural network), RNNs contain loops that provide short-term memory, enabling them to use previous inputs to shape current and future outputs. LSTM [38] networks are a type of RNN capable of capturing long-term dependencies. LSTM units contain specialized memory cells combined with standard recurrent units, enabling the network to remember information for an extended period of time. The information stored within these cells is managed by gates, including the input, output, and forget gates [38,39].

Support vector machine (SVM) is a supervised learning machine model developed by Valdimir et al. [40] in the 1990s. It is commonly used for classification and regression. By determining the optimal hyperplane, the model effectively splits data into distinct categories while maximizing the gap between classes. SVM initially only supported linear classification, then it evolved to non-linear classification problems. Linear SVMs split using a straight line. Nonlinear SVMs classify more complicated data patterns. SVMs are utilized in machine learning in many fields, such as text classification, spam identification, and prediction [40].

K-nearest neighbors (KNN) is a supervised learning algorithm that classifies or predicts values based on similarity to existing examples. Unlike other algorithms that build models while training, KNNs save data and decide what to do when necessary by finding the K training examples that are closest to a new point and voting or taking the average of those. The choice of distance measure such as Hassanat, Euclidean, and Manhattan significantly affects the performance. KNN works best when data are properly scaled, the right distance measure is selected, and an appropriate K value is chosen. Though KNN simplifies the processing of complicated patterns, KNN maintains high precision even in the event of up to 90% noise in the data [41,42].

Random forest is a supervised learning technique that uses many decision trees working together to make predictions. Designed by Leo Breiman in 2001 [43], it generates many trees using random data points and features, then aggregates the results. It is capable of working efficiently with complex data and relationships, requires minimal modification, accommodates missing data, and identifies the most critical variables. The method effectively identifies DDoS attacks by distinguishing malicious from legitimate network activity and evaluating anomalous traffic patterns. Combining several trees improves the algorithm’s reliability by preventing overfitting and improving prediction accuracy on new data [43,44,45,46].

Convolutional neural networks (CNNs) are a subset of DL architectures, developed in the 1980s by Yann LeCun [47]; these networks extract hierarchical characteristics from incoming data by performing convolution operations through a number of processing layers. CNNs were originally inspired by biological visual processing systems; however, they have expanded their capabilities beyond picture classification to include text, audio, and network traffic classification. CNNs instantly learn the best way to describe features straight from data, which makes classification more accurate across a wide range of domains. This is in contrast to traditional ML techniques that require manual feature engineering. Due to their capacity for self-learning, sharing of parameters, and preservation of spatial correlations, CNNs have emerged as powerful AI tools for pattern recognition and classification tasks [48,49].

3. Related Work

IoT has rapidly expanded to link many systems and devices. With this linkage comes security concerns that must be addressed. Researchers [50,51,52] have proposed taxonomies to classify IoT risks into several levels: service, device, infrastructure, and communication. IoT systems’ physical objects, protocols, data, and software components are all open to security breaches [53]. To handle these problems, researchers have proposed several countermeasures including key managements, authentication, access control, and privacy preservation strategies [51,54,55]. They also focused on developing robust detection methods capable of identifying attack signatures within the unique constraints of IoT environments [50,56]. However, the protection of IoT ecosystems remains a challenge due to device limitations and non-standard IoT settings [51,57,58].

Agent-based approaches are being investigated recently to reduce DDoS attacks on IoT systems. The researchers in [59,60] explored and studied adaptive traffic filtering, anomaly detection using ML techniques, and collaborative agent-based detection systems. Some researchers have built agent-based simulators to study protection measures against DDoS attacks [61], and other suggest lightweight agents for IoT to identify and mitigate attacks [60,62]. As novel techniques, blockchain-based collaborative detection and learning-driven detection mitigation have been investigated [63,64,65]. Although these approaches improve IoT network resilience against evolving DDoS attacks while utilizing scalability and resource limitations, they still present some drawbacks. Agent-based techniques commonly face challenges related to resource consumption when deployed on resource-limited IoT devices, potentially creating performance bottlenecks when deployed at scale. Collaborative detection systems, while effective in theory, face significant challenges with secure communication between agents and can fail when attackers target the coordination mechanisms themselves. Approaches like adaptive traffic filtering, blockchain-based detection, and learning-driven mitigation suffer from high false positive rates when facing sophisticated low-rate attacks, and typically require extensive training data that may not be representative of evolving attack patterns.

Ferrag et al. [22] presented Edge-IIoTset, a complete cybersecurity dataset for IoT and IIoT applications, which includes realistic network traffic from over 10 types of IoT devices across a seven-layer testbed architecture, thereby addressing constraints in existing datasets. Deep neural networks (DNNs) in centralized and federated learning environments, as well as traditional ML techniques like DT, RF, SVMs, and K-Nearest Neighbor (KNN) were evaluated. Accuracy values of 99.99% for binary classification and 94.67% for 15-class classification were obtained by using DNNs. The dataset outperforms current IoT/IIoT datasets and offers privacy-preserving federated learning, making it a useful resource for building intrusion detection systems in IoT/IIoT environments.

Designed especially for intrusion detection in IIoT systems, Rashid et al. [66] presented a federated learning architecture. Their method uses CNNs and RNNs as baseline classifiers inside a federated learning system. The system is evaluated on the Edge-IIoT dataset and achieved an accuracy of 92.49%. Although it is less accurate than centralized models (93.92%), the suggested architecture offers significant advantages by lowering bandwidth use and ensuring data privacy. The key contribution of this study is the capacity of peripheral devices to perform intrusion detection independently, free from the need of always being connected to a central server. This immediately tackles IoT network’s basic issues with limited bandwidth and privacy.

Focusing on DenseNet and Inception Time architectures across three main datasets: ToN-IoT, UNSW-NB15, and Edge-IIoT, the authors in [67] performed a detailed comparative evaluation of DL models for IoT cybersecurity. The authors modified DenseNet for one-dimensional input vectors and used sliding window methods with Inception Time to enhance temporal feature extraction. Inception Time performed remarkably, achieving 100% accuracy on the ToN-IoT dataset, 94.9% on Edge-IIoT, and 98.6% on UNSW-NB15. These results show the remarkable effectiveness of the Inception Time architecture.

Hnamte and Hussain [68] proposed a hybrid DL architecture that combines Deep Convolutional Neural Networks (DCNN) with Bidirectional Long Short-Term Memory (BiLSTM) networks. The proposed approach catches temporal correlations and spatial features by combining BiLSTM components with CNN layers. The authors claim that the model has a low false positive rate, with an accuracy of 100% on the CICIDS2018 dataset and 99.64% on the Edge-IIoT dataset. These results therefore show the benefits of hybrid designs over single-model methods.

Khacha et al. [69] developed a tailored hybrid DL model that combines CNN and LSTM architectures. The CNN-LSTM model achieved 100% accuracy in binary classification and performed well in multi-class scenarios compared to both traditional ML models and individual DL models. Thiyam and Dey [70] tackled the challenge of class imbalance in cybersecurity datasets by adopting a unique feature assessment approach. The approach combined feature shuffling methods using RF for optimal feature selection with a hybrid resampling technique integrating SMOTE (Synthetic Minority Over-sampling Technique) and TOMEK link. Using six ML techniques on the Edge-IIoT and CIC-DDoS2019 datasets, the decision tree (DT) classifiers achieved respective accuracies of 99.32% and 99.87%, respectively. This shows rather clearly how fixing class imbalance improves detection performance. Table 1 highlights studies using the Edge-IIoT dataset for multi-class classification, presenting the Edge-IIoT results from studies that evaluated multiple datasets.

4. Materials and Methods

Presented in this section is the methodological framework for the implementation and evaluation of DDoS detection algorithms in IoT systems. This research assesses the efficacy of a variety of ML techniques, including KNN with varying K values, RF, SVM, LSTM, and CNN, in identifying DDoS attacks within IoT traffic patterns. Three primary phases define the approach: performance evaluation; model selection and training; data engineering and preparation. To extract differentiating attributes and address class imbalance issues, the Edge-IIoTset dataset [22] is investigated during the data preparation phase. Configuring the selected ML algorithms with appropriate architectural parameters and implementing training protocols that are optimized for IoT constraints are carried out during the model development phase. This research evaluates algorithm performance using criteria such as performance metrics (detection accuracy, precision, recall, F1-score), computational efficiency, and the speed of detection.

4.1. Data Engineering and Preparation Framework

4.1.1. Dataset Acquisition and Cleaning

The selection of data is a vital step in developing any ML and DL model. It is a highly important stage in the construction of a robust DDoS attack detection model, since every model requires proper training and validation on sufficiently large datasets that are free from noise, outliers, missing data, and so on. With the right dataset selected for training, an ML model would be able to identify unseen traffic patterns more accurately. The Edge-IIoTset dataset is a popular Kaggle dataset specifically created for DDoS attack identification utilizing both ML and DL frameworks [22]. The Edge–IIoTset corpus is made up of 49 data files that are organized into three distinct sub-directories: regular IoT and IoT application traffic, malicious IoT and IoT application traffic, and a specific dataset used for ML and DL.

Data cleaning is the act of fixing problems like outliers and missing numbers in order to prepare a dataset for analysis. There are several reasons why there could be missing values, including lost packets or insufficient logging. Methods such as mean/median imputation, forward/backward fill, or deletion are used to deal with missing values. The identification and elimination of outliers is also essential since they can impact the model’s perception of legitimate versus malicious traffic. This process also removes unwanted and useless information, providing data columns to make it more suitable for analysis.

4.1.2. Data Transformation

Data transformation refers to the encoding and normalization of the data parameters or features to make them suitable for the model. Encoding transforms the categorical parameters into a numerical format to make it easy for the models to interpret. For the purpose of this research, two types of encoding are used:

Label encoding: It directly converts categorical values with numerical substitutes.
One-hot encoding: It ensures that no ordinal associations are identified by the model through converting categorical variables into a sequence of binary columns. Each category is encoded by a binary vector with just one member assigned the value ‘1’ and the remaining components set to ‘0’.

Each parameter has a different range of values. So, it is necessary to make all the parameter values follow a common range. Normalization is normally used to scale up the numerical values to a common range. Min–max normalization is mainly used in this study.

4.1.3. Feature Extraction and Selection

In developing an effective ML model, proper pre-processing of data should be conducted. This pre-processing includes feature selection and extraction, which, in turn, transforms raw network data into structured representations that are suitable for classification algorithms. There are several network attributes that can be utilized in DDoS detection such as packet size distributions, flow duration metrics, protocol type indicators, TCP flag configurations, and network addressing parameters (source/destination IP addresses and ports). The selection method uses both a statistical technique as well as domain knowledge to extract the most informative attributes and eliminate redundant features. This approach enhances computational efficiency without compromising classification performance, as demonstrated by feature selection improving RF training time from 21.028 s to 16.455 s while increasing accuracy from 93.20% to 99.99%, and reducing SVM training time from 334.456 s to 85.322 s while improving accuracy from 75.06% to 99.26% (24.20% improvement). In addition to computational efficiency, this approach improves detection time and other performance criteria. Thorough analysis and sufficient quantitative evidence supporting these improvements are presented in Section 5. In this research, principal component analysis (PCA) was applied to minimize dimensionality and transform potentially correlated features into linearly uncorrelated variables ordered by explained variance. This method ensures the preserve of essential information while reducing computational requirements.

4.1.4. Data Splitting and Imbalanced Classes

The whole dataset is split into three independent datasets: a training set, validation set, and testing set. In total, 80% of the data corpus is used to train the model and 20% is used for model evaluation and testing. The training dataset is also split into training and validation sets using an 80:20 ratio. The training dataset is used to train the model and the validation set is used to tune the hyper-parameters and avoid overfitting. Data splitting might showcase the existence of imbalanced classes, and this can be dealt with by using either undersampling or oversampling. Class weights were assigned for each class throughout the training phase of this study in order to handle unbalanced data. The detailed class distribution demonstrating this imbalance is provided in Appendix A.

Real-time threat identification capabilities are an important aspect of IoT DDoS detection. Detection time is an essential performance parameter, alongside accuracy, in IoT environments, since these environments are characterized by devices with limited processing capacity and resources. Consequently, to ensure practical deployment viability in real-world IoT scenarios, where immediate threat response is crucial for preventing service disruption and network compromise, both high accuracy and quick detection are established as essential requirements.

4.2. Model Selection and Training

4.2.1. Model Selection

While DL methods are traditionally associated with image and sequential data, their application to network traffic analysis is well-justified by the inherent characteristics of DDoS attack patterns and IoT network flows. Sequential connectivity and temporal interconnections in network traffic data demand advanced pattern recognition capabilities. CNNs can capture local feature patterns and complex interactions inside network flow characteristics by considering feature vectors as 1D sequences, which enables the identification of attack signatures across multiple feature dimensions simultaneously. The comprehensive evaluation of both traditional ML and DL methods overcomes the variety of computing needs and deployment scenarios existing in IoT systems. For IoT systems with limited resources, this makes it possible to create hybrid frameworks that combine the computational efficiency of conventional techniques with the advanced pattern recognition abilities of DL. LSTMs are particularly suitable for this domain as DDoS attacks exhibit sequential dependencies and temporal behavioral patterns that traditional ML methods could not totally capture. In this work, five distinct ML and DL models are evaluated: RF, SVM, KNN with varying k values, LSTM, and CNN.

4.2.2. Model Architecture

RF: the RF classifier is created with 100 trees in the forest. The maximum depth for each tree was not set, allowing it to grow until all the leaves were pure or the node contained insufficient samples for splitting. The model parameters specified two samples minimum for internal node splitting and one sample minimum per leaf. These settings were chosen for the purpose of balancing the models complexity and its ability to generalize well.
SVM: it uses a regularization parameter (C) of 1.0 to balance low error on the training data and weight norm optimization. A linear kernel was used, making the decision boundary a straight line. The influence of a single training example is determined by the gamma parameter, which was set to its default value of 1 /( number of features).
KNN: the KNN model’s algorithm type was left at auto to allow the model to choose the optimum method for the dataset’s structure. To generate predictions, k values (3, 5, 7) of neighbors were tested. A uniform weight function was used so that all locations in each neighborhood weigh equally, regardless of their distance from the goal.
LSTM: the LSTM networks’ ability to learn and recall over extended sequences makes them ideal when it comes to finding patterns in time-series data connected with DDoS attacks. The sequential architecture employs four LSTM layers containing 50, 50, 100, and 100 units, respectively, interspersed with dropout layers for regularization, followed by a final dense layer with 15 output neurons for multi-class DDoS detection.
CNN: the detailed structure is made up of convolutional, pooling, and fully connected layers that have been tailored to the dataset’s specific attributes. CNNs are able to detect complex trends in network traffic data because they easily identify spatial hierarchies in the data. The model is made up of five one-dimensional convolutional layers, using 32, 64, 128, 128, and 128 filters, respectively. Each convolutional layer is followed by a max pooling layer to reduce the feature size. After these layers, the output is flattened and passed through two fully connected (dense) layers: first with 64 neurons, then with 15 neurons, which produce the final classification.

4.2.3. Model Training

In ML and DL model training, data are fed into the model and continuously adjusted to minimize a loss function. DL uses multiple-layer neural networks along with very large datasets and substantial computing capacity to discover intricate patterns in the data. In this study, 80% of the entire data is used for training and validation. Throughout the training process, several strategies were used to prevent overfitting and help the models generalize appropriately. To improve generalization and avoid relying greatly on certain neurons, dropout layers were added to the neural networks in the DL models (CNN and LSTM). The dataset was carefully split into training, validation, and testing sets to make sure that the models were tested on new data. Traditional ML models such as RF, KNN, and SVM employed built-in regularization methods, and hyperparameter optimization was conducted to achieve a balance between model complexity and generalization performance. Cross-validation was applied when needed, to ensure robust model selection and parameter tuning, maintaining a reliable evaluation framework for practical IoT deployment scenarios.

Specific stopping criteria were defined for each model type to ensure optimal performance. Early stopping was used for DL models such as CNN and LSTM by monitoring validation accuracy and automatically terminating training when validation performance stopped improving. Traditional ML models implemented algorithm-specific stopping criteria: RF used a predetermined number of estimators, KNN classified using distance-based classification without iterative training, and SVM optimized using convergence tolerance parameters.

4.3. Model Evaluation

The efficacy of a trained model is evaluated using a variety of critical metrics, such as precision, recall, accuracy, and F1-score. Computational efficiency, including both training and detection time, is evaluated alongside confusion matrix analysis within the evaluation framework to ensure a comprehensive performance evaluation for IoT deployment scenarios. In this way, not only classification accuracy is evaluated, but also the practical feasibility of implementing these models in IoT environments with limited resources is thoroughly considered. Typically, this approach includes an unbiased evaluation of the model’s performance using an independent test dataset that was not utilized for training. In general, a model’s evaluation helps determine its performance in real scenarios and provides direction for additional model selection or adjustments. All the testing and model evaluations were carried out on a Dell Precision WorkStation T7500, which featured two Intel Xeon X5570 processors running at 2.93 GHz, 64 GB of RAM, and Windows 10 Pro Education. Python 3.11.11 was used for all experiments.

5. Results and Analytical Discussion

The experimental investigation of ML and DL algorithms for the detection of DDoS attacks in IoT environments revealed significant variations in performance and computational trade-offs. In this section, thorough interpretations of the experimental results, with emphasis on implications for IoT implementations with restricted resources are presented.

5.1. Algorithm Performance Comparison

5.1.1. Classification Performance Metrics

A hierarchy of performance efficacy was revealed through the comparison of several categorization methods. With an accuracy of 99.99%, RF classifier (RFC) showed superior performance than other techniques in the classification of DDoS and other attack types. The CNN model attained the second-highest accuracy with 99.80%; SVM followed with 99.26%. With KNN (K = 7) at 99.10%, KNN (K = 5) at 99.09%, and KNN (K = 3) at 99.14%, the KNN variations showed equivalent accuracy levels. LSTM networks obtained an accuracy of 99.25%.

The analysis of the confusion matrix revealed that RFC maintained consistent performance across all DDoS attack classes (DDoS_HTTP, DDoS_ICMP, DDoS_TCP, and DDoS_UDP), with minimal misclassifications. This insight is especially important considering the multi-class classification challenge, including 15 different kinds of attacks and consistent traffic patterns.

Figure 1 compares the performance metrics of all evaluated models, demonstrating the considerable overall accuracy advantage of RFC and CNN. With the 99.80% highest precision, RFC proved remarkable capacity to prevent false positives across all DDoS variants and other attack types. CNN showed 99.00% precision, same as KNN variants. The LSTM attained 97.00%, while the SVM had a precision of 99.00%. These metrics evaluate the capacity of each model to accurately identify DDoS attack without incorrectly classifying legitimate traffic.

RFC once again outperformed other models in terms of recall, achieving a remarkable 99.90% accuracy rate in identifying all DDoS attacks. CNN and LSTM both scored 99.00%, KNN (K = 3) scored 99.00%, and KNN = (K = 5, 7) scored 98%. With even lower recall at 97.00%, SVM found some undetectable attacks.

The F1-score followed similar trends with RFC scoring 99.90%, followed by CNN and KNN = (K = 3, 5, 7) all at 99.00%. With 98.00%, LSTM and SVM showed rather lower F1-scores. More details, including heatmaps and full confusion matrix visualizations, are provided in Appendix A.

5.1.2. Misclassification Analysis

Examining the confusion matrix closely produced important new understanding of misclassifications between several DDoS variations. Four different DDoS attack forms are included in the dataset: DDoS_HTTP, DDoS_ICMP, DDoS_TCP, and DDoS_UDP. Table 2 shows that classification performance for each DDoS type differed significantly between models in the base run. Note that all percentages results are rounded to one decimal place for clarity.

With only one false negative noted across all DDoS classes, RFC showed a remarkable accuracy for all DDoS variations. Particularly, all DDoS_HTTP (2123 samples), DDoS_TCP (2075 samples), and DDoS_UDP (2890 samples) were correctly identified; DDoS_ICMP had only one misclassification out of 2554 samples. Additionally, RFC demonstrated remarkable non-DDoS attack detection with only four cases across 18,798 samples, including single misclassifications in Port_Scanning and Uploading categories, and two cases in XSS classification. These exceptionally high accuracy rates across multiple algorithms (RFC 99.99%, CNN 99.80%, SVM 99.26%) raise important considerations regarding potential overfitting to the Edge-IIoTset dataset characteristics. Such uniformly high performance may indicate that models have learned dataset-specific patterns rather than generalizable attack signatures, particularly given the controlled laboratory environment in which the data were collected. For most DDoS variations, CNN performed remarkably well; however, it showed particular flaws with DDoS_UDP, where 12 cases were misclassified as DDoS_ICMP. This misclassification pattern indicates that the network traffic patterns of these attack types, which challenged CNN’s feature extraction capabilities, are similar.

The performance of KNN variants varied depending on the DDoS types. For DDoS_HTTP attacks, KNN (K = 3) misclassified 502 instances (23.6%) as XSS attacks, and 56 instances (2.6%) as Vulnerability_scanner. Similarly, KNN (K = 5) misclassified 504 cases (23.7%) as XSS and 64 instances (3.0%) as vulnerability scanner attacks. With 527 occurrences (24.8%) misclassified as XSS, and 51 instances (2.4%) as vulnerability scanner attacks, KNN (K = 7) displayed similar misclassification rates.

With misclassification rates of 19.9% (412 instances), 19.8% (410 instances), and 19.5% (404 instances), KNN variants K = 3, 5, and 7 misclassified DDoS_TCP attacks as Port_Scanning attacks, respectively. Remarkably, KNN (K = 3) also misclassified 28 DDoS_TCP instances (1.3%) as backdoor attacks, 3 instances (0.14%) as fingerprinting, and 1 instance (0.05%) as ransomware giving a total of 21.4% misclassification rate for KNN (K = 3). DDoS_ICMP exhibited rather good detection across the KNN versions, with misclassification rates below 1%, mostly as fingerprinting attacks (11, 13, 14 instances misclassified for K = 3, K = 5, and K = 7, respectively).

SVM showed different performance depending on the DDoS type. It performed poorly for DDoS_TCP with just 56.9% (1181 instances) correctly classified and a total misclassification rate of 43.1%, including 41.2% (855 instances) misclassified as Port_Scanning attacks and 1.9% (39 events) misclassified as ransomware attacks. For DDoS_ICMP, it successfully classified 99.65% of instances with only 0.35% misclassification. With 37.9% (804 instances) misclassified as XSS attacks and 1.5% (33 instances) as Vulnerability_scanner attacks, SVM precisely classified just 60.6% (1286 cases) for DDoS_HTTP, giving a total misclassification rate of 39.4%.

For most DDoS variations, LSTM performed well; however, it showed certain challenges with DDoS_ICMP, misclassifying 10 cases (0.39%) as DDoS_UDP attacks. This pattern is consistent with CNN’s behavior and suggests that these attack categories share inherent similarities, which present a challenge to both DL methods. Table 3 reflects the performance comparison of all evaluation metrics for ML/DL models.

5.2. Computational Efficiency Analysis

5.2.1. Detection Time Efficiency

Figure 2 displays the detecting time efficiency of every model. RFC, with a detection time of just 0.584 s, demonstrated remarkable operational efficiency, making it the most appropriate option for real-time IoT deployment scenarios requiring rapid DDoS attack detection. CNN followed with 4.121 s and LSTM needed 13.580 s for detection.

KNN variants required considerably more time, with K = 5 being the fastest at 36.457 s, followed by K = 7 (37.119 s) and K = 3 (38.923 s). SVM was the slowest model with 93.092 s for detection.

Regarding IoT deployment situations, these differences are important. IoT systems sometimes run under restrictions including limited processing capacity, power limits, and real-time decision-making requirements. Fast detection times of RFC make it especially appropriate for IoT setups with limited resources as quick DDoS attack detection is required to keep service availability.

5.2.2. Training Time Efficiency

The training times required for each model are presented in Figure 3. Among the KNN variations, KNN (K = 7) needed just 0.162 s, KNN (K = 3) needed 0.164 s, and KNN (K = 5) required 0.175 s, thus demonstrating remarkable training efficiency. RFC was also relatively efficient during the training phase, requiring 16.455 s. By contrast, the DL models demanded considerably more computational resources during training. LSTM displayed the longest training time at 4986.316 s (about 83 min), while CNN required 1618.678 s (almost 27 min). Requiring 85.322 s, SVM positioned itself between traditional ML methods and DL approaches.

These differences in training times highlight important considerations for the cycle of model development and application. While KNN variants offer rapid training, which is helpful in constrained computing environments or scenarios requiring frequent model updates, their detection performance for DDoS attacks remains weaker compared to RFC. Although the DL models impose heavy computational demands during training, making them less suitable for contexts with limited computational resources, they still achieve respectable accuracy in DDoS detection. Table 4 reflects the computational efficiency for all models with ranking based on their performance.

5.3. Evaluating Feature Selection

5.3.1. Performance Improvements

The analysis of the base results compared to the enhanced results reveals significant improvements achieved through feature selection and data pre-processing approaches. Table 5 presents a comparison of the base versus the enhanced accuracy across all algorithms.

The most notable improvements were observed in CNN, with a 25.22% increase in accuracy, and SVM with a 24.20% increase. All of the KNN variants experienced substantial improvements of approximately 18%. RFC exhibited a 6.79% improvement, while LSTM showed the smallest enhancement at 0.42%.

5.3.2. False Positive Reduction in DDoS Classification

For all models, feature selection especially affected false positive rates for DDoS attack types. The following DDoS specific classification errors were rather common in basic setups without optimal feature selection:

DDoS_HTTP attacks: KNN variants misclassified about 27% of DDoS_HTTP instances in base configuration; most were incorrectly labeled as XSS attacks. SVM initially misclassified 804 instances (37.9%) of DDoS_HTTP as XSS attacks. Even after feature improvement, KNN variants still showed flaws in DDoS_HTTP classification, although with much lower error rates (23.6–24.8%).

DDoS_TCP attacks: with misclassification rates of approximately 20% for KNN variants and 43% for SVM, these models often confused DDoS_TCP with Port_Scanning attacks in basic configuration. KNN variants still showed the same confusion pattern after feature improvement, but with lower rates (19.5–19.9%). For this attack type, SVM’s improvement was negligible, with 41.2% of DDoS_TCP instances still misclassified as Port_Scanning after feature upgrade.

DDoS_ICMP attacks: for DDoS_ICMP detection, base configuration already showed rather high accuracy among the models. With misclassification rates below 1% for most models, feature improvement still managed to further raise detection rates. In particular, SVM showed improvement, lowering misclassifications to 0.4%.

DDoS_UDP attacks: detection of DDoS_UDP attacks was nearly flawless in both base and improved configurations across all models, indicating that this attack type has quite unique and easily recognizable traffic patterns.

5.3.3. Algorithm-Specific Feature Response Mechanisms

The varying responses to feature selection across algorithms can be attributed to their fundamental learning mechanisms when applied to DDoS detection. The performance of SVM is determined by the separation of hyperplanes within the feature space. It is particularly notable that enhanced feature selection succeeded in eliminating noisy features that had previously obstructed the optimal formation of hyperplanes between different DDoS variants, especially for DDoS_HTTP. However, despite the overall improvement, SVM continued to struggle with DDoS_TCP attacks, maintaining a high misclassification rate of 41.2% even after feature enhancement.

Convolutional operations in CNN greatly benefited from the provision of well-prepared input data. The improved feature selection enabled more distinguishable patterns to be captured by CNN’s convolutional layers, particularly facilitating the differentiation between DDoS_ICMP and DDoS_UDP attacks.

The operation of KNN algorithms fundamentally depends on distance calculations within the feature space. Feature selection effectively reduced dimensionality while preserving critical discriminative features, thereby allowing for more accurate nearest-neighbor identification among DDoS variants. However, KNN continued to struggle in distinguishing certain attack types (such as DDoS_HTTP versus XSS, and DDoS_TCP versus Port_Scanning), suggesting that inherent similarities in their traffic patterns challenge distance-based classification methods.

RF showed moderate improvement, suggesting its ensemble nature and internal feature selection mechanisms already captured significant discriminative information for DDoS attack variants even in the base configuration. LSTM’s minimal improvement suggests its sequential learning capabilities were already effectively capturing temporal patterns in DDoS attack traffic.

The observed variations in performance point to the fundamental limitations of each algorithm when handling complex attack signatures in IoT systems. Enhanced feature selection can improve overall results, but ongoing misclassification across different methods shows that it remains challenging to distinguish between attacks with similar characteristics. These trends emphasize the importance of thoroughly understanding the feature space when choosing detection methods that are best suited for specific IoT deployment scenarios.

5.4. Discussion and Methodological Considerations

5.4.1. Experimental Refinements and Performance Assessment

The initial findings, which revealed nearly perfect accuracy for certain models, required careful review especially with regard to DDoS attack detection. The confusion matrix data gave more consistent and thorough insights by showing particular errors in classifying different types of DDoS. This detailed study identified flaws that were not obvious in aggregate measures, emphasizing the importance of going beyond surface-level performance data for a more thorough examination. For example, even with CNN’s general accuracy of 99.80%, the confusion matrix found specific challenges in differentiating DDoS_ICMP from DDoS_UPD attacks. However, these optimized results must be interpreted with caution regarding real-world deployability. The consistent high performance across diverse algorithms suggests possible dataset-specific optimization that may not translate to operational IoT environments, where traffic exhibits greater variability, noise, and unpredictable device behaviors than captured in controlled datasets. KNN variants similarly struggled particular DDoS kinds even with their apparently high overall accuracy. This result is especially important, since it shows how great overall accuracy could hide significant misclassifications at the level of specific attack type. Therefore, it becomes essential to evaluate model performance not only by means of general indicators but also by closely analyzing behavior over several attack types.

The final DDoS detection results were much influenced by several experimental changes:

Feature normalization: particularly for SVM and DL models, the change from standard scaling to min–max scaling clearly enhanced model performance for DDoS detection. Comparatively to multiple misclassifications seen with conventional scaling, the confusion matrix for RF with min–max scaling revealed flawless categorization for all DDoS_HTTP, DDoS_TCP, and DDoS_UDP cases.

Hyperparameter optimization: models such as LSTM gain from well tuned hyperparameters for DDoS detection including learning rate scheduling with early stopping. By use of early stopping, the LSTM model demonstrated convergence at epoch 44, therefore preventing overfitting and optimizing performance.

Architectural refinements: in CNN’s architecture, five convolutional layers with max-pooling operations helped hierarchical feature extraction, which proved especially successful for different DDoS attack variants with subtle variations in network traffic patterns.

5.4.2. Notable Classification Patterns in DDoS Detection

The evaluation of DDoS attack detection techniques highlighted several interesting trends:

After adjusting feature selection, there was little performance difference between KNN variants, regardless of K value, suggesting that neighborhood size had little impact on DDoS classification.
The LSTM showed a good initial performance in the detection of DDoS_UPD and DDoS_TCP even in the lack of improved feature selection. This observation implies that its temporal learning capacity sufficiently caught the sequential patterns defining these attack forms.
RF’s remarkable combination of accuracy (99.99%) and detection efficiency (0.584 s) underscores the strength of ensemble-based techniques for tackling difficult multi-class issues with many attack variants.
The improvement in LSTM’s performance following feature addition was negligible (only 0.42%), in sharp contrast to the significant gains noted in other models. This finding suggests that the recurrent architecture of LSTMs unintentionally covers relevant temporal connections in network traffic data, therefore reducing the benefits possible by feature selection.
With misclassification rates ranging from 0% (RF) to as high as 39.4% (SVM, with 37.9% specifically misclassified as XSS attacks), the DDoS_HTTP attack type displayed clearly notable variations in categorization accuracy among models. This variation suggests that this particular DDoS variant shows complex traffic patterns, which may be especially challenging for methods depending on hyperplane-based or distance-based decision boundaries.

5.4.3. Attack Variants and Feature Representations

This research makes use of the Edge-IIoTset dataset comprising four main DDoS attack variants: DDoS_HTTP, DDoS_ICMP, DDoS_TCP, and DDoS_UDP. Each variant includes distinct network traffic patterns and exploitation techniques that significantly affect detection performance among the evaluated models. Typically reflecting real-world traffic behavior, DDoS_HTTP attacks overwhelm web servers with HTTP requests. The tendency of KNN variants to misclassify a significant fraction of DDoS_HTTP events as XSS attacks suggests that the similarity to normal web traffic presented particular challenges for distance-based algorithms. The uncertainty between these two web-based attack types highlights inherent traffic signature similarities that complicate ML classification.

Typically employing ping floods, DDoS_ICMP attacks overwhelm target systems through Internet Control Message Protocol (ICMP) packets. These attacks exhibited different traffic patterns, and most models achieved good detection accuracy. The slight confusion between DDoS_ICMP and DDoS_UDP observed in CNN and LSTM models (approximately 0.4% misclassification rate) suggests that minor temporal similarities can influence even advanced DL methods. DDoS_TCP attacks frequently employ Transmission Control Protocol (TCP) mechanisms to exhaust connection resources through SYN flooding attacks. Consistent misclassification patterns between DDoS_TCP and Port_Scanning attacks (ranging from 19.5% to 41.2% across KNN variants and SVM) reveal comparable network behavior traits. This confusion appears to result from both attack types involving systematic TCP port probing, though for distinct purposes while utilizing similar techniques. DDoS_UDP attacks, which saturate target bandwidth through the User Datagram Protocol (UDP), displayed nearly flawless detection rates across all models in both base and enhanced configurations. Their uniquely identifiable traffic patterns, likely a result of UDP’s connectionless nature, produce behaviors markedly different from other attack types. Feature space analysis provided important new insights into the identification of DDoS attack variants. For distinguishing DDoS attacks from normal traffic, network characteristics such as packet size distribution, inter-arrival time, protocol distribution, and flow duration demonstrated considerable discriminative potential. However, greater challenges arose in separating different DDoS variants from other attack categories.

Analysis of the principal components (PCA) of the feature space exposed some interesting trends in the distribution of attack types. DDoS_UDP attacks formed a distinct cluster, a separation that likely contributed to their consistently high detection rates across the tested models. DDoS_HTTP attacks, on the other hand, showed significant overlap with XSS traffic, particularly along components associated with HTTP header attributes and request patterns. Notably, this overlap persisted even after feature selection was applied, helping to explain the recurrent misclassification trends observed in the KNN variants. Furthermore, temporal features played a very important role in enhancing detection accuracy. Flow duration, packet rate, and burst patterns recorded the highest relevance scores within the RF model, emphasizing their value in distinguishing between legitimate and attack traffic. These sequential properties help to explain the remarkable early performance of LSTM networks, whose recurrent structure naturally captures temporal dependencies without requiring careful feature engineering. Figure 4 shows a comparison of training and detection times over all assessed models, therefore providing more understanding of the trade-offs between predictive performance and computational efficiency.

6. Limitations and Future Work

Though the Edge-IIoTset dataset provides a rather structured view of attack paths, its reflection of the real-world diversity and complexity of DDoS threats targeting IoT environments remains limited. Particularly when considering adversarial strategies and concept drift, where attack behaviors evolve in unanticipated ways, controlled testing scenarios often fail to capture the adaptive nature of modern attacks. The models architectures, which were precisely tuned to the dataset, represent another limitation. While they perform well within that controlled context, their generalizability remains questionable. Real-world IoT deployments often involve different traffic patterns, attack types, and noise levels compared to those represented in curated datasets. Without appropriate architectural adaptation, these differences can significantly impair model performance. Simply put, a model designed for one dataset is unlikely to hold up across different real-world scenarios.

Several key research directions for future exploration include the following: automated feature engineering using AutoML techniques, model compression and optimization strategies, hybrid model approaches, unified learning methodologies, and cross-dataset validation using diverse IoT security datasets. Cross-dataset evaluation, including validation on real-world IoT traffic from operational deployments, would enhance generalizability and provide insights into algorithmic performance across heterogeneous IoT environments. In particular, AutoML-driven feature engineering should be further investigated to help identify optimal feature subsets tailored to specific DDoS variants. Research should also be conducted on model compression and optimization techniques specifically tailored for DDoS detection in resource-constrained environments such as the case in IoT. It would also be beneficial to investigate ensemble approaches combining high-accuracy DL models with efficient traditional approaches. Further research on transfer learning techniques could help to achieve enhanced applicability across a range of IoT deployment scenarios by lowering the processing requirements of DL models while yet preserving detection performance.

7. Conclusions

DDoS attacks still form a critical challenge for network systems, especially IoT devices that usually exhibit insufficient protection mechanisms. This research evaluated the effectiveness of multiple ML and DL detection approaches. The RF model demonstrated exceptional performance, with a 99.99% accuracy rate and a remarkable detection time of 0.584 s, indicating that it is appropriate for real-time defense. Although the KNN model attained about 99.1% accuracy, its slower detection times (36 to 39 s) limit its applicability in situations requiring quick reactions. CNN achieved an accuracy of 99.80%, and LSTM achieved about 99.25%. However, both needed a lot of training time, more than 1600 s for CNN and about 4986 s for LSTM. Therefore, environments with strong computational resources are better suited for these DL approaches. The SVM showed a remarkable accuracy of 99.26%; however, its considerable training (85.32 s) and detection (93.09 s) duration make it less suited for fast detection uses. Because of the similarly structured network traffic patterns, all of the examined techniques encountered difficulties distinguishing between certain attack types, such as XSS attacks and DDoS_HTTP, and SQL injection and uploading attacks. The findings suggest that cybersecurity teams should carefully balance the needs for accuracy and processing power when selecting algorithms to defend against DDoS attacks. For most cases, RF remains the most sensible choice even if DL methods can be helpful in instances involving many resources and complex attacks.

Funding

This research was funded by the Public Authority for Applied Education and Training in Kuwait, grant number BS-23-01.

Data Availability Statement

The data presented in this study are openly available in Kaggle -Edge-IIoTset dataset, https://www.kaggle.com/datasets/mohamedamineferrag/edgeiiotset-cyber-security-dataset-of-iot-iiot, accessed on 12 March 2025.

Acknowledgments

The author acknowledges the support provided by the Public Authority for Applied Education and Training.

Conflicts of Interest

The author declares no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

DDoS	Distributed Denial of Service
IoT	Internet of Things
IIoT	Industrial Internet of Things
ML	Machine Learning
DL	Deep Learning
RF	Random Forest
KNN	K-Nearest Neighbors
SVM	Support Vector Machine
CNN	Convolutional Neural Network
LSTM	Long Short-Term Memory
RNN	Recurrent Neural Network
DCNN/BiLSTM	Deep CNN with Bidirectional LSTM
LR	Logistic Regression
DT	Decision Tree
IDS	Intrusion Detection System
XSS	Cross-Site Scripting
SQL	Structured Query Language
HTTP	Hypertext Transfer Protocol
ICMP	Internet Control Message Protocol
TCP	Transmission Control Protocol
UDP	User Datagram Protocol
AI	Artificial Intelligence
SMOTE	Synthetic Minority Oversampling Techniques
RNN	Recurrent Neural Networks
DNN	Deep Neural Networks
GDB	GradientBoosting
ADB	Adaboost
PCA	Principal Component Analysis

Appendix A. Class Distribution Data

Table A1. Class distribution in the Edge-IIoTset test set.

Attack Class	Number of Instances
Normal	4896
DDoS_UDP	2890
DDoS_ICMP	2554
DDoS_HTTP	2123
Uploading	2083
DDoS_TCP	2075
SQL_injection	2016
Password	1992
XSS	1979
Backdoor	1976
Vulnerability_scanner	1982
Ransomware	1934
Port_Scanning	1727
Fingerprinting	157
MITM	56
Total	30,440

Confusion Matrix Visualizations

Figure A1. Heatmap of confusion matrix across all evaluated models. Note: Darker colors indicate better performance.

Figure A2. Heatmap of training and detection times across all evaluated models. Note: Darker colors indicate better performance.

Confusion Matrix for each model

Figure A3. Confusion matrices comparison across all evaluated models: (top row) CNN, KNN (K = 3), KNN (K = 5); (middle row) KNN (K = 7), SVM, RF; (bottom) LSTM.

References

Adedeji, K.B.; Abu-Mahfouz, A.M.; Kurien, A.M. DDoS Attack and Detection Methods in Internet-Enabled Networks: Concept, Research Perspectives, and Challenges. J. Sens. Actuator Netw. 2023, 12, 51. [Google Scholar] [CrossRef]
Makhdoom, I.; Abolhasan, M.; Lipman, J.; Liu, R.P.; Ni, W. Anatomy of threats to the internet of things. IEEE Commun. Surv. Tutor. 2018, 21, 1636–1675. [Google Scholar] [CrossRef]
Falowo, O.I.; Abdo, J.B. 2019–2023 in Review: Projecting DDoS Threats With ARIMA and ETS Forecasting Techniques. IEEE Access 2024, 12, 26759–26772. [Google Scholar] [CrossRef]
Sunny, B.; Krishan, K. Characterization and Comparison of DDoS Attack Tools and Traffic Generators: A Review. Int. J. Netw. Secur. 2017, 19, 383–393. [Google Scholar]
Peraković, D.; Periša, M.; Cvitić, I. Analysis of the IoT impact on volume of DDoS attacks. XXXIII Simp. Novim Tehnol. Poštanskom i Telekomun. Saobraćaju–PosTel 2015, 2015, 295–304. [Google Scholar]
Cloudflare. DDoS Threat Report for 2024 Q4. 2025. Available online: https://blog.cloudflare.com/ddos-threat-report-for-2024-q4/ (accessed on 12 March 2025).
Hafsa, M.; Malik, S. Factors Effecting Businesses due to Distributed Denial of Service (DDoS) Attack. In Proceedings of the International Conference on Intelligent Computing (ICIC), Lahore, Pakistan, 9–10 November 2021. [Google Scholar] [CrossRef]
Somani, G.; Gaur, M.S.; Sanghi, D. DDoS/EDoS attack in cloud: Affecting everyone out there! In Proceedings of the 8th International Conference on Security of Information and Networks, SIN ’15, Sochi, Russia, 8–10 September 2015. [CrossRef]
Zhao, K.; Ge, L. A Survey on the Internet of Things Security. In Proceedings of the 2013 Ninth International Conference on Computational Intelligence and Security, Emeishan, China, 14–15 December 2013; pp. 663–667. [Google Scholar] [CrossRef]
Hassija, V.; Chamola, V.; Saxena, V.; Jain, D.; Goyal, P.; Sikdar, B. A Survey on IoT Security: Application Areas, Security Threats, and Solution Architectures. IEEE Access 2019, 7, 82721–82743. [Google Scholar] [CrossRef]
Gartner. Forecast: Internet of Things endpoints and spending, worldwide, 2020. Gart. Forecast. Anal. 2020, 11, 1–13. [Google Scholar]
Djenna, A.; Harous, S.; Saidouni, D.E. Internet of things meet internet of threats: New concern cyber security issues of critical cyber infrastructure. Appl. Sci. 2021, 11, 4580. [Google Scholar] [CrossRef]
Bassene, A.; Gueye, B. DeepDDoS: A Deep-Learning Model for Detecting Software Defined Healthcare IoT Networks Attacks. In Advances in Ubiquitous Networking; Springer: Cham, Switzerland, 2021. [Google Scholar] [CrossRef]
Razavi, H.; Jamali, M.R.; Emsaki, M.; Ahmadi, A.; Hajiaghei-Keshteli, M. Quantifying the Financial Impact of Cyber Security Attacks on Banks: A Big Data Analytics Approach. In Proceedings of the 2023 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE), Regina, SK, Canada, 24–27 September 2023; pp. 533–538. [Google Scholar]
Najafimehr, M.; Zarifzadeh, S.; Mostafavi, S. DDoS attacks and machine learning based detection methods: A survey and taxonomy. Eng. Rep. 2023, 5, e12697. [Google Scholar] [CrossRef]
Karie, N.M.; Sahri, N.M.; Haskell-Dowland, P. IoT Threat Detection Advances, Challenges and Future Directions. In Proceedings of the 2020 Workshop on Emerging Technologies for Security in IoT (ETSecIoT), Sydney, NSW, Australia, 21 April 2020. [Google Scholar]
Asharf, J.; Moustafa, N.; Khurshid, H.; Debie, E.; Haider, W.; Wahab, A. A Review of Intrusion Detection Systems Using Machine and Deep Learning in Internet of Things: Challenges, Solutions and Future Directions. Electronics 2020, 9, 1177. [Google Scholar] [CrossRef]
Samtani, S.; Kantarcioglu, M.; Chen, H. Trailblazing the artificial intelligence for cybersecurity discipline: A multi-disciplinary research roadmap. ACM Trans. Manag. Inf. Syst. 2020, 11, 1–19. [Google Scholar] [CrossRef]
Rasmussen, C.B.; Moeslund, T.B. Evaluation of Model Selection for Kernel Fragment Recognition in Corn Silage. arXiv 2020, arXiv:2004.00292. [Google Scholar]
Mool, A.; Panda, J.; Sharma, K. Face Detection framework for accelerated analysis of High quality Multimedia content. Int. J. Pattern Recognit. Artif. Intell. 2024, 38, 2456001. [Google Scholar] [CrossRef]
Faiz, M.N.; Somantri, O.; Supriyono, A.R.; Muhammad, A.W. Impact of feature selection methods on machine learning-based for detecting DDoS attacks: Literature review. J. Inform. Telecommun. Eng. 2022, 5, 305–314. [Google Scholar] [CrossRef]
Ferrag, M.A.; Friha, O.; Hamouda, D.; Maglaras, L.; Janicke, H. Edge-IIoTset: A new comprehensive realistic cyber security dataset of IoT and IIoT applications for centralized and federated learning. IEEE Access 2022, 10, 40281–40306. [Google Scholar] [CrossRef]
Zarpelão, B.B.; Miani, R.S.; Kawakani, C.T.; de Alvarenga, S.C. A survey of intrusion detection in Internet of Things. J. Netw. Comput. Appl. 2017, 84, 25–37. [Google Scholar] [CrossRef]
Bhunia, S.S.; Gurusamy, M. Dynamic attack detection and mitigation in IoT using SDN. In Proceedings of the 2019 27th International Telecommunication Networks and Applications Conference (ITNAC), Melbourne, VIC, Australia, 22–24 November 2017; pp. 1–6. [Google Scholar]
Sagirlar, G.; Carminati, B.; Ferrari, E. AutoBotCatcher: Blockchain-based P2P botnet detection for the Internet of Things. In Proceedings of the 2018 IEEE 4th International Conference on Collaboration and Internet Computing (CIC), Philadelphia, PA, USA, 18–20 October 2018; pp. 1–8. [Google Scholar]
Al-Garadi, M.A.; Mohamed, A.; Al-Ali, A.K.; Du, X.; Ali, I.; Guizani, M. A survey of machine and deep learning methods for internet of things (IoT) security. IEEE Commun. Surv. Tutor. 2020, 22, 1646–1685. [Google Scholar] [CrossRef]
Adat, V.; Gupta, B.B. Security in Internet of Things: Issues, challenges, taxonomy, and architecture. Telecommun. Syst. 2018, 67, 423–441. [Google Scholar] [CrossRef]
Kumar, P.; Tripathi, R.; Singh, G. A distributed framework for detecting DDoS attacks in smart home IoT devices. J. Supercomput. 2019, 76, 4784–4810. [Google Scholar]
Khraisat, A.; Gondal, I.; Vamplew, P.; Kamruzzaman, J. Survey of intrusion detection systems: Techniques, datasets and challenges. Cybersecurity 2019, 2, 20. [Google Scholar] [CrossRef]
Day, D.; Flores, D. Tactics and signatures: Defending against DDoS with effective mitigations. J. Cybersecur. Res. 2017, 2, 12–24. [Google Scholar]
Hajiheidari, S.; Wakil, K.; Badri, M.; Navimipour, N.J. Intrusion detection systems in the Internet of things: A comprehensive investigation. Comput. Netw. 2019, 160, 165–191. [Google Scholar] [CrossRef]
Koroniotis, N.; Moustafa, N.; Sitnikova, E.; Turnbull, B. Towards the development of realistic botnet dataset in the Internet of Things for network forensic analytics: Bot-IoT dataset. Future Gener. Comput. Syst. 2019, 100, 779–796. [Google Scholar] [CrossRef]
Kim, J.; Kim, J.; Kim, H.; Shim, M.; Choi, E. CNN-based network intrusion detection against denial-of-service attacks. Electronics 2020, 9, 916. [Google Scholar] [CrossRef]
Yin, C.; Zhu, Y.; Fei, J.; He, X. A deep learning approach for intrusion detection using recurrent neural networks. IEEE Access 2018, 5, 21954–21961. [Google Scholar] [CrossRef]
Ferrag, M.A.; Maglaras, L.; Moschoyiannis, S.; Janicke, H. Deep learning for cyber security intrusion detection: Approaches, datasets, and comparative study. J. Inf. Secur. Appl. 2020, 50, 102419. [Google Scholar] [CrossRef]
Latif, S.; Zou, Z.; Idrees, Z.; Ahmad, J. A novel attack detection scheme for the industrial internet of things using a lightweight random neural network. IEEE Access 2020, 8, 89337–89350. [Google Scholar] [CrossRef]
Moustafa, N.; Turnbull, B.; Choo, K.K.R. An ensemble intrusion detection technique based on proposed statistical flow features for protecting network traffic of internet of things. IEEE Internet Things J. 2019, 6, 4815–4830. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
Ghosh, S.; Dasgupta, A.; Swetapadma, A. A Study on Support Vector Machine based Linear and Non-Linear Pattern Classification. In Proceedings of the 2019 International Conference on Intelligent Sustainable Systems (ICISS), Palladam, India, 21–22 February 2019; pp. 24–28. [Google Scholar] [CrossRef]
Abu Alfeilat, H.A.; Hassanat, A.B.; Lasassmeh, O.; Tarawneh, A.S.; Alhasanat, M.B.; Eyal Salman, H.S.; Prasath, V.S. Effects of distance measure choice on k-nearest neighbor classifier performance: A review. Big Data 2019, 7, 221–248. [Google Scholar] [CrossRef]
Mucherino, A.; Papajorgji, P.J.; Pardalos, P.M.; Mucherino, A.; Papajorgji, P.J.; Pardalos, P.M. K-nearest neighbor classification. In Data Mining in Agriculture; Springer: New York, NY, USA, 2009; pp. 83–106. [Google Scholar]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Liaw, A.; Wiener, M. Classification and regression by randomForest. R News 2002, 2, 18–22. [Google Scholar]
Biau, G.; Scornet, E. A random forest guided tour. Test 2016, 25, 197–227. [Google Scholar] [CrossRef]
Rigatti, S.J. Random forest. J. Insur. Med. 2017, 47, 31–39. [Google Scholar] [CrossRef]
LeCun, Y.; Boser, B.; Denker, J.S.; Henderson, D.; Howard, R.E.; Hubbard, W.; Jackel, L.D. Backpropagation applied to handwritten zip code recognition. Neural Comput. 1989, 1, 541–551. [Google Scholar] [CrossRef]
Chauhan, N.K.; Singh, K. A review on conventional machine learning vs deep learning. In Proceedings of the 2018 International Conference on Computing, Power and Communication Technologies (GUCON), Greater Noida, India, 28–29 September 2018; pp. 347–352. [Google Scholar]
Alzubaidi, L.; Zhang, J.; Humaidi, A.J.; Al-Dujaili, A.; Duan, Y.; Al-Shamma, O.; Santamaría, J.; Fadhel, M.A.; Al-Amidie, M.; Farhan, L. Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. J. Big Data 2021, 8, 53. [Google Scholar] [CrossRef]
Xenofontos, C.; Zografopoulos, I.; Konstantinou, C.; Jolfaei, A.; Khan, M.K.; Choo, K.K.R. Consumer, Commercial, and Industrial IoT (In)Security: Attack Taxonomy and Case Studies. IEEE Internet Things J. 2021, 9, 199–221. [Google Scholar] [CrossRef]
Vishwakarma, R.; Jain, A.K. A survey of DDoS attacking techniques and defence mechanisms in the IoT network. Telecommun. Syst. 2020, 73, 3–25. [Google Scholar] [CrossRef]
Obaidat, M.A.; Obeidat, S.; Holst, J.; Al Hayajneh, A.; Brown, J. A Comprehensive and Systematic Survey on the Internet of Things: Security and Privacy Challenges, Security Frameworks, Enabling Technologies, Threats, Vulnerabilities and Countermeasures. Computers 2020, 9, 44. [Google Scholar] [CrossRef]
Abdul-Ghani, H.A.; Konstantas, D.; Mahyoub, M. A comprehensive IoT attacks survey based on a building-blocked reference model. Int. J. Adv. Comput. Sci. Appl. 2018, 9, 355–373. [Google Scholar]
Khanam, S.; Ahmedy, I.B.; Idris, M.Y.I.; Jaward, M.H.; Sabri, A.Q.B.M. A Survey of Security Challenges, Attacks Taxonomy and Advanced Countermeasures in the Internet of Things. IEEE Access 2020, 8, 219709–219743. [Google Scholar] [CrossRef]
Das, A.K.; Zeadally, S.; He, D. Taxonomy and analysis of security protocols for Internet of Things. Future Gener. Comput. Syst. 2018, 89, 110–125. [Google Scholar] [CrossRef]
Bovenzi, G.; Aceto, G.; Ciuonzo, D.; Montieri, A.; Persico, V.; Pescapé, A. Network anomaly detection methods in IoT environments via deep learning: A fair comparison of performance and robustness. Comput. Secur. 2023, 128, 103167. [Google Scholar] [CrossRef]
Nawir, M.; Amir, A.; Yaakob, N.; Lynn, O.B. Internet of Things (IoT): Taxonomy of security attacks. In Proceedings of the 2016 3rd International Conference on Electronic Design (ICED), Phuket, Thailand, 11–12 August 2016; pp. 321–326. [Google Scholar]
Sasi, T.; Lashkari, A.H.; Lu, R.; Xiong, P.; Iqbal, S. A comprehensive survey on IoT attacks: Taxonomy, detection mechanisms and challenges. J. Inf. Intell. 2024, 2, 455–513. [Google Scholar] [CrossRef]
Prajapati, P.M.; Gandhi, P.P.; Degadwala, S. Exploring Methods of Mitigation against DDoS Attack in an IoT Network. In Proceedings of the 2024 International Conference on Inventive Computation Technologies (ICICT), Lalitpur, Nepal, 24–26 April 2024; pp. 1373–1377. [Google Scholar] [CrossRef]
Giachoudis, N.; Damiris, G.P.; Theodoridis, G.; Spathoulas, G. Collaborative Agent-based Detection of DDoS IoT Botnets. In Proceedings of the 2019 15th International Conference on Distributed Computing in Sensor Systems (DCOSS), Santorini, Greece, 29–31 May 2019; pp. 205–211. [Google Scholar] [CrossRef]
González-Landero, F.; García-Magariño, I.; Lacuesta, R.; Lloret, J. ABSDDoS: An AgentBased Simulator about Strategies of Both DDoS Attacks and Their Defenses, to Achieve Efficient Data Forwarding in Sensor Networks and IoT Devices. Wirel. Commun. Mob. Comput. 2018, 2018, 7264269. [Google Scholar] [CrossRef]
Sonar, K.; Upadhyay, H. An Approach to Secure Internet of Things Against DDoS. In Advances in Intelligent Systems and Computing; Springer: Singapore, 2016; pp. 367–376. [Google Scholar] [CrossRef]
Spathoulas, G.; Giachoudis, N.; Damiris, G.P.; Theodoridis, G. Collaborative Blockchain-Based Detection of Distributed Denial of Service Attacks Based on Internet of Things Botnets. Future Internet 2019, 11, 226. [Google Scholar] [CrossRef]
Doshi, K.; Yilmaz, Y.; Uludag, S. Timely Detection and Mitigation of Stealthy DDoS Attacks via IoT Networks. IEEE Trans. Dependable Secur. Comput. 2021, 18, 2164–2176. [Google Scholar] [CrossRef]
Zaidi, S.A.N. Mitigating DDoS Attacks on IoT Networks: Strategies and Solutions. Int. J. Res. Appl. Sci. Eng. Technol. 2024, 12, 1387–1394. [Google Scholar] [CrossRef]
Rashid, M.M.; Khan, S.U.; Eusufzai, F.; Redwan, M.A.; Sabuj, S.R.; Elsharief, M. A Federated Learning-Based Approach for Improving Intrusion Detection in Industrial Internet of Things Networks. Network 2023, 3, 158–179. [Google Scholar] [CrossRef]
Tareq, I.; Elbagoury, B.M.; El-Regaily, S.; El-Horbaty, E.S.M. Analysis of ton-iot, unw-nb15, and edge-iiot datasets using dl in cybersecurity for iot. Appl. Sci. 2022, 12, 9572. [Google Scholar] [CrossRef]
Hnamte, V.; Hussain, J. DCNNBiLSTM: An efficient hybrid deep learning-based intrusion detection system. Telemat. Inform. Rep. 2023, 10, 100053. [Google Scholar] [CrossRef]
Khacha, A.; Saadouni, R.; Harbi, Y.; Aliouat, Z. Hybrid deep learning-based intrusion detection system for industrial internet of things. In Proceedings of the 2022 5th International Symposium on Informatics and its Applications (ISIA), M’sila, Algeria, 29–30 November 2022; pp. 1–6. [Google Scholar]
Thiyam, B.; Dey, S. Efficient feature evaluation approach for a class-imbalanced dataset using machine learning. Procedia Comput. Sci. 2023, 218, 2520–2532. [Google Scholar] [CrossRef]

Figure 1. Comparison of performance metrics across all evaluated models.

Figure 2. Detection time comparison across all models.

Figure 3. Training time comparison across all models.

Figure 4. Comparison of processing times across different models.

Table 1. Summary of reviewed studies.

Research	Year	Dataset	Model	Accuracy	Performance Metrics	ML	DL	IoT	Approach
Ferrag et al. [22]	2022	Edge-IIoTset	DT, RF, SVM, KNN, DNN	94.67–96.00% ^†	P, R, F1, DT, TT	✓	✓	✓	CL + FL
Rashid et al. [66]	2023	Edge-IIoTset	CNN, RNN	92.49%	P, R, F1	✗	✓	✓	FL
Khacha et al. [69]	2023	Edge-IIoTset	CNN-LSTM	98.69%	P	✗	✓	✓	Hybrid DL + CL
Tareq et al. [67]	2022	Edge-IIoTset	Inception Time	94.94%	P, R, F1	✗	✓	✓	DL Architecture + CL
Hnamte et al. [68]	2023	Edge-IIoTset	DCNNBiLSTM, DNN, CNN, LSTM, AE	99.64%	DT, TT	✗	✓	✓	Hybrid DL + CL
Thiyam et al. [70]	2023	Edge-IIoTset	KNN, LR, DT, RF, ADB, GDB	99.32% ^⋆	P, R, F1	✓	✗	✓	FE + CL
Present Work	2025	Edge-IIoTset	RF, KNN, SVM, LSTM, CNN	99.99%	P, R, F1, DT, TT	✓	✓	✓	FE + CL

P: Precision; R: Recall; F1: F1-Score; DT: Detection Time; TT: Training Time; ML: Machine Learning; DL: Deep Learning; IoT: Internet of Things; FL: Federated Learning; FE: Feature Engineering; CL: Centralized Learning; Performance Metrics: Available performance and timing metrics. ^† Multi-class accuracy ranges shown where applicable. ^⋆ Binary classification.

Table 2. Misclassification rates for DDoS attack variants across models.

Model	DDoS_HTTP	DDoS_ICMP	DDoS_TCP	DDoS_UDP
RFC	0.0%	0.04%	0.0%	0.0%
CNN	0.0%	0.0%	0.0%	0.41%
LSTM	0.0%	0.39%	0.0%	0.0%
KNN (K = 3)	26.3%	0.43%	21.4%	0.0%
KNN (K = 5)	26.8%	0.51%	19.8%	0.0%
KNN (K = 7)	27.2%	0.55%	19.5%	0.0%
SVM	39.4%	0.35%	43.1%	0.0%

Table 3. Performance evaluation of classification models.

Model	Accuracy (%)	Precision (%)	Recall (%)	F1-Score (%)
RFC	99.987	99.9	99.9	99.9
CNN	99.796	99.0	99.0	99.0
SVM	99.261	99.0	97.0	98.0
LSTM	99.251	97.0	99.0	98.0
KNN (K = 3)	99.136	99.0	99.0	99.0
KNN (K = 5)	99.093	99.0	98.0	99.0
KNN (K = 7)	99.097	99.0	98.0	99.0

Table 4. Computational efficiency analysis of evaluated models: training and detection times with ranks.

Model	Training Time (s)	Detection Time (s)	Training Rank	Detection Rank
KNN (K = 7)	0.162	37.119	1	5
KNN (K = 3)	0.164	38.923	2	6
KNN (K = 5)	0.175	36.457	3	4
RFC	16.455	0.584	4	1
SVM	5.322	93.092	5	7
CNN	1618.678	4.121	6	2
LSTM	4986.316	13.580	7	3

Table 5. Impact of feature selection on model accuracy.

Model	Accuracy (Base)	Accuracy (Enhanced)	Improvement	Detection Time Impact
RFC	93.20%	99.99%	+6.79%	+0.04 s
KNN (K = 3)	81.03%	99.14%	+18.11%	+1.87 s
KNN (K = 5)	80.92%	99.09%	+18.17%	+3.26 s
KNN (K = 7)	81.06%	99.10%	+18.04%	+4.46 s
SVM	75.06%	99.26%	+24.20%	+8.34 s
LSTM	98.83%	99.25%	+0.42%	+0.00 s
CNN	74.58%	99.80%	+25.22%	+0.00 s

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Alenezi, M.N. Significance of Machine Learning-Driven Algorithms for Effective Discrimination of DDoS Traffic Within IoT Systems. Future Internet 2025, 17, 266. https://doi.org/10.3390/fi17060266

AMA Style

Alenezi MN. Significance of Machine Learning-Driven Algorithms for Effective Discrimination of DDoS Traffic Within IoT Systems. Future Internet. 2025; 17(6):266. https://doi.org/10.3390/fi17060266

Chicago/Turabian Style

Alenezi, Mohammed N. 2025. "Significance of Machine Learning-Driven Algorithms for Effective Discrimination of DDoS Traffic Within IoT Systems" Future Internet 17, no. 6: 266. https://doi.org/10.3390/fi17060266

APA Style

Alenezi, M. N. (2025). Significance of Machine Learning-Driven Algorithms for Effective Discrimination of DDoS Traffic Within IoT Systems. Future Internet, 17(6), 266. https://doi.org/10.3390/fi17060266

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Significance of Machine Learning-Driven Algorithms for Effective Discrimination of DDoS Traffic Within IoT Systems

Abstract

1. Introduction

2. Background

2.1. Generic Detection Approaches

2.2. Machine Learning Models

3. Related Work

4. Materials and Methods

4.1. Data Engineering and Preparation Framework

4.1.1. Dataset Acquisition and Cleaning

4.1.2. Data Transformation

4.1.3. Feature Extraction and Selection

4.1.4. Data Splitting and Imbalanced Classes

4.2. Model Selection and Training

4.2.1. Model Selection

4.2.2. Model Architecture

4.2.3. Model Training

4.3. Model Evaluation

5. Results and Analytical Discussion

5.1. Algorithm Performance Comparison

5.1.1. Classification Performance Metrics

5.1.2. Misclassification Analysis

5.2. Computational Efficiency Analysis

5.2.1. Detection Time Efficiency

5.2.2. Training Time Efficiency

5.3. Evaluating Feature Selection

5.3.1. Performance Improvements

5.3.2. False Positive Reduction in DDoS Classification

5.3.3. Algorithm-Specific Feature Response Mechanisms

5.4. Discussion and Methodological Considerations

5.4.1. Experimental Refinements and Performance Assessment

5.4.2. Notable Classification Patterns in DDoS Detection

5.4.3. Attack Variants and Feature Representations

6. Limitations and Future Work

7. Conclusions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A. Class Distribution Data

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI