Drift Adaptive Online DDoS Attack Detection Framework for IoT System

: Internet of Things (IoT) security is becoming important with the growing popularity of IoT devices and their wide applications. Recent network security reports revealed a sharp increase in the type, frequency, sophistication, and impact of distributed denial of service (DDoS) attacks on IoT systems, making DDoS one of the most challenging threats. DDoS is used to commit actual, effective, and profitable cybercrimes. The current machine learning-based IoT DDoS attack detection systems use batch learning techniques, and hence are unable to maintain their performance over time in a dynamic environment. The dynamicity of heterogeneous IoT data causes concept drift issues that result in performance degradation and automation difficulties in detecting DDoS. In this study, we propose an adaptive online DDoS attack detection framework that detects and adapts to concept drifts in streaming data using a number of features often used in DDoS attack detection. This paper also proposes a novel accuracy update weighted probability averaging ensemble (AUWPAE) approach to detect concept drift and optimize zero-day DDoS detection. We evaluated the proposed framework using IoTID20 and CICIoT2023 dataset containing benign and DDoS traffic data. The results show that the proposed adaptive online DDoS attack detection framework is able to detect DDoS attacks with an accuracy of 99.54% and 99.33% for the respective datasets.


Introduction
The Internet of Things (IoT) has grown exponentially in the past ten years by changing ordinary objects into smart and intelligent ones that can work together to make decisions.The desire for smart apps and devices that can work autonomously without requiring human involvement has been one of the main drivers in this field.The development of efficient applications, enhanced communication protocols, and breakthroughs in embedded system design along with end user demand have all contributed to the acceleration of IoT growth.It is predicted that the number of connected IoT devices used around the world will increase by 12% on average from 27 billion in 2017 to 125 billion in 2030 [1].
Despite the wide range of applications and technological developments, IoT is susceptible to cyber attacks [2].The absence of built-in security features has been one of the most obvious shortcomings of the IoT system.This flaw results from the fact that the majority of embedded devices lack the computational power necessary to implement sophisticated security procedures and encryption techniques.
Recent network security reports show that there is a sharp increase in the type, frequency, sophistication, and impact of distributed denial of service (DDoS) attacks on IoT systems, making DDoS one of the most challenging threats [3].Information and network security measures such as encryption, authentication, and access control techniques are not sufficient to defend against DDoS attacks on IoT infrastructure [4].An effective DDoS attack detection solution is required to supplement current security measures.DDoS attack detection monitors any network traffic records that are generated within IoT networks in order to detect DDoS attacks [5].Such systems can be developed and implemented at IoT network gateways, alert network managers, and prevent DDoS attacks.
In recent years, machine learning has grown in popularity as a technique for DDoS attack detection in IoT networks.Machine learning-based approaches use historical IoT systems normal and DDoS traffic data to train and build a model, and detect DDoS attacks.Such an approach, however, is not effective to detect DDoS attacks in IoT systems with.Its low performance is mainly attributed to IoT's unique properties, such as resourceconstrained devices, enormous volumes of data, and real-time requirements.IoT systems have notable security issues because the majority of current industrial security solutions including machine learning approaches require heavy-weight computations and large memory requirements [4].To address these problems, researchers from academia and industry have conducted several studies on machine learning-based detection techniques that attempt to deliver effective data-driven detection [6,7].Several DDoS attack detection models are batch learning-based machine learning techniques.Batch learning techniques frequently require access to a complete dataset for model training.Learning massive IoT datasets demands a significant amount of time for retraining, computational resources, and memory due to the real-time nature of the environment.Online learning is more appropriate than are batch learning techniques for IoT DDoS attack detection given the real-time nature of many IoTs [8].
Another challenge for machine learning-based IoT DDoS attack detection approaches is concept drift, which is caused due to the dynamic nature of IoT environments.Concept drift is a situation where the statistical properties of the target variable, i.e., attack, change over time.Concept drift makes already trained models less useful in recognizing zero-day attacks.A good data analytics model must accurately detect and adapt to observed drifts in order to prevent concept drift issues and maintain high prediction accuracy.Concept drift could be categorized into sudden, incremental, gradual, recurring, and noise drifts [5].Concept drifts are a challenge for IoT DDoS attack detection since the distribution of fault patterns varies over time.It also has data that are imbalanced because data with flaws only make up a small portion of all data.
In this work, a novel Accuracy Update Weighted Probability Averaging Ensemble (AUWPAE) framework is proposed for DDoS attack detection using real-time data streaming.The proposed framework relies on the dynamic nature of incoming streaming data, which leads to rapid changes in data distributions, to build a model that detects concept drifts.The proposed framework reacts to different types of concept drift to perform effectively.The two commonly used drift detection methods, ADWIN [9] and DDM [10], are used in the proposed ensemble framework to detect gradual drift and sudden concept drift, respectively.ARF [11], SRPs [12], and KNN [13] are used as base leaners to construct the ensemble framework.The proposed framework is evaluated on two benchmark IoT datasets: IoTID20 and CICIoT2023 datasets.The results show that the proposed adaptive online DDoS attack detection framework is able to detect DDoS attacks on IoT systems with an accuracy of 99.32% and 99.25% for the two datasets, respectively.
The contributions of this paper are as follows: 1.
We proposed a novel Accuracy Update Weighted Probability Averaging Ensemble (AUWPAE) framework for online DDoS detection that addresses gradual drift and sudden concept drift issues in a dynamic environment.

2.
We evaluated the proposed framework using two well-known, publicly available IoT security datasets as a case study.We investigated and compared the proposed framework with various state-of-the-art online ensemble learning techniques.
The rest of the paper is organized as follows.Section 2 presents related works on DDoS attack detection, and concept drift detection and adaptation.Section 3 describes the proposed framework for adaptive online DDoS attack detection.Section 4 presents the experimental environment, performance metrics, and dataset used in the experiment.Section 5 discusses the experiment results, while Section 6 presents the practical scenario in which the proposed solution could be deployed.Finally, Section 7 presents the conclusion of the paper.

DDoS Attack Detection
The development of efficient and effective DDoS attack detection techniques in IoT systems has received research attention in the past decade [8].Special focus has been given to implementing DDoS detection based on network traffic analysis.Machine learning-based DDoS detection techniques, in particular, have been hailed as promising for making inferences about DDoS attacks.Chen et al. [7] proposed a machine learning-based multi-layer IoT DDoS attack detection framework that includes IoT devices, IoT gateways, softwaredefined network (SDN) switches, and cloud servers.The author constructed eight smart poles equipped with different sensors on campus networks and collected sensor data as datasets over wired and wireless networks.The experimental findings demonstrated that the multi-layer DDoS detection systems have accuracy above 97% for different datasets.Additionally, the SDN controller can efficiently block malicious devices based on blacklists generated by the proposed DDoS attack detection framework.
Attota et al. [14] proposed an ensemble multi-view federated method for IoT intrusion detection.The authors addressed the limitations of centralized deployment by using edge computing paradigms for maintaining data privacy.Three artificial neural network (ANN) models were trained using the bidirectional traffic flow, unidirectional traffic flow, and packet of network traffic features.To select the optimum set of network traffic features for ANN model training, the grey wolf optimization (GWO) technique was used.This reduced the amount of memory space needed to store the training data by reducing the dimensionality of the network traffic features.ANN models' outputs are fed into a random forest (RF) model to predict attacks.The hyper parameters of the models, however, are not disclosed.An evaluation of the proposed approach was conducted using the MQTT dataset, which lacked samples of IoT DDoS attacks.
Nguyen et al. [15] demonstrated the vulnerability of federated learning-based intrusion detection systems to backdoor attack.Backdoor attack is a type of poisoning attack in which the attacker corrupts specific input to a model in order to make incorrect predictions.The anomaly detection system used the gated recurrent unit (GRU), which is a type of recurrent neural network (RNN) for detecting the anomaly behavior of IoT devices.The experiment results showed the effectiveness of backdoor attacks in circumventing stateof-the-art defenses against federated learning poisoning.The attacker performed white box poisoning on the training data by stealthily injecting malicious traffic into the benign training dataset.As a result, the model incorrectly classified malicious traffic as benign.
Ullah et al. [6] developed anomaly-based intrusion detection models for IoT networks using convolutional neural network models.The proposed convolution neural network (CNN) model was implemented using 1D, 2D, and 3D network architectures.CNN and transfer learning were employed as deep learning models for multi-class and binary classification on BoT-IoT, IoT network intrusion, MQTT-IoT-IDS2020, and IoT-23 intrusion detection datasets.
Chengetal.[16] proposed the federated transfer learning approach for intrusion detection in mobile edge computing.The federated learning approach uses transfer learning to speed up model training, lower computational costs, boost communication effectiveness, and enhance classification performance.CNN model architecture was utilized for binary classification that included three convolutional layers, two max-pooling layers, a batch normalization layer, a dropout layer, and two dense layers.The NSL-KDD dataset was used to train the model in the source domain, and the UNSW-NB15 dataset was utilized to finish the training in the target domain in order to evaluate the performance of the method.
Zainudin et al. [17] provide the CNN and LSTM hybrid mode for DDoS attack classification.The author utilized the extreme gradient boosting (XGBoost) feature selection technique in order to determine the top 10 relevant features.The proposed model was evaluated using CIC-DDoS2019 dataset, which includes DDoS attack and benign data.The experiment results show an accuracy of 99.5%.
Kumar et al. [18] developed an LSTM-based DDoS attack detection model using the CICDDoS2019 dataset.The proposed model was performed using binary classification to distinguish between DDoS attacks and benign traffic.The experimental results show that the proposed model achieved an accuracy of 98.6%.
The recent research papers on machine learning-based DDoS attack detection for IoT devices are summarized in Table 1.From the above, we can see that none of the above works considered the dynamicity of the IoT environment, which could result in concept drift.The learning methods proposed in the above works are not online attack detection methods.In this research, we proposed an Adaptive Online DDoS Attack Detection framework for IoT System.

Concept Drift
The majority of IoT devices used in IoT systems have limited computing power and storage [4].This limited memory capacity hinders their ability to handle and retain massive amounts of data and complex learning models.Therefore, it is crucial to develop analytics models with low computing complexity.Online learning methods that enable real-time analytics are able to fulfill IoT system time and memory requirements.Online learning based approaches can continuously update the learning model with new data samples as they arrive in a short execution time while batch learning based approaches need to frequently re-train the learning model on the entire training dataset which takes relatively longer execution time.IoT data samples are usually produced dynamically in constantly changing IoT environments.The dynamic nature of streaming traffic also makes already trained models less useful in recognizing zero-day attack.
Data analytics frequently experiences concept drift issues in real-world applications due to the change in IoT data distribution over time.Concept drift is caused by three main data distribution changes: recurring, gradual, and sudden [19].Sudden drift happens when rapid irreversible changes occur in a short period of time.Gradual drift happens when the data distribution gradually replaces the old one over time.Recurring drift happens when the previous data distribution happens again over time.
Concept drift is formally defined as a set of samples, indicated as S 0,t = {d 0 , . . . ,d t } for a time interval of [0, t], where d i = (x i , y i ) represents a single observation, x i is the feature vector, y i is the label, and S o,t adheres to a specific distribution f o,t (x, y).If f o,t (x, y) f t+1,∞ (x, y), denoted as ∃ t : p t (x, y)p t+1 (x, y), then concept drift happens at times- tamp t t+1 .Concept drift at time t can also be defined as the change in the joint probability of x and y attime t, expressed as p t (x, y) = p t (x) × p t (y/x).Concept drift will happen under the following three conditions.

•
p t (x) = p t+1 (x) while p t (y/x) = p t+1 (y/x)p t (x).This type of drift is known as virtual drift because p t (x) does not affect the decision boundary.• p t (y/x) p t+1 (y/x) while p t (x) = p t+1 (x) while p t (x) remains unchanged.This is considered actual drift because it affects the decision boundary and also leads to a decline in learning accuracy.• The combination of the first two, p t (x)p t+1 (x) and p t (y/x)p t+1 (y/x).
Concept drift poses significant issues when building machine learning models because it can cause changes in data distribution that result in machine learning model performance gradually declining over time [20].Hence, advanced online machine learning models need to be developed in order to detect and adapt the concept drift that occurs in IoT data streams.

Drift Detection
Drift detection is a crucial element for adaptive machine learning models that can solve concept drift issues.The two primary categories of drift detection methods are performance-based and distribution-based [21].

Performance-Based Methods
Performance-based concept drift detection methods are based on changes in the metrics used to evaluate model performance.Common examples of indicators of concept drift are accuracy decline and increase in error rate.If the error rate of a learner gradually decreases or remains constant as more samples are learned, it often indicates constant data distribution without drift.On the other hand, if a learner's error rate drastically increases as more data is processed; it often indicates the presence of concept drift.The two well-known performance-based drift detection methods are the drift detection method (DDM) and early drift detection method (EDDM), which are able to detect concept drift by keeping track of degradations in performance.The DDM is a popular performance-based drift detection method that measures model error rate and standard deviation changes using two predefined thresholds: the warning threshold and the drift threshold [19].DDM often performs well on data streams with sudden drift, but its reaction time is often too slow for detecting gradual drift.The early drift detection method (EDDM) is an enhanced version of the DDM that detects concept drift using the same concept drift detection mechanism [10].The EDDM often performs well on data streams with gradual drift.Even though the EDDM frequently performs better than does the DDM, it still falls short for sudden drift.Furthermore, due to its sensitivity to noise, it may mistake noise for drift, resulting in false alarms.

Distribution-Based Methods
Distribution-based concept drift detection is based on changes in data distributions.Data distribution changes can be measured using statistical variables such as mean, variance, and information entropy.Adaptive windowing (ADWIN) is one such widely used method that uses adaptive sliding windows to detect concept drift based on the statistical difference between two adjacent sub-windows.The adaptive windowing method uses characteristic values such as the mean and variance, as well as variable-size sliding windows to detect concept drift.The window size is dynamically increased if there is no concept drift and reduced when concept drift is detected [9].IoT systems with less memory often adopt distribution-based methods since they only need to retrain on the most recent samples.Windowing methods are often quick and simple to use.However, they could miss certain important historical data.

Drift Adaptation
Concept drift must be handled after it is detected by updating the current models using the appropriate drift adaptation methods.Drift adaptation is a procedure used to update a model automatically when a concept drift occurs to enhance performance of model and detect zero-day attacks.The procedures usually fully retrain or alter the learning model using new dataset.The three categories of drift adaptation methods are model retraining, incremental learning and ensemble learning.

Model Retraining
One of the more simple and straightforward methods to handle concept drift is model retraining.Offline models are often unable to accurately predict unseen incoming streaming data.This problem can be addressed by retraining the model using the most recent data streams.However, employing this technique can cause unnecessary model retraining or drift adaptation delays.Therefore, it is crucial to use learning models together with an appropriate drift detection method to determine when to retrain the learning model for timely and necessary updates.
There are two types of model retraining methods: full retraining and partial retraining.Retraining the learning model using the entire dataset and all available samples is known as full retraining, while partial retraining trains the model on selected parts of the dataset.The window-based method is used to partially retrain a model using the most recent data.This reduces training times but may result in the loss of historical patterns.Hence, selecting the right window size is crucial.ADWIN is a drift detection method that uses model retraining.ADWIN performs better as it uses a dynamic window to fit new data [9].The learning model is partially retrained on only the new concept samples in order to save training time.

Incremental Learning Methods
Incremental learning has become commonly used in data stream analytics research.Incremental learning involves updating the learning model when each instance is processed.Incremental learning methods often partially update the learning model to fit a new data sample [22].They do not require a sufficient amount of data prior to the training process due to the incremental learning approaches' capacity to support progressive learning.However, only a few machine learning algorithms such as MLP and multinomial NB support partial updates.

Ensemble Learning Methods
Ensemble learning approaches have been developed to provide powerful learners for data stream analytics to enable stronger concept drift adaptation.Ensemble learning combines multiple base learners to tackle the same problem [23].Ensemble learning base learners can be constructed using different algorithms, and different configurations of hyper parameters configuration.Ensemble learning models are often more generalizable than single models because they combine the outputs of multiple base learners.Reusing existing models in an ensemble is significantly more effective for concept drift adaptation than training new models on data streams with recurrent concept drift [24].
Block-based ensembles and online ensembles are the two main categories of ensemble techniques used in data stream analytics [25].Data streams are divided into fixed-size blocks by block-based ensembles, which then train a base learner on each block.The base learners will be evaluated and updated each time a new block arrives.Block-based ensembles react to gradual drifts accurately, but they frequently take longer to respond to sudden drifts.Three common block-based ensembles are accuracy-weighted ensemble (AWE) [26], accuracy-updated ensemble (AUE) [27], and the streaming ensemble algorithm (SEA) [25].The AUE often performs better among the block-based ensembles [26].AUE uses non-linear error functions to apply weights to base learners in order to improve performance.
To enhance the learning performance of online ensembles, different incremental learning models such as Hoeffding trees (HTs) are included.Gomes et al. [11] proposed the adaptive random forest (ARF) approach, which makes use of HTs as base learners and ADWIN as a drift detector.The drift detection mechanism replaces underperforming base trees with new trees that better fit.Since random forest is a well-known, effective machine learning algorithm, ARF frequently outperforms many other methods.ARF also includes a powerful re-sampling method and the flexibility to accommodate various drifts types.
Gomes et al. [12] also proposed streaming random patches (SRPs) for the adaptive ensemble approach.Although its execution time is usually longer, SRPs often exhibit slightly better prediction accuracy than ARF.Leverage bagging (LB) [28] is another online ensemble that constructs base learners using bootstrap samples.Although LB is simple to construct, it usually performs worse than SRPs and ARF.Despite the fact that there are several concept drift adaptation strategies in use today, their efficacy is constrained by slow drift response and poor prediction accuracy.
Incremental learning methods often perform poorly due to their low model complexity and limited drift adaptability [29], whereas block-based ensembles face significant difficulties in determining block size and responding quickly to drift.Online ensembles, like ARF and SRPs, often outperform incremental learning and block-based ensemble methods.However, because of their randomization strategies, they give unstable learning models, adding more unpredictability to DDoS attack detection.
In line with the previous research, this research aims to detect DDoS attacks in IoT systems while addressing the concept drift issue.In particular, this paper proposes adaptive online DDoS attack detection using the ensemble learning method to address the issue of concept drift in IoT system DDoS attack detection.

Overview of Proposed Framework
The proposed online DDoS attack detection framework using online data stream analytics in IoT systems has three main steps: preprocessing IoT data, DDoS detection using a base model, and DDoS detection using online ensemble methods.The K-means clustering approach is used to acquire a more representative subset of the incoming IoT data streams.In order to standardize sample data distribution and scale every feature, a Z-score and min-max normalization are used.This step is discussed in Section 3.2.To handle concept drift adaptation and perform DDoS attack detection, four base learners, ARF-ADWIN, ARF-DDM, SRPs-DDM, and KNN-ADWIN, are developed.The development of the base learners is presented in Section 3.3.These base learners are made to adapt to the changing data distribution and detect DDoS attack in the data stream.Finally, the proposed AUWPAE approach, which is based on the ensemble model, is discussed in Section 3.4.AUWPAE combines the perdition of base models using their real-time mean square error rates.Figure 1 shows the proposed drift adaptive online DDoS attack detection framework for IoT systems.

Data Preprocessing
Data preprocessing is a basic step in all machine learning applications including DDoS attack detection.The performance and accuracy of the detection method can be significantly impacted by the representation, size, and quality of the incoming data.Particularly, selecting a dataset with high dimensionality and a large number of duplicate and irrelevant features will affect training.To address these issues, in the data preprocessing phase, we used data cleaning, data encoding, data normalization, and feature selection techniques.

Data Cleaning
The selected dataset network contains a number of different distinct dataset files that are merged into a single file.The merged single file is referred to as a combined dataset.IoT data values are generated in real-world applications as words or strings.Real-world datasets frequently have missing values due to data accessibility issues or challenges in gathering the required data.Missing values such as blank spaces, NaNs, and erroneous data types could be represented using null values.Most machine learning models, however, cannot deal with missing values directly or are negatively impacted during learning.The goal of data cleaning is to fill the gaps left by missing data with acceptable values [28].A number of basic cleaning techniques are used to fill missing variables.In this research, we replaced the missing values with the means of each column.
Electronics 2024, 13, x FOR PEER REVIEW 8 of 21 of base models using their real-time mean square error rates.Figure 1 shows the proposed drift adaptive online DDoS attack detection framework for IoT systems.

Data Preprocessing
Data preprocessing is a basic step in all machine learning applications including DDoS attack detection.The performance and accuracy of the detection method can be significantly impacted by the representation, size, and quality of the incoming data.Particularly, selecting a dataset with high dimensionality and a large number of duplicate and irrelevant features will affect training.To address these issues, in the data preprocessing phase, we used data cleaning, data encoding, data normalization, and feature selection techniques.

Data Cleaning
The selected dataset network contains a number of different distinct dataset files that are merged into a single file.The merged single file is referred to as a combined dataset.IoT data values are generated in real-world applications as words or strings.Real-world datasets frequently have missing values due to data accessibility issues or challenges in gathering the required data.Missing values such as blank spaces, NaNs, and erroneous data types could be represented using null values.Most machine learning models, however, cannot deal with missing values directly or are negatively impacted during learning.The goal of data cleaning is to fill the gaps left by missing data with acceptable values [28].A number of basic cleaning techniques are used to fill missing variables.In this research, we replaced the missing values with the means of each column.

Data Sampling
The selection of highly representative data samples requires the use of an efficient data sampling method.The proposed framework uses the K-means based cluster sampling method to obtain a representative subset.Clustering algorithms have important roles in unsupervised models in grouping data samples based on their similarities.One

Data Sampling
The selection of highly representative data samples requires the use of an efficient data sampling method.The proposed framework uses the K-means based cluster sampling method to obtain a representative subset.Clustering algorithms have important roles in unsupervised models in grouping data samples based on their similarities.One of the common clustering methods is K-means, which divides an unlabeled dataset into K clusters based on the degree of similarity between data points.

Data Encoding
In real world IoT data, values are generated as words or strings to make them humanreadable.Data encoding involves the conversion of string information into numerical features that machine learning models are able to understand and process.Label, one-hot, and target encoding are examples of frequently used encoding methods.In this research, we used target encoding to replace categorical values with the means of the target variables.

Data Normalization
There are different attribute values in the combined dataset, and some features have values that are spread out over a large range while other values are spread out over a small range.This variation could affect the performance of the models.To solve this issue, we used normalization techniques to scale the feature values in the same range.Two common normalization methods for data analytics are Z-score and min-max normalization.The min-max normalization approach transforms the original data linearly to achieve a balance of comparative values between the data before and after processing.The min-max normalization method scales a feature's values to a range between 0 and 1.The min-max scaling approach is shown in Equation (1) [30]: where X new is the result of normalization and X is the value to be normalized; Min and Max are the minimum and maximum values in each feature.The Z-score normalization method is based on the mean and standard deviation of the data.This method is helpful if the minimum and maximum values of the data are unknown.Equation (2) shows the formula for Z-score normalization [31]: where X new is the result of normalization, X is the value to be normalized, µ = population mean, and δ = standard deviation.Min-max normalization can keep outliers in datasets; hence, it is more appropriate for DDoS attack detection models.On the other hand, Z-score normalization is robust against outliers; hence, it usually works well for data analytics problems related to non-outliers.As a result, the proposed framework chooses both the Z-score and min-max normalization to handle this issue.

Feature Selection
Feature selection is used to choose a subset of the original feature set in order to increase the performance and speed of machine learning models.The best-performing minimal feature set can be identified by evaluating all feature combinations.However, with a high number of features, optimization approaches can be used to examine the feature search space and determine the optimal feature set.Information gain (IG) and Pearson correlation methods are used to remove irrelevant and duplicate features, respectively.

Drift Adaptation Base Model Selection
The proposed framework consists of four base models that are used to construct the online ensemble model (see Figure 1).The framework aims to balance learning performance and efficiency since it is designed for online IoT systems.Most of the existing data stream classification algorithms specialize in only one type of drift.Some classifiers are best suited for gradual drift while others are suited for sudden drifts.Our research aims to develop a data stream classifier that reacts to both types of concept drift.In this research, the two commonly used drift detection methods, ADWIN and DDM, are used together to detect concept drift.ADWIN's sliding window can be expanded to a large-size window to detect long-term changes.This makes ADWIN an ideal match for data streams with gradual drift.DDM often works well on data streams with sudden drift, but its reaction time is often too slow for detecting gradual drift.
Online ensembles can improve learning performance using various incremental learning models like Hoeffding trees (HTs).The adaptive random forest (ARF) method uses HTs as base learners and ADWIN as a default drift detector [11].The drift detection mechanism replaces underperforming base trees with new trees that fit the new concept.Since the random forest method is an effective technique, ARF often out performs a variety of other methods.ARF makes the best use of re-sampling, and it is adjusted to a variety of drift types.
Streaming random patches (SRPs) is an alternative adaptive ensemble method for streaming data analytics [12].SRPs use a combination of online bagging and random subspace algorithms for predictions.It uses a similar approach to the one above, except SRP employs a global subspace randomization mechanism and ARF's local subspace randomization.A flexible technique for increasing the diversity of base learners is global subspace randomization.Although SRPs' execution time is longer than ARF's, its prediction accuracy is frequently slightly higher.
ARF and SRPs are advanced drift adaptation approaches that have shown better performance through experimental research than other drift adaptation approaches have [11,12].ARF is an advanced ensemble model that builds HTs for drift adaptation using local subspace randomization and that uses a drift detector for drift detection.ARF has demonstrated good performance and excellent execution time compared to those of others in solving data stream analytics problems [11].As a result, ARF-ADWIN and ARF-DDM drift detectors are selected as base models for the proposed ensemble framework.The ensemble uses two ARF models with different drift detectors to preserve high accuracy.
The other two models are then selected among seven online learning methods (LB, SRPs, OPA, PWPAE, SRPs-ADWIN, SRPs-DDM, and KNN-ADWIN) by considering execution time and performance.LB [32], SRPs, and SRPs-ADWIN can effectively solve concept drift but they are not as effective as ARF.Although HTs' computational costs are low, this method is not selected due to performance limitations.The other two models, OPA and PWPAE [29], are not selected because of their high computational complexity.Hence, ARF-ADWIN, ARF-DDM, SRPs-DDM, and KNN-ADWIN are selected for our proposed online learning framework.Below is a summary of the justifications for selecting these models.

1.
ADWIN performs better for detecting gradual drift whereas DDM performs effectively for detecting sudden drift.Combining these two makes it possible to detect sudden and gradual concept drifts.

2.
Experimental results demonstrate that ARF-ADWIN and ARF-DDM perform better at handling concept drift than do other drift adaptation approaches.Since ARF-ADWIN and ARF-DDM are extensions of the adaptive random forest (ARF) method, the proposed framework inherits the ability to adapt to concept drift and improve overall system performance.

3.
The selected models, ARF-ADWIN, ARF-DDM, SRPs-DDM, and KNN-ADWIN, result in better performance while maintaining a shorter execution time.Hence, they are well suited for online DDoS attack detection and concept drift adaptation.

Accuracy Update Weighted Probability Averaging Ensemble (AUWPAE)
In this research, a novel Accuracy Update Weighted Probability Averaging Ensemble (AUWPAE)is proposed for combining the base learners for IoT data stream analytics.AUWPAE makes use of the advantages of the previous approaches while mitigating their limitations.AUWPAE provides dynamic weights to base learners in accordance with their real-time performance.The AUWPAE approach utilizes the weighted average approach.The prediction probability of each base learner is assigned a weight and multiplied by the prediction probability of that base learner.The multiplication results are summed to determine the final prediction probability.The prediction class of the proposed model will be the final one with the highest average probability [22].Given a data stream, D = {(x 1 , y 1 ), (x 2 , y 2 ), . . .(x n , y n )}, consisting of c different target classes, y ∈ 1, . . .c.The predicted target category, ŷ, for each data input, x, can be expressed mathematicallyusing the following Equation (3) [20]: where L j stands for the jth base learner model, p j (y = i |L j , x defines the probability of predicting a class value, I, on a data sample, x, using the jth base learner, L j ; b is the number of base learner models, which in our case is b = 4; and w j denotes the weight of each base learner model, L j .The weight of base learner can be quantitatively assessed using the accuracy-updated ensemble method [26].For every data stream, the weight of the base learner model, w j , is calculated by estimating the mean square error rate on data stream D = (x 1 , y 1 ), (x 2 , y 2 ), . . .(x n , y n ), as shown in Equations ( 4)-( 6): (5) where the function f j y (x) represents the probability given by the base learner model, L j , that x is an instance of class y.The accuracy-updated ensemble algorithm considers probabilities for all classes instead of making predictions fora single class.MSE ij evaluates the prediction error of the base learner model, L j , on datum x i .MSE r is the mean square error of a randomly predicted base learner model and is used as a reference point for the current class distribution.To avoid issues with division by zero, a very little positive value, ∈, is also added to the equation.Equation ( 6) is used as weighting formula to combine the accuracy of the base learner model with the current class distribution.In addition, assigning ensemble members new weights for each datum, x i , a candidate base learner model, L j , is built from instances in the most recent data.L j is a "perfect" base learner model, since it is trained on the most recent data.Equation ( 7) is used to determine its weight.
The weight of the candidate base learner model, L j , does not take into consideration the prediction error for base learner L j in comparison with the function used to weight the members of the current ensemble.This approach is based on the assumption that the most recent data reflect the distribution of the present and the near-future data.L j is considered to be the best base learner model available, because it is trained using the most recent data.The proposed Accuracy Update Weighted Probability Average Ensemble Algorithm (Algorithm 1) is shown below.
The computational complexity of the AUWPAE model is primarily determined by the complexity of the selected based models; the AUWPAE method itself has a linear computational complexity of O(pwb), where p is probability of predicting the base learner, b is the number of base learner models, which in our case is b = 4, and, w denotes the weight of each base learner model.Most of the performance comparisons are conducted with the base learners used in the proposed framework.AUWPAE adds a weighting algorithm that is linear in complexity on top of the base learners.Hence, the proposed framework should have nearly the same complexity as the base learner, which can be further improved by replacing lower complexity base learners.
The proposed Accuracy Update Weighted Probability Averaging Ensemble (AUWPAE) makes use the advantages of the previous approaches while mitigating their limitations.We developed AUWPAE to address both sudden and gradual drifts, and enhance the detection and adaptability capabilities of current methods.Further, this study aims to propose a novel ensemble framework that successfully achieves a trade-off between execution time and predictive accuracy.To evaluate the proposed online DDoS detection framework, we used two security datasets: CICIoT2023 and IoTID20.In 2023, the Canadian Institute for Cyber security developed CICIoT2023 dataset [33].To generate the attack traffic, 33 attacks are executed on 105 IoT devices as targets.The traffic includes normal and attack traffic from DDoS, DoS, Recon, and Web-based DDoS attack, brute force, spoofing, and Mirai.The dataset contains the most common and current attacks as of this time.However, for this experiment, we use DDoS and normal traffic.CICIoT2023 is a new realistic IoT dataset that was created by generating real IoT device traffic data from both legitimate and malicious IoT devices that include different DDoS attacks.
The IoTID20 dataset has been used for developing DDoS attack detection in several studies [34,35].The authors of the IoTID20 dataset [36] performed binary and multiclass classification, and reported the accuracy scored for different classifier methods.DDoS attack types included in the IoTID20 dataset are: Mirai-ACK flooding, Mirai brute force, Mirai-HTTP flooding, and Mirai-UDP flooding.The IoTID20 dataset is a relatively new dataset that considers IoT devices while containing DDoS attacks, and several recent works have used it to develop IDS [37].IoTID20 focuses on IoT security and provides a wide range of attack and normal samples from various IoT devices.

Performance Metrics
The proposed framework is analyzed from a variety of perspectives to provide a comprehensive view of the experiment.The four-performance metrics, accuracy, precision, recall, and F1-score, are the primary metrics we used to evaluate the performance of the proposed framework.Latency and throughput are also measured to evaluate the learning and detection speed.
In this case, latency refers to the delay in the response time to a specific input.It represents the typical processing time needed by our model to analyze and classify each sample.On the contrary, throughput is a measure of how many data the system can process in a given amount of time.In our context, it corresponds to the number of samples our model can analyze and classify in a given unit of time.Low latency and high throughput are the two essential performance criteria for machine learning and data analytics models.A balance between prediction accuracy and latency should be maintained by efficient learning models in order to achieve real-time analytics.

Experiment Environment
The proposed online DDoS detection framework aims to detect DDoS attacks on IoT systems.To observe the performance of the proposed approach, we developed a prototype using Python 3.10 programming language in the Jupyter Notebook environment.The River library [35] was used for data stream analytics and addressing concept drift through machine learning.The experiment was conducted on a machine running Intel(R) Xeon(R) CPU @2.20 GHz and with 16 GB of RAM.

Experimental Results and Discussion
The datasets used during the experiments are IOTID20 and CICIoT2023.The attack patterns on both datasets have changed overtime, resulting in three and seven concept drifts on the IOTID20 and CICIoT2023 datasets, respectively.The changes and in particular the occurrence of the concept drifts in the datasets show the dynamic nature of IoT streaming traffic.The concept drifts are shown in Figures 2 and 3 using black arrows.Except for the third and the last drifts shown for the CICIoT2023 dataset in Figure 3, all other concept drifts shown for both the IOTID20 dataset in Figure 2 and CICIoT2023 dataset in Figure 3 are sudden drifts.The third and the last drifts shown for the CICIoT2023 dataset in Figure 3 are gradual drifts.The performance of the proposed model, which is AUWPAE, is compared with that of other state-of-the-art online adaptive learning methods, such as ARF-ADWIN, ARF-DDM, SRPs-ADWIN, SRPs-DDM, KNN-ADWIN, HTs, LB, and PWPAE, using two datasets: IOTID20 and CICIoT2023.The experimental results are presented in Figures 2 and 3, and Tables 2 and 3.
Figure 2 and Table 2 show performance comparisons of the proposed model, which is AUWPAE, with other previously proposed models, i.e., ARF-ADWIN, ARF-DDM, SRPs-ADWIN, SRPs-DDM, HTs, LB, KNN-ADWIN, and PWPAE, using the IOTID20 dataset.The performance metrics used for comparison were accuracy, precision, recall, F1-score, latency, and throughput.From the experimental results, in terms of accuracy, the proposed model achieved an average accuracy of 99.54%.This result shows that AUWPAE outperforms the other online adaptive learning methods.The reason why the proposed model performed better than the others is mainly attributed to the weighting algorithm used for ensemble.The two selected base models, SRPs-DDM and KNN-ADWIN, achieved better accuracy.The accuracies of SRPs-DDM and KNN-ADWIN are 98.84% and 99.13%, respectively, while using the IoTID20 dataset.This is because the proposed model is an extension of the AUE, and the weighting of classifiers is performed using non-linear error function, which contributes to performance enhancement.In terms of precision, recall, and the F1-score, the experimental results show that the proposed model achieved 99.51%, 99.99%, and 99.76%, respectively.These results show that the proposed approach has relatively accurate and precise DDoS attack detection capability than the other solutions and is robust to concept drifts.
taset in Figure 3 are sudden drifts.The third and the last drifts shown for the CICIoT2023 dataset in Figure 3 are gradual drifts.The performance of the proposed model, which is AUWPAE, is compared with that of other state-of-the-art online adaptive learning methods, such as ARF-ADWIN, ARF-DDM, SRPs-ADWIN, SRPs-DDM, KNN-ADWIN, HTs, LB, and PWPAE, using two datasets: IOTID20 and CICIoT2023.The experimental results are presented in Figures 2 and 3, and Tables 2 and 3.   2 show performance comparisons of the proposed model, which is AUWPAE, with other previously proposed models, i.e., ARF-ADWIN, ARF-DDM, SRPs-ADWIN, SRPs-DDM, HTs, LB, KNN-ADWIN, and PWPAE, using the IOTID20 dataset.The performance metrics used for comparison were accuracy, precision, recall, F1score, latency, and throughput.From the experimental results, in terms of accuracy, the proposed model achieved an average accuracy of 99.54%.This result shows that AUWPAE outperforms the other online adaptive learning methods.The reason why the proposed model performed better than the others is mainly attributed to the weighting algorithm used for ensemble.The two selected base models, SRPs-DDM and KNN-ADWIN, achieved better accuracy.The accuracies of SRPs-DDM and KNN-ADWIN are 98.84% and 99.13%, respectively, while using the IoTID20 dataset.This is because the proposed model is an extension of the AUE, and the weighting of classifiers is performed using non-linear error function, which contributes to performance enhancement.In terms of precision, recall, and the F1-score, the experimental results show that the proposed model achieved 99.51%, 99.99%, and 99.76%, respectively.These results show that the proposed approach has relatively accurate and precise DDoS attack detection capability than the other solutions and is robust to concept drifts.Figure 3 and Table 3 show the experimental results of the proposed model and the other methods considered in this research while using CICIoT2023 dataset.In terms of accuracy, the proposed model, i.e., AUWPAE, has achieved 99.33% accuracy, which is the highest when compared to the other methods in this experiment.Moreover, in terms of precision, recall, and F1-score performance metrics, AUWPAE has achieved 98.88%, 96.53%, and 97.98%, respectively, which are better than the other methods.These results  In terms of latency, the proposed model uses an average of 2.73 ms to classify a given sample as attack and normal while using IoTID20 dataset (see Figure 2 and Table 2).The other online adaptive models HTs, ARF-ADWIN, ARF-DDM, KNN-ADWIN, and LB performed better than the proposed approach, i.e., AUWPAE, in terms of processing time.AUWPAE is better than only PWPAE, SRPs-ADWIN and SRPs-DDM, respectively.The reason why the proposed model took more time to process and give results is mainly attributed to the ensemble algorithm.However, on the proposed ensemble model, ARF-ADWIN and ARP-ADWIN have positive contribution on latency.Since, the two models ARF-ADWIN and ARF-DDM achieved excellent latency compared to the other models which are 0.24 ms and 0.26 ms using the IoTID20 dataset, respectively.ARF-ADWIN is the second fastest approaches in the experiment while KNN-ADWIN is the third.
For this experimental setup, the last performance metrics used were throughput.The experimental result shows that the average throughput achieved by the proposed model (AUWPAE) is 365 samples/second.This result would mean that AUWPAE is better than PWPAE, SRPs-DDM, and SRPs-ADWIN.ARF-DDM, LB, HTs, and KNN-ADWIN performed better than the proposed model.The relatively high throughput values inversely correlate with the latency of the methods.Their accuracy and F-score values, however, is low when compared to the proposed approach.This shows that even though the proposed system is slower to classify data samples in microseconds, it is able to detect DDoS attacks better than the other methods in the presence of concept drift.
Figure 3 and Table 3 show the experimental results of the proposed model and the other methods considered in this research while using CICIoT2023 dataset.In terms of accuracy, the proposed model, i.e., AUWPAE, has achieved 99.33% accuracy, which is the highest when compared to the other methods in this experiment.Moreover, in terms of precision, recall, and F1-score performance metrics, AUWPAE has achieved 98.88%, 96.53%, and 97.98%, respectively, which are better than the other methods.These results indicate that the proposed approach is able to detect DDoS attacks more accurately than the other methods in the presence of concept drifts.
The latency of AUWPAE is 1.29 ms.AUWPAE is relatively slow when it is compared with HTs, ARF-ADWIN, KNN-ADWIN, and LB, and fast when compared with PWPAE, SRPs-DDM, and SRPs-ADWIN.LB is the fastest method but has low accuracy and a low F1-score when compared with AUWPAE.The higher latency value of AUWPAE is mainly attributed to the ensemble algorithm.However, this has given it the advantage to detect DDoS attacks accurately.This result is also consistent with the IOTID20 dataset.
The AUWPAE showed better accuracy and F1-score values for both datasets while taking relatively more time (in ms) to process the data.This indicates that AUWPAE is able to correctly classify normal and DDoS attacks accurately while there are three or more concept drifts in a short period of time.In real-world contexts, however, concept drifts would happen less frequently.Hence, the AUWPAE is expected to perform better in a real-world scenario.Even though AUWPAE is relatively slow, we believe that the accuracy of the classification plays an important role in making a decision to protect an IoT system.Moreover, the latency of AUWPAE is less than the latency requirement of IoT production applications [38].

Proposed Solution Deployment Location
This section examines the potential deployment and expected performance of the proposed adaptive online DDoS attack detection framework.IoT endpoints are commonly used for IoT data stream collection, IoT edge servers are used for preliminary for DDoS attack detection analytics, and IoT cloud servers are used for comprehensive DDoS attack detection.The proposed AUWPAE model could be deployed in the IoT cloud, edge servers for IoT DDoS attack detection, and the control server located at the edge layer.The first option is deploying it in IoT edge devices, which provide quick data processing to reduce the size for long-distance data transfer but typically have limited computational capacity.Edge computing allows the detection of local DDoS attacks on edge servers or control servers at the edge layer.Control servers can perform preliminary and fundamental IoT DDoS attack detection jobs locally, including data pre-processing and feature selection.Deploying high-performance computing equipment at the IoT edge layer will significantly minimize latency.Hence, the proposed deployment location considers the DDoS attack detection solution at THE IoT edge using high-performance computing equipment.
The other option could be deploying it in IoT cloud servers for DDoS attack detection.IoT cloud servers often include several cloud machines with high computational power and resources, allowing them to use cloud computing to carry out complex DDoS attack detection activities.However, deploying the proposed framework on the cloud server poses a high risk to the privacy of the data.Additionally, there is a delay in the classification of IoT data due to requests and responses to the central cloud server.Figures 4 and 5 show the proposed framework's deployment location at edge servers and in the IoT cloud, respectively.

Conclusions
IoT DDoS attack detection solutions are usually developed to protect IoT systems from DDoS attacks using IoT data stream analytics.However, IoT data are usually dynamic and could have concept drifts.This paper provides a novel Accuracy Update Weighted Probability Averaging Ensemble (AUWPAE) approach to detect concept drift and perform zero-day DDoS detection.We evaluated the proposed model using the IoTID20 and CICIoT2023 datasets with benign and DDoS traffic data that had concept drifts.The results show that AUWPAE achieved better accuracies of 99.54% and 99.33% for the respective datasets when compared with those of the other eight models.This result indicates that the proposed adaptive online DDoS attack detection framework, which uses AUWPAE is able to detect DDoS attacks in the presence of concept drifts.In this paper, we also presented the IoT DDoS attack detection solution deployment framework for IoT systems.
As part of future work, we plan to implement the two deployment scenarios of the proposed approach described in Section 6 in a real-world setting.We also plan to further investigate the use of other algorithms as base learners.

Figure 2
Figure 2 and Table2show performance comparisons of the proposed model, which is AUWPAE, with other previously proposed models, i.e., ARF-ADWIN, ARF-DDM, SRPs-ADWIN, SRPs-DDM, HTs, LB, KNN-ADWIN, and PWPAE, using the IOTID20 dataset.The performance metrics used for comparison were accuracy, precision, recall, F1score, latency, and throughput.From the experimental results, in terms of accuracy, the proposed model achieved an average accuracy of 99.54%.This result shows that AUWPAE outperforms the other online adaptive learning methods.The reason why the proposed model performed better than the others is mainly attributed to the weighting algorithm used for ensemble.The two selected base models, SRPs-DDM and KNN-ADWIN, achieved better accuracy.The accuracies of SRPs-DDM and KNN-ADWIN are 98.84% and 99.13%, respectively, while using the IoTID20 dataset.This is because the proposed model is an extension of the AUE, and the weighting of classifiers is performed using non-linear error function, which contributes to performance enhancement.In terms of precision, recall, and the F1-score, the experimental results show that the proposed model achieved 99.51%, 99.99%, and 99.76%, respectively.These results show that the proposed approach has relatively accurate and precise DDoS attack detection capability than the other solutions and is robust to concept drifts.

Figure 4 .
Figure 4. Proposed framework deployment location at IoT Edge Server.

Figure 4 .
Figure 4. Proposed framework deployment location at IoT Edge Server.

Figure 4 .
Figure 4. Proposed framework deployment location at IoT Edge Server.

Table 1 .
Summary of related works.✗ indicates absence of drift detection while ✓ indicates presence of drift detection.