Optimized MLP-CNN Model to Enhance Detecting DDoS Attacks in SDN Environment

: In the contemporary landscape, Distributed Denial of Service (DDoS) attacks have emerged as an exceedingly pernicious threat, particularly in the context of network management centered around technologies like Software-Deﬁned Networking (SDN). With the increasing intricacy and sophistication of DDoS attacks, the need for effective countermeasures has led to the adoption of Machine Learning (ML) techniques. Nevertheless, despite substantial advancements in this ﬁeld, challenges persist, adversely affecting the accuracy of ML-based DDoS-detection systems. This article introduces a model designed to detect DDoS attacks. This model leverages a combination of Multilayer Perceptron (MLP) and Convolutional Neural Network (CNN) to enhance the performance of ML-based DDoS-detection systems within SDN environments. We propose utilizing the SHapley Additive exPlanations (SHAP) feature-selection technique and employing a Bayesian optimizer for hyperparameter tuning to optimize our model. To further solidify the relevance of our approach within SDN environments, we evaluate our model by using an open-source SDN dataset known as InSDN. Furthermore, we apply our model to the CICDDoS-2019 dataset. Our experimental results highlight a remarkable overall accuracy of 99.95% with CICDDoS-2019 and an impressive 99.98% accuracy with the InSDN dataset. These outcomes underscore the effectiveness of our proposed DDoS-detection model within SDN environments compared to existing techniques.


Introduction
A Distributed Denial of Service (DDoS) attack seeks to incapacitate a service or resource by inundating it with an overwhelming traffic volume, rendering it inaccessible to legitimate users.These attacks are typically orchestrated through a network of compromised devices known as botnets, unleashing a deluge of traffic upon the target [1].DDoS attacks can wreak havoc on online services, and combating these threats can be exceptionally challenging due to the multifarious sources from which the traffic originates [2,3].
Recently, there has been a surge in high-profile DDoS attacks targeting various entities, employing many techniques, including amplification attacks that exploit vulnerabilities in Internet of Things (IoT) devices.This evolving landscape has posed significant challenges to cybersecurity.The advent of Software-Defined Networking (SDN), which enables a centralized controller to oversee and configure the entire network, has made SDN a prime target for DDoS attacks, as depicted in Figure 1 [3,4].Attackers have leveraged the unique architecture of SDN infrastructures to develop increasingly sophisticated and potent DDoS threats, raising serious concerns for both the online community and service providers.Machine Learning (ML) has emerged as a crucial defense mechanism [5].
The integration of ML into DDoS detection has gained momentum, positioning it as an indispensable tool for safeguarding SDN environments.ML algorithms can assimilate insights from historical data and adapt to evolving SDN dynamics, substantially enhancing the efficiency and effectiveness of DDoS detection.Nevertheless, given the ever-evolving nature of these threats, refining ML-based DDoS detection is primary in addressing this monumental challenge [6,7].Numerous approaches can be harnessed to enhance the effectiveness of ML in DDoSattack detection.These strategies encompass feature selection, Deep Learning (DL), and optimization techniques.Feature selection aims to pinpoint the most relevant attributes within a dataset, thereby fortifying the ML model's capability to discern DDoS attacks [4].Trimming down the dataset's dimensionality and discarding extraneous or duplicative features contributes to superior model performance and guards against pitfalls like overfitting or poor generalization.Furthermore, a hybrid DL model emerges as a formidable asset in ML [3].By amalgamating their inherent strengths, it unites disparate neural network architectures or DL models to confront specific tasks and challenges.This amalgamated approach paves the way for heightened performance and resolving intricate problems.Moreover, these models harness the most effective methods from a pool of trained alternatives.Techniques such as stacking, bagging, and boosting empower the identification of the cream of the crop among these models, thereby elevating the efficacy of DDoS-detection systems to a new level [8].
This article introduces a highly effective model to enhance the detection of DDoS attacks within SDN environments.This method combines a feature selection technique with an optimized hybrid DL model to craft a robust classifier.The initial step involves the application of a feature-selection method, precisely the SHAP-feature-selection technique, to identify the most relevant attributes for DDoS detection.The core of this model lies in a hybrid neural network design that blends the capabilities of Multilayer Perceptron (MLP) and Convolutional Neural Networks (CNNs).This combination is adept at capturing the intricate components and temporal dependencies within network traffic data, making it an ideal choice for the task at hand.Furthermore, hyperparameter tuning and the Adaptive Moment Estimation (ADAM) optimizer are employed to fine tune and enhance the proposed method.The effectiveness of the OptMLP-CNN model is assessed through evaluations conducted on two diverse and widely used datasets: InSDN and CICDDoS-2019.This comprehensive evaluation framework validates the theoretical robustness of the proposed model, affirming its high degree of practicality within real-world SDN environments.The main contributions are: • Introduction of OptMLP-CNN: This research introduces a novel DDoS-attack-detection method referred to as "OptMLP-CNN".This method combines two critical elements: SHAP-feature selection and a hybrid neural network architecture.The primary goal is to create an optimized and effective DDoS attack detector.

•
Optimization Using Bayesian and ADAM Optimizers: We optimized the proposed model by applying two optimization techniques: the Bayesian optimizer and the ADAM optimizer.These optimization methods fine tune the model to enhance its performance and effectiveness in detecting DDoS attacks.

•
In-Depth Analysis of DDoS-Attack-Detection Systems: The research extends beyond introducing a new model by conducting a detailed analysis of DDoS-attack-detection systems.This analysis encompasses systems that incorporate DL techniques and were selected based on specific criteria.The criteria include evaluating and comparing their DDoS-detection performance, the datasets used, optimization methods applied, and the types of systems they are designed to protect.This paper unfolds as follows: In Section 2, an extensive survey of related research is presented, focusing on advancing DDoS detection through utilizing ML techniques.Section 3 explores the foundational concepts and overarching background within this domain.Section 4 introduces a novel model for DDoS detection in SDN, incorporating a feature-selection method and an optimized hybrid DL model.Section 5 presents an in-depth analysis of the proposed model's performance, along with a detailed comparative assessment of the experimental outcomes.Section 6 encapsulates the paper with final insights, synthesizing findings and delineating potential future research directions, providing a holistic framework for advancing DDoS detection in SDN environments.

Motivation
In network security, the motivation behind this research is driven by the pressing need to combat the increasing menace of DDoS attacks.These attacks have evolved in frequency, sophistication, and ability to disrupt essential network services, leading to severe financial and reputational repercussions for organizations.This alarming trend in DDoS attacks demands an urgent response in the form of improved detection and mitigation mechanisms.
One key area of concern is the vulnerability of networks operating under the paradigm of SDN.While SDN offers numerous advantages in centralized control and dynamic configurations, it also introduces novel vulnerabilities that make SDN environments attractive targets for DDoS attacks.Safeguarding SDN networks from such threats has become a vital concern, necessitating the development of advanced DDoS-detection solutions tailored to this unique network architecture.
ML and DL techniques have emerged as potent tools for enhancing network security, given their capacity to process vast volumes of network traffic data and identify anomalous patterns.However, to fully harness the potential of these models, they require meticulous optimization, customization, and adaptation to the specific dynamics of SDN environments.
Optimization plays a crucial role in the effectiveness of DDoS-detection systems.This entails fine tuning model hyperparameters, selecting relevant features, and refining the model architecture.In SDN, where network behavior can significantly differ from traditional networking environments, this customization and rigorous optimization are vital to achieving heightened accuracy and efficiency in DDoS detection.
SDN networks introduce unique characteristics such as centralized control and dynamic reconfiguration.To effectively counter DDoS attacks in SDN, detection systems must be tailored to these distinctive attributes and demonstrate adaptability to evolving network conditions.Beyond accuracy, DDoS-detection models should offer interpretability, allowing network administrators to understand the rationale behind detection decisions.Furthermore, these systems must exhibit robustness in the face of rapidly evolving attack techniques and changes in network configurations.
Developing an effective DDoS-detection solution for SDN requires a comprehensive evaluation within a realistic SDN environment.The rigorous assessment provides critical insights into the model's capacity to distinguish legitimate traffic from DDoS attacks, ultimately determining its overall efficacy.
In light of these compelling motivations, this research embarks on developing an "Optimized MLP-CNN Model" explicitly designed to enhance the detection of DDoS attacks within SDN environments.By harnessing the capabilities of ML and DL and coupling them with meticulous optimization, this study aspires to furnish a robust, adaptable, and transparent solution that reinforces SDN networks against the mounting menace of DDoS attacks.

Comprehensive Overview
This section delves into contemporary and extensively employed methodologies for detecting DDoS attacks by applying ML techniques.An exhaustive comparison of the recent related works on improving ML-based DDoS detection is summarized in Table 1.In [9], Thakkar and Lohiya introduced a model that harnessed statistical importance, specifically standard deviation, to facilitate the selection of features within a deep neural network.Their pioneering work sought to bolster the efficiency of intrusion detection and classification.To assess the efficacy of their approach, they conducted evaluations by using three diverse datasets: NSL-KDD, UNSW_NB-15, and CIC-IDS-2017.In [4], the authors presented an optimized approach for precisely identifying DDoS attacks within Software-Defined Internet of Things (SDIoT) networks.This approach harnessed the power of Autoencoder and XGBoost algorithms' power, meticulous feature selection, and hyperparameter-tuning optimization.The method's robustness was evaluated through rigorous testing by using two specific SDN-IoT datasets: SDN-NF-TJ and SDNIoT.
The authors in [10] presented a DDoS-detection analysis by using seven ML algorithms, four DL, and five unsupervised models.They evaluated the performance of 15 methods to select features according to the UNSW_NB-15 dataset.The authors of [11] presented an approach to address the security of the SDN paradigm compared to the traditional Vehicular Ad Hoc Networking (VANET); they simulated a dataset and proposed a system based on the Minimum Redundancy Maximum Relevance process to deal with DDoS attacks.This approach was implemented in a Bayesian optimization-based decision tree classifier.
In [12], the authors evaluated the performance of detecting DDoS attacks by using logistic regression, a decision tree classifier, linear support vector machine, k-nearest neighbors, Gaussian Naive Bayes, Random Forest Classifier, XGBoost, ANN, and CNN.They also used hyperparameter optimization and applied various techniques to select features on KDD Cup 99 and UNSW-NB15.An extreme learning machine was used in [13] to ameliorate the performance related to the DDoS-attack-detection system; the authors suggested a method based on memory optimization and a hybrid approach for selecting features.They conducted the experiments by using the CICDDoS-2019 dataset.
In the SDN infrastructures context, the article [14] simulated an SDN environment to evaluate DDoS-attack detection in SDN by using supervised learning techniques, and 1999 DARPA, the DDoS-attack SDN dataset (DASD), and InSDN datasets were also used for this evaluation.In [15], the authors employed the SHAP-feature weight within recursive feature exclusion by using five base classifiers to detect DDoS attacks.
In [16,25], many experiments were conducted to evaluate feature selection's impact on Ensemble Learning techniques' performance by combining several ML models to improve the overall performance.It is based on the principle that multiple models working together often achieve better performance than a single model alone.In [17,24,27], the authors proposed hybrid feature-selection methods to enhance the DDoS-detection systems.Selecting features according to the Information Gain technique was also evaluated in [18,19] to deal with DDoS attacks.
The authors of [20] proposed a method to select features and detect DDoS attacks based on flow classification and threshold tuning for optimization.The authors of [26] generated a dataset in an SDN-simulated network and evaluated the performances against DDoS attacks through different methods of selecting features and ML models.The paper [21] evaluated selecting-features methods by using Majority Voting and the UNSW_NB-15 dataset.The authors of [22,23] exploited the SDN infrastructures to evaluate the DDoSattack-detection systems.In [3], the authors investigated a methodology based on dataset understanding, feature modeling, and dimensionality reduction to improve DDoS detection via ML-based systems in the SDN environment.
The approach introduced by Cauteruccio et al. [28] leverages edge-data analysis combined with cloud data analysis to autonomously detect anomalies within heterogeneous sensor networks.Their methodology utilizes a fully unsupervised artificial neural network algorithm for edge-data analysis, while cloud-data analysis involves the application of the multiparameterized edit-distance algorithm.

Key Findings
The realm of detecting DDoS attacks has witnessed a substantial surge in research endeavors.These investigations have honed in on the strategic fusion of SDN infrastructures and the capabilities of DL models within diverse domains.This collective exploration has produced a trove of invaluable insights, each adding a distinctive layer to the evolving landscape of DDoS detection.Let us delve into these pivotal findings: • Network-Specific Innovations: A significant portion of this research body has been intricately tailored to address the intricate nuances of specific network types.Notably, SDN and SD-VANET networks have emerged as focal points of interest.Researchers have recognized the importance of customizing DDoS-detection approaches to suit these specialized network environments.These tailored strategies delve into the unique characteristics and challenges that SDN and SD-VANET networks present, paving the way for more effective ML-based DDoS detection within these domains.

•
Feature Selection and Deep Learning's Prowess in SDN: In the context of SDN environments, a remarkable discovery has centered around the power of feature-selection techniques.These methodologies meticulously sift through data to pinpoint the most pertinent features, substantially elevating the performance of ML-based DDoSdetection systems.Furthermore, DL techniques have come under the microscope, with researchers diligently assessing their contribution to SDN's overarching DDoSdetection framework.DL models have showcased an unparalleled aptitude for capturing intricate data patterns and relationships, positioning them as valuable assets in the arsenal against DDoS threats.• Fine Tuning via Hyperparameter Optimization: Research efforts have introduced a meticulous fine-tuning mechanism to optimize the performance of DDoS-detection systems.Hyperparameter-optimization techniques, including Bayesian optimization and GridSearchCV, have played a central role in systematically adjusting the settings of ML models.This systematic calibration process bolsters the accuracy and efficiency of these models, ensuring that they are finely attuned to the intricate nuances of DDoS detection.

•
The Rise of Hybrid Deep Learning Models: A prominent revelation in this landscape pertains to the emergence of hybrid DL models.These innovative architectures skillfully merge multiple DL frameworks or combine DL with traditional ML techniques.The overarching objective is to enhance the interpretability and robustness of DDoSdetection systems.By amalgamating diverse models, these hybrid entities provide a wealth of insights and information that guide the final decision making.This synergy augments the accuracy of DDoS detection and crystallizes the findings, making them more comprehensible and actionable.
In essence, the meticulous research endeavors in DDoS detection have coalesced into a compendium of strategies, each meticulously tailored to the nuances of SDN environments and propelled by the profound capabilities of DL models.These discoveries encompass network-specific tailoring, the precision of feature selection, the finesse of hyperparameter optimization, and the innovation of hybrid models.Collectively, these findings fortify the security of networks and equip them to confront the ever-evolving specter of DDoS attacks with heightened resilience.

Background
ML constitutes a pivotal realm within cognitive computing, focusing on creating intelligent systems capable of autonomously acquiring knowledge from data to make predictions and decisions without explicit programming.This field encompasses various ML categories: supervised, unsupervised, semisupervised, and reinforcement learning [29].ML has gained significant prominence in detecting DDoS attacks in recent years, primarily due to its innate aptitude for continual learning and performance enhancement.This ongoing refinement is achieved through applying various techniques, such as feature selection, optimization, and incorporating hybrid DL methodologies [3,5].

Feature Selection
Feature selection is an important step in DDoS-attack detection, primarily due to the often high dimensionality of network data.This process is instrumental in improving the performance and interpretability of ML models by selecting a subset of the most essential features.The significance of feature selection arises from its ability to address various challenges within DDoS detection.High dimensionality can lead to computational inefficiencies and overfitting while including irrelevant or noisy features can obfuscate the model's decision-making process.Therefore, feature selection aims to mitigate these challenges by reducing dimensionality, eliminating noise, and enhancing interpretability.Feature-selection methods encompass three main categories: filter methods, which independently evaluate feature relevance; wrapper methods, which involve the ML classifier in exploring feature subsets; and embedded methods, which seamlessly integrate feature selection with the ML algorithm.The feature-selection method should be tailored to the dataset's characteristics, available computational resources, and the specific ML model in use, as it significantly impacts the effectiveness of DDoS-attack detection, particularly within SDN environments [3,4].
SHAP-feature selection represents a cutting-edge technique for assessing the significance of individual features within a dataset while constructing predictive models.Rooted in the principles of cooperative game theory, SHAP assigns a numerical value to each feature, quantifying its impact on model predictions by meticulously considering every possible combination of features.A higher SHAP value for a particular feature signifies its heightened influence on shaping the model's predictive outcomes.Armed with this knowledge, data scientists can meticulously rank features according to their SHAP values, enabling them to pinpoint the most essential attributes for model precision and inter-pretability.This process facilitates strategically selecting a subset of the most informative features, streamlining the model-development process and elevating the overall predictive performance to new heights [4,13].
The Shapley value, formally defining the SHAP value for cooperative games, is calculated by summing the worth of all potential coalitions that do not include a specific player, denoted as i, and subtracting the value of coalitions that exclude this player, as mentioned in [4].This calculation is divided by the total number of players, essentially quantifying the individual player's contribution to the coalition.The formula for the Shapley value is expressed as In this equation, |S| represents the size of the coalition, and N stands for the total number of players involved.The summation encompasses all feasible coalitions that exclude player i.Furthermore, v(S) denotes the value attributed to the coalition S.
The Shapley value introduces two important concepts, namely "dummy" and "efficiency", which are described by Equations ( 2) and (3).In cooperative games, a player who contributes nothing to the final outcome and is essentially ignored in the Shapley value calculation is referred to as a "dummy" player.The principle of efficiency ensures that the summation of Shapley values for all players equals the total outcome of the game:

Optimization
Optimization is a fundamental aspect of enhancing the effectiveness and efficiency of ML models.When it comes to DDoS-attack detection, fine tuning model hyperparameters, selecting relevant features, and optimizing the model architecture are crucial.The optimization algorithm then iteratively adjusts the model's parameters to minimize the objective function by using various optimization techniques such as Bayesian optimization and ADAM optimizer.
Bayesian optimization is a potent ally among the arsenal of advanced techniques available.Bayesian optimization is a sophisticated strategy rooted in Bayesian statistical principles, designed to master the optimization of complex and computationally demanding functions.This technique shines when confronted with vast search spaces and resource-intensive evaluations, as commonly encountered in hyperparameter tuning and feature selection.At its core, Bayesian optimization revolves around creating a probabilistic model, often termed a "surrogate model," that approximates the target objective function.This surrogate model, which can take the form of a Gaussian process (GP) or even a deep neural network, plays a crucial role in estimating the objective function's behavior [30].
Surrogate Modeling: Bayesian optimization begins with establishing a prior distribution over the objective function.In conjunction with available data, this prior allows the surrogate model to approximate the objective function's behavior.The surrogate model is typically represented as a GP, which predicts accurate function values for different inputs.The prior and posterior distributions of the GP are defined as follows: where where The Point Selection: The acquisition function is employed to choose the next point for evaluation.This point is then assessed on the true objective function.The point to be evaluated next, x next , is chosen by maximizing the acquisition function: In Equation ( 7), x next represents the next point selected for evaluation, and α(x) is the acquisition function, such as EI, that guides the choice of the next evaluation point.This equation plays a central role in the iterative process of Bayesian optimization, where new points are selected based on the acquisition function to iteratively find the optimal configuration.
On the other hand, ADAM is an optimization algorithm commonly used in DL to train ML models.It is particularly well suited for optimizing neural networks, which have become fundamental to many modern AI applications.The core idea behind ADAM is to adapt the learning rates for each parameter during training, which helps the algorithm converge faster and more efficiently.Traditional gradient-based optimization algorithms, such as Stochastic Gradient Descent (SGD), use a fixed learning rate for all parameters.ADAM, on the other hand, maintains a separate adaptive learning rate for each parameter.This adaptive learning rate is based on two key moving averages: the gradients' first moment (the mean) and the second moment (the uncentered variance) [31].
The process begins with a set of parameters, including the learning rate (α), the exponential decay rates for moment estimates (β 1 and β 2 ), the objective function to minimize ( f (θ)), and the initial parameter vector (θ 0 ).
Initialization: The process commences by initializing two-moment vectors: the firstmoment vector, m 0 , and the second-moment vector, v 0 , both set to zero.The iteration counter, t, is initialized to zero as well.
Iteration: The algorithm iterates until the parameters converge.The algorithm progresses iteratively until the parameters converge.In each iteration, the counter increments (t ← t + 1), the objective function's gradient (∇ θ f t (θ t−1 )) computes based on the current parameter vector, and the first-moment estimate (m t ) updates through an exponential moving average with β 1 .Similarly, the second-moment estimate (v t ) updates by using β 2 , incorporating the square of the gradient.Bias-corrected estimates, mt and vt , adjust for the exponential moving average and iteration number for m t and v t , respectively.The parameter vector θ t updates by using these moments to determine step sizes, scaled by α and adjusted for numerical stability with a small constant .
Convergence: The algorithm continues these steps until the parameters converge, returning the optimal parameter vector θ t .
The mathematical equations for the ADAM optimizer can be summarized as follows: These equations capture the essence of the ADAM optimizer's behavior and its ability to efficiently update the parameters during the training of deep neural networks.

Deep Learning
DL is a subfield of ML that aims to mimic how the human brain processes information by utilizing neural networks composed of multiple layers, referred to as DNNs.These networks are designed to automatically learn and extract hierarchical features from raw data, making them exceptionally well suited for tasks that involve large and complex datasets.DL models have demonstrated remarkable success in various applications, including image recognition, natural language processing, and speech recognition, owing to their capacity to uncover intricate patterns within data that may not be evident to traditional ML algorithms [4].
In the context of detecting DDoS attacks, DL offers several advantages.DDoS attacks are prevalent cyber threats that inundate a network or system with an overwhelming traffic volume, disrupting regular operations.They are particularly challenging to detect due to their ability to mimic legitimate traffic patterns, rendering traditional rule-based methods less effective.DL models excel in discerning nuanced and evolving patterns within network traffic data, which can aid in the early identification of DDoS attacks.DL techniques can be applied to DDoS-attack detection in various ways.One common approach is to utilize DNNs, such as CNNs, to process network traffic data.CNNs are adept at extracting spatial features from data, making them suitable for tasks involving structured inputs, like network packet headers.One of the critical advantages of DL in DDoS detection is its adaptability.DNNs can be trained to automatically learn and adapt to new attack strategies, making them highly effective against evolving threats.Additionally, feature engineering, a labor-intensive step in traditional ML, is often unnecessary with DL.These models can autonomously extract relevant features from the raw data, reducing the burden on security analysts.Nonetheless, DL models are not without their challenges.They typically require large volumes of labeled data for training, which can be scarce in the case of DDoS attacks.Ensuring the interpretability of deep models is another concern, as black-box models may not provide insights into why an attack was flagged.Despite these challenges, adopting DL in DDoS detection showcases the potential for more accurate and adaptive security systems to identify previously unseen attack patterns, strengthening network resilience and security.

Proposed Model
The proposed "OptMLP-CNN" is a comprehensive model developed to enhance the detection of DDoS attacks within SDN environments.It leverages a combination of advanced techniques, including SHAP-feature selection, a fused MLP and CNN architecture, and the application of Bayesian optimization and the ADAM optimizer.

MLP
MLP is a type of artificial neural network that serves as a foundational building block for DL models.MLPs are a class of feedforward neural networks, meaning that information flows in one direction, from the input layer to the output layer.They are widely used for various ML tasks, including classification, regression, and feature learning.Let us explore the structure, components, and workings of an MLP in more detail [32,33].An MLP consists of multiple layers of interconnected neurons, and three main types of layers characterize its structure:

•
Input Layer: This is the network's first layer, receiving the raw input data.Each neuron in the input layer corresponds to a feature or variable in the dataset, making it a fundamental representation of the data's dimensions.• Hidden Layers: MLPs can have one or more hidden layers between the input and output layers.These layers are called "hidden" because they are not directly connected to the outside world (i.e., input or output).Neurons in hidden layers take the weighted sum of inputs from the previous layer, apply an activation function, and pass the result to the next layer.The number of hidden layers and the number of neurons in each layer are hyperparameters that can be adjusted to optimize the model's performance.

•
Output Layer: The final layer produces the model's output or prediction.The structure of this layer depends on the problem the MLP is designed to solve.For example, in binary classification, there may be a single neuron with a sigmoid activation function, while in multiclass classification, there might be multiple neurons, each representing a class and using a softmax activation function.
Each neuron in an MLP performs the following operations:

Weighted Sum
For each neuron in a layer (except the input layer), the input is a weighted sum of the outputs from the previous layer.This weighted sum is often referred to as the "activation" of the neuron.Mathematically, the weighted sum z (l) j of neuron j in layer l can be expressed as is the activation of neuron j in layer l. is the bias of neuron j in layer l.

Activation Function
The weighted sum is passed through a nonlinear activation function to introduce nonlinearity into the model.Common activation functions include the sigmoid function, Rectified Linear Unit (ReLU), and the hyperbolic tangent function (tanh).The output of one layer becomes the input for the next layer: where a (l) j is the output of neuron j in layer l and f is the activation function.

Feedforward Propagation
During the feedforward process, the activations are computed for each neuron by applying the activation function to the weighted sum.The output of one layer becomes the input for the next layer, propagating information from the input layer to the output layer:

CNNs
CNNs are a class of DL models specifically designed for processing structured grid data, such as images.CNNs are widely used for tasks like image classification, object detection, and image segmentation [33][34][35].The components of a CNN can be summarized as follows:

Convolution Layer
The fundamental operation in CNNs is the convolution operation.Given an input feature map I and a convolution kernel K, the convolution operation is defined as: Here, (x, y) represents the spatial position in the output feature map, and I(x + i, y + j) denotes the input at spatial position (x + i, y + j).The convolution operation computes the weighted sum of the input values by sliding the kernel over the input feature map.

Activation Function
After the convolution operation, an activation function is applied element-wise to introduce nonlinearity.Common activation functions include ReLU and sigmoid.Mathematically, this operation is expressed as

Pooling Layer
Pooling layers reduce the spatial dimensions of the feature map.The most common pooling operation is max pooling, where the maximum value in a local region is retained.The max-pooling operation can be defined as Here, P(x, y) represents the output of the pooling layer, s is the stride, and A(x • s + i, y • s + j) denotes the input to the pooling layer.

Fully Connected Layer
In the final layers of the CNN, one or more fully connected layers are used for tasks such as classification.These layers flatten the output feature map and connect every neuron to the previous layer's neurons.
The entire operation of a CNN, from convolution to fully connected layers, can be mathematically expressed as Here, Y is the output, W represents the weights (learned during training), X is the input data, B is the bias term, * denotes the convolution operation, and f represents the activation function.

OptMLP-CNN Detector
The methodology behind the OptMLP-CNN Detector represents an advanced approach to detecting DDoS attacks, leveraging a combination of MLP, CNN, SHAP-feature selection, Bayesian optimization, and the Adam optimizer.This methodology outlines a systematic process for deploying and maintaining a robust cybersecurity model, making it a valuable asset for safeguarding network infrastructure.
Commencing with data preprocessing as illustrated in Figure 2, the flowchart encapsulates the initial phase where the input dataset undergoes normalization, augmentation, and formatting, ensuring its readiness for subsequent modeling.It seamlessly progresses into feature selection, incorporating SHAP to discern the pivotal features essential for robust DDoS-attack detection.This phase refines the dataset, laying the foundation for subsequent modeling endeavors.A crucial "Class Distribution Evaluation" step at this junction determines the following pathway: directly proceeding to SHAP-feature selection or opting for Up-Sampling before engaging in SHAP-feature selection.
The heart of the flowchart unfolds with the construction of the combined model, constituting both MLP and CNN architectures.This merger aims to harness the diverse strengths of both models, leveraging MLP's prowess in handling structured data alongside CNN's adeptness in spatial-and image-data analysis.The step-by-step delineation within the flowchart elucidates the intricate layers and connections within the combined architecture, emphasizing the fusion's intricacies.
The flowchart seamlessly transitions into the compilation and training phases.Here, crucial elements such as defining loss functions, optimizers, and evaluation metrics are introduced, followed by rigorous training by using the designated dataset.The subsequent evaluation using a separate testing dataset and calculating performance metrics ensue, validating the model's accuracy and efficacy.The flowchart does not conclude merely with model training but extends into the domain of hyperparameter optimization by using Bayesian techniques.This stage refines the model's efficiency by fine tuning hyperparameters, enhancing its precision in identifying and mitigating DDoS attacks.The option for model deployment, continuous monitoring, reporting, and alert mechanisms for detected attacks further solidifies the adaptive and proactive nature of the OptMLP-CNN Detector.
The flowchart is a visual representation of the intricate and comprehensive workflow that underlies the OptMLP-CNN Detector, offering a structured roadmap from data preprocessing and feature selection to model training, refinement, and potential deployment.
The proposed MLP Architecture algorithm outlines the design and implementation of our MLP model, which is fundamental for classification tasks (see Algorithm 1).Its workflow commences with data preprocessing, encompassing normalization, augmentation, and dataset formatting, crucial preparatory steps for model training.The MLP architecture consists of an input layer, two hidden layers, and an output layer.The input layer's size aligns with the dataset's feature count, while the hidden layers, comprising 128 and 64 units utilize the ReLU activation function.The output layer's configuration adapts to the classification task, employing softmax for multiclass or sigmoid for binary classification.
Upon model construction, the algorithm proceeds to compile the MLP model by defining appropriate loss functions, optimizers, and evaluation metrics.Subsequently, it undergoes training on the designated training dataset across specified epochs.After training, the model undergoes rigorous evaluation by using the testing dataset, assessing diverse performance metrics encompassing the accuracy, precision, recall, and F1-score.Additionally, the algorithm accommodates hyperparameter tuning and provides options for deploying the trained model.It strongly advocates continuous monitoring and iterative updates to ensure adaptability and performance refinement.
The proposed CNN Architecture algorithm delineates the creation of a CNN model, paramount for image-based classification tasks (refer to Algorithm 2).Aligning with the MLP Architecture, the algorithm initiates data preprocessing, ensuring dataset normalization, augmentation, and formatting for the CNN model.The CNN architecture encompasses two convolutional layers, each paired with ReLU activation and same padding.Max-pooling layers augment these layers to reduce feature map dimensionality.Further components include a flattening layer and two fully connected layers, employing softmax for multiclass or sigmoid for binary classification in the last layer.The algorithm compiles the CNN model postconstruction, specifying crucial components like loss functions, optimizers, and evaluation metrics.Subsequent training on the provided dataset occurs over a defined number of epochs.Upon completion, the model undergoes rigorous evaluation on the testing dataset, where diverse performance metrics are computed.Similar to the MLP Architecture, provisions for hyperparameter tuning, optional deployment, and continuous model monitoring and updating are integrated into this CNN model.
The combined MLP-CNN Detector, as detailed in Algorithm 3, merges the MLP and CNN architectures to optimize DDoS-attack detection.It integrates preprocessing for dataset refinement and employs SHAP for feature selection, which is crucial for identifying key DDoS-related features.This amalgamation paves the way for a robust combined MLP-CNN model that leverages MLP's structured data handling and CNN's image-analysis prowess.The algorithm initiates with meticulous data preprocessing, normalizing and enhancing the dataset, and SHAP-feature selection.This preparatory phase primes the data for modeling.The heart of the algorithm lies in merging the MLP and CNN architectures to craft a powerful model adept at decoding complex patterns associated with DDoS attacks.

Experimental Results
In this section, we provide a detailed evaluation of the proposed model.We begin by outlining the experimental setup and introducing the evaluation metrics employed to assess the model's performance.Subsequently, the model's performance is examined under various scenarios, including exploring different activation functions and inputweight ranges.To gauge the effectiveness of our model, we compare the results with those of recent state-of-the-art methods.The experiments were conducted by using a Jupyter Notebook with Python, leveraging libraries such as scikit-learn, Pandas, and Matplotlib.The computational environment featured a 2.60 GHz Intel Core i7 10G processor, 4 GB NVidia GTX 1650 Ti graphics, 16 GB of RAM, and the Windows operating system.This rigorous evaluation validates our proposed model's robustness and effectiveness.

Dataset
In our study, we harnessed two datasets: InSDN [36] and CICDDoS-2019 [37].The CICDDoS-2019 dataset [37] provides a contemporary repository of DDoS-attack traces that manifest at the application layer, employing TCP/UDP-based protocols.This dataset classifies DDoS attacks into two distinct categories: reflection based and exploitation based.Reflection-based attacks involve using a reflector server to redirect malicious traffic towards the target, masking the source IP address.In contrast, exploitation-based attacks directly target the victim without needing a reflector server.The CICDDoS-2019 dataset encompasses a rich feature set comprising 88 distinct features.The InSDN dataset, as detailed by the authors of [36], aligns its focus with SDN infrastructures.This dataset was meticulously curated by deploying four virtual machines and incorporating a broad spectrum of attack classes from the SDN network's internal and external sources.It also includes a diverse representation of normal traffic, reflecting numerous application services.The InSDN dataset encompasses a substantial collection of 361,317 instances, including 68,424 benign or normal traffic and 292,893 instances of attack traffic.The dataset comprises 84 features that serve as the foundation for our research endeavors.

Performance Metrics
Evaluating the effectiveness and accuracy of the DDoS-detection model is paramount.The literature consistently employs a set of well-established evaluation metrics based on four fundamental elements: true positive (TP), false positive (FP), true negative (TN), and false negative (FN).
Accuracy: This widely used metric provides a holistic assessment of the classifier's correctness.It is calculated by dividing the total number of correct predictions by the total number of predictions made.The accuracy is computed as Precision: Precision measures the proportion of true positive predictions out of all positive forecasts.It is defined as Recall: Recall quantifies the percentage of actual positive events correctly predicted.It is calculated as F1-Score: This metric harmonizes precision and recall, offering a balanced assessment of a model's performance.It is computed as the harmonic mean of precision and recall Area Under the Receiver Operating Characteristic Curve (AUC-ROC): The AUC-ROC is a graphical representation that showcases a classifier's performance.It plots the true positive rate against the false positive rate, providing a visual performance indicator.The AUC value quantifies the classifier's accuracy, with higher values indicating superior performance.
The choice of performance evaluation metrics, including accuracy, precision, recall, F1score, and AUC-ROC, reflects a meticulous consideration of various assessment dimensions essential for a holistic model evaluation.Accuracy stands as a fundamental indicator, quantifying the ratio of correctly predicted instances to the total.Precision accentuates the model's prowess in identifying true positives among all positive predictions, while recall illuminates its capacity to capture all actual positives.The F1-score, a harmonic mean of precision and recall, offers a balanced assessment.Additionally, the AUC-ROC provides insights into the model's discriminative ability across classes.This suite of metrics ensures a comprehensive understanding of the model's performance, offering nuanced insights into its strengths and potential limitations across diverse performance aspects.This scrutiny steers the strategic decision-making process regarding feature inclusion or exclusion, consequently bolstering the model's proficiency and accuracy in detecting DDoS attacks within the intricate realms of SDN environments.Figures 3 and 4 facilitate the enhancement of feature-selection methodologies, thereby contributing substantively to the evolution of more robust and effective DDoS-detection models.

Detection Phase
The correlation matrices depicted in Figures 5 and 6 offer comprehensive insight into the inter-relationships among the 20 selected features obtained through the SHAP-selection method applied to the CICDDoS-2019 and InSDN datasets, respectively.Each cell in these matrices represents the correlation coefficient between a pair of features, ranging from −1 to 1.A coefficient of −1 indicates a perfect negative correlation, signifying that as one feature increases, the other decreases in proportion.Conversely, a coefficient of one signifies a complete positive correlation, indicating that both features either increase or decrease simultaneously.A correlation coefficient of 0 denotes no linear relationship between the features.This visualization serves as a pivotal tool to unearth redundant features and discern the most influential ones in shaping the DDoS-detection model's performance within these datasets.The performance evaluation of the optimized MLP-CNN model, detailed in Table 2 and Figure 7, showcases its effectiveness in detecting DDoS attacks, a critical concern within network security, especially in SDN environments.Our assessment covers two distinctive datasets, CICDDoS2019 and InSDN, providing insights into attack patterns in conventional and SDN infrastructures.The chosen performance metrics offer a comprehensive overview of the model's accuracy and efficacy, and our model demonstrates remarkable prowess in identifying malicious network traffic.Moreover, the F1-score of 0.999381, a balance between precision and recall, highlights the model's ability to classify attacks while minimizing false negatives accurately.This underscores its capability to achieve high accuracy while effectively managing false alarms.Additionally, the AUC score of 0.997901, observed in the AUC-ROC, emphasizes the model's robust discriminatory power in effectively discerning benign and malicious network traffic.
The model maintains its exceptional performance by transitioning to the InSDN dataset, which reflects the unique landscape of SDN network environments.With an accuracy score of 0.999802, the model demonstrates its efficacy in classifying network traffic, even within the specialized realm of SDN.The precision score is equally impressive at 0.999858, highlighting the model's ability to classify instances as DDoS attacks within the SDN context accurately.Furthermore, the recall score of 0.999717 signifies the model's capability to identify most actual DDoS attacks in SDN networks while minimizing false negatives.The F1-score of 0.999787 strikes an optimal balance between precision and recall, reinforcing the model's robust performance in DDoS-attack detection.The AUC score of 0.999601 signifies its discriminatory ability within SDN environments.Table 3 provides a comprehensive assessment of various ML techniques, including DNN, ELM, GB, and our novel OptMLP-CNN, in their capacity to detect DDoS activities across different datasets, namely NSL-KDD, UNSW_NB-15, CIC-IDS-2017, CICDDoS-2019, and InSDN.The findings reveal that all models, both the well-established benchmark models [9,13,24] and the proposed methods, exhibit a commendable performance across various evaluation metrics, including accuracy, precision, recall, and F1-score.Notably, the OptMLP-CNN model achieves the highest accuracy of 99.98% when employed with the InSDN dataset, underscoring its proficiency in accurately classifying network traffic.Precision, a crucial metric for minimizing false positives, is also notably high across all models.Among the benchmark models, model [13] exhibits the highest precision at 99.88%, while the proposed OptMLP-CNN model, when applied to the InSDN dataset, boasts the highest precision at 99.99%.This indicates the model's ability to minimize misclassifications of benign traffic as DDoS attacks.
The recall metric, which measures the capacity to capture actual DDoS attacks effectively, is quite impressive across the benchmark models, ranging from 98.81% [9] to 99.99% [13].Furthermore, the proposed OptMLP-CNN, when utilized with the CICDDoS-2019 dataset, demonstrates a recall of 99.98%, while it maintains a recall of 99.97% when applied to the InSDN dataset.This highlights the model's ability to detect the most genuine DDoS attacks while keeping false negatives minimal.The F1-score, a harmonious metric considering precision and recall, ranges from 99.37% [9] to 99.94% [13] in the benchmark models.In the case of the OptMLP-CNN model, the F1-score attains 99.94% when employed with the CICDDoS-2019 dataset and achieves an even higher F1-score of 99.98% when operating within the InSDN dataset.
It is important to note that the AUC, representing the classifier's discriminatory ability, was reported for all models except model [9].The AUC values for the OptMLP-CNN model are notable, reaching 99.79% with the CICDDoS-2019 dataset and an impressive 99.96% within the InSDN dataset.
The presented results indicate that the proposed OptMLP-CNN model consistently exhibits a remarkable performance across diverse datasets, often outperforming or at least matching the capabilities of other established techniques.These findings attest to the model's robustness and potential to bolster network security against the evolving landscape of DDoS threats, with its most exceptional performance observed in the InSDN dataset.
The proposed model leverages a combined MLP-CNN architecture, amalgamating the strengths of both models.This fusion enables the model to capture intricate patterns in structured and spatial data, significantly enhancing its adaptability and accuracy across

•
f (x) represents the function modeled by the GP.• µ(x) is the mean function of the GP.• κ(x, x ) is the GP's covariance function (kernel).• X is the set of input points.• y is the observed data.• x is the point for which we want to make predictions.• µ (x) is the mean function of the posterior GP.• κ (x, x ) is the covariance function (kernel) of the posterior GP.Equations (4) and (5) capture the prior and posterior distributions used in Surrogate Modeling in Bayesian optimization, where the GP approximates the objective function based on available data.The acquisition function: It guides the search for the optimum by quantifying the value of evaluating the objective function at different points.It balances exploration (sampling from areas where the surrogate model is uncertain) and exploitation (sampling where the surrogate model predicts high objective values).The acquisition function, often represented as α(x), guides the selection of the next evaluation point.One commonly used acquisition function is the Expected Improvement (EI):

w
(l) ij is the weight of the connection between neuron i in layer l − 1 and neuron j in layer l. a (l−1) i is the output of neuron i in layer l − 1. b (l) j

128 20 :
Activation function: ReLU 21: Fully connected layer 2 (output layer): 22: Number of units: number of classes 23: Activation function: softmax (for multiclass) or sigmoid (for binary classification) 24: Compile the CNN model: define loss, optimizer, and evaluation metrics 25: Train the CNN model on D train for a set number of epochs 26: Evaluate the model on D test and calculate performance metrics 27: Fine tune hyperparameters as needed 28: Optionally deploy the model 29: Continuously monitor and update the model

Algorithm 3
Combined MLP-CNN Detector Require: training data D train , testing data D test Require: hyperparameters for MLP and CNN architectures Require: SHAP-feature selection and Bayesian optimization parameters Require: Adam optimizer hyperparameters 1: Preprocess the data: normalize, augment, and format the dataset 2: Select the most important features using SHAP 3: Build the combined model: combining MLP and CNN architectures 4: Compile the combined model: define loss, optimizer, and evaluation metrics 5: Train the combined model on D train for a set number of epochs 6: Evaluate the model on D test and calculate performance metrics 7: Perform Bayesian optimization to fine tune hyperparameters 8: Retrain the model with optimal hyperparameters 9: Optionally deploy the model 10: Continuously monitor and update the model 11: Implement reporting and alert mechanisms for detected DDoS attacks 12: Maintain documentation of the model architecture and training process 13: Update and maintain the model regularly Subsequent steps entail model compilation, defining loss functions and metrics, and rigorous training.This training phase refines the model's parameters, improving its recognition of DDoS indicators.Post-training, the model undergoes a thorough evaluation by using a separate dataset and employs Bayesian optimization to fine tune hyperparameters, enhancing precision in identifying and mitigating DDoS attacks.The algorithm's adaptive nature includes provisions for model deployment in DDoS-detection systems, advocating continuous monitoring and updates to counter evolving threats effectively.It highlights the implementation of alert mechanisms for detected attacks, ensuring swift responses, and emphasizes meticulous documentation for comprehensive records of the model's architecture and training processes.

Figures 3 and 4
Figures 3 and 4 portray the outcomes derived from conducting the SHAP-featureselection experiment by using the CICDDoS2019 and InSDN datasets, respectively, within the framework of the OptMLP-CNN model.These visual representations encapsulate the fluctuating landscape of feature importance, quantified through Shapley values, across varying subset sizes.These graphics serve as powerful tools, offering deep insights into the pivotal features that significantly influence the OptMLP-CNN model's efficacy in identifying and

Figure 6 .
Figure 6.Correlation of selected features in InSDN.
• Evaluation of Effectiveness of Using Public Datasets:This study evaluates the proposed method by applying it to two publicly available datasets.One of these datasets is deliberately constructed to simulate an SDN environment.The results of this evaluation demonstrate that OptMLP-CNN, the proposed method, achieves a high accuracy rate and outperforms other existing methods in detecting DDoS attacks in the context of SDN infrastructure.
• Promising Solution for SDN Security: This research's overarching contribution is the presentation of a promising solution for enhancing the security of SDN and addressing the growing threat of DDoS attacks.By introducing and optimizing the OptMLP-CNN method, this study aims to enhance the resilience of SDN environments against DDoS attacks, which have significant security implications in contemporary networked environments.

Table 1 .
Comparison of recent literature.

Algorithm 1 Proposed MLP Architecture Require: training
data D train , testing data D test Compile the MLP model: define loss, optimizer, and evaluation metrics 10: Train the MLP model on D train for a set number of epochs 11: Evaluate the model on D test and calculate performance metrics 12: Fine tune hyperparameters as needed 13: Optionally deploy the model 14: Continuously monitor and update the model Algorithm 2 Proposed CNN Architecture Require: training data D train , testing data D test Require: hyperparameters for CNN architecture 1: Preprocess the data: normalize, augment, and format the dataset

Table 2 .
OptMLP-CNN evaluation.For the CICDDoS2019 dataset, the model's accuracy score 0.999504 signifies its exceptional precision in predicting class labels with minimal classification errors.With a precision score of 0.999009, the model accurately identifies DDoS attacks while maintaining an impressively low rate of false positives, crucial for minimizing false alarms and ensuring accurate attack predictions.Its recall score of 0.999752, representing the true positive rate, showcases the model's efficiency in capturing genuine DDoS attacks while demonstrating very few false negatives.

Table 3 .
Comparison with previous studies.