Previous Article in Journal
LEACH-CSA: A Clustering Algorithm for Wireless Sensor Networks
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Improved Method for Anomalous Traffic Detection in SDN Based on Gated Feature Fusion

1
School of Computer Science and Engineering, University of Emergency Management, Langfang 065201, China
2
Langfang Key Laboratory of Network Emergency Protection and Network Security, Langfang 065201, China
*
Author to whom correspondence should be addressed.
Future Internet 2026, 18(5), 270; https://doi.org/10.3390/fi18050270
Submission received: 24 April 2026 / Revised: 18 May 2026 / Accepted: 18 May 2026 / Published: 20 May 2026
(This article belongs to the Section Cybersecurity)

Abstract

Existing anomalous traffic detection methods based on feature fusion in Software-Defined Networking (SDN) lack adaptability in weight allocation mechanisms. Consequently, their detection accuracy and model generalization capabilities fail to meet practical security requirements. To solve these limitations, this paper proposes a refined detection method based on hybrid feature selection and gated fusion. First, the framework employs XGBoost combined with the Recursive Feature Elimination (RFE) algorithm. This process identifies shallow statistical features with high discriminative power. Simultaneously, the method utilizes a 1D Convolutional Neural Network (1D-CNN) integrated with a Squeeze-and-Excitation (SE) block to extract deep temporal semantic features. Subsequently, a tailored gated fusion mechanism incorporating linear projection layers for feature alignment adaptively integrates these two categories of features. The fused features are then input into a Multilayer Perceptron (MLP) to execute anomalous traffic detection. Experimental results demonstrate that the proposed method achieves superior performance. Specifically, on the InSDN Dataset, the binary and multi-classification accuracy rates reach 99.91% and 99.88%. Similarly, the accuracy rates on the NSL-KDD dataset are 99.78% and 99.76%. Finally, we established a local simulation environment. Experimental results demonstrate that our method attains an average precision exceeding 93% for anomalous traffic detection in simulated real scenarios.

1. Introduction

SDN [1] improves network flexibility through centralized control and programmable management. However, this architecture faces issues such as Distributed Denial of Service (DDoS) [2] attacks and Advanced Persistent Threats (APT) [3]. If these attacks compromise the SDN controller, the entire network risks paralysis. Therefore, anomalous traffic detection in SDN environments is critical for security assurance.
To solve these threats, SDN-based Intrusion Detection Systems (IDS) [4] have become core components of defense frameworks. Traditional IDS primarily rely on signature matching of anomalous traffic. Yu et al. [5] proposed a method combining unsupervised pre-training via stacked Dilated Convolutional Autoencoders (DCAEs) with supervised fine-tuning. This method demonstrated high accuracy and low false alarm rates in multiple classification tasks. However, the model training process is time-consuming. Furthermore, it lacks robustness against novel or obfuscated attacks and fails to identify them accurately. Consequently, research has shifted toward anomalous traffic detection methods based on Machine Learning (ML) and Deep Learning (DL). DL models, such as Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), use powerful automatic feature learning capabilities. They capture complex, high-dimensional patterns from raw traffic. This leads to significant breakthroughs in detection accuracy. Yin et al. [6] proposed an intrusion detection method based on RNNs (RNN-IDS). Experimental results show that this method outperforms traditional machine learning methods in accuracy.
Despite the success of deep learning methods, these approaches face two core challenges in practical applications. First, most models focus on deep semantic features from packet content or temporal relationships. These models often neglect shallow statistical features, including traffic rate, packet length distribution, and connection duration. These features contain critical information. Such features intuitively reflect the macro-level behavioral state of the network. Furthermore, as demonstrated by [7], these features are vital for distinguishing normal traffic from specific attacks. Second, existing research mainly uses static strategies like simple concatenation or weighted averaging to fuse features from different sources. These methods cannot dynamically adjust feature importance based on traffic context. This leads to information redundancy or the obscuring of critical insights by noise. Ultimately, this limits model generalization and robustness.
To solve the problem of effectively fusing shallow statistical features with deep temporal semantic features, this paper proposes an anomalous traffic detection method in SDN based on hybrid feature selection and gated fusion. This method extracts two types of features in parallel. It uses XGBoost and RFE to identify discriminative shallow statistical features. Simultaneously, it uses a 1D-CNN and a SE block to extract deep temporal semantic features. Finally, a tailored gated fusion mechanism incorporating specialized projection layers assigns weights to both feature types adaptively based on input traffic characteristics. This achieves dynamic information complementarity by aligning the heterogeneous representation spaces of the extracted features.
The main contributions of this paper are summarized as follows:
(1)
We propose a hybrid feature extraction framework. It combines shallow statistical feature selection based on XGBoost-RFE with deep temporal feature extraction based on 1D-CNN-Attention. This effectively solves the problem of insufficient expressive power of single features.
(2)
We design a refined anomalous traffic detection method utilizing an adapted gated fusion unit. By integrating learnable linear transformations, this method dynamically adjusts fusion weights between shallow statistical features (XGBoost-RFE) and deep temporal semantic features (1D-CNN-Attention) based on input samples. This specialized design optimizes the integration of disparate SDN feature modalities, effectively solving the problems of information redundancy and noise interference.
The remainder of this paper is organized as follows: Section 2 reviews the related work in SDN security and anomaly detection techniques. Section 3 details the proposed improved anomalous traffic detection method based on gated feature fusion, including the model architecture and data preprocessing procedures. Section 4 presents the experimental analysis and results, providing a comprehensive evaluation of the model’s performance on the InSDN dataset. Finally, Section 5 concludes the paper and discusses potential directions for future research.

2. Related Work

The SDN architecture consists of three layers: the Data Plane, the Control Plane, and the Application Plane. SDN adopts a centralized Control Plane and a distributed Data Plane. This design achieves the separation of these two planes. The Control Plane utilizes southbound interfaces to manage interface forwarding. This facilitates the centralized management of network devices on the Data Plane. Furthermore, the Control Plane provides flexible programmable control to the Application Plane through northbound interfaces. This architecture improves network programmability and simplifies configuration operations for users. However, it simultaneously introduces new attack surfaces. Consequently, traditional intrusion detection solutions face numerous challenges in this environment. Therefore, various methods exist for anomalous traffic detection and defense in SDN. These methods are primarily categorized into statistical learning, traditional machine learning and deep learning, and feature fusion methods.
Intrusion detection systems based on statistical learning typically utilize traffic statistical features, information entropy, or distribution changes to identify anomalies. Braga et al. [7] proposed a lightweight detection method based on Self-Organizing Maps (SOM). However, its reliance on statistical traffic features for classification inherently limits its adaptability and robustness against novel or low-rate attacks. Aladaileh et al. [8] proposed a detection method based on generalized Rényi joint entropy. However, this method was primarily validated on UDP traffic. Consequently, its adaptability to complex attack types, such as TCP SYN or hybrid attacks, is insufficient. Aladaileh et al. [9] proposed a DDoS attack detection method based on entropy. However, it exhibits high false positive rates and low detection accuracy when facing low-rate, multi-target attacks.
Statistical-based methods demonstrate certain effectiveness in handling specific attack types. However, they typically rely on predefined thresholds and simple traffic characteristics. To solve these limitations, researchers apply machine learning methods to IDS. This aims to enhance detection accuracy and compensate for the deficiencies of statistical approaches. Song et al. [10] proposed Eunoia, a real-time threat awareness framework based on random forests and decision trees. However, as its cost function considers solely link utilization and latency metrics, the model demonstrates insufficient capabilities for fine-grained identification and differentiated responses to complex multi-stage attacks. Kokila et al. [11] proposed a DDoS attack detection framework based on Support Vector Machines (SVM). This approach emphasizes the deployability of single classifiers in low-overhead scenarios. However, it exhibits inadequate characterization of low-rate, multi-stage, or encrypted DDoS traffic, and lacks a closed-loop mechanism for online incremental learning and real-time flow rule updates. Mohd Mat Isa et al. [12] introduced an intrusion detection framework AERF. Nevertheless, when attack characteristics shift, the reconstruction distribution of its autoencoder is prone to variation, requiring model retraining to maintain accuracy and consequently leading to temporary detection blind spots.
Traditional machine learning methods provide satisfactory accuracy and efficiency when manual feature selection is sufficient. However, their adaptability to high-dimensional and dynamic data is limited, and the computational overhead remains significant in most scenarios. Consequently, deep learning has emerged as a core research direction for IDS. Brandão Lent et al. [13] proposed an unsupervised GAN model for DDoS detection in SDN. However, it requires alternating optimization of generator and discriminator, and is highly sensitive to hyperparameters, leading to unstable training. Chaganti et al. [14] proposed a deep learning method based on Long Short-Term Memory (LSTM) networks. However, model training relies heavily on large amounts of labeled data results in identification challenges when encountering unknown attack types. Ataa et al. [15] proposed an intrusion detection method for SDN environments. They constructed two deep learning models based on a CNN-LSTM hybrid architecture and a Transformer encoder architecture, respectively. However, model training and inference require considerable time. This is particularly true for the Transformer architecture, which demands high computational resources. Consequently, deployment becomes difficult.
While deep learning exhibits significant potential in automated feature extraction, existing models tend to overemphasize deep temporal semantic information, often at the expense of shallow statistical features that possess high discriminative value. To address this imbalance, researchers have introduced feature fusion techniques as a robust solution. Saheed et al. [16] proposed a hybrid feature selection method for intrusion detection systems. However, its performance heavily depends on the parameter settings of the Bat Algorithm (BA) and the modulus selection in the Residue Number System (RNS); thus, the stability of its generalization ability across diverse network environments warrants further validation. Damtew et al. [17] introduced a Heterogeneous Ensemble Feature Selection (HEFS) method. Nevertheless, this approach remains sensitive to feature selection thresholds and exhibits insufficient accuracy in identifying minority class samples. Ramkumar et al. [18] developed an intrusion detection method based on Deep Residual Networks (DRNs) by integrating RV coefficient feature fusion with the Exponential Sea Lion Optimization (ExpSLO) algorithm. However, its reliance on the Spark platform and protracted training times lead to suboptimal performance in high-throughput network environments.
In summary, existing research indicates that single detection models often struggle with accuracy and efficiency when facing high-dimensional and dynamic network traffic, primarily due to limitations in feature extraction dimensions. While feature fusion has emerged as a mainstream approach to enhancing detection performance, most current studies rely on static fusion strategies. Such methods lack the capability to adaptively adjust the weight distribution between shallow and deep features based on input sample characteristics, thereby restricting the model’s detection ceiling for novel or complex attacks. Table 1 provides a categorized summary of the aforementioned literature.
To solve the aforementioned limitations, this paper proposes an improved detection method based on hybrid feature selection and gated fusion. First, the framework utilizes the XGBoost model combined with the RFE algorithm. This process selects shallow statistical features with high discriminative power. Simultaneously, the method employs a 1D-CNN integrated with a SE block. This architecture extracts deep temporal semantic features. Subsequently, the paper designs a dynamic fusion module based on a gating mechanism. This mechanism fuses these two feature types dynamically. The fusion weights adjust adaptively according to input samples. Consequently, the method enhances detection accuracy and generalization capabilities. Meanwhile, the approach maintains model interpretability and computational controllability. Finally, the system feeds the fused features into a Multilayer Perceptron (MLP) to complete anomalous traffic detection. To validate the effectiveness of the method, we conducted extensive comparative and ablation experiments on datasets such as InSDN [19] and NSL-KDD [20]. A detailed description of these datasets, including traffic characteristics and sample distribution, is provided in Section 3.1.1. Furthermore, we designed and deployed a local simulation environment. This environment verifies the applicability and accuracy of the proposed method.

3. Improved Anomalous Traffic Detection Method Based on Gated Feature Fusion

To address the issues of limited accuracy, insufficient generalization capability, and inflexible feature fusion weights in existing single-stage detection models, this paper proposes a workflow, as illustrated in the technical architecture of Figure 1.
This workflow integrates data preprocessing, feature extraction, feature fusion, and anomaly detection. After the raw data are standardized through the preprocessing pipeline, the framework executes two parallel extraction processes: an XGBoost based RFE algorithm for capturing shallow statistical features, and a 1D-CNN integrated with an SE block for extracting deep temporal semantic features. These heterogeneous feature streams are dynamically integrated via an improved gated fusion module. Finally, an MLP performs anomalous traffic detection.

3.1. Data Preprocessing

Effective data preprocessing is essential to ensure the quality of the input traffic features and the subsequent performance of the detection models. In this section, we first provide a detailed description of the datasets used for evaluation in Section 3.1.1. Subsequently, the specific data processing procedures, including data cleaning, normalization, Label Encoding, and data splitting, are elaborated in Section 3.1.2.

3.1.1. Dataset Description

The InSDN Dataset [19], released in 2020, aims to bridge the gap in appropriate datasets for training models in SDN settings. Unlike traditional datasets collected from non-SDN environments, InSDN incorporates attack patterns with features unique to SDN architectures. These specific attributes are crucial for evaluating threat detection technologies in SDN contexts. Simulated attacks must account for the novel network architecture; for instance, attackers may employ techniques like “IPsweep” and “Portscan” to flood SDN controllers with unmatched flow packets, triggering frequent Packet-In events that consume significant bandwidth and controller resources. Consequently, such actions cause network instability and generate a novel form of DDoS attacks that do not conform to the traditional definition but remain applicable within SDN networks. While traditional datasets do not represent these scenarios, the InSDN Dataset effectively covers them, which is a primary reason for selecting it in this paper.
The InSDN Dataset encompasses SDN-specific attacks targeting conventional network components along with attacks targeting various standard traffic types. Specific attack categories cover multiple forms, including Denial of Service (DoS), DDoS, Probe, Botnet, Exploitation, Brute Force, and Web attacks. By employing multiple internal and external attack vectors, the dataset effectively simulates realistic and complex environments.
Furthermore, this paper employs the NSL-KDD benchmark dataset to validate the generalization capability and robustness of the proposed model. By applying identical preprocessing and training protocols to both datasets, we ensure an objective assessment of model adaptability across diverse network traffic environments.
The NSL-KDD dataset [20] serves as an enhanced version of the classic KDD Cup 1999 dataset. This improved benchmark resolves inherent flaws found in KDD99, such as redundant records and data imbalance, providing more reliable data for IDS and machine learning research. The NSL-KDD dataset comprises 41 features, ranging from protocol types and durations to login attempt count and connections to the same target host. The processed labels are categorized into normal data and four attack classes: R2L, DoS, U2R, and Probe. In the experiments of this paper, the framework employs the training set and test set of the NSL-KDD dataset for model training and evaluation, respectively.

3.1.2. Dataset Processing

Data preprocessing in this paper encompasses multiple steps, including data cleaning, normalization, and label encoding, to transform raw data into a standardized format suitable for model training. The specific implementation logic is summarized in Algorithm 1.
Algorithm 1 Data Preprocessing and Normalization Strategy
Require: Raw dataset D r a w in CSV format
Ensure: Normalized training set D t r a i n n o r m and testing set D t e s t n o r m
Initialize  D D r a w
//Part1: Data Cleaning
for each row r in D  do
if  r contains any null or missing values then
Remove r from D
end if
end for
if ‘id’ column exists in D  then
Drop ‘id’ column from D
end if
//Part2: Label EncodingLet L be the ‘Label’ column in D
Create a mapping M from each unique category in L to an integer
Apply mapping M to transform column L into numerical labels
//Part3: Data Splitting(Prevent Data Leakage)
Split D into training set D t r a i n (70%) and testing set D t e s t (30%)
//Part4: Feature Normalization(Min-Max Scaling)
Let C n u m e r i c be the set of all numeric feature columns
For each column c in C n u m e r i c  do
c m i n m i n ( D t r a i n [ c ] )                           C a l c u l a t e   s t a t s   o n   T r a i n   s e t   O N L Y
c m a x m a x ( D t r a i n [ c ] )
for each value x t r a i n in D t r a i n [ c ]  do
x t r a i n ( x t r a i n c m i n ) / ( c m a x c m i n )
end for
for each value x t e s t in D t e s t [ c ] do
x t e s t ( x t e s t c m i n ) / ( c m a x c m i n )
end for
End for
Return D t r a i n n o r m , D t e s t n o r m
(1)
Data Cleaning
To further ensure data integrity, we perform a comprehensive inspection of all records. Upon detecting missing values or empty rows, the process removes the corresponding records. Furthermore, identifier columns that lack discriminative information are discarded, ensuring that the model focuses on critical features.
(2)
Normalization
The continuous features in the InSDN Dataset exhibit significant variations in numerical ranges. For example, the range for Tot Fwd Pkts (Total Forwarded Packets) is [0, 16,928]. Meanwhile, the range for Fwd Pkt Len Mind (Forward Packet Length Mean) is [0, 3900]. Consequently, the scales of different features vary substantially. To simplify arithmetic operations and unify value ranges, this paper uses Min-Max normalization. This method maps each feature linearly to the [0, 1] interval, thereby preventing features with larger magnitudes from dominating the learning process.
(3)
Label Encoding
To accommodate the numerical requirements of the classifiers, categorical attributes in the InSDN Dataset such as protocol type (proto), connection state (state), and service type (service)—must be transformed. While One-Hot Encoding is a common alternative, this paper employs the Label Encoding technique, which assigns a unique numerical label to each distinct category. This approach effectively preserves categorical information while facilitating feature extraction. Finally, the framework applies this encoding technique to the Label column to obtain numerical targets.
(4)
Data Splitting
To evaluate the model’s practical defense capability in dynamic SDN environments, this study adopts an Open-set Recognition (OSR) evaluation protocol rather than a standard closed-set split. In real-world SDN deployments, network administrators frequently encounter “zero-day” attacks or novel variants of existing threats that are not present in historical training logs. Traditional supervised learning, which assumes that the training and testing sets share the same label space, often fails to generalize to these unobserved categories.
Consequently, our data partitioning strategy is designed to simulate this “unseen threat” scenario. We define a subset of minority classes as “Unknown Attacks” to be excluded entirely from the training phase. For the InSDN dataset, these include BFA, BOTNET, U2R, and Web-Attack; for the NSL-KDD dataset, R2L and U2R are reserved. Specific information is shown in Table 2. The training set consists only of the remaining “Known” categories, split into a 7:3 ratio for training and baseline testing. The excluded minority samples are then reintroduced only during the testing phase to assess the model’s binary classification capability. Specifically, its ability to correctly identify these completely novel patterns as “Anomalous.”
We avoid using traditional oversampling techniques, such as SMOTE, which studies [21,22] have shown that SMOTE generates synthetic samples by interpolating between a small number of existing samples, effectively “leaking” the distribution of these specific attacks into the training process. While this can improve closed-set metrics, it often leads to overly optimistic performance, which fails when the model faces structurally different, truly unknown attacks. In contrast, our approach forces the model to learn robust latent representations of “normal” and “general attack” features, thereby enhancing its generalization ability to typical open-set challenges in SDN environments.

3.2. Feature Extraction

High-dimensional network traffic data often contains redundant or irrelevant attributes that may increase computational overhead and degrade model performance. Particularly when handling imbalanced or complex traffic patterns, certain features can drive model overfitting. Therefore, feature selection is essential for enhancing efficiency and generalization by identifying and retaining the most informative attributes. While dimensionality reduction improves generalization, it may occasionally discard features containing complementary information. In complex tasks like network intrusion detection, relying solely on single source features often proves insufficient to capture both underlying patterns and high-level semantic behaviors. To overcome this limitation, this paper employs a feature fusion strategy, the specific workflow of which is shown in Algorithm 2.
Algorithm 2 Feature Extraction and Gated Fusion
Require: Training set D_train, Test set D_test, Number of features to select k = 45
Ensure: Fused feature vectors F f u s e d for classification
//Part1: Shallow Features Selection (XGBoost + RFE)
Initialize XGBoost model M x g b and RFE selector with M x g b
Train M x g b on D_train using 5-fold cross-validation to get feature importances
Rank features based on average importance scores
Use RFE to select the top k features from X
Apply the selected feature indices to D_test to ensure consistent feature dimensionality without leakage.
Let F s h a l l o w be the resulting set of k -dimensional shallow feature vectors.
//Part2: Deep Feature Extraction(1D-CNN+SE Block)
Initialize 1D-CNN model M c n n with a Squeeze-and-Excitation(SE) block
Reshape each input sample X i to a 3D tensor
F c o n v M c n n .Conv1D(X)
F s e M c n n .SEBlock( F c o n v )
F d e e p M c n n .AdaptiveAvgPool( F s e )         Extract deep feature vectors
//Part3: Gated Fusion Mechanism
Initialize linear projection layers P s , P d and gating layer L g
For each sample i from 1 to N do
   F s , i P s ( F s h a l l o w , i )          Project shallow features.
   F d , i P d ( F d e e p , i )           Project deep features.
   F c o n c a t , i c o n c a t e n a t e ( F s , i , F d , i )
   g i σ ( L g ( F c o n c a t , i ) )            Calculate gate vector using Sigmoid σ
   F f u s e d , i g i F s , i + ( 1 g i ) F d , i     is element-wise product
end for
Return  F f u s e d
This approach integrates heterogeneous or multidimensional features to leverage the strengths of different feature subsets. The method combines shallow statistical characteristics with deep temporal semantic features. In this process, to ensure the rigor of the evaluation and prevent data leakage, the feature selection process using XGBoost-RFE is conducted strictly within the training folds. Specifically, the importance ranking and recursive elimination are determined solely based on training data, and the resulting feature subset is then applied to the test set for evaluation. By maintaining such strict data isolation, feature fusion effectively enriches the descriptive dimensions and depth of the input space without introducing evaluation bias. This mechanism constructs a more comprehensive representation of network behavior, and ultimately the fusion improves detection accuracy and model robustness.

3.2.1. Shallow Statistical Feature Selection Method Based on XGBoost and Recursive Feature Elimination

To identify the most discriminative features and reduce dimensionality, this study employs a hybrid strategy combining XGBoost with RFE. As an ensemble learning method, XGBoost excels in feature ranking and classification due to its capability to capture complex nonlinear interactions. The implementation process is described below.
Let Equation (1) denote the original sample set:
D   =   ( x i , y i ) i = 1 N
where x i R d denotes the input vector with d features, and y i represents the corresponding class label.
As a gradient boosting tree model, XGBoost constructs the final prediction function by progressively fitting residual terms through additive models. Equation (2) presents the mathematical form of the prediction model.
y ^ i = t = 1 T f t ( x i ) ,   f t F
where T denotes the total number of trees (set to 100), f t represents each individual regression tree, and F represents the function space of tree Equation (3) presents the objective function.
L ( θ ) = i = 1 N l ( y i , y ^ i ) + t = 1 T Ω ( f t )
Here, the first term represents the loss function, while the second term denotes the structural regularization term used to control model complexity. Equation (4) defines this regularization term:
Ω ( f ) = γ T + 1 2 λ j = 1 T w j 2
where γ controls the quantity of trees, and λ controls the L2 regularization of the leaf node weights.
Following the training phase, this method evaluates feature importance based on the usage frequency or average gain. To enhance the stability of the assessment, this study uses 5-fold cross-validation. For each fold k { 1 , 2 , , 5 } , the system trains XGBoost model separately. The process accumulates the score I j ( k ) for each feature. Finally, Equation (5) presents the calculation for the average importance score:
I - j   =   1 K k = 1 K I j ( k ) ,   K   =   5
The framework sorts the features in descending order based on these average scores, selecting the top k features as the candidate feature set. To further refine this subset, the RFE method is employed as a wrapper-based selection tool. The core principle involves evaluating the contribution of each feature during each iteration and removing the “least important” attribute until a target number of features is reached.
Let S ( t ) denote the feature subset at the current iteration. Equation (6) defines the update rule for each iteration:
S ( t ) = S ( t 1 ) arg min x j S ( t 1 ) S c o r e ( x j )
Here, S c o r e ( x j ) denotes the impact of feature x j on model performance. The process continues until the number of remaining features reaches the predefined limit k . Although the study initially evaluates a total of 55 features, Table 3 explicitly presents the specific details of the Top-45 features identified as the optimal subset based on final experiments. This selection suggests that the 45-feature threshold represents an optimal trade-off point where the discriminative information is maximized; further increasing the feature count introduces redundant information or noise that detracts from the model’s generalization. The remaining 10 features that were not adopted are cataloged and analyzed in Section 4.2.1 to provide a comparative baseline. To ensure the statistical stability of this cutoff, we verified the consistency of the XGBoost importance scores across all 5 folds of the cross-validation process. The ranking of the Top-45 features remained highly consistent, with the core feature set showing minimal membership changes across folds. This stability indicates that the selected threshold captures the intrinsic characteristics of the traffic patterns and serves as a robust basis for the subsequent gated fusion mechanism. Consequently, the framework utilizes this optimized 45-feature combination for subsequent model training. Section 4, “Experimental Analysis and Results”, presents the comprehensive experimental details and the omitted feature comparison.

3.2.2. Deep Temporal Semantic Feature Extraction Method Based on Convolution and SE Mechanism

To effectively extract high-order temporal features and local dependencies from network traffic data, this study utilizes a specialized architecture integrating a 1D-CNN with an SE block. The specific implementation is described below.
After preprocessing, normalization, and tensor conversion, the process generates a three-dimensional tensor X R B × C × L . Here, B denotes the batch size. C   =   1 represents the channel count, indicating that the raw features serve as single-channel inputs. L denotes the original feature dimension. Subsequently, the model uses a one-dimensional convolutional layer to extract local features, as described in Equation (7). This layer utilizes 16 convolutional kernels with a size of 3 and applies same-padding to better model adjacent relationships between features.
F conv = ReLU ( Conv 1 D ( X ) ) R B × 16 × L
To further enhance the modeling capability for key channel-dimension features, an SE block is integrated following the convolutional layer. The module first performs channel dimension compression (squeeze), followed by recalibration (excitation) via a two-layer fully connected network to generate channel attention weights. Equation (8) illustrates the core principle:
s   =   σ W 2 ReLU W 1 AvgPool F conv
Here, W 1     R C r × C and W 2     R C × C r represent the weight matrices, with the reduction ratio r is set to 8. σ denotes the Sigmoid function. Finally, the system performs channel-wise weighted fusion, as shown in Equation (9):
F att   =   F conv s
This mechanism assigns weights across different channels to effectively suppress redundant feature responses and highlight key channels. Finally, the model employs adaptive average pooling (AdaptiveAvgPool1d(1)) to further compress the weighted features F att , resulting in a 16-dimensional fixed-length vector representation. This vector serves as the deep feature output, defined by Equation (10).
f out = Dropout Pool F att R B × 16
In this method, the dropout rate is set to 0.5 to prevent overfitting. To ensure the scientific validity and robustness of the hyperparameters, this study employs a 5-fold Stratified K-Fold cross-validation strategy, aligning with the XGBoost-RFE feature selection method. The results confirm that the selected hyperparameters exhibit strong generalization capabilities.

3.3. Dynamic Feature Fusion Method Based on Gating Mechanism

The preceding subsections detailed the feature extraction tasks at two distinct levels: first, the selection of discriminative shallow statistical features via XGBoost, and second, the extraction of deep temporal semantic features using a 1D-CNN integrated with an SE block. Since these two feature types describe network traffic behavior from complementary perspectives, their integration provides a more holistic representation. However, direct concatenation or averaging risks neglecting the varying importance of features across different samples, potentially compromising the model’s discriminative capability. To address this issue, this paper employs a dynamic fusion strategy based on a gating mechanism to achieve adaptive information integration at the feature level.
Specifically, let f x g b R d 1 denote the 45-dimensional feature vector. This vector is selected by the feature selection method based on XGBoost and RFE. Additionally, let f c n n R d 2 represent the 16-dimensional deep feature representation output by the CNN module. To integrate these heterogeneous inputs, the architecture first employs two sets of learnable linear projection layers. These layers project the features into a unified fusion space R d to bridge the representation gap between discrete statistical metrics and continuous neural embeddings. Equations (11) and (12) illustrate the specific formulas.
h xgb =   W xgb f xgb + b xgb
h cnn =   W cnn f cnn + b cnn
Subsequently, the framework concatenates these two feature sets and feeds them into a gated network equipped with a Sigmoid activation function to generate a sample-specific fusion weight vector g . Equation (13) presents the specific formula.
g = σ W g h xgb h cnn + b g ,   g     [ 0 ,   1 ] d
Equation (14) presents the expression for the final output fused features f fused .
f fused = g h x g b + ( 1   -   g ) h c n n
Here, denotes the element-wise multiplication operation. This refined gating mechanism offers a significant advantage over simple concatenation or conventional GMU formulations. While traditional methods treat all features equally or rely on fixed weightings, our approach dynamically recalibrates the importance of expert-selected statistical features versus automated deep features based on the real-time traffic context. For instance, during a high-volume DDoS attack, the gate can learn to prioritize statistical rate features, whereas for stealthy APT probing, it may focus more on deep temporal patterns.
To achieve precise identification of abnormal traffic, this study inputs the dynamically fused feature vector f fused into a MLP classification module. This classifier employs a hidden layer with 32 neurons. The layer utilizes the ReLU activation function to enhance nonlinear expressive capability. To accommodate different detection tasks, two separate model instances with task-specific output layers are implemented: for binary classification, the output layer consists of 2 neurons, while for multi-class classification, it is configured with 4 neurons. Subsequently, the Softmax function calculates category probabilities. Regarding the training configuration, this study employs the Adam optimizer with an initial learning rate of 0.001. The batch size is set to 64, with a maximum of 50 epochs. To further enhance generalization and prevent overfitting, the model applies a Dropout rate of 0.5 after the MLP hidden layer. Additionally, the training process implements an early stopping strategy based on the validation set’s macro-F1 score. Finally, the entire training process employs 5-fold stratified cross-validation to ensure the stability and robustness of the classification performance.

4. Experimental Analysis and Results

This section provides a comprehensive evaluation of the proposed anomaly detection framework through various experimental scenarios. We begin by defining the evaluation metrics used to quantify model performance in Section 4.1. Then, an ablation study is presented in Section 4.2 to verify the effectiveness of the gated fusion module. Section 4.3 and Section 4.4 analyze the experimental results for binary and multi-class classification, respectively. Finally, Section 4.5 discusses the model’s performance on the validation set to demonstrate its generalization capability.

4.1. Evaluation Metrics

To comprehensively evaluate the performance of the constructed intrusion detection model, this study employs four standard classification evaluation metrics: Accuracy, Precision, Recall, and F1-score. These metrics effectively measure the discriminative capability of the model across different categories. To ensure the reliability and stability of the experimental results, all metrics are reported as the arithmetic mean derived from 5-fold cross-validation. This approach avoids the bias associated with single point estimates and provides a more objective assessment of the model’s generalization ability, particularly in security scenarios characterized by class imbalance or high costs associated with false positives and false negatives.
Specifically, the overall performance is defined as the average of the results across k folds. The following formulas define the calculations for these four metrics:
A c c u r a c y = 1 K i = 1 K T P i + T N i T P i + T N i + F P i + F N i
P r e c i s i o n = 1 K i = 1 K T P i T P i + F P i
R e c a l l = 1 K i = 1 K T P i T P i + F N i
F 1 s c o r e = 1 K i = 1 K 2 P r e c i s i o n i R e c a l l i P r e c i s i o n i + R e c a l l i
  • True Positive (TP): The number of positive samples correctly predicted as positive by the model. For example, the model correctly identifies attack traffic as an attack.
  • True Negative (TN): The number of negative samples correctly predicted as negative by the model. For example, the model correctly identifies normal traffic as normal.
  • False Positive (FP): The number of negative samples incorrectly predicted as positive by the model. This error represents normal traffic being misclassified as attack traffic.
  • False Negative (FN): The number of positive samples incorrectly predicted as negative by the model. This error represents attack traffic being missed and classified as normal traffic.
Additionally, different classification tasks allow for the aggregation of these metrics using various averaging methods. To mitigate the interference of sample size discrepancies on overall metrics in multi-class tasks, this study employs the macro-average method for each fold. This approach involves calculating the metrics for each class individually and then computing their arithmetic mean, as shown in Equation (19).
M e t r i c m a c r o = 1 N j = 1 N M e t r i c j
Here, N denotes the total number of categories. In binary classification tasks, the study employs the binary averaging method, positioning the positive class as the core of the evaluation. The final results reported in subsequent sections represent the mean and standard deviation across all folds, ensuring a rigorous statistical foundation for the performance evaluation.

4.2. Ablation Experiments

Ablation experiments play a crucial role in model performance evaluation. These experiments assist researchers in gaining deeper insights into the influence of individual components on the final performance. By progressively removing or modifying specific components, ablation experiments reveal the contribution of various design choices to the overall model. Consequently, this process provides theoretical support for model improvement and optimization.
This study conducts ablation experiments on the InSDN Dataset to validate the contribution of each module. The analysis specifically focuses on the effects of fusion strategies and gating mechanisms. This paper compares several distinct model configurations. These configurations cover baseline models like XGBoost-MLP and CNN-MLP, along with several feature fusion approaches. Section 4.2.3 lists the specific results.

4.2.1. Comparison of Optimal Number of Features

The feature selection method based on XGBoost and RFE initially identified the top 55 features. Among these, the 10 features that were ultimately not adopted in the final optimal subset are detailed in Table 4, which serves as a subsequent extension to Table 3.
To further enhance accuracy and reduce computational complexity, this study conducted a secondary selection phase. This process focused on optimizing the number of features. The experiment integrated the selected feature counts with the subsequent processing steps. Figure 2 presents the final observed results.
Experimental validation confirms that model performance gradually improves as the number of features increases. The performance reaches its peak when the system utilizes the top 45 features. Consequently, this study selects the top 45 features as the optimal feature set.

4.2.2. Comparison of Baseline Models

In the initial phase of the ablation experiments, this paper first examined two baseline models: XGBoost-MLP and CNN-MLP. The XGBoost-MLP model achieved an accuracy of 97.27%, precision of 97.58%, recall of 96.74%, and F1-score of 97.13%. In contrast, the CNN-MLP model demonstrated significantly lower results. This model recorded an accuracy of 95.64%, precision of 95.51%, recall of 93.24%, and F1-score of 94.15%. The comparison reveals that the CNN-MLP model falls far short of the performance level of XGBoost-MLP. These results indicate that standalone CNN or XGBoost models exhibit limitations in handling this task. Consequently, these single-model approaches fail to fully leverage the potential of the data.

4.2.3. Comparison of Different Fusion Strategies

Next, this paper compares several distinct feature fusion strategies. Table 5 presents the specific data. First, the experiment evaluates concatenation fusion without the gating mechanism. This model achieves an accuracy of 99.61%, precision of 99.45%, recall of 99.53%, and F1-score of 99.49%. This result indicates that model performance improves significantly through simple concatenation fusion. However, this approach does not yet reach the optimal level.
In the fixed-weight fusion experiment, the system sets the weights for both XGBoost and CNN to 0.5. In this configuration, the model achieves an accuracy of 99.54%, precision of 99.38%, recall of 99.43%, and F1-score of 99.41%. Compared to concatenation fusion, the performance of the fixed-weight model declines slightly. However, this strategy still delivers satisfactory results.
Subsequently, the study employs a random weight fusion strategy. This method uses a random function to assign fusion weights during each iteration. Consequently, the model accuracy further improves to 99.78%, with precision at 99.73%, recall at 99.72%, and F1-score at 99.73%. Random weight fusion significantly enhances performance. This finding demonstrates the effectiveness of dynamically adjusting weights during feature fusion.

4.2.4. Anomalous Traffic Detection Method Based on Improved Gated Fusion Features

Experimental data demonstrates that the anomaly traffic detection method based on improved gated fusion features outperforms the random weight fusion model. This superiority is evident across all performance metrics. Specifically, the proposed model achieves an accuracy of 99.88%. Precision and F1-score both stabilize at 99.86%. Additionally, the recall reaches 99.85%. These results indicate that the dynamic weights computed through the gating mechanism capture complementary information between heterogeneous features more accurately. While the performance of the random weight fusion strategy is numerically competitive, our proposed gated fusion model consistently achieves superior results with significantly higher stability, as evidenced by the reduced standard deviation. This suggests that the learned gating mechanism effectively converges to an optimal feature-weighting configuration, whereas random weighting introduces greater performance volatility. The consistent improvement across all metrics, combined with the reduction in variance, demonstrates that the gating mechanism provides a more reliable and robust solution for SDN anomaly detection.

4.3. Results of Binary Classification Models

Deep learning models, particularly CNN, LSTM, and GAN, have achieved remarkable results in intrusion detection. However, despite their outstanding accuracy, challenges remain. Specifically, enhancing generalization capabilities while reducing false positive and false negative rates is difficult. To evaluate the performance of the proposed model, this study compares it against several baselines. These baselines include the DNN-based LSTM model (DNN-LSTM), Generative Adversarial Network (GAN), combined CNN and LSTM model (cuCNN-LSTM), Gated Recurrent Unit model (GRU), and Bidirectional LSTM model (Bi-LSTM). To ensure a fair and rigorous comparison, all baseline models were independently evaluated under the exact same experimental framework as our proposed method. This includes the use of identical data preprocessing procedures, the same Top-45 feature selection results, and the specific Open-set Recognition (OSR) data partitioning protocol described. We did not adopt the performance metrics directly from the original cited literature, as those results were obtained under different data split conditions and preprocessing pipelines, which would render the comparison invalid. To ensure the statistical rigor of the comparison, a paired t-test was conducted between the proposed model and the strongest baseline in each category. The results consistently yielded p -values less than 0.05, confirming that the performance improvements are statistically significant despite the high saturation of the metrics.
Table 6 presents the performance comparison of different models on the experimental dataset. The results demonstrate that the proposed model achieves optimal performance in Accuracy (99.91%), Recall (99.94%), and F1-Score (99.78%). This outcome indicates outstanding performance in detection rate and overall balance. However, regarding Precision, cuDNN-LSTM (99.94%), GRU (99.91%), and Bi-LSTM (99.90%) slightly outperform the proposed model (99.63%).
This result suggests that in rare instances, the proposed method exhibits a higher false positive rate compared to certain recurrent neural networks. This discrepancy likely stems from the design emphasis on enhancing recognition capabilities for minority class samples. Specifically, the model strengthens feature extraction and fusion to boost Recall. Consequently, the system tends to classify certain borderline samples as positive. While this strategy incurs a slight loss in Precision, it significantly boosts Recall. This trade-off leads to a further reduction in false negatives. False negatives are often more severe than false positives in intrusion detection tasks. The risk of undetected attacks far outweighs the cost of generating additional alerts. Therefore, this strategy offers greater advantages in practical applications.
Regarding the performance comparison of the models, cuDNN-LSTM achieves the highest Precision (99.94%). However, its Recall (99.80%) and F1-Score (99.37%) are slightly lower than those of the proposed model. This result indicates insufficient coverage of abnormal samples. GRU and Bi-LSTM maintain relatively high levels of Accuracy and Precision. However, their Recall values are 98.68% and 98.99%, respectively. These figures fall short of the detection capability achieved by the proposed method. CNN-LSTM performs well in terms of F1-Score (99.08%). Nevertheless, its overall performance still lags behind Bi-LSTM and the proposed method. GAN achieves a high Recall (98.37%) but suffers from extremely low Precision (65.37%). Consequently, this imbalance results in excessive false positive rates.
Overall, the proposed model achieves optimal results in Recall and F1-Score. While its Precision is slightly lower than that of some recurrent neural networks, the model demonstrates superior comprehensive performance and practical value. In intrusion detection scenarios, high Recall effectively reduces attack false negatives. The proposed method further enhances Recall while maintaining high Precision. Therefore, this approach proves more practical in environments with stringent security requirements.

4.4. Multi-Class Classification Model Results

The proposed model demonstrates outstanding performance in multi-class classification tasks. It surpasses existing mainstream models across multiple evaluation metrics. Table 7 presents the experimental results for different models in multi-class classification tasks. The proposed method achieves optimal performance across all four metrics. Specifically, the Accuracy is 99.88%. Precision and Recall reach 99.86% and 0.9985, respectively. Consequently, the F1-Score maintains the highest level at 0.9986. This result demonstrates that the proposed approach maintains exceptionally high detection accuracy and stability. This capability persists even when confronting complex multi-class traffic patterns.
Regarding the overall trend, RNNs and their variants continue to demonstrate strong advantages. GRU, cuDNN-LSTM, and Bi-LSTM all achieve Accuracy scores above 99.6% in the multi-classification task. Among these, Bi-LSTM achieves an Accuracy of 99.75% and an F1-Score of 99.67%. This model performs closest to the proposed method among the comparison approaches. This result validates the effectiveness of the bidirectional structure in capturing temporal dependencies. GRU also demonstrates stable performance, with an Accuracy of 99.73% and an F1-Score of 99.64%. This performance indicates its ability to balance efficiency and accuracy in multi-class scenarios.
In contrast, CNN-LSTM achieves an Accuracy and F1-Score of 99.38% and 99.20%, respectively. These figures are slightly lower than those of RNNs. This discrepancy indicates that merely combining local convolutional features with temporal information has certain limitations. Specifically, this combination lacks sufficient category discrimination capability. The GAN model performs the least satisfactorily. Although its Accuracy and Recall approach 99.5%, its F1-Score is only 99.32%. This score is significantly lower than that of other models. This gap likely stems from the limitations of GAN in sample generation and class boundary delineation. Consequently, the model struggles to achieve stable classification performance under complex multi-class distributions.
Notably, some comparison models, such as Bi-LSTM and GRU, approach the Precision or Recall of the proposed model. However, they still lag in composite metrics. The proposed approach achieves the highest Precision of 99.86%. Simultaneously, it maintains the optimal Recall of 99.85%. This dual advantage demonstrates the ability to effectively distinguish between different categories. Furthermore, the method minimizes both false negatives and false positives in multi-class scenarios. Ultimately, the improvement in F1-Score fully validates the overall superiority of the model in multi-class classification tasks.

4.5. Validation Set Results

During model training, the validation set serves as a crucial tool for evaluating model generalization capabilities. It assesses model performance on unseen data. Additionally, the process guides model tuning. This study employs the NSL-KDD dataset as the validation set. The experiment utilizes the same training workflow and hyperparameter tuning strategy as applied to the InSDN Dataset. This approach further validates the effectiveness of the proposed model.
(1)
Binary Classification Task
For the binary classification task, the proposed method demonstrates outstanding performance on the validation set. Table 8 presents the specific results. The proposed model achieves an accuracy of 99.78%, precision of 99.90%, recall of 99.69%, and F1-score of 99.80%. Compared to the cuDNN-LSTM and CNN-LSTM models, the proposed method improves accuracy by 0.15% and 0.06%, respectively. In terms of precision, the model outperforms cuDNN-LSTM by 0.17%. However, it is marginally lower than CNN-LSTM by 0.03%. Furthermore, the F1-score outperforms both cuDNN-LSTM and CNN-LSTM. This result indicates that the proposed approach achieves superior accuracy. Moreover, it demonstrates greater robustness in balancing precision and recall.
(2)
Multi-Class Classification Task
For the multi-class classification task, the proposed method also demonstrates outstanding performance. Table 9 presents the specific results. On the validation set, the model achieves an accuracy of 99.76%, precision of 99.34%, recall of 99.79%, and F1-score of 99.56%. Specifically, the accuracy of the proposed method is 0.13% higher than that of cuDNN-LSTM and 0.35% higher than that of CNN-LSTM. Additionally, the F1-score improves by 0.24% and 0.59%, respectively. These results demonstrate that the proposed model excels not only in binary classification tasks but also in multi-class classification tasks. This adaptability highlights the significant advantages of the approach.
In summary, the validation set results further confirm the efficiency and broad applicability of this method. The data demonstrates the ability of the model to deliver outstanding performance consistently across diverse tasks. Consequently, the proposed approach possesses strong practical value.
(3)
Simulation Experiment
To evaluate the practical applicability and operational feasibility of the proposed model, a prototype system was deployed in a virtualized SDN environment. The specific structure is shown in Figure 3. The control plane, managed by the Ryu controller (version 4.3.4), communicates with the data plane via the OpenFlow v1.3 protocol. The topology consists of two core switches (S1, S2), four aggregation layer switches (S3–S6), and six edge switches (S7–S12), Additionally, twelve hosts (H1–H12) connect to this infrastructure, implemented within Mininet (version 2.3.0). The entire simulation was hosted on an Ubuntu 20.04 LTS virtual machine allocated with 4 vCPUs and 16 GB of RAM.
To ensure a rigorous evaluation of real-time operational performance, diverse abnormal traffic patterns were simulated by carefully defining their type, structure, and velocity to match the constraints of a virtualized network stack. Specifically, DoS attacks were simulated using Hping3 on host H1 to generate TCP SYN flood traffic. This attack involved a randomized source IP structure with a velocity maintained at approximately 1000 to 3000 packets per second, a range specifically chosen to test the stress limits of the Ryu controller’s asynchronous message handler without inducing an immediate system crash. For DDoS scenarios, a coordinated distributed UDP flood was launched from both H1 and H2 using LOIC, generating a combined throughput of 120–180 Mbps. This volume is sufficient to saturate the virtual link bandwidth and trigger sustained Packet-in requests, thereby simulating the link congestion typically observed during volumetric attacks. Additionally, probe attacks were executed from H1 using Nmap to perform aggressive port scanning and OS fingerprinting, creating a complex structure of TCP/UDP packets across a wide range of ports to evaluate the model’s sensitivity to low-volume reconnaissance patterns. Concurrently, Iperf3 was employed between the remaining hosts to maintain a steady-state normal background traffic of approximately 40–60 Mbps, providing a realistic baseline for evaluating the model’s discriminative capability under mixed traffic conditions. The detection model is integrated into the Ryu controller as an asynchronous detection module. Considering the resource constraints of the virtual machine environment, the quantitative performance evaluation is summarized in Table 10.
To further evaluate the model’s discriminative capability across different threat categories in a dynamic environment, the specific classification performance for each simulated traffic type was recorded. Unlike the offline tests on static datasets, the simulation involves transient feature fluctuations and background noise from normal network operations. The classification results, summarized in Table 11, demonstrate that the proposed model maintains high precision and recall even under the resource constraints and timing sensitivities of the Ryu controller.
The experimental results presented in Table 9 and Table 10 provide a comprehensive validation of the model’s operational feasibility and robustness in a real-world SDN simulation. The performance data in Table 9 reveals that the system effectively manages a throughput of over 420 Mbps with a manageable CPU overhead, confirming that the gated fusion architecture is efficient enough to operate alongside core controller functions without inducing systemic instability. This computational overhead is primarily driven by the CNN-SE feature extraction branch and the continuous gating computation during high-load scenarios. Although the detection latency of 24.8 ms is slightly higher than that of hardware-based deployments, it remains well within the acceptable threshold for mitigating rapid-fire attacks like SYN floods before they can fully compromise switch flow tables. Furthermore, the analysis of Table 10 indicates that while the precision and recall metrics across different attack categories show a slight decrease compared to the saturated results of the offline evaluation, this reflects a more realistic assessment of the model’s performance. This marginal decline is primarily attributed to the transient nature of flow features and the inevitable noise generated by concurrent normal background traffic, which can occasionally obscure the distinct signatures of low-volume probe attacks or high velocity bursts.
The significant performance in detecting DDoS and DoS attacks, despite the high-load conditions of the virtualized environment, underscores the effectiveness of the gated fusion mechanism in extracting critical features from saturated traffic streams. To further mitigate resource consumption in large-scale deployments, future iterations could incorporate model quantization or a selective activation mechanism to engage the deep branch only for borderline samples. By achieving high accuracy and low packet drop rates under simulated stress, the prototype demonstrates that it can bridge the gap between theoretical deep learning research and practical network defense. Consequently, these findings suggest that the proposed method offers a reliable, low-overhead solution for real-time threat detection, providing evidence that it is capable of maintaining network integrity in the face of diverse and evolving cybersecurity threats.

5. Conclusions

This paper proposes a novel anomalous traffic detection framework based on improved gated feature fusion in an SDN environment. The proposed methodology addresses the detection accuracy bottlenecks caused by single-dimensional feature representations and rigid fusion strategies in existing methods. First, the data preprocessing module executes cleaning, Min-Max normalization, and label encoding on raw traffic data to construct a standardized input space. Building upon this foundation, the feature engineering phase employs a dual-track parallel strategy. The first track utilizes an XGBoost model combined with the RFE algorithm to select the top 45 most discriminative shallow statistical features, thereby preserving key macro-level network behavioral attributes. Simultaneously, the second track deploys a 1D-CNN with an SE block to extract deep temporal semantic features, enhancing the representation capability for complex attack patterns. Subsequently, the framework integrates these heterogeneous features via an improved gated fusion mechanism that employs a gated vector generated by a Sigmoid activation function. Consequently, the system adaptively adjusts the fusion weights based on the contextual characteristics of input samples, effectively mitigating information redundancy and noise interference inherent in static fusion. Finally, the resulting weighted vectors are fed into an MLP classifier to achieve high-precision anomaly identification.
Experimental analysis demonstrates that the proposed method exhibits competitive performance across multiple key metrics. Test results on both InSDN and NSL-KDD datasets indicate that this approach yields improvements over several existing models, such as CNN-LSTM and GAN. Specifically, the method excels in recall and F1-scores, which directly reflects a significant reduction in the false negative rate for anomalous samples. Additionally, ablation experiments confirm the value of the gating mechanism, demonstrating that the dynamic fusion of heterogeneous features effectively enhances model consistency within complex network environments.
The prototype deployment on Mininet and the Ryu controller provides an assessment of the model’s operational potential in simulated SDN environments. The system demonstrates robust detection capabilities across diverse threats, maintaining F1-scores between 91.5% and 96.4%. Under high-intensity attack scenarios, the prototype efficiently processes traffic at 420.5 Mbps with a detection latency of 24.8 ms, while incurring a peak CPU utilization of 38.6% and a 412 MB memory footprint.
In summary, the proposed method reduces false negative rates while maintaining high detection accuracy, offering a viable approach for enhancing security defense in SDN environments. Future work will focus on optimizing computational overhead, specifically through model quantization and selective activation strategies, to further improve real-time efficiency in resource-constrained scenarios. Furthermore, we aim to evaluate the framework’s generalization across broader, real-world network conditions.

Author Contributions

Conceptualization, R.G. and X.W.; methodology, R.G.; validation, R.G., F.C. and G.Y.; formal analysis, S.L. and P.Q.; data curation, F.C., S.L. and R.G.; writing—original draft preparation, R.G. and G.Y.; writing—review and editing, R.G. and X.W.; project administration, R.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Science and Technology Innovation Program for Postgraduate students in IDP subsidized by Fundamental Research Funds for the Central Universities (Grant No. ZY20260317). In addition, this work was supported in part by the Research on Feasibility Schemes and Simulation Environment Design for Unmanned Emergency Rescue in the Complex Environments of Northern Guangdong; Research on the Reform of Comprehensive Practical Teaching Content for AI-Empowered Network and Information Security Innovation and Entrepreneurship (Grant No. JY2025B01); and the 2025 Hebei Provincial Innovation and Entrepreneurship Course (Specialty-Innovation Integration Course) Project “Network and Information Security” (Grant No. 2025cxkc183).

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Kreutz, D.; Ramos, F.M.; Verissimo, P.E.; Rothenberg, C.E.; Azodolmolky, S.; Uhlig, S. Software-defined networking: A comprehensive survey. Proc. IEEE 2015, 103, 14–76. [Google Scholar] [CrossRef]
  2. Srivastava, A.; Gupta, B.B.; Tyagi, A.; Sharma, A.; Mishra, A. A recent survey on DDoS attacks and defense mechanisms. In Advances in Parallel Distributed Computing; Communications in Computer and Information Science (CCIS); Springer: Berlin/Heidelberg, Germany, 2011; Volume 203, pp. 570–580. [Google Scholar]
  3. Singh, S.; Sharma, P.K.; Moon, S.Y.; Moon, D.; Park, J.H. A comprehensive study on APT attacks and countermeasures for future networks and communications: Challenges and solutions. J. Supercomput. 2019, 75, 4543–4574. [Google Scholar] [CrossRef]
  4. Khraisat, A.; Gondal, I.; Vamplew, P.; Kamruzzaman, J. Survey of intrusion detection systems: Techniques, datasets and challenges. Cybersecurity 2019, 2, 20. [Google Scholar] [CrossRef]
  5. Yu, Y.; Long, J.; Cai, Z. Network intrusion detection through stacking dilated convolutional autoencoders. Secur. Commun. Netw. 2017, 2017, 4184196. [Google Scholar] [CrossRef]
  6. Yin, C.; Zhu, Y.; Fei, J.; He, X. A deep Learning approach for intrusion detection using recurrent neural networks. IEEE Access 2017, 5, 21954–21961. [Google Scholar] [CrossRef]
  7. Braga, R.; Mota, E.; Passito, A. Lightweight DDoS flooding attack detection using NOX/OpenFlow. In Proceedings of the 2010 IEEE 35th Conference on Local Computer Networks (LCN); IEEE: Piscataway, NJ, USA, 2010; pp. 408–415. [Google Scholar]
  8. Aladaileh, M.; Anbar, M.; Hasbullah, I.H.; Sanjalawe, Y.K.; Chong, Y.-W. Entropy-Based approach to detect DDoS attacks on software defined networking controller. Comput. Mater. Contin. 2021, 69, 373–391. [Google Scholar] [CrossRef]
  9. Aladaileh, M.A.; Anbar, M.; Hintaw, A.J.; Hasbullah, I.H.; Bahashwan, A.A.; Al-Amiedy, T.A.; Ibrahim, D.R. Effectiveness of an entropy-based approach for detecting low-and high-rate DDoS attacks against the SDN controller: Experimental analysis. Appl. Sci. 2023, 13, 775. [Google Scholar] [CrossRef]
  10. Song, C.; Park, Y.; Golani, K.; Kim, Y.; Bhatt, K.; Goswami, K. Machine-learning based threat-aware system in software defined networks. In Proceedings of the 26th International Conference on Computer Communications and Networks; IEEE: Piscataway, NJ, USA, 2017; pp. 1–9. [Google Scholar]
  11. Kokila, R.; Selvi, S.T.; Govindarajan, K. DDoS detection and analysis in SDN-based environment using support vector machine classifier. In Proceedings of the 6th International Conference on Advanced Computing (ICoAC); IEEE: Piscataway, NJ, USA, 2014; pp. 205–210. [Google Scholar]
  12. Isa, M.M.; Mhamdi, L. Native SDN intrusion detection using machine learning. In Proceedings of the IEEE Eighth International Conference on Communications and Networking (ComNet 2020) (ComNet); IEEE: Piscataway, NJ, USA, 2020; pp. 1–7. [Google Scholar]
  13. Lent, D.M.B.; da Silva Ruffo, V.G.; Carvalho, L.F.; Lloret, J.; Rodrigues, J.J.P.C.; Proença, M.L. An Unsupervised Generative Adversarial Network System to Detect DDoS Attacks in SDN. IEEE Access 2024, 12, 70690–70706. [Google Scholar] [CrossRef]
  14. Chaganti, R.; Suliman, W.; Ravi, V.; Dua, A. Deep learning approach for SDN-enabled intrusion detection system in IoT networks. Information 2023, 14, 41. [Google Scholar] [CrossRef]
  15. Ataa, M.S.; Sanad, E.E.; El-Khoribi, R.A. Intrusion detection in software defined network using deep learning approaches. Sci. Rep. 2024, 14, 8830. [Google Scholar] [CrossRef] [PubMed]
  16. Saheed, Y.K.; Kehinde, T.O.; Raji, M.A.; Baba, U.A. Feature selection in intrusion detection systems: A new hybrid fusion of Bat algorithm and Residue Number System. J. Inf. Telecommun. 2024, 8, 189–207. [Google Scholar] [CrossRef]
  17. YDamtew, G.; Chen, H.; Yuan, Z. Heterogeneous ensemble feature selection for network intrusion detection system. Int. J. Comput. Intell. Syst. 2023, 16, 165. [Google Scholar] [CrossRef]
  18. Ramkumar, M.; Reddy, P.B.; Thirukrishna, J.; Vidyadhari, C. Intrusion detection in big data using hybrid feature fusion and optimization enabled deep learning based on spark architecture. Comput. Secur. 2022, 116, 102668. [Google Scholar] [CrossRef]
  19. Elsayed, M.S.; Le-Khac, N.-A.; Jurcut, J.D. InSDN: A Novel SDN Intrusion Dataset. IEEE Access 2020, 8, 165263–165284. [Google Scholar] [CrossRef]
  20. Tavallaee, M.; Bagheri, E.; Lu, W.; Ghorbani, A.A. A Detailed Analysis of the KDD CUP 99 Data Set. In Proceedings of the 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications (CISDA); IEEE: Piscataway, NJ, USA, 2009; pp. 1–6. [Google Scholar]
  21. Shanmugam, V.; Razavi-Far, R.; Hallaji, E. Addressing Class Imbalance in Intrusion Detection: A Comprehensive Evaluation of Machine Learning Approaches. Electronics 2025, 14, 69. [Google Scholar] [CrossRef]
  22. Meng, D.; Li, Y. An imbalanced learning method by combining SMOTE with Center Offset Factor. Appl. Soft Comput. 2022, 120, 108618. [Google Scholar] [CrossRef]
  23. Al Razib, M.; Javeed, D.; Khan, M.T.; Alkanhel, R.; Muthanna, M.S.A. Cyber threats detection in smart environments using SDN-enabled DNN-LSTM hybrid framework. IEEE Access 2022, 10, 53015–53026. [Google Scholar] [CrossRef]
  24. Zabeehullah; Arif, F.; Haq, Q.M.U.; Khan, N.A.; Din, I.U.; Almogren, A.; Khan, M.A.; Alsaleh, O.; Guizani, M. Hybrid CNN-LSTM model for DDoS attack detection in internet of Things-based healthcare industry 5.0. IEEE Internet Things J. 2025, 12, 46075–46082. [Google Scholar] [CrossRef]
  25. Assis, M.V.; Carvalho, L.F.; Lloret, J.; Proença, M.L., Jr. A GRU deep learning system against attacks in software defined networks. J. Netw. Comput. Appl. 2021, 177, 102892. [Google Scholar] [CrossRef]
  26. Alotaibi, J. A hybrid software-defined networking approach for enhancing IoT cybersecurity with deep learning and blockchain in smart cities. Peer-Peer Netw. Appl. 2025, 18, 123. [Google Scholar] [CrossRef]
Figure 1. Technical workflow of the proposed model.
Figure 1. Technical workflow of the proposed model.
Futureinternet 18 00270 g001
Figure 2. Comparison of the Number of Selected Features.
Figure 2. Comparison of the Number of Selected Features.
Futureinternet 18 00270 g002
Figure 3. Experimental Network Topology.
Figure 3. Experimental Network Topology.
Futureinternet 18 00270 g003
Table 1. Summary of Related Work.
Table 1. Summary of Related Work.
Ref.YearModelDatasetNo. of FeaturesHighest Accuracy
BinaryMulti-Class
[7]2010SOM-based detectionSelf-collected dataset698.61%-
[8]2021GEADDDC (Rényi joint entropy-based detection)Self-collected dataset299.72%-
[9]2023Entropy-based detectionSelf-collected dataset-97%-
[10]2017Decision TreeKDD9910 82.48%-
1591.17%
[11]2014SVMDARPA2000--95.11%
[12]2017Autoencoder and random forestNSL-KDD--98.40%
[13]2024GANCICIDS2019699.45%-
[14]2023LSTMDS12597.70%97.10%
[15]2024CNN-LSTMInSDN48-99.08%
699.19%
2598.86%
[15]2024TransformerInSDN48-99.16%
699.16%
2598.93%
[16]2023Bat-RNS+PCA+NBNSL-KDD16-97.82%
[17]2023Heterogeneous Ensemble Feature Selection (HEFS)NSL-KDD1099.6%-
[18]2022RV coefficient + ExpSLO-based DRNMQTT
Apache web server 2021
-87.25%-
Table 2. InSDN Dataset Details.
Table 2. InSDN Dataset Details.
No.TypeDescriptionCount
1DDoSDistributed Denial of Service121,942
2ProbeProbe Attack98,129
3NormalNormal Traffic68,424
4DoSDenial of Service53,616
5BFABrute Force Attack1405
6Web-AttackWeb Attack192
7BOTNETBotnet164
8U2RUser-to-Root Attack17
Table 3. Selected Top-45 Features.
Table 3. Selected Top-45 Features.
TopFeature NameDescription
1ProtocolType of protocol, e.g., tcp, udp, etc.
2Pkt Len MaxMax of the length of a packet
3Fwd Header LenTotal bytes used for headers in the forward direction
4Tot Fwd PktsTotal packets in the forward direction
5Dst PortDestination port number
6TotLen Fwd PktsTotal size of packet in forward direction
7Bwd Header LenTotal bytes used for headers in the backward direction
8Init Bwd Win BytsThe total number of bytes sent in initial window in the backward direction
9SYN Flag CntNumber of packets with SYN
10Fwd Pkts/sNumber of forward packets per second
11Fwd Pkt Len MinMin of the size of packet in forward direction
12Bwd Pkt Len MeanMean of the size of packet in backward direction
13Fwd Pkt Len MeanMean of the size of packet in forward direction
14Bwd IAT MaxMax of the Time between two packets sent in the backward direction
15Idle MeanMean time flow was idle before becoming active
16TotLen Bwd PktsTotal size of packet in backward direction
17Pkt Size AvgAverage size of packet
18Bwd PSH FlagsNumber of times the PSH flag was set in packets travelling in the backward direction (0 for UDP)
19Active MinMin of the time flow was active before becoming idle
20Pkt Len StdStandard deviation of the length of a packet
21RST Flag CntNumber of packets with RST
22Bwd Pkt Len StdStandard deviation of the size of packet in backward direction
23Flow IAT MinMin of the time between two packets sent in the flow
24Tot Bwd PktsTotal packets in the backward direction
25Bwd Pkts/sNumber of backward packets per second
26Pkt Len MeanMean of the length of a packet
27Bwd IAT StdStandard deviation of the Time between two packets sent in the backward direction
28Src PortSource port number
29Idle MaxMax time flow was idle before becoming active
30Pkt Len MinMin of the length of a packet
31Flow Pkts/sNumber of flow packets per second
32Bwd Pkt Len MaxMax of the size of packet in backward direction
33Fwd IAT MinMin of the time between two packets sent in the forward direction
34Flow IAT StdStandard deviation of the time between two packets sent in the flow
35Fwd IAT MeanMean of the time between two packets sent in the forward direction
36Flow Byts/sNumber of flow bytes per second
37Idle MinMin time flow was idle before becoming active
38FIN Flag CntNumber of packets with FIN
39Flow DurationDuration of the flow in Microsecond
40Fwd Pkt Len StdStandard deviation of the size of packet in forward direction
41Fwd IAT MaxMax of the time between two packets sent in the forward direction
42Fwd Act Data PktsCount of packets with at least i byte of TCP data payload in the forward direction
43Bwd Pkt Len MinMin of the size of packet in backward direction
44Bwd IAT TotTotal of the time between two packets sent in the backward direction
45Fwd Pkt Len MaxMax of the size of packet in forward direction
Table 4. Features excluded from the optimal subset.
Table 4. Features excluded from the optimal subset.
TopFeature NameDescription
46Active MeanMean of the time flow was active before becoming idle
47ACK Flag CntNumber of packets with ACK
48Fwd IAT TotTotal of the time between two packets sent in the forward direction
49Down/Up RatioDownload and upload ratio
50Flow IAT MeanMean of the time between two packets sent in the flow
51Flow IAT MaxMax of the time between two packets sent in the flow
52Bwd IAT MinMin of the Time between two packets sent in the backward direction
53Active MaxMax of the time flow was active before becoming idle
54Active StdStandard deviation of the time flow was active before becoming idle
55Bwd IAT MeanMean of the Time between two packets sent in the backward direction
Table 5. Comparison of Ablation Experiments.
Table 5. Comparison of Ablation Experiments.
IndexModelAccuracy (%)Precision (%)Recall (%)F1-Score (%)
1XGBoost-MLP97.27 ± 0.1297.58 ± 0.1596.74 ± 0.1897.13 ± 0.14
2CNN-MLP95.64 ± 0.2595.51 ± 0.2893.24 ± 0.3594.15 ± 0.31
3Concatenation fusion99.61 ± 0.0699.45 ± 0.0899.53 ± 0.0799.49 ± 0.07
4Fixed weight fusion99.54 ± 0.0899.38 ± 0.0999.43 ± 0.1099.41 ± 0.09
5Random weight fusion99.78 ± 0.0599.73 ± 0.0699.72 ± 0.0699.73 ± 0.05
6This model99.88 ± 0.0299.86 ± 0.0399.85 ± 0.0299.86 ± 0.02
Table 6. Binary Classification Results on InSDN Dataset.
Table 6. Binary Classification Results on InSDN Dataset.
ReferenceModelAccuracy (%)Precision (%)Recall (%)F1-Score (%)
[13]GAN89.25 ± 0.4565.37 ± 0.8598.37 ± 0.3278.54 ± 0.65
[23]cuDNN-LSTM99.75 ± 0.0699.94 ± 0.0299.80 ± 0.0599.37 ± 0.08
[24]CNN-LSTM99.63 ± 0.0899.39 ± 0.1098.78 ± 0.1299.08 ± 0.11
[25]GRU99.72 ± 0.0799.91 ± 0.0398.68 ± 0.0999.29 ± 0.07
[26]Bi-LSTM99.78 ± 0.0599.90 ± 0.0498.99 ± 0.0699.45 ± 0.05
This model99.91 ± 0.0299.63 ± 0.0399.94 ± 0.0299.78 ± 0.03
Table 7. Multi-Class Classification Results on the InSDN Dataset.
Table 7. Multi-Class Classification Results on the InSDN Dataset.
ReferenceModelAccuracy (%)Precision (%)Recall (%)F1-Score (%)
[13]GAN99.47 ± 0.0899.25 ± 0.1099.40 ± 0.0999.32 ± 0.09
[23]cuDNN-LSTM99.68 ± 0.0499.55 ± 0.0599.61 ± 0.0499.58 ± 0.05
[24]CNN-LSTM99.38 ± 0.0999.11 ± 0.1299.31 ± 0.1099.20 ± 0.11
[25]GRU99.73 ± 0.0599.62 ± 0.0699.67 ± 0.0599.64 ± 0.06
[26]Bi-LSTM99.75 ± 0.0499.66 ± 0.0599.69 ± 0.0499.67 ± 0.05
This model99.88 ± 0.0299.86 ± 0.0399.85 ± 0.0299.86 ± 0.02
Table 8. Binary Classification Results on the NSL-KDD Dataset.
Table 8. Binary Classification Results on the NSL-KDD Dataset.
ReferenceModelAccuracy (%)Precision (%)Recall (%)F1-Score (%)
[13]GAN93.46 ± 0.3597.56 ± 0.4290.13 ± 0.5593.70 ± 0.48
[23]cuDNN-LSTM99.63 ± 0.0899.73 ± 0.0699.58 ± 0.1099.66 ± 0.08
[24]CNN-LSTM99.72 ± 0.0699.93 ± 0.0499.55 ± 0.0899.74 ± 0.06
[25]GRU99.68 ± 0.0799.75 ± 0.0599.65 ± 0.0999.70 ± 0.07
[26]Bi-LSTM99.53 ± 0.0999.75 ± 0.0799.37 ± 0.1299.56 ± 0.10
This model99.78 ± 0.0399.90 ± 0.0499.69 ± 0.0499.80 ± 0.05
Table 9. Multi-Class Classification Results on the NSL-KDD Dataset.
Table 9. Multi-Class Classification Results on the NSL-KDD Dataset.
ReferenceModelAccuracy (%)Precision (%)Recall (%)F1-Score (%)
[13]GAN97.04 ± 0.2596.04 ± 0.3895.28 ± 0.4295.65 ± 0.40
[23]cuDNN-LSTM99.63 ± 0.0899.21 ± 0.1299.44 ± 0.1099.32 ± 0.11
[24]CNN-LSTM99.41 ± 0.1098.84 ± 0.1599.11 ± 0.1398.97 ± 0.14
[25]GRU99.71 ± 0.0699.43 ± 0.0899.66 ± 0.0799.55 ± 0.07
[26]Bi-LSTM-HBA99.57 ± 0.0999.23 ± 0.1099.35 ± 0.0999.29 ± 0.09
This model99.76 ± 0.0399.34 ± 0.0599.79 ± 0.0399.56 ± 0.04
Table 10. System Performance in a Virtualized Environment.
Table 10. System Performance in a Virtualized Environment.
MetricDescriptionValue (Mean ± Std)
ThroughputMaximum bandwidth processed in the VM environment420.5 ± 25.8 Mbps
Detection LatencyEnd-to-end processing time24.8 ± 3.2 ms
Inference OverheadPeak CPU utilization of the Ryu process during attack38.6% ± 5.4%
Memory FootprintAdditional RAM used by the detection module412 ± 35 MB
Packet Drop RatePacket loss under high-frequency Packet-in requests0.85% ± 0.12%
Table 11. Classification Performance across Categories.
Table 11. Classification Performance across Categories.
TypeToolPrecision (%)Recall (%)F1-Score (%)
NormalIperf396.4 ± 1.295.8 ± 0.896.1 ± 1.0
DoSHping394.2 ± 1.593.5 ± 1.893.9 ± 1.6
DDoSLOIC92.7 ± 2.191.4 ± 2.492.1 ± 2.2
ProbeNmap91.5 ± 2.890.2 ± 3.190.8 ± 2.9
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Gu, R.; Wang, X.; Cui, F.; Yang, G.; Liu, S.; Qi, P. An Improved Method for Anomalous Traffic Detection in SDN Based on Gated Feature Fusion. Future Internet 2026, 18, 270. https://doi.org/10.3390/fi18050270

AMA Style

Gu R, Wang X, Cui F, Yang G, Liu S, Qi P. An Improved Method for Anomalous Traffic Detection in SDN Based on Gated Feature Fusion. Future Internet. 2026; 18(5):270. https://doi.org/10.3390/fi18050270

Chicago/Turabian Style

Gu, Ruize, Xiaoying Wang, Fangfang Cui, Guoqing Yang, Shuai Liu, and Panpan Qi. 2026. "An Improved Method for Anomalous Traffic Detection in SDN Based on Gated Feature Fusion" Future Internet 18, no. 5: 270. https://doi.org/10.3390/fi18050270

APA Style

Gu, R., Wang, X., Cui, F., Yang, G., Liu, S., & Qi, P. (2026). An Improved Method for Anomalous Traffic Detection in SDN Based on Gated Feature Fusion. Future Internet, 18(5), 270. https://doi.org/10.3390/fi18050270

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop