Enhancing Network Intrusion Detection Under Class Imbalance Using a Three-Discriminator Generative Adversarial Network

Kim, Taesu; Park, Hyoseong; Shin, Dongil; Shin, Dongkyoo

doi:10.3390/electronics15061253

Open AccessArticle

Enhancing Network Intrusion Detection Under Class Imbalance Using a Three-Discriminator Generative Adversarial Network

¹

Department of Computer Engineering, Sejong University, Seoul 05006, Republic of Korea

²

Department of Convergence Engineering for Intelligent Drones, Sejong University, Seoul 05006, Republic of Korea

^*

Author to whom correspondence should be addressed.

Electronics 2026, 15(6), 1253; https://doi.org/10.3390/electronics15061253

Submission received: 24 February 2026 / Revised: 12 March 2026 / Accepted: 16 March 2026 / Published: 17 March 2026

(This article belongs to the Special Issue Machine Learning and Cybersecurity—Trends and Future Challenges, 2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

Network Intrusion Detection Systems (NIDS) play a crucial role in protecting network environments against cyberattacks. However, traditional NIDS rely heavily on predefined attack signatures, which limits their ability to detect zero-day attacks. Although machine learning-based intrusion detection techniques have been widely adopted in Network Intrusion Prevention Systems (NIPS), publicly available network traffic datasets often suffer from severe class imbalance, leading to biased learning and degraded detection performance. To address this issue, this study proposes data augmentation framework based on a 3D-GAN (Three-Discriminator Generative Adversarial Network). The proposed architecture integrates an autoencoder, a CNN (Convolutional Neural Network), and an LSTM (Long Short-Term Memory) network as parallel discriminators to capture the statistical, spatial, and temporal characteristics of network traffic. By jointly optimizing multiple discriminator losses, the framework enhances training stability and generates high-quality synthetic samples. Experiments were conducted on the CIC-UNSW-NB15 dataset using Random Forest-, XGBoost (eXtreme Gradient Boosting)-, and BiGRU (Bidirectional Gated Recurrent Unit)-based classifiers. Two augmented datasets were constructed to address class imbalance, containing approximately 100,000 and 350,000 samples, respectively. Among them, Dataset 2, augmented using the proposed 3D-GAN, demonstrated the most significant performance improvement. Compared to the original imbalanced dataset, the XGBoost classifier trained on Dataset 2 achieved approximately a 4% increase in both accuracy and F1-score, while reducing the false positive rate and false negative rate by approximately 3.5%. Furthermore, the optimal configuration attained an F1-score of 0.9816, indicating superior capability in modeling complex network traffic patterns. Overall, this study highlights the potential of GAN-based data augmentation for alleviating class imbalance and improving the robustness and generalization of intrusion detection systems.

Keywords:

network traffic data; intrusion prevention; data augmentation

1. Introduction

With the rapid Expansion of Internet Services and network infrastructure, network connectivity has become ubiquitous, increasing exposure to security threats [1]. Network intrusions violate fundamental security objectives, including confidentiality, integrity, and privacy, making Network Intrusion Prevention Systems (NIPS) essential for modern network protection [2].

Traditional intrusion detection systems mainly rely on signature-based techniques that match predefined attack patterns. Although effective for known attacks, these methods are vulnerable to evasion through payload obfuscation and pattern modification. Furthermore, the increasing prevalence of zero-day attacks has significantly reduced the effectiveness of conventional approaches [3]. Zero-day attacks represent one of the most critical threats in modern cybersecurity environments because they exploit previously unknown vulnerabilities. Since signature-based detection systems rely on predefined attack patterns, they are inherently ineffective against such attacks. Therefore, data-driven intrusion detection systems that can learn complex traffic distributions have become essential for identifying previously unseen attack behaviors.

To address these limitations, machine learning-based intrusion detection methods have been widely studied. However, publicly available network traffic datasets often suffer from severe class imbalance, where normal traffic dominates and attack samples are underrepresented. In many benchmark intrusion detection datasets, several attack categories exhibit extremely low frequencies, leading to biased learning and limited practical applicability.

Various data augmentation techniques have been proposed to alleviate class imbalance. Traditional oversampling and SMOTE (Synthetic Minority Over-sampling Technique)-based methods generate samples through linear interpolation, which fails to capture nonlinear characteristics and complex feature dependencies. Recently, GAN (Generative Adversarial Network)-based approaches have been introduced for attack data generation. Nevertheless, conventional GANs employ a single discriminator, limiting their ability to simultaneously model statistical, temporal, and structural properties of network traffic.

In this study, we propose a 3D-GAN that integrates Autoencoder (AE), Convolutional Neural Network (CNN), and Long Short-Term Memory (LSTM) network as parallel discriminators. This architecture enables the generator to receive complementary feedback from multiple perspectives and produce high-quality synthetic attack samples. The generated data are used to augment training datasets for Random Forest, XGBoost, and BiGRU-based classifiers.

The main contributions of this study are as follows:

A novel multi-discriminator GAN framework for network traffic data augmentation under severe class imbalance;
A parallel discriminator architecture that captures statistical, spatial, and temporal characteristics of network;
Extensive experimental validation on the imbalance dataset demonstrating superior performance over existing methods.

These results confirm that the proposed framework effectively enhances intrusion detection performance and improves the robustness and generalization of learning-based security systems.

2. Related Work

2.1. Network Intrusion Detection

Network intrusion detection systems are generally classified into misuse-based, anomaly-based, and hybrid approaches [4]. Misuse-based systems rely on predefined attack signatures and are effective for detecting known threats. However, they often fail to identify novel or zero-day attacks. In contrast, anomaly-based methods attempt to detect abnormal patterns that deviate from normal network behavior, enabling them to identify previously unseen attacks.

Recent studies have explored deep learning-based hybrid architectures to capture spatial and temporal features of network traffic. For example, CNN-RNN (Convolutional Neural Network–Recurrent Neural Network) frameworks have demonstrated promising performance in modeling complex traffic patterns and improving detection accuracy [5,6].

2.2. Network Traffic Datasets

Early intrusion detection studies mainly relied on the KDD99 dataset, which suffers from redundancy and severe class imbalance [7]. The NSL-KDD dataset was later introduced to alleviate these limitations by removing redundant records and improving the balance between training and testing sets [8].

More recently, UNSW-NB15 has been widely adopted as a benchmark due to its realistic traffic patterns and diverse attack categories [9]. Nevertheless, class imbalance remains a major challenge in most public datasets.

2.3. Data Augmentation for Intrusion Detection

Network intrusion detection datasets often suffer from severe class imbalance, where normal traffic samples significantly outnumber attack samples. This imbalance negatively affects the performance of machine learning-based intrusion detection systems, particularly in detecting rare attack categories and emerging threats. To address this issue, various data augmentation techniques have been proposed to increase minority class samples and improve model generalization.

Traditional oversampling techniques such as the Synthetic Minority Over-sampling Technique (SMOTE) generate synthetic minority samples through interpolation between existing samples [10]. Although SMOTE is simple and efficient, it often fails to capture complex nonlinear relationships in high-dimensional network traffic data, which may result in unrealistic samples.

To overcome these limitations, recent studies have explored generative models for data augmentation. In particular, Generative Adversarial Networks (GANs) have demonstrated the ability to learn the underlying distribution of network traffic and generate realistic synthetic samples. GAN-based augmentation methods have been shown to improve intrusion detection performance by producing more representative minority attack samples [11].

Generative models designed for tabular data, such as CTGAN, have also shown strong capability in modeling complex feature dependencies and generating high-quality synthetic samples for imbalanced datasets [12]. However, many existing augmentation approaches still rely on relatively simple generative structures, which may limit their ability to capture diverse characteristics of network traffic data.

2.4. Generative Adversarial Networks

Generative Adversarial Networks (GANs) are widely used generative modeling frameworks consisting of a generator and a discriminator. The generator produces synthetic samples that resemble real data, while the discriminator attempts to distinguish between real and generated samples through an adversarial training process [13].

Various GAN variants have been proposed to improve training stability and generation performance. Conditional GANs (cGANs), for example, incorporate class labels to guide the generation process, enabling the model to produce class-specific samples [14].

Recent studies have applied GAN-based frameworks to cybersecurity and anomaly detection tasks. These approaches demonstrate that GAN models can generate realistic network traffic samples and improve intrusion detection performance, particularly in imbalanced datasets [15].

Another promising direction is the use of multi-discriminator architectures, which provide diverse feedback signals to the generator and help model more complex data distributions [16]. However, most GAN-based intrusion detection approaches still rely on single-discriminator structures.

To address this limitation, this study proposes a three-discriminator GAN framework integrating Autoencoder-, CNN-, and LSTM-based discriminators. Each discriminator captures different characteristics of network traffic data, enabling the generator to learn richer data representations and generate higher-quality synthetic samples for intrusion detection.

3. Proposed Three-Discriminator GAN Framework

This study is inspired by the parallel discriminator structure of MD-GAN. We propose a novel adversarial learning framework, named Three-Discriminator Generative Adversarial Network (3D-GAN), which consists of one generator and three parallel discriminators. The proposed framework is designed to effectively learn complex network traffic distributions and generate high-quality synthetic samples to alleviate data imbalance.

Conventional GANs with a single discriminator are prone to overfitting specific statistical characteristics or patterns, which often leads to mode collapse and low-quality sample generation. To address these limitations, the proposed 3D-GAN employs three discriminators specifically designed to capture different data characteristics. Each discriminator focuses on distinct aspects of network traffic, including global statistical patterns, temporal structures, and local variations with noise features.

In the proposed framework, the generated samples

x_{g} = G (z)

are independently evaluated by three discriminators,

D_{1}, D_{2}

and

D_{3}

. Let

D_{i} (x_{g})

denote the probability that the discriminator

D_{i}

classifies a generated sample

x_{g}

as real. This probability is obtained through a sigmoid activation function applied to the final layer of the discriminator network. The value of

D_{i} (x_{g})

depends on the parameters learned during training and reflects how similar the generated sample is to real data in the feature space learned by the discriminator. Furthermore,

I_{i} (x_{g})

represents the corresponding binary decision function, as defined in Equation (1).

A majority voting mechanism is then applied, in which a generated sample is accepted only when at least two discriminators classify it as real. The voting strategy is formally expressed in Equation (2).

This is example 1 of an equation:

I_{i} (x_{i}) = \{\begin{matrix} r e a l, i f D_{i} (x_{g}) \geq τ_{i} \\ f a k e, o t h e r w i s e \end{matrix}

(1)

x_{g} i s a c c e p t e d ⟺ \sum_{i = 1}^{3} I_{i} (x_{g}) \geq 2

(2)

The overall architecture of the proposed model based on this design is illustrated Figure 1.

3.1. Generator Architecture

The CIC-UNSW-NB15 dataset used in this study consists of 76-dimensional tabular features. Unlike image or sequential data, tabular data lack explicit spatial and temporal local patterns. Therefore, MultiLayer Perceptron (MLP)-based architectures are more suitable than convolutional or recurrent neural networks for modeling global nonlinear relationships [17].

Accordingly, the generator is designed using a fully connected MLP architecture. To effectively capture complex nonlinear patterns and class-dependent relationships, linear layers are combined with batch normalization and LeakyReLU activation functions. Batch normalization mitigates internal covariate shift during training, while LeakyReLU preserves gradient flow in negative regions, contributing to stable optimization [18].

In addition, a Tanh activation function is applied at the output layer to constrain generated values within a range from −1 to 1, thereby enhancing training stability and computational efficiency [19].

Since this study focuses on augmenting minority attack classes without modifying normal samples, the generator adopts a conditional generation framework. Class labels are embedded and integrated into the latent vector through an embedding layer. The latent vector is progressively expanded via multiple fully connected layers, each followed by linear transformation, batch normalization, and LeakyReLU activation.

The generator consists of four fully connected layers. The input latent vector is first projected to a 256-dimensional space and subsequently expanded to 512 and 1024 dimensions. The final output layer produces feature vectors with the same dimensionality as the original data. Class-conditioned synthetic samples are generated by combining label embeddings with the transformed latent representations.

3.2. Multi-Discriminator Architecture

In the proposed framework, three types of discriminators—AE, CNN, and LSTM—are employed in parallel for a single generator. This design aims to capture complementary characteristics of network traffic from multiple perspectives.

Increasing the number of discriminators beyond this configuration may lead to redundant feature representations and overlapping decision boundaries, which can negatively affect training stability. Moreover, additional discriminators inevitably increase model complexity, computational cost, and parameter size, resulting in diminishing performance gains. Therefore, three discriminators are selected to achieve a balance between modeling capacity and computational efficiency.

Each discriminator evaluates the authenticity of generated samples from a distinct viewpoint, encouraging the generator to produce samples that diverse quality criteria. This multi-perspective assessment enhances data diversity and contributes to improved generalization, training stability, and overall generation quality.

3.2.1. AE-Based Discriminator

An AE is an unsupervised neural network composed of an encoder and a decoder. The encoder compresses high-dimensional input data into a low-dimensional latent representation, while the decoder reconstructs the original input. The difference between the input and reconstructed output is defined as the reconstruction error, which is minimized during training [20].

In this study, the autoencoder-based discriminator evaluates the authenticity of generated samples using reconstruction errors. Since low-quality or abnormal samples tend to produce higher reconstruction errors, this mechanism effectively captures structural consistency and latent patterns in network traffic data.

The ensure balanced training among discriminators and reduce computational overhead, a lightweight three-layer architecture is adopted for 76-dimensional tabular data. The encoder projects input features to 128 dimensions, followed by ReLU activation and dropout, and compresses them into a 32-dimensional latent space. The decoder restores the latent vector to the original input dimension using symmetric layers.

The mean squared error (MSE) between the reconstructed vector

\hat{X}

and the original input

X

is used as the reconstruction loss and serves as the primary criterion for discrimination. In practical applications, the acceptable reconstruction error threshold cannot be determined theoretically because it depends on the data distribution and the training characteristics of the autoencoder. Therefore, the threshold value is typically determined empirically based on the reconstruction error distribution observed on real samples. In this study, the threshold was experimentally selected to distinguish between realistic and unrealistic generated samples.

Figure 2 illustrates the overall architecture of the autoencoder-based discriminator.

3.2.2. CNN-Based Discriminator

Traditional artificial neural networks (ANNs) consist of neurons that are optimized through training. However, conventional ANNs suffer from high computational complexity, which motivated the development of convolutional neural networks (CNNs) [21]. CNNs effectively learn local features through convolution operations while preserving the spatial structure of input data, thereby reducing dimensionality and improving generalization performance. In addition, CNNs automatically extract features by exploiting spatial and temporal correlations, demonstrating strong performance in analyzing high-dimensional data with diverse distributions [22].

Therefore, in this study, a CNN-based discriminator was adopted to evaluate the structural authenticity of samples generated from high-dimensional tabular data. The proposed discriminator processes the input data in a one-dimensional format and extracts discriminative features for classification.

For tabular data with an input dimension of 76, the discriminator is designed as follows. First, the input vectors are reshaped into (

N

, 1, 76) by adding a channel dimension, where

N

denotes the batch size. One-dimensional convolution is then applied for feature extraction. The batch size was selected empirically to ensure stable training while maintaining computational efficiency.

The first convolutional layer uses a kernel size of 3, a stride of 1, and padding to prevent boundary information loss, followed by a LeakyReLU activation function, producing 16 feature channels. The second convolutional layer maintains the same kernel size, stride, and padding, expands the number of channels to 32, and applies batch normalization and LeakyReLU activation to enhance training stability. Finally, the extracted features are flattened and passed through fully connected layers, and the output layer applies a sigmoid activation function to produce a probability value between 0 and 1.

Figure 3 illustrates the overall architecture of the CNN-based discriminator.

3.2.3. LSTM-Based Discriminator

Recurrent Neural Networks (RNNs) have been widely used for processing sequential data such as text, audio, video, and time-series signals. However conventional RNNs suffer from limitations in learning long-term dependencies due to gradient vanishing and the use of Tanh-based activation functions, especially when input variations are large [23].

Network traffic data exhibit strong temporal continuity and class imbalance characteristics. As a result, unidirectional LSTM models are limited in capturing future contextual information. To overcome this limitation, this study adopts a Bidirectional LSTM (BiLSTM) architecture, which enables simultaneous learning of past and future dependencies and effectively analyzes the sequential patterns of encrypted network traffic [24].

The proposed LSTM-based discriminator uses 76-dimensional tabular data as input, which are reshaped into the form (N, 76, 1). The model consists of three BiLSTM layers with 128 hidden units in each direction. The final forward and backward outputs are concatenated to generate a 256-dimensional sequence representation. To improve generalization performance, a dropout rate of 0.2 is applied. The extracted features are then processed by a fully connected layer followed by a Sigmoid activation function to produce the probability that a sample is real. Figure 4 illustrates the overall architecture of the LSTM-based discriminator.

3.3. Training Strategy

The proposed 3D-GAN is trained in an adversarial manner using one generator and three parallel discriminators, including Autoencoder, CNN, and LSTM models. The generator receives a random latent vector z and a conditional label y to generate synthetic samples, which are evaluated by all discriminators.

To ensure balanced and stable training, a unified loss function is adopted. The generator is optimized using the average loss obtained from the three discriminators, as shown in Equation (3). Each discriminator evaluates the generated samples from different perspectives, including reconstruction consistency, spatial feature representation, and temporal dependency.

This is example 3 of an equation:

G = \frac{1}{3} ([E_{z ~ p (z)} [D_{A E} (G (z_{i}))] + E_{z ~ p (z)} [{(D_{C N N} (G (z_{i})) - 1)}^{2}] + E_{z ~ p (z)} [{(D_{L S T M} (G (z_{i})) - 1)}^{2}])

(3)

The Autoencoder-based discriminator evaluates generated samples based on reconstruction consistency between the input and reconstructed outputs. The reconstruction error is measured using the mean squared error (MSE), enabling the discriminator to capture statistical characteristics of the data. In addition, a BEGAN-inspired equilibrium mechanism is employed to maintain a stable balance between the generator and the discriminator during training. The CNN-based discriminator focuses on extracting spatial features from the generated samples, while the LSTM-based discriminator evaluates temporal dependencies within the data.

The CNN and LSTM discriminators aim to classify real and generated samples by minimizing prediction errors. In particular, the LSTM-based discriminator utilizes Binary Cross-Entropy (BCE) loss, as shown in Equation (4).

This is example 4 of an equation:

L_{B C E} = - \frac{1}{N} \sum_{i = 1}^{N} [y_{i} \log {(x}_{i}) (1 - y_{i}) \log (1 - x_{i})]

(4)

where

y_{i} \in {0, 1}

denotes the ground-truth label and

x_{i}

represents the predicted probability.

All discriminators are optimized using backpropagation with the Adam optimizer. Through iterative adversarial training, the generator learns to produce high-quality synthetic samples that satisfy multiple evaluation criteria, while the discriminators progressively improve their ability to distinguish real and generated samples.

4. Experimental Setup and Results

4.1. Experimental Environment

The experimental environment is summarized in Table 1. All experiments were conducted using Python. The external libraries included Pytorch 2.6, scikit-learn 1.6.1, and imbalanced-learn 0.13.

The train parameter is summarized in Table 2. The number of training epochs was set to 200 with a batch size of 64. The latent dimension of the generator was fixed at 100. The Adam optimizer was employed to ensure stable convergence through adaptive learning based on first and second moment estimates.

A sensitivity analysis was performed using the XGBoost (3.0.3) classifier to evaluate the impact of the generator’s learning rate and latent dimension on detection performance. As shown in Figure 5, the learning rate is presented on the x-axis, and each curve corresponds to a latent dimension (50, 100, and 150). The optimal configuration (learning rate = 0.0002, latent dimension = 100) achieved the highest F1-score of 0.9816, demonstrating a balanced trade-off between convergence stability and representational capacity. Figure 5 illustrates the sensitivity analysis results in detail, highlighting the performance variation across different learning rates and latent dimensions.

Since this study aims to alleviate data imbalance in the CIC-UNSW-NB15 dataset and improve network intrusion detection performance, three classification models were selected for evaluation: Random Forest (RF), eXtreme Gradient Boosting (XGBoost), and Bidirectional Gated Recurrent Unit (BiGRU).

For Random Forest, the number of trees was set to 100, and the Gini impurity criterion was used for node splitting. XGBoost was configured with a maximum tree depth of 6, a learning rate of 0.1, and 100 boosting rounds. BiGRU consisted of two GRU layers and was trained using the cross-entropy loss with a learning rate of 0.001.

These models were selected to evaluate the effectiveness and generalization capability of the proposed GAN-based data augmentation method across both traditional machine learning and deep learning classifiers.

For statistically significant performance evaluation, a confusion matrix was used, employing four evaluation metrics: accuracy, false positive rate (FPR), false negative rate (FNR), and F1 score.

4.2. Dataset Description and Preprocessing

In this study, the CIC-UNSW-NB15 dataset was used, which was reconstructed from the original UNSW-NB15 dataset using CIC-FlowMeter developed by the Canadian Institute for Cybersecurity [25]. This process improves data quality by removing missing values, duplicate records, and label inconsistencies, resulting in more reliable training data.

The dataset assigns label 0 to normal traffic and labels 1 to 9 nine attack types. The detailed distribution of the original classes is presented in Table 3. However, some classes, including Analysis, Backdoor, and worms, contain extremely limited samples, which may cause overfitting during data augmentation. Therefore, based on the attack taxonomy proposed by Hansman and Hunt, the original attack types were reorganized into three categories: Information Gathering and Analysis, System Compromise and Malicious Activities, and Denial of Service. The reconstructed class distribution is shown in Table 4. This reconstruction leads to a more balanced class distribution and facilitates stable training.

Since CIC-UNSW-NB15 features are derived from network flow statistics, they exhibit large variations in scale and range. Such differences make model training unstable and may degrade the quality of generated samples. To address this issue, all features were normalized using Min–Max normalization.

To support the hyperbolic tangent activation function, feature values were scaled to the range of [−1, 1]. The normalization as shown in Equation (5).

This is example 5 of an equation:

x^{'} = 2 \times \frac{x - m i n (x)}{\max (x) - \min (x) + ε} - 1

(5)

where

x

represents the original feature value,

\min (x)

and

m a x (x)

denote the minimum and maximum values of each feature, and

ε

is a small constant for numerical stability.

This normalization improves convergence speed, training stability, and quality of generated samples.

4.3. Result

Data augmentation was conducted in two settings. In the first setting, minority classes were augmented to approximately one-third of the normal samples. In the second setting, the number of samples was increased to a level comparable to the normal class. This design was intended to examine whether increasing the amount of augmented data consistently improves model training and detection performance.

To evaluate the effectiveness of the proposed method, comparative experiments were performed using SMOTE and baseline GAN-based augmentation approach, Table 5 and Table 6 summarize the augmentation results under the two settings.

For performance evaluation, three intrusion detection models were employed: Random Forest (

R F

), XGBoost (

X G B

), and

B i G R U

. Subscripts were used to distinguish augmentation methods: S for SMOTE, G for the baseline GAN, and 3D for the proposed 3D-GAN (e.g.,

{R F}_{S}, {R F}_{G}, {R F}_{3 D}

).

Table 7 presents the detection results using the augmented datasets corresponding to Table 5. In the case of SMOTE, the

R F

model showed minimal performance variation, while

X G B

and

B i G R U

exhibited performance degradation. This suggests that SMOTE, although effective in balancing class distributions, has limitations in capturing nonlinear relationships and temporal characteristics inherent in network traffic data.

In Contrast, both GAN and 3D-GAN improved Accuracy, F1-score, False Positive Rate (FPR), and False Negative Rate (FNR) compared to the non-augmented baseline. Furthermore, the proposed 3D-GAN consistently outperformed the baseline GAN. In particular, the

{X G B}_{3 D}

model achieved the best overall performance, with an Accuracy and F1-Score of approximately 0.9602 and FPR and FNR of approximately 0.0398. For reduced FPR and FNR by 0.0104 compared to the baseline GAN.

These results indicate that incorporating temporal feature learning through the LSTM-based discriminator contributes significantly to realistic network traffic generation and improved intrusion detection performance.

Table 8 presents the intrusion detection results under the second augmentation setting, where the number of augmented samples increased from 100,000 to approximately 350,000.

Under this setting, the SMOTE-based augmentation showed performance improvement only for the

{R F}_{S}

model, while

{X G B}_{S}

and

{B i G R U}_{S}

exhibited performance degradation. This trend is consistent with the observations from the first setting and further indicates that SMOTE has limitations in effectively modeling the nonlinear relationships and temporal dependencies inherent in network traffic data.

In contrast, the proposed 3D-GAN again the highest overall performance, particularly in the

{X G B}_{3 D}

model. Moreover, across all classifiers, the 3D-GAN consistently demonstrated slight but stable performance improvements over the baseline GAN. These results suggest that the proposed 3D-GAN more effectively captures the structural and temporal characteristics of network traffic compared to the conventional GAN-based augmentation approach.

Furthermore, when the number of augmented samples increased from 100,000 to 350,000, the detection performance improved by approximately (1.5%), indicating that the proposed model maintains scalability while preserving data quality. This consistent performance gain suggests that the 3D-GAN generates high-fidelity synthetic samples that contribute positively to classifier learning.

To further investigate the contribution of each discriminator component in the proposed architecture, an ablation study was conducted by progressively increasing the structural complexity of the GAN model. Four configurations were evaluated: a baseline GAN with a multilayer perceptron (MLP) discriminator, a GAN with a CNN-based discriminator, a GAN combining CNN and Autoencoder discriminators, and the full proposed 3D-GAN incorporating CNN, Autoencoder, and LSTM discriminators.

The results show that the detection performance gradually improves as additional discriminators are integrated into the model. The CNN-based discriminator contributes to learning spatial feature representations from the network traffic data, while the Autoencoder discriminator enhances the reconstruction consistency between real and generated samples. Furthermore, the LSTM-based discriminator enables the model to capture temporal dependencies in sequential traffic patterns.

In addition to the performance improvement, the training time per epoch remains relatively stable across different configurations, ranging from 31 to 33 s. This indicates that incorporating additional discriminators does not introduce a significant computational overhead while still improving detection performance.

The full 3D-GAN configuration achieved the best overall performance among all evaluated models, confirming that the combination of Autoencoder, CNN, and LSTM discriminators effectively improves the quality of generated samples and enhances intrusion detection performance. The detailed results of the ablation study are presented in Table 9.

Overall, the experimental results demonstrate that the proposed 3D-GAN framework effectively improves intrusion detection performance by generating high-quality augmented data for imbalanced network traffic datasets. The ablation analysis further confirms that integrating heterogeneous discriminators, including CNN, Autoencoder, and LSTM, progressively enhances the quality of generated samples and contributes to improved detection performance. Moreover, the results show that these architectural enhancements can be achieved without introducing significant computational overhead, as the training time per epoch remains nearly constant across different model configurations. These findings indicate that the proposed approach provides a robust and efficient data augmentation framework for network intrusion detection systems.

5. Conclusions

In this study, a novel data augmentation framework, termed 3D-GAN, was proposed to address the class imbalance problem in network traffic datasets. The proposed model integrates three parallel discriminators—Autoencoder, CNN, and LSTM—to effectively capture structural and temporal characteristics of network traffic data.

Experimental results demonstrated that 3D-GAN consistently outperformed the conventional SMOTE method and the baseline GAN-based augmentation approach. In particular, under the second augmentation setting with approximately 350,000 generated samples, the

{X G B}_{3 D}

model achieved the best overall intrusion detection performance. Compared to the original non-augmented dataset, classification accuracy and F1-score increased by more than 4%, while the false positive rate (FPR) and false negative rate (FNR) decreased by approximately 3.5%.

Similarly, the

B i G R U

model exhibited an improvement of over (1.5%) in F1-Score compared to the baseline GAN-based augmentation. These quantitative differences confirm that incorporating temporal feature learning through the LSTM-based discriminator enables the generation of more realistic network traffic patterns, leading to measurable gains in detection performance.

Overall, the proposed 3D-GAN more effectively reflects the intrinsic characteristics of network traffic data, producing higher-quality synthetic samples that enhance the performance of various intrusion detection models.

In practical deployment scenarios, the proposed framework does not necessarily require high computational resources during the operational phase. Although the training process involves multiple discriminators, the synthetic sample generation can be performed offline in advance. Once the augmented dataset is constructed, the trained intrusion detection models can operate independently without the need to run the GAN architecture in real time. Therefore, the proposed 3D-GAN framework can still be applicable in environments with limited computational resources, as the augmentation process mainly affects the offline training stage rather than the runtime inference stage.

For future work, several architectural refinements will be explored to further enhance the performance and stability of the proposed 3D-GAN. First, we plan to investigate adaptive loss integration strategies, such as dynamically weighted multi-discriminator loss functions, to balance the contributions of the autoencoder, CNN, and LSTM discriminators during training. This may improve convergence stability and prevent dominance of a specific discriminator.

Second, the current metric-based voting mechanism among discriminators will be extended to a confidence-aware ensemble scheme, where discriminator outputs are aggregated using reliability-aware weighting rather than simple majority voting. Such an approach is expected to enhance the robustness of synthetic sample selection.

In addition, more advanced architectural components, including attention mechanisms and residual connections, will be incorporated into the discriminator networks to better capture long-range dependencies and complex feature interactions in network traffic data.

Finally, to validate the generalizability of the proposed framework, extensive experiments will be conducted on diverse and large-scale intrusion detection datasets with varying traffic distributions and attack types. Cross-dataset evaluation will be performed to assess robustness under different network environments and imbalance ratios.

Author Contributions

Conceptualization, T.K. and D.S. (Dongil Shin); methodology, T.K. and D.S. (Dongkyoo Shin); software, T.K.; validation, H.P. and T.K.; formal analysis, H.P.; investigation, T.K. and D.S. (Dongkyoo Shin); resources, H.P.; data curation, T.K.; writing—original draft preparation, T.K. and D.S. (Dongkyoo Shin); writing—review and editing, T.K., D.S. (Dongkyoo Shin) and D.S. (Dongil Shin); visualization, T.K.; supervision, D.S. (Dongkyoo Shin) and D.S. (Dongil Shin); project administration, D.S. (Dongkyoo Shin) and D.S. (Dongil Shin); funding acquisition, D.S. (Dongkyoo Shin) and D.S. (Dongil Shin). All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Future Challenge Defense Technology Research and Development Project (9150921) hosted by the Agency for Defense Development Institute in 2023.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Rosay, A.; Riou, K.; Carlier, F.; Leroux, P. Multi-layer perceptron for network intrusion detection. Ann. Telecommun. 2022, 77, 371–394. [Google Scholar] [CrossRef]
Chakraborty, N. Intrusion detection system and intrusion prevention system: A comparative study. Int. J. Comput. Bus. Res. 2013, 4, 1–8. [Google Scholar]
Garcia-Teodoro, J.; Díaz-Verdejo, J.; Maciá-Fernández, G.; Vázquez, E. Anomaly-based network intrusion detection: Techniques, systems and challenges. Comput. Secur. 2009, 28, 18–28. [Google Scholar] [CrossRef]
Bhuyan, M.H.; Bhattacharyya, D.K.; Kalita, J.K. Network anomaly detection: Methods, systems and tools. IEEE Commun. Surv. Tutor. 2014, 16, 303–336. [Google Scholar] [CrossRef]
Benchama, A.; Zebbara, K.; Elasri, S.; Aftatah, M. Optimized CNN-BiGRU intrusion detection model with SMOTE enhancement: Using Optuna for automated hyperparameter tuning. In Artificial Intelligence, Big Data, IoT and Block Chain in Healthcare; Springer: Cham, Switzerland, 2024. [Google Scholar] [CrossRef]
Bamber, S.S.; Katkuri, A.V.R.; Sharma, S.; Angurala, M. A hybrid CNN-LSTM approach for intelligent cyber intrusion detection system. Comput. Secur. 2025, 148, 104146. [Google Scholar] [CrossRef]
KDD Cup 1999 Dataset. Available online: http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html (accessed on 10 January 2026).
Tavallaee, M.; Bagheri, E.; Lu, W.; Ghorbani, A.A. A detailed analysis of the KDD CUP 99 data set. In Proceedings of the IEEE Symposium on Computational Intelligence for Security and Defense Applications, Ottawa, ON, Canada, 8–10 July 2009. [Google Scholar] [CrossRef]
Meena, G.; Choudhary, R.R. A review paper on IDS classification using KDD 99 and NSL KDD dataset in WEKA. In Proceedings of the International Conference on Computer, Communications and Electronics (Comptelix), Jaipur, India, 1–2 July 2017; pp. 553–558. [Google Scholar] [CrossRef]
Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic Minority Over-sampling Technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
Zhao, G.; Liu, P.; Sun, K.; Yang, Y.; Lan, T.; Yang, H. Research on Data Imbalance in Intrusion Detection Using CGAN. PLoS ONE 2023, 18, e0291750. [Google Scholar] [CrossRef] [PubMed]
Habibi, O.; Chemmakha, M.; Lazaar, M. Imbalanced tabular data modelization using CTGAN and machine learning to improve IoT botnet attacks detection. Eng. Appl. Artif. Intell. 2023, 118, 105669. [Google Scholar] [CrossRef]
Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. In Advances in Neural Information Processing Systems (NeurIPS); Curran Associates, Inc.: Red Hook, NY, USA, 2014; Volume 27. [Google Scholar]
Mirza, M.; Osindero, S. Conditional Generative Adversarial Nets. arXiv 2014, arXiv:1411.1784. [Google Scholar] [CrossRef]
Tian, W.; Shen, Y.; Guo, N.; Yuan, J.; Yang, Y. VAE-WACGAN: An Improved Data Augmentation Method Based on VAEGAN for Intrusion Detection. Sensors 2024, 24, 6035. [Google Scholar] [CrossRef] [PubMed]
Li, Z.; Huang, C.; Qiu, W. An Intrusion Detection Method Combining Variational Auto-Encoder and Generative Adversarial Networks. Comput. Netw. 2024, 253, 110724. [Google Scholar] [CrossRef]
Xu, L.; Skoularidou, M.; Cuesta-Infante, A.; Veeramachaneni, K. Modeling Tabular Data Using Conditional GAN. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), Vancouver, BC, Canada, 14 December 2019. [Google Scholar]
Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the International Conference on Machine Learning (ICML); Curran Associates, Inc.: Red Hook, NY, USA, 2015. [Google Scholar]
Trusov, A.V. Training 4.6-bit convolutional neural networks with a HardTanh activation function. Pattern Recognit. Image Anal. 2025, 35, 44–64. [Google Scholar] [CrossRef]
LeCun, Y. Connexionist Learning Models. Ph.D. Thesis, Université Pierre et Marie Curie, Paris, France, 1987. [Google Scholar]
O’Shea, K.; Nash, R. An introduction to convolutional neural networks. arXiv 2015, arXiv:1511.08458. [Google Scholar] [CrossRef]
Khan, A.; Sohail, A.; Zahoora, U.; Qureshi, A.S. A survey of the recent architectures of deep convolutional neural networks. Artif. Intell. Rev. 2020, 53, 5455–5516. [Google Scholar] [CrossRef]
Yu, Y.; Si, X.; Hu, C.; Zhang, J. A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput. 2019, 31, 1235–1270. [Google Scholar] [CrossRef] [PubMed]
Liu, J.; Li, D.; Liu, Z. Research on traffic classification based on BiLSTM and Informer. In Proceedings of CSA 2024 Advances in Computer Science and Ubiquitous Computing, Pattaya, Thailand, 18–20 December 2024; Springer: Singapore, 2025. [Google Scholar] [CrossRef]
Moustafa, N.; Slay, J. UNSW-NB15: A comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In Proceedings of the 2015 Military Communications and Information Systems Conference (MilCIS), Canberra, Australia, 10–12 November 2015; pp. 1–6. [Google Scholar] [CrossRef]

Figure 1. 3D-GAN Architecture.

Figure 2. Autoencoder-based Discriminator Architecture.

Figure 3. CNN-based Discriminator Architecture.

Figure 4. LSTM-based Discriminator Architecture.

Figure 5. Sensitivity Analysis of learning rate and latent dimension using XGBoost (F1-Score).

Table 1. Experiment environment.

Component	Specification
OS	Windows 11 Pro
CPU	AMD Ryzen 7 9700X 8-Core Processor
RAM	64.00 GB
CPU	NVIDIA GeForce RTX 3090
Language	Python 3.7.0
Library	Torch, imbalanced-learn, scikit-learn

Table 2. Train Parameter.

Parameter	Setting
Epoch	200
Batch size	64
Latent Dimension	100
Optimizer	Adam (Adaptive Moment Estimation)
Learning Rate	0.0002

Table 3. Original class distribution of CIC-UNSW-NB15.

Label	Sample Size	Label Number
Benign	358,332	0
Analysis	385	1
Backdoor	452	2
Dos	4467	3
Exploits	30,951	4
Fuzzers	29,613	5
Generic	4632	6
Reconnaissance	16,735	7
Shellcode	2102	8
Worms	246	9

Table 4. Reorganized class distribution used in this study.

Category	Label	Sample Size	Label Number
Benign	Benign	358,332	0
Information Gathering and Analysis	Analysis, Reconnaissance	17,120	1
System Compromise and Malicious Activities	Backdoor, Exploits, Generic, Shellcode, Worms	38,383	2
Denial of Service	Dos, Fuzzers	34,080	3

Table 5. Data Augmentation Result 1.

Label	Original	Augmentation
0	358,332	358,332
1	17,120	100,000
2	38,383	100,000
3	34,080	100,000

Table 6. Data Augmentation Result 2.

Label	Original	Augmentation
0	358,332	358,332
1	17,120	350,000
2	38,383	350,000
3	34,080	350,000

Table 7. Performance evaluation of the network attack detection model using Augmentation Result 1.

	Accuracy	F1-Score	FNR	FPR
$R F$	0.9397	0.9397	0.0603	0.0603
$X G B$	0.9424	0.9424	0.0576	0.0576
$B i G R U$	0.9300	0.9300	0.0700	0.0700
${R F}_{S} - 1$	0.9367	0.9367	0.0633	0.0633
${X G B}_{S} - 1$	0.8913	0.8913	0.1087	0.1087
${B i G R U}_{S} - 1$	0.8737	0.8737	0.1263	0.1263
${R F}_{G} - 1$	0.9532	0.9532	0.0468	0.0468
${X G B}_{G} - 1$	0.9564	0.9564	0.0436	0.0436
${B i G R U}_{G} - 1$	0.9439	0.9439	0.0561	0.0561
${R F}_{3 D} - 1$	0.9587	0.9587	0.0413	0.0413
${X G B}_{3 D} - 1$	0.9602	0.9602	0.0398	0.0398
${B i G R U}_{3 D} - 1$	0.9535	0.9535	0.0465	0.0465

Table 8. Performance evaluation of the network attack detection model using Augmentation Result 2.

	Accuracy	F1-Score	FNR	FPR
$R F$	0.9397	0.9397	0.0603	0.0603
$X G B$	0.9424	0.9424	0.0576	0.0576
$B i G R U$	0.9300	0.9300	0.0700	0.0700
${R F}_{S} - 2$	0.9589	0.9589	0.0411	0.0411
${X G B}_{S} - 2$	0.8353	0.8353	0.1647	0.1647
${B i G R U}_{S} - 2$	0.8150	0.8150	0.1850	0.1850
${R F}_{G} - 2$	0.9746	0.9746	0.0264	0.0264
${X G B}_{G} - 2$	0.9752	0.9752	0.0258	0.0258
${B i G R U}_{G} - 2$	0.9749	0.9749	0.0262	0.0262
${R F}_{3 D} - 2$	0.9812	0.9812	0.0188	0.0188
${X G B}_{3 D} - 2$	0.9816	0.9816	0.0184	0.0184
${B i G R U}_{3 D} - 2$	0.9780	0.9780	0.0220	0.0220

Table 9. Ablation study results of discriminator configurations in the proposed GAN architecture.

Model	Training Time (Per Epoch)	F1-Score
Baseline GAN (MLP)	32 s	0.9752
GAN + CNN	31 s	0.9754
GAN + CNN + AE	33 s	0.9780
Proposed 3D-GAN (GAN + CNN + AE + LSTM)	33 s	0.9816

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kim, T.; Park, H.; Shin, D.; Shin, D. Enhancing Network Intrusion Detection Under Class Imbalance Using a Three-Discriminator Generative Adversarial Network. Electronics 2026, 15, 1253. https://doi.org/10.3390/electronics15061253

AMA Style

Kim T, Park H, Shin D, Shin D. Enhancing Network Intrusion Detection Under Class Imbalance Using a Three-Discriminator Generative Adversarial Network. Electronics. 2026; 15(6):1253. https://doi.org/10.3390/electronics15061253

Chicago/Turabian Style

Kim, Taesu, Hyoseong Park, Dongil Shin, and Dongkyoo Shin. 2026. "Enhancing Network Intrusion Detection Under Class Imbalance Using a Three-Discriminator Generative Adversarial Network" Electronics 15, no. 6: 1253. https://doi.org/10.3390/electronics15061253

APA Style

Kim, T., Park, H., Shin, D., & Shin, D. (2026). Enhancing Network Intrusion Detection Under Class Imbalance Using a Three-Discriminator Generative Adversarial Network. Electronics, 15(6), 1253. https://doi.org/10.3390/electronics15061253

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Enhancing Network Intrusion Detection Under Class Imbalance Using a Three-Discriminator Generative Adversarial Network

Abstract

1. Introduction

2. Related Work

2.1. Network Intrusion Detection

2.2. Network Traffic Datasets

2.3. Data Augmentation for Intrusion Detection

2.4. Generative Adversarial Networks

3. Proposed Three-Discriminator GAN Framework

3.1. Generator Architecture

3.2. Multi-Discriminator Architecture

3.2.1. AE-Based Discriminator

3.2.2. CNN-Based Discriminator

3.2.3. LSTM-Based Discriminator

3.3. Training Strategy

4. Experimental Setup and Results

4.1. Experimental Environment

4.2. Dataset Description and Preprocessing

4.3. Result

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI