SPE-ACGAN: A Resampling Approach for Class Imbalance Problem in Network Intrusion Detection Systems

Yang, Hao; Xu, Jinyan; Xiao, Yongcai; Hu, Lei

doi:10.3390/electronics12153323

Open AccessArticle

SPE-ACGAN: A Resampling Approach for Class Imbalance Problem in Network Intrusion Detection Systems

¹

State Grid Jiangxi Electric Power Research Institute, Nanchang 330096, China

²

School of Computer and Information Engineering, Jiangxi Normal University, Nanchang 330022, China

^*

Author to whom correspondence should be addressed.

Electronics 2023, 12(15), 3323; https://doi.org/10.3390/electronics12153323

Submission received: 18 June 2023 / Revised: 31 July 2023 / Accepted: 1 August 2023 / Published: 3 August 2023

(This article belongs to the Special Issue Security and Privacy in Networks and Multimedia)

Download

Browse Figures

Versions Notes

Abstract

:

Network Intrusion Detection Systems (NIDSs) play a vital role in detecting and stopping network attacks. However, the prevalent imbalance of training samples in network traffic interferes with NIDS detection performance. This paper proposes a resampling method based on Self-Paced Ensemble and Auxiliary Classifier Generative Adversarial Networks (SPE-ACGAN) to address the imbalance problem of sample classes. To deal with the class imbalance problem, SPE-ACGAN oversamples the minority class samples by ACGAN and undersamples the majority class samples by SPE. In addition, we merged the CICIDS-2017 dataset and the CICIDS-2018 dataset into a more imbalanced dataset named CICIDS-17-18 and validated the effectiveness of the proposed method using the three datasets mentioned above. SPE-ACGAN is more effective than other resampling methods in improving NIDS detection performance. In particular, SPE-ACGAN improved the F1-score of Random Forest, CNN, GoogLeNet, and CNN + WDLSTM by 5.59%, 3.75%, 3.60%, and 3.56% after resampling.

Keywords:

network intrusion detection system; imbalanced network traffic; resampling method

1. Introduction

Since 2020, Corona Virus Disease 2019 (COVID-19) has spread worldwide, dramatically changing people’s lifestyles, and forcing people to shift their learning, work, and entertainment activities from offline to online. However, with the continuous development of the network, a series of constantly evolving network attack means, such as worms and buffer overflow, threaten the transportation, energy, education, medical and other industries. Many companies and organizations lack the experience and skills of synchronized confrontation with network attacks, so it is difficult to detect these network attacks. NIDS is a monitoring system for network traffic, which can detect suspicious network attack activities from network traffic and respond to alerts in a timely manner to protect the network before hackers intrude.

There are three main types of NIDS: misuse-based, anomaly-based and hybrid NIDS [1]. Misuse-based NIDS [2,3] uses a library of features or fingerprints of known attacks to match each traffic feature or fingerprint, and if the match is successful, the traffic is determined to be malicious. Anomaly-based NIDS [4,5] models normal behavior and does not require attacks to be explicitly identified in the training data. The model describes traffic activity in the normal state of the protected system, and any network traffic that does not match the behavior described by the model is captured and reported. Hybrid NIDS [6,7] uses both misuse and anomaly-based, which allows the NIDS to have a lower false alarm rate and higher accuracy than the above two methods alone. The use of deep learning to implement hybrid NIDS is the dominant approach today, and by learning the characteristics within the network traffic, attack signatures can be obtained and anomalous behavior can also be identified. The use of hybrid NIDS can compensate for the shortcomings of misuse-based NIDS and anomaly-based Anomaly NIDS. Deep learning models such as Convolutional Neural Networks (CNNs) [8], Long Short-Term Memory Networks (LSTM) [9] and GoogLeNet [10] have been proven to be effective in detecting attacks.

However, the extremely unbalanced distribution of network traffic [11,12] greatly hinders further development of deep learning-based intrusion detection research: the behavior of Internet users is almost normal, with only a small number of malicious attacks. As a result, a large sample of traffic is generated in cyberspace, but only a small percentage is malicious, and the distribution between different malicious traffic is also uneven. Deep learning network models require a large number of samples for training, and they have more robust performance when a large amount of data are available for training, but the performance of deep learning algorithms also degrades significantly when learning unbalanced data [13,14,15]. In general, the lack of a certain class of network traffic may cause NIDS to favor the majority class of samples and neglect learning from the minority class. Therefore, balancing network traffic is necessary for NIDS to fully learn the features of each class of samples.

Many studies have resampled network traffic when training the NIDS model: oversampling the minority class of samples or undersampling the majority class of samples, such as the Synthetic Minority Over-Sampling Technique (SMOTE) [16], Random Undersampling (RUS) [17] and Generative Adversarial Networks (GANs) [18]. SMOTE requires traversing each minority class sample and selecting one sample to calculate its distance from neighboring samples. This is very resource intensive in a data set with a large volume of data. In addition, the samples generated via interpolation increase the possibility of overlapping samples of each class and the possibility of overfitting of the classification mode. RUS removes most class samples in a random way, so it may remove samples that are on the classification boundary, which may result in information loss. The samples generated by GAN are random in nature and cannot be generated on demand, so most of the samples generated have difficulty fitting the features of the minority class of samples.

In this work, we introduced a novel resampling method, SPE-ACGAN, based on the combination of Auxiliary Classifier GAN (ACGAN) [19] and Self-Paced Ensemble (SPE) [20] to deal with the problem of imbalanced network traffic. The imbalanced datasets are divided into a minority class subset and a majority class subset, and then the minority class subset is fed into ACGAN to generate the specified number of samples, and the majority class subset is fed into SPE to remove the majority class samples until the number of samples reaches the specified value.

The main contributions of this work are described as follows:

For NIDS, a resampling method SPE-ACGAN based on the combination of SPE and ACGAN is proposed to alleviate the data imbalance problem, which is able to reduce the majority class samples and increase the minority class samples to make the training set more balanced.
We merge the CICIDS-2017 dataset and the CICIDS-2018 dataset into a new dataset, named CICIDS-17-18. The CICIDS-17-18 dataset is a more imbalanced dataset with a larger amount of data to show the effectiveness of SPE-ACGAN.
Our proposed method is experimented on the above three datasets and compared with some existing resampling methods. The performance metrics of some typical NIDS models are improved after applying our proposed method.

The next section, Section 2, discusses the existing methodology. Section 3 presents the proposed method. Section 4 compares and analyses the performance of the proposed method and existing methods. Finally, Section 5 concludes the whole paper.

2. Related Work

Nowadays, for NIDS, the deep learning-based method is the essential classification model to identify different types of attacks. An imbalance of training samples can lead to overfitting of the classification model and affect its generalization ability. We introduce the related work from the deep learning-based method and sample resampling.

Deep learning-based network intrusion detection is mainly based on training models to learn potential features of data samples for classification and prediction purposes, which can be divided into supervised and unsupervised learning. Supervised learning includes Long and Short-Term Memory networks (LSTM) [9], Convolutional Neural Networks (CNNs) [8], etc. Yang et al. [21] proposed a Gradient-Boosting Decision Tree (GBDT)–parallel quadratic ensemble learning method for intrusion detection systems with a Gated Recurrent Unit (GRU) model and special modification to network traffic to handle temporal data. Experimental results based on the CICIDS2017 dataset show that the advanced temporal intrusion detection system based on integrated learning achieves better accuracy, recall, precision and F1 scores compared to existing methods. Unsupervised learning mainly consists of Auto Encoder (AE) [22,23] and Self-supervised Learning (SSL), which can learn from a large number of unlabeled samples and also effectively learn the features of different classes of traffic data [24]. Vaiyapuri et al. [25] proposed an unsupervised IDS model that uses deep autoencoder (DAE) to learn traffic features and then uses one class support vector machine (OCSVM) to segment the decision hyperplane, using the NSL-KDD dataset and UNSW-NB15 dataset. The proposed model was verified as having good performance.

Considering the impact of data imbalance, the minority class of samples will tend to be overfitted during training, and the model prediction will be more biased towards the majority of samples, which is less accurate in identifying malicious attacks. Therefore, many scholars have started to study how to solve the problem of extremely unbalanced data distribution of network traffic. Yan et al. [26] proposed an improved locally adaptive composite minority sampling algorithm (LA-SMOTE) to deal with network traffic imbalance and then detected network traffic anomalies based on a deep learning GRU neural network. Abdulhammed et al. [27] used data oversampling and undersampling methods to deal with the imbalanced dataset CIDDS-001 and used a deep neural network, random Forests and variational autoencoder classifiers to evaluate the dataset. Ga et al. [28] used ACGAN for minority class sample synthesis on the CICIDS-2017 dataset and then used CNN for classification to achieve the final OA and F1-score of 99.48% and 98.71%. Park et al. [29] used GAN for data synthesis on the minority class attack data in the training phase and then used Auto Encoder (AE) to optimize the generated data, and experimental results on NSL-KDD, UNSW-NB15 and IoT datasets show that reasonably increasing data can improve the performance of existing deep learning-based NIDS by solving the data imbalance problem. Table 1 provides a comparison of typical resampling methods.

In this paper, for NIDS models, we propose SPE-ACGAN, a resampling method based on a combination of supervised learning ACGAN and SPE, to resample unbalanced network traffic in order to solve the problem of unbalanced network work traffic. The ACGAN network adds to the GAN network the ability to generate a specified class, which can generate the minority class samples of a specified category, and its discriminator continuously improves the quality of the data it generates. SPE efficiently reduces the number of majority class samples and is able to retain most of the samples that are on the classification boundary.

3. Methods and Materials

In this section, the general structure of SPE-ACGAN proposed in this paper and the working principle of each module are first introduced. Then, the datasets used in the proposed algorithm and the implementation details of the algorithm, are presented.

3.1. SPE-ACGAN

Considering the unbalance of training samples of the network traffic in NIDS, we resample the training samples from two dimensions. Using SPE to decrease the number of samples in the majority class and using ACGAN to increase the number of samples in the minority class.

3.1.1. SPE

SPE is a framework for unbalanced classification [20], the core idea of which is to propose a concept of classification hardness and to coordinate data hardness by undersampling self-paced to generate a new undersampled dataset. The process of SPE is shown in Figure 1.

SPE first divides the input, the imbalanced dataset, into a majority set

N

and a minority set

P

. Then, each sample in

N

is randomly placed into

k

bins according to the categorical hardness of each sample, and each bin has a total categorical hardness. The above steps are repeated until the number of majority class samples equals the number of minority class samples or the specified number to complete the resampling. To obtain a balanced dataset, SPE keeps the total categorical hardness of each bin as the same. The hardness value is derived from a hardness function,

H

, which is a “categorical hardness function”, such as Absolute Error, MSE and Cross Entropy. For a given model

F (x)

, the categorical hardness of the sample

(x, y)

is given by Equation (1):

H_{x} = H (x, y, F)

(1)

The hardness grade is given by Equation (2), where

B_{l}

is the hardness grade of the

l

-th box:

B_{l} = \{(x, y)| \frac{l - 1}{k} \leq H_{x} = H (x, y, F) \leq \frac{l}{k}\} H (\cdot) \in [0, 1]

(2)

3.1.2. ACGAN

ACGAN is mainly composed of Generation (

G

) and Discrimination (

D

), and its structure is shown in Figure 2. ACGAN works as follows: ACGAN works by the network generating a random set of noise values

z

. According to the input of the specified category, the generator

G

modifies the noise values z into

X_{f a k e}

of the corresponding category, and the discriminator

D

trained by

X_{r e a l}

to identify whether the generated

X_{f a k e}

it real data, and if it is virtual data, what is the probability of belonging to each category, respectively, and the error is found by the loss function. The generator

G

is instructed to update the parameters.

3.1.3. Overall Model Architecture

The SPE-ACGAN resampling method works in two steps, which are performed by ACGAN and SPE, respectively. The first step is the oversampling of the minority class samples, which is fed into ACGAN to increase the number of minority class samples; the second step is the undersampling of the majority class samples, which is fed into SPE to reduce the number of majority class samples. The workflow of the SPE-ACGAN resampling method is shown in Figure 3.

3.2. Details of the SPE-ACGAN

3.2.1. Dataset

CICIDS-2017 and CICIDS-2018 [30] are network intrusion detection datasets published by the Canadian Institute for Cybersecurity (CIC) on Amazon Web Services (AWS) in 2017 and 2018. The two datasets mentioned above have as many as 14 types of attacks, such as DDOS, XSS, Heartbleed and Brute force, each over 100 GB in size, which make them the richest datasets of all publicly available datasets in terms of category.

In addition, there is an imbalance in the samples for each category in both of these datasets. In the CICIDS-2017 dataset, the lowest number of Heartbleed, Infiltration and Web Attack-Sql Injection is only 11, 36 and 21, while the highest number of Benign is 2,273,097. In the CICIDS-2018 dataset, the number of Heartbleed and Port Scan is 0, while Benign has 6,376,223. To exacerbate this imbalance and test our proposed resampling method, we merge the CICIDS-2017 dataset and the CICIDS-2018 dataset to form the CICIDS-17-18 dataset. The distribution of data before and after the merging of the CICIDS-17-18 dataset is shown in Table 2. After the CICID-17-18 dataset is merged, the proportions of FTP-Patator, SSH-Patator, Bot, etc., are increased, and the gap between them and Heartbleed, Infiltration, Web Attack-Sql Injection becomes wider and wider. This aggravates the imbalance and makes it more difficult for NIDS to learn the features of Heartbleed, Infiltration and Web Attack-Sql Injection.

3.2.2. Dataset Resampling

We randomly divided the above data set into a training set and a test set by 8:2. SPE-ACGAN resampled the training sets for the CICIDS-17-18 dataset, the CICIDS-2017 dataset, and the CICIDS-2018 dataset with the aim of moderating the extreme imbalances in these datasets. ACGAN performed data synthesis for several categories of samples, and SPE censored most of the categories so that benign traffic and all malicious traffic are close to each other, and the number of malicious flows is close to each other in proportion.

The number of resamples is an important factor in the quality of resampling. After some experiments, like the threshold of the minority category with 5000, 10,000 and 20,000, in the resampling process, we define categories with less than 10,000 to be the minority category, which need to be oversampled to increase their number by 10,000, while categories with more than 20,000 and less than 50,000 are reduced in number by 50% and rounded down. For categories with more than 50,000, we reduce the number by 70% and round down. Finally, based on the total number of all malicious traffic, the number of benign traffic is adjusted to be approximately equal to the number of malicious traffic.

When ACGAN generates an analogous sample, it can generate 10 samples of a specified type per round, and we can achieve the required number of samples by having it generate multiple batches of samples of a specified type. When SPE undersamples, we use the hardness level of classification as the hardness index and calculate the hardness index of each sample. The samples are reordered after each training session, and when the number of iterations is satisfied, the samples are selected from the front to the back according to the number of undersamples. Table 3, Table 4 and Table 5 show the data distribution of the above three datasets before and after resampling by SPE-ACGAN, respectively.

4. Experimentation and Result Analysis

In this section, we detail the experimental setup and evaluation metrics and present the experimental results to demonstrate the validity of the proposed method.

4.1. Experimental Setup

In this work, the settings on all experimental environments are as follows: the deep learning framework is the Tensorflow 2.4 open source framework, the operating system is the Windows 10 Professional operating system, the processor is an Intel(R) Core (TM) i5 10400F CPU @ 2.90 GHz, the memory size is 32 GB, the graphics card uses a single NVIDIA GeForce GTX 1080Ti, the development environment is PyCharm and Anaconda3, and the development language is Python.

A machine learning NIDS, Random Forest [31], and three deep learning network intrusion detection models, CNN + WDLSTM (weight-dropped LSTM) [9], CNN [32] and GoogLeNet [10], are used as the validation models to verify the effectiveness of the SPE-ACGAN resampling method proposed in this paper.

The CICIDS-2017 dataset, CICIDS-2018 dataset and CICIDS-17-18 dataset are used as datasets for validating the SPE-ACGAN resampling method, and RUS [17], SMOTE [16], ACGAN and SPE are used as the resampling methods for comparison.

4.2. Performance Metrics

In network intrusion detection, there are many evaluation metrics that can be referred to. In this paper, we would like to use Precision (P), Recall (R) and F1-Score (F1) as the criteria to evaluate the performance of the model.

P: The proportion of attack samples correctly predicted by the classifier to all samples predicted as attacks, whose formula is shown in (3):

P = \frac{T P}{T P + F P}

(3)

R: The ratio of all samples correctly classified by the classifier as attacks to all samples actually attacked, with the formula shown in (4):

R = \frac{T P}{T P + F N}

(4)

F1: The summed average of precision and recall to check the stability of the system by considering the precision and recall of the system with the formula shown in (5):

F 1 = \frac{2 \times P \times R}{P + R}

(5)

True Positive (TP) means that the classifier correctly predicts a positive sample as a positive sample; True Negative (TN) means that the classifier correctly predicts a negative sample as a negative sample; False Positive (FP) means that the classifier incorrectly predicts a negative sample as a positive sample; and False Negative (FN) is a false negative, meaning that the classifier incorrectly predicts a positive sample as a negative sample.

Table 6 summarizes the performance changes of each NIDS after the resampling of the CICIDS-2017 and CICIDS-2018 datasets by SPE-ACGAN. After the resampling of the CICIDS-2017 dataset by SPE-ACGAN, Random Forest achieved 93.03%, 94.93% and 93.97% in the Precision, Recall and F1-score metrics, 0.86%, 1.14% and 1% higher than before resampling. CNN + WDLSTM achieved 98.68%, 98.88% and 98.78% in the Precision, Recall and F1-score metrics, 0.61%, 0.46% and 0.54% higher than before resampling. CNN achieved 96.85%, 98.11% and 97.48% in the Precision, Recall and F1-score metrics, 0.17%, 0.06% and 0.12% higher than before resampling.

In addition, after the CICIDS-2018 dataset was resampled by SPE-ACGAN, Random Forest achieved 92.70%, 90.64% and 91.66% in the Precision, Recall and F1-score metrics, 1.02%, 0.99% and 1.01% higher than before resampling. CNN + WDLSTM achieved 95.92%, 96.13% and 96.02% in the Precision, Recall and F1-score metrics, 0.95%, 1.25% and 1.39% higher than before resampling. CNN achieved 94.71%, 93.33% and 94.01% in Precision, Recall and F1-score metrics, 1.09%,1.23% and 1.67% higher than before resampling. GoogLeNet achieved 93.17%, 92.43% and 92.80% in the Precision, Recall and F1-score metrics, 0.23%, 1.04% and 1.09% higher than before resampling. The above experimental results all show that the SPE-ACGAN resampling method can moderate the network traffic imbalance problem.

Table 7 summarizes the comparison of SPE-ACGAN with other methods in the CICIDS-17-18 dataset experiments. After resampling by SPE-ACGAN, Random Forest achieved 75.63%, 77.14% and 76.38% in Precision, Recall and F1-score, 2.02%, 2.77% and 5.59% higher than before resampling. CNN + WDLSTM achieved 82.23%, 82.54% and 82.38% in Precision, Recall and F1-score, 2.77%, 3.3% and 3.56% higher than before resampling. CNN achieved 83.94%, 82.78% and 81.66% in Precision, Recall and F1-score,5.41%,5.48% and 3.75% higher than before resampling. GoogLeNet achieved 77.57%, 80.20% and 78.86% in Precision, Recall and F1-score, 3.41%, 3.80% and 3.60% higher than before resampling. The performance of each NIDS on the three metrics of Precision, Recall and F1-score before and after ACGAN resampling is shown in Figure 4, Figure 5 and Figure 6.

Furthermore, the SPE-ACGAN method proposed in this paper takes F1-score values of 76.38%, 82.38%, 81.66% and 78.86% in Random Forest, CNN + WDLSTM, CNN and GoogLeNet after resampling. After resampling by the SPE-ACGAN method proposed in this paper, the F1-score in Random Forest, CNN + WDLSTM, CNN and GoogLeNet takes the values of 76.38%, 82.38%, 81.66% and 78.86%. After resampling by the RUS, the F1-score takes the values of 75.14%, 77.94%, 75.03% and 78.36%. After resampling using the SMOTE method, the F1-score takes the values of 76.23%, 78.96%, 75.32% and 81.11% in Random Forest, CNN + WDLSTM, CNN and GoogLeNet. After resampling by SPE method, the F1-score takes the values of 75.40%, 76.57%, 80.32%, and 81.42% in Random Forest, CNN + WDLSTM, CNN and GoogLeNet. After resampling using the ACGAN method, the F1-score takes the values of 74.49%, 76.52%, 78.08%, and 77.70% in Random Forest, CNN + WDLSTM, CNN and GoogLeNet. By comparing the results with other resampling results, it can be concluded that the resampling method proposed in this paper has the highest performance improvement for each NIDS.

5. Conclusions

The sample imbalance of network traffic is one of the important reasons affecting the detection performance of the NIDS classifier. In this paper, the rationale behind our proposed resampling approach is to balance the amount of malicious traffic with the amount of benign traffic and, similarly, to balance the amount of malicious traffic in each category. We propose a resampling method SPE-ACGAN based on the combination of ACGAN and SPE, which balances the network traffic by eliminating majority class samples and generating minority class samples. Compared to existing oversampling methods, ACGAN is able to generate data with specified categories, whereas GAN generates data randomly and needs to be filtered again to find data that match the features of the specified categories. Not only that, ACGAN does not need to traverse the neighboring samples of the minority samples compared to SMOTE, which can greatly improve efficiency. Compared to RUS, SPE is able to retain samples that are on the classification boundary to a great extent rather than randomly removing samples from the majority class. Experimental results show that the resampling method proposed in this paper alleviates the sample imbalance problem of NIDS and not only improves the performance of multi-class NIDS but also achieves a better improvement than other resampling methods.

In NIDS, capturing attack samples is a difficult task, but generating attack samples is more difficult because verifying the effectiveness of generating samples is not an easy task. The processing of small and zero samples will be an important aspect of NIDS.

In addition, scenarios of class imbalance often occur in everyday life, such as the gap between rare disease diagnoses and health cases. The use of resampling techniques enables the model to cope with the imbalance by enabling the features of a small number of class samples during the training process.

Author Contributions

Conceptualization, H.Y., J.X., Y.X. and L.H.; methodology, J.X. and H.Y.; validation, L.H. and J.X.; formal analysis, J.X. and Y.X.; resources, J.X.; writing—original draft preparation, J.X. and L.H.; writing—review and editing, L.H. and Y.X. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The datasets utilized in this paper are the CICIDS-2017 dataset (https://www.unb.ca/cic/datasets/ids-2017.html, accessed on 7 June 2022) and the CICIDS-2018 dataset (https://www.unb.ca/cic/datasets/ids-2018.html, accessed on 7 June 2022).

Conflicts of Interest

The authors declare no conflict of interest.

References

Molina-Coronado, B.; Mori, U.; Mendiburu, A.; Miguel-Alonso, J. Survey of network intrusion detection methods from the perspective of the knowledge discovery in databases process. IEEE Trans. Netw. Serv. 2020, 4, 2451–2479. [Google Scholar]
Viegas, E.K.; Santin, A.O.; Oliveira, L.S. Toward a reliable anomaly-based intrusion detection in real-world environments. Comput. Netw. 2017, 11, 200–216. [Google Scholar] [CrossRef]
Aggarwal, C.C. Data Mining: The Textbook; Springer: Berlin/Heidelberg, Germany, 2015. [Google Scholar]
Shyu, M.L.; Chen, S.C.; Sarinnapakorn, K.; Chang, L.W. A novel anomaly detection scheme based on principal component classifier. In Proceedings of the IEEE Foundation and New Direction of Data Mining Workshop, Melbourne, FA, USA, 19–22 November 2003; pp. 172–179. [Google Scholar]
Goodall, J.R.; Ragan, E.D.; Steed, C.A.; Reed, J.W. Situ: Identifying and explaining suspicious behavior in networks. IEEE Trans. Vis. Comput. Graph. 2019, 1, 204–214. [Google Scholar] [CrossRef] [PubMed]
Depren, O.; Topallar, M.; Anarim, E.; Ciliz, M.K. An intelligent intrusion detection system (IDS) for anomaly and misuse detection in computer networks. Expert Syst. Appl. 2005, 4, 713–722. [Google Scholar] [CrossRef]
Bhuyan, M.H.; Bhattacharyya, D.K.; Kalita, J.K. A multi-step outlier-based anomaly detection approach to network-wide traffic. Inf. Sci. 2016, 6, 243–271. [Google Scholar] [CrossRef]
Wu, K.; Chen, Z.; Li, W. A novel intrusion detection model for a massive network using convolutional neural networks. IEEE Access 2018, 9, 50850–50859. [Google Scholar] [CrossRef]
Hassan, M.M.; Gumaei, A.; Alsanad, A.; Alrubaian, M.; Fortino, G. A hybrid deep learning model for efficient intrusion detection in big data environment. Inf. Sci. 2019, 3, 386–396. [Google Scholar] [CrossRef]
Li, Z.P.; Qin, Z.; Huang, K.; Yang, X.; Ye, S.X. Intrusion detection using convolutional neural networks for representation learning. In Proceedings of the NIP 2017, Long Beach, CA, USA, 4–9 December2017; pp. 858–866. [Google Scholar]
Bedi, P.; Gupta, N.; Jindal, V. I-SiamIDS: An improved Siam-IDS for handling class imbalance in network-based intrusion detection systems. Appl. Intell. 2021, 2, 1133–1151. [Google Scholar] [CrossRef]
Bedi, P.; Gupta, N.; Jindal, V. Siam-IDS:Handling class imbalance problem in Intrusion Detection Systems using Siamese Neural Network. In Proceedings of the Third International Conference on Computing and Network Communications, Vellore, India, 30–31 March 2019; Elsevier: Amsterdam, The Netherlands, 2019; pp. 780–789. [Google Scholar]
Apruzzese, G.; Colajanni, M.; Ferretti, L.; Guido, A.; Marchetti, M. On the effectiveness of machine and deep learning for cyber security. In Proceedings of the International Conference on Cyber Conflict, Swissotel Tallinn, Estonia, 29 May–1 June 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 371–390. [Google Scholar]
Dong, B.; Wang, X. Comparison Deep Comparison deep learning method to traditional methods using for network intrusion detection. In Proceedings of the IEEE International Conference on Communication Software & Networks, Beijing, China, 4–6 June 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 581–585. [Google Scholar]
Wang, S.; Liu, W.; Wu, J.; Cao, L.; Meng, Q.; Kennedy, P.J. Training deep neural networks on imbalanced data sets. In Proceedings of the International Joint Conference on Neural Networks, Vancouver, BC, Canada, 24–29 July 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 4368–4374. [Google Scholar]
Ma, X.Y.; Shi, W. Aesmote: Adversarial reinforcement learning with smote for anomaly detection. IEEE Trans. Netw. Sci. Eng. 2021, 2, 943–956. [Google Scholar] [CrossRef]
Tahir, M.A.; Kittler, J.; Mikolajczyk, K.; Yan, F. A multiple expert approach to the class imbalance problem using inverse random under sampling. In Proceedings of the International Workshop on Multiple Classifier Systems, Reykjavik, Iceland, 10–12 June 2009; Springer: Berlin, Germany, 2009; pp. 82–91. [Google Scholar]
Lee, J.; Park, K. AE-CGAN model based high performance network intrusion detection system. Appl. Sci. 2019, 9, 4221. [Google Scholar] [CrossRef] [Green Version]
Odena, A.; Olan, C.; Solens, J. Conditional image synthesis with auxiliary classifier GANs. In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017; PMLR: New York, NY, USA, 2017; pp. 2642–2651. [Google Scholar]
Liu, Z.; Cao, W.; Gao, Z.; Bian, J.; Chen, H. Self-paced Ensemble for Highly Imbalanced Massive Data Classification. In Proceedings of the 36th IEEE International Conference on Data Engineering, Dallas, TX, USA, 20–24 April 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 841–852. [Google Scholar]
Yang, J.; Sheng, Y.; Wang, J. A GBDT-paralleled quadratic ensemble learning for intrusion detection system. IEEE Access 2020, 8, 175467–175482. [Google Scholar] [CrossRef]
Lopez-Martin, M.; Carro, B.; Sanchez-Esguevillas, A.; Lloret, J. Conditional variational autoencoder for prediction and feature recovery applied to intrusion detection in IoT. Sensors 2017, 17, 1967. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Rifai, S.; Vincent, P.; Muller, X.; Glorot, X.; Bengio, Y. Contractive auto-encoders: Explicit invariance during feature extraction. In Proceedings of the ICM 2011, Bellevue, WA, USA, 28 June–2 July 2011; ACM: New York, NY, USA, 2011; pp. 833–840. [Google Scholar]
Wang, Z.; Li, Z.; Wang, J.; Li, D. Network intrusion detection model based on improved BYOL self-supervised learning. Secur. Commun. Netw. 2021, 2021, 9486949. [Google Scholar] [CrossRef]
Vaiyapuri, T.; Binbusayyis, A. Enhanced deep autoencoder based feature representation learning for intelligent intrusion detection system. CMC—Comput. Mater. Contin. 2021, 3, 3271–3288. [Google Scholar] [CrossRef]
Yan, B.H.; Han, G.D. LA-GRU: Building combined intrusion detection model based on imbalanced learning and gated recurrent unit neural network. Secur. Commun. Netw. 2018, 2018, 6026878. [Google Scholar] [CrossRef] [Green Version]
Abdulhammed, R.; Faezipour, M.; Abuzneid, A.; Abumallouh, A. Deep and machine learning approaches for anomaly-based intrusion detection of imbalanced network traffic. IEEE Sens. Lett. 2019, 1, 7101404. [Google Scholar] [CrossRef]
Andresini, G.; Appice, A.; Rose, L.D.; Malerba, D. GAN augmentation to deal with imbalance in imaging-based intrusion detection. Futur. Gener. Comp. Syst. 2021, 123, 108–127. [Google Scholar] [CrossRef]
Park, C.; Lee, J.; Kim, Y.; Park, J.-G.; Kim, H.; Hong, D. An enhanced AI-based network intrusion detection system using generative adversarial networks. IEEE. IoT-J. 2023, 10, 2330–2345. [Google Scholar] [CrossRef]
Sharafaldin, I.; Lashkari, A.H.; Ghorbani, A.A. Toward generating a new intrusion detection dataset and intrusion traffic characterization. In Proceedings of the International Conference on Information Systems Security & Privacy, Funchal, Portugal, 22–24 January 2018; Elsevier: London, UK, 2018; pp. 108–116. [Google Scholar]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
Ho, S.; Jufout, S.A.; Dajani, K.; Mozumdar, M. A novel intrusion detection model for detecting known and innovative cyberattacks using convolutional neural network. IEEE Open J. Comput. Soc. 2021, 2, 14–25. [Google Scholar] [CrossRef]

Figure 1. The process of SPE.

Figure 2. The architecture of ACGAN.

Figure 3. The architecture of the proposed resampling model.

Figure 4. The Performance of Precision before and after resampling.

Figure 5. The Performance of Recall before and after resampling.

Figure 6. The Performance of F1-score before and after resampling.

Table 1. The typical resampling methods.

Method	Oversampling	Undersampling
SMOTE	√
RUS		√
GAN	√
SPE-ACGAN (our method)	√	√

Table 2. Quantity distribution of CICDS-17-18 dataset before and after consolidation.

Class	Samples of CICIDS-2017	Composition (%)	Samples of CICIDS-2018	Composition (%)	Samples of CICIDS-17-18	Composition (%)
Benign	2,273,097	80.301	6,376,223	76.041	8,649,320	76.023
FTP-Patator	7938	0.281	193,353	2.306	201,291	1.771
SSH-Patator	5897	0.209	187,588	2.237	193,485	1.702
Bot	1966	0.070	285,289	3.402	287,255	2.523
DDos	128,027	4.523	687,840	8.203	815,867	7.176
Dos GoldenEye	10,293	0.364	461,911	5.509	472,204	4.155
Dos Hulk	231,073	8.163	41,507	0.495	272,580	2.401
Dos Slowhttptest	5499	0.195	139,889	1.668	145,388	1.282
Dos Slowloris	5796	0.205	10,989	0.131	16,785	0.148
Heartbleed	11	0.001	0	0	11	0.001
Infiltration	36	0.001	161,095	1.921	161,131	1.416
Port Scan	158,930	5.615	0	0	158,930	1.397
Web Attack-Brute Force	1507	0.054	610	0.001	2117	0.001
Web Attack-Sql Injection	21	0.001	86	0.001	107	0.001
Web Attack-XSS	652	0.024	229	0.001	881	0.001

Table 3. CICIDS-2017 distribution of the number of training sets before and after resampling.

Class	Before Resampling	Composition (%)	After Resampling	Composition (%)
Benign	1,818,477	80.301	300,000	52.685
FTP-Patator	7938	0.281	17,938	3.153
SSH-Patator	6350	0.209	16,350	2.874
Bot	1572	0.07	11,572	2.034
DDos	102,421	4.523	30,726	5.401
Dos GoldenEye	8234	0.364	18,234	3.205
Dos Hulk	184,858	8.163	55,457	9.748
Dos Slowhttptest	4399	0.195	15,499	2.724
Dos Slowloris	4636	0.205	14,636	2.573
Heartbleed	8	0.001	10,008	1.759
Infiltration	28	0.001	10,028	1.763
Port Scan	127,144	5.615	38,143	6.705
Web Attack-Brute Force	1295	0.054	11,295	1.810
Web Attack-Sql Injection	16	0.001	10,016	1.761
Web Attack-XSS	521	0.023	10,521	1.847

Table 4. CICIDS-2018 distribution of the number of training sets before and after resampling.

Class	Before Resampling	Composition (%)	After Resampling	Composition (%)
Benign	509,778	76.041	509,778	49.708
FTP-Patator	154,682	2.306	46,404	4.569
SSH-Patator	150,070	2.237	45,021	6.742
Bot	228,231	3.402	68,469	8.427
DDos	550,272	8.203	165,081	16.255
Dos GoldenEye	369,528	5.509	11,858	1.671
Dos Hulk	33,205	0.495	16,602	11.257
Dos Slowhttptest	111,911	1.668	33,573	1.635
Dos Slowloris	8791	0.131	18,791	1.850
Heartbleed	0	0	0	0
Infiltration	128,876	1.921	38,662	3.807
Port Scan	0	0	0	0
Web Attack-Brute Force	488	0.001	10,488	1.033
Web Attack-Sql Injection	68	0.001	10,068	0.991
Web Attack-XSS	183	0.001	10,183	0.992

Table 5. CICIDS-17-18 distribution of the number of training sets before and after resampling.

Class	Before Resampling	Composition (%)	After Resampling	Composition (%)
Benign	6,919,456	76.023	700,000	49.793
FTP-Patator	161,032	1.771	48,309	3.463
SSH-Patator	154,788	1.702	46,436	3.328
Bot	229,804	2.523	68,941	4.942
DDos	652,693	7.176	195,807	14.035
Dos GoldenEye	377,763	4.155	113,328	8.123
Dos Hulk	218,064	2.401	65,419	4.689
Dos Slowhttptest	116,311	1.282	34,893	2.501
Dos Slowloris	13,428	0.148	13,428	0.963
Heartbleed	8	0.001	10,008	0.713
Infiltration	128,904	1.416	38,671	0.646
Port Scan	127,144	1.397	38,143	2.771
Web Attack-Brute Force	1693	0.001	11,634	0.834
Web Attack-Sql Injection	86	0.001	10,086	0.723
Web Attack-XSS	704	0.001	10,704	0.761

Table 6. The outcome of the proposed method before and after resampling.

Method	CICIDS-2017			CICIDS-2018
Method	P (%)	R (%)	F1 (%)	P (%)	R (%)	F1 (%)
Random Forest	92.17	93.79	92.97	91.68	89.65	90.65
GoogLeNet	92.88	94.53	93.69	92.94	91.39	91.71
CNN	96.68	98.05	97.36	93.62	92.10	92.34
CNN + WDLSTM	98.07	98.42	98.24	94.97	94.88	94.63
Our Proposed + Random Forest	93.03	94.93	93.97	92.70	90.64	91.66
Our Proposed + GoogLeNet	93.34	94.10	93.72	93.17	92.43	92.80
Our Proposed + CNN	96.85	98.11	97.48	94.71	93.33	94.01
Our Proposed + CNN + WDLSTM	98.68	98.88	98.78	95.92	96.13	96.02

Table 7. Comparison of the proposed method and different methods.

Method	CICIDS-17-18
Method	P (%)	R (%)	F1 (%)
Random Forest	73.34	74.37	70.79
GoogLeNet	74.16	76.40	75.26
CNN	78.53	77.30	77.91
CNN + WDLSTM	79.46	79.24	78.82
RUS + Random Forest	75.13	75.15	75.14
RUS + GoogLeNet	76.58	79.36	77.94
RUS + CNN	78.38	78.23	75.03
RUS + CNN + WDLSTM	77.68	79.06	78.36
SMOTE + Random Forest	74.76	79.87	76.23
SMOTE + GoogLeNet	77.82	80.14	78.96
SMOTE + CNN	78.09	81.96	75.32
SMOTE + CNN + WDLSTM	80.55	81.67	81.11
SPE + Random Forest	75.58	75.22	75.40
SPE + GoogLeNet	75.69	77.58	76.57
SPE + CNN	80.53	80.12	80.32
SPE + CNN + WDLSTM	81.02	81.82	81.42
ACGAN + Random Forest	74.44	74.55	74.49
ACGAN + GoogLeNet	75.55	77.52	76.52
ACGAN + CNN	78.98	77.21	78.08
ACGAN + CNN + WDLSTM	78.36	77.06	77.70
Our Proposed + Random Forest	75.63	77.14	76.38
Our Proposed + GoogLeNet	77.57	80.20	78.86
Our Proposed + CNN	83.94	82.78	81.66
Our Proposed + CNN + WDLSTM	82.23	82.54	82.38

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, H.; Xu, J.; Xiao, Y.; Hu, L. SPE-ACGAN: A Resampling Approach for Class Imbalance Problem in Network Intrusion Detection Systems. Electronics 2023, 12, 3323. https://doi.org/10.3390/electronics12153323

AMA Style

Yang H, Xu J, Xiao Y, Hu L. SPE-ACGAN: A Resampling Approach for Class Imbalance Problem in Network Intrusion Detection Systems. Electronics. 2023; 12(15):3323. https://doi.org/10.3390/electronics12153323

Chicago/Turabian Style

Yang, Hao, Jinyan Xu, Yongcai Xiao, and Lei Hu. 2023. "SPE-ACGAN: A Resampling Approach for Class Imbalance Problem in Network Intrusion Detection Systems" Electronics 12, no. 15: 3323. https://doi.org/10.3390/electronics12153323

APA Style

Yang, H., Xu, J., Xiao, Y., & Hu, L. (2023). SPE-ACGAN: A Resampling Approach for Class Imbalance Problem in Network Intrusion Detection Systems. Electronics, 12(15), 3323. https://doi.org/10.3390/electronics12153323

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

SPE-ACGAN: A Resampling Approach for Class Imbalance Problem in Network Intrusion Detection Systems

Abstract

1. Introduction

2. Related Work

3. Methods and Materials

3.1. SPE-ACGAN

3.1.1. SPE

3.1.2. ACGAN

3.1.3. Overall Model Architecture

3.2. Details of the SPE-ACGAN

3.2.1. Dataset

3.2.2. Dataset Resampling

4. Experimentation and Result Analysis

4.1. Experimental Setup

4.2. Performance Metrics

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI