MCGAN: Modiﬁed Conditional Generative Adversarial Network (MCGAN) for Class Imbalance Problems in Network Intrusion Detection System

: With developing technologies, network security is critical, predominantly active, and distributed ad hoc in networks. An intrusion detection system (IDS) plays a vital role in cyber security in detecting malicious activities in network trafﬁc. However, class imbalance has triggered a challenging issue where many instances of some classes are more than others. Therefore, traditional classiﬁers suffer in classifying malicious activities and result in low robustness to unidentiﬁed glitches. This paper introduces a novel technique based on a modiﬁed conditional generative adversarial network (MCGAN) to address the class imbalance problem. The proposed MCGAN handles the class imbalance issue by generating oversamples to balance the minority and majority classes. Then, the Bi-LSTM technique is incorporated to classify the multi-class intrusion efﬁciently. This formulated model is experimented on using the NSL-KDD+ dataset with the aid of accuracy, precision, recall, FPR, and F-score to validate the efﬁcacy of the proposed system. The simulation results of the proposed method are associated with other existing models. It achieved an accuracy of 95.16%, precision of 94.21%, FPR of 2.1%, and F1-score of 96.7% for the NSL-KDD+ dataset with 20 selected features.


Introduction
In recent years, the evolution of information technology and security protocols has increased exponentially in network traffic data [1]. Most computer applications are allied to cyberspace for efficient services of various applications such as browsing, social media, e-mails, etc. In addition, different security modules are invoked into network applications to tackle network intrusions. Network intrusions are unsolicited traffic behaviors that are prone to malicious attacks and are harmful to host networks. The hostile invasions are prone to various attacks, including denial-of-service (DoS) attacks, stealing user information by ID theft attacks, phishing attacks, etc. Therefore, it leads to the growth of security problems in cloud storage and leaks the confidentiality of users' data in a communal environment. The intruders execute these attacks with malicious nodes or malware to compromise the host system. Hence, the network security researchers introduced an intrusion detection system to handle anomalous networks [2].
IDS is recognized as one of the most powerful and promising techniques. It aids in perceiving threats and malicious actions by monitoring computer traffic information, and signals are raised when the threats are noticed. Generally, observing malicious activities is characterized into two processes: signature-based discovery and anomaly-based discovery. The signature-based detection method works like an antivirus application that compares the current task with historical task features. In contrast, the anomaly-based detection method works based on the comparison with regular traffic to process the decision. In the NSL-KDD dataset, the network attacks are characterized into four major divisions, 2 of 13 namely remote-to-local (R2L), denial-of-service (DoS), probe attack, and user-to-root (U2R) attacks [3].
Many network security researchers have recently incorporated learning models into intrusion detection systems to determine accurate network attacks. Learning models are utilized more because of their efficacy in processing large-dimension data, evolutionarybased learning capability, and automatic feature extraction. Most recent works used traditional machine learning models to handle the intrusion detection, namely support vector machine [4], XGBoost [5], naive Bayes [6], KNN [7], and random forest [8]. In addition, deep learning models have been utilized in several recent works, namely recurrent neural networks [9], multilayer perceptron [10], and convolutional neural networks [11]. They have proven their efficacy in detecting attacks with improved accuracy.
Nevertheless, the existing techniques have made thoughtful improvements, but the class-imbalanced information remains a challenging issue that hampers the performance of most IDS. The class imbalance arises when the standard trials are significantly higher than the number of intruder trials. Therefore, traditional activities dominate in real networks, thus lead to the misclassification of intrusion detection [12]. A quantitative indicator to determine the severity of class imbalance is the disparity ratio between the dominant and marginal classes. For illustration, the practical dataset of network incursion has nearly 22 lakhs of standard trials and only 5 lakhs of intrusion trials. Thus, the imbalance ratio is computed as 4.4:1. An observation is noticed that most of the samples belong to majority classes while abandoning the minority samples. The trials of minority classes are minimal and insufficient for those techniques. However, the methods are inadequate to learn from the minimal minority samples, and the outcomes are towards the majority. The misdetection of minority samples (intrusion) is much more critical than detecting the standard trials as an intrusion.
In this work, we introduced a novel learning technique, modified conditional generative adversarial network (MCGAN), to handle the class disparity issue, which generates adequate trials for minority classes. The MCGAN technique percolates the information to guarantee only generating minority classes to improve real intrusion discovery. A MCGANbased IDS model is erected to handle the class imbalance issue and incorporates three strategies: feature extraction, CGAN, and deep neural network.
The main objective of this work is described as follows: • A novel technique, namely the modified conditional generative adversarial network (MCGAN), is introduced to get rid of the class disparity issue. • A linear correlation-based feature selection method was introduced to select the significant features, and the Bi-LSTM technique was used to classify the sub-class of intrusion.

•
The proposed technique experimented on NSL-KDD+ datasets. We analyzed the efficiency of attack detection with measurable estimations.

•
The outcome of the proposed technique is associated with traditional techniques under various modified features to validate the system's efficacy.
The next section, Section 2, discusses the existing methodology and its limitations. Section 3 illustrates the datasets and problem formulation. Section 4 discusses the proposed model's merits in handling the class imbalance problem and classification of multi-class intrusions. Section 5 describes the proposed model's experimental setup and analyzes the proposed model outcome with existing methods. Finally, Section 6 accomplishes the work with its imminent directions.

Literature Survey
With the massive increase in network traffic data, learning-based network intrusion detection systems (NIDS) have been introduced to eradicate different unauthentic network actions and possible hidden threats [13]. However, various issues exist in the design and implementation of the learning-based NIDS. (1) Standard network intrusion discovery approaches frequently used several ways such as dimension reduction, compression, and Appl. Sci. 2023, 13, 2576 3 of 13 filtering approaches to reduce measurement noise to tackle the intricacy of dealing with large-scale, high-dimensional data facts. Consequently, extracting features for incursion behaviors will likely eliminate concealed but essential information. Hence, it might result in high false discovery rates. (2) For learning-based NIDS to accurately identify the aspects of intrusion activities, a significant volume of labelled data samples is often needed. The amount and quality of the labelled training set significantly impact how well the NIDS performs. Generating superior, significant training instances is challenging because we traditionally label the training samples manually, which is a time-consuming and errorprone method. (3) NIDS must react to intrusion behaviors in the real world to minimize the loss under an attack. For instance, overflow attacks are frequently concealed in network traffic that can get past the firewall. If such assaults are not quickly identified and stopped, the perpetrator may use them as a Launchpad to transmit a flood of negative posts to the intranet and leave a backdoor open in the breached system. (4) Class imbalance data lead to misclassification, which consists of minimal minority classes and huge majority classes. Hence, class imbalance requires optimization and parallelization architecture for a real-time intrusion detection system [14].
The deep learning model has attracted researchers and industry personnel to handle complex problems. It has been given significant importance in research on cyber security to improve the research quality [15]. Yin et al. [9] proposed an RNN-based IDS to classify multi-class intrusion on the NSL-KDD dataset. They tested the model by varying numerous hidden layers and learning factors. The model outcome in terms of accuracy is satisfactory. However, the multi-class classification, especially on U2R and Probe, is unsatisfactory. The author in [16] introduced a distributed approach using DBN and an ensemble multi-layer SVM model for large-scale NIDS. The DBN model was utilized to extract the features; the extracted features were then provided as input to the ensemble SVM model. The outcome was performed based on the majority voting technique. The model was validated on four different datasets and offers better accuracy in detecting abnormal behaviors.
Vinayakumar et al. [17] merged a scalable DNN model to address both HIDS and NIDS abnormal behavior. They utilized the Apache spark cluster tool for experimental purposes. In addition, six different datasets were used to validate the model's performance. The performance of the model is superior to other compared models. However, the author has not specified the class imbalance issue, and thereby it reduces the version of the model. Punam et al. [18] addressed NIDS issues using a Siamese neural network in which they concentrated on eradicating class imbalance issues. The model processes the oversampling and arbitrary under sampling mechanism to mitigate the problems. However, the model performance is not satisfactory in classifying the multi-class intrusions. Later, the same author in [19] improved the Siamese neural network to improve the classification accuracy. They incorporated the DL method to enrich the detection accuracy rate.
Gupta et al. [20] introduced the LIO-IDS framework to handle class imbalance issues and to improve classification accuracy. The model incorporates the LSTM technique to classify the multi-class intrusions efficiently. The experimentation was carried out on standard datasets and attained better accuracy than other compared models. Tang et al. [21] proposed a DNN-based IDS to secure the software-defined networking platform. They achieved an accuracy of 75% on the standard dataset of NSL-KDD. Later, Wang et al. [22] presented a hybrid CNN and LSTM model to extract the three-dimensional and time-based features. The model provides better outcomes in detecting the intrusion on the standard NSL-KDD dataset.
Based on the numerous studies in the literature [23][24][25], we observed that only a few works address the class imbalance problem. Hence, a promising model is required to knob the class disparity issue and improve the multi-class classification performance. This work introduces a novel network intrusion detection technique using modified deep learning to balance the imbalance issue and improve detection accuracy.

Problem Definition
Let us consider Ψ as a feature set with the number of features as η, and assume (A i , B i ) as samples, where A i → ϑ denotes actual network traffic trials and B i → γ represents the actual labelled classes, and γ denotes the number of types. The objective of IDS is to acquire a classifier with f : ϑ → γ, which specifies an accurate depiction of the arriving network traffic. The motive of the attacker is to create unnoticeable attack data ρ which will be incorporated into the actual sample A i and establish an attacker sample as A * . Later, the sample will be classified as A i + ρ = B i . In this work, we formulated a novel framework with the aid of reinforced GAN to generate the sufficient A * that aids ML/DL techniques to gain sufficient knowledge and equitize minority and majority classes. Furthermore, we also introduced a conditional GAN-based IDS system that strengthens the ML/DL approaches for the classification of attacker samples efficiently.

Dataset: NSL-KDD
The NSL-KDD dataset is considered as one of the standard benchmark datasets to validate the IDS model. This dataset is collected from a real-time network scenario, which consists of KDD train+ and test+ samples. In addition, it includes four major categories of malicious traffic data, namely root-to-local (R2L), probing (probe), denial-of-service (DoS), and user-to-root (U2R) attacks. This dataset contains 41 features, with 9discrete set features and 32 continuous set features. Based on the analysis of each element, these dataset features are classified into four significant sets, namely "content", "host-based traffic", "intrinsic", and "time-based traffic" [26].

Proposed Methodology
This section describes four processes: first is the data preprocessing technique to preprocess the data. Second is a modified CGAN technique introduced to handle the imbalance issues. Later, the feature selection technique is utilized to select the optimal features, and finally, the Bi-LSTM technique is used for efficient classification.

Data Preprocessing
The numeric adaptation and the regularization are used to preprocess facts earlier, being nourished into models since NSL-KDD has many feature types and ranges. There are three embedded non-numerical characteristics (protocol type, service, and flag). For instance, the three properties of "protocol type," TCP, UDP, and ICMP, will be transformed into one-hot vectors. The min-max regularization approach converts the unique numeric features of the data into the series of [0; 1] to remove the range influence among feature values in the input vectors. The mathematical formulation of the min-max regularization approach is presented as follows: where β L and β U specify the maximal and minimal dimensional limits, respectively.

Modified Conditional Generative Adversarial Network (MCGAN)
A generative adversarial network (GAN)is a deep learning technique that mimics the game theory concept of two-human zero-sum games and is utilized to handle large-scale real-time complex information. This technique is used to oversample the available data by balancing minority and majority classes [27]. GAN incorporated two neural networks, namely generator (g) and discriminator (D). Let g be used to analyze the distribution of actual input samples S = {s 1 , s 2 , . . . , s n } and create a new set of data samples, and D be considered as a binary classifier utilized to specify whether the s is original or engendered data (z). The classified outcome reverts to g and D to eradicate the weight loss. The process repeats until D has successfully classified original and engendered data samples. The learning method is a mini-max game to obtain a Nash equilibrium among g and D. The optimization function for GAN is represented as where p(s) specifies the dispersal of actual data instances, the method g(z) implies the noise information (z) to the search space, and D(s) denotes the probability that instance s is actual data. Moreover, the D(s) samples must have more data than D(g(z)) to differentiate the actual and generated data. The traditional GAN technique has the drawback of modecollapse occurrence; the outcome leads to only one class instead of giving importance to the entire distribution. This problem may occur when the actual sample distribution is multi-modal. To handle the above issue, we proposed the conditional generative adversarial network (CGAN) that is a modified technique of the GAN Model, in which the categorical data and noise information are clubbed together with the actual samples as the input to the g and D with loss strategy. CGAN is effective in learning together with the existing distribution samples.
where x denotes the class details and the other parameters as specified in Equation (1). The working process of CGAN is presented in Figure 1. Generally, GAN and CGAN create a set of instances and minimize the class disparity issues. Nevertheless, their practice of the Jensen Shannon distributer involves overlay between the scattering of actual and produced cases, which is unreal or unacceptable. The D is trained to be optimal; hence, it may lead to the prototypical downfall and wipe out gradient issues [28].
game theory concept of two-human zero-sum games and is utilized to handle large-scale real-time complex information. This technique is used to oversample the available data by balancing minority and majority classes [27]. GAN incorporated two neural networks, namely generator (ℊ) and discriminator ( ). Let ℊbe used to analyze the distribution of actual input samples = { 1 , 2 , … , } and create a new set of data samples, and be considered as a binary classifier utilized to specify whether the is original or engendered data ( ). The classified outcome reverts to g and D to eradicate the weight loss.The process repeats until D has successfully classified original and engendered data samples. The learning method is a mini-max game to obtain a Nash equilibrium among ℊ and . The optimization function for GAN is represented as where ( ) specifies the dispersal of actual data instances, the method ℊ( )implies the noise information ( ) to the search space, and ( ) denotes the probability that instance is actual data. Moreover, the ( )samples must have more data than (ℊ( ))to differentiate the actual and generated data. The traditional GAN technique has the drawback of mode-collapse occurrence; the outcome leads to only one class instead of giving importance to the entire distribution. This problem may occur when the actual sample distribution is multi-modal.
To handle the above issue, we proposed the conditional generative adversarial network (CGAN) that is a modified technique of the GAN Model, in which the categorical data and noise information are clubbed together with the actual samples as the input to the g and D with loss strategy. CGAN is effective in learning together with the existing distribution samples.
where denotes the class details and the other parameters as specified in Equation (1). The working process of CGAN is presented in Figure 1. Generally, GAN and CGAN create a set of instances and minimize the class disparity issues. Nevertheless, their practice of the Jensen Shannon distributer involves overlay between the scattering of actual and produced cases, which is unreal or unacceptable. The is trained to be optimal; hence, it may lead to the prototypical downfall and wipe out gradient issues [28].  In this work, we modified CGAN by merging the Lipschitz boundary and Wasserstein distance to handle the above issues. These incorporated techniques address the D, z vector and mapping labels to g. Here, we utilize D to determine the actual and created instances s. If D fails to distinguish the genuine and generated s, then we fine-tune the g and train the D and vice versa. We replicate the process until the loss rate is minimized at about 0.5. The proposed model can create the data of a quantified pattern to balance the imbalance Appl. Sci. 2023, 13, 2576 6 of 13 dataset by eradicating the waning gradients that affect the D in the training process. The fitness function of MCGAN is presented below.
where ϕ denotes an artificial factor, ||∇ s D(s|x)|| specifies the computing pattern for s in D(s), and s ∼ p(ω) determines the centre location of the line-linking facts on p(ω) and p(g).

Feature Selection
The feature selection process is a vital method to handle high-dimensional data and reduce computational complexity. This process selects the significant features from various elements in the problem datasets. In this work, we utilized the Nadam optimizer in the neural network to extract the components [29]. Later, we used a linear correlation-based feature selection technique which computes the correlation distance between two arbitrary feature vectors. In addition, it also calculates linear correlations between the features. For instance, if feature X with value a, class Y with value b are specified as arbitrary vectors. Then, the correlation C among the vectors is mathematically formulated as below.
where a and b are denoted as predicted values of X and Y. C is equal to zero if X and Y are linearly independent or else one if they are wholly dependent. The proposed feature selection technique utilizes the linear correlation coefficient to select the significant features that minimize the computational complexity.

Bidirectional Long Short-Term Memory (Bi-LSTM) Technique
LSTM is a variant of recurrent neural networks that uses a gating technique to study long-term dependencies. It eradicates the issue of disappearing gradient. At the same time, it performs the training of the generic recurrent neural network (RNN). The LSTM technique uses different switch gates that trigger them to circumvent units and, as an outcome, memory for long short-term steps [30]. In this work, we utilized bidirectional LSTM in which the first LSTM was applied directly to input samples, whereas another LSTM was applied to the replica of the input samples. This aids the network in learning more data information, thereby improving classification accuracy.
Further, it takes the original input to the initial layer and the reversed imitation of input samples provided to the replicated layer. This work eradicates the issue of vanishing gradients in generic RNNs. The training of Bi-LSTM processed based on all earlier and forthcoming data resides within a specified time sequence. In addition, Bi-LSTM processes the input samples in two significant ways with the aid of a forward hidden layer and a backward hidden layer. The mathematical method of forward and backward hidden layers is specified as follows.
Scale the features using Equation (1)  where ℎ ⃗ ⃗ and ℎ ⃖⃗ ⃗ denote the forward and backward hidden input weight values; ℎ ⃗ ⃗ and ℎ ⃖⃗ ⃗ specifythe bias values of forward and backward hidden layers; andℋ denotes the hidden layer, respectively. The detailed working process of the proposed model is offered in algorithm 1. The proposed architecture of the multi-class Bi-LSTM is illustrated in Figure 2.

Experimentation and Result Analysis
In this section, we specifythe experimental setup and examine the performance evaluation of the model. On the other hand, we examined enough experiments to validate the

Experimentation and Result Analysis
In this section, we specify the experimental setup and examine the performance evaluation of the model. On the other hand, we examined enough experiments to validate the efficacy of the data augmentation, and an improved number of attribute reduction models are discussed in Section 3. Furthermore, we associated the formulated outcome with other existing models.

Experimental Setup
In this work, all experiments were carried out in Intel ® core™ i5-8250U processor @1.60GHz 1.80 GHz, 8GB RAM, with the Windows ten operating system. The coding and simulation of the model were executed in python 3.8, and PyTorch 2.0 and the sklearn library were utilized. For the testing and validation, samples were taken from NSL-KDD+ datasets, described in Section 3.1. The proposed model outcome is endorsed with a stratified K-fold cross-validation approach with k fixed as 10. In addition, the proposed model outcome is associated with other existing models such as AE-CGAN-RF [28], LSSVM-IDS [31], RNN-IDS [9], and SSAE-LSTM [32], which were applied to the balanced NSL-KDD+ dataset.

Performance Metrics
The standard evaluation metrics such as true positive (TP), true negative (TN), false positive (FP), and false negative (FN) were utilized to validate the efficacy of the classification. In addition, false-positive rate (FPR), precision (Ψ), recall (Y), specificity (ϑ), accuracy (Φ), and F1-score were utilized to compare the efficacy of the formulated approach with other compared models. The mathematical formulation of the performance metrics is described as follows.
If the outcomes of Φ, Υ, Ψ, ϑ, and F1 Score are high, then the outcome of the proposed model is improved. On the other hand, the FPR value must be less to ensure enriching the classification quality.

Result Analysis
We conducted experiments on a modified NSL-KDD+ dataset balanced by the MC-GAN approach. In addition, we created the NSL-KDD+ and NSL-KDD+20 datasets that include all 41 features and 20 selected features. To validate the outcome of the proposed model, we utilized the datasets in training (80%) and testing (20%) samples. The proposed model uses the linear correlation feature selection approach to diminish the features and select significant characteristics for training purposes. In this work, we set 20 features that aid the model in learning from the selected low-dimensional parts to improve the classification outcome of the classifiers. Table 1 provides the class-wise outcome achieved by the proposed model on NSL-KDD+ and NSL-KDD+20 datasets. In the NSL-KDD+ dataset, the F1-score outcome for the typical class was 95.78%, with a DoS of 94.31%, probe of 84.87%, R2L of 94.57%, and U2R of 81.45%. The false-positive rate (FPR) outcome for the typical class was 4.57%, with a DoS of 0.87%, probe of 2.14%, R2L of 0.51%, and U2R of 0.69%. As we specified earlier in Section 5.2, FPR should be less to ensure that the proposed model achieved a better outcome. Further, we applied the proposed model on the selected 20 significant features termed NSL-KDD+20. In the NSL-KDD+20 dataset, the F1-score result for the typical class was 96.91%, with a DoS of 94.87%, probe of 85.74%, R2L of 95.71%, and U2R of 82.97%. The false-positive rate (FPR) outcome for the typical class was 4.14%, with a DoS of 0.74%, probe of 2.45%, R2L of 0.47%, and U2R of 0.71%. Based on the analysis of both datasets, we noticed that the proposed model on the NSL-KDD+20 dataset provides a better outcome than the NSL-KDD+ dataset. This outcome is achieved due to formal learning from samples by the proposed model. Figures 3-6 show the precision, recall, specificity, and F1-score of different classes on NSL-KDD+ and NSL-KDD+20 datasets. of 0.87%, probe of 2.14%, R2L of 0.51%, and U2R of 0.69%. As we specified earlier in Section 5.2, FPR should be less to ensure that the proposed model achieved a better outcome. Further, we appliedthe proposed model on the selected 20 significant features termed NSL-KDD+20. In the NSL-KDD+20 dataset, the F1-score result for the typical class was 96.91%, with aDoS of 94.87%, probe of 85.74%, R2L of 95.71%, and U2R of 82.97%. The false-positive rate (FPR) outcome for the typical class was4.14%, with a DoS of 0.74%, probe of 2.45%, R2L of 0.47%, and U2R of 0.71%. Based on the analysis of both datasets, we noticed that the proposed model on the NSL-KDD+20 dataset provides a better outcome than the NSL-KDD+ dataset. This outcome is achieved due to formal learning from samples by the proposed model. Figures 3-6 show the precision, recall, specificity, and F1-score of different classes on NSL-KDD+ and NSL-KDD+20 datasets.

Comparative Analysis of Proposed Model
To highlight the efficacy of the proposed model, the outcome of the model associated with the other existing approaches that include LSSVM-IDS, AE-CGAN-RF, RNN-IDS, and SSAE-LSTM is presented. The result of the proposed model deliberates higher accuracy than other compared models such as LSSVM-IDS, AE-CGAN-RF, RNN-IDS, and SSAE-LSTM. Figure 7 provides the accuracy of the proposed approach and otherexisting approaches on NSL-KDD+ and NSL-KDD+20 datasets. The LSSVM-IDS model offers an accuracy of 53.21% and 55.86% on NSL-KDD+ and NSL-KDD+20 datasets. The performance of LSSSVM-IDS shows poor performance compared to the AE-CGAN-RF approach. In addition, AE-CGAN-RF offers 64.56% and 67.12% accuracy on NSL-KDD+ and NSL-KDD+20 datasets. However, the outcome of AE-CGAN-RF provides an unsatisfactoryperformance. Consequently, the RNN-IDS and SSAE-LSTM approaches provide satisfactory accuracy outcomes of 81.42% and 84.93% and 85.98% and 88.79% on the NSL-KDD+ and NSL-KDD+20 datasets. Simultaneously, the formulated approachoffers higher accuracies of 91.76% and 95.16% on NSL-KDD+ and NSL-KDD+20 datasets.

Comparative Analysis of Proposed Model
To highlight the efficacy of the proposed model, the outcome of the model associated with the other existing approaches that include LSSVM-IDS, AE-CGAN-RF, RNN-IDS, and SSAE-LSTM is presented. The result of the proposed model deliberates higher accuracy than other compared models such as LSSVM-IDS, AE-CGAN-RF, RNN-IDS, and SSAE-LSTM. Figure 7 provides the accuracy of the proposed approach and other existing approaches on NSL-KDD+ and NSL-KDD+20 datasets. The LSSVM-IDS model offers an accuracy of 53.21% and 55.86% on NSL-KDD+ and NSL-KDD+20 datasets. The performance of LSSSVM-IDS shows poor performance compared to the AE-CGAN-RF approach. In addition, AE-CGAN-RF offers 64.56% and 67.12% accuracy on NSL-KDD+ and NSL-KDD+20 datasets. However, the outcome of AE-CGAN-RF provides an unsatisfactory performance. Consequently, the RNN-IDS and SSAE-LSTM approaches provide satisfactory accuracy outcomes of 81.42% and 84.93% and 85.98% and 88.79% on the NSL-KDD+ and NSL-KDD+20 datasets. Simultaneously, the formulated approach offers higher accuracies of 91.76% and 95.16% on NSL-KDD+ and NSL-KDD+20 datasets. accuracy of 53.21% and 55.86% on NSL-KDD+ and NSL-KDD+20 datasets. The performance of LSSSVM-IDS shows poor performance compared to the AE-CGAN-RF approach. In addition, AE-CGAN-RF offers 64.56% and 67.12% accuracy on NSL-KDD+ and NSL-KDD+20 datasets. However, the outcome of AE-CGAN-RF provides an unsatisfactoryperformance. Consequently, the RNN-IDS and SSAE-LSTM approaches provide satisfactory accuracy outcomes of 81.42% and 84.93% and 85.98% and 88.79% on the NSL-KDD+ and NSL-KDD+20 datasets. Simultaneously, the formulated approachoffers higher accuracies of 91.76% and 95.16% on NSL-KDD+ and NSL-KDD+20 datasets.    Figure 7 provides the accuracy of the proposed approach and other existing approaches on NSL-KDD+ and NSL-KDD+20 datasets. The LSSVM-IDS model offers an accuracy of 53.21% and 55.86% on NSL-KDD+ and NSL-KDD+20 datasets. The performance of LSSSVM-IDS shows poor performance compared to the AE-CGAN-RF approach. In addition, AE-CGAN-RF offers 64.56% and 67.12% accuracy on NSL-KDD+ and NSL-KDD+20 datasets. However, the outcome of AE-CGAN-RF provides an unsatisfactory performance. Consequently, the RNN-IDS and SSAE-LSTM approaches provide satisfactory accuracy outcomes of 81.42% and 84.93% and 85.98% and 88.79% on the NSL-KDD+ and NSL-KDD+20 datasets. Simultaneously, the formulated approach offers higher accuracies of 91.76% and 95.16% on NSL-KDD+ and NSL-KDD+20 datasets. Table 2 compares the overall performance of the proposed model to that of other comparable models on the NSL-KDD+ and NSL-KDD+20 datasets. To ensure the effectiveness of the proposed model, the accuracy, recall, FAR, and F1-score of LSSVM-IDS, AE-CGAN-RF, RNN-IDS, and SSAE-LSTM were measured. When compared to previous models, the suggested proposed model has a higher accuracy and recall rate. Furthermore, it is obvious that the suggested model achieves 1.85% and 1.06% false alarm rates for the NSL-KDD+ and NSL-KDD+20 datasets, respectively, while the comparison model fails to reach higher FPR outcomes. On the other hand, the suggested model's F1-score outperforms the NSL-KDD+ and NSL-KDD+20 datasets. Based on the comparative result analysis, we conclude that the proposed model identifies various classes of known and unknown assaults by boosting the learning accuracy of low-dimensional characteristics. The proposed model's overall performance expresses by outperforming other models in terms of accuracy. Furthermore, the suggested model may be integrated into a real-time intrusion detection system to increase detection speed and accuracy.

Conclusions
Class imbalance is considered a severe issue that might lead to poor detection accuracy in network intrusion detection systems. An efficient detection approach is necessary to eradicate the class imbalance problem. This work introduces a modified conditional generative adversarial network (MCGAN) to handle the imbalance issues. MCGAN generates a good set of samples to eradicate the class imbalance problem. In addition, the Nadam optimizer for feature extraction and linear-correlation-based feature selection were utilized to select the significant features. Later, the Bi-LSTM approach was used to classify the attacks according to a different set of classes. The experimentation was carried out on NSL-KDD+ datasets with balanced data samples and NSL-KDD+20 with 20 selected features. The proposed model was applied to those selected datasets, and the model's performance was measured using standard performance metrics such as precision, recall, accuracy, specificity, false-positive rate, and F1-score. The outcome of the proposed model was compared with other state-of-art approaches that include LSSVM-IDS, AE-CGAN-RF, RNN-IDS, and SSAE-LSTM. The overall accuracy of the proposed model on NSL-KDD+ achieved 91.76% and 95.16% on the NSL-KDD+20 datasets. Given that the proposed model achieved better accuracy than other compared models, further, this work can be extended by incorporating a meta-heuristic algorithm to choose the optimal features and to improve the model's accuracy by reducing computational complexity.