# Improved Bidirectional GAN-Based Approach for Network Intrusion Detection Using One-Class Classifier

^{1}

^{2}

^{3}

^{*}

## Abstract

**:**

## 1. Introduction

- We relax the requirement for the generator and the discriminator to train them in sync. The generator (along with the encoder) in our proposed model goes through more rigorous training iterations in order to produce more reliable synthetic data set that highly resembles the real traffic samples. This can effectively remove any overheads associated with the discriminator trained overly. Our proposed model shows that the generator’s performance is greatly improved by offering the generator (and encoder) to train more than the discriminator, which in turn, also actually improves the discriminator’s performance better.
- In our promised model, a cross-entropy is used to keep track of the overall balance in terms of the number of relative training iterations required for the generator and the discriminator. In addition, we employ a -log (D) trick to train the generator to obtain sufficient gradient in the early training stage by inverting the label.
- We offer new construction of a one-class classifier using the trained encoder-discriminator for detecting anomalous traffic from normal traffic instead of having to calculate either anomaly scores or thresholds which are computationally expensive and complex.
- Our experimental result shows that our proposed method is highly effective in using a GAN-based model for network anomaly detection tasks by achieving more than 92% F1-score on the NSL-KDD dataset and more than 99% F1-score on the CIC-DDoS2019 dataset.

## 2. Related Work

## 3. Background

#### 3.1. Generic GAN

Algorithm 1: Training in Generic GAN |

#### 3.2. Bidirectional GAN

Algorithm 2: Training in BiGAN |

## 4. Our Proposed Model

#### 4.1. Main Components

#### 4.1.1. Encoder

#### 4.1.2. Generator

#### 4.1.3. Discriminator

#### 4.2. Training Phase

#### Training Loss Function

Algorithm 3: Training Phase of our proposed method |

#### 4.3. Testing Phase

Algorithm 4: Testing Phase of our proposed method |

#### 4.4. Putting It Together

## 5. Data and Preprocessing

#### 5.1. Datasets

#### 5.2. Data Preprocessing

## 6. Experimental Results

#### 6.1. Setup Environment

#### 6.2. Performance Metrics

#### 6.3. Results

#### 6.3.1. Training Loss and PCA

#### 6.3.2. Testing

#### 6.3.3. Benchmarking with Other Similar Models

## 7. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## References

- Jang-Jaccard, J.; Nepal, S. A survey of emerging threats in cybersecurity. J. Comput. Syst. Sci.
**2014**, 80, 973–993. [Google Scholar] [CrossRef] - Ahmad, Z.; Khan, A.S.; Shiang, C.W.; Abdullah, J.; Ahmad, F. Network intrusion detection system: A systematic study of machine learning and deep learning approaches. Trans. Emerg. Telecommun. Technol.
**2021**, 32, e4150. [Google Scholar] [CrossRef] - Zhu, J.; Jang-Jaccard, J.; Liu, T.; Zhou, J. Joint Spectral Clustering based on Optimal Graph and Feature Selection. Neural Process. Lett.
**2021**, 53, 257–273. [Google Scholar] [CrossRef] - Kingma, D.P.; Welling, M. Auto-encoding variational bayes. arXiv
**2013**, arXiv:1312.6114. [Google Scholar] - Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. Adv. Neural Inf. Process. Syst.
**2014**, 27, 1–9. [Google Scholar] - Schlegl, T.; Seeböck, P.; Waldstein, S.M.; Schmidt-Erfurth, U.; Langs, G. Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. In Information Processing in Medical Imaging; Niethammer, M., Styner, M., Aylward, S., Zhu, H., Oguz, I., Yap, P.T., Shen, D., Eds.; Springer International Publishing: Cham, Switzerland, 2017; pp. 146–157. [Google Scholar]
- Schlegl, T.; Seeböck, P.; Waldstein, S.M.; Langs, G.; Schmidt-Erfurth, U. f-AnoGAN: Fast unsupervised anomaly detection with generative adversarial networks. Med. Image Anal.
**2019**, 54, 30–44. [Google Scholar] [CrossRef] [PubMed] - Akcay, S.; Atapour-Abarghouei, A.; Breckon, T.P. GANomaly: Semi-supervised anomaly detection via adversarial training. In Computer Vision—ACCV 2018; Jawahar, C.V., Li, H., Mori, G., Schindler, K., Eds.; Springer International Publishing: Cham, Switzerland, 2019; pp. 622–637. [Google Scholar]
- Chen, H.; Jiang, L. Efficient GAN-based method for cyber-intrusion detection. arXiv
**2019**, arXiv:1904.02426. [Google Scholar] - Kaplan, M.O.; Alptekin, S.E. An improved BiGAN based approach for anomaly detection. Procedia Comput. Sci.
**2020**, 176, 185–194. [Google Scholar] [CrossRef] - Javaid, A.; Niyaz, Q.; Sun, W.; Alam, M. A deep learning approach for network intrusion detection system. Eai Endorsed Trans. Secur. Saf.
**2016**, 3, e2. [Google Scholar] - An, J.; Cho, S. Variational autoencoder based anomaly detection using reconstruction probability. Spec. Lect. IE
**2015**, 2, 1–18. [Google Scholar] - Chang, Y.; Tu, Z.; Xie, W.; Yuan, J. Clustering driven deep autoencoder for video anomaly detection. In Computer Vision—ECCV 2020; Vedaldi, A., Bischof, H., Brox, T., Frahm, J.M., Eds.; Springer International Publishing: Cham, Switzerland, 2020; pp. 329–345. [Google Scholar]
- Xu, W.; Jang-Jaccard, J.; Singh, A.; Wei, Y.; Sabrina, F. Improving Performance of Autoencoder-based Network Anomaly Detection on NSL-KDD dataset. IEEE Access
**2021**, 9, 140136–140146. [Google Scholar] [CrossRef] - Sadaf, K.; Sultana, J. Intrusion Detection Based on Autoencoder and Isolation Forest in Fog Computing. IEEE Access
**2020**, 8, 167059–167068. [Google Scholar] [CrossRef] - Aygun, R.C.; Yavuz, A.G. Network Anomaly Detection with Stochastically Improved Autoencoder Based Models. In Proceedings of the 2017 IEEE 4th International Conference on Cyber Security and Cloud Computing (CSCloud), New York, NY, USA, 26–28 June 2017; pp. 193–198. [Google Scholar] [CrossRef]
- Zenati, H.; Foo, C.S.; Lecouat, B.; Manek, G.; Chandrasekhar, V.R. Efficient gan-based anomaly detection. arXiv
**2018**, arXiv:1802.06222. [Google Scholar] - Mohammadi, B.; Sabokrou, M. End-to-End Adversarial Learning for Intrusion Detection in Computer Networks. In Proceedings of the 2019 IEEE 44th Conference on Local Computer Networks (LCN), Osnabrueck, Germany, 14–17 October 2019; pp. 270–273. [Google Scholar] [CrossRef][Green Version]
- Dumoulin, V.; Belghazi, I.; Poole, B.; Mastropietro, O.; Lamb, A.; Arjovsky, M.; Courville, A. Adversarially learned inference. arXiv
**2016**, arXiv:1606.00704. [Google Scholar] - Donahue, J.; Krähenbühl, P.; Darrell, T. Adversarial feature learning. arXiv
**2016**, arXiv:1605.09782. [Google Scholar] - Arjovsky, M.; Bottou, L. Towards Principled Methods for Training Generative Adversarial Networks. arXiv
**2017**, arXiv:stat.ML/1701.04862. [Google Scholar] - Berthelot, D.; Schumm, T.; Metz, L. Began: Boundary equilibrium generative adversarial networks. arXiv
**2017**, arXiv:1703.10717. [Google Scholar] - Ieracitano, C.; Adeel, A.; Morabito, F.C.; Hussain, A. A novel statistical analysis and autoencoder driven intelligent intrusion detection approach. Neurocomputing
**2020**, 387, 51–62. [Google Scholar] [CrossRef] - Zenati, H.; Romain, M.; Foo, C.S.; Lecouat, B.; Chandrasekhar, V. Adversarially Learned Anomaly Detection. In Proceedings of the 2018 IEEE International Conference on Data Mining (ICDM), Singapore, 17–20 November 2018; pp. 727–736. [Google Scholar] [CrossRef][Green Version]
- Forestiero, A. Metaheuristic algorithm for anomaly detection in Internet of Things leveraging on a neural-driven multiagent system. Knowl. Based Syst.
**2021**, 228, 107241. [Google Scholar] [CrossRef] - Forestiero, A. Bio-inspired algorithm for outliers detection. Multimed. Tools Appl.
**2017**, 76, 25659–25677. [Google Scholar] [CrossRef] - Wei, Y.; Jang-Jaccard, J.; Sabrina, F.; Singh, A.; Xu, W.; Camtepe, S. Ae-mlp: A hybrid deep learning approach for ddos detection and classification. IEEE Access
**2021**, 9, 146810–146821. [Google Scholar] [CrossRef] - Zhu, J.; Jang-Jaccard, J.; Watters, P.A. Multi-Loss Siamese Neural Network with Batch Normalization Layer for Malware Detection. IEEE Access
**2020**, 8, 171542–171550. [Google Scholar] [CrossRef] - Zhu, J.; Jang-Jaccard, J.; Singh, A.; Watters, P.A.; Camtepe, S. Task-aware meta learning-based siamese neural network for classifying obfuscated malware. arXiv
**2021**, arXiv:2110.13409. [Google Scholar] - Zhu, J.; Jang-Jaccard, J.; Singh, A.; Welch, I.; AI-Sahaf, H.; Camtepe, S. A Few-Shot Meta-Learning based Siamese Neural Network using Entropy Features for Ransomware Classification. arXiv
**2021**, arXiv:2112.00668. [Google Scholar] [CrossRef] - McIntosh, T.R.; Jang-Jaccard, J.; Watters, P.A. Large Scale Behavioral Analysis of Ransomware Attacks. In Neural Information Processing; Cheng, L., Leung, A.C.S., Ozawa, S., Eds.; Springer International Publishing: Cham, Switzerland, 2018; pp. 217–229. [Google Scholar]
- McIntosh, T.; Jang-Jaccard, J.; Watters, P.; Susnjak, T. The Inadequacy of Entropy-Based Ransomware Detection. In Neural Information Processing; Gedeon, T., Wong, K.W., Lee, M., Eds.; Springer International Publishing: Cham, Switzerland, 2019; pp. 181–189. [Google Scholar]
- Feng, S.; Liu, Q.; Patel, A.; Bazai, S.U.; Jin, C.K.; Kim, J.S.; Sarrafzadeh, M.; Azzollini, D.; Yeoh, J.; Kim, E.; et al. Automated pneumothorax triaging in chest X-rays in the New Zealand population using deep-learning algorithms. J. Med. Imaging Radiat. Oncol.
**2022**, in press. [Google Scholar] [CrossRef]

**Figure 1.**Structure of GAN. The generator G map the input z (i.e., random noise) in latent space to produce a high dimensional $G\left(z\right)$ (i.e., fake samples). The discriminator D is expected to separate x (i.e., real samples) from $G\left(z\right)$.

**Figure 2.**Structure of BiGAN. Note that (z and $E\left(x\right)$) and ($G\left(z\right)$ and x) have the same dimensions. The concatenated pairs $\left[G\right(z),z]$ and $[x,E(x\left)\right]$ are the two input sources of the discriminator D. The Generator G and the encoder E are optimized with the loss generated by the discriminator D.

**Figure 4.**BiGAN data flow. Encoder: input dimensions (122), output dimensions (10); Generator: input dimensions (10), output dimensions (122); Discriminator concatenates the input and output of Encoder or Generator to form the input and functions as a binary classifier.

**Figure 5.**Training losses vs. iterations. Eloss, Gloss, and Dloss represent the training loss trend of encoder, generator, and discriminator, respectively.

**Figure 7.**The PCA visualization of concatenated outputs of the encoder and the generator after training on (

**a**) NSL-KDD and (

**b**) CIC-DDoS2019.

NSL-KDD | Total | Normal | Others |
---|---|---|---|

KDDTrain+ | 125,973 | 67,343 | 58,630 |

KDDTest+ | 22,544 | 9711 | 12,833 |

No | Features | Type | No | Features | Type |
---|---|---|---|---|---|

0 | duration | int64 | 21 | is_guest_login | int64 |

1 | protocol_type | object | 22 | count | int64 |

2 | service | object | 23 | srv_count | int64 |

3 | flag | object | 24 | serror_rate | float64 |

4 | src_bytes | int64 | 25 | srv_serror_rate | float64 |

5 | dst_bytes | int64 | 26 | rerror_rate | float64 |

6 | land | int64 | 27 | srv_rerror_rate | float64 |

7 | wrong_fragment | int64 | 28 | same_srv_rate | float64 |

8 | urgent | int64 | 29 | diff_srv_rate | float64 |

9 | hot | int64 | 30 | srv_diff_host_rate | float64 |

10 | num_failed_logins | int64 | 31 | dst_host_count | int64 |

11 | logged_in | int64 | 32 | dst_host_srv_count | int64 |

12 | num_compromised | int64 | 33 | dst_host_same_srv_rate | float64 |

13 | root_shell | int64 | 34 | dst_host_diff_srv_rate | float64 |

14 | su_attempted | int64 | 35 | dst_host_same_src_port_rate | float64 |

15 | num_root | int64 | 36 | dst_host_srv_diff_host_rate | float64 |

16 | num_file_creations | int64 | 37 | dst_host_serror_rate | float64 |

17 | num_shells | int64 | 38 | dst_host_srv_serror_rate | float64 |

18 | num_access_files | int64 | 39 | dst_host_rerror_rate | float64 |

19 | num_outbound_cmds | int64 | 40 | dst_host_srv_rerror_rate | float64 |

20 | is_host_login | int64 |

CIC-DDoS2019 | Total | BENIGN | ATTACKS |
---|---|---|---|

Training | 56,425 | 56,425 | - |

test | 977,830 | 2811 | 975,019 |

Unit | Description |
---|---|

Processor | 2 Cores, 2.0 Ghz |

GPU | Tesla P100 |

RAM | 16 GB |

OS | Linux 5.10.68+ |

Packages used | TensorFlow 2.6.0 |

Parameters | Values | Description |
---|---|---|

Batch Size | 64 | The number of training examples in one forward/backward pass |

Learning rate | 0.002 | Learning rate is used in the training of neural networks—range between 0.0 and 1.0. |

N-iterations | 1000 | Total numbers of iterations in the training process |

Steps | 5 | The compensated training iterations for Generator and Encoder |

Dataset | Accuracy | Precision | Recall | F1 Score | Time ($\mathit{\mu}\pm \mathit{\sigma}$) |
---|---|---|---|---|---|

KDDTest+ | 91.12% | 87.27% | 98.81% | 92.68% | $118\phantom{\rule{3.33333pt}{0ex}}\mathrm{ms}\pm 25\phantom{\rule{3.33333pt}{0ex}}\mathrm{ms}$ |

CIC-DDOS2019Test+ | 99.68% | 99.85% | 99.82% | 99.84% | $59\phantom{\rule{3.33333pt}{0ex}}\mathrm{ms}\pm 23\phantom{\rule{3.33333pt}{0ex}}\mathrm{ms}$ |

Method | Accuracy | Precision | Recall | F1 Score | Dataset |
---|---|---|---|---|---|

AE [23] | 84.21% | 87% | 80.37% | 81.98% | NSL-KDD |

AE [15] | 88.98% | 87.92% | 93.48% | 90.61% | NSL-KDD |

DAE [16] | 88.65% | 96.48% | 83.08% | 89.28% | NSL-KDD |

AnoGAN [24] | - | 87.86% | 82.97% | 88.65% | KDD99 |

BiGAN [9] | - | 93.24% | 94.73% | 93.98% | KDD99 |

BiGAN [10] | 89.5% | 83.6% | 99.4% | 90.8% | KDD99 |

Our approach | 91.12% | 87.27% | 98.81% | 92.68% | NSL-KDD |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Xu, W.; Jang-Jaccard, J.; Liu, T.; Sabrina, F.; Kwak, J.
Improved Bidirectional GAN-Based Approach for Network Intrusion Detection Using One-Class Classifier. *Computers* **2022**, *11*, 85.
https://doi.org/10.3390/computers11060085

**AMA Style**

Xu W, Jang-Jaccard J, Liu T, Sabrina F, Kwak J.
Improved Bidirectional GAN-Based Approach for Network Intrusion Detection Using One-Class Classifier. *Computers*. 2022; 11(6):85.
https://doi.org/10.3390/computers11060085

**Chicago/Turabian Style**

Xu, Wen, Julian Jang-Jaccard, Tong Liu, Fariza Sabrina, and Jin Kwak.
2022. "Improved Bidirectional GAN-Based Approach for Network Intrusion Detection Using One-Class Classifier" *Computers* 11, no. 6: 85.
https://doi.org/10.3390/computers11060085