You are currently viewing a new version of our website. To view the old version click .
Journal of Cybersecurity and Privacy
  • Article
  • Open Access

12 September 2025

Enhancing SCADA Security Using Generative Adversarial Network

and
Department of AI and Software Engineering, School of Computing, Gachon University, Seongnam 13120, Republic of Korea
*
Author to whom correspondence should be addressed.
This article belongs to the Section Security Engineering & Applications

Abstract

Supervisory Control and Data Acquisition (SCADA) systems play a critical role in industrial processes by providing real-time monitoring and control of equipment across large-scale, distributed operations. In the context of cyber security, Intrusion Detection Systems (IDSs) help protect SCADA systems by monitoring for unauthorized access, malicious activity, and policy violations, providing a layer of defense against potential intrusions. Given the critical role of SCADA systems and the increasing cyber risks, this paper highlights the importance of transitioning from traditional signature-based IDS to advanced AI-driven methods. Particularly, this study tackles the issue of intrusion detection in SCADA systems, which are critical yet vulnerable parts of industrial control systems. Traditional Intrusion Detection Systems (IDSs) often fall short in SCADA environments due to data scarcity, class imbalance, and the need for specialized anomaly detection suited to industrial protocols like DNP3. By integrating GANs, this study mitigates these limitations by generating synthetic data, enhancing classification accuracy and robustness in detecting cyber threats targeting SCADA systems. Remarkably, the proposed GAN-based IDS achieves an outstanding accuracy of 99.136%, paired with impressive detection speed, meeting the crucial need for real-time threat identification in industrial contexts. Beyond these empirical advancements, this paper suggests future exploration of explainable AI techniques to improve the interpretability of IDS models tailored to SCADA environments. Additionally, it encourages collaboration between academia and industry to develop extensive datasets that accurately reflect SCADA network traffic.

1. Introduction

Cyber attacks have underscored the serious and widespread impact of digital security breaches on both companies and individuals. For instance, the 2023 ransomware attack on a major U.S. healthcare provider disrupted patient services across numerous facilities, delaying critical treatments and surgeries, which posed significant risks to patient well being [1]. Such attacks often lead to massive data breaches, where sensitive information—like social security numbers, financial data, and medical records—is stolen and misused, resulting in identity theft and financial fraud for individuals. On a corporate level, these attacks damage trust, lead to financial losses, and require costly mitigation efforts, such as system overhauls and security upgrades. The effects ripple through the economy as companies face regulatory fines, lawsuits, and reputational harm, ultimately eroding public confidence in digital systems.
As cyber threats become more sophisticated, these incidents highlight the urgent need for enhanced security measures to protect both personal and professional aspects of life, which are increasingly intertwined with digital technologies.
In the era of pervasive digitization, safeguarding the security and integrity of networked systems has become a critical concern, driving extensive research within the field of cybersecurity [2]. As cyber threats continue to evolve in sophistication, traditional security measures often prove inadequate, thus necessitating the development of advanced and adaptive Intrusion Detection Systems (IDSs). The progression of IDS technology reflects the dynamic landscape of cyber threats. Originating in the 1980s, IDSs evolved from basic anomaly detection to advanced frameworks incorporating machine learning and artificial intelligence (AI) techniques. Early IDS models relied heavily on signature-based methods that matched known attack patterns, which, although effective for previously seen threats, struggled with novel and evolving attack strategies. In response to these limitations, researchers have increasingly incorporated anomaly detection and AI-based methodologies, leveraging the adaptability and learning capabilities of modern algorithms.
Among these critical systems, Supervisory Control and Data Acquisition (SCADA) systems stand out due to their fundamental role in managing industrial and infrastructural processes. SCADA systems control and monitor essential services, such as power grids, water treatment facilities, and transportation networks, making them attractive targets for cyber adversaries. The distributed and interconnected nature of SCADA systems across large-scale industrial environments introduces significant vulnerabilities, as illustrated in recent studies highlighting targeted attacks on SCADA infrastructure [3,4]. These studies underscore the need for robust, SCADA-specific IDS mechanisms capable of defending against increasingly sophisticated threats.
SCADA systems, as the backbone of critical infrastructure, have become prime targets for sophisticated cyber-attacks. Traditional cybersecurity measures, including rule-based IDS, face limitations in adapting to evolving attack patterns and the high volume of data generated by modern SCADA environments. Generative Adversarial Networks (GANs) have emerged as a powerful tool in machine learning due to their ability to model complex data distributions and generate synthetic samples. In industrial cybersecurity, GANs offer a novel approach to intrusion detection by enhancing the diversity of training data and improving the detection of attacks.
Generative Adversarial Networks (GANs), first introduced by Goodfellow et al. in 2014 [5,6], offer a promising framework for enhancing IDS, particularly in SCADA environments. GANs consist of a generator that creates synthetic data and a discriminator that evaluates them, competing to refine detection accuracy. In cybersecurity contexts, GANs can mitigate common issues such as data scarcity and class imbalance, which frequently hinder IDS performance. Leveraging GANs for SCADA-focused IDS presents unique opportunities for creating adaptable, high-fidelity models that enhance network security and anomaly detection. Moreover, integrating GANs with explainable AI (XAI) methods can improve the interpretability of IDS outputs, enabling security professionals to better understand and act on anomaly classifications. In detail, GANs address imbalanced datasets by generating synthetic samples for minority classes, enhancing data diversity and improving model generalization. They also extract high-dimensional features, enabling better anomaly detection in IDS by capturing subtle patterns. Unlike traditional methods reliant on labeled data or predefined rules, GANs adapt dynamically to evolving threats. However, practical challenges like mode collapse, computational overhead, and synthetic data reliability may arise. GAN-based feature extraction is novel for its unsupervised learning capability and ability to generalize across unseen data. It surpasses traditional IDS approaches by learning data distributions, not just features. Despite limitations, GANs offer a promising solution for modern IDS applications with imbalanced data. Recent research into GAN-based IDS has shown encouraging results in detecting anomalies within network environments. Studies [7,8,9] demonstrate the efficacy of GANs in addressing data-related challenges, achieving enhanced performance in multi-class classification tasks. These models can distinguish between a broad spectrum of attack types, thereby improving detection capabilities over conventional methods. The conceptual foundation of IDS dates back to 1980 with the pioneering work of Anderson [10], establishing IDS as a critical combination of hardware and software for network protection. Modern IDS tools offer comprehensive functionalities, including alerting network administrators to potential internal and external threats, distinguishing unauthorized access attempts, and preemptively detecting vulnerability assessments conducted by potential attackers. This vigilance enables administrators to promptly address vulnerabilities and mitigate emerging threats, reinforcing the security infrastructure across industrial and enterprise networks. The use of artificial intelligence, particularly machine learning and deep learning, in IDS has led to flexible, adaptable systems capable of learning from new data [11]. Deep learning, a subset of machine learning [12], has seen rapid adoption across domains, including medicine, autonomous vehicles, and industrial automation [13,14,15,16]. By integrating AI-driven methods, IDS can detect threats with greater accuracy, utilizing analytics to enhance overall security levels [17]. Typically, IDSs utilize both anomaly-based and signature-based detection methods to identify potential intrusions, yet the effectiveness of deep learning-based IDS is often constrained by the need for large, labeled datasets [18]. Additionally, AI techniques such as deep learning models enable IDSs to automatically adapt to new attack patterns, reducing the need for frequent manual updates. Furthermore, AI-based IDSs can improve accuracy by reducing false positives, allowing cybersecurity teams to focus on genuine threats. The incorporation of AI into IDSs not only enhances detection accuracy but also contributes to more efficient, scalable, and resilient cybersecurity solutions, meeting the demands of today’s digital landscape.
The Distributed Network Protocol version 3 (DNP3) is a communication protocol widely used in industrial control systems, particularly within SCADA environments. DNP3 facilitates reliable, real-time data exchange between control stations and remote devices across large-scale infrastructures, such as power substations, water treatment facilities, and transportation networks. Designed to be robust and efficient, DNP3 supports asynchronous communication, which is critical for environments where timing and data integrity are essential. The protocol also incorporates error-checking mechanisms and time-stamping features to ensure the accuracy and security of transmitted data. Due to its widespread use and critical role in managing essential services, DNP3-based systems are frequent targets for cyber threats, making secure and reliable communication imperative in safeguarding industrial operations [19]. We propose an IDS model based on GANs, specifically tailored for SCADA environments, leveraging DNP3 protocol traffic analysis to detect anomalies with high precision. To the best of our knowledge, this is the first application of a GAN model for classifying abnormal traffic within DNP3 protocol-based SCADA systems. Our simulation results indicate that the proposed GAN model outperforms traditional methods, achieving an accuracy exceeding 99%.
The need for robust IDSs in cybersecurity has become paramount as cyber threats continue to grow in frequency, diversity, and sophistication. Modern organizations face a broad array of cyber-attacks, including malware, phishing, ransomware, and advanced persistent threats (APTs), all of which can result in data breaches, financial loss, and reputational damage. An effective IDS plays a vital role in monitoring network traffic and detecting suspicious activity, providing a critical line of defense by identifying and mitigating potential threats before they can cause significant harm. Traditional IDS techniques, often based on static rules or signature matching, struggle to keep pace with the rapidly changing tactics of cyber adversaries. Consequently, there is a pressing need for more adaptable and intelligent IDS solutions that can respond dynamically to evolving threats.
The remainder of this paper is organized as follows: Section 2 presents relevant background and prior research in SCADA and GAN-based IDS. Section 3 describes the proposed GAN-based model, with a focus on its application to DNP3 protocol analysis. Section 4 provides a detailed evaluation of the model’s performance, including metrics and analyses that assess its accuracy and efficiency. Section 5 concludes this paper by summarizing the findings and suggesting directions for future research.

3. Proposed IDS Based on GAN Model

DNP3 serves as a critical communication standard within SCADA systems, enabling reliable data transmission between master stations and field devices, such as RTUs and IEDs. DNP3’s extensive use in monitoring and controlling critical infrastructure, such as electrical grids and water management systems, makes it a crucial target for cyber-attacks. To mitigate these threats, identifying and extracting meaningful features from DNP3 traffic is essential.
Feature extraction transforms raw network traffic data into structured, informative features that represent the behavior of network communications. By capturing both normal and anomalous patterns, feature extraction enhances the accuracy and effectiveness of IDS in identifying attacks within DNP3-based SCADA environments.
The hybrid adoption of Information Gain (IG) and Correlation-based Feature Selection (CFS) is particularly suitable for processing time-series industrial protocol data. IG evaluates the contribution of each feature toward reducing classification uncertainty, enabling the selection of highly informative attributes from the high-dimensional protocol space. However, IG alone may still retain redundant or overlapping features. CFS complements this by ensuring that selected features are strongly correlated with the class variable while being minimally correlated with each other, thus eliminating redundancy. By combining IG and CFS, we obtain a compact yet discriminative feature subset, which is crucial for handling noisy, redundant, and high-volume industrial time-series data, ultimately improving detection efficiency and generalization in SCADA security tasks.

3.1. DNP3 Protocol and Feature Extraction

3.1.1. DNP3 Protocol

The DNP3 is widely used in SCADA systems for data acquisition and control. For our experiments, we employed a network software emulator, GNS3, to simulate a DNP3 network comprising one DNP3 master and one outstation. The network included two Linux hosts connected through a switch operating at a negotiated speed of 1000 Mbps. One host served as the DNP3 master, while the other acted as the outstation, both running OpenDNP3. Additionally, a Kali Linux host was introduced as an attacking node to perform penetration testing and simulate cyber-attacks. The primary attacks considered were as follows:
  • Denial-of-Service (DoS): Attack traffic was generated using hping3 to overwhelm port 20000 (DNP3 port) of the outstation node.
  • Packet Injection/Modification: This attack was executed using a man-in-the-middle (MITM) technique via ARP spoofing. The attacker manipulated communication by blocking unsolicited responses and executing a cold restart function code.
The attack scenarios aimed to emulate common DNP3-specific threats in SCADA environments, generating a labeled dataset consisting of 861 instances: 470 normal instances, 10 instances of disabled unsolicited message attacks, 11 instances of cold restart command attacks, and 370 DoS attack instances
Figure 2 illustrates the DNP3 network configuration used in our experimental setup.
Figure 2. DNP3 experiment configuration.

3.1.2. Feature Extraction

Feature extraction is a critical step in building an effective Intrusion Detection System (IDS). This phase transforms raw network traffic data into a structured form suitable for model training by generating and selecting representative features. The two primary stages of feature extraction are feature generation and feature selection. Feature Generation: We utilized a window-based approach to capture temporal behaviors of the DNP3 traffic. This approach aggregates network packets within fixed windows, capturing time-series characteristics of the packet streams [73]. The generated dataset comprised 17 features commonly used in network intrusion detection [74], as detailed below:
  • Duration: Time taken for a connection to be established and terminated.
  • Source Bytes: Number of bytes sent from the source to the destination.
  • Destination Bytes: Number of bytes sent from the destination to the source.
  • Flag: Status of the connection (e.g., Normal or Error).
  • Count: Number of connections to the same destination within a two-s window.
  • Service Count: Number of connections to the same service within a two-s window.
  • Same Service Rate: Proportion of connections to a specific service.
  • Dst_host_count: Number of connections from hosts to the destination.
  • Dst_host_srv_count: Count of different services connecting to the destination.
  • Srv_Rate: Proportion of connections to a specific service.
  • Port Rate: Proportion of connections using the same source port.
  • Round Trip Time Delay (RTTD): Total time for a signal to travel and receive a response.
  • Contains DNP3 Packets: Indicates whether DNP3 packets are present.
  • DNP3 Payload Length: Length of DNP3 payload in a connection.
  • Min DNP3 Payload Length: Minimum payload length in the connection.
  • Cold Restart in DNP3 Packet: Boolean indicating the presence of a cold restart or disable unsolicited message command.
  • Function Code Not Supported Count: Boolean indicating changes in function codes.
Feature Selection: Feature selection further refines the extracted features by identifying the most informative subset for intrusion detection. This process minimizes dimensionality while preserving discriminative power. We employed statistical measures such as Information Gain (IG) and Correlation-based Feature Selection (CFS) to rank features based on their relevance. Additionally, domain-specific knowledge was incorporated to prioritize DNP3-related features (e.g., DNP3 payload length) due to their high relevance to SCADA security.
Formally, let X = x 1 , x 2 , , x n denote the feature set and Y denote the target labels (Normal or Attack). Information Gain (IG) for feature x i is defined as follows:
I G ( Y , x i ) = H ( Y ) H ( Y | x i )
where H ( Y ) is the entropy of the target variable and H ( Y | x i ) is the conditional entropy given feature x i .
The final selected features, summarized in Table 1, were chosen based on their predictive power and relevance to network anomalies.
Table 1. List of input features.

3.2. Proposed GAN Model for IDS

Our proposed IDS design leverages a GAN model to detect anomalies and attacks within SCADA networks. The system architecture is divided into three primary phases: Feature Extraction, Training, and Detection. The extracted features, as described earlier, serve as inputs to the GAN-based model.
  • GAN Architecture: The GAN model comprises two components: the Generator (G) and the Discriminator (D). The generator synthesizes realistic network traffic samples, while the discriminator distinguishes between real and synthetic samples. The adversarial training process aims to optimize the following objective function:
    min G max D V ( D , G ) = E x p r e a l [ log D ( x ) ] + E x p z [ log ( 1 D ( G ( z ) ) ) ]
    Here, p r e a l represents the distribution of real data, and p z denotes the noise distribution used to generate synthetic data, E represents the expectation operator, representing the average over the data distribution, D ( x ) denotes the discriminator’s output for real data x. Lastly, the D ( G ( z ) ) represent the discriminator’s output for generated data G ( z ) .
  • Training and Detection Phases: During training, the discriminator learns to classify real and generated network traffic, while the generator improves its ability to produce realistic samples. The final discriminator acts as a binary classifier for intrusion detection. Figure 3 illustrates the architecture of the proposed GAN model.
    Figure 3. Proposed GAN model for intrusion detection system.
    The generator and discriminator models use convolutional layers, activation functions (LeakyReLU, ReLU), and dropout for regularization.
    The algorithm outlines the structure of the discriminator network used in the GAN model. Each step corresponds to a layer in the discriminator network, including details about kernel size, activation functions, output sizes, padding, and dropout rates. The architecture is designed for binary classification (e.g., normal vs. attack) in a SCADA network IDS using DNP3 protocol traffic.
    The GAN model outputs two classes: Normal (0) and Attack (1). This classification is achieved by training the discriminator using labeled instances of network traffic, enabling it to differentiate between normal and malicious behaviors within SCADA networks.
The Algorithm 1 outlines the key steps of a GAN-based IDS for SCADA systems using the DNP3 protocol, with a focus on feature extraction, adversarial training, and detection.
The generator’s goal is to “fool ”the discriminator by maximizing D ( x ^ i ) , i.e., making the discriminator believe that generated data are real. The main objective is to minimize the generator loss, thereby improving its ability to produce realistic data that is indistinguishable from real data.
Algorithm 1: GAN-Based Intrusion Detection System for SCADA using DNP3 Protocol
Jcp 05 00073 i001

4. Experience and Results

4.1. Network Training

To build and implement our model, we used TensorFlow within the Keras framework. The generator is updated using the combined GAN model. This updates the generator to improve the generation of real samples in the next batch. By using a random noise data distribution, generator G generates samples, and fake samples are mixed with the original two classification training label samples to form a new training set.
To determine the optimal hyperparameters, including the number of epochs, batch size, and learning rate, we conducted extensive experiments, testing various parameter configurations to fine-tune our model effectively. The parameters of the model are established as shown in Table 2.
Table 2. Hyperparameter optimization.
In our experiments, all the data were split into two parts: a training set and testing set with 80% and 20% of the data, respectively. In the training set, 10% of the randomly selected data samples were used for the validation. Therefore, there were 688 and 173 training and test samples, respectively. We utilized TensorFlow, a Google-developed open-source software library, for numerical calculations using a data flow graph framework to build and perform the proposed GAN model. A Titan X GPU with 3584 cores running at 1.2 GHz was used to train and test the model.

4.2. Performance Metrics

We evaluate the model using Accuracy, Precision, Recall, F1-score, and AUC (see Appendix A for formal definitions). Accuracy can be a misleading metric when evaluating models on imbalanced datasets; therefore, the F1-score is used as a more reliable performance metric. As presented in Table 3, the table defines the key components of the evaluation metrics: TP (True Positive), FP (False Positive), TN (True Negative), and FN (False Negative). Given the imbalanced nature of our dataset, we supplemented the accuracy-based evaluation of the models with additional performance metrics. Specifically, we assessed the classification models using precision, recall, and the F1-score criteria to provide a more comprehensive evaluation.
Table 3. Confusion matrix indicator for machine learning.
Regarding classification problems, the AUC - ROC curve is a performance measure. The AUC stands for the Area Under the Curve. It is a performance metric used to estimate the ability of a classification model to distinguish between classes. Specifically, AUC refers to the area under the Receiver Operating Characteristic (ROC) curve, which plots the True Positive Rate (TPR) against the False Positive Rate (FPR) at various threshold settings. As shown in the figure, the higher the AUC, the better the model is at predicting. As shown in the Figure 4, the AUC score of proposed model is 0.994, this mean that the model will be reliable and high accuracy at distinguishing between the normal and attack.
Figure 4. The AUC of the proposed model.
As shown in Table 4, our approach achieves a classification accuracy of 99.136% and an F1-score of 99.37%, which outperform the comparative models (FNN: 98.75% accuracy, 98.12% F1-score; RNN: 98.68% accuracy, 98.96% F1-score; CNN: 98.68% accuracy, 97.69% F1-score; SVM: 97.7% accuracy, 97.6% F1-score). This quantitative comparison highlights that the proposed GAN-based IDS provides superior performance not only in accuracy but also in robustness when handling imbalanced datasets. These results strongly support the novelty and practical advantages of our method compared to existing techniques.
Table 4. Comparision performance.
The trained GAN model is tested with a test set, which contains data that the proposed model has never seen before. This evaluation step provides a crucial assessment of how effectively our model can classify unseen data, a fundamental requirement for any robust intrusion detection system (IDS). To quantify the model’s performance, we utilize a confusion matrix. This matrix visually displays the percentage of accurate and incorrect predictions made by the model for each class. By analyzing the matrix, we can extract four key metrics to assess the IDS’s overall efficiency. As shown in Figure 5, the confusion matrix reveals that the model achieved exceptional classification accuracy, with over 99% of test samples correctly predicted. This outstanding result demonstrates the proposed model’s effectiveness in distinguishing between normal and abnormal network behaviors.
Figure 5. Confusion matrix of GAN model.
Compared to existing works, our proposed GAN-based IDS for SCADA achieves a remarkable accuracy of 99.9%, with precision, recall, and F1-score all at 0.99, making it highly effective in detecting cyber intrusions in SCADA environments. Unlike previous studies that rely primarily on generic cybersecurity datasets (e.g., NSL-KDD, CICIDS2017, or BoT-IoT), our approach is tailored specifically for SCADA security, using a DNP3 SCADA dataset, ensuring domain-specific feature extraction and protocol-aware intrusion detection. Furthermore, our model addresses one of the major challenges in IDS research—computational efficiency. While prior models, such as CTGSM-DNN (2025) and WGAN (2024), required high training and inference times, our approach significantly reduces inference time to 20 ms, making it ideal for real-time industrial applications. The proposed method also improves generalizability, as it outperforms existing models across different attack scenarios, ensuring robustness in various SCADA security threats. However, while our model excels in accuracy, detection capability, and real-time inference, future improvements could focus on enhancing adaptability across different SCADA protocols beyond DNP3, and integrating additional low-latency adversarial learning techniques to further reduce computational overhead in real-world industrial control system deployments.

5. Conclusions

SCADA systems play a critical role in managing industrial processes, and their reliance on network separation as the primary security measure underscores the need for robust IDS. Enhancing security monitoring through advanced methodologies, such as leveraging network flows, application protocols, process-aware features, and deep learning techniques, is essential for protecting SCADA systems against evolving cyber threats. This study introduces a novel GAN-based IDS framework specifically designed to meet the unique requirements of SCADA environments. By utilizing GANs, our approach effectively addresses the challenges posed by limited and imbalanced datasets, which are common in SCADA intrusion detection scenarios.
The proposed GAN-based IDS demonstrates superior performance compared to existing methods, achieving higher accuracy and F1-scores in simulation experiments. These results highlight the potential of GANs in enhancing intrusion detection by generating realistic synthetic samples, thereby improving the robustness of the detection model. This study also explores the influence of various parameter configurations, identifying optimal settings that contribute to the overall effectiveness and flexibility of the proposed framework.
Despite these promising outcomes, certain limitations of the proposed approach warrant further attention. First, the scalability of the GAN-IDS framework to larger and more complex SCADA networks remains a significant challenge, as the computational overhead of GAN models increases with network complexity. This limitation could hinder the deployment of the system in large-scale industrial environments where real-time performance is critical. Additionally, the high computational cost associated with training and inference may pose difficulties in resource-constrained settings, such as remote or edge-based deployments.
Although our evaluation primarily focused on conventional network-based attacks (e.g., Denial of Service and ARP spoofing), it is important to recognize that GAN-based detection models, like most deep learning approaches, remain susceptible to adversarial perturbations. Such perturbations can be carefully crafted to manipulate input traffic in ways that appear legitimate to the model while bypassing detection, thereby posing a significant security risk in mission-critical SCADA environments. To address this limitation, future work will extend our study to include a broader spectrum of adversarial attack scenarios, such as evasion attacks, poisoning attacks, and timing-based manipulations, which more accurately reflect real-world adversarial behavior. In addition, we plan to investigate robustness-enhancing strategies, including adversarial training to expose the model to perturbed samples during learning, ensemble defenses that combine complementary detection mechanisms, and knowledge-guided defenses that embed protocol semantics or domain-specific constraints into the model. By pursuing these directions, we aim to significantly improve the resilience and reliability of AI-based intrusion detection systems in securing industrial control networks.

Author Contributions

H.N.N.: conceptualization, methodology, software, validation, formal analysis, investigation, data curation, writing—original draft, visualization. J.K.: investigation, methodology, project administration, resources, supervision, writing—review and editing. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Data will be made available on request.

Acknowledgments

The authors gratefully acknowledge our seniors for the insightful feedback on the manuscript structure, fresh perspective, and the expert guidance on the GAN model. We also sincerely appreciate colleges for their patience, support, and valuable discussions, which were instrumental in completing this work.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ACGANAuxiliary Classifier Generative Adversarial Network
AIArtificial Intelligence
AIDSAnomaly-based Intrusion Detection System
APTAdvanced Persistent Threat
ARPAddress Resolution Protocol
CNNConvolution Neural Network
DDoSDistributed Denial-of-Service
DLDeep learning
DNP3Distributed Network Protocol 3
DoSDenial-of-Service
DRLDistributional Reinforcement Learning
GANGenerative Adversarial Network
GPRSGeneral Packet Radio Service
GSMGlobal System for Mobile Communications
HIDSHost-based Intrusion Detection System
HMIHuman–Machine Interface
HSDPAHigh-Speed Downlink Packet Access
ICCPInter-control center communications
IDSIntrusion Detection System
IoTInternet-of-Things
IVNsIn-Vehicle Networks
MitMMan-in-the-Middle
NIDSNetwork-based Intrusion Detection System
RTURemote Terminal Unit
SCADASupervisory Control and Data Acquisition
TCPTransmission Control Protocol
UAVUnmanned Aerial Vehicle
Wi-FiWireless Fidelity
XAIeXplainable AI

Appendix A. Performance Metric Definitions

To evaluate the classification performance of the proposed GAN-based IDS, we use widely accepted metrics: Accuracy, Precision, Recall, F1-score, and AUC. These definitions are presented here for completeness.
  • Accuracy
    A c c u r a c y = T P + T N T P + T N + F P + F N
    where T P = True Positives, T N = True Negatives, F P = False Positives, and F N = False Negatives.
  • Precision
    P r e c i s i o n = T P T P + F P
    Precision measures the proportion of positive identifications that are actually correct.
  • Recall (Sensitivity)
    R e c a l l = T P T P + F N
    Recall measures the proportion of actual positives correctly identified.
  • F1-Score
    F 1 = 2 × P r e c i s i o n × R e c a l l P r e c i s i o n + R e c a l l
    The F1-score balances Precision and Recall, and is particularly useful for imbalanced datasets.
  • Area Under the Curve (AUC)
    AUC refers to the area under the Receiver Operating Characteristic (ROC) curve, which plots the True Positive Rate (TPR) against the False Positive Rate (FPR) at various threshold settings. Higher AUC values indicate better discrimination between normal and attack classes.

References

  1. van Boven, L.S.; Kusters, R.W.; Tin, D.; van Osch, F.H.; De Cauwer, H.; Ketelings, L.; Rao, M.; Dameff, C.; Barten, D.G. Hacking Acute Care: A Qualitative Study on the Health Care Impacts of Ransomware Attacks Against Hospitals. Ann. Emerg. Med. 2024, 83, 46–56. [Google Scholar] [CrossRef]
  2. Nhung-Nguyen, H.; Girdhar, M.; Kim, Y.H.; Hong, J. Machine-Learning-Based Anomaly Detection for GOOSE in Digital Substations. Energies 2024, 17, 3745. [Google Scholar] [CrossRef]
  3. Lee, J.M.; Hong, S. Keeping Host Sanity for Security of the SCADA Systems. IEEE Access 2020, 8, 62954–62968. [Google Scholar] [CrossRef]
  4. Lee, J.M.; Hong, S. Host-Oriented Approach to Cyber Security for the SCADA Systems. In Proceedings of the 2020 6th IEEE Congress on Information Science and Technology (CiSt), Agadir-Essaouira, Morocco, 5–12 June 2021; pp. 151–155. [Google Scholar] [CrossRef]
  5. Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Networks. arXiv 2014, arXiv:1406.2661. [Google Scholar] [CrossRef]
  6. Nhung Nguyen, H.; Kim, Y.H. GAN-Based Driver’s Head Motion Using Millimeter-Wave Radar Sensor. IEEE Access 2025, 13, 108359–108367. [Google Scholar] [CrossRef]
  7. Lee, J.; Park, K. GAN-based imbalanced data intrusion detection system. Pers. Ubiquitous Comput. 2021, 25, 121–128. [Google Scholar] [CrossRef]
  8. Piplai, A.; Chukkapalli, S.S.L.; Joshi, A. NAttack! Adversarial Attacks to bypass a GAN based classifier trained to detect Network intrusion. In Proceedings of the 2020 IEEE 6th Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS), Baltimore, MD, USA, 25–27 May 2020; pp. 49–54. [Google Scholar]
  9. Liao, D.; Huang, S.; Tan, Y.; Bai, G. Network Intrusion Detection Method Based on GAN Model. In Proceedings of the 2020 International Conference on Computer Communication and Network Security (CCNS), Xi’an, China, 21–23 August 2020; pp. 153–156. [Google Scholar]
  10. Anderson, J.P. Computer Security Threat Monitoring and Surveillance; Technical Report; James P. Anderson Company: Fort Washington, MD, USA, 1980. [Google Scholar]
  11. Liu, H.; Lang, B. Machine Learning and Deep Learning Methods for Intrusion Detection Systems: A Survey. Appl. Sci. 2019, 9, 4396. [Google Scholar] [CrossRef]
  12. Chollet, F. Deep Learning with Python, 1st ed.; Manning Publications Co.: Shelter Island, NY, USA, 2017. [Google Scholar]
  13. LeCun, Y. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
  14. Nhung-Nguyen, H.; Youn, Y.W.; Kim, Y.H. A Deep Neural Network to Identify Vacuum Degrees in Vacuum Interrupter Based on Partial Discharge Diagnosis. IEEE Access 2022, 10, 95125–95131. [Google Scholar] [CrossRef]
  15. Hong, J.; Kim, Y.H.; Nhung-Nguyen, H.; Kwon, J.; Lee, H. Deep-Learning Based Fault Events Analysis in Power Systems. Energies 2022, 15, 5539. [Google Scholar] [CrossRef]
  16. Nguyen, H.N.; Lee, S.; Nguyen, T.T.; Kim, Y.H. One-shot learning-based driver’s head movement identification using a millimetre-wave radar sensor. IET Radar Sonar Navig. 2022, 16, 825–836. [Google Scholar] [CrossRef]
  17. Wang, W.; Lu, Z. Cyber security in the Smart Grid: Survey and challenges. Comput. Netw. 2013, 57, 1344–1371. [Google Scholar] [CrossRef]
  18. Vinayakumar, R.; Barathi Ganesh, H.B.; Poornachandran, P.; Anand Kumar, M.; Soman, K.P. Deep-Net: Deep Neural Network for Cyber Security Use Cases. arXiv 2018, arXiv:1812.03519. [Google Scholar] [CrossRef]
  19. IEEE Std 1815-2012; IEEE Standard for Electric Power Systems Communications-Distributed Network Protocol (DNP3). IEEE Standards Association: Piscataway, NJ, USA, 2012; pp. 1–821. [CrossRef]
  20. Dogaru, D.I.; Dumitrache, I. Cyber Security of Smart Grids in the Context of Big Data and Machine Learning. In Proceedings of the 2019 22nd International Conference on Control Systems and Computer Science (CSCS), Bucharest, Romania, 28–30 May 2019; pp. 61–67. [Google Scholar] [CrossRef]
  21. Rakas, S.V.B.; Stojanović, M.D.; Marković-Petrović, J.D. A Review of Research Work on Network-Based SCADA Intrusion Detection Systems. IEEE Access 2020, 8, 93083–93108. [Google Scholar] [CrossRef]
  22. Martins, I.; Resende, J.S.; Sousa, P.R.; Silva, S.; Antunes, L.; Gama, J. Host-based IDS: A review and open issues of an anomaly detection system in IoT. Future Gener. Comput. Syst. 2022, 133, 95–113. [Google Scholar] [CrossRef]
  23. Bulle, B.B.; Santin, A.O.; Viegas, E.K.; dos Santos, R.R. A Host-based Intrusion Detection Model Based on OS Diversity for SCADA. In Proceedings of the IECON 2020 The 46th Annual Conference of the IEEE Industrial Electronics Society, Singapore, 18–21 October 2020; pp. 691–696. [Google Scholar] [CrossRef]
  24. da Conceição Aberto, H.; Dembele, J.M.; Diop, I.; Bah, A. Review of Intrusion Detection Systems for Supervisor Control and Data Acquisition: A Machine Learning Approach. In Communications in Computer and Information Science, Proceedings of the International Conference on Science, Engineering Management and Information Technology, Ankara, Turkey, 14–15 September 2023; Springer: Cham, Switzerland, 2023; pp. 28–51. [Google Scholar] [CrossRef]
  25. Al-Asiri, M.; El-Alfy, E.S.M. On Using Physical Based Intrusion Detection in SCADA Systems. Procedia Comput. Sci. 2020, 170, 34–42. [Google Scholar] [CrossRef]
  26. Kwon, H.Y.; Kim, T.; Lee, M.K. Advanced Intrusion Detection Combining Signature-Based and Behavior-Based Detection Methods. Electronics 2022, 11, 867. [Google Scholar] [CrossRef]
  27. Yang, Y.; McLaughlin, K.; Littler, T.; Sezer, S.; Wang, H. Rule-based intrusion detection system for SCADA networks. In Proceedings of the 2nd IET Renewable Power Generation Conference (RPG 2013), Beijing, China, 9–11 September 2013; pp. 1–4. [Google Scholar] [CrossRef]
  28. Adiban, M.; Siniscalchi, S.M.; Salvi, G. A step-by-step training method for multi generator GANs with application to anomaly detection and cybersecurity. Neurocomputing 2023, 537, 296–308. [Google Scholar] [CrossRef]
  29. Park, C.H.; Jo, J.Y.; Kim, Y. Detecting Cyber Threats with Limited Dataset Using Generative Adversarial Network on SCADA System. In Proceedings of the 2023 International Conference on Computational Science and Computational Intelligence (CSCI), Las Vegas, NV, USA, 13–15 December 2023; pp. 915–919. [Google Scholar] [CrossRef]
  30. Gunnam, S.R.; Vepuri, S.K.; Nallarasan, V. Detection of Real Time Malicious Intrusions Using GAN (Generative Adversarial Networks) in Cyber Physical System. In Proceedings of the 2024 5th International Conference for Emerging Technology (INCET), Belgaum, India, 24–26 May 2024; pp. 1–7. [Google Scholar] [CrossRef]
  31. Freitas de Araujo-Filho, P.; Kaddoum, G.; Campelo, D.R.; Gondim Santos, A.; Macêdo, D.; Zanchettin, C. Intrusion Detection for Cyber–Physical Systems Using Generative Adversarial Networks in Fog Environment. IEEE Internet Things J. 2021, 8, 6247–6256. [Google Scholar] [CrossRef]
  32. Benaddi, H.; Jouhari, M.; Ibrahimi, K.; Ben Othman, J.; Amhoud, E.M. Anomaly Detection in Industrial IoT Using Distributional Reinforcement Learning and Generative Adversarial Networks. Sensors 2022, 22, 8085. [Google Scholar] [CrossRef]
  33. Yalçın, N.; Çakır, S.; Ünaldı, S. Attack Detection Using Artificial Intelligence Methods for SCADA Security. IEEE Internet Things J. 2024, 11, 39550–39559. [Google Scholar] [CrossRef]
  34. Kim, J.Y.; Bu, S.J.; Cho, S.B. Malware detection using deep transferred generative adversarial networks. In Lecture Notes in Computer Science, Proceedings of the Neural Information Processing: 24th International Conference, ICONIP 2017, Guangzhou, China, 14–18 November 2017; Proceedings, Part I 24; Springer: Cham, Switzerland, 2017; pp. 556–564. [Google Scholar] [CrossRef]
  35. Seo, E.; Song, H.M.; Kim, H.K. GIDS: GAN based Intrusion Detection System for In-Vehicle Network. In Proceedings of the 2018 16th Annual Conference on Privacy, Security and Trust (PST), Belfast, Ireland, 28–30 August 2018; pp. 1–6. [Google Scholar] [CrossRef]
  36. Tabassum, A.; Erbad, A.; Lebda, W.; Mohamed, A.; Guizani, M. FEDGAN-IDS: Privacy-preserving IDS using GAN and Federated Learning. Comput. Commun. 2022, 192, 299–310. [Google Scholar] [CrossRef]
  37. Li, S.; Cao, Y.; Liu, S.; Lai, Y.; Zhu, Y.; Ahmad, N. HDA-IDS: A Hybrid DoS Attacks Intrusion Detection System for IoT by using semi-supervised CL-GAN. Expert Syst. Appl. 2024, 238, 122198. [Google Scholar] [CrossRef]
  38. Yoo, J.D.; Kim, H.; Kim, H.K. GUIDE: GAN-based UAV IDS Enhancement. Comput. Secur. 2024, 147, 104073. [Google Scholar] [CrossRef]
  39. Liu, X.; Li, T.; Zhang, R.; Wu, D.; Liu, Y.; Yang, Z. A GAN and Feature Selection-Based Oversampling Technique for Intrusion Detection. Secur. Commun. Netw. 2021, 2021, 9947059. [Google Scholar] [CrossRef]
  40. Kim, T.; Pak, W. Early Detection of Network Intrusions Using a GAN-Based One-Class Classifier. IEEE Access 2022, 10, 119357–119367. [Google Scholar] [CrossRef]
  41. Abu-Jassar, A.T.; Attar, H.; Yevsieiev, V.; Amer, A.; Demska, N.; Luhach, A.K.; Lyashenko, V. Electronic User Authentication Key for Access to HMI/SCADA via Unsecured Internet Networks. Comput. Intell. Neurosci. 2022, 2022, 5866922. [Google Scholar] [CrossRef]
  42. Yadav, G.; Paul, K. Architecture and security of SCADA systems: A review. Int. J. Crit. Infrastruct. Prot. 2021, 34, 100433. [Google Scholar] [CrossRef]
  43. Qian, J.; Du, X.; Chen, B.; Qu, B.; Zeng, K.; Liu, J. Cyber-Physical Integrated Intrusion Detection Scheme in SCADA System of Process Manufacturing Industry. IEEE Access 2020, 8, 147471–147481. [Google Scholar] [CrossRef]
  44. Anwar, M.; Lundberg, L.; Borg, A. Improving anomaly detection in SCADA network communication with attribute extension. Energy Inform. 2022, 5, 69. [Google Scholar] [CrossRef]
  45. Aboulsamh, R.M.; Albugaey, M.T.; Alghamdi, D.O.; Abujaid, F.H.; Alsubaie, S.N.; Saqib, N.A. Secure Communication Protocols for SCADA Systems: Analysis and Comparisons of Different Secure Communication Protocols. In Proceedings of the 2024 Seventh International Women in Data Science Conference at Prince Sultan University (WiDS PSU), Riyadh, Saudi Arabia, 3–4 March 2024; pp. 209–214. [Google Scholar] [CrossRef]
  46. Lin, C.Y.; Nadjm-Tehrani, S. Protocol study and anomaly detection for server-driven traffic in SCADA networks. Int. J. Crit. Infrastruct. Prot. 2023, 42, 100612. [Google Scholar] [CrossRef]
  47. Alsabbagh, W.; Langendörfer, P. Security of Programmable Logic Controllers and Related Systems: Today and Tomorrow. IEEE Open J. Ind. Electron. Soc. 2023, 4, 659–693. [Google Scholar] [CrossRef]
  48. Yang, K.; Wang, H.; Wang, H.; Sun, L. An effective intrusion-resilient mechanism for programmable logic controllers against data tampering attacks. Comput. Ind. 2022, 138, 103613. [Google Scholar] [CrossRef]
  49. Rencelj Ling, E.; Urrea Cabus, J.E.; Butun, I.; Lagerström, R.; Olegard, J. Securing Communication and Identifying Threats in RTUs: A Vulnerability Analysis. In Proceedings of the 17th International Conference on Availability, Reliability and Security, Vienna, Austria, 23–26 August 2022; ARES ’22, pp. 1–7. [Google Scholar] [CrossRef]
  50. Cruz, T.; Rosa, L.; Proença, J.; Maglaras, L.; Aubigny, M.; Lev, L.; Jiang, J.; Simões, P. A Cybersecurity Detection Framework for Supervisory Control and Data Acquisition Systems. IEEE Trans. Ind. Inform. 2016, 12, 2236–2246. [Google Scholar] [CrossRef]
  51. Juma, M.; Alattar, F.; Touqan, B. Securing Big Data Integrity for Industrial IoT in Smart Manufacturing Based on the Trusted Consortium Blockchain (TCB). IoT 2023, 4, 27–55. [Google Scholar] [CrossRef]
  52. Lupascu, C.; Lupascu, A.; Bica, I. DLT Based Authentication Framework for Industrial IoT Devices. Sensors 2020, 20, 2621. [Google Scholar] [CrossRef]
  53. Ali, B.S.; Ullah, I.; Al Shloul, T.; Khan, I.A.; Khan, I.; Ghadi, Y.Y.; Abdusalomov, A.; Nasimov, R.; Ouahada, K.; Hamam, H. ICS-IDS: Application of big data analysis in AI-based intrusion detection systems to identify cyberattacks in ICS networks. J. Supercomput. 2024, 80, 7876–7905. [Google Scholar] [CrossRef]
  54. Abdullahi, M.; Alhussian, H.; Aziz, N.; Abdulkadir, S.J.; Alwadain, A.; Muazu, A.A.; Bala, A. Comparison and Investigation of AI-Based Approaches for Cyberattack Detection in Cyber-Physical Systems. IEEE Access 2024, 12, 31988–32004. [Google Scholar] [CrossRef]
  55. Hu, J.; Yang, H.; Lyu, M.R.; King, I.; Man-Cho So, A. Online Nonlinear AUC Maximization for Imbalanced Data Sets. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 882–895. [Google Scholar] [CrossRef]
  56. Yan, Y.; Liu, R.; Ding, Z.; Du, X.; Chen, J.; Zhang, Y. A Parameter-Free Cleaning Method for SMOTE in Imbalanced Classification. IEEE Access 2019, 7, 23537–23548. [Google Scholar] [CrossRef]
  57. Balla, A.; Habaebi, M.H.; Elsheikh, E.A.A.; Islam, M.R.; Suliman, F.M. The Effect of Dataset Imbalance on the Performance of SCADA Intrusion Detection Systems. Sensors 2023, 23, 758. [Google Scholar] [CrossRef] [PubMed]
  58. Sams Aafiya Banu, S.; Gopika, B.; Esakki Rajan, E.; Ramkumar, M.; Mahalakshmi, M.; Emil Selvan, G. Smote variants for data balancing in intrusion detection system using machine learning. In Proceedings of the International Conference on Machine Intelligence and Signal Processing; Springer: Singapore, 2022; pp. 317–330. [Google Scholar] [CrossRef]
  59. Abdelmoumin, G.; Rawat, D.B.; Rahman, A. Studying Imbalanced Learning for Anomaly-Based Intelligent IDS for Mission-Critical Internet of Things. J. Cybersecur. Priv. 2023, 3, 706–743. [Google Scholar] [CrossRef]
  60. Louk, M.H.L.; Tama, B.A. Exploring Ensemble-Based Class Imbalance Learners for Intrusion Detection in Industrial Control Networks. Big Data Cogn. Comput. 2021, 5, 72. [Google Scholar] [CrossRef]
  61. Khan, I.A.; Pi, D.; Khan, Z.U.; Hussain, Y.; Nawaz, A. HML-IDS: A Hybrid-Multilevel Anomaly Prediction Approach for Intrusion Detection in SCADA Systems. IEEE Access 2019, 7, 89507–89521. [Google Scholar] [CrossRef]
  62. Rajesh, L.; Satyanarayana, P. Evaluation of machine learning algorithms for detection of malicious traffic in scada network. J. Electr. Eng. Technol. 2022, 17, 913–928. [Google Scholar] [CrossRef]
  63. Yan, B.; Han, G.; Sun, M.; Ye, S. A novel region adaptive SMOTE algorithm for intrusion detection on imbalanced problem. In Proceedings of the 2017 3rd IEEE International Conference on Computer and Communications (ICCC), Chengdu, China, 13–16 December 2017; pp. 1281–1286. [Google Scholar] [CrossRef]
  64. Sun, Y.; Liu, F. SMOTE-NCL: A re-sampling method with filter for network intrusion detection. In Proceedings of the 2016 2nd IEEE International Conference on Computer and Communications (ICCC), Chengdu, China, 14–17 October 2016; pp. 1157–1161. [Google Scholar] [CrossRef]
  65. Ahmad, I.; Basheri, M.; Iqbal, M.J.; Rahim, A. Performance Comparison of Support Vector Machine, Random Forest, and Extreme Learning Machine for Intrusion Detection. IEEE Access 2018, 6, 33789–33795. [Google Scholar] [CrossRef]
  66. Zarpelão, B.B.; Miani, R.S.; Kawakani, C.T.; de Alvarenga, S.C. A survey of intrusion detection in Internet of Things. J. Netw. Comput. Appl. 2017, 84, 25–37. [Google Scholar] [CrossRef]
  67. Mohagheghi, S.; Stoupis, J.; Wang, Z. Communication protocols and networks for power systems-current status and future trends. In Proceedings of the 2009 IEEE/PES Power Systems Conference and Exposition, Seattle, WA, USA, 15–18 March 2009; pp. 1–9. [Google Scholar] [CrossRef]
  68. Mander, T.; Cheung, R.; Nabhani, F. Power System DNP3 Data Object Security Using Data Sets. Comput. Secur. 2010, 29, 487–500. [Google Scholar] [CrossRef]
  69. IEC 60870-6 TASE.2; Telecontrol Standard IEC 60870-6 TASE.2 Globally Adopted. Springer-Verlag Wien: Vienna, Austria, 1999.
  70. IEEE Std 1379-2000; IEEE Recommended Practice for Data Communications Between Remote Terminal Units and Intelligent Electronic Devices in a Substation. IEEE Standards Association: Piscataway, NJ, USA, 2001; pp. 1–72. [CrossRef]
  71. IEEE Std 1815-2010; IEEE Standard for Electric Power Systems Communications—Distributed Network Protocol (DNP3). IEEE Standards Association: Piscataway, NJ, USA, 2010; pp. 1–775. [CrossRef]
  72. Yin, X.C.; Liu, Z.G.; Nkenyereye, L.; Ndibanje, B. Toward an Applied Cyber Security Solution in IoT-Based Smart Grids: An Intrusion Detection System Approach. Sensors 2019, 19, 4952. [Google Scholar] [CrossRef]
  73. Linda, O.; Vollmer, T.; Manic, M. Neural Network based Intrusion Detection System for critical infrastructures. In Proceedings of the 2009 International Joint Conference on Neural Networks, Atlanta, GA, USA, 14–19 June 2009; pp. 1827–1834. [Google Scholar] [CrossRef]
  74. Altaha, M.; Lee, J.M.; Muhammad, A.; Hong, S. Network Intrusion Detection based on Deep Neural Networks for the SCADA system. J. Phys. Conf. Ser. 2020, 1585, 012038. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.