Previous Article in Journal
Structured Heatmap Learning for Multi-Family Malware Classification: A Deep and Explainable Approach Using CAPEv2
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Enhancing SCADA Security Using Generative Adversarial Network

by
Hong Nhung Nguyen
and
Jakeoung Koo
*
Department of AI and Software Engineering, School of Computing, Gachon University, Seongnam 13120, Republic of Korea
*
Author to whom correspondence should be addressed.
J. Cybersecur. Priv. 2025, 5(3), 73; https://doi.org/10.3390/jcp5030073
Submission received: 4 August 2025 / Revised: 3 September 2025 / Accepted: 10 September 2025 / Published: 12 September 2025
(This article belongs to the Section Security Engineering & Applications)

Abstract

Supervisory Control and Data Acquisition (SCADA) systems play a critical role in industrial processes by providing real-time monitoring and control of equipment across large-scale, distributed operations. In the context of cyber security, Intrusion Detection Systems (IDSs) help protect SCADA systems by monitoring for unauthorized access, malicious activity, and policy violations, providing a layer of defense against potential intrusions. Given the critical role of SCADA systems and the increasing cyber risks, this paper highlights the importance of transitioning from traditional signature-based IDS to advanced AI-driven methods. Particularly, this study tackles the issue of intrusion detection in SCADA systems, which are critical yet vulnerable parts of industrial control systems. Traditional Intrusion Detection Systems (IDSs) often fall short in SCADA environments due to data scarcity, class imbalance, and the need for specialized anomaly detection suited to industrial protocols like DNP3. By integrating GANs, this study mitigates these limitations by generating synthetic data, enhancing classification accuracy and robustness in detecting cyber threats targeting SCADA systems. Remarkably, the proposed GAN-based IDS achieves an outstanding accuracy of 99.136%, paired with impressive detection speed, meeting the crucial need for real-time threat identification in industrial contexts. Beyond these empirical advancements, this paper suggests future exploration of explainable AI techniques to improve the interpretability of IDS models tailored to SCADA environments. Additionally, it encourages collaboration between academia and industry to develop extensive datasets that accurately reflect SCADA network traffic.

1. Introduction

Cyber attacks have underscored the serious and widespread impact of digital security breaches on both companies and individuals. For instance, the 2023 ransomware attack on a major U.S. healthcare provider disrupted patient services across numerous facilities, delaying critical treatments and surgeries, which posed significant risks to patient well being [1]. Such attacks often lead to massive data breaches, where sensitive information—like social security numbers, financial data, and medical records—is stolen and misused, resulting in identity theft and financial fraud for individuals. On a corporate level, these attacks damage trust, lead to financial losses, and require costly mitigation efforts, such as system overhauls and security upgrades. The effects ripple through the economy as companies face regulatory fines, lawsuits, and reputational harm, ultimately eroding public confidence in digital systems.
As cyber threats become more sophisticated, these incidents highlight the urgent need for enhanced security measures to protect both personal and professional aspects of life, which are increasingly intertwined with digital technologies.
In the era of pervasive digitization, safeguarding the security and integrity of networked systems has become a critical concern, driving extensive research within the field of cybersecurity [2]. As cyber threats continue to evolve in sophistication, traditional security measures often prove inadequate, thus necessitating the development of advanced and adaptive Intrusion Detection Systems (IDSs). The progression of IDS technology reflects the dynamic landscape of cyber threats. Originating in the 1980s, IDSs evolved from basic anomaly detection to advanced frameworks incorporating machine learning and artificial intelligence (AI) techniques. Early IDS models relied heavily on signature-based methods that matched known attack patterns, which, although effective for previously seen threats, struggled with novel and evolving attack strategies. In response to these limitations, researchers have increasingly incorporated anomaly detection and AI-based methodologies, leveraging the adaptability and learning capabilities of modern algorithms.
Among these critical systems, Supervisory Control and Data Acquisition (SCADA) systems stand out due to their fundamental role in managing industrial and infrastructural processes. SCADA systems control and monitor essential services, such as power grids, water treatment facilities, and transportation networks, making them attractive targets for cyber adversaries. The distributed and interconnected nature of SCADA systems across large-scale industrial environments introduces significant vulnerabilities, as illustrated in recent studies highlighting targeted attacks on SCADA infrastructure [3,4]. These studies underscore the need for robust, SCADA-specific IDS mechanisms capable of defending against increasingly sophisticated threats.
SCADA systems, as the backbone of critical infrastructure, have become prime targets for sophisticated cyber-attacks. Traditional cybersecurity measures, including rule-based IDS, face limitations in adapting to evolving attack patterns and the high volume of data generated by modern SCADA environments. Generative Adversarial Networks (GANs) have emerged as a powerful tool in machine learning due to their ability to model complex data distributions and generate synthetic samples. In industrial cybersecurity, GANs offer a novel approach to intrusion detection by enhancing the diversity of training data and improving the detection of attacks.
Generative Adversarial Networks (GANs), first introduced by Goodfellow et al. in 2014 [5,6], offer a promising framework for enhancing IDS, particularly in SCADA environments. GANs consist of a generator that creates synthetic data and a discriminator that evaluates them, competing to refine detection accuracy. In cybersecurity contexts, GANs can mitigate common issues such as data scarcity and class imbalance, which frequently hinder IDS performance. Leveraging GANs for SCADA-focused IDS presents unique opportunities for creating adaptable, high-fidelity models that enhance network security and anomaly detection. Moreover, integrating GANs with explainable AI (XAI) methods can improve the interpretability of IDS outputs, enabling security professionals to better understand and act on anomaly classifications. In detail, GANs address imbalanced datasets by generating synthetic samples for minority classes, enhancing data diversity and improving model generalization. They also extract high-dimensional features, enabling better anomaly detection in IDS by capturing subtle patterns. Unlike traditional methods reliant on labeled data or predefined rules, GANs adapt dynamically to evolving threats. However, practical challenges like mode collapse, computational overhead, and synthetic data reliability may arise. GAN-based feature extraction is novel for its unsupervised learning capability and ability to generalize across unseen data. It surpasses traditional IDS approaches by learning data distributions, not just features. Despite limitations, GANs offer a promising solution for modern IDS applications with imbalanced data. Recent research into GAN-based IDS has shown encouraging results in detecting anomalies within network environments. Studies [7,8,9] demonstrate the efficacy of GANs in addressing data-related challenges, achieving enhanced performance in multi-class classification tasks. These models can distinguish between a broad spectrum of attack types, thereby improving detection capabilities over conventional methods. The conceptual foundation of IDS dates back to 1980 with the pioneering work of Anderson [10], establishing IDS as a critical combination of hardware and software for network protection. Modern IDS tools offer comprehensive functionalities, including alerting network administrators to potential internal and external threats, distinguishing unauthorized access attempts, and preemptively detecting vulnerability assessments conducted by potential attackers. This vigilance enables administrators to promptly address vulnerabilities and mitigate emerging threats, reinforcing the security infrastructure across industrial and enterprise networks. The use of artificial intelligence, particularly machine learning and deep learning, in IDS has led to flexible, adaptable systems capable of learning from new data [11]. Deep learning, a subset of machine learning [12], has seen rapid adoption across domains, including medicine, autonomous vehicles, and industrial automation [13,14,15,16]. By integrating AI-driven methods, IDS can detect threats with greater accuracy, utilizing analytics to enhance overall security levels [17]. Typically, IDSs utilize both anomaly-based and signature-based detection methods to identify potential intrusions, yet the effectiveness of deep learning-based IDS is often constrained by the need for large, labeled datasets [18]. Additionally, AI techniques such as deep learning models enable IDSs to automatically adapt to new attack patterns, reducing the need for frequent manual updates. Furthermore, AI-based IDSs can improve accuracy by reducing false positives, allowing cybersecurity teams to focus on genuine threats. The incorporation of AI into IDSs not only enhances detection accuracy but also contributes to more efficient, scalable, and resilient cybersecurity solutions, meeting the demands of today’s digital landscape.
The Distributed Network Protocol version 3 (DNP3) is a communication protocol widely used in industrial control systems, particularly within SCADA environments. DNP3 facilitates reliable, real-time data exchange between control stations and remote devices across large-scale infrastructures, such as power substations, water treatment facilities, and transportation networks. Designed to be robust and efficient, DNP3 supports asynchronous communication, which is critical for environments where timing and data integrity are essential. The protocol also incorporates error-checking mechanisms and time-stamping features to ensure the accuracy and security of transmitted data. Due to its widespread use and critical role in managing essential services, DNP3-based systems are frequent targets for cyber threats, making secure and reliable communication imperative in safeguarding industrial operations [19]. We propose an IDS model based on GANs, specifically tailored for SCADA environments, leveraging DNP3 protocol traffic analysis to detect anomalies with high precision. To the best of our knowledge, this is the first application of a GAN model for classifying abnormal traffic within DNP3 protocol-based SCADA systems. Our simulation results indicate that the proposed GAN model outperforms traditional methods, achieving an accuracy exceeding 99%.
The need for robust IDSs in cybersecurity has become paramount as cyber threats continue to grow in frequency, diversity, and sophistication. Modern organizations face a broad array of cyber-attacks, including malware, phishing, ransomware, and advanced persistent threats (APTs), all of which can result in data breaches, financial loss, and reputational damage. An effective IDS plays a vital role in monitoring network traffic and detecting suspicious activity, providing a critical line of defense by identifying and mitigating potential threats before they can cause significant harm. Traditional IDS techniques, often based on static rules or signature matching, struggle to keep pace with the rapidly changing tactics of cyber adversaries. Consequently, there is a pressing need for more adaptable and intelligent IDS solutions that can respond dynamically to evolving threats.
The remainder of this paper is organized as follows: Section 2 presents relevant background and prior research in SCADA and GAN-based IDS. Section 3 describes the proposed GAN-based model, with a focus on its application to DNP3 protocol analysis. Section 4 provides a detailed evaluation of the model’s performance, including metrics and analyses that assess its accuracy and efficiency. Section 5 concludes this paper by summarizing the findings and suggesting directions for future research.

2. Related Works

2.1. Intrusion Detection Systems (IDSs)

Cyber security strategies have three parts: network separation, communication message security, and monitoring. Monitoring is an essential part of attack detection and reporting. The primary tool for realizing security monitoring is the intrusion detection system [20]. IDSs are critical in protecting SCADA systems by identifying, monitoring, and responding to threats. Due to SCADA’s central role in critical infrastructure management, robust security measures, including IDSs, are essential for preventing malicious activities [21]. IDS can be broadly classified into host-based, network-based, signature-based, anomaly-based, and hybrid systems.
IDS types are divided into the following categories:
  • Host-Based IDS (HIDS): HIDS operates at the host level, monitoring processes, file integrity, and logs to detect unusual behavior or unauthorized access. Recent works by Martins et al. [22] and Bulle et al. [23] highlight the effectiveness of HIDS in mitigating internal threats in SCADA systems, with a focus on identifying anomalies in control servers. However, scalability and resource constraints are limitations often cited in these studies.
  • Network-Based IDS (NIDS): NIDS detects intrusions by analyzing network traffic. NIDS is widely used to protect SCADA systems by monitoring protocols such as Modbus, DNP3, and IEC 60870-5-104. Works by Rakas et al. [21] and Aberto et al. [24] explore the use of machine learning techniques for detecting network-level attacks in SCADA systems, particularly emphasizing the protection of legacy protocols that may lack inherent security.
  • Signature-Based IDS: Signature-based IDS detects intrusions based on known attack signatures. This approach has been extensively used due to its high accuracy in detecting previously known attacks. However, it struggles against zero-day attacks. Al-Asiri et al. [25] and Kwon et al. [26] and Yong et al. [27] explored signature-based methods tailored for SCADA networks, demonstrating the effectiveness of lightweight detection mechanisms but acknowledging limitations in adaptability.
  • Anomaly-Based IDS: Anomaly-based IDS detects deviations from normal behavior and are well-suited for identifying unknown threats. Recent advances leveraging machine learning, such as generative adversarial networks (GANs), have proven effective in detecting sophisticated attacks in SCADA systems. Studies by Adiban et al. [28] and Park et al. [29] and Gunnam et al. [30] illustrate the use of GANs to model normal SCADA operations, improving detection rates for complex threats.
  • Hybrid-Based IDS: Hybrid IDS combines multiple detection techniques to enhance accuracy and resilience. For instance, Araujo-Filho et al. [31] combined signature and anomaly detection, achieving higher detection rates and minimizing false positives. Similarly, Bennadi et al. [32] integrated Distributional Reinforcement Learning with GAN, offering a robust framework against sophisticated attacks. The proposed models performed better to the normal DRL in the standard metrics of accuracy, precision, recall, and F1 score. The study [32] demonstrated that the GAN introduced in the training process of DRL with the aim of improving the detection of a specific class of data achieves the best results.
Intrusion Detection Systems (IDSs) have been widely categorized into signature-based, anomaly-based, and hybrid approaches, with extensive surveys available in the literature. While these taxonomies provide useful context, a detailed review falls outside the scope of this paper. Instead, our focus is on IDS approaches tailored for Supervisory Control and Data Acquisition (SCADA) systems, where unique challenges such as protocol vulnerabilities, limited labeled datasets, and real-time constraints demand specialized solutions.
Recent work has shown the effectiveness of AI-based IDSs for SCADA security. In particular, Yildiz and Aydin [33] evaluated multiple machine learning models on two SCADA datasets, achieving over 96.8% accuracy overall, with XGBoost reaching 99.99% on WUSTL-IIOT-2021, underscoring the potential of AI methods for industrial cyber defense.
In addition, anomaly detection using LSTM-autoencoders has shown promise in identifying time-series deviations, while CNN-based models have been adapted to handle SCADA network traffic features. More recently, Generative Adversarial Networks (GANs) have gained traction for addressing data scarcity and class imbalance by generating synthetic but realistic SCADA traffic, thereby improving detection accuracy. However, despite these advances, there remains a lack of approaches that both enhance detection performance and ensure robustness against evolving cyber threats in SCADA environments. This motivates our proposed GAN-based framework for intelligent SCADA intrusion detection.
In their study on IDS using GANs, Jin-Young Kim et al. [34] proposed an approach to automatically classify malicious software through a GAN model. They stabilized the GAN training process by incorporating an autoencoder for pre-training and transfer learning of the generative model’s weights. However, as the study focused on binary classification, only accuracy was evaluated, leaving important performance metrics for imbalanced data, such as Recall and F1-Score, unaddressed.
Seo et al. [35] developed the GIDS model using GAN-based techniques specifically for In-Vehicle Networks (IVNs), achieving impressive accuracy rates, with an average accuracy of 97.53% and detection rate of 98.65% across attacks like DoS, FUZZY, RPM, and GEAR. While the model demonstrated strong performance within IVNs, it lacks generalizability to broader networks or additional types of cyber-attacks, limiting its application in varied network environments. This limitation restricts the model’s relevance to modern and diverse attack landscapes beyond IVNs.
Tabassum et al. [36] introduced FEDGAN-IDS, designed for Internet of Things (IoT) networks, with models achieving high accuracy scores of up to 99% for binary and 98% for multiclass classifications on the KDDCUP-99, NSL-KDD, and UNSW-NB15 datasets. This study highlights the GAN model’s efficacy in IoT network security with remarkable performance. However, the use of older datasets (such as KDDCUP-99) is a notable limitation, as they may not fully capture the complexities and variety of contemporary cyber threats, potentially reducing the model’s effectiveness when applied to modern network environments.
Li et al. [37] proposed HDA-IDS, a GAN model tailored for IoT networks and trained on NSL-KDD, CICIDS2018, and Bot-IoT datasets. Achieving an average accuracy of 98.97%, HDA-IDS effectively detects IoT-related DoS and botnet attacks, showcasing strong accuracy and detection rates. However, this study’s scope was restricted to two types of attacks—DoS and botnet—limiting its applicability to other IoT threats, which highlights a narrow focus that could restrict the model’s potential in addressing diverse IoT security needs.
Yoo et al. [38] focused on intrusion detection within unmanned aerial vehicle (UAV) systems, generating a UAV-specific dataset and validating it through GAN-based intrusion detection models. This study demonstrates GANs’ utility in IDS for UAVs, suggesting that they can enhance security in UAV operations. However, while the study successfully explores the novel domain of UAV networks, the generalizability and comparative efficacy of this GAN model remain unexplored across different or non-UAV datasets, which could limit its adaptability.
Liu et al. [39] addressed the issues of class imbalance and high dimensionality in IDS data, proposing an innovative combination of GAN-based oversampling with feature selection. Their approach significantly improved attack detection accuracy, outperforming baseline models. However, the increased complexity due to feature selection and GAN integration may increase computational costs, potentially impacting the model’s feasibility in resource-constrained environments or real-time scenarios.
Kim et al. [40] proposed a hybrid GAN-LSTM-DNN model and evaluated it on ISCX2012, CICIDS2017, and CSE2018 datasets. Their approach achieved commendably high performance, showing GAN’s potential in conjunction with advanced neural network architectures for IDS. However, the model’s dependency on high computational resources may limit its usability in operational contexts with limited processing capabilities, which can restrict the model’s deployment in practical, real-world cybersecurity environments.

2.2. SCADA System Components

SCADA systems comprise multiple components responsible for monitoring, controlling, and managing industrial processes. These components include servers, human–machine interfaces (HMIs), communication equipment, control equipment, and data acquisition devices.
  • Servers and Human–Machine Interface (HMI): The HMI provides an interface for operators to interact with the SCADA system. Security vulnerabilities in HMIs can lead to unauthorized control. Research by Abu-Jassar et al. [41], Yadav and Paul [42], and Qian et al. [43] demonstrated how attackers target HMI interfaces through social engineering and network vulnerabilities, highlighting the need for robust security practices.
  • Communication Equipment: This equipment enables data exchange between different components of a SCADA system. Protocols such as DNP3 are often exploited, making communication channels a critical attack vector. Studies by Anwar et al. [44] and Aboulsamh et al. [45], and Chih-Yuan and Simin [46] explored the use of secure communication protocols and anomaly detection to protect SCADA communication links.
  • Control Equipment (PLCs and RTUs): Control devices such as programmable logic controllers (PLCs) and remote terminal units (RTUs) are responsible for executing control commands. Attacks on these devices can disrupt operations. Alsabbagh and Langendörfer [47], and Yang et al. [48] and Ling et al. [49] examined the security challenges and provided defense mechanisms for mitigating threats to control devices.
  • Data Acquisition Equipment: This equipment collects data from sensors and field devices. Research by Rosa et al. [50] and Juma et al. [51] and Lupascu et al. [52] focused on spoofing attacks targeting data acquisition systems and proposed secure data integrity frameworks.

2.3. NIDS for SCADA Systems

Network-based Intrusion Detection Systems (NIDSs) are extensively used to protect SCADA systems by analyzing network traffic and identifying suspicious activities. Since SCADA protocols like DNP3 have inherent vulnerabilities, NIDSs must be capable of deep packet inspection and behavior analysis. Works by Ali et al. [53] and Yalçın et al. [33] proposed AI-based detection algorithms that leverage deep learning and GANs for enhanced detection of DNP3 protocol anomalies. Abdullahi et al. [54] compared and Investigated the AI-Based Approaches for Cyberattack Detection in Cyber-Physical Systems. The methods were tested on a gas pipeline industrial control system dataset and other benchmark datasets, such as NetML-2020 and IoT-23, which contain various cyberattacks. The performance of the two methods was found to be usable in comparison to models such as support vector machine (SVM) and artificial neural networks (ANNs) on several evaluation metrics.
Rakas et al. [21] reviewed a recent research on a Network-Based SCADA Intrusion Detection Systems, a structured evaluation methodology that encompasses detection techniques, protected protocols, implementation tools, test environments, and IDS performance. Special attention was focused on evaluating implementation maturity as well as the applicability of each surveyed solution in the Future Internet environment. The study highlights that SCADA systems have a rich and long history, and several successful attacks on worldwide industrial control systems were notified in the past decades. The authors mentioned that capturing and preprocessing SCADA network traffic is needed for better intrusion detection. The study stated that due to confidentiality on real SCADA network data, researchers often use synthetic datasets or experimental datasets obtained from CPS testbeds. Simulation of attacks is the prevalent method in test scenarios. Typical simulated attacks on SCADA systems include malware attacks, network attacks, communication protocol attacks, DoS/MITM, false data injection, false sequential logic attacks, and data integrity attacks [21].

2.4. Imbalanced Data and Solving with SMOTE

The issue of class imbalance refers to a significant disparity in the number of instances among different classes within a dataset [55]. To address this challenge, previous studies have employed techniques such as oversampling, undersampling, and the SMOTE [56], which integrates elements of both approaches. The primary concept of SMOTE involves identifying data points near the minority class instances and generating new synthetic samples within this range.
Intrusion detection datasets often suffer from imbalanced data, as attacks are typically rare compared to normal traffic. This poses challenges for machine learning models in effectively detecting minority-class events. SMOTE is widely used to address this imbalance. Works by Balla et al. [57] and Banu et al. [58] and Abdelmoumin et al. [59] demonstrate the effect of Dataset Imbalance on the performance of SCADA IDSs. Studies by Louk et al. [60], Khan et al. [61], and Rajesh et al. [62] showed significant improvements in model accuracy when SMOTE was applied to SCADA datasets, particularly for minority-class attack detection.
Yan B. et al. [63] utilized the NSL-KDD dataset for sampling with SMOTE at the data level and subsequently combined the sampled data with various classification algorithms, including Support Vector Machine (SVM), Random Forest (RF), and Backpropagation Neural Network (BPNN), to evaluate and compare their performance. Sun Y. et al. [64] introduced an enhanced version of SMOTE, known as SMOTE-NCL, which calculates class ratios, their average ratio, standard deviation, and an imbalance scale (derived by dividing the standard deviation by the class ratio). This technique continues sampling the minority class until the imbalance scale surpasses a predefined threshold. Both studies demonstrate the application of SMOTE for data sampling but highlight the need to address inherent limitations, such as class overlap and noise susceptibility.
Ahmed et al. [65] conducted a comparative analysis of precision and reproducibility in performance across varying sizes (full, half, and quarter) of the NSL-KDD datasets, employing multiple algorithms, including Multi-layer Perceptron (MLP), Support Vector Machine (SVM), Random Forest (RF), and Extreme Learning Machine (ELM). Zarpelao et al. [66] explored a framework tailored for IDS research in the context of the Internet of Things (IoT), acknowledging the challenges posed by the application of traditional IDS techniques in this domain. Their study examined detection methods, IDS deployment strategies, security threats, and validation techniques to determine which combinations of detection methods and deployment strategies are most suitable for the unique characteristics of IoT environments. Though, both the studies [65,66] did not imply GAN, the findings from their proposed method are commendable.

2.5. Background to DNP3 Protocol

The Distributed Network Protocol version 3 (DNP3) was developed by General Electric and released to the public in 1993. Initially designed for Supervisory Control and Data Acquisition (SCADA) applications, it has since been widely adopted across various sectors, including electrical and water infrastructure, oil and gas, and security systems [67]. The protocol is recognized for its efficiency in real-time monitoring, control, and automation of industrial processes, making it a key component in critical infrastructure networks. The DNP3 protocol is structured across four primary layers: the physical, data link, transport, and application layers, enabling reliable and efficient communication between SCADA devices.
SCADA systems (Figure 1), which rely heavily on DNP3, are accessible to a diverse set of stakeholders, such as distribution system operators, electrical retailers, end-users, device manufacturers, and government entities [21]. The distribution system operator, for example, uses the SCADA network to monitor and control equipment, facilitating seamless data collection, processing, and real-time control.
A typical SCADA system is comprised of the following key components, each of which interacts with the DNP3 protocol to facilitate communication and control:
  • Servers and Human–Machine Interface (HMI): SCADA systems typically have multiple servers for redundancy and reliability. The Human–Machine Interface (HMI) allows users to interact with the SCADA system. HMIs can be local (located within the plant) or remote (connected via the Internet). Cybersecurity threats to HMI interfaces pose significant risks, as attackers may attempt to manipulate operator views or inject malicious commands. IDS solutions can be deployed to monitor user interactions and detect unauthorized access attempts, thereby mitigating risks to HMI security.
  • Communication Equipment: DNP3 communication involves real-time data collection from the field by SCADA servers. To ensure secure and fast communication, trusted and redundant networks are often employed, typically using ring topologies for quick recovery. The network may be wireless (e.g., GSM, GPRS, HSDPA, Wi-Fi) or wired (e.g., copper, fiber). High-capacity core switches near servers and edge switches near data acquisition equipment facilitate data flow. Since communication links are a critical attack vector, intrusion detection systems are often used to inspect network traffic for abnormal behaviors and possible breaches in DNP3 protocol communications.
  • Control Equipment: These devices, such as Programmable Logic Controllers (PLCs) and Remote Terminal Units (RTUs), execute control commands based on received data. For example, power analyzers, current transducers, and other sensors used in power plants fall under this category. Due to their critical role in process control, these devices are prime targets for attackers seeking to disrupt operations. Anomaly-based intrusion detection mechanisms can be utilized to detect unusual behaviors, which may indicate a breach or tampering attempt.
  • Data Acquisition Equipment: The primary devices used for data acquisition are PLCs, with RTUs serving as a more flexible alternative. When combined with remote input/output devices, RTUs enhance SCADA’s flexibility and ease of deployment. Attacks targeting data acquisition systems can lead to inaccurate or manipulated data being sent to control units. Deploying IDS solutions tailored to detect data tampering in real-time can help maintain data integrity and ensure operational continuity.
The DNP3 protocol facilitates communication between different components of a process automation system, serving as a backbone for SCADA systems. This protocol is primarily used in service industries, such as electricity and water utilities, but its adoption in other sectors is relatively limited [68]. It enables communication between SCADA master stations, RTUs, and Intelligent Electronic Devices (IEDs), providing critical connectivity for data acquisition and control. However, the widespread use of DNP3 also presents a significant attack surface for cyber threats, such as man-in-the-middle (MitM) attacks, message injection, and denial-of-service (DoS) attacks. To communicate between master stations, one uses the inter-control center communications protocol (ICCP), which is a component of IEC 60870-6 [69]. The more recent IEC 61850 protocol and more dated Modbus protocol are competing standards [70,71].
DNP3 communication is based on a request-response model. The master station sends read requests to outstations, requesting data such as commands, analog inputs, counter inputs, and configuration data. The master can also issue control commands to outstations, such as tripping circuit breakers or sending analog output values. Outstations not only respond to the master’s requests but can also initiate communication when specific conditions are met (unsolicited responses), such as when monitored data exceed a predetermined threshold. This feature, while improving efficiency, also introduces potential security risks, as unsolicited responses may be exploited to introduce malicious messages.
Given its critical role in industrial automation, the DNP3 protocol is a target for various cybersecurity threats. An attacker who gains unauthorized access to DNP3 communications can disrupt services, manipulate data, or even cause physical damage to equipment [72]. To protect against such risks, robust intrusion detection systems tailored to SCADA environments have been developed. These IDS solutions monitor DNP3 traffic patterns and detect anomalies or deviations from expected behavior, providing early warnings of potential attacks.
IDS for DNP3 and SCADA networks often employ advanced techniques such as machine learning, signature-based detection, and behavior analysis to identify threats. GANs have emerged as a promising approach for enhancing anomaly-based detection, providing the ability to simulate realistic attack scenarios and improve detection accuracy. Studies have demonstrated the effectiveness of GANs in identifying sophisticated and previously unknown attacks, reducing false positives, and ensuring the reliability of SCADA operations.

3. Proposed IDS Based on GAN Model

DNP3 serves as a critical communication standard within SCADA systems, enabling reliable data transmission between master stations and field devices, such as RTUs and IEDs. DNP3’s extensive use in monitoring and controlling critical infrastructure, such as electrical grids and water management systems, makes it a crucial target for cyber-attacks. To mitigate these threats, identifying and extracting meaningful features from DNP3 traffic is essential.
Feature extraction transforms raw network traffic data into structured, informative features that represent the behavior of network communications. By capturing both normal and anomalous patterns, feature extraction enhances the accuracy and effectiveness of IDS in identifying attacks within DNP3-based SCADA environments.
The hybrid adoption of Information Gain (IG) and Correlation-based Feature Selection (CFS) is particularly suitable for processing time-series industrial protocol data. IG evaluates the contribution of each feature toward reducing classification uncertainty, enabling the selection of highly informative attributes from the high-dimensional protocol space. However, IG alone may still retain redundant or overlapping features. CFS complements this by ensuring that selected features are strongly correlated with the class variable while being minimally correlated with each other, thus eliminating redundancy. By combining IG and CFS, we obtain a compact yet discriminative feature subset, which is crucial for handling noisy, redundant, and high-volume industrial time-series data, ultimately improving detection efficiency and generalization in SCADA security tasks.

3.1. DNP3 Protocol and Feature Extraction

3.1.1. DNP3 Protocol

The DNP3 is widely used in SCADA systems for data acquisition and control. For our experiments, we employed a network software emulator, GNS3, to simulate a DNP3 network comprising one DNP3 master and one outstation. The network included two Linux hosts connected through a switch operating at a negotiated speed of 1000 Mbps. One host served as the DNP3 master, while the other acted as the outstation, both running OpenDNP3. Additionally, a Kali Linux host was introduced as an attacking node to perform penetration testing and simulate cyber-attacks. The primary attacks considered were as follows:
  • Denial-of-Service (DoS): Attack traffic was generated using hping3 to overwhelm port 20000 (DNP3 port) of the outstation node.
  • Packet Injection/Modification: This attack was executed using a man-in-the-middle (MITM) technique via ARP spoofing. The attacker manipulated communication by blocking unsolicited responses and executing a cold restart function code.
The attack scenarios aimed to emulate common DNP3-specific threats in SCADA environments, generating a labeled dataset consisting of 861 instances: 470 normal instances, 10 instances of disabled unsolicited message attacks, 11 instances of cold restart command attacks, and 370 DoS attack instances
Figure 2 illustrates the DNP3 network configuration used in our experimental setup.

3.1.2. Feature Extraction

Feature extraction is a critical step in building an effective Intrusion Detection System (IDS). This phase transforms raw network traffic data into a structured form suitable for model training by generating and selecting representative features. The two primary stages of feature extraction are feature generation and feature selection. Feature Generation: We utilized a window-based approach to capture temporal behaviors of the DNP3 traffic. This approach aggregates network packets within fixed windows, capturing time-series characteristics of the packet streams [73]. The generated dataset comprised 17 features commonly used in network intrusion detection [74], as detailed below:
  • Duration: Time taken for a connection to be established and terminated.
  • Source Bytes: Number of bytes sent from the source to the destination.
  • Destination Bytes: Number of bytes sent from the destination to the source.
  • Flag: Status of the connection (e.g., Normal or Error).
  • Count: Number of connections to the same destination within a two-s window.
  • Service Count: Number of connections to the same service within a two-s window.
  • Same Service Rate: Proportion of connections to a specific service.
  • Dst_host_count: Number of connections from hosts to the destination.
  • Dst_host_srv_count: Count of different services connecting to the destination.
  • Srv_Rate: Proportion of connections to a specific service.
  • Port Rate: Proportion of connections using the same source port.
  • Round Trip Time Delay (RTTD): Total time for a signal to travel and receive a response.
  • Contains DNP3 Packets: Indicates whether DNP3 packets are present.
  • DNP3 Payload Length: Length of DNP3 payload in a connection.
  • Min DNP3 Payload Length: Minimum payload length in the connection.
  • Cold Restart in DNP3 Packet: Boolean indicating the presence of a cold restart or disable unsolicited message command.
  • Function Code Not Supported Count: Boolean indicating changes in function codes.
Feature Selection: Feature selection further refines the extracted features by identifying the most informative subset for intrusion detection. This process minimizes dimensionality while preserving discriminative power. We employed statistical measures such as Information Gain (IG) and Correlation-based Feature Selection (CFS) to rank features based on their relevance. Additionally, domain-specific knowledge was incorporated to prioritize DNP3-related features (e.g., DNP3 payload length) due to their high relevance to SCADA security.
Formally, let X = x 1 , x 2 , , x n denote the feature set and Y denote the target labels (Normal or Attack). Information Gain (IG) for feature x i is defined as follows:
I G ( Y , x i ) = H ( Y ) H ( Y | x i )
where H ( Y ) is the entropy of the target variable and H ( Y | x i ) is the conditional entropy given feature x i .
The final selected features, summarized in Table 1, were chosen based on their predictive power and relevance to network anomalies.

3.2. Proposed GAN Model for IDS

Our proposed IDS design leverages a GAN model to detect anomalies and attacks within SCADA networks. The system architecture is divided into three primary phases: Feature Extraction, Training, and Detection. The extracted features, as described earlier, serve as inputs to the GAN-based model.
  • GAN Architecture: The GAN model comprises two components: the Generator (G) and the Discriminator (D). The generator synthesizes realistic network traffic samples, while the discriminator distinguishes between real and synthetic samples. The adversarial training process aims to optimize the following objective function:
    min G max D V ( D , G ) = E x p r e a l [ log D ( x ) ] + E x p z [ log ( 1 D ( G ( z ) ) ) ]
    Here, p r e a l represents the distribution of real data, and p z denotes the noise distribution used to generate synthetic data, E represents the expectation operator, representing the average over the data distribution, D ( x ) denotes the discriminator’s output for real data x. Lastly, the D ( G ( z ) ) represent the discriminator’s output for generated data G ( z ) .
  • Training and Detection Phases: During training, the discriminator learns to classify real and generated network traffic, while the generator improves its ability to produce realistic samples. The final discriminator acts as a binary classifier for intrusion detection. Figure 3 illustrates the architecture of the proposed GAN model.
    The generator and discriminator models use convolutional layers, activation functions (LeakyReLU, ReLU), and dropout for regularization.
    The algorithm outlines the structure of the discriminator network used in the GAN model. Each step corresponds to a layer in the discriminator network, including details about kernel size, activation functions, output sizes, padding, and dropout rates. The architecture is designed for binary classification (e.g., normal vs. attack) in a SCADA network IDS using DNP3 protocol traffic.
    The GAN model outputs two classes: Normal (0) and Attack (1). This classification is achieved by training the discriminator using labeled instances of network traffic, enabling it to differentiate between normal and malicious behaviors within SCADA networks.
The Algorithm 1 outlines the key steps of a GAN-based IDS for SCADA systems using the DNP3 protocol, with a focus on feature extraction, adversarial training, and detection.
The generator’s goal is to “fool ”the discriminator by maximizing D ( x ^ i ) , i.e., making the discriminator believe that generated data are real. The main objective is to minimize the generator loss, thereby improving its ability to produce realistic data that is indistinguishable from real data.
Algorithm 1: GAN-Based Intrusion Detection System for SCADA using DNP3 Protocol
Jcp 05 00073 i001

4. Experience and Results

4.1. Network Training

To build and implement our model, we used TensorFlow within the Keras framework. The generator is updated using the combined GAN model. This updates the generator to improve the generation of real samples in the next batch. By using a random noise data distribution, generator G generates samples, and fake samples are mixed with the original two classification training label samples to form a new training set.
To determine the optimal hyperparameters, including the number of epochs, batch size, and learning rate, we conducted extensive experiments, testing various parameter configurations to fine-tune our model effectively. The parameters of the model are established as shown in Table 2.
In our experiments, all the data were split into two parts: a training set and testing set with 80% and 20% of the data, respectively. In the training set, 10% of the randomly selected data samples were used for the validation. Therefore, there were 688 and 173 training and test samples, respectively. We utilized TensorFlow, a Google-developed open-source software library, for numerical calculations using a data flow graph framework to build and perform the proposed GAN model. A Titan X GPU with 3584 cores running at 1.2 GHz was used to train and test the model.

4.2. Performance Metrics

We evaluate the model using Accuracy, Precision, Recall, F1-score, and AUC (see Appendix A for formal definitions). Accuracy can be a misleading metric when evaluating models on imbalanced datasets; therefore, the F1-score is used as a more reliable performance metric. As presented in Table 3, the table defines the key components of the evaluation metrics: TP (True Positive), FP (False Positive), TN (True Negative), and FN (False Negative). Given the imbalanced nature of our dataset, we supplemented the accuracy-based evaluation of the models with additional performance metrics. Specifically, we assessed the classification models using precision, recall, and the F1-score criteria to provide a more comprehensive evaluation.
Regarding classification problems, the AUC - ROC curve is a performance measure. The AUC stands for the Area Under the Curve. It is a performance metric used to estimate the ability of a classification model to distinguish between classes. Specifically, AUC refers to the area under the Receiver Operating Characteristic (ROC) curve, which plots the True Positive Rate (TPR) against the False Positive Rate (FPR) at various threshold settings. As shown in the figure, the higher the AUC, the better the model is at predicting. As shown in the Figure 4, the AUC score of proposed model is 0.994, this mean that the model will be reliable and high accuracy at distinguishing between the normal and attack.
As shown in Table 4, our approach achieves a classification accuracy of 99.136% and an F1-score of 99.37%, which outperform the comparative models (FNN: 98.75% accuracy, 98.12% F1-score; RNN: 98.68% accuracy, 98.96% F1-score; CNN: 98.68% accuracy, 97.69% F1-score; SVM: 97.7% accuracy, 97.6% F1-score). This quantitative comparison highlights that the proposed GAN-based IDS provides superior performance not only in accuracy but also in robustness when handling imbalanced datasets. These results strongly support the novelty and practical advantages of our method compared to existing techniques.
The trained GAN model is tested with a test set, which contains data that the proposed model has never seen before. This evaluation step provides a crucial assessment of how effectively our model can classify unseen data, a fundamental requirement for any robust intrusion detection system (IDS). To quantify the model’s performance, we utilize a confusion matrix. This matrix visually displays the percentage of accurate and incorrect predictions made by the model for each class. By analyzing the matrix, we can extract four key metrics to assess the IDS’s overall efficiency. As shown in Figure 5, the confusion matrix reveals that the model achieved exceptional classification accuracy, with over 99% of test samples correctly predicted. This outstanding result demonstrates the proposed model’s effectiveness in distinguishing between normal and abnormal network behaviors.
Compared to existing works, our proposed GAN-based IDS for SCADA achieves a remarkable accuracy of 99.9%, with precision, recall, and F1-score all at 0.99, making it highly effective in detecting cyber intrusions in SCADA environments. Unlike previous studies that rely primarily on generic cybersecurity datasets (e.g., NSL-KDD, CICIDS2017, or BoT-IoT), our approach is tailored specifically for SCADA security, using a DNP3 SCADA dataset, ensuring domain-specific feature extraction and protocol-aware intrusion detection. Furthermore, our model addresses one of the major challenges in IDS research—computational efficiency. While prior models, such as CTGSM-DNN (2025) and WGAN (2024), required high training and inference times, our approach significantly reduces inference time to 20 ms, making it ideal for real-time industrial applications. The proposed method also improves generalizability, as it outperforms existing models across different attack scenarios, ensuring robustness in various SCADA security threats. However, while our model excels in accuracy, detection capability, and real-time inference, future improvements could focus on enhancing adaptability across different SCADA protocols beyond DNP3, and integrating additional low-latency adversarial learning techniques to further reduce computational overhead in real-world industrial control system deployments.

5. Conclusions

SCADA systems play a critical role in managing industrial processes, and their reliance on network separation as the primary security measure underscores the need for robust IDS. Enhancing security monitoring through advanced methodologies, such as leveraging network flows, application protocols, process-aware features, and deep learning techniques, is essential for protecting SCADA systems against evolving cyber threats. This study introduces a novel GAN-based IDS framework specifically designed to meet the unique requirements of SCADA environments. By utilizing GANs, our approach effectively addresses the challenges posed by limited and imbalanced datasets, which are common in SCADA intrusion detection scenarios.
The proposed GAN-based IDS demonstrates superior performance compared to existing methods, achieving higher accuracy and F1-scores in simulation experiments. These results highlight the potential of GANs in enhancing intrusion detection by generating realistic synthetic samples, thereby improving the robustness of the detection model. This study also explores the influence of various parameter configurations, identifying optimal settings that contribute to the overall effectiveness and flexibility of the proposed framework.
Despite these promising outcomes, certain limitations of the proposed approach warrant further attention. First, the scalability of the GAN-IDS framework to larger and more complex SCADA networks remains a significant challenge, as the computational overhead of GAN models increases with network complexity. This limitation could hinder the deployment of the system in large-scale industrial environments where real-time performance is critical. Additionally, the high computational cost associated with training and inference may pose difficulties in resource-constrained settings, such as remote or edge-based deployments.
Although our evaluation primarily focused on conventional network-based attacks (e.g., Denial of Service and ARP spoofing), it is important to recognize that GAN-based detection models, like most deep learning approaches, remain susceptible to adversarial perturbations. Such perturbations can be carefully crafted to manipulate input traffic in ways that appear legitimate to the model while bypassing detection, thereby posing a significant security risk in mission-critical SCADA environments. To address this limitation, future work will extend our study to include a broader spectrum of adversarial attack scenarios, such as evasion attacks, poisoning attacks, and timing-based manipulations, which more accurately reflect real-world adversarial behavior. In addition, we plan to investigate robustness-enhancing strategies, including adversarial training to expose the model to perturbed samples during learning, ensemble defenses that combine complementary detection mechanisms, and knowledge-guided defenses that embed protocol semantics or domain-specific constraints into the model. By pursuing these directions, we aim to significantly improve the resilience and reliability of AI-based intrusion detection systems in securing industrial control networks.

Author Contributions

H.N.N.: conceptualization, methodology, software, validation, formal analysis, investigation, data curation, writing—original draft, visualization. J.K.: investigation, methodology, project administration, resources, supervision, writing—review and editing. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Data will be made available on request.

Acknowledgments

The authors gratefully acknowledge our seniors for the insightful feedback on the manuscript structure, fresh perspective, and the expert guidance on the GAN model. We also sincerely appreciate colleges for their patience, support, and valuable discussions, which were instrumental in completing this work.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ACGANAuxiliary Classifier Generative Adversarial Network
AIArtificial Intelligence
AIDSAnomaly-based Intrusion Detection System
APTAdvanced Persistent Threat
ARPAddress Resolution Protocol
CNNConvolution Neural Network
DDoSDistributed Denial-of-Service
DLDeep learning
DNP3Distributed Network Protocol 3
DoSDenial-of-Service
DRLDistributional Reinforcement Learning
GANGenerative Adversarial Network
GPRSGeneral Packet Radio Service
GSMGlobal System for Mobile Communications
HIDSHost-based Intrusion Detection System
HMIHuman–Machine Interface
HSDPAHigh-Speed Downlink Packet Access
ICCPInter-control center communications
IDSIntrusion Detection System
IoTInternet-of-Things
IVNsIn-Vehicle Networks
MitMMan-in-the-Middle
NIDSNetwork-based Intrusion Detection System
RTURemote Terminal Unit
SCADASupervisory Control and Data Acquisition
TCPTransmission Control Protocol
UAVUnmanned Aerial Vehicle
Wi-FiWireless Fidelity
XAIeXplainable AI

Appendix A. Performance Metric Definitions

To evaluate the classification performance of the proposed GAN-based IDS, we use widely accepted metrics: Accuracy, Precision, Recall, F1-score, and AUC. These definitions are presented here for completeness.
  • Accuracy
    A c c u r a c y = T P + T N T P + T N + F P + F N
    where T P = True Positives, T N = True Negatives, F P = False Positives, and F N = False Negatives.
  • Precision
    P r e c i s i o n = T P T P + F P
    Precision measures the proportion of positive identifications that are actually correct.
  • Recall (Sensitivity)
    R e c a l l = T P T P + F N
    Recall measures the proportion of actual positives correctly identified.
  • F1-Score
    F 1 = 2 × P r e c i s i o n × R e c a l l P r e c i s i o n + R e c a l l
    The F1-score balances Precision and Recall, and is particularly useful for imbalanced datasets.
  • Area Under the Curve (AUC)
    AUC refers to the area under the Receiver Operating Characteristic (ROC) curve, which plots the True Positive Rate (TPR) against the False Positive Rate (FPR) at various threshold settings. Higher AUC values indicate better discrimination between normal and attack classes.

References

  1. van Boven, L.S.; Kusters, R.W.; Tin, D.; van Osch, F.H.; De Cauwer, H.; Ketelings, L.; Rao, M.; Dameff, C.; Barten, D.G. Hacking Acute Care: A Qualitative Study on the Health Care Impacts of Ransomware Attacks Against Hospitals. Ann. Emerg. Med. 2024, 83, 46–56. [Google Scholar] [CrossRef]
  2. Nhung-Nguyen, H.; Girdhar, M.; Kim, Y.H.; Hong, J. Machine-Learning-Based Anomaly Detection for GOOSE in Digital Substations. Energies 2024, 17, 3745. [Google Scholar] [CrossRef]
  3. Lee, J.M.; Hong, S. Keeping Host Sanity for Security of the SCADA Systems. IEEE Access 2020, 8, 62954–62968. [Google Scholar] [CrossRef]
  4. Lee, J.M.; Hong, S. Host-Oriented Approach to Cyber Security for the SCADA Systems. In Proceedings of the 2020 6th IEEE Congress on Information Science and Technology (CiSt), Agadir-Essaouira, Morocco, 5–12 June 2021; pp. 151–155. [Google Scholar] [CrossRef]
  5. Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Networks. arXiv 2014, arXiv:1406.2661. [Google Scholar] [CrossRef]
  6. Nhung Nguyen, H.; Kim, Y.H. GAN-Based Driver’s Head Motion Using Millimeter-Wave Radar Sensor. IEEE Access 2025, 13, 108359–108367. [Google Scholar] [CrossRef]
  7. Lee, J.; Park, K. GAN-based imbalanced data intrusion detection system. Pers. Ubiquitous Comput. 2021, 25, 121–128. [Google Scholar] [CrossRef]
  8. Piplai, A.; Chukkapalli, S.S.L.; Joshi, A. NAttack! Adversarial Attacks to bypass a GAN based classifier trained to detect Network intrusion. In Proceedings of the 2020 IEEE 6th Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS), Baltimore, MD, USA, 25–27 May 2020; pp. 49–54. [Google Scholar]
  9. Liao, D.; Huang, S.; Tan, Y.; Bai, G. Network Intrusion Detection Method Based on GAN Model. In Proceedings of the 2020 International Conference on Computer Communication and Network Security (CCNS), Xi’an, China, 21–23 August 2020; pp. 153–156. [Google Scholar]
  10. Anderson, J.P. Computer Security Threat Monitoring and Surveillance; Technical Report; James P. Anderson Company: Fort Washington, MD, USA, 1980. [Google Scholar]
  11. Liu, H.; Lang, B. Machine Learning and Deep Learning Methods for Intrusion Detection Systems: A Survey. Appl. Sci. 2019, 9, 4396. [Google Scholar] [CrossRef]
  12. Chollet, F. Deep Learning with Python, 1st ed.; Manning Publications Co.: Shelter Island, NY, USA, 2017. [Google Scholar]
  13. LeCun, Y. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
  14. Nhung-Nguyen, H.; Youn, Y.W.; Kim, Y.H. A Deep Neural Network to Identify Vacuum Degrees in Vacuum Interrupter Based on Partial Discharge Diagnosis. IEEE Access 2022, 10, 95125–95131. [Google Scholar] [CrossRef]
  15. Hong, J.; Kim, Y.H.; Nhung-Nguyen, H.; Kwon, J.; Lee, H. Deep-Learning Based Fault Events Analysis in Power Systems. Energies 2022, 15, 5539. [Google Scholar] [CrossRef]
  16. Nguyen, H.N.; Lee, S.; Nguyen, T.T.; Kim, Y.H. One-shot learning-based driver’s head movement identification using a millimetre-wave radar sensor. IET Radar Sonar Navig. 2022, 16, 825–836. [Google Scholar] [CrossRef]
  17. Wang, W.; Lu, Z. Cyber security in the Smart Grid: Survey and challenges. Comput. Netw. 2013, 57, 1344–1371. [Google Scholar] [CrossRef]
  18. Vinayakumar, R.; Barathi Ganesh, H.B.; Poornachandran, P.; Anand Kumar, M.; Soman, K.P. Deep-Net: Deep Neural Network for Cyber Security Use Cases. arXiv 2018, arXiv:1812.03519. [Google Scholar] [CrossRef]
  19. IEEE Std 1815-2012; IEEE Standard for Electric Power Systems Communications-Distributed Network Protocol (DNP3). IEEE Standards Association: Piscataway, NJ, USA, 2012; pp. 1–821. [CrossRef]
  20. Dogaru, D.I.; Dumitrache, I. Cyber Security of Smart Grids in the Context of Big Data and Machine Learning. In Proceedings of the 2019 22nd International Conference on Control Systems and Computer Science (CSCS), Bucharest, Romania, 28–30 May 2019; pp. 61–67. [Google Scholar] [CrossRef]
  21. Rakas, S.V.B.; Stojanović, M.D.; Marković-Petrović, J.D. A Review of Research Work on Network-Based SCADA Intrusion Detection Systems. IEEE Access 2020, 8, 93083–93108. [Google Scholar] [CrossRef]
  22. Martins, I.; Resende, J.S.; Sousa, P.R.; Silva, S.; Antunes, L.; Gama, J. Host-based IDS: A review and open issues of an anomaly detection system in IoT. Future Gener. Comput. Syst. 2022, 133, 95–113. [Google Scholar] [CrossRef]
  23. Bulle, B.B.; Santin, A.O.; Viegas, E.K.; dos Santos, R.R. A Host-based Intrusion Detection Model Based on OS Diversity for SCADA. In Proceedings of the IECON 2020 The 46th Annual Conference of the IEEE Industrial Electronics Society, Singapore, 18–21 October 2020; pp. 691–696. [Google Scholar] [CrossRef]
  24. da Conceição Aberto, H.; Dembele, J.M.; Diop, I.; Bah, A. Review of Intrusion Detection Systems for Supervisor Control and Data Acquisition: A Machine Learning Approach. In Communications in Computer and Information Science, Proceedings of the International Conference on Science, Engineering Management and Information Technology, Ankara, Turkey, 14–15 September 2023; Springer: Cham, Switzerland, 2023; pp. 28–51. [Google Scholar] [CrossRef]
  25. Al-Asiri, M.; El-Alfy, E.S.M. On Using Physical Based Intrusion Detection in SCADA Systems. Procedia Comput. Sci. 2020, 170, 34–42. [Google Scholar] [CrossRef]
  26. Kwon, H.Y.; Kim, T.; Lee, M.K. Advanced Intrusion Detection Combining Signature-Based and Behavior-Based Detection Methods. Electronics 2022, 11, 867. [Google Scholar] [CrossRef]
  27. Yang, Y.; McLaughlin, K.; Littler, T.; Sezer, S.; Wang, H. Rule-based intrusion detection system for SCADA networks. In Proceedings of the 2nd IET Renewable Power Generation Conference (RPG 2013), Beijing, China, 9–11 September 2013; pp. 1–4. [Google Scholar] [CrossRef]
  28. Adiban, M.; Siniscalchi, S.M.; Salvi, G. A step-by-step training method for multi generator GANs with application to anomaly detection and cybersecurity. Neurocomputing 2023, 537, 296–308. [Google Scholar] [CrossRef]
  29. Park, C.H.; Jo, J.Y.; Kim, Y. Detecting Cyber Threats with Limited Dataset Using Generative Adversarial Network on SCADA System. In Proceedings of the 2023 International Conference on Computational Science and Computational Intelligence (CSCI), Las Vegas, NV, USA, 13–15 December 2023; pp. 915–919. [Google Scholar] [CrossRef]
  30. Gunnam, S.R.; Vepuri, S.K.; Nallarasan, V. Detection of Real Time Malicious Intrusions Using GAN (Generative Adversarial Networks) in Cyber Physical System. In Proceedings of the 2024 5th International Conference for Emerging Technology (INCET), Belgaum, India, 24–26 May 2024; pp. 1–7. [Google Scholar] [CrossRef]
  31. Freitas de Araujo-Filho, P.; Kaddoum, G.; Campelo, D.R.; Gondim Santos, A.; Macêdo, D.; Zanchettin, C. Intrusion Detection for Cyber–Physical Systems Using Generative Adversarial Networks in Fog Environment. IEEE Internet Things J. 2021, 8, 6247–6256. [Google Scholar] [CrossRef]
  32. Benaddi, H.; Jouhari, M.; Ibrahimi, K.; Ben Othman, J.; Amhoud, E.M. Anomaly Detection in Industrial IoT Using Distributional Reinforcement Learning and Generative Adversarial Networks. Sensors 2022, 22, 8085. [Google Scholar] [CrossRef]
  33. Yalçın, N.; Çakır, S.; Ünaldı, S. Attack Detection Using Artificial Intelligence Methods for SCADA Security. IEEE Internet Things J. 2024, 11, 39550–39559. [Google Scholar] [CrossRef]
  34. Kim, J.Y.; Bu, S.J.; Cho, S.B. Malware detection using deep transferred generative adversarial networks. In Lecture Notes in Computer Science, Proceedings of the Neural Information Processing: 24th International Conference, ICONIP 2017, Guangzhou, China, 14–18 November 2017; Proceedings, Part I 24; Springer: Cham, Switzerland, 2017; pp. 556–564. [Google Scholar] [CrossRef]
  35. Seo, E.; Song, H.M.; Kim, H.K. GIDS: GAN based Intrusion Detection System for In-Vehicle Network. In Proceedings of the 2018 16th Annual Conference on Privacy, Security and Trust (PST), Belfast, Ireland, 28–30 August 2018; pp. 1–6. [Google Scholar] [CrossRef]
  36. Tabassum, A.; Erbad, A.; Lebda, W.; Mohamed, A.; Guizani, M. FEDGAN-IDS: Privacy-preserving IDS using GAN and Federated Learning. Comput. Commun. 2022, 192, 299–310. [Google Scholar] [CrossRef]
  37. Li, S.; Cao, Y.; Liu, S.; Lai, Y.; Zhu, Y.; Ahmad, N. HDA-IDS: A Hybrid DoS Attacks Intrusion Detection System for IoT by using semi-supervised CL-GAN. Expert Syst. Appl. 2024, 238, 122198. [Google Scholar] [CrossRef]
  38. Yoo, J.D.; Kim, H.; Kim, H.K. GUIDE: GAN-based UAV IDS Enhancement. Comput. Secur. 2024, 147, 104073. [Google Scholar] [CrossRef]
  39. Liu, X.; Li, T.; Zhang, R.; Wu, D.; Liu, Y.; Yang, Z. A GAN and Feature Selection-Based Oversampling Technique for Intrusion Detection. Secur. Commun. Netw. 2021, 2021, 9947059. [Google Scholar] [CrossRef]
  40. Kim, T.; Pak, W. Early Detection of Network Intrusions Using a GAN-Based One-Class Classifier. IEEE Access 2022, 10, 119357–119367. [Google Scholar] [CrossRef]
  41. Abu-Jassar, A.T.; Attar, H.; Yevsieiev, V.; Amer, A.; Demska, N.; Luhach, A.K.; Lyashenko, V. Electronic User Authentication Key for Access to HMI/SCADA via Unsecured Internet Networks. Comput. Intell. Neurosci. 2022, 2022, 5866922. [Google Scholar] [CrossRef]
  42. Yadav, G.; Paul, K. Architecture and security of SCADA systems: A review. Int. J. Crit. Infrastruct. Prot. 2021, 34, 100433. [Google Scholar] [CrossRef]
  43. Qian, J.; Du, X.; Chen, B.; Qu, B.; Zeng, K.; Liu, J. Cyber-Physical Integrated Intrusion Detection Scheme in SCADA System of Process Manufacturing Industry. IEEE Access 2020, 8, 147471–147481. [Google Scholar] [CrossRef]
  44. Anwar, M.; Lundberg, L.; Borg, A. Improving anomaly detection in SCADA network communication with attribute extension. Energy Inform. 2022, 5, 69. [Google Scholar] [CrossRef]
  45. Aboulsamh, R.M.; Albugaey, M.T.; Alghamdi, D.O.; Abujaid, F.H.; Alsubaie, S.N.; Saqib, N.A. Secure Communication Protocols for SCADA Systems: Analysis and Comparisons of Different Secure Communication Protocols. In Proceedings of the 2024 Seventh International Women in Data Science Conference at Prince Sultan University (WiDS PSU), Riyadh, Saudi Arabia, 3–4 March 2024; pp. 209–214. [Google Scholar] [CrossRef]
  46. Lin, C.Y.; Nadjm-Tehrani, S. Protocol study and anomaly detection for server-driven traffic in SCADA networks. Int. J. Crit. Infrastruct. Prot. 2023, 42, 100612. [Google Scholar] [CrossRef]
  47. Alsabbagh, W.; Langendörfer, P. Security of Programmable Logic Controllers and Related Systems: Today and Tomorrow. IEEE Open J. Ind. Electron. Soc. 2023, 4, 659–693. [Google Scholar] [CrossRef]
  48. Yang, K.; Wang, H.; Wang, H.; Sun, L. An effective intrusion-resilient mechanism for programmable logic controllers against data tampering attacks. Comput. Ind. 2022, 138, 103613. [Google Scholar] [CrossRef]
  49. Rencelj Ling, E.; Urrea Cabus, J.E.; Butun, I.; Lagerström, R.; Olegard, J. Securing Communication and Identifying Threats in RTUs: A Vulnerability Analysis. In Proceedings of the 17th International Conference on Availability, Reliability and Security, Vienna, Austria, 23–26 August 2022; ARES ’22, pp. 1–7. [Google Scholar] [CrossRef]
  50. Cruz, T.; Rosa, L.; Proença, J.; Maglaras, L.; Aubigny, M.; Lev, L.; Jiang, J.; Simões, P. A Cybersecurity Detection Framework for Supervisory Control and Data Acquisition Systems. IEEE Trans. Ind. Inform. 2016, 12, 2236–2246. [Google Scholar] [CrossRef]
  51. Juma, M.; Alattar, F.; Touqan, B. Securing Big Data Integrity for Industrial IoT in Smart Manufacturing Based on the Trusted Consortium Blockchain (TCB). IoT 2023, 4, 27–55. [Google Scholar] [CrossRef]
  52. Lupascu, C.; Lupascu, A.; Bica, I. DLT Based Authentication Framework for Industrial IoT Devices. Sensors 2020, 20, 2621. [Google Scholar] [CrossRef]
  53. Ali, B.S.; Ullah, I.; Al Shloul, T.; Khan, I.A.; Khan, I.; Ghadi, Y.Y.; Abdusalomov, A.; Nasimov, R.; Ouahada, K.; Hamam, H. ICS-IDS: Application of big data analysis in AI-based intrusion detection systems to identify cyberattacks in ICS networks. J. Supercomput. 2024, 80, 7876–7905. [Google Scholar] [CrossRef]
  54. Abdullahi, M.; Alhussian, H.; Aziz, N.; Abdulkadir, S.J.; Alwadain, A.; Muazu, A.A.; Bala, A. Comparison and Investigation of AI-Based Approaches for Cyberattack Detection in Cyber-Physical Systems. IEEE Access 2024, 12, 31988–32004. [Google Scholar] [CrossRef]
  55. Hu, J.; Yang, H.; Lyu, M.R.; King, I.; Man-Cho So, A. Online Nonlinear AUC Maximization for Imbalanced Data Sets. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 882–895. [Google Scholar] [CrossRef]
  56. Yan, Y.; Liu, R.; Ding, Z.; Du, X.; Chen, J.; Zhang, Y. A Parameter-Free Cleaning Method for SMOTE in Imbalanced Classification. IEEE Access 2019, 7, 23537–23548. [Google Scholar] [CrossRef]
  57. Balla, A.; Habaebi, M.H.; Elsheikh, E.A.A.; Islam, M.R.; Suliman, F.M. The Effect of Dataset Imbalance on the Performance of SCADA Intrusion Detection Systems. Sensors 2023, 23, 758. [Google Scholar] [CrossRef] [PubMed]
  58. Sams Aafiya Banu, S.; Gopika, B.; Esakki Rajan, E.; Ramkumar, M.; Mahalakshmi, M.; Emil Selvan, G. Smote variants for data balancing in intrusion detection system using machine learning. In Proceedings of the International Conference on Machine Intelligence and Signal Processing; Springer: Singapore, 2022; pp. 317–330. [Google Scholar] [CrossRef]
  59. Abdelmoumin, G.; Rawat, D.B.; Rahman, A. Studying Imbalanced Learning for Anomaly-Based Intelligent IDS for Mission-Critical Internet of Things. J. Cybersecur. Priv. 2023, 3, 706–743. [Google Scholar] [CrossRef]
  60. Louk, M.H.L.; Tama, B.A. Exploring Ensemble-Based Class Imbalance Learners for Intrusion Detection in Industrial Control Networks. Big Data Cogn. Comput. 2021, 5, 72. [Google Scholar] [CrossRef]
  61. Khan, I.A.; Pi, D.; Khan, Z.U.; Hussain, Y.; Nawaz, A. HML-IDS: A Hybrid-Multilevel Anomaly Prediction Approach for Intrusion Detection in SCADA Systems. IEEE Access 2019, 7, 89507–89521. [Google Scholar] [CrossRef]
  62. Rajesh, L.; Satyanarayana, P. Evaluation of machine learning algorithms for detection of malicious traffic in scada network. J. Electr. Eng. Technol. 2022, 17, 913–928. [Google Scholar] [CrossRef]
  63. Yan, B.; Han, G.; Sun, M.; Ye, S. A novel region adaptive SMOTE algorithm for intrusion detection on imbalanced problem. In Proceedings of the 2017 3rd IEEE International Conference on Computer and Communications (ICCC), Chengdu, China, 13–16 December 2017; pp. 1281–1286. [Google Scholar] [CrossRef]
  64. Sun, Y.; Liu, F. SMOTE-NCL: A re-sampling method with filter for network intrusion detection. In Proceedings of the 2016 2nd IEEE International Conference on Computer and Communications (ICCC), Chengdu, China, 14–17 October 2016; pp. 1157–1161. [Google Scholar] [CrossRef]
  65. Ahmad, I.; Basheri, M.; Iqbal, M.J.; Rahim, A. Performance Comparison of Support Vector Machine, Random Forest, and Extreme Learning Machine for Intrusion Detection. IEEE Access 2018, 6, 33789–33795. [Google Scholar] [CrossRef]
  66. Zarpelão, B.B.; Miani, R.S.; Kawakani, C.T.; de Alvarenga, S.C. A survey of intrusion detection in Internet of Things. J. Netw. Comput. Appl. 2017, 84, 25–37. [Google Scholar] [CrossRef]
  67. Mohagheghi, S.; Stoupis, J.; Wang, Z. Communication protocols and networks for power systems-current status and future trends. In Proceedings of the 2009 IEEE/PES Power Systems Conference and Exposition, Seattle, WA, USA, 15–18 March 2009; pp. 1–9. [Google Scholar] [CrossRef]
  68. Mander, T.; Cheung, R.; Nabhani, F. Power System DNP3 Data Object Security Using Data Sets. Comput. Secur. 2010, 29, 487–500. [Google Scholar] [CrossRef]
  69. IEC 60870-6 TASE.2; Telecontrol Standard IEC 60870-6 TASE.2 Globally Adopted. Springer-Verlag Wien: Vienna, Austria, 1999.
  70. IEEE Std 1379-2000; IEEE Recommended Practice for Data Communications Between Remote Terminal Units and Intelligent Electronic Devices in a Substation. IEEE Standards Association: Piscataway, NJ, USA, 2001; pp. 1–72. [CrossRef]
  71. IEEE Std 1815-2010; IEEE Standard for Electric Power Systems Communications—Distributed Network Protocol (DNP3). IEEE Standards Association: Piscataway, NJ, USA, 2010; pp. 1–775. [CrossRef]
  72. Yin, X.C.; Liu, Z.G.; Nkenyereye, L.; Ndibanje, B. Toward an Applied Cyber Security Solution in IoT-Based Smart Grids: An Intrusion Detection System Approach. Sensors 2019, 19, 4952. [Google Scholar] [CrossRef]
  73. Linda, O.; Vollmer, T.; Manic, M. Neural Network based Intrusion Detection System for critical infrastructures. In Proceedings of the 2009 International Joint Conference on Neural Networks, Atlanta, GA, USA, 14–19 June 2009; pp. 1827–1834. [Google Scholar] [CrossRef]
  74. Altaha, M.; Lee, J.M.; Muhammad, A.; Hong, S. Network Intrusion Detection based on Deep Neural Networks for the SCADA system. J. Phys. Conf. Ser. 2020, 1585, 012038. [Google Scholar] [CrossRef]
Figure 1. The components of a SCADA system.
Figure 1. The components of a SCADA system.
Jcp 05 00073 g001
Figure 2. DNP3 experiment configuration.
Figure 2. DNP3 experiment configuration.
Jcp 05 00073 g002
Figure 3. Proposed GAN model for intrusion detection system.
Figure 3. Proposed GAN model for intrusion detection system.
Jcp 05 00073 g003
Figure 4. The AUC of the proposed model.
Figure 4. The AUC of the proposed model.
Jcp 05 00073 g004
Figure 5. Confusion matrix of GAN model.
Figure 5. Confusion matrix of GAN model.
Jcp 05 00073 g005
Table 1. List of input features.
Table 1. List of input features.
No.Features
1Source Bytes
2Destination Bytes
3Flag
4Service Count
5Contains DNP3 Packets
6DNP3 Payload Length
7Min DNP3 Payload Length
8Cold Restart in DNP3 Packet
9Same Service Rate
10Round Trip Time Delay
11Destination Host Identical Source Port Rate
12Function Code Not Supported Count
Table 2. Hyperparameter optimization.
Table 2. Hyperparameter optimization.
HyperparameterValue
Latent dim128
Generator input100
Batch size64
The number of epochs300
Learning rate0.001
OptimizerAdam
Table 3. Confusion matrix indicator for machine learning.
Table 3. Confusion matrix indicator for machine learning.
IndicatorTrue LabelModel’s Prediction
TPAttackAttack
FPNormalAttack
TNNormalNormal
FNAttackNormal
Table 4. Comparision performance.
Table 4. Comparision performance.
MethodsAverage Classification
Accuracy
F1-Score
SVM97.7%97.6%
FNN98.75% [74]98.12%
RNN98.68%98.96%
CNN98.68%97.69%
LWGAN99.136%99.37%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Nguyen, H.N.; Koo, J. Enhancing SCADA Security Using Generative Adversarial Network. J. Cybersecur. Priv. 2025, 5, 73. https://doi.org/10.3390/jcp5030073

AMA Style

Nguyen HN, Koo J. Enhancing SCADA Security Using Generative Adversarial Network. Journal of Cybersecurity and Privacy. 2025; 5(3):73. https://doi.org/10.3390/jcp5030073

Chicago/Turabian Style

Nguyen, Hong Nhung, and Jakeoung Koo. 2025. "Enhancing SCADA Security Using Generative Adversarial Network" Journal of Cybersecurity and Privacy 5, no. 3: 73. https://doi.org/10.3390/jcp5030073

APA Style

Nguyen, H. N., & Koo, J. (2025). Enhancing SCADA Security Using Generative Adversarial Network. Journal of Cybersecurity and Privacy, 5(3), 73. https://doi.org/10.3390/jcp5030073

Article Metrics

Back to TopTop