Abstract
The increasing digital integration of Industrial Control Systems (ICS), including Supervisory Control and Data Acquisition (SCADA) and Distributed Control Systems (DCSs), has improved operational efficiency while simultaneously increasing exposure to cyber threats. Traditional signature-based intrusion detection systems are limited in detecting novel and stealthy attacks in dynamic industrial environments. This study presents a deep learning–based anomaly detection framework for ICS cybersecurity using multivariate time-series data from sensors, actuators, and network traffic. Three architectures, Convolutional Neural Networks (CNNs), Long Short-Term Memory (LSTM) networks, and Transformer models, are evaluated using the HAI Security Dataset. Experimental results show that the Transformer model achieves the highest accuracy (92%), followed by CNN (91%) and LSTM (90%), with all models attaining an F1-score of 91%. The Transformer demonstrates superior generalization by effectively modelling complex temporal dependencies. Key challenges, including data imbalance, overfitting, and limited interpretability, are discussed alongside potential mitigation strategies such as hybrid modelling, federated learning, and digital twin integration. The findings demonstrate the effectiveness of deep learning for scalable, real-time cybersecurity threat detection in industrial control environments. To address challenges such as class imbalance and overfitting, the study discusses mitigation strategies including regularization, early stopping, cost-sensitive learning, and future integration of data balancing and federated learning techniques for improved robustness and generalization.
1. Introduction
Industrial Process Control and Monitoring Systems (PCMS), including Supervisory Control and Data Acquisition (SCADA), Distributed Control Systems (DCS), and Programmable Logic Controllers (PLCs) [1], constitute the operational backbone of critical infrastructure sectors such as energy production, water treatment, transportation, and manufacturing [2]. These systems are responsible for real-time monitoring and control of physical processes whose failure can result in severe economic loss, environmental damage, and threats to human safety [3]. In recent years, the integration of Operational Technology (OT) with Information Technology (IT), driven by Industry 4.0 initiatives and the rapid adoption of the Industrial Internet of Things (IIoT), has significantly enhanced operational efficiency and remote accessibility [4]. However, this convergence has also expanded the attack surface of industrial environments, exposing PCMS to increasingly sophisticated cyber threats [5].
Unlike conventional IT systems, PCMS operate under strict real-time constraints, rely heavily on legacy infrastructure, and employ specialized industrial communication protocols such as Modbus, DNP3, and OPC-UA [5]. These characteristics limit the effectiveness of traditional IT-centric cybersecurity solutions and introduce unique vulnerabilities. The consequences of successful cyberattacks on PCMS extend beyond data breaches to include physical equipment damage, operational shutdowns, and safety incidents [6]. High-profile attacks such as Stuxnet, Havex, Black Energy, and Industroyer have demonstrated the ability of adversaries to manipulate industrial processes and disrupt critical services, highlighting the urgent need for intelligent and adaptive cybersecurity mechanisms tailored to industrial environments. Conventional signature-based intrusion detection systems (IDS) remain widely deployed in PCMS but are inherently reactive and ineffective against zero-day attacks, polymorphic malware, and stealthy intrusions that mimic legitimate operational behavior [7]. Furthermore, modern PCMS generate large volumes of heterogeneous data, including sensor measurements, actuator signals, control commands, and network traffic logs [8,9]. The high dimensionality, temporal dependencies, and severe class imbalance inherent in such data pose significant challenges for traditional rule-based and shallow machine learning approaches, which often struggle to distinguish malicious anomalies from normal operational fluctuations [10,11].
Deep learning has emerged as a promising paradigm for addressing these challenges due to its ability to automatically learn hierarchical representations from high-dimensional and multivariate time-series data [12]. Architectures such as Convolutional Neural Networks (CNNs), Long Short-Term Memory (LSTM) networks, and Transformer models have demonstrated strong potential in modeling spatial and temporal dependencies in industrial data streams [13]. By leveraging deep neural networks, it becomes possible to continuously monitor industrial processes and detect subtle deviations indicative of cyber intrusions, insider threats, or advanced persistent attacks without relying solely on predefined signatures [14,15]. Despite growing research interest, existing deep learning-based ICS security studies exhibit several limitations [16]. Many rely on single datasets, limiting generalizability across diverse industrial contexts, while others emphasize detection accuracy without adequately addressing class imbalance, computational constraints, interpretability [17], or deployment feasibility in real-time environments. In addition, insufficient methodological detail in some studies hinders reproducibility and practical adoption in operational PCMS [18,19].
To address these gaps, this study presents a systematic evaluation of deep learning architectures for cybersecurity threat detection in industrial process control and monitoring systems. Using the HAI Security Dataset, which reflects realistic industrial operational behaviors and attack scenarios, we assess and compare the performance of CNN, LSTM, and Transformer models on multivariate sensor, actuator, and network data. Beyond detection performance, the study also considers generalization capability and practical deployment considerations relevant to resource-constrained industrial environments [20,21]. Despite the growing adoption of deep learning for ICS cybersecurity, existing studies often suffer from limited generalizability due to reliance on single models or datasets, insufficient attention to class imbalance, and inadequate discussion of deployment constraints in real-time industrial environments. Furthermore, many works report high accuracy without critically analyzing overfitting, minority-class performance, or computational feasibility, limiting their practical applicability.
To address the identified limitations in existing industrial cybersecurity research, this study makes several key contributions. First, it provides a unified and systematic experimental comparison of CNN, LSTM, and Transformer models for anomaly detection in Industrial Process Control and Monitoring Systems (PCMS) using realistic multivariate time-series data under identical preprocessing, training, and evaluation conditions [22,23]. Second, this study extends performance analysis beyond overall accuracy by explicitly examining class imbalance, overfitting behavior, and model generalization, with particular emphasis on the detection of minority (attack) classes [24]. Third, it discusses practical deployment considerations for deep learning–based intrusion detection in real-time and resource-constrained ICS environments, highlighting the trade-offs between detection performance and computational complexity [25]. Finally, the study identifies promising future research directions aimed at improving the robustness, scalability, and trustworthiness of AI-driven cybersecurity solutions for industrial process control and monitoring systems.
2. Materials and Methods
This section describes the experimental methodology adopted in this study and explains how each deep learning model is applied and evaluated within the same anomaly detection framework. All models (CNN, LSTM, and Transformer) are trained and tested using identical data preprocessing steps, sliding-window segmentation, and evaluation metrics to ensure a fair and consistent comparison. The outputs of these models directly correspond to the experimental results presented in Section 3, including the dataset used in this study, the deep learning models employed for anomaly detection, and the overall algorithmic workflow adopted to evaluate cybersecurity threats in Industrial Control Systems (ICS).
2.1. Dataset Description
This study utilizes the HAI Security Dataset, a publicly available dataset designed to emulate realistic Industrial Control System (ICS) operations and cyberattack scenarios [26]. The dataset consists of multivariate time-series data collected from an industrial process control environment and includes sensor measurements, actuator states, and network-related variables. The dataset comprises multiple features (over 50 process variables) representing physical and cyber components of the system, such as temperature, pressure, flow rates, valve states, control commands, and network activity indicators. These features capture the dynamic behavior of the industrial process under both normal and abnormal operating conditions. Several types of cyberattacks are represented in the dataset, including false data injection, command manipulation, and process disruption attacks, which reflect realistic threat scenarios targeting ICS environments. Each data instance is labeled as either normal or attack, resulting in a highly imbalanced class distribution, where attack samples constitute a small fraction of the total data. This imbalance reflects real-world industrial environments, where malicious events are rare but highly impactful. To ensure reliable evaluation, the dataset was divided into training, validation, and testing subsets, preserving temporal order to prevent data leakage and to assess model generalization.
2.2. CNN-Based Anomaly Detection Model
The Convolutional Neural Network (CNN) model is designed to capture local temporal patterns in multivariate ICS time-series data. The input to the CNN consists of fixed-length sliding windows of normalized time-series data, where each window represents a sequence of observations across all features. Temporal dependencies are handled by applying one-dimensional convolutional filters along the time axis, enabling the model to learn short-term fluctuations, abrupt changes, and localized anomalies in sensor and control signals. Pooling layers are used to reduce dimensionality and improve robustness to noise. The extracted feature representations are passed to fully connected layers, which output a binary anomaly prediction indicating normal or attack behavior. The CNN model described in this subsection is evaluated in the Results Section using accuracy, confusion matrix, ROC–AUC, and classification metrics to assess its effectiveness in detecting cyber anomalies in ICS data.
2.3. LSTM-Based Anomaly Detection Model
The Long Short-Term Memory (LSTM) model is employed to model long-term temporal dependencies inherent in industrial processes. Similar to the CNN, the input to the LSTM consists of sliding windows of multivariate time-series data. Unlike convolutional models, the LSTM processes input sequences sequentially and maintains internal memory states that allow it to learn the evolution of system behavior over time. This capability enables the detection of anomalies that manifest as deviations from expected operational sequences rather than instantaneous signal changes. The final hidden state of the LSTM is passed to a dense layer, producing a binary classification output that indicates whether the observed sequence corresponds to normal operation or an attack. The performance of the LSTM-based model is experimentally analyzed in result Section, where its ability to capture long-term temporal dependencies is compared against CNN and Transformer models.
2.4. Transformer-Based Anomaly Detection Model
The Transformer-based model leverages a self-attention mechanism to capture global temporal dependencies across the entire input sequence. The input consists of fixed-length time-series windows, augmented with positional encoding to preserve temporal order. Unlike recurrent architectures, the Transformer processes all time steps in parallel, allowing it to model complex, long-range interactions among features and time points. The self-attention layers compute contextualized representations of the input sequence, which are then aggregated and passed to a classification head. The model outputs a binary anomaly label, identifying whether the sequence represents normal behavior or a cyberattack. Experimental results for the Transformer model are presented in the result highlighting its generalization capability and trade-offs between detection performance and computational complexity.
2.5. Algorithmic Workflow
The overall anomaly detection process employed in this study is summarized in Algorithm 1.
| Algorithm 1. Deep Learning-Based ICS Anomaly Detection. |
| Input: Multivariate time-series data X |
| Output: Anomaly label y |
| 1. Normalize raw ICS time-series data. |
| 2. Segment data into fixed-length sliding windows. |
| 3. Train the selected deep learning model (CNN, LSTM, or Transformer) using labeled data. |
| 4. Predict anomaly labels for unseen test sequences. |
| 5. Evaluate detection performance using standard classification metrics. |
2.6. Experimental Setup and Implementation Details
To ensure reproducibility, all experiments were conducted using Python 3.11 with the TensorFlow/Keras deep learning framework. The HAI Security Dataset was normalized using min–max scaling to ensure uniform feature ranges. Time-series data were segmented into fixed-length sliding windows of W time steps with a stride of S to capture temporal dependencies. The dataset was split into training (70%), validation (15%), and testing (15%) sets while preserving temporal order to prevent data leakage. All models were trained for N epochs using the Adam optimizer with a learning rate of η and a batch size of B. Binary cross-entropy was used as the loss function. To mitigate overfitting, early stopping was applied based on validation loss. Model performance was evaluated using accuracy, precision, recall, F1-score, and AUC, which are particularly suitable for imbalanced industrial cybersecurity datasets.
Hyperparameters were selected based on commonly adopted configurations in prior ICS anomaly detection studies and empirical validation on the training and validation sets. Early stopping and dropout were applied to reduce overfitting. The dataset was split into training (70%), validation (15%), and testing (15%) subsets while preserving temporal order to avoid data leakage.
3. Results
The results of the study show that all three deep learning models, CNN, LSTM, and Transformer, performed well in detecting cyber threats within Industrial Control Systems using the HAI Security Dataset. The Transformer model achieved the highest accuracy at 92%, followed by CNN at 91% and LSTM at 90%, with all models attaining an F1-score of 91%. The LSTM model recorded the highest precision (93%) and AUC (0.98), while the Transformer demonstrated the best overall balance between recall and generalization. However, all models exhibited signs of overfitting and struggled with detecting minority class (attack) instances due to class imbalance. Overall, the Transformer emerged as the most effective model for real-time anomaly detection in ICS environments.
In addition to detection accuracy, inference efficiency was qualitatively analyzed. Transformer-based models demonstrated higher computational overhead compared to CNN and LSTM models due to the self-attention mechanism, which may limit their deployment on resource-constrained edge devices. Nevertheless, their superior generalization performance highlights a trade-off between predictive accuracy and computational cost. It is important to note that this evaluation was conducted using a single dataset, and further validation on additional ICS datasets such as SWAT, WADI, and EPIC is necessary to confirm the generalizability of the models across diverse operational scenarios.
Figure 1 represents the accuracy and loss of a Convolutional Neural Network (CNN) model, clearly illustrating the problem of overfitting. While the training accuracy consistently increases and training loss consistently decreases, indicating the CNN is effectively learning the training data, the validation accuracy plateaus and even drops in later epochs, coupled with a fluctuating or increasing validation loss. This divergence signifies that the CNN is memorizing the training examples rather than learning generalizable features, leading to poor performance on unseen data despite excellent performance on the training set.
Figure 1.
Model Accuracy and Loss Performance Graph of CNN.
3.1. CONVOLUTIONAL NEURAL NETWORK (CNN)
Figure 2 visually summarizes the performance of a binary classification model, showing the counts of correct and incorrect predictions for two classes, labeled 0 and 1. Out of a total of 166 instances, the model correctly identified 132 true negatives (class 0 predicted as 0) and 19 true positives (class 1 predicted as 1), resulting in 151 accurate predictions. However, it made 10 false positive errors (class 0 incorrectly predicted as 1) and 5 false negative errors (class 1 incorrectly predicted as 0), indicating a stronger performance in identifying class 0 than class 1.
Figure 2.
Confusion Matrix of CNN.
Figure 3 depicts the ROC curve, with an impressive Area Under the Curve (AUC) of 0.97, which signifies that the model is an excellent binary classifier. Its curve hugs the top-left corner of the plot, demonstrating a high true positive rate while maintaining a low false positive rate across various classification thresholds. This indicates a strong capability to distinguish between the two classes, performing significantly better than a random classifier.
Figure 3.
Receiver Operating Characteristic (ROC) curve of CNN.
Table 1 details the model’s performance across two classes, 0 and 1, highlighting a strong capability in identifying class 0 with 96% precision and 93% recall. Conversely, performance on class 1, which has significantly fewer instances (24 vs. 142 for class 0), is weaker, showing a precision of 66% and a recall of 79%. Although the overall accuracy is high at 91%, the disparity in per-class metrics, particularly the lower precision for class 1, suggests the model struggles more with false positives for this minority class.
Table 1.
Classification Report of CNN.
3.2. LONG SHORT-TERM MEMORY (LSTM)
Figure 4 illustrates the training progression of an LSTM model, exhibiting clear signs of overfitting. While the training accuracy consistently improves and training loss continuously decreases across epochs, the validation metrics show a significant divergence. In the initial training phase (first plot), the validation accuracy starts to plateau and fluctuate while validation loss remains higher than training loss, indicating early overfitting. This trend is exacerbated in the longer training run (second plot), where the validation accuracy becomes volatile, and validation loss rises in later epochs, demonstrating the LSTM’s increasing inability to generalize effectively to unseen data despite continued learning on the training set.
Figure 4.
Model Accuracy and Loss Performance Graph of LSTM.
Figure 5 indicates the model achieved an accuracy of approximately 90.36%, with 129 true negatives and 21 true positives, demonstrating good overall performance but a lower precision of 61.76% for class 1 compared to its recall of 87.5% for the same class. Concurrently, the accompanying accuracy and loss plots reveal a clear case of overfitting, as the training accuracy consistently improves and loss decreases, while the validation accuracy becomes volatile and validation loss begins to rise in later epochs, signifying a deterioration in the model’s ability to generalize to unseen data.
Figure 5.
Confusion Matrix of LSTM.
Figure 6 depicts the ROC curve with an AUC of 0.98 signifies the model’s excellent ability to distinguish between classes. The accompanying accuracy and loss plots reveal a critical issue of overfitting. As training progresses, the model continues to improve on the training data (rising accuracy, falling loss), but its performance on unseen validation data deteriorates, marked by volatile validation accuracy and a clear increase in validation loss in later epochs. This indicates that despite its strong discriminative power, the model is memorizing the training set and failing to generalize effectively.
Figure 6.
Receiver Operating Characteristic (ROC) curve of LSTM.
Table 2 indicates that the model performs very well on the majority class (class 0, with 142 instances), achieving a high precision of 98% and recall of 91%. However, its performance on the minority class (class 1, with 24 instances) is less robust, demonstrated by a significantly lower precision of 62% despite a decent recall of 88%. While the overall accuracy is 90%, the notable disparity in precision between the two classes highlights that the model frequently makes false positive errors when predicting class 1, suggesting room for improvement in handling the imbalanced dataset.
Table 2.
Classification Report of LSTM.
3.3. Transformer
Figure 7 indicates that the Transformer model is significantly overfitting. While the model demonstrates robust learning on the training data, with consistently increasing accuracy and decreasing loss, its generalization to unseen validation data steadily degrades. This is evident from the widening gap between training and validation accuracy, the plateauing or increasing validation loss, and the overall instability of validation metrics, suggesting the model is memorizing the training set rather than extracting broadly applicable patterns.
Figure 7.
Model Accuracy and Loss Performance Graph of Transformer.
Figure 8 summarizes the performance of a binary classification model, showing that out of 166 total instances, the model correctly identified 136 true negatives (class 0 predicted as 0) and 16 true positives (class 1 predicted as 1). However, it also made 6 false positive errors (class 0 incorrectly predicted as 1) and 8 false negative errors (class 1 incorrectly predicted as 0). This indicates that while the model performs reasonably well in correctly identifying class 0, it has a notable challenge with precisely predicting class 1 due to the false positive and false negative errors, despite identifying a fair portion of actual class 1 instances.
Figure 8.
Confusion Matrix of Transformer.
Figure 9 displays a Receiver Operating Characteristic (ROC) curve, a standard visualization for evaluating binary classification model performance. The orange curve plots the True Positive Rate (sensitivity) against the False Positive Rate for various classification thresholds. The dashed blue line represents a random classifier, serving as a baseline for comparison. A key metric, the Area Under the Curve (AUC), is reported as 0.95. This high AUC value, close to the ideal of 1.0, indicates that the model exhibits excellent discriminatory power, effectively distinguishing between positive and negative classes with a high likelihood of correctly identifying true positives while keeping false positives low.
Figure 9.
Receiver Operating Characteristic (ROC) curve of Transformer.
Table 3 details a model’s performance across two classes, “0” and “1”, with a notable class imbalance evident from the support values (142 for class 0, 24 for class 1). The model demonstrates strong performance on class 0, achieving high precision (0.94), recall (0.96), and F1-score (0.95). However, its performance significantly drops for class 1, with lower precision (0.73) and recall (0.67), resulting in an F1-score of 0.70, indicating difficulty in accurately identifying this minority class. While the overall accuracy is high at 0.92, this figure is likely boosted by the model’s proficiency with the dominant class 0; therefore, metrics like the lower macro-averaged F1-score (0.82) and particularly the F1-score for class 1 (0.70) offer a more critical view of the model’s limitations on the minority class.
Table 3.
Classification Report of Transformer.
Table 4 compares the performance of three deep learning models, CNN, LSTM, and Transformer, across accuracy, precision, recall, and F1-score. All models demonstrate strong performance, with metrics consistently in the low 90s. The Transformer model shows a slight advantage in accuracy and recall (both 92%), while the LSTM achieves the highest precision (93%). Notably, all three models exhibit an identical F1-score of 91%, indicating a similar balance between precision and recall. Overall, the results suggest that all architectures are highly effective for the task, with only minor performance variations distinguishing them.
Table 4.
Comparison of Models’ Performances.
3.4. Summary of Experimental Results
All three deep learning models demonstrated strong anomaly detection performance on the HAI Security Dataset, with accuracy values exceeding 90%. The Transformer model achieved the highest overall accuracy (92%) and recall (92%), indicating superior generalization capability in detecting cyber threats. The LSTM model yielded the highest precision (93%), suggesting fewer false alarms, while the CNN model provided competitive performance with lower computational complexity. Despite high overall accuracy, all models exhibited reduced performance on the minority attack class due to severe data imbalance. This limitation is reflected in lower precision and F1-scores for class 1 across all models. ROC–AUC values above 0.95 confirm strong discriminatory power, even in the presence of overfitting. These results highlight a trade-off between detection accuracy, generalization, and computational efficiency in industrial cybersecurity applications.
4. Discussion
This study provides a comparative evaluation of three deep learning architectures: Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM) networks, and Transformer models for cybersecurity threat detection in Industrial Control Systems (ICS) [26]. Using the HAI Security Dataset, which reflects realistic ICS operational behavior and attack scenarios, the results demonstrate that deep learning approaches can effectively detect cyber threats in SCADA and DCS environments [27]. Among the evaluated models, the Transformer achieved the highest overall accuracy (92%) and recall (92%), indicating superior generalization capability, while the LSTM model produced the highest precision (93%). Despite this strong overall performance, all models exhibited reduced effectiveness in detecting minority attack classes, primarily due to data imbalance and overfitting. This limitation highlights the need for improved learning strategies, including advanced regularization techniques, cost-sensitive loss functions, and data balancing methods such as synthetic oversampling or adversarial data generation [28].
Beyond model performance, this research addresses practical considerations for deploying AI-based security solutions in industrial environments. Constraints such as limited computational resources at the edge, strict real-time requirements, and the lack of interpretability of deep learning models remain significant barriers to adoption. To mitigate these challenges, this study emphasizes the importance of model compression techniques, including pruning and quantization, as well as the integration of Explainable AI (XAI) methods such as SHAP, LIME, and attention mechanisms to enhance transparency and operator trust. In addition, hybrid approaches that combine deep learning with rule-based logic or domain-specific constraints are identified as promising solutions for improving robustness while maintaining operational reliability. Federated learning is also highlighted as a viable strategy for enabling collaborative model training across distributed industrial sites without exposing sensitive data, while digital twins offer a safe and realistic environment for validating AI models against cyber-physical attack scenarios [29].
Although deep learning has matured significantly in conventional IT security, its adoption in ICS cybersecurity remains limited due to the unique characteristics of industrial environments, including legacy systems, real-time determinism, and specialized communication protocols. The findings of this study indicate that, when appropriately adapted, deep learning models can outperform traditional intrusion detection systems that rely on static rules and known signatures. In particular, the Transformer’s ability to capture long-range temporal dependencies makes it well-suited for identifying subtle and low-frequency attacks that are difficult to detect using conventional methods. This work contributes to a clearer understanding of how deep learning can be effectively applied to ICS cybersecurity. It underscores the importance of designing solutions that are accurate, explainable, and aligned with operational constraints. It also establishes a foundation for future research exploring sensor fusion, reinforcement learning, and causal inference to further strengthen resilience against sophisticated industrial cyber threats.
5. Conclusions
This study demonstrates the effectiveness of deep learning techniques for cybersecurity threat detection in Industrial Control Systems (ICS), particularly within SCADA and DCS environments. Using the HAI Security Dataset, CNN, LSTM, and Transformer models were evaluated on multivariate time-series data, with the Transformer achieving the highest overall accuracy of 92%. While all models exhibited strong detection capability, performance on minority attack classes was constrained by severe class imbalance and overfitting, highlighting critical challenges in real-world ICS anomaly detection. Beyond performance evaluation, this work underscores key practical considerations for deploying AI-based security solutions in industrial settings, including data scarcity, limited computational resources, explainability requirements, and strict real-time operational constraints. The findings suggest that although Transformer models offer superior generalization, their computational overhead presents trade-offs when deployed in resource-constrained environments. This study is subject to several limitations. First, the evaluation was conducted using a single dataset, which may restrict generalizability across diverse industrial contexts. Second, class imbalance and overfitting were not fully mitigated, potentially affecting the reliable detection of rare attack events. These limitations constrain external validity and motivate future research directions, including multi-dataset validation, advanced data balancing strategies, and federated learning–based training to enhance robustness and privacy. In addition, hybrid modeling approaches and digital twin–based validation frameworks are identified as promising avenues for improving the scalability, trustworthiness, and real-world applicability of AI-driven cybersecurity systems. The proposed framework provides a scalable foundation for real-time, intelligent anomaly detection and contributes to the development of more resilient cybersecurity defenses for critical industrial infrastructure.
Author Contributions
Conceptualization, G.P.O. and S.A.O.; methodology, G.P.O.; software, G.P.O.; validation, G.P.O., J.A.O. and B.N.U.; formal analysis, G.P.O.; investigation, G.P.O.; resources, S.A.O.; data curation, G.P.O.; writing original draft preparation, G.P.O.; writing review and editing, J.A.O., B.N.U. and B.E.A.; visualization, G.P.O.; supervision, S.A.O.; project administration, S.A.O.; funding acquisition, S.A.O. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
The original data presented in the study are openly available in the Kaggle online data repository at https://www.kaggle.com/datasets/icsdataset/hai-security-dataset (accessed on 24 May 2025).
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Alladi, T.; Chamola, V.; Zeadally, S. Industrial Control Systems: Cyberattack trends and countermeasures. Comput. Commun. 2020, 155, 1–8. [Google Scholar] [CrossRef]
- Alimi, O.A.; Ouahada, K.; Abu-Mahfouz, A.M.; Rimer, S.; Alimi, K.O.A. A Review of Research Works on Supervised Learning Algorithms for SCADA Intrusion Detection and Classification. Sustainability 2021, 13, 9597. [Google Scholar] [CrossRef]
- Oise, G.P.; Onwuzo, C.J.; Fole, M.; Oyedotun, S.A.; Odimayomi, J.A.; Unuigbokhai, N.B.; Ejenarhome, P.O.; Akilo, B.E. DECENTRALIZED DEEP LEARNING IN HEALTHCARE: ADDRESSING DATA PRIVACY WITH FEDERATED LEARNING. FUDMA J. Sci. 2025, 9, 19–26. [Google Scholar] [CrossRef]
- Ateş, Ç.; Özdel, S.; Anarım, E. Graph–Based Anomaly Detection Using Fuzzy Clustering. In Proceedings of the International Conference on Intelligent and Fuzzy Systems, Istanbul, Turkey, 23–25 July 2019; pp. 338–345. [Google Scholar]
- Khan, I.A.; Keshk, M.; Pi, D.; Khan, N.; Hussain, Y.; Soliman, H. Enhancing IIoT networks protection: A robust security model for attack detection in Internet Industrial Control Systems. Ad Hoc Netw. 2022, 134. [Google Scholar] [CrossRef]
- Abdelaty, M.; Doriguzzi-Corin, R.; Siracusa, D. AADS: A Noise-Robust Anomaly Detection Framework for Industrial Control Systems. In Proceedings of the International Conference on Information and Communications Security, Beijing, China, 15–17 December 2019; pp. 53–70. [Google Scholar]
- Zhang, J.; Pan, L.; Han, Q.-L.; Chen, C.; Wen, S.; Xiang, Y. Deep Learning Based Attack Detection for Cyber-Physical System Cybersecurity: A Survey. Ieee/caa J. Autom. Sin. 2021, 9, 377–391. [Google Scholar] [CrossRef]
- Oise, G.P.; Konyeha, S. Deep Learning System for E-Waste Management. Eng. Proc. 2024, 67, 66. [Google Scholar] [CrossRef]
- Ben Fredj, O.; Mihoub, A.; Krichen, M.; Cheikhrouhou, O.; Derhab, A. CyberSecurity Attack Prediction: A Deep Learning Approach. In Proceedings of the SIN 2020: 13th International Conference on Security of Information and Networks, Merkez, Turkey, 4–7 November 2020; pp. 1–6. [Google Scholar]
- Xiong, D.; Zhang, D.; Zhao, X.; Zhao, Y. Deep Learning for EMG-based Human-Machine Interaction: A Review. Ieee/caa J. Autom. Sin. 2021, 8, 512–533. [Google Scholar] [CrossRef]
- Ganesh, P.; Lou, X.; Chen, Y.; Tan, R.; Yau, D.K.Y.; Chen, D.; Winslett, M. Learning-Based Simultaneous Detection and Characterization of Time Delay Attack in Cyber-Physical Systems. IEEE Trans. Smart Grid 2021, 12, 3581–3593. [Google Scholar] [CrossRef]
- Yang, K.; Li, Q.; Li, T.; Wang, H.; Sun, L. Detecting Time-Delay Attacks in Industrial Control Systems Through State-Aware Inference. IEEE Internet Things J. 2024, 12, 7195–7208. [Google Scholar] [CrossRef]
- Pan, K.; Wang, Z.; Dong, J.; Palensky, P.; Xu, W. Real-Time Estimation and Defense of PV Inverter Sensor Attacks With Hardware Implementation. IEEE Trans. Ind. Electron. 2024, 72, 3228–3232. [Google Scholar] [CrossRef]
- Berardehi, Z.R.; Yin, J.; Taheri, M. Stabilization of Phasor Measurement Sensor-Based Markovian Jump CPSs Through Soft Actor–Critic. IEEE Sensors J. 2024, 24, 37800–37808. [Google Scholar] [CrossRef]
- Xiahou, K.; Xu, X.; Huang, D.; Du, W.; Li, M. Sliding-Mode Perturbation Observer-Based Delay-Independent Active Mitigation for AGC Systems Against False Data Injection and Random Time-Delay Attacks. IEEE Trans. Ind. Cyber-Physical Syst. 2024, 2, 446–458. [Google Scholar] [CrossRef]
- Unuigbokhai, N.B.; Oise, G.P.; Akilo, B.E.; Nwabuokei, O.C.; Odimayomi, J.A.; Bakare, S.K.; Atake, O.M. ADVANCEMENTS IN FEDERATED LEARNING FOR SECURE DATA SHARING IN FINANCIAL SERVICES. FUDMA J. Sci. 2025, 9, 80–86. [Google Scholar] [CrossRef]
- Bindra, S.S.; Aggarwal, A. Deep Learning-based Enhanced Security in Cyber- Physical Systems: A Multi-Attack Perspective. In Proceedings of the 2024 International Conference on Computational Intelligence and Computing Applications (ICCICA), Panipat, India, 23–24 May 2024; pp. 347–352. [Google Scholar]
- Meydani, A.; Shahinzadeh, H.; Ramezani, A.; Nafisi, H.; Gharehpetian, G.B. A Review and Analysis of Attack and Countermeasure Approaches for Enhancing Smart Grid Cybersecurity. In Proceedings of the 2024 28th International Electrical Power Distribution Conference (EPDC), Zanjan, Iran, 23–25 April 2024; pp. 1–19. [Google Scholar]
- Abdullahi, M.; Alhussian, H.; Aziz, N.; Abdulkadir, S.J.; Alwadain, A.; Muazu, A.A.; Bala, A. Comparison and Investigation of AI-Based Approaches for Cyberattack Detection in Cyber-Physical Systems. IEEE Access 2024, 12, 31988–32004. [Google Scholar] [CrossRef]
- Abdi, N.; Albaseer, A.; Abdallah, M. The Role of Deep Learning in Advancing Proactive Cybersecurity Measures for Smart Grid Networks: A Survey. IEEE Internet Things J. 2024, 11, 16398–16421. [Google Scholar] [CrossRef]
- Bakker, C.; Vasisht, S.; Huang, S.; Vrabie, D.L. Sensor and Actuator Attacks on Hierarchical Control Systems with Domain-Aware Operator Theory*. In Proceedings of the 2023 Resilience Week (RWS), National Harbor, MD, USA, 27–30 November 2023; pp. 1–8. [Google Scholar]
- Koay, A.M.Y.; Ko, R.K.L.; Hettema, H.; Radke, K. Machine learning in industrial control system (ICS) security: Current landscape, opportunities and challenges. J. Intell. Inf. Syst. 2022, 60, 377–405. [Google Scholar] [CrossRef]
- Oyedotun, S.A.; Oise, G.P.; Ozobialu, C.E. Towards Intelligent Cybersecurity in SCADA and DCS Environments: Anomaly Detection Using Multimodal Deep Learning and Explainable AI. J. Sci. Res. Rev. 2025, 2, 20–31. [Google Scholar] [CrossRef]
- Balla, A.; Habaebi, M.H.; Islam, R.; Mubarak, S. Applications of deep learning algorithms for Supervisory Control and Data Acquisition intrusion detection system. Clean. Eng. Technol. 2022, 9. [Google Scholar] [CrossRef]
- ICS Security Dataset, “HAI Security Dataset.” Kaggle Online Data Repository. Available online: https://www.kaggle.com/datasets/icsdataset/hai-security-dataset (accessed on 24 May 2025).
- Qi, Y.; Tang, Y.; Zhao, X.; Xing, N.; Qiu, J. Dual Event-Triggered Control for Asynchronous Scheduling Parameter Varying Networked Switched Systems Under DoS Attacks. IEEE Syst. J. 2023, 17, 5854–5865. [Google Scholar] [CrossRef]
- Simonthomas, S.; Subramanian, R. Detection of Cyber Attacks in Smart Grid Using Optimization and Deep Learning Techniques. In Proceedings of the 2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT), Delhi, India, 6–8 July 2023; pp. 1–7. [Google Scholar]
- Khare, U.; Malviya, A.; Gawre, S.K.; Arya, A. Cyber Physical Security of a Smart Grid: A Review. In Proceedings of the 2023 IEEE International Students’ Conference on Electrical, Electronics and Computer Science (SCEECS), Bhopal, India, 18–19 February 2023; pp. 1–6. [Google Scholar]
- Simonthomas, S.; Subramanian, R. Detection and Prevention of Cyber-Attacks in Cyber-Physical Systems based on Nature Inspired Algorithm. In Proceedings of the 2023 International Conference on Intelligent Systems for Communication, IoT and Security (ICISCoIS), Coimbatore, India, 9–11 February 2023; pp. 483–487. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.








