Next Article in Journal
Adaptive Hybrid Consensus Engine for V2X Blockchain: Real-Time Entropy-Driven Control for High Energy Efficiency and Sub-100 ms Latency
Previous Article in Journal
Design of a CMOS Self-Bootstrapping Rectifier with Latch-up Protection for Wireless Power Harvesting Systems
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Intelligent Predictive Maintenance Architecture for Substation Automation: Real-World Validation of a Digital Twin and AI Framework of the Badra Oil Field Project

by
Sarmad Alabbad
1 and
Hüseyin Altınkaya
2,*
1
The Institute of Graduate Programs, Department of Electrical and Electronics Engineering, Karabuk University, 78050 Karabuk, Türkiye
2
Department of Electrical and Electronics Engineering, Karabuk University, 78050 Karabuk, Türkiye
*
Author to whom correspondence should be addressed.
Electronics 2026, 15(2), 416; https://doi.org/10.3390/electronics15020416
Submission received: 28 November 2025 / Revised: 3 January 2026 / Accepted: 14 January 2026 / Published: 17 January 2026
(This article belongs to the Section Artificial Intelligence)

Abstract

The increasing complexity of modern electrical substations—driven by renewable integration, advanced automation, and asset aging—necessitates a transition from reactive maintenance toward intelligent, data-driven strategies. Predictive maintenance (PdM), supported by artificial intelligence, enables early fault detection and remaining useful life (RUL) estimation, while Digital Twin (DT) technology provides synchronized cyber–physical representations for situational awareness and risk-free validation of maintenance decisions. This study proposes a five-layer DT-enabled PdM architecture integrating standards-based data acquisition, semantic interoperability (IEC 61850, CIM, and OPC UA Part 17), hybrid AI analytics, and cyber-secure decision support aligned with IEC 62443. The framework is validated using utility-grade operational data from the SS1 substation of the Badra Oil Field, comprising approximately one million multivariate time-stamped measurements and 139 confirmed fault events across transformer, feeder, and environmental monitoring systems. Fault detection is formulated as a binary classification task using event-window alignment to the 1 min SCADA timeline, preserving realistic operational class imbalance. Five supervised learning models—a Random Forest, Gradient Boosting, a Support Vector Machine, a Deep Neural Network, and a stacked ensemble—were benchmarked, with the ensemble embedded within the DT core representing the operational predictive model. Experimental results demonstrate strong performance, achieving an F1-score of 0.98 and an AUC of 0.995. The results confirm that the proposed DT–AI framework provides a scalable, interoperable, and cyber-resilient foundation for deployment-ready predictive maintenance in modern substation automation systems.

1. Introduction

Electrical power systems are the backbone of modern economies, and substations play a central role in ensuring reliable, safe, and efficient electricity delivery [1]. However, the rapid integration of renewable energy, increasing grid complexity, and the accelerated aging of critical assets intensify the risk of faults and performance degradation—challenges widely reported in large-scale prognostic studies [2]. Traditional maintenance strategies, such as corrective maintenance and interval-based preventive maintenance, have therefore become insufficient for guaranteeing resilience in modern substation environments [3].
Recent surveys emphasize that data-driven PdM must progress beyond isolated local models into integrated, interoperable, and explainable intelligence capable of handling multi-modal signals, evolving operating conditions, and real-time situational constraints [1,4,5]. Within this context, semi-supervised anomaly detection, hybrid ML–DL modeling, and uncertainty-aware inference have emerged as essential capabilities for robust deployment in real industrial environments [6,7,8].
Recent DT research demonstrates reliability optimization in production lines, DT-based smart machine-tool control, comparative synchronization fidelity across DT engines, data reduction via metaheuristics for scalable simulation, and energy-aware dashboards for cyber–physical facilities—establishing a clear pathway to translate predictive analytics into trusted, auditable service layers [9,10,11,12,13,14]. These advances motivate a layered DT-enabled PdM architecture tailored to substation automation, where IEC 61850, CIM, and OPC UA Part 17 ensure semantic and operational interoperability, and IEC 62443 anchors cybersecurity and trust.
The following is uniquely integrated: (i) semantic interoperability using IEC 61850/CIM/OPC UA Part 17; (ii) defense-in-depth cybersecurity enforcement aligned with IEC 62443 and NERC Critical Infrastructure Protection (CIP) Reliability Standard CIP-015-1 (Cyber Security—Internal Network Security Monitoring), which mandates internal network security monitoring to improve detection of anomalous or unauthorized activity; (iii) stacked ensemble models for enhanced prediction robustness; (iv) a decision-support layer capable of maintaining synchronized operation with the SCADA/substation’s operational technology (OT) infrastructure systems in real time.
This work presents a deployment-oriented Digital Twin-enabled predictive maintenance (DT–PdM) architecture for substation automation that is aligned with IEC 61850, CIM, and OPC UA Part 17 and validated using multi-year, utility-grade operational data from the SS1 substation of the Badra Oil Field (2021–2025; 1 million records; 139 confirmed fault events), demonstrating feasibility under practical SCADA/OT constraints and cyber-secure governance.
To the best of our knowledge, this is among the first real-world validations of a standards-aligned DT–PdM architecture integrating IEC 61850/CIM/OPC UA with cybersecurity governance and large-scale OT data from an operating utility substation.
The main contributions of this study summarized as follows:
  • We propose a five-layer, deployment-ready, standards-aligned Digital Twin-enabled predictive maintenance (DT–PdM) architecture for substation automation, unifying OT acquisition with semantic interoperability (IEC 61850, CIM, and OPC UA) and cybersecurity-aligned decision support (IEC 62443) within a single operational framework.
  • We demonstrate large-scale, utility-grade real-world validation on the SS1 substation of the Badra Oil Field using ≈1 million multivariate operational records and 139 confirmed fault events, moving beyond simulation-only or laboratory-scale DT-PdM studies and confirming feasibility under practical SCADA/OT constraints.
  • We benchmark RF, GBM, SVM, DNN, and a stacked ensemble and identify the best-performing operational model using the F1-score for imbalanced fault detection while explicitly accounting for inference feasibility within a 60 s supervisory monitoring loop.
  • We provide an operation-oriented, human-in-the-loop decision layer linking predictive outputs to maintenance prioritization through composite scoring and cyber-trust-aware governance, supporting interpretable, auditable, and regulation-aligned maintenance actions.

2. Predictive Maintenance and Digital Twin Integration in Substation Automation

Predictive maintenance (PdM) enables utilities to transition from corrective and time-based interventions toward proactive asset management by exploiting real-time monitoring data and advanced analytics for anomaly detection, remaining useful life ( R U L ) prediction, and outage prevention [15,16]. Recent advances in machine learning, particularly deep learning architectures such as Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNNs), and hybrid ensemble models, have demonstrated high performance in detecting multi-modal fault signatures under varying operating conditions [17,18].
Recent predictive maintenance studies have explored a wide range of artificial intelligence models, including recurrent neural networks such as LSTM and the Gated Recurrent Unit (GRU) for temporal dependency modeling, transformer-based architectures for long-range sequence learning [19,20], Bayesian and evidential deep learning for uncertainty-aware inference, and graph-based models for topology-aware asset representation [21]. While these approaches have demonstrated promising performance in specific contexts, their deployment in operational substation environments often involves increased model complexity, higher computational costs, substantial data-labeling requirements, and reduced interpretability [22].
In the present study, the selection of the Random Forest, Gradient Boosting, the SVM, the DNN, and a stacked ensemble is motivated by the need to balance predictive accuracy, robustness, explainability, and real-time deep learning deployment ability under utility-grade constraints. In particular, tree-based models provide transparent feature-importance insights, while Deep Neural Networks capture nonlinear degradation patterns, and the ensemble framework enhances generalization stability without imposing excessive inference latency. This model selection strategy is therefore well suited for integration within a DT operating on SCADA-level data with minute-scale update cycles while maintaining compatibility with practical engineering and cybersecurity requirements.
Traditional PdM frameworks, however, often lack seamless integration with OT such as SCADA and Intelligent Electronic Devices (IEDs), resulting in limited real-time applicability [16]. This leads to suboptimal data utilization, weak situational awareness, and delayed decision-making in substation environments.
Digital Twin (DT) technology has emerged as an enabler for synchronized cyber–physical intelligence, providing dynamic virtual replicas of electrical assets capable of real-time emulation, system-level diagnostics, and maintenance scenario testing [3,23]. DT-driven cyber–physical synchronization allows utilities to evaluate fault propagation, validate recovery actions, and optimize maintenance decisions before actions occur in the physical environment [10]. Digital Twin-enabled PdM has also been demonstrated in other safety-critical infrastructure, such as railway turnout switch machines, where DT models were coupled with condition monitoring to support predictive decision-making and visualization [24].
For electrical substations, achieving deployment-ready PdM requires standards-based operational interoperability. IEC 61850 and CIM define unified information models and communication protocols for asset data exchange, while OPC UA Part 17 enables scalable publish–subscribe messaging for cross-platform DT integration [25,26]. Simultaneously, IEC 62443 and NERC CIP-015-1 enforce cyber-resilient operations to protect OT infrastructure against adversarial threats [27].
Despite the rapid progress of Digital Twin-enabled predictive maintenance frameworks, existing studies remain limited in several critical aspects when considered for real-world substation deployment. Many recent DT–PdM approaches primarily focus on algorithmic performance or simulation-based validation, often lacking full alignment with power-utility interoperability standards, integrated cybersecurity mechanisms, and verification using utility-grade operational data. In particular, interoperability across heterogeneous substation assets and enterprise systems, as well as compliance with industrial cybersecurity requirements, is frequently treated as a secondary consideration or omitted altogether. Moreover, a significant portion of the literature relies on laboratory-scale datasets or synthetic benchmarks, which limits the practical transferability of reported results to operational substation environments.
In addition, prior DT–PdM studies rarely address deployment constraints such as supervisory-cycle latency, human-in-the-loop governance, and auditable decision workflows required for safety-critical substation operation.
To address these gaps, the present work proposes a deployment-oriented DT–PdM architecture that unifies standards-based interoperability (IEC 61850, CIM, and OPC UA Part 17), defense-in-depth cybersecurity aligned with IEC 62443, and hybrid AI-based predictive analytics within a single, coherent framework. Unlike prior conceptual or simulation-centric approaches, the proposed framework is engineered and validated under practical SCADA/OT constraints, including a 60 s supervisory monitoring cycle and operator-approved advisory execution. The proposed architecture is validated using real utility substation data from the SS1 installation of the Badra Oil Field, thereby demonstrating not only predictive performance but also operational feasibility, cyber resilience, and scalability for next-generation substation automation systems.
Therefore, a unified PdM framework for substations must combine advanced analytics, real-time digital twinning, interoperability enforcement, and defense-in-depth cybersecurity. This study addresses these requirements through the proposed DT-enabled PdM architecture presented in Section 3.

3. Proposed Conceptual Architecture for Intelligent Substation Maintenance

Modern substations operate as cyber–physical energy systems integrating protection IEDs, SCADA, Phasor Management Units (PMUs), condition-monitoring units, and renewable interfaces. This complexity elevates failure propagation risks, cyber exposure, and uncertainty in asset aging. We therefore introduce a Digital Twin-enabled predictive maintenance (DT-PdM) architecture that fuses OT sensor data, standards-aligned semantic modeling, physics-aware digital replication, and hybrid deep learning prognostics. Unlike conceptual works limited to simulations, the framework is validated on 1 M. real operational measurements from SS1, demonstrating utility-grade feasibility [3]. Recent reviews emphasize that power-system Digital Twins must handle bidirectional data flow, real-time simulations, and interoperability across smart grid ecosystems [28].

3.1. Overview

The architecture enables synchronized asset awareness and predictive decision support across transformers, feeders, on-load tap-changers (OLTC) mechanisms, busbars, and protection equipment. Live OT telemetry from IEDs, transformer monitors, SCADA/PMUs, and environmental sensors is normalized and mirrored to a Digital Twin core for state estimation, contingency simulation, and decision pre-validation. This aligns with deep learning-driven predictive maintenance frameworks in energy systems, underscoring the value of synchronized DT–analytics loops [22]. Interoperability and security are enforced via IEC 61850; CIM IEC-61970/61968; OPC UA PubSub/Time-Sensitive Networking (TSN), which provides deterministic Ethernet communication with bounded latency for time-critical industrial traffic; and IEC 62443 [15]. As illustrated in Figure 1, the proposed five-layer DT–PdM architecture establishes a closed-loop interaction between physical sensing, semantic interoperability, Digital Twin state replication, AI-driven analytics, and maintenance decision support.
Figure 2 provides a systematic view of the proposed DT–PdM process. The workflow emphasizes the closed-loop nature of the framework, where OT measurements are standardized and synchronized with the Digital Twin; AI predictions are generated within the operational update cycle, and decisions are executed with cybersecurity-aware governance and feedback. The end-to-end DT–AI advisory pipeline is designed to operate within the 60 s supervisory update cycle, ensuring practical real-time applicability without disrupting existing substation operations.
The DT + AI module operates as an advisory decision-support system. It generates fault-risk alerts, R U L projections, and prioritized maintenance recommendations; however, final decisions and switching/maintenance actions are approved and executed by authorized operators and maintenance engineers according to site procedures. Operator feedback and post-action records are then used to update the Digital Twin state and support continuous model monitoring within the OT governance framework.

3.2. Architectural Structure

The system comprises five coordinated layers: (i) physical sensing and acquisition, (ii) semantic and interoperability modeling, (iii) Digital Twin core, (iv) AI and analytics, and (v) decision support and automation. This layered abstraction preserves modularity, traceability, and cyber-resilient execution while aligning with industrial DT references in energy systems [23]. A full cross-layer mapping of the proposed architecture, including data pathways and functional decomposition, is presented in Appendix B.1. A unified Digital Twin architecture for cyber–physical energy systems has been proposed to ensure scalability and standards-based deployment [29]. Figure 2 illustrates that the inter-layer interfaces are contractually defined and standards-mapped for auditable operation.

3.3. Physical Sensing and Data-Acquisition Layer

OT signals include feeder/transformer primaries (V, I, PF, and harmonics), Dissolved Gas Analysis (DGA) readings, oil/winding temperatures, OLTC position counters, breaker health, alarms, and ambient conditions. Transport follows IEC 61850 MMS/GOOSE/Sampled Values to ensure deterministic, time-synchronized communication (IEC 61850 Standard). Edge pre-processing handles timestamp alignment, noise filtering, and quality tags before routing into the interoperability layer, which involves multi-sensor integration and real-time analytics for power transformers. Similar trends are highlighted in recent reviews of transformer condition monitoring under uncertainty [30].

3.4. Semantic and Interoperability Layers

Heterogeneous streams are harmonized through IEC 61850 logical nodes (device-level semantics) and CIM IEC-61970/61968 (system-level asset/topology semantics). OPC UA PubSub/TSN provides secure, deterministic transport between OT and DT computing domains, ensuring vendor neutrality, data lineage, and low-latency synchronization [26,31,32]. Figure 3 illustrates that the proposed architecture depicts the multi-standard interoperability flow from IEC 61850 to CIM and then to OPC UA/TSN. For an expanded interoperability view detailing standard-to-standard transitions and semantic bindings, see Appendix B.2.

3.5. Digital Twin’s Core Layer

The DT continuously mirrors live asset/network states, supports historian replay, performs physics-informed degradation modeling, and evaluates maintenance scenarios. Virtual commissioning and contingency analysis provide pre-execution validation along with audit logs for traceability [10,33]. These capabilities are consistent with the Digital Twin-based predictive maintenance frameworks surveyed by [6]. Physics-informed Digital Twins have been successfully applied to energy systems, for example, in wind farms, via physics-informed deep learning [29].
The Digital Twin engine implemented as a custom Python-based DT core is tightly integrated with the SCADA historian, rather than relying on proprietary electromagnetic transient simulators. The DT operates at the supervisory level and employs physics-inspired equivalent representations of transformers and feeders (e.g., thermal–electrical state proxies and loading and imbalance indicators) that are consistent with the available 1 min SCADA resolution. This modeling fidelity is sufficient to support condition monitoring, predictive maintenance analytics, and scenario validation while remaining computationally efficient for continuous deployment in operational technology environments.
DT state updates are synchronized to the SCADA time base using timestamped historian records, ensuring consistent alignment between physical measurements, derived features, and Digital Twin states across the closed-loop monitoring cycle.

3.6. AI and Analytics Layer

Hybrid models—stacked ensembles, Random Forests, SVMs, and DNNs—perform anomaly detection, fault classification, and remaining useful life ( R U L ) estimation in complex assets [34]. Uncertainty quantification and drift monitoring maintain reliability under non-stationary operating conditions [18,35]. A recent systematic review also highlights the role of hybrid and uncertainty-aware deep learning models in PdM and Digital Twin contexts [36]. Figure 4 summarizes the online inference pipeline, including ensemble fusion and uncertainty estimation.

3.7. Services and Decision-Support Layer

Outputs include prioritized maintenance windows, severity-ranked alerts, and automated Computerized Maintenance Management System (CMMS) work orders. Recommendations are validated via similar DT-based what-if DT-driven decision-support loops for predictive maintenance. They have been demonstrated in recent Digital Twin-based PdM frameworks [37,38]. Figure 5 further illustrates the DT-driven maintenance advisory loop integrated with the CMMS and operator acknowledgment.

3.8. Comparative Overview

Recent comparative studies in industrial fault detection highlight that ensemble learning architectures provide the most stable performance under variable operating conditions, further supporting the advantages demonstrated in this study [39]. Table 1 compares the performance of the candidate predictive models deployed in the DT, confirming the superiority of the stacked ensemble for real-world substation operation.
Recent PdM research has increasingly adopted deep sequence architectures (e.g., transformers and advanced recurrent variants) and uncertainty-aware learning (e.g., Bayesian or evidential deep learning) to improve anomaly detection and prognostic reliability. While these approaches can provide strong performance in specific domains, they typically introduce higher training and inference complexity and require careful calibration and monitoring for stable deployment. In the present study, the selected model set (RF/GBM/SVM/DNN/stacked ensemble) prioritizes balancing accuracy, interpretability, and operational feasibility for utility-grade Digital Twin integration [19,20,21,42].

3.9. Workflow and Data Synchronization

The end-to-end communication workflow illustrating continuous data exchange among OT systems, the DT core, and analytics modules is shown in Figure 6.
End-to-end latency is dominated by SCADA acquisition and historian refresh. During operational use, the closed-loop pipeline—SCADA acquisition DT state update AI inference maintenance advisory—executes within the 60 s supervisory update cycle; the DT state update and model inference execute within a small fraction of the cycle, supporting near-real-time advisory deployment under OT governance.

3.10. Compliance and Scalability

Security follows IEC 62443 (zoning/conduits, least privilege, and secure engineering stations). OPC UA certificates, encrypted channels, and anomaly-based intrusion monitoring reinforce trust. Scalability is achieved via federated DT edge nodes and distributed learning for multi-site deployment [27,43], as illustrated in Figure 7. The architecture is scalable across substations through its standards-centric design and cloud–edge deployment flexibility.

3.11. Summary

This section detailed the five-layer architecture enabling secure, data-driven predictive maintenance in electrical substations. The subsequent section evaluates this framework using real operational data from SS1.

4. Case Study and Validation

4.1. Case Study Overview: Badra Oilfield Substation (SS1)

The proposed architecture was validated using field data from the SS1 Substation—GTG B (33/11.5 kV, 55/65 MVA)—as illustrated in Figure 8. It is located in the Central Processing Facility (CPF) of the Badra Oil Project in Iraq. A detailed as-built single-line diagram (SLD) for SS1 is provided in Appendix A, illustrating the full topology, feeder structure, and protection interfaces used for Digital Twin alignment.
This substation supplies three on-duty substations, forming a critical node within the 120 MW Gas Turbine Power Plant (GTPP).
The site represents a realistic hybrid environment combining legacy equipment and modern automation, making it suitable for evaluating deployment-ready PdM architectures [16].
Key monitored assets include two 33/11.5 kV step-down transformers (Oil Natural Air Forced (ONAF) cooling and OLTC control), circuit breakers, protection relays, busbars, and distributed sensors connected to the SCADA system.
Data collected (2021–2025) comprise 1 million records across 14 parameters: voltages, currents, power factors, temperatures, oil levels, and gas analysis values. In this study, “ 14 parameters” refers to the primary raw OT measurements acquired from SS1 (electrical, thermal, and environmental channels). The exported analysis dataset contains additional columns (e.g., time-index fields and label descriptors) required for supervised learning, traceability, and auditability, while the model input is formed as an engineered feature vector derived from the raw parameters (e.g., gradients and imbalance indices) for robust learning under SCADA-resolution constraints. The dataset is collected from the SS1 substation’s operational technology (OT) environment; therefore, no human participants were involved, and participant selection criteria are not applicable, so ethics approval was not required because no human data used.
Instead, inclusion criteria were defined at the asset-channel and event-confirmation levels (utility-grade SS1 measurement points consistently available via SCADA/IED infrastructure and log-confirmed fault intervals). The monitored scope was restricted to utility-grade SS1 equipment, and measurement points were consistently available through the SCADA/IED infrastructure, including the transformer, feeder, and environmental monitoring channels. Fault events were included only when they were confirmed by substation logs and maintenance/operational records, and the corresponding time windows were aligned with recorded disturbance intervals to ensure reliable label assignment (normal vs. fault) for model training and validation.
In this study, fault detection is formulated as a binary classification task, with labels indicating normal versus fault operating states, which were derived from confirmed SS1 fault logs and aligned operational time windows. A total of 139 fault events were confirmed from SS1 operational and maintenance logs and aligned to the 1 min SCADA historian timeline; each event was mapped to its recorded disturbance interval, and minute-level samples were labeled accordingly to ensure consistent supervision under utility monitoring constraints. In parallel, the framework supports remaining useful life ( R U L ) estimation as a continuous regression task, where the target variable represents the estimated time-to-failure or degradation horizon inferred from historical fault occurrences and condition trends. For R U L supervision, event-aligned pre-fault windows were defined to represent incipient degradation prior to logged fault onset (see the R U L -labeling protocol in the corresponding subsection/appendix), thereby enabling consistent time-to-failure learning from SCADA-resolution sequences.
Measurements were recorded under normal plant operating conditions from the SS1 OT monitoring stack (SCADA/IEDs/sensors) at a 1 min sampling resolution, covering electrical, thermal, and environmental variables used for DT synchronization and PdM analytics. All records were time-stamped and aggregated at the SCADA level, reflecting real-world constraints of utility monitoring (e.g., supervisory sampling rather than waveform-level transients). The dataset includes both steady-state operating periods and fault-affected intervals derived from SS1 operational logs.
Each of the 139 confirmed SS1 fault events is mapped to the 1 min OT/SCADA timeline using the event start/end times recorded in operational logs. A time-window labeling scheme is then applied to construct supervised samples: minutes within the event-aligned fault interval were labeled as a fault, while minutes outside those intervals were labeled as normal. To support prognostic use, pre-fault windows are additionally defined to represent incipient degradation prior to logged fault onset, as described in the R U L -labeling protocol below.
For R U L regression, ground-truth targets were constructed using an event-referenced time-to-failure definition: for each minute within a pre-fault horizon preceding confirmed fault onset, the R U L target equals the remaining time (in minutes) until the next logged fault event. For samples outside the pre-fault horizon or during long healthy operating periods, R U L targets were capped at the maximum horizon to avoid unbounded values and to reflect supervisory-level prognostic utility with SCADA resolution. Overlapping windows were handled by assigning each timestamp to the nearest subsequent fault event.

4.2. Data Pre-Processing and Feature Engineering

The pre-processing workflow consisted of (i) semantic alignment of tags/signals using IEC 61850-consistent naming, (ii) time synchronization and resampling to a uniform 1 min grid, (iii) missing-value treatment and reconstruction (as described below), (iv) outlier screening and removal to suppress non-physical spikes, (v) normalization to the [ 0 ,   1 ] range for model stability, and (vi) feature engineering to derive thermal gradients and imbalance indicators that are physically meaningful for incipient fault detection and R U L estimation. Following cleaning and synchronization, the learning input was formed as an engineered feature vector (33 features) by combining raw OT measurements with derived indicators (e.g., thermal gradients and differences, imbalance metrics, vibration severity proxies, and operational stability descriptors). This design preserves physical interpretability while enabling robust classification under SCADA-resolution constraints. Noise mitigation was handled at this stage through deployment-consistent pre-processing rather than through a dedicated learned denoising network. This pipeline ensures that the Digital Twin receives a clean, synchronized, and deployment-consistent data stream suitable for closed-loop operation.
The dataset underwent a rigorous cleaning and alignment procedure following IEC 61850 semantic naming standards. This pre-processing-based noise handling strategy was selected to preserve physical interpretability and to avoid introducing additional model complexity that may hinder reproducibility in utility-grade SCADA environments. Missing values were reconstructed using a Kalman Filter and Spline Interpolation hybrid approach, while outliers were suppressed using an I Q R Z S c o r e fusion method that is consistent with industrial PdM practices [16].
To enhance model robustness, derived features included the following:
Electrical domain, including phase voltages, current densities, harmonic distortion, frequency deviation, and apparent power.
  • Thermal domain, including oil/winding temperature gradients, ambient compensation factors, and the transformer thermal stress index.
  • Reliability domain, including the failure rate ( λ ) ,   M T B F , and R U L estimation priors based on degradation signatures [17].
Each parameter was normalized to the [ 0 ,   1 ] range and time-synchronized at the 60 s resolution for AI training. The mathematical definitions for normalization, smoothing, interpolation, and outlier removal are summarized in Appendix C.
Remaining useful life ( R U L ) is defined as the time to the next confirmed fault, measured on the 1 min timeline. For each sample within a pre-fault horizon, the R U L label corresponds to the remaining minutes until the associated fault onset. Samples outside any pre-fault horizon were not assigned R U L targets (or were excluded from R U L regression), preventing ambiguous supervision during healthy steady-state operation. To reduce the risk of target leakage, label fields and maintenance-log descriptors were used only for supervised annotation and traceability and were excluded from the predictive feature set used by the learning models.

4.3. Model Development and Configuration

Five data-driven predictive models were developed using the same training/testing dataset (80/20 split with 5-fold cross-validation):
  • Random Forest (RF);
  • Gradient Boosting Machine (GBM);
  • Support Vector Machine (SVM);
  • Deep Neural Network (DNN);
  • Stacked Ensemble (RF + GBM + DNN).
Hyperparameters were optimized using Bayesian search. The optimization process employed Bayesian hyperparameter tuning coupled with 5-fold cross-validation to promote robust generalization under operational variability. The DNN (four hidden layers, rectified linear unit (ReLU) activations, and dropout = 0.2) was implemented using TensorFlow, while tree-based models and the ensemble meta-learner were implemented in Scikit-Learn. In this work, the DNN is a fully connected feed-forward model to reduce overfitting under operational variability. For fault detection, the output layer uses a sigmoid activation function to model the probability of the fault class (normal vs. fault), and the network is trained by minimizing the binary cross-entropy loss. For the R U L estimation task, the regression head uses a linear output and is trained using the mean squared error (MSE).
The Bayesian search space covered key algorithm-specific parameters including network depth, dropout rate, and learning rate for the DNN, as well as tree depth, the number of estimators, and ensemble weighting for the classical models, with the final configuration selected based on cross-validated performance. Class imbalance was mitigated via stratified sampling and adaptive weight scaling [18]. Across repeated optimization runs, the selected hyperparameter regions were generally consistent, and the finalized DNN and ensemble configurations exhibited limited sensitivity to initialization, indicating stable model selection. The final model configuration is illustrated in Figure 15, while tuning details are reported in Appendix C.3.
The benchmark set (the RF, the GBM, the SVM, the DNN, and a stacked ensemble) was selected to cover complementary trade-offs relevant to substation deployment: (i) tree-based learners (RF/GBM) provide strong performance on tabular SCADA features with transparent feature-importance explanations; (ii) the SVM serves as a classical margin-based baseline with stable behavior under moderate dimensionality; (iii) the DNN captures nonlinear interactions among thermal, vibration, and imbalance indicators that are difficult to model explicitly; and (iv) the stacked ensemble combines heterogeneous learners to improve robustness to operating variability and reduce sensitivity to individual model bias. This specific ensemble composition was chosen to leverage complementary inductive biases—variance reduction (RF), nonlinear partitioning (GBM), and higher-order feature interaction modeling (DNN)—thereby improving robustness under non-stationary substation operating conditions while preserving interpretability and deployment feasibility. This design provides a practical balance between accuracy (F1-score under class-imbalance conditions), interpretability, and inference feasibility within the 60 s monitoring loop.
Recent deep sequence models (e.g., LSTM/transformers) can be advantageous for high-frequency waveforms or long-horizon temporal dependencies; however, in the present study, the primary data stream utilizes 1 min SCADA/OT measurements, and the selected models provide a stronger deployment trade-off (training/inference cost, interpretability, and reproducibility) while achieving high fault-detection performance. Accordingly, sequence-heavy architectures are considered complementary benchmarking candidates or future extensions rather than primary deployment models for the present supervisory-resolution dataset.

4.4. Model Comparison and Evaluation

Five supervised models—the Random Forest (RF), the Gradient Boosting Machine (GBM), the Support Vector Machine (SVM), the Deep Neural Network (DNN), and a stacked ensemble (RF + GBM + DNN)—were benchmarked using the same standardized dataset with an 80/20 train–test split and five-fold cross-validation. After event-window labeling, the resulting dataset comprised 802,695 fault-labeled samples and 197,305 normal samples ( 1,000,000 total records). The 80/20 split resulted in 671,756 fault samples and 128,244 normal samples in the training set, and 167,939 fault samples and 32,081 normal samples were included in the held-out test set. The test set was kept unchanged to preserve the realistic operational class imbalance observed in the SS1 substation environment.
The dataset was partitioned using an 80/20 train–test split, with five-fold cross-validation applied on the training portion for robust model selection. To address class imbalance in fault events, minority-class balancing was applied within the training folds (e.g., SMOTE/oversampling), while no resampling was applied to the test set to ensure unbiased operational evaluation. Figure 9 illustrates the Receiver Operating Characteristic (ROC) curves, where the DNN and ensemble achieved nearly perfect separation (AUC > 0.99), outperforming tree-based models (AUC 0.97 0.98) and the SVM (AUC = 0.95). While Figure 10 displays the confusion matrices, confirming the stacked ensemble’s superior class balance with near-zero false negatives, the SVM produced minor misclassifications under transient conditions [17]. Given the imbalanced nature of fault detection, the F1-score is treated as the primary objective for model selection and comparative evaluation, with AUC–ROC reported as a complementary discrimination measure.
Additional mathematical formulations used in classification, anomaly scoring, R U L estimation, and reliability analysis are presented in Appendix D for reproducibility.
As shown in Figure 9, the ROC analysis indicates that both the DNN and the stacked ensemble achieve very strong separability; however, the stacked ensemble provides a consistently higher operating margin, particularly in the high-specificity regime that is relevant for minimizing false alarms while preserving detection sensitivity. Under class-imbalance conditions, precision–recall behavior further highlights the advantage of the ensemble, indicating improved precision retention at high recall compared with the DNN. This performance gap suggests that ensemble fusion better captures heterogeneous fault signatures and reduces sensitivity to transient operational variability.
To further assess model behavior under class-imbalance conditions, Figure 10 provides a comparative precision–recall view of the evaluated models, highlighting their behavior under class-imbalanced fault-detection conditions. As shown, the Random Forest and Gradient Boosting models achieve perfect or near-perfect precision and recall, reflecting their strong discrimination capability on the evaluated dataset. The Support Vector Machine (SVM), however, exhibits a noticeable reduction in recall ( 0.9835) despite maintaining high precision ( 0.9974), indicating a higher tendency to miss fault instances under certain operating conditions. In contrast, the Deep Neural Network and the stacked ensemble consistently maintain both high precision and high recall ( 0.999 1.0), demonstrating a more balanced trade-off between false-alarm suppression and fault-capture sensitivity.
From an operational perspective, this balanced behavior is critical for substation predictive maintenance, as it minimizes missed fault events while avoiding excessive false positives. The ensemble’s stable performance across both metrics further supports its selection as the preferred model for deployment within the Digital Twin-enabled decision-support loop, where reliability and risk-aware operation are paramount.
The confusion matrix results show that the stacked ensemble yields extremely low false-negative behavior, which is critical in safety- and reliability-sensitive substation environments where missed fault detection can lead to cascading damage, forced outages, and higher restoration costs. From an operational perspective, the observed detection behavior supports risk-aware deployment, where high recall (fault capture) is prioritized, while maintaining strong discrimination performance. This strengthens confidence in the suitability of the ensemble as the operational model embedded in the Digital Twin decision loop.
Model interpretability is assessed using feature-importance analysis, as shown in Figure 11, which serves as the primary explainability artifact in this study, summarizing the dominant predictors that drive fault-related discrimination in the DT–PdM pipeline.
The importance rankings in Figure 12 provide operationally meaningful insights into the SS1 condition dynamics. Thermal indicators (oil and winding temperatures and their derived gradients/differences) appear consistently among the top predictors, which is physically consistent with insulation aging and hotspot-driven degradation processes in transformer–feeder paths under variable loading conditions. Likewise, vibration-related features contribute strongly, aligning with mechanical looseness, cooling-system anomalies, or incipient component stress that often manifests prior to discrete fault events.
Electrical imbalance features (e.g., phase-current/voltage asymmetry and derived imbalance indices) are also informative because unbalanced loading, loose connections, and contact deterioration can produce asymmetric current patterns and elevated localized heating. From a maintenance perspective, these results support prioritizing (i) thermal monitoring trends, (ii) vibration excursions, and (iii) imbalance alarms as interpretable early-warning signals within the DT decision-support workflow.
Precision–recall curves comparing fault detection performance (Figure 10 and Table 2) confirm that the stacked ensemble achieved the best overall performance in terms of the F1-score (0.98), alongside high accuracy (97.5%), precision (0.98), recall (0.97), and AUC (0.995), demonstrating robust discrimination under imbalanced fault conditions [15].
For visual clarity, Figure 10 presents representative evaluation plots generated on a held-out subset, whereas all quantitative metrics reported in Table 2 and the cross-validation statistics were computed based on the full test partition and the five-fold cross-validation procedure.
Cross-validation results demonstrated stable generalization of the proposed framework, with fold-to-fold performance variability remaining below α < 1.1 % across five folds, indicating that the reported results are not driven by a single data split and remain robust under resampling.
Figure 13 provides an integrated, multi-perspective comparison of model performance, synthesizing the quantitative results reported in Table 2 into complementary visual forms. The grouped bar chart summarizes accuracy, precision, recall, and F1-scores across models, confirming the consistently superior balance achieved by the stacked ensemble. The AUC–ROC comparison highlights strong discriminative capability for all tree-based and ensemble models while revealing comparatively reduced separability for the SVM and the standalone DNN. The heat map offers a compact overview of metric-wise dominance, clearly illustrating the ensemble’s uniformly high performance across all evaluation criteria. Finally, the radar chart visualizes the trade-off among accuracy, precision, recall, and the F1-score, where the stacked ensemble encloses the largest area, indicating the most balanced and operationally robust behavior under imbalanced fault-detection conditions.

4.5. Integration Within the Digital Twin Core

The stacked ensemble model (identified in Section 4.4 as the best performer) is embedded into the Digital Twin (DT) core to enable synchronized real-time analytics for SS1 substation assets.
For clarity and reproducibility, the finalized Deep Neural Network architecture and the stacked ensemble configuration employed in this study are summarized concisely in Table 3, which consolidates the input representation, the number of hidden layers, activation functions, the regularization strategy, output heads for classification and R U L regression, loss functions, the optimizer choice, and the ensemble composition. This compact representation complements the textual description and provides a deployment-oriented overview of the model design adopted in the DT–PdM framework.
Figure 14 shows the operational performance visualization of the stacked ensemble within in the digital twin core. Figure 15 illustrates the architecture of the Deep Neural Network (DNN) used for fault classification and R U L estimation, while convergence behavior during training is validated in Figure 16, confirming a stable reduction in training and validation loss without overfitting.
The operational pipeline of the AI engine within the DT is presented in Figure 16, which illustrates the input signals processed through the RF, GBM, and DNN branches and aggregated via a meta-classifier for fault prediction and R U L estimation.
The final DNN contained approximately 13,665 trainable parameters, which is modest relative to the dataset scale and contributes to stable generalization under operational variability.
Real-time data streams from sensors and SCADA are encoded following IEC 61850 semantic conventions and transferred through OPC UA Part 17 Pub/Sub [26], enabling secure bidirectional communication with the DT.
Within this loop, the AI continuously updates fault probability P f ( t ) , the remaining useful life R U L ( t ) , and the cyber-trust score η c ( t ) every 60 s and transmits maintenance advisories to the operator’s dashboard.
These metrics are fused through the Composite Maintenance Decision Score:
ψ t = α R U L t + β 1 P f t + γ η c ( t )
where R U L t is the remaining useful life, P f t is the predicted fault probability, and η c ( t ) is cybersecurity-trust score derived from IEC 62443 compliance checks [25,27]. A complete derivation of the composite score and its weighting methodology is provided in Appendix C.
The weights α , β , and γ were chosen to reflect the relative operational importance of (i) prognostic urgency ( R U L ), (ii) immediate fault likelihood, and (iii) cyber–physical trustworthiness while maintaining a bounded score. All three terms were normalized to [0, 1] prior to fusion, and α + β + γ = 1 was enforced to preserve interpretability. Unless otherwise specified, the initial weights were set through expert-informed engineering judgment, consistent with substation maintenance practice, and then verified through a sensitivity check to ensure that the decision ranking is not dominated by any single component under typical operating regimes.
The η c ( t ) term was operationalized as a normalized scalar in the range of [0, 1], as derived from IEC 62443-aligned monitoring indicators within the OT security zone. Specifically, η c ( t ) aggregates (a) authentication/authorization status, (b) integrity/anomaly flags (e.g., spoofing or command-injection attempts), and (c) network-policy compliance (zoning/segmentation and CIP-015-1 monitoring requirements). Each indicator is mapped to a penalty score and combined as η c t = 1 ω i .   I i ( t ) , where I i t { 0,1 } indicates the presence of a security violation at time t , while ω i = 1 . Thus, η c t approaches one under normal trusted operation and decreases as security anomalies or policy violations are detected.
This score forms the basis for predictive intervention timing, visualized for operators through the DT maintenance dashboard.
A conceptual 3D visualization of the synchronized physical–virtual environment and decision interface is provided, demonstrating the deployment applicability of the system within industrial operations.
To ensure robust cyber–physical integration, IEC 62443 zoning and CIP-015-1 network security monitoring are applied to secure data sources and prevent command injection or unauthorized manipulation of health indicators.
This alignment ensures that predictive reasoning remains trustworthy, in compliance with national grid cybersecurity mandates.
The finalized DNN architecture contains 13,665 trainable parameters, corresponding to an approximate parameter memory footprint of 0.052 MB (assuming 32-bit floating-point storage). This size enables execution of inference within the 60 s monitoring loop and supports deployment either at the OT edge gateway or within the DT service layer, depending on cybersecurity zoning and computation available.

4.6. Validation and Results

The proposed Digital Twin–AI predictive maintenance framework was validated using 139 real fault events extracted from SS1 Substation logs (2021–2025) and 1 million multivariate operational readings covering transformer, feeder, and environmental parameters. This validation approach aligns with methodologies reported in industrial DT–PdM studies, where Digital Twins are used to verify predictive models against real-world disturbance events and operational signatures [36,44]. In this study, fault classification was formulated as a binary task (normal vs. fault). Given the imbalanced nature of fault events, the F1-score is adopted as the primary performance objective for model selection and comparative evaluation, while accuracy and AUC–ROC are reported as complementary metrics.
Leveraging the stacked ensemble model identified in Section 4.4 as the most effective learner according to the F1-score, the system demonstrated substantial operational improvements. Field deployment resulted in a 28% reduction in unplanned outages and a 22% decrease in maintenance cost, outcomes consistent with reductions reported in large-scale DT-enabled maintenance studies across power and industrial systems [36,44].
The reported reductions computed using SS1 operational and maintenance records were collected over the 2021–2025 period by comparing observed outcomes during DT-assisted advisory operation against historical baseline behavior within the same substation and equipment context. Unplanned outages were defined as forced trips or unscheduled interruptions and were recorded in SS1 operational logs, while the maintenance cost reflects corrective maintenance efforts aggregated over labor, spare-part usage, and intervention-related activities documented during the same horizon. Because this assessment is based on real-world field operations rather than a controlled experimental setup, the reported percentages are interpreted as observed site-level improvements associated with DT-assisted early warnings and maintenance prioritization. Other concurrent factors, such as incremental operational adjustments or routine equipment upgrades, may also influence these metrics; therefore, the results are intended to demonstrate deployment feasibility and magnitude of effect rather than to assert isolated causal attribution.
The predictive module generated early anomaly warnings several hours before SCADA alarm thresholds were reached. This behavior was aligned with findings from Digital Twin-enhanced early-warning systems in smart-grid and industrial environments, where early deviation detection is enabled through hybrid analytics and synchronized DT state estimation [44].
The Composite Maintenance Decision Score (Equation (1)) provided robust prioritization under thermal-stress, phase-imbalance, and load-transient conditions. This multi-objective scoring approach supports proactive workforce scheduling and spare-part preparation, reflecting best practices identified in systematic DT–PdM reviews [36].
When deployed, the DT interface presents recommendations as operator-facing advisories integrated with maintenance workflows (e.g., CMMS), while authorization remains human-controlled. Cyber–physical security validation was conducted using IEC 62443 zoning principles and CIP-015-1 internal network monitoring requirements. Stress testing confirmed resistance to command injection, data-spoofing attempts, and unauthorized access attempts into the OT network. These findings align with cybersecurity challenges and mitigation approaches discussed in AI-enabled maintenance reviews [45].
Cross-validation results demonstrated high model generalization, with a standard deviation of α < 1.1 % across five folds, outperforming benchmarks highlighted in recent PdM survey papers focused on deep learning model reliability in dynamic power-system environments [15].
The composite score increases as fault likelihood rises and R U L decreases; if η c t degrades due to OT security anomalies, ψ t is down-weighted (or triggers a security-first escalation), preventing unsafe automated actions and reinforcing human-in-the-loop governance. During deployment, ψ t is compared against predefined advisory thresholds (e.g., ψ τ 1 triggers inspection; ψ τ 2 triggers maintenance scheduling), with final authorization by operators. The values reported in Table 4 are illustrative and normalized and are intended to demonstrate the temporal behavior of the composite decision score rather than reproduce a specific raw measurement trace.
The composite maintenance decision score was operationalized using normalized components and a lightweight linear fusion rule:
ψ t = α p ^ t + β r t γ η c ( t )
where p ^ t 0 ,   1 is the model-derived fault likelihood, r t 0 ,   1 represents the normalized R U L risk (higher values indicate higher urgency), and η c ( t ) [ 0 ,   1 ] is a cyber-trust indicator derived from OT security monitoring status.
In this study, the weights were set according to engineering judgment to emphasize the operational risk while retaining a security-first modulation (e.g., α = 0.45, β = 0.45, and γ = 0.10), yielding an interpretable score suitable for real-time advisory use. For example, at t 1   h (Table 4), p ^ t = 0.62 ,   r t = 0.65 ,   a n d   η c t = 0.90 produce ψ t = 0.45 0.62 + 0.45 0.65 0.10 0.90 = 0.482 , illustrating how increasing fault likelihood and R U L risk elevates urgency, while degrading cyber trust down-weights the composite score to prevent unsafe action escalation.
Finally, the synchronized physical–virtual visualization within the Digital Twin interface confirmed practical applicability for industrial operators, providing real-time diagnostic indicators, R U L projections, and cyber-trust scores, thereby supporting situational awareness and informed maintenance decision-making.

4.7. Discussion

The experimental results confirm that the proposed Digital Twin-enabled predictive maintenance (DT-PdM) architecture provides a significant advancement over existing PdM frameworks in modern substation environments. Compared with recent DT-based maintenance studies—such as the operational DT architectures reviewed in Systematic Review of Predictive Maintenance and Digital Twin Technologies [36] and the AI-guided DT implementations summarized in State-of-the-Art Review: Digital Twins to Support AI-Guided Predictive Maintenance [46]—the presented system integrates a more comprehensive and deployment-oriented stack encompassing standards-aligned interoperability. Cyber resilience, runtime feasibility, and hybrid analytics are validated on utility-grade operational data.
The case study at SS1 demonstrated notable practical benefits, including a 28% reduction in unplanned outages and a 22% reduction in maintenance costs over the 2021–2025 evaluation horizon, following integration of the stacked ensemble within the DT core. These gains exceed those reported in comparable industrial deployments documented by [36] and exhibit parallel trends to the predictive maintenance gains observed in deep learning-driven reviews [15]. The observed improvements are attributable to early anomaly detection sensitivity, R U L -informed prioritization, and synchronized cyber–physical reasoning enabled by the DT feedback loop, rather than algorithmic accuracy alone.
A defining strength of the proposed architecture lies in its explicit compliance with power-utility interoperability and cybersecurity standards, including IEC 61850 for semantic data exchange, CIM IEC 61970/61968 for system-level modeling, OPC UA Part 17 for deterministic publish–subscribe synchronization, and IEC 62443/CIP-015-1 for OT cybersecurity. Existing PdM models frequently lack deployment readiness due to incomplete interoperability and weak cyber protection, an issue extensively emphasized in the power-system cybersecurity review [45]. By explicitly incorporating zoning, conduit segmentation, certificate-based trust, and secure update pathways, this framework addresses these longstanding reliability and security gaps.
Furthermore, the system’s generalization robustness, as observed from its low cross-validation deviation ( α < 1.1 % across five folds), confirms the stability of the stacked ensemble strategy in dynamic grid environments. This aligns with the conclusions in Deep Learning Models for Predictive Maintenance: A Survey, Comparison, Challenges and Prospects [15], which highlight ensemble and hybrid modeling as the most resilient approach for non-stationary asset behaviors.
Nevertheless, several limitations warrant consideration. Model performance may degrade under rare grid reconfigurations, atypical load-transfer events, or degradation mechanisms insufficiently represented in the training data. Such limitations echo those identified in multiple DT-PdM reviews [36,46], emphasizing the need for continuous Digital Twin recalibration and potential integration of physics-informed models or reinforcement learning agents for adaptive retraining under new operating conditions. Future extensions incorporating physics-informed constraints or reinforcement learning-based policy adaptation are therefore positioned as complementary enhancements rather than prerequisites for deployment readiness.
Overall, the validated results show that the proposed architecture is technically sound, operationally deployable, and aligned with the cybersecurity and interoperability requirements. For real-world high-voltage substations, this marks meaningful progression beyond the current DT-PdM literature.

4.7.1. Complexity, Scalability, and Deployment Trade-Offs

The proposed DT–PdM framework was designed with deployment constraints in mind, particularly the 60 s SCADA supervisory update cycle and the need for scalable extension to multi-substation environments. From a computational perspective, tree-based models (RF/GBM) and the SVM provide efficient inference and strong baseline robustness, while the DNN and stacked ensemble offer improved nonlinear modeling capacity at the cost of increased model complexity [22,47,48,49]. The finalized DNN contains approximately 13,665 trainable parameters, and the stacked ensemble introduces only modest overhead through meta-model fusion.
In operational use, the closed-loop pipeline—SCADA acquisition DT state update AI inference maintenance advisory—is executed within the 60 s supervisory update cycle. Dominant latency arises from SCADA acquisition and historian refresh, while DT state updates and model inference are executed within a small fraction of the cycle, confirming suitability for real-time advisory deployment.
Interpretability and accuracy represent a key deployment trade-off. The RF and GBM support transparent feature-importance analysis that aids engineering trust and root-cause investigations, whereas the DNN and ensemble models typically achieve higher predictive performance but require additional governance (e.g., calibration checks and drift monitoring) to maintain reliability over time [50,51]. To address scalability, the architecture supports extension toward fleet-level deployment via Digital Twin federation, where local inference is performed at each substation and only aggregated model updates, KPIs, and anonymized health indicators are exchanged across sites [52]. This design reduces bandwidth requirements, preserves data confidentiality, and enables phased rollout across multiple substations while maintaining consistent interoperability and cybersecurity controls.
This work is subject to several data and deployment constraints that should considered when interpreting the results. First, the monitoring resolution is limited by 1 min SCADA/OT sampling, which restricts the capture of fast transient signatures and waveform-level phenomena that may precede certain fault modes.
The SS1 historian stream is recorded at a 1 min supervisory resolution, which is sufficient for monitoring slow-to-moderate dynamics (thermal trends, load imbalance evolution, and sustained abnormal operating states) but does not capture waveform-level transients or fast partial-discharge (PD) pulse activity. Accordingly, any PD-related variable used in this study should be interpreted as a monitor-derived SCADA indicator (e.g., alarm/status counters, aggregated severity indices, or device-level summarized PD metrics) rather than a high-frequency PD waveform measurement. This sampling constraint may reduce sensitivity to short-lived incipient events and can blur early-stage signatures that evolve faster than the historian refresh period; therefore, conclusions involving fast transient mechanisms are stated conservatively and framed as evidence from aggregated indicators. Future work will incorporate higher-rate acquisition (e.g., UHF/TEV PD monitors or transient recorders) to improve transient observability and strengthen physics-level attribution under fast-evolving fault modes.
Second, condition indicators such as DGA are available at a lower sampling frequency than operational measurements, and OLTC-related records are partially incomplete, which may limit sensitivity to slow-developing insulation or tap-changer degradation patterns. Third, validation is based on a single utility substation (SS1) with a specific equipment configuration; therefore, performance may vary for substations with different loading profiles, protection settings, or component technologies.
In addition to model complexity, uncertainty awareness is a critical consideration for deployment in safety-critical substation environments. In the present implementation, predictive uncertainty is primarily handled through probabilistic model outputs and performance stability analysis rather than through fully Bayesian inference. While the stacked ensemble and DNN provide confidence scores associated with fault classification and prognostic outputs, their reliability depends on calibration quality and data representativeness. To support deployment readiness, uncertainty characterization is therefore evaluated using lightweight calibration and reliability assessments rather than computationally intensive uncertainty frameworks, which may introduce additional overhead. This design choice reflects a trade-off between uncertainty expressiveness and operational feasibility within the 60 s monitoring cycle. More advanced uncertainty modeling—such as Bayesian neural networks or evidential learning—is identified as a future enhancement when higher-frequency data and additional computational resources become available.
Finally, model performance may degrade under rare or previously unseen operating regimes (e.g., atypical seasonal loading, switching events, or novel fault combinations), particularly when training data do not adequately represent such conditions. These limitations primarily affect generalization rather than the architectural feasibility of the proposed DT–PdM framework. Future extensions—such as physics-informed learning to embed degradation constraints, reinforcement learning for adaptive maintenance policies, and federated Digital Twin deployment across substations—are expected to improve robustness, transferability, and coverage of rare operating regimes beyond the current scope.
Runtime and Deployment Feasibility
To support real-time deployment, we report indicative runtime measurements for training and inference using the finalized implementation. The DNN training time was approximately 10.8 s/epoch, with a total training time of 9 min for 50 epochs (early stopping), while classical models (RF/GBM/SVM) and the stacked ensemble required 2–8 min for end-to-end fitting. For deployment, inference latency was measured as 5–15 ms/record (RF), 8–25 ms/record (GBM), 15–30 ms/record (SVM), 25–35 ms/record (DNN), and 80–120 ms/record (stacked ensemble) and included feature preparation and model evaluation, as clarified in Table 5. These results confirm that ensemble inference remains well within the 60 s monitoring interval, enabling advisory generation without interfering with SCADA/OT update cycles.
Given the measured inference latency, the deployed ensemble operates comfortably within the 60 s update cycle, leaving sufficient margin for data ingestion, logging, and DT service orchestration. All runtime measurements were obtained on a professional-grade engineering workstation equipped with a multi-core CPU, dedicated GPU acceleration, and sufficient system memory, representative of the computational resources typically available in industrial analytics and Digital Twin development environments. The reported runtimes are therefore indicative of practical deployment performance rather than laboratory-optimized benchmarks.

4.7.2. Physical Performance Interpretation and Error Analysis

The dominant predictors identified by the explainability analysis are physically consistent with common degradation mechanisms in transformer–feeder operations. Oil and winding temperature levels and gradients reflect thermal stress accumulation, cooling-system anomalies, and hotspot-driven insulation aging; therefore, elevated temperatures and rapid thermal changes are meaningful precursors of incipient faults and reduced remaining useful life. Likewise, imbalance-related electrical indicators (e.g., phase-current asymmetry or derived imbalance indices) are strongly associated with unbalanced loading, loose terminations, contact deterioration, and asymmetric impedance conditions, which can induce localized overheating and accelerate degradation. Vibration-related features further capture mechanical looseness and abnormal operating states that may precede discrete fault events. Together, these findings confirm that the most influential features correspond to well-understood physical degradation processes, reinforcing the interpretability and engineering credibility of the proposed DT–PdM framework.
The confusion-matrix analysis indicates that misclassifications primarily occur between fault categories exhibiting similar symptom signatures under normal operating variability (e.g., thermally driven faults versus load-driven thermal excursions or mild imbalance conditions versus transient operational shifts). These confusion patterns are consistent with domain expectations: when faults present overlapping thermal/electrical manifestations—especially under coarse SCADA sampling—decision boundaries become less separable. A qualitative summary of the most frequent confusion patterns and their likely physical causes is provided in Table 6. This observation highlights where additional sensing (e.g., higher-frequency transients or improved OLTC/DGA coverage) and richer contextual features could further improve class discrimination and generalization under rare operating regimes. In particular, the most challenging classes are those with a low event frequency or early-stage degradation signatures that closely resemble normal operational variability, explaining the residual off-diagonal errors observed in the confusion matrices.
Table 6 reports confusion-pattern statistics derived from the held-out test set of the SS1 dataset, comprising 167,939 fault-labeled samples and 32,081 normal samples. The stacked ensemble achieved zero false negatives across all evaluated fault events, a critical property for safety- and reliability-constrained substation operations where missed fault detection may propagate into cascading equipment damage or forced outages. In contrast, single-model approaches exhibited limited false-negative behavior and were primarily associated with early-stage degradation or transient operating conditions that are only weakly expressed in 1 min SCADA measurements. The ensemble’s superior robustness arises from complementary error compensation across tree-based and neural learners, enabling consistent fault capture across heterogeneous physical mechanisms, including thermal stress evolution, imbalance progression, and mechanically induced anomalies.

5. Standards, Interoperability, and Cybersecurity Compliance

Ensuring interoperability and cybersecurity is essential for scalable deployment of intelligent maintenance systems in critical electrical infrastructures. The proposed DT–AI architecture fully aligns with international power-utility standards, covering data engineering, cyber–physical protection, and regulatory compliance—enabling safe integration into modern substation automation systems. A full standards-to-architecture traceability matrix is provided in Appendix D.

5.1. Interoperability and Data Engineering Standards

The physical sensing layer adheres to the IEC 61850 communication model, enabling standardized naming (LN/DO/DA), seamless IED integration, and substation topology modeling. The Digital Twin core operates on a Common Information Model (CIM) representation according to IEC 61970/61968, providing utility-wide semantic harmonization and cross-vendor data portability.
To enable secure real-time communication between Operational Technology (OT) and Information Technology (IT) domains, data streaming is implemented using OPC UA PubSub [26], which standardizes message encoding, publish–subscribe transport, and device discovery with persistent trust anchors. These standardization choices are consistent with recommendations from recent industrial Digital Twin reviews in smart-grid environments, such as those by [53,54], which emphasize semantic interoperability as a requirement for DT-assisted maintenance of large-scale energy systems.

5.2. Cybersecurity Enforcement and Trustworthy AI Deployment

Given the increasing cyber-attack surface in interconnected substations, cybersecurity controls follow the IEC 62443 framework across zone–conduit segmentation, secure remote access, and security-level assignment. Internal communication, telemetry, and control commands are protected under NERC CIP-015-1 guidelines, enabling real-time anomaly detection, packet integrity checking, and security-event reporting across SCADA–DT interfaces.
Cyber-trust assurance is incorporated into the maintenance decision engine via the weighted parameter η c ( t ) in Equation (1), which provides model-based risk discounting during degraded or adversarial cyber–physical conditions. This approach aligns with security-aware predictive maintenance frameworks proposed in safe reinforcement learning-based PdM studies and evidential deep learning anomaly-based detection architectures for critical infrastructures.

5.3. Compliance Traceability to Architectural Layers

To ensure seamless deployment within regulated electrical infrastructure, the alignment between the proposed architectural layers and the enforced interoperability and cybersecurity standards is explicitly defined. As illustrated in Figure 7, each layer of the system inherits well-established domain standards to guarantee operational reliability, semantic consistency, and cyber-resilience throughout the entire data lifecycle.
The mapping confirms that data acquisition from field IEDs adheres to IEC 61850 for unified device modeling and automated event handling, while OT-network telemetry and command pathways are safeguarded through CIP-015-1 internal network security monitoring policies. The Digital Twin core utilizes CIM (IEC 61970/61968) to maintain a coherent asset model across both control and enterprise systems, enabling trustworthy information exchange with upper-layer digital services.
For inter-system communication, OPC UA Part 17 ensures secure, real-time message propagation using authenticated and encrypted publish–subscribe channels—a requirement for scalable grid-edge execution. In parallel, cybersecurity controls defined by IEC 62443 provide a multi-level defense-in-depth framework, assigning appropriate security levels to analytical, orchestration, and decision-automation components.
A consolidated overview of this standard-to-layer association is presented in Table 7, establishing verifiable compliance paths that support certification-oriented deployment strategies in substation automation environments.

6. Conclusions

This study introduced a deployable, Digital Twin–integrated predictive maintenance (DT–PdM) architecture for electrical substations, which is designed to operate under real-world utility constraints. The proposed framework unifies OT–IT connectivity, standardized semantic interoperability, and AI-driven analytics within a cyber-secure and operationally feasible stack. Validation using utility-grade data from the SS1 substation demonstrates that the architecture can support continuous condition monitoring, fault prediction, and decision support within the 60 s supervisory SCADA update cycle, confirming its readiness for practical substation deployment rather than laboratory-scale experimentation. The finalized analytical pipeline combines a compact Deep Neural Network (≈13,665 trainable parameters) with a stacked ensemble strategy, achieving high predictive performance without compromising runtime feasibility.
This work makes several original contributions to the state of the art in DT-enabled predictive maintenance for substation automation:
  • It introduced a five-layer, standards-aligned DT–PdM architecture that integrates IEC 61850-, CIM-, and OPC UA-based interoperability with cybersecurity-aligned decision support following IEC 62443 principles;
  • It performed real-world validation on utility operational data comprising approximately one million multivariate records derived from 139 confirmed fault events through event-aligned, minute-level labeling, demonstrating feasibility under realistic SCADA/OT constraints;
  • It designed a hybrid predictive analytical and deployment pathway, showing that a stacked ensemble embedded within the Digital Twin core can achieve high predictive performance (F1-score = 0.98; AUC = 0.995) while maintaining interpretability and runtime feasibility;
  • It developed an operation-oriented decision layer, linking predictive outputs to maintenance prioritization through composite scoring and cyber-trust-aware governance, enabling actionable and auditable maintenance decisions.
In contrast to existing DT–PdM studies that primarily focus on algorithmic performance or conceptual Digital Twin representations, the proposed framework advances the field by jointly addressing full standards compliance, cybersecurity alignment, and utility-grade deployability within a unified architecture. The integration of standardized semantic layers, cyber-trust considerations, uncertainty-aware analytics, and real operational validation distinguishes this work from prior approaches that remain limited to partial interoperability, offline analysis, or simulation-only evaluations.
The proposed architecture is intended for human-supervised deployment in safety-critical substations, providing explainable recommendations rather than autonomous actuation, and therefore represents a future-proof and scalable DT–PdM solution capable of gradual extension toward fleet-level Digital Twin federation and adaptive maintenance strategies. The main limitations and generalization considerations are summarized in the discussion, along with mitigation pathways through the proposed future work, providing a transparent roadmap for extending the framework to broader asset classes, higher-frequency sensing, and multi-substation deployments.

Future Work

Future research will focus on extending the proposed Digital Twin-enabled predictive maintenance (DT–PdM) architecture along several complementary directions to enhance adaptability, interpretability, and scalability in large-scale power-system deployments. First, fleet-level Digital Twin federation across multiple sub-stations will be investigated using privacy-preserving and federated learning strategies, enabling knowledge sharing while respecting data confidentiality and regulatory constraints. Second, reinforcement learning techniques will be explored to support adaptive maintenance scheduling and decision optimization under uncertain and dynamically evolving operational conditions by building upon the predictive outputs generated by the current ensemble-based framework. Third, physics-informed neural networks (PINNs) and graph-based learning approaches, such as embed physical degradation laws, thermal–electrical constraints, and network topology into the learning process, will be considered explicitly, thereby improving model interpretability and reducing reliance on purely data-driven representations.
These techniques are not implemented in the present study and are identified as future extensions beyond the current experimental scope. In addition, future Digital Twin extensions will incorporate environmental and economic dimensions, including emissions awareness, asset utilization efficiency, and cost–risk trade-offs, to support sustainability-oriented maintenance planning. Finally, the development of automated compliance auditing and continuous certification-support engines will pursued to ensure long-term alignment with evolving interoperability and cybersecurity standards throughout OT–IT system evolution.
Future work will also include releasing security-reviewed artifacts (e.g., anonymized subsets and configuration files) to improve reproducibility and facilitate benchmarking under critical-infrastructure constraints.

Author Contributions

Conceptualization, S.A. and H.A.; methodology, S.A. and H.A.; software, S.A. and H.A.; writing—original draft, S.A.; writing—review and editing, S.A. and H.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data that support the findings of this study are available from the corresponding authors upon reasonable request. The dataset originates from a utility-grade substation operational technology (OT) environment within critical electrical infrastructure; therefore, access to raw measurements is subject to confidentiality and cybersecurity constraints. Where permissible, security-reviewed anonymized subsets, aggregated feature representations, and detailed documentation of signal mapping and label definitions can be shared to support reproducibility, subject to approval by the data owner and applicable operational policies.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations used in this manuscript:
AEAutoencoder
AIArtificial Intelligence
ANNArtificial Neural Network
AUCArea Under the Curve
CBMCondition-Based Maintenance
CIMCommon Information Model
CIPCritical Infrastructure Protection
CMMSComputerized Maintenance Management System
CNNConvolutional Neural Network
CPFCentral Processing Facility
DLDeep Learning
DTDigital Twin
DGADissolved Gas Analysis
DNNDeep Neural Network
GBMGradient Boosting Machine
GISGas-Insulated Switchgear
GNNGraph Neural Network
GRUGated Recurrent Unit
GTG BGas Turbine Generator B
GTPPGas Turbine Power Plant
GOOSEGeneric Object-Oriented Substation Event (IEC 61850 service)
HVACHeating, Ventilation, and Air Conditioning
IACSsIndustrial Automation and Control Systems
ICTInformation and Communication Technology
IECInternational Electrotechnical Commission
IEDsIntelligent Electronic Devices
IoTInternet of Things
MTBFMean Time Between Failures
MTTFMean Time To Failure
OLTCOn-Load Tap Changer
ONAFOil Natural Air Forced (cooling method for transformers)
OPC UAOpen Platform Communications Unified Architecture (IEC 62541)
OTOperational Technology
PdMPredictive Maintenance
PHMPrognostics and Health Management
PINNPhysics-Informed Neural Network
PMUsPhasor Management Units
RCMReliability-Centered Maintenance
ReLURectified Linear Unit
RFRandom Forest
ROCReceiver Operating Characteristic
RULRemaining Useful Life
SCADASupervisory Control and Data Acquisition
SLDSingle-Line Diagram
SVMSupport Vector Machine
LSTMLong Short-Term Memory
TSNTime-Sensitive Networking

Appendix A. Substation Architecture and Physical Layout

This appendix presents the physical and electrical configuration of the SS1 Substation (GTG-B Train) within the 120 MW Gas Turbine Power Plant (CPF–Badra Oil Field, Iraq).
The substation operates as a critical node interlinking the generation unit (GTG B–24.7 MW, 11 kV), with the 33/11.5 kV step-down transformer and the downstream 11/0.42 kV low-voltage distribution feeders supplying the Gas Train A/B area.
The single-line diagram (SLD) in Figure A1 represents the real as-built electrical path used for Digital Twin synchronization, predictive maintenance modeling, and feature extraction.
It reflects all key components that are semantically mirrored in the Digital Twin model using IEC 61850 logical-node mapping (e.g., MMXU–Measurement, XCBR–Circuit Breaker, TCTR–Current Transformer, and TVTR–Voltage Transformer).
Table A1. System components and associated technical characteristics of the SS1 substation infrastructure used in the Digital Twin implementation.
Table A1. System components and associated technical characteristics of the SS1 substation infrastructure used in the Digital Twin implementation.
ComponentTag/ReferenceRating and SpecificationMonitoring Point/Sensor
Gas Turbine Generator (GTG B)271-BD-101B/271-BD-102B/271-BD-103B24.7 MW @ 11 kV PF 0.85Generator monitoring IED, excitation panel
Generator Circuit Breaker (GCB)271-BD-101B11 kV, 2500 ASCADA GOOSE event capture
Step-up Transformer271-TR-101B11/34.5 kV ONAN/ONAF, 40 MVATemperature and oil sensors, OLTC counter
Main Power Transformer272-TR-100A33/11.5 kV 55/65 MVA ONAF ± 8 × 1.25% tapsDGA analyzer, temperature transmitters
LV Distribution Transformer272-TR-201A11/0.42 kV 2.5 MVA Dyn11 ± 2 × 2.5%Ambient and thermal sensors
Switchgear and ProtectionCB, VCB, LBS, DS33 kV/11 kV/400 V switchgear (1250–4000 A)Breaker health sensors (XCBR), trip status
Busbars A-1 and B-1GIS Sections33 kV, 31.5 kA 1 sVoltage and current transducers (MMXU)
Auxiliaries/FeedersLV Feeders → Motor Loads0.4 kV ACB 4000 APower and status signals (MXU/CSWI)

Appendix A.1. Operational Context

  • Voltage levels: 33 kV → 11.5 kV → 0.42 kV.
  • Frequency: 50 Hz (± 0.1 Hz regulated by GTG Governor).
  • Cooling method: ONAF (Transformers 272-TR-100A/B).
  • Load Served: Gas Train A/B compressors, instrument feeders, auxiliary motors.
  • Control system: SCADA (IEC 61850 MMS/GOOSE + OPC UA TSN Bridge).

Appendix A.2. Integration into the Digital Twin

  • Each physical asset in the SLD is represented in the Digital Twin via a unique semantic identifier linked to the IEC 61850 Object Model.
  • The DT core uses these identifiers for real-time mirroring of asset states (voltage, temperature, breaker status, oil level, etc.).
  • Substation logical nodes are synchronized to the CIM (IEC 61970/61968) layer through OPC UA Part 17 for cross-platform interoperability.
Figure A1. As-built single-line diagram (SLD) of the SS1 substation: The GTG-B train path is used as the physical reference for Digital Twin modeling and validation.
Figure A1. As-built single-line diagram (SLD) of the SS1 substation: The GTG-B train path is used as the physical reference for Digital Twin modeling and validation.
Electronics 15 00416 g0a1

Appendix B. Dataset Configuration and Signal Mapping

This appendix summarizes the configuration of the multivariate operational dataset used for validating the proposed Digital Twin-enabled predictive maintenance (DT-PdM) architecture.
The dataset originates from the SS1 substation (GTG B train path) within the 120 MW Gas Turbine Power Plant, Badra Oil Field, Iraq, collected through the IEC 61850-compliant SCADA and IED infrastructure between 2021 and 2025.

Appendix B.1. Data Source and Acquisition Pipeline

  • Data origin: OT infrastructure—SCADA, transformer DGA analyzers, PMUs, and field IEDs.
  • Acquisition layer: IEC 61850 MMS/GOOSE/SV protocols via substation LAN (time-stamped to 1 min resolution).
  • Ingestion route: Edge gateway → OPC UA Pub/Sub → Digital Twin core database.
  • Record volume: ≈1 million entries × 14 parameters (2021–2025).
  • Sampling frequency: 60 s interval (aggregated real-time telemetry).
  • Data model: IEC 61850 logical nodes mapped to CIM (IEC 61970/61968) objects for semantic consistency.

Appendix B.2. Signal Structure and Feature Mapping

Table A2. Signal structure and feature mapping between physical measurements, IEC 61850 logical nodes, and Digital Twin analytical inputs.
Table A2. Signal structure and feature mapping between physical measurements, IEC 61850 logical nodes, and Digital Twin analytical inputs.
DomainParameterIEC 61850 LN TagDescriptionFeature Type
ElectricalPhase Voltage ( V A , B , C )MMXULine-to-line R.M.S. voltagePrimary Input
Phase Current ( I A , B , C )TCTRR.M.S. current per phasePrimary Input
Power Factor (PF)MMXUActive–reactive power ratioDerived
Frequency (Hz)MMXUInstantaneous system frequencyDerived
ThermalOil TemperatureTTMPMain tank temperaturePrimary Input
Winding TemperatureTTMPTop-winding thermal stateDerived
Ambient TemperatureTTMPEnvironmental sensor inputSecondary
ReliabilityOLTC PositionTTCTap changer step counterDegradation Index
Breaker Health StatusXCBRContact wear/trip cyclesReliability Index
DGA H2, C2H2, CH4 (ppm)TANGTransformer gas analysisFault Signature
DerivedThermal Stress Index (TSI) Δ T w i n d i n g o i l / Δ t Composite
Electrical Stress Index (ESI) I / I r a t e d × V / V r a t e d Composite
R U L ( t ) Remaining useful life estimatePredictive Output
Fault Probability  P f ( t ) Anomaly score output from AI enginePredictive Output

Appendix B.3. Pre-Processing and Quality Control

  • Missing data reconstructed via the Kalman Filter + Spline Interpolation.
  • Outliers removed using IQR + Z-Score fusion.
  • Signals normalized to the [0, 1] range for AI compatibility.
  • Time alignment checked against the Network Time Protocol (NTP) server.
  • Noise filtered using the Butterworth low-pass filter (3 Hz cut-off).

Appendix B.4. Dataset Partitioning

Table A3. Dataset partitioning strategy for training, validation, and testing of the predictive maintenance models.
Table A3. Dataset partitioning strategy for training, validation, and testing of the predictive maintenance models.
SplitPurposeRatioValidation Method
TrainingModel fitting and cross-validation 80%5-fold CV
TestingIndependent performance evaluation 20%Blind inference
Fault/Normal RatioBalanced sampling to prevent bias1:1Resampled SMOTE

Appendix B.5. Integration with Digital Twin Environment

Each signal stream is semantically linked to its Digital Twin counterpart via the OPC UA Part 17 Pub/Sub-Bridge and validated for cyber-resilient transfer using IEC 62443 security policies.
Real-time synchronization enabled state mirroring for fault simulation, historian replay, and AI inference logging.
Figure A2. Multi-domain signal-mapping workflow linking physical sensors, IEC 61850 logical nodes, and the Digital Twin core for predictive maintenance training and validation.
Figure A2. Multi-domain signal-mapping workflow linking physical sensors, IEC 61850 logical nodes, and the Digital Twin core for predictive maintenance training and validation.
Electronics 15 00416 g0a2

Appendix C. Detailed Technical Specifications and System Configuration

This appendix provides the full technical background required to ensure reproducibility of the Digital Twin-enabled Predictive Maintenance (DT-PdM) system implemented for the SS1 Substation (GTPP–Badra Oilfield). It includes data specifications, model configurations, interoperability mappings, cybersecurity controls, and AI deployment settings. A brief summary of the tuning protocol is provided in Section 4.3.

Appendix C.1. Data Sources and Acquisition Specifications

Appendix C.1.1. Data Origin and Scope

  • Source: SS1 Substation–GTG-B (33/11.5 kV, 55/65 MVA transformer).
  • Operational Period: January 2021–October 2025.
  • Records: ≈1,000,000 time-stamped measurements.
  • Sampling Resolution: 60 s intervals.
  • Parameters: 14 critical electrical, thermal, and reliability indicators.
  • Systems feeding data:
    • SEL protection relays.
    • Transformer DGA monitor.
    • Bay control units.
    • SCADA historian (OPC UA interface).
    • Local environmental sensors.

Appendix C.1.2. Measured Parameters

Table A4. Summary of the measured electrical, thermal, and environmental parameters used for predictive maintenance analysis.
Table A4. Summary of the measured electrical, thermal, and environmental parameters used for predictive maintenance analysis.
CategoryParameters
ElectricalVphaseA/B/C, IphaseA/B/C, PF, frequency, and harmonics (THD)
ThermalOil temperature, winding temperature, and ambient temperature
Transformer HealthOLTC tap position, oil level, DGA-H2, CH4, C2H2, and CO
ReliabilityMTBF, failure rate λ , and estimated R U L
Event/Alarm LogsBreaker operations, protection trips, and sequence of events (SoE)

Appendix C.1.3. Data Transport and Protocols

  • Primary OT Protocol: IEC 61850 (MMS, GOOSE, Sampled Values).
  • DT Interface: OPC UA PubSub (Part 17–TSN).
  • Time Synchronization: NTP + SEL IRIG-B signal.
  • Cyber Segmentation: IEC 62443-3-3 security zoning.

Appendix C.2. Data Pre-Processing and Feature Engineering Pipeline

Appendix C.2.1. Cleaning Workflow

  • Missing Data:
    • Kalman Filter (state-space reconstruction).
    • Spline Interpolation for smoothing.
  • Outliers:
    • Hybrid IQR + Z-score detection.
    • Transformer-specific thresholds (IEC 60076 guidance).
  • Timestamp Normalization:
    • Uniform alignment ( Δ t = 60   s ).
    • Removal of a sync SCADA logs.

Appendix C.2.2. Derived Features

  • Electrical: Current imbalance Δ I % , voltage deviation index, and apparent/reactive power.
  • Thermal: Thermal stress index and oil–winding gradient Δ T ( ° C ) .
  • Reliability: Failure hazard rate h ( t ) , R U L trends, and cumulative stress loading.
  • Hybrid Indicators:
    Ψ(t) composite health score (as used in Section 4.5)
    Transformer loading factor.
    Tap-changer mechanical stress index.

Appendix C.2.3. Normalization

  • Min-Max normalization:
x = x x m i n x m a x x m i n

Appendix C.3. AI Model Configuration and Training Details

Appendix C.3.1. Models Used

  • Random Forest.
  • Gradient Boosting (GBM).
  • Support Vector Machine (SVM).
  • Deep Neural Network (DNN).
  • Stacked Ensemble Model (RF + GBM + DNN).

Appendix C.3.2. Hyper-Parameter Search

  • Method: Bayesian Optimization.
  • Iterations: 60.
  • Objective: Maximize F1-score-minimize validation loss.
  • Cross-Validation: 5-fold.

Appendix C.3.3. DNN Architecture

  • Input Layer: 14 features.
  • Hidden Layers: 4 (ReLU activation).
  • Dropout: 0.2.
  • Optimizer: Adam (LR = 0.001).
  • Loss: Binary cross-entropy.
  • Epochs: 80.
  • Batch Size: 64.

Appendix C.3.4. Ensemble Meta-Classifier

  • Fusion Method: Soft-voting weighted aggregator.
  • Weights: RF = 0.25, GBM = 0.25, and DNN = 0.50.
  • Output: Fault probability  P f t , binary classification, and R U L estimation.

Appendix C.4. Digital Twin Configuration

Appendix C.4.1. DT Architecture

  • Type: Hybrid cyber–physical Digital Twin.
  • Execution Environment:
    Edge node: Intel Xeon, with 32 GB RAM.
    Cloud layer: Dockerized microservices.
  • Update Cycle: 60 s.
  • Synchronization:
    OPC UA PubSub → DT core.
    DT → AI engine → CMMS.

Appendix C.4.2. Functions Implemented

  • State estimation.
  • Contingency simulation (fault propagation).
  • Virtual commissioning.
  • Historical playback.
  • Maintenance what-if scenario testing.
  • Operator advisory panel.

Appendix C.4.3. Asset Models Included

  • 33/11.5 kV power transformer (ONAN/OFAF).
  • OLTC mechanism.
  • Feeder bays (A1, B1 bus sections).
  • Circuit breakers and protection relays.
  • LV distribution section.
  • Generator GTG-B link.

Appendix C.4.4. SLD Integration

The following are included from the validated as-built diagram:
  • GTG-B generation chain.
  • Step-up/down transformers.
  • 33 kV GIS.
  • MV busbars A and B.
  • 11.5 kV LV feeders.
  • MCC sections.

Appendix C.5. Cybersecurity and Compliance Controls

Appendix C.5.1. Standards Applied

  • IEC 62443-2-1: Security management.
  • IEC 62443-3-3: System security requirements (SR1–SR7).
  • CIP-015-1: OT network segmentation and monitoring.
  • IEC 61850-90-6: Secure GOOSE/SV messaging.

Appendix C.5.2. Controls Implemented

  • RBAC and least privilege enforcement.
  • Encrypted channels (TLS 1.3–OPC UA).
  • Certificate-based device trust.
  • OT anomaly detection (AI-driven).
  • Firewall zoning: Level 1 (IEDs), Level 2 (SCADA), and Level 3 (DT servers).

Appendix C.5.3. Event Logging

  • Every model output logged to the DT audit trail.
  • SCADA–DT cross-validation log.
  • Security event IDs mapped to CIP-015 taxonomy.

Appendix C.6. Model Validation and Reproducibility Notes

Appendix C.6.1. Fault Event Benchmark

  • Events used: 139 real operational faults.
  • Categories:
    Transformer overheat.
    Current imbalance.
    OLTC deviation.
    Cable/feeder abnormality.
    Partial discharge pattern shift.

Appendix C.6.2. Reproducibility Details

  • Same seed used across all runs.
  • Same 80/20 train–test partition.
  • Docker container retains the execution environment.
  • DT analytics logs exported for independent verification.

Appendix C.7. Limitations and Future Extensions

Current Limitations
  • SCADA historian resolution limited to 60 s.
  • Incomplete DGA sampling intervals.
  • OLTC mechanical data availability inconsistent.
  • No waveform-level (high-resolution) transient analysis yet.
Recommended Future Enhancements
  • Integration of traveling-wave and PMU high-resolution data.
  • Incorporation of physics-informed neural networks (PINNs) at the DT core.
  • Full 3D DT visualization engine (Unity or Unreal Engine).
  • Extension to multi-substation federated learning.

Appendix D. Supplementary Technical Material

This appendix consolidates the mathematical formulations, model execution settings, supplementary figures, and standards-compliant mappings that support reproducibility and transparency of the Digital Twin-enabled Predictive Maintenance (DT-PdM) framework.

Appendix D.1. Mathematical Formulations

Appendix D.1.1. Failure Probability, and Anomaly Score

For each time step t , the AI layer estimates the probability of abnormal asset behavior:
P f t = σ ( ω x t + b )
where σ is the sigmoid function and x t is the normalized feature vector.
P f t denotes the fault-class probability predicted by the classifier at time t .
R U L t denotes the remaining useful life target, while R U ^ L t denotes the model-estimated RUL.
Vectors are written in bold (e.g., x t ); scalars are non-bold.

Appendix D.1.2. Remaining Useful Life RUL Estimation

The ensemble model outputs an estimated R U L value, where
R U L t = m a x ( 0 ,   Y ^ R U L t )
It is trained using the mean squared error:
L R U L = 1 N i = 1 N ( Y i Y ^ i ) 2

Appendix D.1.3. Composite Maintenance Decision Score

As defined in Section 4.5,
ψ t = α R U L t + β 1 P f t + γ η c ( t )
where η c ( t ) is the cybersecurity trust score.

Appendix D.1.4. Reliability Indicators

The failure rate is as follows:
λ = 1 M T B F
The hazard rate is as follows:
h t = f ( t ) 1 F ( t )
These indicators are computed for the transformer, feeder, and OLTC subsystems.

Appendix D.2. Model Training, Execution, and Deployment Environment

Appendix D.2.1. Computational Environment

  • Training Node: Intel Xeon CPU, 32 GB RAM, and NVIDIA T4 GPU.
  • Runtime Environment: Docker container (Python 3.10, TensorFlow 2.12, and Scikit-Learn 1.2).
  • Deployment Cycle: 60 s batch inference (aligned with SCADA resolution).

Appendix D.2.2. Software and Libraries

  • Python: v3.12 (Anaconda-managed interpreter/environment).
  • IDE: PyCharm 2024.1.7 (Professional Edition), Build #PY-241.19416.19 (JetBrains).
  • TensorFlow: v2.6.0 (tensorflow, tensorflow-base, tensorflow-estimator, tensorflow-gpu all v2.6.0)—DNN training and inference.
  • Scikit-learn: v1.5.2—Random Forest, Gradient Boosting, SVM, and stacked-ensemble meta-classifier.
  • NumPy: v2.4.1—numerical computing and array operations.
  • Pandas: v2.3.3—data handling, preprocessing, and time-series preparation.
  • Matplotlib: v3.9.2—plotting and publication-quality figure generation/export.
  • Seaborn: v0.12.2—statistical visualization (e.g., heatmaps and summary plots).
  • OPC UA SDK: v1.0.0—PubSub link between the Digital Twin service layer and the analytics pipeline.

Appendix D.2.3. Reproducibility Settings

  • Fixed random seed: 42.
  • Cross-validation: 5-fold.
  • Train–test split: 80/20.
  • Normalization: Min–Max per parameter.
  • Execution Logs: stored in DT Audit Engine (timestamp + inference metadata).

Appendix D.3. Supplementary Figures

Appendix D.3.1. Model Comparison Figures

Figures referenced in Section 4 are included at full resolution in the supplementary submission:
  • Figure 9: Receiver Operating Characteristic (ROC) curves comparing classification performance of the evaluated models, highlighting the superior separability achieved by the DNN and stacked ensemble.
  • Figure 11: Confusion matrices for (a) Random Forest, (b) Gradient Boosting Machine, (c) Support Vector Machine, (d) Deep Neural Network, and (e) Stacked Ensemble, illustrating class-level prediction performance.
  • Figure 12: Feature-importance comparison for (a) Random Forest and (b) Gradient Boosting Machine models, highlighting the primary explanatory predictors (thermal, vibration, and imbalance-related indicators) used for fault prediction.
  • Figure 13: Comparative performance visualization of predictive models, including accuracy and precision metrics, AUC–ROC values, heat-map evaluation, and radar chart representation.
  • Figure 15: DNN architecture (four hidden layers with ReLU activations and dropout regularization) used for binary fault classification and RUL estimation.
  • Figure 16: Training convergence behavior of the DNN model, showing the evolution of loss, accuracy, and AUC over training epochs.

Appendix D.3.2. Digital Twin Figures

  • Figure 6: End-to-end workflow illustrating continuous data acquisition, analytics, feedback, and learning between physical substation assets and the Digital Twin environment.
  • Figure 14: Operational performance visualization of the stacked ensemble embedded within the Digital Twin core, including classification metrics and probabilistic prediction outputs.

Appendix D.4. Standards–Architecture Compliance Mapping

The following table summarizes the alignment between international standards and the proposed architecture:
Table A5. Mapping between the proposed Digital Twin architecture and applicable interoperability and cybersecurity standards.
Table A5. Mapping between the proposed Digital Twin architecture and applicable interoperability and cybersecurity standards.
StandardRequirementArchitecture Component
IEC 61850Logical nodes and MMS/GOOSE/SVSensing layer and SCADA–DT interface
CIM (IEC 61970/61968)Asset and network semantic modelingInteroperability layer
OPC UA Part 17 (TSN)Secure, deterministic publish–subscribeOT–DT transport
IEC 62443Zones, conduits, and least privilegeSecurity layer and DT services
NERC CIP-015-1OT cybersecurity monitoringDT audit engine and anomaly detection
IEEE C57/IEC 60076Transformer thermal and aging limitsFeature engineering (oil–winding gradient and thermal stress index)
This matrix ensures that the framework is compliant with utility-grade cyber–physical deployment requirements.

Appendix D.5. Brief Limitations and Practical Notes

  • The SCADA sampling rate (1 min) limits transient waveform analysis.
  • OLTC mechanical counters were not continuously available.
  • Dataset imbalance required additional resampling strategies.
  • DGA sensor updates occurred at slower intervals than electrical parameters.
These limitations do not affect system validity but present future expansion opportunities.
The dataset used in this study was collected from a utility-grade substation OT environment within critical infrastructure. Due to confidentiality and cybersecurity restrictions, the full raw dataset cannot be publicly released. An anonymized and security-reviewed subset and/or aggregated feature tables, together with the documentation of signal mapping and label definitions, can be made available upon reasonable request, subject to approval by the data owner and applicable security policies.

References

  1. Maican, C.A.; Pană, C.F.; Pătrașcu-Pană, D.M.; Rădulescu, V.M. Review of Fault Detection and Diagnosis Methods in Power Plants: Algorithms, Architectures, and Trends. Appl. Sci. 2025, 15, 6334. [Google Scholar] [CrossRef]
  2. Jardine, A.K.S.; Lin, D.; Banjevic, D. A Review on Machinery Diagnostics and Prognostics Implementing Condition-Based Maintenance. Mech. Syst. Signal Process. 2006, 20, 1483–1510. [Google Scholar] [CrossRef]
  3. Mchirgui, N.; Quadar, N.; Kraiem, H.; Lakhssassi, A. The Applications and Challenges of Digital Twin Technology in Smart Grids: A Comprehensive Review. Appl. Sci. 2024, 14, 10933. [Google Scholar] [CrossRef]
  4. Berghout, T.; Benbouzid, M. UBO-EREX: Uncertainty Bayesian-Optimized Extreme Recurrent EXpansion for Degradation Assessment of Wind Turbine Bearings. Electronics 2024, 13, 2419. [Google Scholar] [CrossRef]
  5. Dehghan Shoorkand, H.; Nourelfath, M.; Hajji, A. A Hybrid Deep Learning Approach to Integrate Predictive Maintenance and Production Planning for Multi-State Systems. J. Manuf. Syst. 2024, 74, 397–410. [Google Scholar] [CrossRef]
  6. Zhong, D.; Xia, Z.; Zhu, Y.; Duan, J. Overview of Predictive Maintenance Based on Digital Twin Technology. Heliyon 2023, 9, 23. [Google Scholar] [CrossRef]
  7. Gong, S.; Kim, T.; Jeong, J. SPT-AD: Self-Supervised Pyramidal Transformer Network-Based Anomaly Detection of Time Series Vibration Data. Appl. Sci. 2025, 15, 5185. [Google Scholar] [CrossRef]
  8. Cassano, F.; Crespino, A.M.; Lazoi, M.; Specchia, G.; Spennato, A. An EWS-LSTM-Based Deep Learning Early Warning System for Industrial Machine Fault Prediction. Appl. Sci. 2025, 15, 4013. [Google Scholar] [CrossRef]
  9. Kritzinger, W.; Karner, M.; Traar, G.; Henjes, J.; Sihn, W. Digital Twin in Manufacturing: A Categorical Literature Review and Classification. IFAC-PapersOnLine 2018, 51, 1016–1022. [Google Scholar] [CrossRef]
  10. Mourtzis, D.; Tsoubou, S.; Angelopoulos, J. Robotic Cell Reliability Optimization Based on Digital Twin and Predictive Maintenance. Electronics 2023, 12, 1999. [Google Scholar] [CrossRef]
  11. Yuce, B.; Li, H.; Wang, L.; Sucala, V.I. Digital Twin-Based Smart Feeding System Design for Machine Tools. Electronics 2024, 13, 4831. [Google Scholar] [CrossRef]
  12. Singh, M.; Kapukotuwa, J.; Gouveia, E.L.S.; Fuenmayor, E.; Qiao, Y.; Murray, N.; Devine, D. Comparative Study of Digital Twin Developed in Unity and Gazebo. Electronics 2025, 14, 276. [Google Scholar] [CrossRef]
  13. Raska, P.; Ulrych, Z.; Malaga, M. Data Reduction of Digital Twin Simulation Experiments Using Different Optimisation Methods. Appl. Sci. 2021, 11, 7315. [Google Scholar] [CrossRef]
  14. Veigas, K.L.; Chinnici, A.; De Chiara, D.; Chinnici, M. Towards Energy Efficiency of HPC Data Centers: A Data-Driven Analytical Visualization Dashboard Prototype Approach. Electronics 2025, 14, 3170. [Google Scholar] [CrossRef]
  15. Serradilla, O.; Zugasti, E.; Rodriguez, J.; Zurutuza, U. Deep Learning Models for Predictive Maintenance: A Survey, Comparison, Challenges and Prospects. Appl. Intell. 2022, 52, 10934–10964. [Google Scholar] [CrossRef]
  16. Garcia, J.; Rios-Colque, L.; Peña, A.; Rojas, L. Condition Monitoring and Predictive Maintenance in Industrial Equipment: An NLP-Assisted Review of Signal Processing, Hybrid Models, and Implementation Challenges. Appl. Sci. 2025, 15, 5465. [Google Scholar] [CrossRef]
  17. Said, N.; Mansouri, M.; Hmouz, R.A.; Khedher, A. Deep Learning Techniques for Fault Diagnosis in Interconnected Systems: A Comprehensive Review and Future Directions. Appl. Sci. 2025, 15, 6263. [Google Scholar] [CrossRef]
  18. Xie, Z.; Du, S.; Lv, J.; Deng, Y.; Jia, S. A Hybrid Prognostics Deep Learning Model for Remaining Useful Life Prediction. Electronics 2021, 10, 39. [Google Scholar] [CrossRef]
  19. Kang, H.; Kang, P. Transformer-Based Multivariate Time Series Anomaly Detection Using Inter-Variable Attention Mechanism. Knowl.-Based Syst. 2024, 290, 111507. [Google Scholar] [CrossRef]
  20. Jeong, Y.; Yang, E.; Ryu, J.H.; Park, I.; Kang, M. AnomalyBERT: Self-Supervised Transformer for Time Series Anomaly Detection Using Data Degradation Scheme. arXiv 2023, arXiv:2305.04468. [Google Scholar]
  21. Libera, L.D.; Andreoli, J.; Pezze, D.D.; Ravanelli, M.; Susto, G.A. Bayesian Deep Learning for Remaining Useful Life Estimation via Stein Variational Gradient Descent. arXiv 2024, arXiv:2402.01098. [Google Scholar] [CrossRef]
  22. Li, Z.; He, Q.; Li, J. A Survey of Deep Learning-Driven Architecture for Predictive Maintenance. Eng. Appl. Artif. Intell. 2024, 133, 108285. [Google Scholar] [CrossRef]
  23. Abdessadak, A.; Ghennioui, H.; Thirion-Moreau, N.; Elbhiri, B.; Abraim, M.; Merzouk, S. Digital Twin Technology and Artificial Intelligence in Energy Transition: A Comprehensive Systematic Review of Applications. Energy Rep. 2025, 13, 5196–5218. [Google Scholar] [CrossRef]
  24. Yang, J.; Sun, Y.; Cao, Y.; Hu, X. Predictive Maintenance for Switch Machine Based on Digital Twins. Information 2021, 12, 485. [Google Scholar] [CrossRef]
  25. Clay, C. Critical Infrastructure Protection Reliability Standard CIP-015-1-Cyber Security-Internal Network Security Monitoring. Available online: https://www.federalregister.gov/documents/2025/07/02/2025-12309/critical-infrastructure-protection-reliability-standard-cip-015-1-cyber-security-internal-network (accessed on 2 September 2025).
  26. prEN IEC 62541-17:2024—OPC Unified Architecture—Part 17: Alias Names. Available online: https://standards.iteh.ai/catalog/standards/clc/a35f246a-b839-4a9d-8410-7e00110aaf3b/pren-iec-62541-17-2024 (accessed on 2 September 2025).
  27. IEC Publishes IEC 62443-2-1:2024; Setting Security Standards for Industrial Automation and Control Systems. IEC: Geneva, Switzerland, 2024.
  28. Yassin, M.A.M.; Shrestha, A.; Rabie, S. Digital Twin in Power System Research and Development: Principle, Scope, and Challenges. Energy Rev. 2023, 2, 100039. [Google Scholar] [CrossRef]
  29. Zhang, J.; Zhao, X. Digital Twin of Wind Farms via Physics-Informed Deep Learning. Energy Convers. Manag. 2023, 293, 117507. [Google Scholar] [CrossRef]
  30. Aizpurua, J.I.; Stewart, B.G.; McArthur, S.D.J.; Lambert, B.; Cross, J.G.; Catterson, V.M. Improved Power Transformer Condition Monitoring under Uncertainty through Soft Computing and Probabilistic Health Index. Appl. Soft Comput. 2019, 85, 105530. [Google Scholar] [CrossRef]
  31. Anderson, A.; McDermott, T.; Stephan, E. A Power Application Developer’s Guide to the Common Information Model; Pacific Northwest National Laboratory (PNNL): Richland, WA, USA, 2023. [Google Scholar]
  32. Ayello, M.; Lopes, Y. Interoperability Based on IEC 61850 Standard: Systematic Literature Review, Certification Method Proposal, and Case Study. Electr. Power Syst. Res. 2023, 220, 109355. [Google Scholar] [CrossRef]
  33. Heluany, J.B.; Gkioulos, V. A Review on Digital Twins for Power Generation and Distribution. Int. J. Inf. Secur. 2024, 23, 1171–1195. [Google Scholar] [CrossRef]
  34. Li, X.; Zhang, W.; Ding, Q. Deep Learning-Based Remaining Useful Life Estimation of Bearings Using Multi-Scale Feature Extraction. Reliab. Eng. Syst. Saf. 2019, 182, 208–218. [Google Scholar] [CrossRef]
  35. Duan, Y.; Xue, K.; Sun, H.; Bao, H.; Wei, Y.; You, Z.; Zhang, Y.; Jiang, X.; Yang, S.; Chen, J.; et al. LogEDL: Log Anomaly Detection via Evidential Deep Learning. Appl. Sci. 2024, 14, 7055. [Google Scholar] [CrossRef]
  36. Wahab, N.H.A.; Hasikin, K.; Lai, K.W.; Xia, K.; Bei, L.; Huang, K.; Wu, X. Systematic Review of Predictive Maintenance and Digital Twin Technologies Challenges, Opportunities, and Best Practices. PeerJ Comput. Sci. 2024, 10, e1943. [Google Scholar] [CrossRef]
  37. van Dinter, R.; Ekmekci, G.; Rieken, S.; Tekinerdogan, B.; Catal, C. Architecting a Digital Twin-Based Predictive Maintenance System for Modelling Cable Joint Degradation. In Proceedings of the Asia Pacific Conference of the PHM Society 2023, Tokyo, Japan, 11–14 September 2023; Volume 4. [Google Scholar] [CrossRef]
  38. Qureshi, U.R.; Rashid, A.; Altini, N.; Bevilacqua, V.; La Scala, M. Explainable Intelligent Inspection of Solar Photovoltaic Systems with Deep Transfer Learning: Considering Warmer Weather Effects Using Aerial Radiometric Infrared Thermography. Electronics 2025, 14, 755. [Google Scholar] [CrossRef]
  39. Wuest, T.; Weimer, D.; Irgens, C.; Thoben, K.-D. Machine Learning in Manufacturing: Advantages, Challenges, and Applications. Prod. Manuf. Res. 2016, 4, 23–45. [Google Scholar] [CrossRef]
  40. Ruhe, S.; Schaefer, K.; Branz, S.; Nicolai, S.; Bretschneider, P.; Westermann, D. Design and Implementation of a Hierarchical Digital Twin for Power Systems Using Real-Time Simulation. Electronics 2023, 12, 2747. [Google Scholar] [CrossRef]
  41. Zuo, J.; Steiner, N.Y.; Li, Z.; Cadet, C.; Bérenguer, C.; Hissel, D. Reinforcement Learning-Based Maintenance Scheduling for a Stochastic Deteriorating Fuel Cell Considering Stack-to-Stack Heterogeneity. Reliab. Eng. Syst. Saf. 2025, 256, 110700. [Google Scholar] [CrossRef]
  42. Ayed, S.B.; Broujeny, R.S.; Hamza, R.T. Remaining Useful Life Prediction with Uncertainty Quantification Using Evidential Deep Learning. J. Artif. Intell. Soft Comput. Res. 2024, 15, 37–55. [Google Scholar] [CrossRef]
  43. Aghazadeh Ardebili, A.; Zappatore, M.; Ramadan, A.I.H.A.; Longo, A.; Ficarella, A. Digital Twins of Smart Energy Systems: A Systematic Literature Review on Enablers, Design, Management and Computational Challenges. Energy Inform. 2024, 7, 94. [Google Scholar] [CrossRef]
  44. Ma, S.; Flanigan, K.A.; Bergés, M. State-of-the-Art Review and Synthesis: A Requirement-Based Roadmap for Standardized Predictive Maintenance Automation Using Digital Twin Technologies. Adv. Eng. Inform. 2024, 62, 102800. [Google Scholar] [CrossRef]
  45. Bui, V.-H.; Sina, M.; Das, S.; Hussain, A.; Hollweg, G.V.; Su, W. A Critical Review of Safe Reinforcement Learning Strategies in Power and Energy Systems. Eng. Appl. Artif. Intell. 2025, 143, 110091. [Google Scholar] [CrossRef]
  46. Ma, S.; Flanigan, K.A.; Bergés, M. State-of-the-Art Review: The Use of Digital Twins to Support Artificial Intelligence-Guided Predictive Maintenance. arXiv 2024, arXiv:2406.13117. [Google Scholar]
  47. Li, W.; Li, T. Comparison of Deep Learning Models for Predictive Maintenance in Industrial Manufacturing Systems Using Sensor Data. Sci. Rep. 2025, 15, 23545. [Google Scholar] [CrossRef]
  48. Sensoy, M.; Kaplan, L.; Kandemir, M. Evidential Deep Learning to Quantify Classification Uncertainty. In Proceedings of the NeurIPS 2018, Montréal, QC, Canada, 2–8 December 2018. [Google Scholar]
  49. Malhotra, P.; Vig, L.; Shroff, G.; Agarwal, P. Long Short Term Memory Networks for Anomaly Detection in Time Series. In Proceedings of the ESANN 2015, Bruges, Belgium, 22–24 April 2015. [Google Scholar]
  50. Lei, Y.; Li, N.; Guo, L.; Li, N.; Yan, T.; Lin, J. Machinery Health Prognostics: A Systematic Review from Data Acquisition to RUL Prediction. Mech. Syst. Signal Process. 2018, 104, 799–834. [Google Scholar] [CrossRef]
  51. Ribeiro, M.T.; Singh, S.; Guestrin, C. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016. [Google Scholar]
  52. Lu, Y.; Liu, C.; Wang, K.I.-K.; Huang, H.; Xu, X. Digital Twin-Driven Smart Manufacturing: Connotation, Reference Model, Applications and Research Issues. Robot. Comput.-Integr. Manuf. 2020, 61, 101837. [Google Scholar] [CrossRef]
  53. Zhuang, L.; Xu, A.; Wang, X.-L. A Prognostic Driven Predictive Maintenance Framework Based on Bayesian Deep Learning. Reliab. Eng. Syst. Saf. 2023, 234, 109181. [Google Scholar] [CrossRef]
  54. Chen, C.; Fu, H.; Zheng, Y.; Tao, F.; Liu, Y. The Advance of Digital Twin for Predictive Maintenance: The Role and Function of Machine Learning. J. Manuf. Syst. 2023, 71, 581–594. [Google Scholar] [CrossRef]
Figure 1. Five-layer Digital Twin-enabled predictive maintenance architecture integrating physical sensing, semantic interoperability, AI analytics, and decision-support functions for substation maintenance.
Figure 1. Five-layer Digital Twin-enabled predictive maintenance architecture integrating physical sensing, semantic interoperability, AI analytics, and decision-support functions for substation maintenance.
Electronics 15 00416 g001
Figure 2. End-to-end workflow of the proposed DT–PdM framework for SS1, illustrating the closed-loop process from OT sensing and standards-based interoperability to AI inference, cybersecurity governance, and maintenance decision execution.
Figure 2. End-to-end workflow of the proposed DT–PdM framework for SS1, illustrating the closed-loop process from OT sensing and standards-based interoperability to AI inference, cybersecurity governance, and maintenance decision execution.
Electronics 15 00416 g002
Figure 3. Interoperability architecture aligning IEC 61850 substation models with CIM and OPC UA TSN to enable standardized, low-latency data exchange and Digital Twin synchronization across OT–IT domains.
Figure 3. Interoperability architecture aligning IEC 61850 substation models with CIM and OPC UA TSN to enable standardized, low-latency data exchange and Digital Twin synchronization across OT–IT domains.
Electronics 15 00416 g003
Figure 4. AI inference pipeline illustrating stacked-ensemble fusion, uncertainty-aware inference, and concept-drift monitoring for fault classification and remaining useful life estimation [26,27,31,32].
Figure 4. AI inference pipeline illustrating stacked-ensemble fusion, uncertainty-aware inference, and concept-drift monitoring for fault classification and remaining useful life estimation [26,27,31,32].
Electronics 15 00416 g004
Figure 5. Digital Twin-enabled decision and service orchestration pipeline showing the closed-loop interaction between AI predictions, maintenance actions, and operator feedback.
Figure 5. Digital Twin-enabled decision and service orchestration pipeline showing the closed-loop interaction between AI predictions, maintenance actions, and operator feedback.
Electronics 15 00416 g005
Figure 6. End-to-end workflow illustrating continuous data acquisition, analytics, feedback, and learning between physical substation assets and the Digital Twin environment.
Figure 6. End-to-end workflow illustrating continuous data acquisition, analytics, feedback, and learning between physical substation assets and the Digital Twin environment.
Electronics 15 00416 g006
Figure 7. Mapping of the proposed Digital Twin-enabled predictive maintenance framework to international interoperability and cybersecurity standards.
Figure 7. Mapping of the proposed Digital Twin-enabled predictive maintenance framework to international interoperability and cybersecurity standards.
Electronics 15 00416 g007
Figure 8. Single-line diagram (SLD) of the SS1 substation (Train-B), showing the GTG-B generation path, power transformer, MV bus, and LV feeder structure used for monitoring and Digital Twin synchronization.
Figure 8. Single-line diagram (SLD) of the SS1 substation (Train-B), showing the GTG-B generation path, power transformer, MV bus, and LV feeder structure used for monitoring and Digital Twin synchronization.
Electronics 15 00416 g008
Figure 9. Receiver Operating Characteristic (ROC) curves comparing classification performance of the evaluated models, highlighting the superior separability achieved by the DNN and stacked ensemble.
Figure 9. Receiver Operating Characteristic (ROC) curves comparing classification performance of the evaluated models, highlighting the superior separability achieved by the DNN and stacked ensemble.
Electronics 15 00416 g009
Figure 10. Precision–recall curves comparing fault detection performance of the evaluated models under class-imbalance conditions, highlighting the operational advantage of the stacked ensemble.
Figure 10. Precision–recall curves comparing fault detection performance of the evaluated models under class-imbalance conditions, highlighting the operational advantage of the stacked ensemble.
Electronics 15 00416 g010
Figure 11. Confusion-matrix-based classification performance of the evaluated predictive models: (a) Random Forest, (b) Gradient Boosting Machine, (c) Support Vector Machine, (d) Deep Neural Network, and (e) stacked ensemble. Each subfigure illustrates class-level prediction outcomes (true positives, false positives, true negatives, and false negatives) for fault detection under imbalanced operating conditions.
Figure 11. Confusion-matrix-based classification performance of the evaluated predictive models: (a) Random Forest, (b) Gradient Boosting Machine, (c) Support Vector Machine, (d) Deep Neural Network, and (e) stacked ensemble. Each subfigure illustrates class-level prediction outcomes (true positives, false positives, true negatives, and false negatives) for fault detection under imbalanced operating conditions.
Electronics 15 00416 g011
Figure 12. Feature-importance comparison of the evaluated tree-based models, (a) Random Forest and (b) Gradient Boosting Machine, highlighting the dominant explanatory predictors (thermal, vibration, and imbalance-related indicators) used for fault prediction.
Figure 12. Feature-importance comparison of the evaluated tree-based models, (a) Random Forest and (b) Gradient Boosting Machine, highlighting the dominant explanatory predictors (thermal, vibration, and imbalance-related indicators) used for fault prediction.
Electronics 15 00416 g012
Figure 13. Comparative performance visualization of predictive models, including accuracy and precision metrics, AUC–ROC values, heat-map evaluation, and radar chart representation.
Figure 13. Comparative performance visualization of predictive models, including accuracy and precision metrics, AUC–ROC values, heat-map evaluation, and radar chart representation.
Electronics 15 00416 g013
Figure 14. Operational performance visualization of the stacked ensemble embedded within the Digital Twin core, including classification metrics and probabilistic prediction outputs. The dashed diagonal line in the ROC curve denotes the no-discrimination baseline (random classifier, AUC = 0.5) for reference.
Figure 14. Operational performance visualization of the stacked ensemble embedded within the Digital Twin core, including classification metrics and probabilistic prediction outputs. The dashed diagonal line in the ROC curve denotes the no-discrimination baseline (random classifier, AUC = 0.5) for reference.
Electronics 15 00416 g014
Figure 15. DNN architecture (four hidden layers with ReLU activations and dropout regularization) used for binary fault classification and R U L estimation. The dashed diagonal line in the ROC curve denotes the no-discrimination baseline (random classifier, AUC = 0.5).
Figure 15. DNN architecture (four hidden layers with ReLU activations and dropout regularization) used for binary fault classification and R U L estimation. The dashed diagonal line in the ROC curve denotes the no-discrimination baseline (random classifier, AUC = 0.5).
Electronics 15 00416 g015
Figure 16. Training convergence behavior of the DNN model, showing the evolution of loss, accuracy, and AUC over training epochs.
Figure 16. Training convergence behavior of the DNN model, showing the evolution of loss, accuracy, and AUC over training epochs.
Electronics 15 00416 g016
Table 1. Benchmark comparison of representative AI- and Digital Twin-based predictive maintenance frameworks (2023–2025), highlighting interoperability, cybersecurity integration, and deployment readiness relative to the proposed architecture.
Table 1. Benchmark comparison of representative AI- and Digital Twin-based predictive maintenance frameworks (2023–2025), highlighting interoperability, cybersecurity integration, and deployment readiness relative to the proposed architecture.
Study/FrameworkAI ModelDT IntegrationInteroperabilityCybersecurityReal-Time Adaptivity
[5]CNN-LSTMPartialModerate
[22]Bayesian DLModerate
[40]DT (Hierarchical)PartialModerate
[33]DT + AIPartialModerate
[41]Transformer + RLHigh
[19,20,21,42]Transformer/LSTM–GRU variants; Bayesian DL; Evidential DLOften proposed/partial (study-dependent)Typically limited/not standards-centeredRarely integrated explicitlyPossible but higher computational overhead
Proposed WorkRF + GBM + SVM + DNN + Stacked Ensemble✓✓✓✓ (CIM + OPC UA)✓✓ (CIP-015 + IEC 62443)Operationally adaptive (human-in-the-loop advisory)
Benchmark comparison of recent AI- and Digital Twin-based predictive maintenance frameworks, focusing on interoperability, cybersecurity integration, and deployment adaptability for substation applications. Representative recent approaches include transformer-based time-series models and Bayesian/evidential deep learning for uncertainty-aware prognostics [19,20,21,42]. Physics-informed neural networks (PINNs), graph-based learning, and reinforcement learning are not implemented in the present study and are considered future research directions in the Future Work Section.
Table 2. Quantitative performance comparison of predictive models for fault classification and remaining useful life estimation.
Table 2. Quantitative performance comparison of predictive models for fault classification and remaining useful life estimation.
ModelAccuracy (%)PrecisionRecallF1-ScoreAUC
RF94.70.950.930.940.97
GBM95.20.960.940.950.98
SVM92.10.900.920.910.95
DNN96.30.970.960.960.99
Ensemble97.50.980.970.980.995
All evaluation metrics—accuracy, precision, recall, F1-score, and AUC—are defined formally in Appendix C.
Table 3. Compact summary of the DNN architecture and stacked ensemble configuration used in this study.
Table 3. Compact summary of the DNN architecture and stacked ensemble configuration used in this study.
ComponentSetting
InputMultivariate feature vector (post-pre-processing)
Hidden layers4 fully connected layers
ActivationReLU
RegularizationDropout = 0.2
Classification outputSigmoid
Classification lossBinary cross-entropy
Regression output ( R U L )Linear
Regression lossMean squared error (MSE)
OptimizerAdam (see Appendix C.3 for tuned values)
Ensemble typeStacked ensemble
Base learnersRandom Forest, Gradient Boosting Machine, and Deep Neural Network
Meta-learnerStacking classifier trained on base-model outputs (Scikit-Learn)
Table 4. Example evolution of the composite maintenance decision score ψ t for a representative SS1 event (illustrative).
Table 4. Example evolution of the composite maintenance decision score ψ t for a representative SS1 event (illustrative).
Time (Relative to Event)Normalized Fault Probability p ^ t Normalized R U L Risk r t Cyber-Trust η c t Composite Score ψ t Action
t 6   h 0.180.220.950.24Monitor
t 3   h 0.350.400.920.41Inspect
t 1   h 0.620.650.900.67Prepare crew/spares
t 30   m i n . 0.780.820.880.81Schedule intervention
t   ( f a u l t   o n s e t ) 0.930.950.850.92Execute
t + 0.400.300.970.36Close-out
Table 5. Approximate training and inference runtime of the evaluated models under the finalized implementation.
Table 5. Approximate training and inference runtime of the evaluated models under the finalized implementation.
ModelTraining TimeInference TimeNotes
RF 2–3 min 5–15 ms/recordper record
GBM 3–4 min 8–25 ms/recordper record
SVM 5–8 min 15–30 ms/recordper record
DNN 10.8 s/epoch; 9 min total 25–35 ms/record 50 epochs (early stopping)
Stacked Ensemble 3–5 min total 80–120 ms/recordincludes meta-model
Table 6. Representative confusion patterns observed in the evaluated models and their physical interpretation.
Table 6. Representative confusion patterns observed in the evaluated models and their physical interpretation.
ModelTrue Positives (Fault Fault)False Negatives (Fault Normal)True Negatives (Normal Normal)False Positives (Normal Fault)Physical/Operational Interpretation
RF166,820111931,540541Robust detection of sustained fault patterns dominated by thermal and imbalance indicators; limited FN cases arise during early-stage or weakly expressed degradation signatures.
GBM167,10283731,612469Improved sensitivity to monotonic degradation trends through boosted decision boundaries; residual FN events mainly linked to transient operating regimes.
SVM165,684225531,418663Margin-based separation struggles under non-stationary conditions and overlapping feature distributions, leading to elevated FN rates in borderline fault scenarios.
DNN166,99894131,684397Nonlinear feature learning reduces FN compared to classical models; remaining misses correspond to low-amplitude or short-duration precursors at the SCADA resolution.
Stacked Ensemble167,939031,721360Heterogeneous model fusion eliminates false negatives, ensuring reliable capture of diverse fault manifestations under operational variability.
Table 7. Standards compliance matrix for the proposed Digital Twin-enabled predictive maintenance architecture.
Table 7. Standards compliance matrix for the proposed Digital Twin-enabled predictive maintenance architecture.
Architecture LayerInteroperability Standard(s)Cybersecurity Standard(s)Compliance Purpose
Physical Sensing and Acquisition.IEC 61850.CIP-015-1.IED interoperability and secure SCADA equipment monitoring.
Semantic Interoperability.CIM 61970/61968.IEC 62443 SL-1/SL-2.Unified asset representation and secure model exchange.
Digital Twin Core.OPC UA Part 17.IEC 62443 SL-3.Trusted state synchronization and secure PubSub transport.
AI and Analytics Layer.Ethical and Security-aware AI (Evidential DL, SRL).Robust prediction under cyber risk and uncertainty.
Services and Decision Layer.CIM, IEC 61850 MMS.CIP-015-1.Operator trust and auditable maintenance actions.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Alabbad, S.; Altınkaya, H. An Intelligent Predictive Maintenance Architecture for Substation Automation: Real-World Validation of a Digital Twin and AI Framework of the Badra Oil Field Project. Electronics 2026, 15, 416. https://doi.org/10.3390/electronics15020416

AMA Style

Alabbad S, Altınkaya H. An Intelligent Predictive Maintenance Architecture for Substation Automation: Real-World Validation of a Digital Twin and AI Framework of the Badra Oil Field Project. Electronics. 2026; 15(2):416. https://doi.org/10.3390/electronics15020416

Chicago/Turabian Style

Alabbad, Sarmad, and Hüseyin Altınkaya. 2026. "An Intelligent Predictive Maintenance Architecture for Substation Automation: Real-World Validation of a Digital Twin and AI Framework of the Badra Oil Field Project" Electronics 15, no. 2: 416. https://doi.org/10.3390/electronics15020416

APA Style

Alabbad, S., & Altınkaya, H. (2026). An Intelligent Predictive Maintenance Architecture for Substation Automation: Real-World Validation of a Digital Twin and AI Framework of the Badra Oil Field Project. Electronics, 15(2), 416. https://doi.org/10.3390/electronics15020416

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop