Next Article in Journal
Correction: Costanzo, S.; Flores, A.M. From Iterative Methods to Neural Networks: Complex-Valued Approaches in Medical Image Reconstruction. Electronics 2025, 14, 1959
Next Article in Special Issue
A Thermal–Electrical Co-Modeling Method for Bond Wire Degradation Assessment of Power Modules Independent of Junction Temperature
Previous Article in Journal
Design and Implementation of a Prefetcher in a Key Performance Subsystems of RISC-V Processors
Previous Article in Special Issue
Lifetime Prediction of SiC MOSFET by LSTM Based on IGWO Algorithm
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Multi-Scale Temporal Learning with EEMD Reconstruction for Non-Stationary Error Forecasting in Current Transformers

1
Power Supply Service Management Centre of State Grid Jiangxi Electric Power Co., Ltd., Nanchang 330012, China
2
College of Electrical Engineering and New Energy, China Three Gorges University, Yichang 443002, China
*
Author to whom correspondence should be addressed.
Electronics 2026, 15(2), 325; https://doi.org/10.3390/electronics15020325
Submission received: 29 October 2025 / Revised: 12 December 2025 / Accepted: 8 January 2026 / Published: 11 January 2026

Abstract

Current transformer measurement errors exhibit strong non-stationarity and multi-scale temporal dynamics, which make accurate prediction challenging for conventional deep learning models. This paper presents a hybrid signal processing and temporal learning framework that integrates ensemble empirical mode decomposition (EEMD) with a dual-scale temporal convolutional architecture. EEMD adaptively decomposes the error sequence into intrinsic mode functions, while a Pearson correlation-based selection step removes redundant and noise-dominated components. The refined signal is then processed by a dual-scale temporal convolutional network (TCN) designed with parallel dilated kernels to capture both high-frequency transients and long-range drift patterns. Experimental evaluations on 110 kV substation data confirm that the proposed decomposition-enhanced dual-scale temporal convolutional framework significantly improves generalization and robustness, reducing the root mean square error by 40.9% and the mean absolute error by 37.0% compared with benchmark models. The results demonstrate that combining decomposition-based preprocessing with multi-scale temporal learning effectively enhances the accuracy and stability of non-stationary current transformer error forecasting.

1. Introduction

Current transformers (CTs) are fundamental to current measurement in power systems: they convert primary current into a proportional secondary signal and deliver it to monitoring and control devices, thereby supporting grid dispatch and operational decision-making [1,2,3,4]. During long-term operation, however, CT performance is influenced by variations in temperature and humidity, electromagnetic interference, and equipment aging [5,6]. These factors degrade measurement accuracy and introduce ratio error drift that can compromise system monitoring and control. To ensure long-term reliability, periodic calibration remains necessary [7].
Conventional calibration is typically performed offline by disconnecting the transformer from the grid for comparative measurement [8]. While mature and widely adopted in practice, this process requires planned outages, specialized personnel, and dedicated instrumentation, leading to high operational costs and low efficiency. To mitigate these limitations, non-invasive online calibration schemes have been proposed to continuously track CT accuracy without interruption [9,10,11,12]. Nevertheless, practical challenges persist, including robust interference suppression under high-voltage conditions and the difficulty of maintaining long-horizon error traceability [13], which collectively hinder large-scale deployment.
In recent years, machine learning modeling has shown significant advantages and has been increasingly adopted in power-system applications. In particular, deep learning architectures such as convolutional neural networks (CNNs) [14] and long short-term memory (LSTM) networks [15,16] have achieved strong results in time-series forecasting owing to their ability to adaptively extract features and perform nonlinear mapping. These approaches have been progressively applied to power-system measurement. For example, Wang et al. [17] employed bidirectional LSTM (BiLSTM) to predict CT measurement errors, verifying the feasibility of deep-learning-based prediction; Zhang et al. [18] utilized gated recurrent unit (GRU) networks and multi-task learning to model and predict ratio differences. However, practical deployments still face the challenges detailed below.
However, despite these advances, several important gaps remain in the literature on CT ratio-error prediction. First, most existing studies are still centered on threshold-based assessment or coarse accuracy classes and do not explicitly track the multi-scale evolution of ratio errors under realistic operating constraints. Second, many deep-learning approaches operate directly on raw, non-stationary error series without a tailored decomposition stage, which makes them sensitive to multi-source noise and limits their robustness and interpretability under field interference. Third, there is still a lack of hybrid frameworks that are explicitly designed for online CT calibration scenarios and whose benefits are validated using rigorous statistical tests beyond pointwise error metrics. Related data-driven modeling efforts in other power-system domains further highlight the need for models that can exploit complex temporal structure while respecting operational constraints [19,20].
Measurement errors in CTs exhibit both short-term fluctuations (caused by transient loads or noise) and long-term drifts (caused by equipment aging or seasonal environmental variations). Standard CNNs, with fixed receptive fields, often struggle to capture both characteristics simultaneously. Although recurrent models such as LSTM can, in principle, capture long-term dependencies, they commonly suffer from vanishing gradients and training instability, which limit their ability to learn short-term variations concurrently. Furthermore, threshold-based evaluation cannot effectively describe the evolutionary trend of ratio errors.
Raw error data are typically non-stationary and contaminated by multi-source noise. Most existing models operate directly on raw signals and lack dedicated mechanisms to separate intrinsic error patterns from stochastic interference. This deficiency reduces their generalization capability and limits prediction accuracy. Prediction reliability is also highly sensitive to data quality. Under complex and noisy field conditions, multiple interference factors degrade data integrity, weakening the robustness and stability of existing predictive models.
To address these challenges, this work adopts a signal–learning paradigm: a decom- position-driven preprocessing stage improves feature separability, and a multi-scale temporal model exploits the cleaned structure. Specifically, we integrate Ensemble Empirical Mode Decomposition (EEMD) with a Dual-Scale Temporal Convolutional Network (DTCN). EEMD adaptively decomposes the ratio-error sequence into intrinsic mode functions (IMFs); a Pearson-correlation screening removes low-relevance, noise-dominated modes, yielding a reconstructed input with higher signal-to-noise ratio. The DTCN then employs heterogeneous receptive fields to jointly encode high-frequency transients and long-range drift in a causal convolutional framework.
The main contributions of this paper are summarized as follows:
  • Decomposition-driven learnability. We propose an EEMD-based preprocessing paradigm that decomposes non-stationary ratio-error sequences into intrinsic mode functions and reconstructs a surrogate target signal by correlation screening, thereby improving feature regularity and learnability for downstream prediction.
  • Correlation-guided reconstruction. We introduce a Pearson-correlation-based screening strategy that suppresses noise-dominated modes and retains informative components, functioning as a data-driven filter that enhances robustness under complex field interference.
  • Dual-scale temporal modeling. We design a parallel DTCN with heterogeneous dilation that jointly captures short-term transients and long-term drift within a unified causal forecaster, explicitly addressing the multi-scale temporal evolution of CT ratio errors.
According to IEC requirements, a class 0.2 current transformer must maintain ratio error within ± 0.2 % under rated conditions, so accurate prediction is critical for early warning and condition-based maintenance. Evaluations on 110 kV substation data show that the proposed EEMD–DTCN framework substantially improves generalization and robustness compared with strong baselines, yielding lower prediction-error evaluation metrics and higher overall goodness of fit, thus providing a practical pathway toward non-intrusive online calibration.

2. Model Framework

2.1. Measurement Error of Current Transformers

Current transformers convert primary current signals from the power grid into secondary digital signals. During operation, various factors affect the transformer, causing discrepancies between the secondary measurement data and the actual primary signals. This can be mathematically represented as [21,22]:
ε r [ % ] = K r I 2 I 1 I 1 × 100 %
where ε r denotes the ratio error; K r is the rated transformation ratio of the current transformer; I 1 is the actual primary current, and I 2 is the actual secondary current. Ideally, K r I 2 = I 1 (i.e., the ratio is constant). However, over time—especially after long-term operation—accumulated errors may degrade measurement accuracy, leading to a nonzero ratio error.
We consider the ratio error time series y ( t ) and the N-dimensional feature vector x t R N at time t. The input matrix for T historical moments is given by
X = [ x 1 , x 2 , , x T ] R N × T , x t = x t ( 1 ) , x t ( 2 ) , , x t ( N ) .
where ( X ) i , t = x t ( i ) for i = 1 , , N and t = 1 , , T ; representative components of x t include ambient temperature (°C), relative humidity (%), magnetic field strength (A/m), and electric field phase (°). We write y 1 : T = { y ( 1 ) , , y ( T ) } and x 1 : T = { x 1 , , x T } :
y T + 1 = F y 1 : T , x 1 : T .

2.2. Model Process

As shown in Figure 1, the proposed multi-scale forecasting framework comprises four modules: (i) signal decomposition and screening, (ii) dual-scale feature extraction, (iii) feature fusion and prediction, and (iv) training and hyperparameter tuning. The design targets non-linear, non-stationary ratio-error series and preserves strict no-leakage handling under walk-forward evaluation.
  • Decomposition and screening (preprocessing). The original ratio-error sequence is decomposed by Ensemble Empirical Mode Decomposition (EEMD) into intrinsic mode functions (IMFs) plus a residual (trend) component. A Pearson-correlation-based screening then reconstructs a learnable surrogate y ˜ ( t ) , regularizing non-stationarity and suppressing noise-dominated content.
  • Dual-scale temporal convolutional network (feature extraction). The Dual-Scale Temporal Convolutional Network (DTCN) processes y ˜ ( t ) through two complementary branches. Each residual block applies causal, dilated convolutions: the local branch uses smaller kernels and lower dilation to capture short-lived transients, while the global branch uses larger kernels and higher dilation to encode long-range drifts. This heterogeneous receptive-field design enables synchronous modeling of multi-time-scale dependencies.
  • Feature fusion and one-step-ahead prediction. Branch outputs are fused (concatenation followed by a 1 × 1 convolution and normalization) to form a joint representation passed to fully connected layers for one-step-ahead ratio-error prediction. Training minimizes mean squared error (MSE) with L 2 regularization and dropout to enhance generalization and numerical stability.
  • Training protocol and hyperparameter tuning. All preprocessing statistics and screening thresholds are estimated on training folds only and applied unchanged to test folds (walk-forward evaluation). Hyperparameters—dilation schedule, kernel size, channel width, dropout rate, learning rate, and the Pearson-correlation threshold—are selected via grid search on the rolling origin to avoid overfitting and underfitting.
In summary, coupling EEMD-based reconstruction with a dual-scale temporal architecture yields cleaner inputs and scale-aware representations, thereby improving the accuracy and robustness of ratio-error forecasting for current transformers, and supporting real-time monitoring and fault diagnosis in power systems.

3. Data Processing

3.1. Decomposition Objective and Setup

The preprocessing front end aims to (i) regularize the nonstationary ratio error series into narrowband components that expose scale-specific regularities, and (ii) suppress noise-dominated content prior to learning. EEMD is adopted to obtain intrinsic mode functions (IMFs) and a residual trend component; the components are then screened by the Pearson correlation coefficient and recombined to form a learnable surrogate sequence y ˜ ( t ) for downstream modeling.

3.2. Ensemble Empirical Mode Decomposition (EEMD)

EEMD [23,24] extends EMD by adding independent white-noise realizations and averaging the ensuing decompositions to mitigate mode mixing. For the original series y ( t ) and N ens noise realizations { ϵ i ( t ) } ,
y noise ( i ) ( t ) = y ( t ) + ϵ i ( t ) , i = 1 , , N ens .
{ IMF k ( i ) ( t ) } k = 1 K i EMD y noise ( i ) ( t ) .
IMF k ( t ) = 1 N ens i = 1 N ens IMF k ( i ) ( t ) .
Residue ( t ) = y ( t ) k = 1 K IMF k ( t ) .
where K denotes the number of aligned IMFs after averaging. The noise amplitude is set proportional to the standard deviation of y ( t ) ; N ens and stopping criteria follow the experimental setup to ensure reproducibility. Visual diagnostics in Figure 2 reveal hierarchical time–frequency separation supportive of multiscale modeling; reconstruction and residual checks are shown in Figure 3 and Figure 4.

3.3. Correlation-Guided Component Screening and Reconstruction

To enhance learnability and suppress noise-dominated modes, the linear association between each IMF and the original signal is quantified via the Pearson correlation coefficient [25,26],
ρ k = t = 1 T y ( t ) y ¯ IMF k ( t ) IMF k ¯ t = 1 T y ( t ) y ¯ 2 t = 1 T IMF k ( t ) IMF k ¯ 2 ,
and indices S = { k : | ρ k | θ } are retained under an empirically chosen threshold θ = 0.2 . In this study, θ is treated as an empirical hyperparameter: a value around 0.2 corresponds to a weak but non-negligible linear association and is selected as a practical trade-off that prioritizes the suppression of weakly correlated, noise-dominated modes while still preserving informative components. The surrogate sequence is reconstructed as
y ˜ ( t ) = Residue ( t ) + k S IMF k ( t ) .
The heatmap in Figure 5 provides a diagnostic view of component relevance and supports the selection used in the case study.

3.4. Implementation Details and Reproducibility

This subsection summarizes key implementation details to facilitate reproducibility of the proposed preprocessing pipeline. To avoid data leakage, all normalization statistics (e.g., mean and standard deviation) are computed exclusively on the training folds and then applied unchanged to the validation and test segments. Ensemble empirical mode decomposition (EEMD), correlation screening, and reconstruction are performed independently within each fold of the walk-forward (rolling-origin) evaluation protocol so that the decomposition remains consistent with the temporal structure of the data.
The main hyperparameters of the decomposition stage, namely the correlation threshold θ , the ensemble size N ens , and the noise scale, are treated as empirically chosen hyperparameters: they are tuned once based on exploratory analysis on the training data and subsequently kept fixed for all evaluations to match the case-study setup. Two practical guidelines are recommended when potential failure modes are observed. If mode mixing occurs, increasing N ens or slightly adjusting the noise amplitude can mitigate this issue. If the set of relevant IMFs appears to drift across folds, θ should remain fixed and the index set S in (9) should be updated per fold in a data-driven manner.

3.5. Takeaway

Decomposition regularizes the input and exposes scale-specific patterns, while correlation-guided reconstruction filters noise-dominated content; the resulting surrogate y ˜ ( t ) provides a cleaner, more structured target for the dual-scale predictor.

4. Multi-Scale Bidirectional Temporal Convolutional Network Architecture

4.1. TCN Model

The core architecture of this study is based on a modified Temporal Convolutional Network (TCN) [27,28]. Compared with recurrent architectures, TCNs are particularly suitable for CT ratio-error forecasting for two reasons. First, they rely on convolutional operations that can be parallelized along the temporal dimension. Second, by employing dilated convolutions, TCNs can enlarge their receptive fields exponentially with depth, enabling the model to capture long-range temporal dependencies in CT ratio-error sequences while maintaining stable gradients. This property is crucial because CT errors exhibit both slow drifts over long horizons and short-term fluctuations.
The TCN consists of cascaded residual modules, each with three components: (1) causal convolution layers for causality; (2) dilated convolution layers to expand the receptive field; and (3) gated residual connections for efficient feature transfer. The architecture ensures each layer aligns with the input, capturing dependencies at different time steps with less complexity than recurrent networks. The dilated convolution process c = F ( y ˜ ) is expressed as
c = F ( y ˜ ) = ( y ˜ d f ) ( s ) = i = 0 k 1 f ( i ) y ˜ s d i
where d is the dilation coefficient and k is the convolution window size. By setting d = 2 n , the network can capture dependencies up to 2 n time steps; y ˜ denotes the preprocessed ratio-error input after EEMD and Pearson-correlation-based reconstruction. TCN offers advantages over traditional recurrent networks by avoiding explicit recurrent dependencies across time steps through parallel convolution and by using residual gating to combat the vanishing gradient problem.
To prevent information from gradually fading in deep networks, residual connections pass both the layer input x and the transformed information F ( x ) to the next layer. The ReLU activation function is applied for nonlinear transformation, effectively preserving informative features. The resulting forward feature h of the network is calculated as:
h = σ ( W 1 c + b ) ,
where W 1 is the weight matrix, b is the bias value, and σ ( · ) is the ReLU activation function. The residual module combines input information with output from causal convolutions, facilitating cross-layer feature transfer. The dilated convolution modules in the TCN ensure a larger receptive field with fewer layers and capture sequential features effectively.

4.2. Multi-Scale Architecture

Although a standard (forward) TCN extracts temporal features through causal convolutions, a single-scale configuration may still be suboptimal for non-stationary CT ratio-error series. In practice, CT errors are driven by at least two distinct temporal behaviors: (i) short-lived transients and high-frequency fluctuations caused by abrupt load changes, switching events, or measurement noise; and (ii) low-frequency trends and long-term drifts induced by equipment aging and seasonal environmental variations. A single receptive field makes it difficult to represent both types of dynamics equally well within one convolutional stack, which can weaken cross-step dependency modeling, hinder the capture of rapidly evolving dynamics, and ultimately reduce predictive accuracy.
To explicitly address this multi-scale nature, we introduce a dual-scale temporal mechanism that pairs heterogeneous kernel sizes and dilation factors in two complementary branches. The local branch is designed to focus on fine-grained, short-range patterns, whereas the global branch emphasizes long-range dependencies and slow-varying trends. By integrating both local and global context, the model jointly attends to sequence details and long-horizon trends, thereby enhancing robustness and accuracy for non-stationary ratio-error forecasting.
Let the processed input be X = ( x 0 , x 1 , , x T ) . Two complementary convolutional branches are trained in parallel:
c local = TCN local ( X ) ,
c global = TCN global ( X ) ,
where TCN local ( · ) employs smaller kernels with lower dilation rates to emphasize high-resolution, short-term variations, and TCN global ( · ) adopts larger kernels with higher dilation rates to encode long-range temporal dependencies. The branch outputs are fused to obtain the joint representation
c = Φ c local , c global ,
where Φ ( · , · ) denotes a fusion operator. The fused features c are then forwarded to the prediction head:
h = σ ( W 2 c + b 2 ) ,
where W 2 and b 2 are the learnable parameters of the prediction head and σ ( · ) is the activation function.
The structure in Figure 6 applies local and global convolutions in parallel, followed by normalization and activation; a fusion operator forms the joint representation. This dual-scale design enables the DTCN to explicitly capture multiple temporal dependencies in CT ratio-error series and complements the EEMD-based decomposition by providing a learnable, multi-scale temporal aggregator.

5. Case Study Analysis

5.1. Data Source

Experimental data were collected from current transformers at a 110 kV substation in Henan Province. The dataset includes real-time records from the substation’s monitoring system, specifically the current ratio error and environmental parameters such as temperature, humidity, magnetic field strength, and electric field phase. Ratio error data were obtained by connecting a verification current transformer to the unit under test. Environmental data were collected by the corresponding sensing devices deployed within the substation. All channels were aligned using a common timestamp.
The monitoring period spans an extended operating interval during which the in-service current transformers experience naturally occurring variations in both load level and ambient temperature due to daily demand fluctuations and seasonal changes. Consequently, the ratio-error time series naturally cover a variety of typical operating conditions, including moderate load ramps and temperature variations within the normal operating envelope of the substation. All performance metrics reported in this paper are computed over the full dataset without selecting only specific “favorable” intervals, so the reported results reflect model behavior across these representative operating regimes.
To emulate the rolling-forecasting scenario in online monitoring and to mitigate data-leakage risk, this study employs a walk-forward validation (rolling-origin evaluation) framework for model training and evaluation. The dataset comprises N = 4341 time-stamped samples acquired at 10-min intervals; in chronological order, the earliest 80 % form the training set and the most recent 20 % constitute the out-of-sample test set. All preprocessing and standardization (z-score) parameters are estimated exclusively on the training set and then applied unchanged to the test set. Hyperparameters are determined via grid search without allocating a separate validation set. Training is capped at 100 epochs with early stopping disabled, and overfitting is mitigated using L 2 regularization ( λ = 10 4 ) and dropout ( p = 0.1 ) . The model with the selected hyperparameters is retrained on the entire training set and evaluated once on the held-out test set.

5.2. Evaluation Metrics

To evaluate the model’s performance, three metrics were used to analyze the deviation between predicted and actual values: Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and the Coefficient of Determination ( R 2 ). The definitions are as follows:
MAE = 1 n i = 1 n y i y ^ i
RMSE = 1 n i = 1 n y i y ^ i 2
R 2 = 1 i = 1 n y i y ^ i 2 i = 1 n y i y ¯ 2
where y i is the actual value, y ^ i is the predicted value, and y ¯ is the mean of the actual values. A higher R 2 value indicates better model performance.

5.3. Experiment

To validate the proposed Dual-Scale Time-Series Convolutional Network (DTCN) model, baseline models (CNN, TCN, BiTCN) were used to model and test the original sequence. The results in Table 1 show the evaluation metrics (RMSE, MAE, and R 2 ) for four models.

5.4. EEMD Decomposition

To further validate the effectiveness of the proposed Dual-Scale TCN model, a comprehensive comparison of model performance with EEMD decomposition as a preprocessing step was conducted. The results not only demonstrate the universal enhancement offered by EEMD but also precisely quantify the superior performance of our proposed architecture. The key findings are summarized as follows.

5.4.1. EEMD Decomposition Universally Enhances Model Performance

The results in Table 2 show that applying EEMD decomposition as a preprocessing step leads to significant performance improvements across all baseline models. The results are shown in Figure 7. Specifically, the RMSE values for the CNN, TCN, and BiTCN models were reduced by 30.1%, 26.0%, and 30.6%, respectively. This universal improvement highlights the crucial role of EEMD in decoupling the complex, non-stationary ratio error signal into simpler, more regularized sub-components (Intrinsic Mode Functions, IMFs), thereby reducing the learning difficulty and mitigating overfitting risks for deep learning models.
Figure 8 further supports these findings by showing the prediction results of four models before and after EEMD decomposition, based on 60 randomly selected points. The EEMD-based models exhibit significantly improved alignment between predicted and actual values, whereas models trained without EEMD display larger fluctuations, particularly during periods of higher error. This confirms that EEMD not only improves accuracy but also reduces abnormal fluctuations by isolating high-frequency noise and low-frequency trends, thereby enhancing adaptability to complex signals. As model complexity increases, prediction accuracy improves, and the EEMD-DTCN achieves the smallest discrepancies and the best consistency across the time series.

5.4.2. The Proposed EEMD–DTCN Model Achieves State-of-the-Art Performance

The EEMD-Dual-Scale TCN model achieved the best results, with RMSE of 0.777 × 10 3 , MAE of 0.583 × 10 3 , and R 2 of 0.9936. This represents a substantial reduction in RMSE by 40.9% and in MAE by 37.0% compared to the same model trained on undecomposed data (Table 1). More importantly, the proposed model achieved a further 21.8% reduction in RMSE compared to the strongest EEMD-based baseline (EEMD-BiTCN). This confirms that the multi-scale collaborative modeling strategy effectively addresses the limitations of traditional single-scale models in analyzing complex frequency-domain features, enabling the simultaneous capture of short-term fluctuations and long-term trends.

5.4.3. Error Distribution and Goodness-of-Fit Analyses Provide Robust Visual Validation

The superior performance is further corroborated by visual evidence from the error analysis and goodness-of-fit plots. Figure 9 (Error Distribution Analysis) clearly shows that the prediction errors of the EEMD-DTCN model are more concentrated near zero, with a narrower interquartile range and fewer outliers, indicating higher prediction stability and precision compared to other models. Furthermore, Figure 10 (Predicted vs. True Value Scatter Plot) demonstrates that the predictions of the proposed model form a tighter cloud of points clustered closely around the ideal y = x line. This reflects the higher R 2 value and confirms a stronger agreement between predictions and true values with minimal systematic bias, providing intuitive proof of the model’s enhanced accuracy.

5.4.4. Cross-Model Comparative Visualization

To further demonstrate the performance improvements achieved through EEMD preprocessing and the proposed EEMD–Dual-Scale Temporal Convolutional Network (EEMD–DTCN), Figure 11 presents a comparative visualization of two models—DTCN and EEMD–DTCN—trained respectively on the original and EEMD-decomposed phase-displacement series. As shown in Figure 11, the EEMD–DTCN model aligns more closely with the ground-truth phase displacement, illustrating that signal decomposition reshapes the temporal dynamics captured by the model and enhances both its robustness and predictive accuracy. Additional experimental procedures and results are presented in Appendix B (Figure A1, Figure A2 and Figure A3).
The EEMD–DTCN model consistently exhibits improved convergence behavior and stronger error suppression compared with the baseline models, indicating that the proposed architecture can be readily extended to other metrological indicators—including phase displacement—without requiring bespoke model redesign. These findings collectively confirm the robustness and transferability of the proposed approach across different transformer error-prediction tasks.
As shown in Figure 11, models trained on the original sequence display noticeable lag and amplitude distortion during abrupt fluctuations, whereas their EEMD-enhanced counterparts respond more smoothly and in closer synchrony with the ground truth. This visual evidence aligns with the statistical findings reported above: EEMD suppresses high-frequency noise while emphasizing structural regularities, enabling the Dual-Scale Temporal Convolutional Network (DTCN) to capture short-lived transients and long-range drifts more effectively. Additional case studies on phase displacement are provided in Appendix B.

5.5. Statistical Significance Analysis

To further validate the robustness of the proposed model, statistical significance tests were performed on the prediction errors. Specifically, the point-by-point errors of the proposed model and the baseline models were compared using the following two methods:
  • Paired t-test: appropriate when the error distribution approximates normality, used to test mean differences.
  • Wilcoxon signed-rank test: a non-parametric method that does not require distributional assumptions.
RMSE and MAE were selected as the primary evaluation metrics. Table 3 reports the improvements of the proposed model over the baseline models, together with the results of the statistical tests.
As shown in Table 3, the proposed model achieved consistent improvements in both RMSE and MAE. All reported p-values are below 0.05, indicating that the improvements are statistically significant. This multi-perspective statistical validation provides strong evidence that the performance gains are not due to random fluctuations, but rather reflect substantial improvements attributable to the proposed architectural design.

5.6. Diebold–Mariano Predictive Accuracy Test

In addition to the paired t-test and Wilcoxon signed-rank test reported in Section 5.5, we use the Diebold–Mariano (DM) test to determine whether the forecasting accuracy of our method differs significantly from that of the baselines. The DM test compares forecast losses while accounting for serial correlation via a heteroskedasticity and autocorrelation consistent (HAC) variance estimate.
Let e t ( A ) = y t y ^ t ( A ) and e t ( B ) = y t y ^ t ( B ) denote errors of baseline A and our model B. For a loss ( · ) , define the pairwise loss differential d t = ( e t ( A ) ) ( e t ( B ) ) . With this sign convention, a positive mean d ¯ = 1 T t = 1 T d t > 0 indicates a lower average loss for our method. We adopt squared error (SE) as the primary loss and report two-sided p-values; variances are estimated via the Newey–West HAC estimator with lag L = 0 , which is standard for one-step-ahead forecasts. Robustness checks with absolute error (AE) consistent conclusions. Table 4 reports the DM test for one-step-ahead forecasts.
Pairwise comparisons are made against EEMD-CNN, EEMD-TCN, and the strongest baseline EEMD-BiTCN.
Rejecting E [ d t ] = 0 at the 5% level suggests a difference in predictive accuracy; with our sign convention, a positive and statistically significant DM statistic indicates a lower average loss for our model than for the baseline.

6. Conclusions

This study demonstrates that signal decomposition provides a crucial bridge between non-stationary measurement data and deep temporal learning. By integrating EEMD with a DTCN, the proposed framework effectively enhances the prediction of current transformer ratio errors through scale-aware representation and adaptive feature learning. The empirical results verify the validity and robustness of the approach, confirming its advantages over conventional deep learning models.
  • The proposed EEMD–DTCN framework achieves significant performance gains compared with baseline architectures. Statistical analyses, including the Diebold–Mariano predictive-accuracy test, confirm that the improvement is both consistent and significant, indicating that the framework delivers superior one-step-ahead forecasting performance under realistic operating conditions.
  • The model’s improvement originates from two complementary design mechanisms. The EEMD-based reconstruction suppresses noise-dominated modes and enhances informative components, yielding cleaner and more stable input signals. Meanwhile, the dual-scale temporal structure—with heterogeneous receptive fields—enables simultaneous modeling of short-lived transients and long-range drifts, expanding the effective temporal context while avoiding the instability typical of recurrent models.
  • From an engineering standpoint, the combination of signal decomposition and multi-scale temporal modeling offers improved robustness against field interference and non-stationary fluctuations. It provides a practical path toward non-intrusive online calibration and early detection of measurement accuracy drift in high-voltage substations.
Despite these promising results, the present evaluation is conducted on historical data from an in-service 110 kV substation, which mainly reflect typical operating conditions with naturally occurring load and temperature variations. Explicit stress tests under deliberately constructed extreme scenarios—such as large-amplitude load steps or abrupt temperature shocks beyond normal operating limits—have not yet been systematically performed. In addition, the ensemble empirical mode decomposition step is primarily effective for ratio error signals with reasonably continuous structure, and its contribution may be limited for extremely sparse or heavily noise-contaminated data. Future work will therefore design scenario-based test sets to assess robustness under such stressed conditions and investigate more robust decomposition strategies, further validating the framework under a broader range of operating conditions and data quality levels. From a deployment perspective, we note that model training is performed offline on historical data, while the online prediction stage only requires lightweight convolutional inference; a more detailed evaluation of real-time performance and industrial deployment constraints will be carried out in future work.

Author Contributions

Conceptualization, J.L.; Methodology, Z.L. and C.H.; Software, J.L.; Validation, Z.L. and J.L.; Formal Analysis, Z.L.; Investigation, C.H.; Data Curation, J.C.; Writing—Original Draft Preparation, J.L.; Writing—Review & Editing, J.L. and C.H.; Funding Acquisition, J.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Science and Technology Project of State Grid Jiangxi Electric Power Co., Ltd., grant number 52185224000B.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

Author Jian Liu and Chen Hu were employed by the company Power Supply Service Management Centre of State Grid Jiangxi Electric Power Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Appendix A. Supplementary Notes on Current Transformer Operation

This appendix provides a concise summary of the operating principle of current transformers (CTs) and the main quantities that influence their transformation ratio and ratio error. A current transformer is an electromagnetic instrument transformer that reproduces a scaled version of a primary current in its secondary winding. In the ideal case, the primary current I 1 and secondary current I 2 are related by the rated transformation ratio K r through
I 1 = K r I 2 ,
so that the ratio remains constant. In a real CT, however, part of the primary current is used as magnetizing (excitation) current to establish the core flux. The magnitude of this magnetizing current depends on the core material and geometry, the excitation level, and the operating point on the magnetization curve. As a result, the actual transformation ratio deviates from the ideal turns ratio and gives rise to a nonzero ratio error. The effective transformation ratio is also influenced by the burden connected to the secondary winding. The rated burden (specified in VA and power factor) defines the range of secondary impedance over which the CT is designed to meet a given accuracy class under the relevant standards (e.g., IEC 61869-2 for metering CTs). Operating the CT at burdens significantly higher than the rated value increases the secondary voltage and the magnetizing current, which typically leads to larger ratio errors. Conversely, operating at very low burden changes the core operating point and may also affect the ratio error. In addition to the secondary burden, several other factors contribute to the time-varying behavior of the transformation ratio under field conditions [29,30,31]:
  • The magnitude, waveform, and frequency of the primary current (including harm- onic distortion);
  • The thermal state of the windings and core, which affects resistance and losses;
  • Ambient and environmental conditions such as temperature, humidity, and external electromagnetic fields.
In the case study of this paper, the CTs under test are metering CTs designed to meet an accuracy class of 0.2 at their rated burden. Therefore, the observed ratio-error time series primarily reflects the combined effect of the primary-current profile, the CT internal characteristics, and the monitored environmental variables, rather than deliberate changes in the secondary burden.

Appendix B. Additional Experiments

Our measurement system consists of the following: the standard transformer is an electromagnetic instrument transformer with an accuracy class of 0.05. Its output is converted by a transducer into a 2 V signal and sent to the data acquisition unit. The acquisition unit samples at 10 kHz; the data are then transferred to the verification system. The verification software is written in LabVIEW 2023 Q1. The merging unit outputs a digital signal, which is transmitted directly to the verification system via optical fiber.
Figure A1. EEMD decomposition result of the phase displacement.
Figure A1. EEMD decomposition result of the phase displacement.
Electronics 15 00325 g0a1
Figure A2. Comparative Experiment of Four Models Before and After EEMD.
Figure A2. Comparative Experiment of Four Models Before and After EEMD.
Electronics 15 00325 g0a2
Figure A3. Comparison of EEMD Decomposed and Original Sequences and EEMD Residuals.
Figure A3. Comparison of EEMD Decomposed and Original Sequences and EEMD Residuals.
Electronics 15 00325 g0a3

References

  1. Impram, S.; Nese, S.V.; Oral, B. Challenges of renewable energy penetration on power system flexibility: A survey. Energy Strategy Rev. 2020, 31, 100539. [Google Scholar] [CrossRef]
  2. Ameli, A.; Saleh, K.A.; El-Saadany, E.F.; Salama, M.M.; Zeineldin, H.H. Wide-band current transformers for traveling-waves-based protection applications. IEEE Trans. Smart Grid 2020, 12, 845–858. [Google Scholar] [CrossRef]
  3. Li, Z.; Cui, J.; Lu, H.; Zhou, F.; Diao, Y.; Li, Z. Prediction model of measurement errors in current transformers based on deep learning. Rev. Sci. Instruments 2024, 95, 044704. [Google Scholar] [CrossRef]
  4. Li, Z.; Cui, J.; Chen, H.; Lu, H.; Zhou, F.; Rocha, P.R.; Yang, C. Research progress of all-fiber optic current transformers in novel power systems: A review. Microw. Opt. Technol. Lett. 2025, 67, e70061. [Google Scholar] [CrossRef]
  5. Saha, S.; Haque, M.E.; Tan, C.; Mahmud, M.A.; Arif, M.T.; Lyden, S.; Mendis, N. Diagnosis and mitigation of voltage and current sensors malfunctioning in a grid connected PV system. Int. J. Electr. Power Energy Syst. 2020, 115, 105381. [Google Scholar] [CrossRef]
  6. Brandolini, A.; Faifer, M.; Ottoboni, R. A simple method for the calibration of traditional and electronic measurement current and voltage transformers. IEEE Trans. Instrum. Meas. 2009, 58, 1345–1353. [Google Scholar] [CrossRef]
  7. Suomalainen, E.P.; Hallstrom, J.K. Onsite calibration of a current transformer using a Rogowski coil. IEEE Trans. Instrum. Meas. 2008, 58, 1054–1058. [Google Scholar] [CrossRef]
  8. Li, Z.; Du, Y.; Abu-Siada, A.; Bao, G.; Yu, J.; Hu, T.; Zhang, T. An online calibration system for digital input electricity meters based on improved Nuttall window. IEEE Access 2018, 6, 71262–71270. [Google Scholar] [CrossRef]
  9. Li, Z.; Li, H.; Zhang, Z. An accurate online calibration system based on combined clamp-shape coil for high voltage electronic current transformers. Rev. Sci. Instruments 2013, 84, 075113. [Google Scholar] [CrossRef] [PubMed]
  10. Li, Z.; Yu, C.; Abu-Siada, A.; Li, H.; Li, Z.; Zhang, T.; Xu, Y. An online correction system for electronic voltage transformers. Int. J. Electr. Power Energy Syst. 2021, 126, 106611. [Google Scholar] [CrossRef]
  11. Kim, D.E.; Lee, G.Y.; Kil, G.S.; Kim, S.W. Trends in Measuring Instrument Transformers for Gas-Insulated Switchgears: A Review. Energies 2024, 17, 1846. [Google Scholar] [CrossRef]
  12. Li, Z.; Chen, X.; Wu, L.; Ahmed, A.S.; Wang, T.; Zhang, Y.; Li, H.; Li, Z.; Xu, Y.; Tong, Y. Error analysis of air-core coil current transformer based on stacking model fusion. Energies 2021, 14, 1912. [Google Scholar] [CrossRef]
  13. Sun, K.; Qiu, W.; Yao, W.; You, S.; Yin, H.; Liu, Y. Frequency injection based HVDC attack-defense control via squeeze-excitation double CNN. IEEE Trans. Power Syst. 2021, 36, 5305–5316. [Google Scholar] [CrossRef]
  14. Duan, J.; Chang, M.; Chen, X.; Wang, W.; Zuo, H.; Bai, Y.; Chen, B. A combined short-term wind speed forecasting model based on CNN–RNN and linear regression optimization considering error. Renew. Energy 2022, 200, 788–808. [Google Scholar] [CrossRef]
  15. Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
  16. Yu, G.; Liu, C.; Tang, B.; Chen, R.; Lu, L.; Cui, C.; Hu, Y.; Shen, L.; Muyeen, S. Short term wind power prediction for regional wind farms based on spatial-temporal characteristic distribution. Renew. Energy 2022, 199, 599–612. [Google Scholar] [CrossRef]
  17. Zhou, F.; Zhao, P.; Lei, M.; Yue, C.; Yu, J.; Liang, S. Capacitive voltage transformer measurement error prediction by improved long short-term memory neural network. Energy Rep. 2022, 8, 1011–1021. [Google Scholar] [CrossRef]
  18. Zhang, W.; Shi, Y.; Yu, J.; Yang, B.; Lin, C. Online measurement of capacitor voltage transformer metering errors based on GRU and MTL. Electr. Power Syst. Res. 2023, 221, 109473. [Google Scholar] [CrossRef]
  19. Jia, X.; Xia, Y.; Yan, Z.; Gao, H.; Qiu, D.; Guerrero, J.M.; Li, Z. Coordinated operation of multi-energy microgrids considering green hydrogen and congestion management via a safe policy learning approach. Appl. Energy 2025, 401, 126611. [Google Scholar] [CrossRef]
  20. Jiang, Y.; Lee, N.; Deng, X.; Yang, Y. A Secure-Sustainable-Fast Charging Strategy for Lithium-Ion Batteries Based on a Random Forest-Enhanced Electro-Thermal-Degradation Model. IEEE Trans. Power Electron. 2025, 13, 21–30. [Google Scholar] [CrossRef]
  21. Li, J.; Zou, K.; Xing, L. Coarse-to-fine evolutionary search for large-scale multi-objective optimization: An application to ratio error estimation of voltage transformers. Front. Energy Res. 2022, 10, 988772. [Google Scholar] [CrossRef]
  22. Zhang, P.; Tian, Y.; Zhang, Y.; Zhang, X. A problem knowledge driven bi-population cooperative framework for time-varying ratio error estimation of voltage transformers. Swarm Evol. Comput. 2024, 89, 101628. [Google Scholar] [CrossRef]
  23. Li, D.; Jiang, M.R.; Li, M.W.; Hong, W.C.; Xu, R.Z. A floating offshore platform motion forecasting approach based on EEMD hybrid ConvLSTM and chaotic quantum ALO. Appl. Soft Comput. 2023, 144, 110487. [Google Scholar] [CrossRef]
  24. Gao, J.; Shang, P. Analysis of complex time series based on EMD energy entropy plane. Nonlinear Dyn. 2019, 96, 465–482. [Google Scholar] [CrossRef]
  25. Sedgwick, P. Pearson’s correlation coefficient. BMJ 2012, 345, e4483. [Google Scholar] [CrossRef]
  26. Jebli, I.; Belouadha, F.Z.; Kabbaj, M.I.; Tilioua, A. Prediction of solar energy guided by pearson correlation using machine learning. Energy 2021, 224, 120109. [Google Scholar] [CrossRef]
  27. Zhu, J.; Su, L.; Li, Y. Wind power forecasting based on new hybrid model with TCN residual modification. Energy AI 2022, 10, 100199. [Google Scholar] [CrossRef]
  28. Zou, Z.; Wang, J.; E, N.; Zhang, C.; Wang, Z.; Jiang, E. Short-term power load forecasting: An integrated approach utilizing variational mode decomposition and TCN–BiGRU. Energies 2023, 16, 6625. [Google Scholar] [CrossRef]
  29. Kaczmarek, M. A practical approach to evaluation of accuracy of inductive current transformer for transformation of distorted current higher harmonics. Electr. Power Syst. Res. 2015, 121, 121–128. [Google Scholar] [CrossRef]
  30. Tomczyk, K.; Sieja, M.; Ostrowska, K.; Owczarek, D. Review of accuracy assessment methods for current transformers: Errors, uncertainties and dynamic performance. Energies 2025, 18, 4995. [Google Scholar] [CrossRef]
  31. Mingotti, A.; Peretto, L.; Bartolomei, L.; Cavaliere, D.; Tinarelli, R. Are inductive current transformers performance really affected by actual distorted network conditions? An experimental case study. Sensors 2020, 20, 927. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Model framework.
Figure 1. Model framework.
Electronics 15 00325 g001
Figure 2. EEMD decomposition of the ratio error series y ( t ) .
Figure 2. EEMD decomposition of the ratio error series y ( t ) .
Electronics 15 00325 g002
Figure 3. Reconstructed series versus original data after EEMD.
Figure 3. Reconstructed series versus original data after EEMD.
Electronics 15 00325 g003
Figure 4. Residual after EEMD reconstruction.
Figure 4. Residual after EEMD reconstruction.
Electronics 15 00325 g004
Figure 5. Pearson-correlation heatmap across EEMD components.
Figure 5. Pearson-correlation heatmap across EEMD components.
Electronics 15 00325 g005
Figure 6. Dual-Scale Time-Series TCN Structure.
Figure 6. Dual-Scale Time-Series TCN Structure.
Electronics 15 00325 g006
Figure 7. Comparison of Models with Actual Values.
Figure 7. Comparison of Models with Actual Values.
Electronics 15 00325 g007
Figure 8. Comparative Experiment of Four Models Before and After EEMD.
Figure 8. Comparative Experiment of Four Models Before and After EEMD.
Electronics 15 00325 g008
Figure 9. Boxplot of errors.
Figure 9. Boxplot of errors.
Electronics 15 00325 g009
Figure 10. Fitting results.
Figure 10. Fitting results.
Electronics 15 00325 g010
Figure 11. Comparative experiment of models before and after EEMD. The EEMD-based models exhibit smaller deviations and better alignment with the ground-truth ratio-error sequence, confirming the enhanced feature separability and robustness achieved through signal decomposition.
Figure 11. Comparative experiment of models before and after EEMD. The EEMD-based models exhibit smaller deviations and better alignment with the ground-truth ratio-error sequence, confirming the enhanced feature separability and robustness achieved through signal decomposition.
Electronics 15 00325 g011
Table 1. Model Performance on the Original Sequence (No Decomposition).
Table 1. Model Performance on the Original Sequence (No Decomposition).
ModelRMSEMAE R 2
CNN 1.711 × 10 3 1.362 × 10 3 0.9691
TCN 1.589 × 10 3 1.223 × 10 3 0.9733
BiTCN 1.433 × 10 3 1.067 × 10 3 0.9784
Dual-Scale TCN 1.314 × 10 3 0.926 × 10 3 0.9819
Table 2. Model Performance Comparison After EEMD Decomposition.
Table 2. Model Performance Comparison After EEMD Decomposition.
ModelRMSEMAE R 2
EEMD-CNN 1.196 × 10 3 0.906 × 10 3 0.9849
EEMD-TCN 1.175 × 10 3 0.891 × 10 3 0.9854
EEMD-BiTCN 0.994 × 10 3 0.764 × 10 3 0.9896
EEMD-Dual-Scale TCN 0.777 × 10 3 0.583 × 10 3 0.9936
Table 3. Statistical Significance Test Results (Proposed vs. Baseline Models).
Table 3. Statistical Significance Test Results (Proposed vs. Baseline Models).
Baseline Model Δ RMSE Δ MAEt-Test p-ValueWilcoxon p-Value
Proposed vs. EEMD-CNN 0.419 × 10 3 0.323 × 10 3 1.1 × 10 4 1.4 × 10 4
Proposed vs. EEMD-TCN 0.398 × 10 3 0.308 × 10 3 9.0 × 10 5 1.2 × 10 4
Proposed vs. EEMD-BiTCN 0.217 × 10 3 0.181 × 10 3 1.3 × 10 4 2.8 × 10 4
Table 4. DM test on one-step-ahead forecasts. Squared error (SE) is primary; absolute error (AE) is shown for robustness.
Table 4. DM test on one-step-ahead forecasts. Squared error (SE) is primary; absolute error (AE) is shown for robustness.
ComparisonDM (SE)p (SE)DM (AE)p (AE)
Ours vs. EEMD-CNN8.20<0.0015.80<0.001
Ours vs. EEMD-TCN7.90<0.0015.30<0.001
Ours vs. EEMD-BiTCN5.10<0.0013.80<0.001
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liu, J.; Hu, C.; Li, Z.; Cui, J. Multi-Scale Temporal Learning with EEMD Reconstruction for Non-Stationary Error Forecasting in Current Transformers. Electronics 2026, 15, 325. https://doi.org/10.3390/electronics15020325

AMA Style

Liu J, Hu C, Li Z, Cui J. Multi-Scale Temporal Learning with EEMD Reconstruction for Non-Stationary Error Forecasting in Current Transformers. Electronics. 2026; 15(2):325. https://doi.org/10.3390/electronics15020325

Chicago/Turabian Style

Liu, Jian, Chen Hu, Zhenhua Li, and Jiuxi Cui. 2026. "Multi-Scale Temporal Learning with EEMD Reconstruction for Non-Stationary Error Forecasting in Current Transformers" Electronics 15, no. 2: 325. https://doi.org/10.3390/electronics15020325

APA Style

Liu, J., Hu, C., Li, Z., & Cui, J. (2026). Multi-Scale Temporal Learning with EEMD Reconstruction for Non-Stationary Error Forecasting in Current Transformers. Electronics, 15(2), 325. https://doi.org/10.3390/electronics15020325

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop