1. Introduction
Serving as a critical function for the intelligent management of distribution networks, state estimation helps identify weak points in the network and propose remedial measures to improve system reliability [
1]. It utilizes readings of power, voltage, and current gathered from conventional systems and devices—including supervisory control and data acquisition (SCADA) systems, phasor measurement units (PMUs), and smart metering devices—for estimating the true conditions of the electric power system [
2]. With the widespread integration of controllable resources like electric vehicles (EVs) and rooftop photovoltaics (PV) into distribution networks, enhancing their control capabilities has become a key direction for the development of power systems [
3,
4]. Consequently, distribution network state estimation is receiving increasing attention [
5].
In contrast to transmission networks, which are equipped with a large number of secondary data acquisition devices like PMUs, distribution networks have a relatively weak measurement infrastructure. Despite the gradual integration of SCADA systems and Micro-PMUs, their widespread, high-density deployment faces restrictions due to high economic costs, the vast quantity of network nodes, and inadequate communication infrastructure [
6]. This leads to a common problem of low measurement coverage in distribution networks, accompanied by challenges such as insufficient measurement accuracy and low refresh rates. These factors result in a data redundancy far below what is required for state estimation, making it difficult to support accurate and real-time state estimation for distribution networks [
7]. To address the issue of measurement sparsity, pseudo-measurement data have been identified as an efficient means of enhancing the observability of distribution networks. Pseudo-measurement generation models fall into two major classes: statistical analysis models and machine learning models [
8]. Statistical analysis models typically use historical measurement data to infer missing measurements through methods such as non-parametric least-squares density estimation [
9] and kernel density estimation [
10]. Conversely, machine learning models yield more precise pseudo-measurement data by exploiting the intricate, non-linear dependencies embedded in prior data. These approaches include attention-enhanced recurrent neural networks [
11], support vector machines [
12], and generative adversarial networks (GANs) [
13]. However, most existing studies either focus solely on the temporal correlation of available measurement data or fail to balance the depth of spatiotemporal feature extraction with computational efficiency. The loads and distributed energy generation at different nodes inherently exhibit significant spatial correlation [
14]. While GANs demonstrate strong capabilities in modeling complex data distributions, they often entail high computational complexity and training instability due to the adversarial min–max optimization process, which can be burdensome for real-time applications. Therefore, comprehensively considering and effectively exploiting the spatiotemporal correlation among measurement data while maintaining high computational efficiency is a potential key to further improving the accuracy of pseudo-measurement generation.
Due to the increasing randomness and volatility of distribution networks, traditional static state estimation, which relies solely on information from a single time snapshot, struggles to cope with rapid fluctuations in system states, leading to delayed or inaccurate estimation results. Built upon the framework of the Kalman filter (KF), dynamic state estimation continuously tracks the state transition trajectory across multiple time slots, allowing for a more accurate depiction of system dynamics. In tackling the pronounced non-linearity inherent in the distribution network’s load flow characteristics, researchers have put forth techniques including an extended Kalman filter (EKF) [
15], UKF [
16], and a cubature Kalman filter (CKF) [
17]. The EKF achieves linearization of the electrical system by relying on a first-order Taylor expansion, a process that inevitably results in truncation errors [
18]. These errors increase as the system’s non-linearity deepens, potentially leading to decreased estimation accuracy or even numerical instability. To avoid EKF’s linearization errors, the UKF and CKF employ a deterministic sampling strategy to handle non-linear transformations, significantly improving estimation performance for non-linear systems. However, these methods still face two major challenges: (1) when system dynamics change drastically, the actual process noise distribution may not own a fixed covariance matrix, which reduces the tracking performance and robustness; (2) the effects of these methods are based on the supposition that measurement noise adheres to a standard normal distribution, which is inconsistent with real-world physical systems [
19].
To handle time-varying process noise, Ref. [
20] proposed the Sage–Husa noise model, which calculates and adjusts the process noise covariance dynamically. Ref. [
21] embedded a sub-optimal fading factor to enhance the response to dynamic changes in process noise, while Ref. [
22] achieved adaptive estimation by adjusting a modulation factor. However, the values or decay strategies of the modulation factors in these methods are often heuristically fixed, lacking the flexibility to autonomously adjust to the time-varying intensity of process noise in real-time. This limitation restricts their estimation accuracy during severe dynamic state mutations. Therefore, a crucial current research direction is how to dynamically calibrate the forgetting factor by analyzing real-time changes in the innovation sequence to achieve more precise state variable estimation.
To handle the impact of non-Gaussian measurement noise, existing methods mainly fall into a few categories: filters based on robust statistics [
23,
24]; filters based on information-theoretic criteria [
25]; and filters based on probabilistic models [
26]. Although these methods have, to some extent, solved the non-Gaussian noise problem, they generally suffer from high computational complexity and sensitivity to parameter selection, thus failing to satisfy the demanding requirements for real-time operation and adaptation in distribution networks. Therefore, the ability to quickly and accurately identify non-Gaussian noise is key to improving the performance of state estimation in distribution networks.
A dynamic state estimation approach for distribution networks is presented in this paper, leveraging the spatiotemporal correlation of data and adaptiveness to process and measurement noise. The principal achievements of this work are listed below:
(1) Pseudo-measurement generation: A CNN-BiGRU-Attention model is presented to generate highly accurate pseudo-measurement data by effectively extracting both the spatial correlation within the network topology and the temporal correlation in the data.
(2) Dynamic state estimation: We propose an innovative unscented Kalman filter with adaptiveness to process and measurement noise (NA-UKF). This method includes a process noise adaptive estimation component based on an AMF and a measurement noise adaptive estimation component based on RMD. Compared to existing methods in the literature [
27,
28,
29], the proposed algorithm more accurately tracks the network state and exhibits high resilience to both time-varying process and non-Gaussian measurement uncertainties.
The subsequent sections of this manuscript are structured as follows. The overall architecture is presented in
Section 2.
Section 3 is then dedicated to the pseudo-measurement generation model, which accounts for spatiotemporal correlation.
Section 4 provides a detailed description of the NA-UKF algorithm. The algorithm’s effectiveness is assessed and confirmed in
Section 5, and the final section,
Section 6, summarizes the paper.
2. Overall Framework
In distribution networks, the main measurement information is provided by two different types of systems: SCADA systems and Micro-PMUs. SCADA systems are the most widely deployed in distribution networks due to their lower single-point cost. These terminals can be installed at nodes or on branches and primarily provide measurement data such as active and reactive power, but their accuracy and reporting rate are relatively low. In contrast, Micro-PMUs have a higher single-point cost and are deployed in smaller numbers, yet they can provide high-accuracy, very high-reporting-rate synchronous phasor measurement data. They are typically installed at critical nodes to measure nodal voltage phasors. In hybrid-measurement-based distribution network state estimation, the SCADA reporting rate is used as the base frequency for state estimation, matching the mainstream sampling capabilities of existing terminals. Then, the system error is corrected using the high-accuracy phasor data from Micro-PMUs, achieving a collaborative optimization of accuracy and economic efficiency.
Figure 1 presents the overall architecture of the study. Within the pseudo-measurement generation process, an offline-trained CNN-BiGRU-Attention model is used to fully leverage the spatiotemporal correlation of distribution network measurement data. This model generates pseudo-measurement data for nodes or branches without measurement devices. The input to the model includes historical and current measurement data; the output consists of the current pseudo-measurement data for the nodes and branches with missing measurements. In the dynamic state estimation stage, we employ the proposed NA-UKF algorithm. This model utilizes the temporal information of the measurement data and adaptively adjusts to its noise characteristics, allowing for continuous tracking and accurate estimation of the system state. Finally, the state variables, namely the magnitude and phase angle associated with the nodal voltages, are calculated.
3. A Pseudo-Measurement Generation Model Considering Spatiotemporal Correlation
Distribution network loads and distributed energy generation not only exhibit significant time-series characteristics but are also influenced by spatial coupling due to geographical location, neighboring node loads, similar load types and the complex radial topology. Existing pseudo-measurement generation methods based on neural networks often neglect the intrinsic spatiotemporal correlation within historical distribution network data, resulting in pseudo-measurement data that lacks sufficient precision. To address these deficiencies, this paper introduces a pseudo-measurement generation framework that leverages the CNN-BiGRU-Attention neural network.
3.1. Model Input and Output
The pseudo-measurement generation model undergoes training under a supervised learning paradigm. The construction of the dataset is a critical foundation for ensuring model performance. The input vector
Ti is defined as
where
P,
Q denote active and reactive power, respectively; subscripts
inj and
l refer to nodal injection and branch power flow. Subscripts
Micro-PMUs and
SCADA indicate historical measurements from Micro-PMUs and SCADA devices. Variables without device subscripts (i.e.,
Pinj,
Qinj,
Pl,
Ql) represent historical data derived from power flow calculations based on historical load records. The output vector
To is expressed as
where the subscript
pseudo denotes the generated pseudo-measurements for unmeasured locations.
After completion of the network training, the model operates in an online spatial imputation mode. To ensure high accuracy in real-time deployment, a rolling prediction strategy is adopted. By inputting the measurement data sequence which includes both the historical data (from T − L to T − 1) and the actual real-time readings captured by SCADA/Micro-PMUs at the current time slot T, we can infer the pseudo-measurement data for all nodes and branches without measurement configuration at the same time slot T. This mechanism continuously slides the input window forward as new data arrives, minimizing error accumulation. This provides more accurate and comprehensive real-time pseudo-measurement information for subsequent dynamic state estimation.
3.2. CNN-BiGRU-Attention Model
Leveraging the inherent spatiotemporal correlations within the data to enhance the precision of pseudo-measurement generation, this paper constructs a CNN-BiGRU-Attention model, with its network architecture presented in
Figure 2.
To effectively encode the distribution network’s topology, the input measurement matrix is constructed based on a depth-first search traversal sequence. This serialization maps physically connected or electrically adjacent nodes to proximal positions in the input tensor, ensuring that the topological structure is preserved in the 1-D data format.
The CNN is utilized to effectively capture the spatial correlation and topological dependencies between different nodes and branches [
7]. It leverages local connectivity to learn data features across different ranges of dimensions. Specifically, the convolutional layers enhance feature extraction by applying convolution operators to the input measurements, utilizing learnable weights and biases to map raw data into high-level spatial feature maps. This process allows the model to automatically abstract intricate spatial dependencies between nodes. Subsequently, pooling layers are utilized to down-sample the features extracted by the convolutional layers, thereby reducing information redundancy while preserving the most critical spatial features.
To capture temporal features, the model incorporates a BiGRU layer. By combining the forward and backward hidden state information, the BiGRU layer can comprehensively mine the periodic patterns and bidirectional temporal correlation within historical data. This substantially improves the model’s capacity to detect dynamic variations within the pseudo-measurement data. The feature vector extracted by the CNN is used as the input for the BiGRU layer.
Finally, an Attention network is introduced to dynamically assign weights to data from different types of measurement sources (e.g., SCADA, Micro-PMUs) and key nodes (e.g., PV power stations, EV charging stations). This empowers the model to selectively concentrate upon essential features for generating more accurate pseudo-measurements.
During the model training process, this paper adopts mean squared error (MSE) as the objective loss function. The mathematical formula is provided as follows:
Finally, the high-dimensional feature vector, processed and refined by the CNN, BiGRU and Attention layers, is non-linearly mapped through a fully connected layer to output the ultimate pseudo-measurement data.
5. Case Study
To establish the efficacy and superior performance of our proposed approach, simulation tests were performed in the MATLAB R2023a setting on the IEEE 33-bus three-phase unbalanced grid, the configuration of which is presented in
Figure 3. In this case study, the system is powered by thermal power from the main grid and distributed PVs, with a renewable energy penetration rate of approximately 10.94%.
To replicate the actual operational conditions of the power system, three types of typical load are modeled: residential, industrial, and commercial. Each type of load has distinct daily fluctuation curves and characteristics. Based on these realistic load profiles, a dataset comprising 2000 temporal snapshots was constructed via time-series power flow analysis. To strictly evaluate the model’s generalization capability, the dataset was partitioned into a training set (80%) and a testing set (20%). The true power flow values are obtained by running the power flow solver. Measurement values are generated by adding zero-mean Gaussian measurement errors to these true values. Specifically, the standard deviation of measurement errors for Micro-PMUs is set to 0.3% for voltage magnitudes, with 0.05° designated for voltage phase angles. For current magnitudes and phase angles, the standard deviation is set to 0.4% and 0.05°, respectively. The power measurement error standard deviation for the SCADA system is fixed uniformly at 1%. For pseudo-measurements generated by the proposed model, the error standard deviation is set to 10%. The time interval for dynamic state estimation is uniformly set at 5 s, which is consistent with the SCADA system’s reporting rate. Furthermore, all initial and empirical parameters used in the algorithm were determined through repeated simulations and tuning. The initial process and measurement noise covariance matrices, Q0 and R0, are initialized as a diagonal matrix with diagonal elements set to 10−4.
The architectural specifications for the proposed CNN-BiGRU-Attention model are detailed below. The CNN component comprises two one-dimensional convolutional layers and two corresponding max pooling layers. Specifically, the initial convolutional layer utilizes 32 filters of size 5, and the subsequent layer doubles the filter count to 64 with the same kernel size. The BiGRU layer is configured with 256 hidden units. Furthermore, the Attention component leverages eight parallel attention heads, and the projection dimension is established as 512. Concerning the training regimen, we fixed the initial learning rate at 0.03%, used a mini-batch size of 32, and performed the training over 300 epochs.
5.1. Evaluation of Pseudo-Measurement Generation Method
To assess the efficacy and advantages of the proposed CNN-BiGRU-Attention pseudo-measurement generation method, we generated pseudo-measurement data for the nodes and branches lacking measurement devices. We compared our method with five other models: Model-1 (CNN), Model-2 (LSTM), Model-3 (GRU), Model-4 (BiGRU), and Model-5 (BiGRU-Attention). Our method is designated as Model-6. Pseudo-measurement data between 11:30 and 11:40 are generated and compared with the true values. The results of the injected power at node 9, phase B and the branch power on branch 16–17, phase B are shown in
Figure 4. The performance of each model on the same test set was evaluated using the following metrics: mean absolute percentage error (MAPE), root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (R2). The results are summarized in
Table 1.
From
Figure 4, it is evident that Model-2 fails to accurately predict data with significant fluctuations, resulting in a low goodness-of-fit. Compared to Model-3, Model-4’s MAPE is only 16.8753% of the former, demonstrating that a bidirectional GRU is a reasonable choice for extracting temporal dependencies. Both
Figure 4 and
Table 1 clearly show that Model-6, the proposed pseudo-measurement generation model, delivers superior predictive performance compared to all competing models. The results closely align with the true values, demonstrating its superior performance.
5.2. Basic Test of the NA-UKF Algorithm
To assess the efficacy of the proposed spatiotemporal pseudo-measurement generator and the NA-UKF algorithm, five comparative methods are designed. It is important to note that the proposed NA-UKF is built upon the architecture of the robust adaptive unscented Kalman filter (RAUKF) [
27], with specific improvements made to the estimation mechanisms for process and measurement noise covariance to enhance robustness. To ensure a fair comparison, all common parameters among the UKF, adaptive extended Kalman filter (AEKF), RAUKF, and NA-UKF algorithms are set to identical values.
M1: Employ the proposed pseudo-measurement model with the standard UKF algorithm [
27].
M2: Employ the proposed pseudo-measurement model with the AEKF algorithm [
28].
M3: Employ the proposed pseudo-measurement model with the RAUKF algorithm [
29].
M4: Employ a proportional method based on typical load curves for pseudo-measurement generation with the proposed NA-UKF algorithm.
M5: Employ the proposed pseudo-measurement model with the proposed NA-UKF algorithm.
Figure 5 and
Figure 6 illustrate the state estimation results and MAE metrics for these five methods. From
Figure 5 and
Figure 6, it is evident that the adaptive algorithms (AEKF, RAUKF and NA-UKF) generally perform well. The unscented transformation-based adaptive methods (M3 and M5) demonstrate superior overall tracking accuracy compared to both M1 and M2. To quantitatively analyze the estimation precision,
Table 2 presents their evaluation metrics. The mean vector error (MVE) is introduced to comprehensively evaluate the estimation deviation of both voltage magnitude and phase angle for phase A. Most importantly, the proposed M5 yields the lowest errors across all metrics, confirming the effectiveness of the proposed noise adaptive mechanism.
Furthermore, M5 exhibits better tracking performance than M4. This is because the CNN-BiGRU-Attention model effectively leverages spatiotemporal correlation to generate pseudo-measurement data of superior precision.
5.3. Test of Process Noise Adaptive Capability
This subsection assesses the adaptive estimation capabilities of our developed algorithm when facing time-varying statistical profiles in process uncertainty. To simulate the sudden mutation of process noise, the system is initially set to operate in a quasi-steady state. However, during the time interval 11:48–12:12, additional Gaussian process noise is injected into the true state values to mimic severe system fluctuations. Specifically, the standard deviations of the injected noise are set to 0.005 p.u. for voltage magnitude and 0.0015 rad for voltage phase angle. Four different algorithms, i.e., UKF, AEKF, RAUKF, and NA-UKF, are used for state estimation.
Figure 7 displays the state estimation results.
Table 3 presents the overall evaluation metrics for the four state estimation algorithms.
From
Figure 7 and
Table 3, it is evident that under severe changes in process noise, the standard UKF fails to track the dynamic changes effectively due to its fixed process noise covariance, resulting in the highest estimation errors. While the AEKF and RAUKF algorithms provide limited improvements, the proposed NA-UKF algorithm shows the most accurate estimation results. Compared to the RAUKF algorithm, the NA-UKF significantly reduces the RMSE error. Specifically, the RMSE for magnitude is reduced by 54.55%, and for the phase angle, it is reduced by 62.38%.
5.4. Test of Measurement Noise Adaptive Capability
To evaluate the algorithm’s robustness against non-Gaussian measurement outliers, a Laplace distribution is selected as a representative of non-Gaussian noise. The noise under this distribution exhibits “peaky” and “heavy-tailed” characteristics, which are significantly different from the Gaussian distribution. To ensure a rigorous comparison, the standard deviations for each measurement type are set identically to those used in the Gaussian scenarios.
Figure 8 displays the state estimation results.
Table 4 presents the evaluation metrics for the four state estimation algorithms.
From
Figure 8 and
Table 4, it is clear that under Laplace measurement noise, the proposed NA-UKF algorithm achieves the highest estimation precision. Relative to the RAUKF method, the NA-UKF significantly reduces the RMSE error. For voltage magnitude, the RMSE is reduced by 58.69%. For the phase angle, it is reduced by 14.28%.
5.5. Test of Load and Renewable Source Mutation
To evaluate the algorithm’s estimation performance under load and renewable source mutation scenarios, a specific simulation case is designed; specifically, the system operates normally until 11:55. At this timestamp, the active and reactive loads at nodes 9 and 25, as well as the charging loads at EV charging stations at nodes 19 and 23, are instantly increased by 50%. Simultaneously, the output of PVs at nodes 18 and 33 is cut to zero to simulate a sudden generation loss. These mutation conditions persist until 12:00, after which all loads and generation outputs are restored to their normal operational levels.
Figure 9 and
Table 5 display the state estimation results and the comparative evaluation metrics for the four algorithms.
From
Figure 9 and
Table 5, it is evident that under load and renewable source mutation, the proposed NA-UKF algorithm demonstrates the most robust tracking capability, yielding the lowest errors across all metrics. Relative to the RAUKF method, the NA-UKF significantly improves the estimation accuracy. Specifically, the RMSE for voltage magnitude is reduced by approximately 26.83%, and the MVE is reduced by 25.35%.
5.6. Test of Robustness Against Bad Data
To validate the proposed algorithm’s robustness against bad data, a specific test involving a random bad data injection is conducted. Specifically, bad data are randomly added to the voltage measurements, with deviation magnitudes set to exceed 3% for voltage magnitude and 3° for voltage phase angle, simulating severe measurement outliers.
Figure 10 and
Table 6 display the state estimation results and the comparative evaluation metrics for the four algorithms under bad data conditions.
From
Figure 10 and
Table 6, it is clearly observable that the proposed NA-UKF demonstrates strong robustness. Relative to the RAUKF method, the NA-UKF significantly improves estimation precision. Specifically, the RMSE for voltage magnitude is reduced by approximately 27.73%, the RMSE for phase angle is reduced by 52.75% and the MVE is reduced by 19.60%.
6. Conclusions
To address the challenges of low measurement redundancy and complex noise environments in renewable-dominated distribution networks, this paper proposes a dynamic state estimation method incorporating spatiotemporal data correlation and noise adaptiveness. The results demonstrate that the inherent spatiotemporal correlations in distribution grids can be effectively utilized to reconstruct missing measurement data. The proposed CNN-BiGRU-Attention model achieves accurate mapping from sparse real-time measurements to unmonitored nodes, significantly enhancing network observability. Furthermore, this study validates that adaptive tuning of noise statistics is essential for maintaining estimation accuracy under complex operating conditions. By integrating the NA-UKF algorithm, the proposed method mitigates the performance degradation often observed in traditional filters constrained by static noise covariance assumptions. Specifically, the AMF mechanism enables the estimator to rapidly track severe system fluctuations by dynamically adjusting the process noise covariance, while the RMD-based strategy effectively suppresses the impact of non-Gaussian measurement outliers. Numerical simulations on the IEEE 33-bus system confirm that this approach exhibits superior robustness and tracking precision compared to existing methods.
Future work will focus on two key directions to further advance the scalability and precision of dynamic state estimation. First, to address the challenges inherent in large-scale distribution networks, we aim to explore the deployment of the NA-UKF algorithm onto edge computing nodes. As grid complexity grows, transmitting vast quantities of raw measurement data to the cloud for centralized processing imposes significant bandwidth pressure and introduces latency. By enabling distributed, localized data processing, we aim to alleviate the computational and communication burden on the core network, realizing a more responsive monitoring architecture. Second, while the proposed CNN-based method effectively balances efficiency and accuracy, we acknowledge the theoretical advantages of graph neural networks (GNNs) in explicit topology modeling. Future research will investigate advanced GNN architectures and develop lightweight variants to overcome their computational bottlenecks and sensitivity to data quality, thereby further enhancing the extraction of spatial correlations in complex mesh networks.