State Estimation in Partially Observable Power Systems via Graph Signal Processing Tools

This paper considers the problem of estimating the states in an unobservable power system, where the number of measurements is not sufficiently large for conventional state estimation. Existing methods are either based on pseudo-data that is inaccurate or depends on a large amount of data that is unavailable in current systems. This study proposes novel graph signal processing (GSP) methods to overcome the lack of information. To this end, first, the graph smoothness property of the states (i.e., voltages) is validated through empirical and theoretical analysis. Then, the regularized GSP weighted least squares (GSP-WLS) state estimator is developed by utilizing the state smoothness. In addition, a sensor placement strategy that aims to optimize the estimation performance of the GSP-WLS estimator is proposed. Simulation results on the IEEE 118-bus system show that the GSP methods reduce the estimation error magnitude by up to two orders of magnitude compared to existing methods, using only 70 sampled buses, and increase of up to 30% in the probability of bad data detection for the same probability of false alarms in unobservable systems The results conclude that the proposed methods enable an accurate state estimation, even when the system is unobservable, and significantly reduce the required measurement sensors.


Introduction
Power system state estimation (PSSE) is a critical component of modern energy management systems (EMSs) for multiple purposes, including monitoring, analysis, security, control, and management of the power delivery [1]. The PSSE is conducted using topological information, power measurements, and physical constraints to estimate the voltages (states) at the system buses. The performance and reliability of the PSSE largely depend on the availability and the quality of the measurements [2]. However, there are various realistic scenarios where the system is partially observable (also named in the literature as unobservable) ( [1] Chpter 4, [3][4][5]), that is, the number of sensors is not sufficiently large, or sensors are not well placed in the network. The observability of the system may be compromised due to communication errors, topology changes, sensor failures [1], malicious attacks [6][7][8], and electrical blackouts [9]. A direct implication of system unobservability is that conventional estimators that assume deterministic states, such as the commonly used weighted least squares (WLS) estimator, can no longer be used since they are inaccurate, inconsistent, and may have large estimation errors even in the absence of noise [3,10]. Therefore, developing new estimation methods that enable the full functionality of systems without requiring observability is crucial for reliable operation of the power grid.
State estimation in partially observable systems must incorporate additional properties or information beyond the power flow equations in order to obtain a meaningful estimation. Most existing approaches are two-step solutions that first produce additional pseudo-measurements, e.g., based on short-term load forecasting, to make the system observable, and then estimate the states using an existing technique [11][12][13][14]. However, pseudo-measurements do not contain real-time data, and thus result in an inaccurate estimation. Dynamic state estimation utilizes measurements at different time points [14,15], but needs fast scan rates to capture the dynamics, and is also based on restrictive stationary system assumptions [14]. PSSE that uses data from smart meters and phasor measurement units (PMUs) to overcome observability issues [10,[16][17][18] is usually inapplicable, due to the limited deployment of these sensors [19], and their initial installment cost [20,21]. Sparse signal recovery methods [5] use matrix completion to estimate the states under low observability conditions. However, to employ the matrix completion framework, the measurements matrix should be low-rank. Unfortunately, this assumption is system-dependent and does not always hold, due to, e.g., the spatial correlation between loads at neighboring buses. Deep learning techniques have recently been used for pseudo-measurement generation, and for the reconstruction of missing data for PSSE [19,22,23]. However, these techniques heavily depend on the availability of numerous high-quality event labels that are rarely available in practice [24,25]. In addition, some of these methods do not utilize the physical model [19], which may result in poor performance in practice.
Concepts from graph theory and graphical models have been used in power systems for sensor placement [26], topology identification [27][28][29], state estimation [6,30,31], analysis [32,33], and optimal power flow calculation [34,35]. However, the graphical model methods assume a specific statistical structure, which does not necessarily apply in power systems, and often do not use the physical models, which may result in poor performance in practice. GSP is an emerging field that extends concepts and techniques from traditional digital signal processing (DSP) to data on graphs [36][37][38][39]. Recent works have proposed the integration of GSP in power systems, such as the application of the GSP framework for the power grid with PMU data in [40], the spectral graph analysis of power flows in [35], and attack detection by GSP in [7,41,42]. For unobservable power systems, the authors of [43] investigate the use of GSP methods for dynamic state estimation in unobservable systems based on PMUs and advanced metering infrastructures (AMIs) under the assumption of bandlimited graph signals. However, state estimation without full observability that does not depend on PMUs and AMIs using GSP tools has not been demonstrated before. Therefore, using GSP tools has a great potential for overcoming the lack of information in state estimation without full observability.
This research develops new GSP methods for state estimation in power systems based on interpreting the voltage signals (phases and magnitudes) as graph signals. First, it is shown empirically and analytically that the states for static PSSE, i.e., voltages, are smooth graph signals with respect to the nodal admittance matrix, which is a Laplacian matrix in the graph representation of the network. Second, a GSP-WLS estimation method is developed for PSSE in the direct current power flow (DC-PF) model that uses the graph smoothness of the states and does not require the full observability of the network. Next, a new approach for sensor placement is introduced in order to optimize the estimation performance obtained by the GSP-WLS estimator. Finally, the proposed estimation method is extended to the more realistic alternating current power flow (AC-PF) model by developing a regularized Gauss-Newton method for PSSE that uses the smoothness of the voltage phases and magnitudes. The simulations show that the proposed methods can accurately estimate voltage phases and magnitudes, and detect bad data under conditions of low observability, where standard methods cannot (or display poor performance).
The aim of this paper is to establish a GSP framework for state estimation, sensor allocation, and bad data detection in partially observable systems. The key novelties are as follows: • It is demonstrated that the voltages in the power system can be represented as smooth graph signals, where the graph Laplacian is the admittance matrix. This result can serve as a foundation for developing new GSP tools for other power systems applications in future research.
• New state estimation methods for PSSE in both DC-PF and AC-PF models are developed. These methods use the graph smoothness of the states, and do not require the full observability of the network. While regularization using the Laplacian quadratic form has been applied in various applications, such as image processing [44,45], principal component analysis (PCA) [46], data classification [47,48], and semisupervised learning on graphs [49,50], it has not been conducted before in the context of unobservable power systems. Additionally, the nonlinear measurement equations in the AC-PF model present a new challenge from a GSP perspective, requiring the incorporation of graphical information in the form of Laplacian regularization into the iterative method. As such, these state estimation methods contribute to the expansion of the GSP toolbox for a wide range of applications. • A new approach for sensor placement is introduced to optimize the estimation performance. As the mean squared error (MSE) of the estimator depends on the unknown state vector, the minimization of the Cramér-Rao bound (CRB) is utilized instead. This results in a novel approach that can potentially be applied to other applications in the future. • Numerical simulations on the IEEE 118-bus system are used to validate the merit of the new estimators and the new sensor placement method under different setups, compared to existing pseudo-measurement and matrix completion techniques.
The rest of this paper is organized as follows. In Section 2, the GSP background is introduced, as well as the model, and the conventional estimation approach. In Section 3, the GSP properties of the states are studied. In Section 4, the GSP-WLS state estimator is derived for the DC-PF model and a sensor placement method. In Section 5, the proposed estimation method to the AC-PF model is extended by deriving the regularized Gauss-Newton method. A simulation study is presented in Section 6, and the discussion and conclusions are provided in Section 8.

Background and Model
A power system can be represented as an undirected weighted graph, where the nodes and the edges of the graph are the grid buses and transmission lines, respectively. This section begins with background on the theory of GSP in Section 2.1. Then, the considered power flow measurement model, as well as the state estimation and network observability for this model, are presented in Section 2.2.
In the rest of this paper, vectors and matrices are denoted by boldface lowercase letters and boldface uppercase letters, respectively. The notations (·) T , (·) −1 , (·) † , and Tr(·) denote the transpose, inverse, Moore-Penrose pseudo-inverse, and trace operators, respectively. The mth element of the vector a and the (m, q)th element of the matrix A are denoted by a m and A m,q , respectively. Similarly, A S 1 ,S 2 denotes the submatrix of A, the rows and columns of which are indexed by the sets S 1 and S 2 , where A S = A S,S , and a S is a subvector of a containing the elements indexed by S. The cardinality of the set S is denoted by |S|. The gradient of a vector function g(x) ∈ R K with respect to x ∈ R M , ∂g(x) ∂x , is a matrix in R K×M , with the (k, m)th element equal to ∂g k ∂x m . The matrices I and 0 denote the identity matrix and the zero matrix with appropriate dimensions, respectively, and || · || denotes the Euclidean l 2 -norm.

Background: GSP Framework
Let G(V, ξ) be a general undirected weighted graph, where V = {1, . . . , N} and ξ = {1, . . . , P} are the sets of nodes and edges, respectively. The matrix W ∈ R N×N is the weighted adjacency matrix of the graph G(V, ξ), where W k,n denotes the weight of the edge between node k and node n. It is assumed that W k,n ≥ 0 and that W k,n = 0 if no edge exists between k and n. The graph Laplacian matrix is defined as The Laplacian matrix is a positive semidefinite matrix with the eigenvalue decomposition where the columns of V are the eigenvectors of L, V T = V −1 , and Λ is a diagonal matrix consisting of the ordered eigenvalues of L: 0 = λ 1 < λ 2 ≤ . . . ≤ λ N . By analogy to signal frequency in DSP, the Laplacian eigenvalues can be interpreted as the graph frequencies that, together with the eigenvectors in V, define the spectrum of the graph G(V, ξ) [37]. A graph signal is a function that assigns a scalar value to each node, and thus, is an N-dimensional vector. The graph Fourier transform (GFT) of a graph signal a with respect to the graph G(V, ξ) is [37] Similarly, the inverse GFT is obtained by left multiplication of a vector by V. The Dirichlet energy of a graph signal, a, is defined as where the second equality is obtained by substituting (1), and the last equality is obtained by substituting (2) and (3). The Dirichlet energy is a smoothness measure, which is used to quantify the variability encoded by the graph weights [37]. A graph signal, a, is smooth if where ε is small in terms of the specific application [37]. It can be seen that the smoothness condition in (5) requires connected nodes to have similar values (according to (4)), and forces the graph spectrum of the graph signal to be concentrated in the small eigenvalues region (according to (4)). A graph filter applied on a graph signal is a linear operator that satisfies [51] a out = Vdiag(ψ(λ 1 ), . . . , ψ(λ N ))V T a in , where a out and a in are the output and input graph signals, diag(a) is a diagonal matrix in which the (n, n)th entry is a n , and ψ(λ n ) is the graph filter frequency response at the graph frequency λ n , n = 1, . . . , N. Low-pass graph filters of order K are defined as follows [52]. (6) is a low-pass graph filter of order K with a cutoff frequency at

Definition 1. The graph filter in
This definition implies that if η K < 1, then most of the energy of the graph filter is concentrated in the first K frequency bins of the graph filter [52]. Upon passing a graph signal through Vdiag(ψ(λ 1 ), . . . , ψ(λ N ))V T , the high-frequency components (related to graph frequencies greater than λ K ) are attenuated relative to the low-frequency components (related to graph frequencies lower than λ K ). Accordingly, as long as the input of the filter is a "well-behaved" excitation, and does not possess strong high-pass components, the output signal is a K-low-pass graph signal [52], and thus, a smooth graph signal for small K as defined in (5).

DC-PF Model: State Estimation and Observability
A power system is a network of buses (generators or loads) connected by transmission lines that can be represented as an undirected weighted graph, G(V, ξ), where the set of nodes, V, is the set of N buses, and the edge set, ξ, is the set of P transmission lines between these buses. The set of all sensor measurements is denoted by M, which includes M = 2P + N active power measurements at the N buses and at the bi-directional P transmission lines. According to the π-model [1], each transmission line, (k, n) ∈ ξ, which connects buses k and n, is characterized by an admittance value, Y k,n .
The active power and the voltages obey multivariate versions of Kirchhoff's and Ohm's laws that result in the nonlinear equations of the AC-PF model (see Section 5). In order to analyze the GSP properties and to simplify the presentation of the new methods, first these equations are approximated using the DC-PF model [1], in which the states are the voltage angles. Therefore, we consider first a DC-PF model with the following noisy measurements of the active power [1]: where • z = [z 1 , . . . , z M ] T ∈ R M is the active power vector. • θ = [θ 1 , . . . , θ N ] T ∈ R N is the system state vector, where θ n is the voltage angle at bus n. In low-observability systems, it is more convenient to delay the assignment of the reference angle (p. 165 in [2]). Thus, θ includes the angle of the reference bus. • e ∈ R M is zero-mean Gaussian noise with covariance R. • H ∈ R M×N is the measurements matrix, which is determined by the topology of the network, the susceptance of the transmission lines, and the meter locations [6]. In particular, the N rows of H associated with the meters on the buses that measure the total power flow of the transmission lines connected to these buses together create the nodal admittance matrix B (e.g., see p. 97 in [2]) with the following (k, l)-th element: where N k is the set of buses connected to bus k and b k,n < 0 is the susceptance of (k, n) ∈ ξ, i.e., b k,n equals the imaginary part of Y k,n .
The goal of DC-PF PSSE is to recover the state vector, θ, from the measurement vector, z, for various monitoring purposes [1,31]. Since θ also includes the reference bus, without loss of generality, the angle of bus 1 (the reference bus) is set to be θ 1 = 0. The PSSE in this case is implemented using the following WLS estimator [1]: The solution of (10) is where andV = V \ 1 is the set of all buses except the reference bus.
For state estimation to be feasible, one needs to have enough measurements such that the system state can be uniquely determined by the WLS estimation approach. This observability requirement, before the assignment of reference angles, can be defined by one of the following (p. 165 in [2]).

Definition 2.
Assume the DC-PF model from (8). The network is observable if any matrix that is obtained from H by deleting one of its columns has a full column rank of N − 1. Alternatively, the network is observable if the following holds: Hθ = 0 if, and only if, θ = α1, where α is an arbitrary scalar.
In particular, since according to Definition 2, H M,V has full column rank for an observable system, observability ensures that H T M,V R −1 H M,V is nonsingular, and the WLS estimator from (11) and (12) is well-defined for any observable network. Definition 2 implies the following corollary: The network is unobservable if the conditions in Definition 2 are not satisfied.
In practice, however, network observability is not always guaranteed. In such cases, the WLS estimator from (11) cannot be implemented. Even for observable systems, errors and outliers may have a disastrous effect on the state estimation. In the following, it is shown that incorporating graphical information using GSP tools improves the state estimation performance and enables estimation even in partially observable systems.

GSP Properties of the States
The power system can be represented as an undirected weighted graph, G(V, ξ), as described at the beginning of Section 2.2. In this context, the state vector, θ ∈ R N , and the subvector of z from (8) that contains the N active power injection measurements at the N buses, denoted as z bus , can be interpreted as graph signals. In this graph representation, the nodal admittance matrix from (9) is a Laplacian matrix: In this section, the graph low-pass nature and the smoothness of the state vector is established in power systems under normal operation conditions, and where the Laplacian matrix is set to be the nodal admittance matrix, as defined in (13). That is, using the smoothness defined in (4) and (5), we show that where ε is small relative to the other parameters in the system. These results are consistent with the low-pass graph nature of the complex voltages described in [40]. It can be seen from (14) that the smoothness property depends on the system states and the network topology. The lack of measurements does not change the physical behavior; therefore, the smoothness property is not affected by the system observability.

Theoretical Validation-Output of a Low-Pass Graph Filter
First, we show analytically that the state vector is a low-pass graph signal. By substituting (13) in the model in (8), after taking only the power injection measurements, one obtains where e bus contains the elements of the noise vector, e, that are related to the N power measurements at the N buses. Equation (15) implies that since L is a Laplacian matrix, it satisfies L1 = 0 [53], where 1 is a vector of ones with appropriate dimensions. Therefore, the states can be recovered from (15) up to a constant shift, which can be written as where c 1 is an arbitrary constant that represents the constant invariant property of the state vector [1], and the last equation is obtained by substituting (2). Without loss of generality, the value of c 1 is set to be where c 2 is an arbitrary constant. Using (17) and the definition of the pseudo-inverse operator, the model in (16) can be written as the following linear graph filter input-output model: where That is, θ is an output of a graph filter with the graph frequency response in (19). This representation holds under the assumption that the network is connected. Therefore, λ 1 is the only zero eigenvalue of L with the eigenvector 1 √ N 1 [53]. Since the eigenvalues of L are ordered, 0 = λ 1 < λ 2 ≤ λ 3 ≤ . . . ≤ λ N , it can be seen that the graph frequency response in (19) decreases as n increases, as long as c 2 > 1 Nλ 2 . By substituting (19) in (7), one obtains where (18) is a graph low-pass filter of any order K ≥ 1. Since z bus in (16) includes the generated powers and loads, it can be assumed to be random [54,55], and thus, the input signal, z bus − e bus , does not possess strong high-pass components. Hence, as explained after Definition 1, the state vector θ is a first-order low-pass graph signal, and a smooth graph signal, as defined in (14).

Experimental Validation in IEEE Systems
In the following, the smoothness of the state signal is demonstrated in the graph frequency domain for the IEEE test case systems [56]. We also demonstrate the smoothness of the voltage magnitude vector, v = [|v 1 |, . . . , |v N |] T , which can be interpreted as graph signals, which will be used in Section 5. Figure 1 compares the normalized state vector, θ ||θ|| , and its GFT (calculated using (3)),θ ||θ|| , versus bus or spectral indices, for the IEEE 118-bus system [56]. Similarly, Figure 2 presents the normalized voltage magnitude vector, v ||v|| , and its GFT,ṽ ||ṽ|| , and Figure 3 presents the normalized power vector, z bus ||z bus || , and its GFT,z bus ||z bus || . For the sake of clarity, the vectors in Figures 1-3 have been decimated by a factor of 3.
It can be seen that most of the energy of the state signal, i.e., the phases ( Figure 1) and the magnitudes (Figure 2) of the voltages, is concentrated in the low graph frequencies region. Accordingly, it can be concluded that the state vector and the voltage magnitude vector are smooth graph signals in the sense of (4). In contrast, the energy of the power injection measurement vector ( Figure 3) is uniformly distributed across all graph frequencies. Thus, the power signal cannot be considered to be smooth. Similar results were obtained for other IEEE systems. Next, we validate experimentally that the states, θ, and the magnitudes, v, are significantly smoother than the power vector, z bus , by comparing their normalized Dirichlet energy for typical IEEE systems, as shown in Table 1. The values of the nodal admittance matrix, B = L, the voltages, and the power data are taken from [56]. It can be seen that the phases and magnitudes are much smoother than the power injection vectors. This result is reasonable, since the phase differences between connected buses are small under normal conditions and the magnitudes are approximately constant [31], while the power may be very different, since each generator/load injects different amounts of power into the system.

GSP-WLS Estimator in DC-PF Model
The recovery of smooth graph signals by incorporating regularization terms has been well studied in the GSP literature [51,57] and in the context of Laplacian regularization [58,59]. In this section, we cast the state estimation problem as a regularized graph signal recovery problem. In particular, we exploit the smoothness of the state vector, established in Section 3, to develop the smoothness-based regularized GSP-WLS estimator of the states in Section 4.1. The properties of the proposed approach are discussed in Section 4.2, where the main advantage is that it does not require system observability. In Section 4.3, an estimator of the missing power data is introduced as a by-product of this approach. Finally, in Section 4.5, a sensor allocation policy that aims to optimize the performance of the GSP-WLS estimator is designed.

GSP-WLS Estimator for the Partial Measurement Model
In the following, the case where only partial observations of the signal z from (8) are available over a subset of sensors from M is considered, where this subset is denoted by S and S ⊆ M. A sensor at a particular location provides one row in the measurements matrix, H. Therefore, based on the model in (8), the partial measurement vector can be written as Since e S contains the elements of the noise vector, e, of the set of available measurements, S, it is a zero-mean Gaussian noise vector with a covariance matrix R S . If the columns of H S,V after deleting one column are linearly dependent, then, from Corollary 1, the new model in (21) with H S,V is not fully observable. In this partially observable case, the WLS estimator for the model in (21) cannot be developed via a similar strategy to that in (10) and (11) , since, according to Definition 2, the state, θ, cannot be uniquely (up to a constant) determined from (21).
As a result, we need to incorporate additional properties beyond the power flow equations in (21) to obtain a valid state estimation. Here, we propose to recover θ using the GSP-WLS estimator that incorporates the smoothness constraint from (14). Hence, the GSP-WLS estimator is defined bŷ UsingV = V \ 1, θV is the state vector without the reference bus state, θ 1 , and LV is the submatrix of L obtained by removing its first row and column. The smoothness constraint in (14), after substituting θ 1 = 0, can be rewritten as Using the smoothness constraint from (23) and substituting θ 1 = 0 in (22), the GSP-WLS estimator is given by andθ GSP-WLS 1 = 0. Then, using the Karush-Kuhn-Tucker (KKT) conditions [60], the minimization problem in (24) can be replaced by the following regularized optimization problem (e.g., see pp. 17-19 in [61]): andθ GSP-WLS 1 = 0. The term θ T V LV θV is a regularization term, which is based on the smoothness constraint from (23). The parameter µ ≥ 0 is a Lagrange multiplier, which is a tuning parameter that replaces ε and is discussed in Section 4.2. If the system is not fully observable based on the sensors at S, then µ should be larger than zero, i.e., µ > 0.
The GSP-WLS estimator from (25) is obtained by equating the derivative of (25) with respect to θV to zero, which results in [61] For a partially observable system, the matrix H T S,V R −1 S H S,V is a singular matrix and the additional term in (27), and µLV with µ > 0 enables the matrix inversion and improves the numerical stability of the proposed GSP-WLS estimator, since LV has full rank (see Lemma 1 in [62]).

Remarks
The main advantage of the proposed GSP-WLS estimator in (26) and (27) is that it does not require full observability of the system. This estimator is a function of the regularization parameter, µ ≥ 0. The determination of µ is discussed in Section 6. More strategies to choose µ are described in the literature (e.g., see Section 1.8 in [61]). The following are special cases of the proposed GSP-WLS estimator: 1.

2.
Large µ: At the other extreme, for µ → ∞, the coefficient matrix from (27) satisfies lim µ→∞K (S, µ) = 0. Thus, in this case, the GSP-WLS estimator from (26) satisfieŝ θ GSP-WLS → 0. This zero estimator can be interpreted as the a priori state estimator, which does not use the observations. Thus, taking too large a value of µ is unhelpful.

3.
Relation with the pseudo-measurement WLS (pm-WLS) estimator: The pm-WLS estimator for systems that are not fully observable is based on generating pseudomeasurements of typical power injection/consumption values from historical data [1,24]. In this case, the received measurements are processed together with a priori estimated (predicted) states (without the reference bus),θ prior ∈ R N−1 , which are assumed to have the error covariance matrix, R prior ∈ R (N−1)×(N−1) . The pm-WLS estimator is the maximum a posteriori state estimator [11]: where It can be seen that ifθ prior = 0 and R −1 prior = µLV , then K 1 =K(S, µ) and the pm-WLS estimator in (29) coincides with the GSP-WLS estimator in (26). Therefore, the proposed GSP-WLS estimator can be interpreted as a special case of the pm-WLS estimator, where the GSP theory provides a mathematical strategy to determine the pseudo-data information. Moreover, in general, the GSP-WLS estimator only requires setting a single scalar parameter, µ, compared with the pm-WLS estimator, which requires setting both R prior andθ prior .

Estimation of Missing Power Measurements
An important by-product of the GSP-WLS estimator is the following method for reconstructing the missing data of active power measurements. In the partially observable system, we have measurements obtained from the set of sensors, S, which is given by z S . Our goal in this subsection is to recover the other measurements that are included in the vector z M\S . Based on the model in (8) (similar to (21)), the measurement vector of the partially observable system can be written as where e M\S is a zero-mean noise vector with a covariance matrix R M\S . By substituting the GSP-WLS estimator from (26) in (32) and removing the noise term, the following WLS-type estimator of the missing power measurements is obtained: By recovering the lost power data, the EMS can also monitor the unobservable part of the system [63].

Detection of Bad Data in Unobservable Systems
State estimators can also be used for bad data detection by plugging it into any detector that is based on a state estimator. In particular, in this paper, the following bad data detection methods that are based on a general estimatorθ can be used with the new estimator: • Largest normalized residual test (LNR) [1]: where the infinity norm of a vector a is defined by : • The GFT-based detector from [7] that was developed for the detection of false data injection (FDI) attacks. The GFT-based detection scheme calculates the GFT of an estimated grid state,θ, and filters the graph's high-frequency components. By comparing the maximum norm of this outcome with a threshold, it can detect the presence of FDI attacks.
In both (34) and (35), τ denotes a chosen threshold, and H 0 and H 1 denote the hypotheses of good/bad data, respectively.

Optimization of the Sampling Policy
Sensor locations have a significant impact on the estimation performance of power systems [64]. Therefore, in this subsection, we design a sensor allocation policy for the model in (21) that aims to minimize the MSE of the GSP-WLS estimator, − θ)]. However, it can be shown that the MSE ofθ GSP-WLS is a function of the unknown state vector, θ, and thus, cannot be used as an objective function for the optimization of the sensor locations. Therefore, the MSE is replaced by the CRB [65], which is a lower bound on the MSE.
In this subsection, we treat the MSE, bias, and CRB of the vector θV (without the reference bus for the sake of simplicity). By substituting θ 1 = 0 in the model in (21), one obtains that the partial measurement vector obtained from a sensor subset S is a Gaussian vector with mean H S,V θV and covariance R S : The CRB for this Gaussian vector, which is a lower bound on the MSE, is given by (pp. 45-46 in [65]) where b(S ) = E[θV − θV ] is the bias of the estimator and ∂b(S ) ∂θV is its gradient. Using the model in (21) and the estimator in (26), it can be seen that the bias of the GSP-WLS estimator is whereK(S, µ) is defined in (27). Accordingly, the gradient of (38) with respect to θV is By substituting (39) in (37), we obtain that the CRB on the MSE of estimators with the GSP-WLS bias is given by By substituting (27) in (40) and using the pseudo-inverse property, A = AA † A, one obtains The CRB in (41) is not a function of the unknown state vector, θ, and thus can be used as an optimization criterion for choosing the sensor locations. We assume a constrained quantity of sensing resources, e.g., due to a limited energy and communication budget. Therefore, the problem of the selection of sensor locations with onlyq sensors can be written as follows: where the last equality is obtained by substituting (41).
It is assumed that in the measured buses, all the relevant power measurements are given. Thus, S is uniquely determined by the buses chosen for the measurements. For the sake of simplicity, in the optimization approach, we take H V,V = L V,V and replace the selection ofq sensors by the selection of q buses. Therefore, by substituting H V,V = L V,V , the problem in (42) is replaced by the problem of selecting the optimal buses in the CRB sense. However, finding the set of q locations among all the N buses with the smallest CRB is a combinatorial optimization with a computational complexity of ( N q ), which is practically infeasible. Therefore, we propose a greedy algorithm, Algorithm 1, for the practical implementation of the sampling scheme. The idea behind this algorithm is to iteratively add to the sampling set those buses that lead to the minimal CRB. In addition, Figure 4 illustrates the data flow diagram of this greedy algorithm for selecting the measured buses.

Algorithm 1 Greedy selection of the measured buses Input:
(1) Laplacian matrix, L, and noise covariance matrix, R (2) Number of buses with sensors, q (3) Regularization parameter, µ Output: Subset of q buses, S 1: Initialize the bus subset S (0) = ∅ and the iteration, i = 0 2: while i < q do 3: Update the set of available locations, L = V \ S (i)

4:
Find the optimal bus to add: whereK is defined in (27) with H V,V = L V,V

5:
Update the subset of buses, S (i+1) ← S (i) ∪ w opt , and the iteration, i ← i + 1 6: end while 7: Update the chosen subset of buses: S = S (i)

Extension to the AC-PF Model
Since the problem of low observability mainly occurs in distribution systems, which requires AC state estimation, in this section, the GSP-WLS estimator is extended to the AC-PF model, where voltage magnitudes are estimated as well. In particular, the conventional PSSE in the AC-PF model is described in Section 5.1. Then, in Section 5.2, the proposed iterative regularized Gauss-Newton method that exploits the smoothness property of the voltage phases and magnitudes in each iteration is presented. In Section 5.3, the properties of the proposed GSP Gauss-Newton algorithm are discussed.

Model, State Estimation, and Observability
In the following, we replace the DC-PF model from (8) by the following nonlinear AC-PF model equations: where • z ∈ R M is the measurement vector that includes the active and reactive branch power flows and power injections. The specific forms and parameters of (44) with different levels of modeling details can be found, e.g., in Chapter 2 of [1].
Similar to the WLS estimator for the DC-PF model in (10), the AC-PF state estimator is usually based on minimizing the following WLS objective function: with respect to x [1]. The first-order optimality condition for the unconstrained minimization problem in (45) is given by where H(x) = ∂h(x) ∂x is the Jacobian matrix of h(x) at x. Solving the nonlinear equation in (46) using the Gauss-Newton method [1,2] results in the following iterative system: where x (i) is the state estimator at the ith iteration and is the gain matrix. Iterating until convergence, i.e., until ||x (i+1) − x (i) || ≤ δ, one will obtain the solution of PSSE. The observability requirement for the AC-PF model can be defined as follows (see Chapter 4.6 in [1], Donti et al. [5]).

Definition 3.
Assume the AC-PF model from (44). The network is observable if G(x) is a nonsingular matrix for any x in the solution space.
By observing (48), it can be seen that if H(x) has a full column rank of 2N − 2, then the network is observable in the AC-PF sense. This observability condition should be satisfied in each iteration of the Gauss-Newton iterative algorithm.

GSP-Based Gauss-Newton Algorithm
Similar to Section 4, here we consider the case where only partial observations of z from (44) are available over a subset of sensors S ⊆ M. That is, based on the model in (44), the partial measurement AC-PF model can be written as where e S is a zero-mean Gaussian noise vector with a covariance matrix R S , as in (21). The Jacobian matrix of the model in (49) is H S,V (x) = ∂h S (x) ∂x , whereV indicates the set of all the columns in H(x). If the columns of H S,V (x) are linearly dependent, then G(x) is a singular matrix, and from Definition 3, the new system in (49) is not fully observable. In this case, the Gauss-Newton iterative procedure for the minimization of (45) cannot be implemented, since the update of the solution cannot be uniquely determined from (47).
In order to tackle this problem, we incorporate the smoothness constraints from (23) and from (5) with a = v. Hence, the GSP-WLS estimator for the AC-PF model is defined bŷ where the function J is defined in (45), and ε θ , ε v are the tuning parameters of the smoothness of θ and v.
Using the KKT conditions [60], the minimization in (50) can be replaced by the following regularized optimization: where and The right term in (52) is a regularization term, which is based on the smoothness property of the phases and magnitudes of the voltages, established in Section 3. The parameters µ θ , µ v ≥ 0 are Lagrange multipliers that replace ε θ , ε v as regularization parameters, and their tuning is discussed in Sections 5.3 and 6.3. The minimum of the quadratic objective function, J reg (z S , h S (x), R S ,L(µ θ , µ v )), with respect to x can be determined using the first order optimality conditions as follows: Then, similar to (47), the nonlinear equation, g reg (x) = 0, is solved using the following Gauss-Newton method iteration: (55) where is the new gain matrix. Solving this equation and iterating until the required accuracy, δ, is reached, i.e., ||x (i+1) − x (i) || ≤ δ, one will obtain the proposed GSP-WLS estimator for the AC-PF model. It can be seen that for a partially observable system, H T S,V (x)R −1 S H S,V (x) is a singular matrix, and the additional terms in (56),L(µ θ , µ v ) from (53) with µ θ > 0, and/or µ v > 0, can enable the matrix inversion of G reg (x) and improve the numerical stability of the GSP-WLS estimator for the AC-PF model. The iterative solution is summarized in Algorithm 2. For the initialization, we suggest that one use the "flat start", where all bus voltages are 1 per unit and have the same phase [1]. Figure 5 presents the flow of the iterative regularized Gauss-Newton algorithm through a data flow diagram. if ||x (i+1) − x (i) || < δ: break 6: end for 7: Update the state vector:x = x (i+1)

Remarks
In the following, we present special cases of the proposed GSP-WLS estimator for the AC-PF model implemented by the regularized Gauss-Newton method.
(1) Full observability: If the information from all sensors is available, i.e., if h S (x) = h(x), then, by substituting µ θ = µ v = 0 in (55) one obtains that G reg (x) = G(x), where G(x) and G reg (x) are defined in (48) and (56), respectively. Accordingly, in this case, the Gauss-Newton iteration of the GSP-WLS estimator in (55) coincides with the Gauss-Newton iteration in (47).
(2) Relation with the pm-WLS estimator: Similar to the case of the DC-PF model, the pm-WLS estimator for partially observable systems is calculated based on the measurements and the a priori estimated (predicted) states,x prior ∈ R 2N−2 , with the forecasting error covariance matrix, R prior . As a result, the following pm-WLS estimator is obtained [11]: The pm-WLS estimator for the AC-PF model can be calculated with the Gauss-Newton method [11]. It can be seen that if we substitutex prior = x 0 and R −1 prior =L(µ θ , µ v ), then the pm-WLS estimator from (57) coincides with the GSP-WLS estimator from (51) and (52). Hence, similar to the DC-PF model, the proposed GSP-WLS estimator can be interpreted as a special case of the pm-WLS estimator, where the GSP theory provides the mathematical justification for the determination of the pseudo-data information.

Results
In this section, the performance of the proposed methods is compared with that of existing methods. In Section 6.2, the performance of the GSP-WLS estimator from Section 4 is evaluated. In Section 6.3, the performance of the regularized Gauss-Newton method from Section 5 is investigated. The influence of the sampling policy from Section 4.5 is examined for both cases. In Section 6.4, the use of the new estimator for bad data detection is demonstrated.

Simulations Platform and Parameters
All the simulations were performed with Matlab R2020b. The measurements were generated according to the AC-PF model from (44) with h(·) |v n ||v k |(G n,k cos(θ n − θ k ) + B n,k sin(θ n − θ k )), n = 1, . . . , N, for the real and reactive power injection measurement at bus n and h 2N+2 f (n,k)−1 (x) = |v n | 2 G n,n − |v k ||v n |(G n,k cos(θ n − θ k ) + B n,k sin(θ n − θ k )), ∀n ∈ N k , k = 1, . . . , N, for the real and reactive power flow from bus n to bus k measurements ( f (n, k) is a one-toone mapping of all (n, k) ∈ ξ to [1, 2P]), where x = [θ 2 , . . . , θ N , |v 2 |, . . . , |v N |] T ∈ R 2N−2 is the state vector here, bus 1 is the reference bus, and thus θ 1 = 0 and v 1 is known. The values of the conductance and the susceptance matrices, B and G, as well as the values of the voltage angles and magnitudes, were taken from the IEEE 118-bus test case recorded in [56]. This IEEE 118-bus test case represents a simple approximation of the American electric power system (in the U.S. Midwest) as of December 1962. This IEEE 118-bus system, which has N = 118 buses and, at most, M = 952 power measurements, contains 19 generators, 35 synchronous condensers, 177 lines, 9 transformers, and 91 loads. Bus number 1 is set to be the slack bus. Gaussian white noise was added to generate several measurements from the recorded grid state. In particular, in the simulations described in the following subsections, the noise covariance matrix is set to R = σ 2 I, where, unless otherwise stated, σ 2 = 0.01. The performance is evaluated using 1000 Monte Carlo simulations. We compare the estimation performance of the different estimators implemented for the following bus selection policies: The basic assumption behind this method (which was suggested in [66] for power systems) is that the measured graph signal (here, the power signal) is an R-bandlimited signal in the graph frequency domain. That is, the GFT of z bus satisfies (z bus ) n = 0, n = R + 1, . . . , N, where R is the cutoff frequency. As can be seen in Figure 3, in practice, the R-bandlimitness assumption does not hold for the power signal. (iii) Minimum CRB (Algorithm 1)-the proposed bus selection policy from Algorithm 1.
It should be noted that the CRB from (37) is not presented in the following simulations, since it is not the main goal of this study, and in order to increase the interpretability of the figures. Figure 6 presents the estimated probability for the system to be observable, according to Definition 2, versus the number of measured buses. The estimated probability of observability is calculated as the percentage of scenarios with observable systems in 100, 000 Monte Carlo simulations for randomly selected buses in the system. It can be seen that this probability of observability increases as the number of measured buses increases, and that for fewer than 72 measured buses, the IEEE 118-bus system will not be observable with probability 1.  Figure 6. The estimated probability for the IEEE 118-bus system to be observable versus the number of measured buses.

State Estimation and Sampling under the DC-PF Model
In this subsection, we evaluate the performance of the following estimators: 1.
The pm-WLS estimator from [11], generated with R −1 prior = 0.5I, and whereθ prior is randomly chosen from a zero-mean Gaussian distribution with covariance 0.015I.

2.
The matrix compilation (mc) method [5], implemented in this subsection by substituting the DC model from (8) in the constraints of the method in Equations (8) and (12) in [5].

3.
The proposed GSP-WLS estimator (26) and (27) with µ = 0.1. In Figure 7a,b the MSE of the GSP-WLS, pm-WLS, and mc estimators is presented versus the number of measured buses, q, and versus 1 σ 2 , respectively, with the sampling policies (i)-(iii). Figure 7b is obtained for q = 48 measured buses, for which the system is not observable with probability 1. It can be seen that the MSE decreases as q increases and as σ 2 decreases. In both figures, the GSP-WLS estimator outperforms the pm-WLS and mc estimators for any tested sampling policy. In this case, the performance of the pm-WLS and the mc estimators is similar, since it can be shown that for the mc estimator, the rank regularization on θ (see Eq. (13a) in [5]) turns out to be an l 2 regularization on θ −θ prior , as in (29), whereθ prior approaches zero. In addition, it can be seen that the proposed sampling policy from Algorithm 1 results in a significantly lower MSE than that obtained for the random and the E-design sampling policies for all estimators. In Figure 7a, it can be seen that for each sampling policy, the MSE of the GSP-WLS, pm-WLS, and mc estimators coincides with that of the other methods where the system becomes observable (i.e., where q > 72 for the random sampling, q > 76 for the E-design sampling, and q > 79 for Algorithm 1). In particular, where q = 118 (i.e., full observability with probability 1), the MSEs of all estimators are identical. Figure 8 shows the MSE of the power estimatorẑ from (33) with the three estimators and the three sampling policies. It can be seen that the MSE decreases as the number of measured buses increases, as expected, since there are fewer parameters to estimate with the increase in the number of samples and the state estimation is more accurate, as presented in Figure 7a. It can be seen that the relationships between the sampling policies and the estimators are similar to Figure 7a, where the GSP-WLS estimator with the bus selection policy of Algorithm 1 achieves the lowest MSE. Moreover, for each sampling policy, the MSE of the power estimation based on the GSP-WLS, pm-WLS, and mc estimators coincides with that of the other methods where the system becomes observable (i.e., where q > 72 for the random sampling, q > 76 for the E-design sampling, and q > 79 for Algorithm 1).  (26) and (27), pm−WLS [11], and mc [5] estimators for random, E−design [38], and Algorithm 1 bus selection policies versus (a) the number of buses, q, with σ 2 = 0.01, and (b) 1 σ 2 with q = 48.  (26) and (27), pm−WLS [11], and mc [5] estimators for random, E−design [38], and Algorithm 1 bus selection policies versus the number of buses, q, where σ 2 = 0.01.
In Table 2 and in Figure 9, the influence of the tuning parameter, µ, is examined; we show that the proposed GSP-WLS estimator is robust to the choice of this tuning parameter by demonstrating that the estimator achieves the same performance for a range of values of µ. In Table 2, the average and the sample standard deviation of the MSE of the GSP-WLS estimator for different values of µ ∈ [0.01, 10] are presented, for a system with q = 48 and σ 2 = 0.01, for the three sampling policies. It can be seen that, for the tested scenario, the MSE is approximately constant for any µ in the range [0.01, 10]. Of course, further increasing µ will eventually increase the MSE, since the weight of the measurements in the estimation will be negligible. Similar results were obtained for other values of q and σ. Therefore, choosing any µ in this wide range results in good estimation performance for any sampling policy.  Figure 9 demonstrates the performance of the GSP-WLS estimator where the power system operating condition changes throughout the day, using time series data for different values of µ and correspondingly different sampling sets chosen by Algorithm 1. Simulation setup and parameters: For this case study, we use the modified IEEE 118-bus system in a similar manner to [67]. Figure 9a shows the hourly load profile of the system demand, as taken from http://motor.ece.iit.edu/Data/ (accessed on 1 August 2022). Given a single measure of the loads in all the buses and the total system demand along 24 h, the hourly load at bus n is the relative part of the load (among the other measured loads) from the total system demand. The true value of the state was calculated using the MATPOWER toolbox [68]. It can be seen in Figure 9b that for the MSE for µ = 0.01, 0.1, 1 is approximately the same. These results show that the proposed GSP-WLS estimator is robust to the tuning parameter in this range, even under changing operating conditions. More strategies for choosing µ are described in the literature (e.g., see [61]).

State Estimation and Sampling under the AC-PF Model
In the following, we discuss the estimation of both the phases and the magnitudes of the states using the estimators: 1.
The Gauss-Newton implementation of the pm-WLS estimator from [11] by Algorithm 2 with the appropriate replacements of the regularization terms, i.e., wherex prior = x 0 and R −1 prior = I are used instead ofL(µ θ , µ v ) (see the remarks after (57)).

2.
The mc method from [5], implemented using Equations (6), (8), (12a), and (12c) from [5], where the low-rank matrix used in this method, composed of the real and the imaginary parts of v. In addition, we added to this method the current measurements as inputs (instead of the power flow measurements) for a fair comparison. The implementation was conducted by the SDP solver of CVX [60].
In the Gauss-Newton-based methods, the maximal number of iterations is set to l = 20 and δ = 10 −8 in Algorithm 2.
In Figure 10a the MSE of phase estimation by the GSP-WLS and pm-WLS estimators (both implemented by the regularized Gauss-Newton method) are presented versus the number of measured buses, q, for the sampling policies (i)-(iii). Similarly, in Figure 10b, the MSE of the magnitude estimation is presented. It can be seen that the MSE decreases as the number of measured buses increases for both the magnitudes and the phases. Moreover, the GSP-WLS estimator outperforms the pm-WLS and the mc estimators for any sampling policy. Figure 10a shows that for each sampling policy, the MSEs of the GSP-WLS, and the pm-WLS estimators for the phases separate from each other when the system observability can no longer be guaranteed (i.e., where q < 72 for the random sampling, q < 76 for the E-design sampling, and q < 79 for Algorithm 1). In particular, where q = 118 (i.e., full observability with probability 1), the performances of the estimators coincide. Figure 10b demonstrates that for q = 118, the MSEs of the GSP-WLS and the pm-WLS estimators for magnitude are equal. For an observable system with q < 118, the MSEs of the estimators coincide for each sampling policy separately. Finally, when the system becomes unobservable (i.e., where q < 72 for the random sampling, q < 76 for the E-design sampling, and q < 79 for Algorithm 1), the MSEs of the estimators split, and the MSE of the GSP-WLS is lower.
It should be noted that the mc estimator implementation by CVX has higher computational complexity and a lot of tuning parameters in comparison to the other methods. Finally, it can be seen that the sampling policy from Algorithm 1 results in a significantly lower MSE than that obtained for the random and the E-design sampling policies for the pm-WLS and the GSP-WLS estimators.
In Figure 11, we examine the influence of the tuning parameters µ θ and µ v on the estimation performance of the phases and magnitudes, where the sampling set is chosen by Algorithm 1. It can be seen that for µ θ , µ v ∈ [0.01, 10], the MSE is approximately constant. Hence, choosing any µ θ and µ v in this range will obtain a good result.

Detection of Bad Data
In the following, the application of bad data detection based on the proposed estimator is demonstrated, based on the description in Section 4.4. We compare the performance of the following detectors: • The LNR test from (34), the J(θ) test from (35), and the GFT-based detector from [7], after substituting eitherθ =θ pm-WLS from (29) orθ =θ GSP-WLS from (26) and (27). In the GFT-based method, the cutoff frequency is chosen such that in normal states, 35% of the frequencies pass the filter. • The L ∞ detector from [69] with Σ x = I.

•
The Laplacian-Regularized detector (Lap.Reg.) from our previous work in [41], which also uses the GSP-WLS estimator,θ GSP-WLS . 50   of the pm−WLS [11], mc [5], and GSP−WLS (Algorithm 2) estimators for random, E−design [38], and Algorithm 1 bus selection policies versus the number of buses, q. Simulation setup and parameters: The detection performance is evaluated for partially observable systems with only power injection measurements, i.e., L ν,S = H ν,S , with q = 48 buses that are chosen by Algorithm 1. Constant noise (∆z = 10σ) has been added onto three random measurements, i.e., z i + ∆z at the ith bus, where z i is obtained by (21).
In Figure 12, the detection performance of the proposed detectors is demonstrated by the receiver operating characteristic (ROC) curves. This figure shows that the detectors that are based on the GSP-WLS estimator outperform the detectors that are based on the pm-WLS estimator, as well as the L ∞ detector. These results are aligned with the estimation results, since the MSE of the state estimation by the GSP-WLS estimator is significantly lower than the MSE obtained by the pm-WLS estimator, as shown in Figure 7.  (35), GFT−based detector [7] implemented with the GSP−WLS (26) and (27), and the pm−WLS [11] estimators, L ∞ [69], and the Laplacian−regularized [41] detectors with q = 48 buses.

Discussion
Utilization and Implications: The proposed GSP-WLS estimator has the potential to significantly improve state estimation in unobservable power systems, which can be caused by various reasons, such as communication issues, cyber attacks, and legal or economic constraints on measurement locations. By utilizing GSP techniques, the proposed estimator can achieve more accurate estimates with fewer measurements, reducing the cost and complexity of data collection. In addition, the proposed sensor placement strategy optimizes the performance of the estimator, further improving its accuracy. These GSP techniques have important implications for the reliability and stability of power systems, as well as for the development of advanced control and optimization algorithms. The research also demonstrates the capabilities of GSP in power system networks, highlighting the potential for its use in other tasks within the field. Overall, this research aims to increase the stability and reliability of electrical grids by developing monitoring techniques. Additionally, this study is closely related to the control and management of renewable energy sources, since using enhanced estimation, sensor allocation, and attack detection methods, the ability of the grid to deal with randomness in the system that stems from the sources increases.
Limitations: This research provides a new GSP framework for state estimation and bad data detection. However, it has some limitations that should be considered in future research. First, it only uses the most recent vector of measurements. While this is a good policy when the system may be under abrupt changes, the performance of the estimator can be improved by incorporating historical measurements from previous time steps. Second, the topology of the network is assumed to be known, which may not always be the case, particularly for distribution systems. In such cases, methods for estimating the topology [29,70] could be used. Third, the Gauss-Newton algorithm used in this study may be computationally expensive for large networks. To address this issue, future research could use low-complexity algorithms to calculate matrix inversion, as described in [1]. Finally, this study has not explored the incorporation of smart meter and PMU measurements into the proposed methods, and this should be considered in future research.
Future research: Future research directions include the incorporation of time series measurements with temporal dependencies to improve the estimation and bad data performance. Additionally, it would be useful to investigate the joint estimation of the system state and the identification of topology changes. Another potential future direction is the combination of smart meter measurements and PMU data in the proposed method, as mentioned above in the limitations. Finally, a promising area of future research is the integration of deep unfolding or other machine learning and optimization tools to accelerate the regularized Gauss-Newton method. One particularly important area of future research is the development of new graph neural network (GNN) approaches that take advantage of the graph properties of the states that were validated in this study. GNN methods could potentially offer significant improvements in the accuracy and efficiency of state estimation in power systems, depending on the availability of relevant data.

Conclusions
This paper proposes a GSP framework for state estimation, sensor allocation, and bad data detection in unobservable power systems. First, the graph smoothness of the phases and the magnitudes of the voltages with respect to the admittance matrix have been validated. Then, the GSP-WLS estimator of the system states has been derived under both the DC-PF and AC-PF models. The GSP-WLS estimator uses the graph smoothness of the state signals as a regularization term, and thus, does not require full observability of the system. It is analytically shown that when the system is observable, the proposed GSP-WLS estimator coincides with the WLS estimator. A greedy algorithm has been introduced to tackle the problem of selecting the sampling set that optimizes the state estimation performance. In addition, it is shown that the GSP-WLS estimator is useful for bad data detection by plugging the estimator into existing detectors. Simulation results demonstrate the potential of the GSP methods in power systems for cases where the system observability is not guaranteed. It is shown that the proposed methods can accurately estimate voltage phases and magnitudes, and detect bad data under partial observability conditions where standard methods cannot, or exhibit poor performance. These results show that advanced sensing and measurement technologies using GSP tools can transform data into useful information and enhance various aspects of power system management.

Data Availability Statement:
The IEEE test case systems data in the simulations has been taken from the "Power Systems Test Case Archive", ref. [56], http://www.ee.washington.edu/research/pstca/ (accessed on 1 June 2019).

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations
The following abbreviations are used in this manuscript: