A Bayesian Approach to Bad Data Identification in Power System State Estimation

D’Antona, Gabriele

doi:10.3390/electronics15081732

Open AccessArticle

A Bayesian Approach to Bad Data Identification in Power System State Estimation

by

Gabriele D’Antona

Department of Energy, Politecnico di Milano, 20156 Milan, Italy

Electronics 2026, 15(8), 1732; https://doi.org/10.3390/electronics15081732

Submission received: 19 March 2026 / Revised: 13 April 2026 / Accepted: 15 April 2026 / Published: 19 April 2026

(This article belongs to the Special Issue Advanced Fault and Error Detection Techniques Using Machine Learning and Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

This paper addresses the problem of robust identification of gross errors affecting both measurements and network parameters in power system state estimation. The study is conducted within a steady-state framework and focuses on improving bad data identification in the presence of modeling and measurement uncertainties, explicitly accounting for the limited observability of gross errors. Building on an Extended Weighted Least Squares (EWLS) estimator and a theoretically refined eigenvalue-based clustering of dominant error components, a novel Bayesian identification framework is introduced. The proposed Bayesian approach assigns probabilities to competing gross error models, including scenarios involving multiple simultaneous errors, given the observed clusters of dominant errors. This probabilistic formulation enables a systematic and quantitative decision-making process for identifying the most likely sources of gross errors, extending existing deterministic or heuristic approaches. The methodology is evaluated through numerical simulations on the IEEE-14 bus test system, considering several gross error scenarios and significant parameter uncertainties. The results demonstrate that the proposed Bayesian framework enhances the interpretability and discriminative capability of gross error identification, highlighting its potential for robust bad data identification in power system state estimation.

Keywords:

bad data identification; Bayesian model selection; eigenvalue-based error analysis; multiple gross error identification; power system static state estimation; extended weighted least squares; measurement and parameter uncertainties; probabilistic decision making

1. Introduction

State estimation (SE) under steady-state conditions plays a central role in the monitoring, control, and secure operation of electrical power systems. SE relies on the availability of heterogeneous data sets, including measurements, pseudo-measurements, network parameters, and topological information, which are related to the system state variables through a mathematical network model. Due to the large scale and complexity of modern power systems, portions of these data sets may be affected by gross errors arising from sensor malfunctions, communication failures, modeling inaccuracies, or parameter uncertainties. The presence of such errors can severely compromise the reliability of state estimation results and, consequently, system operation.

Classical studies on bad data processing in power system SE have predominantly focused on measurement errors, typically assuming perfect knowledge of network parameters. Within this context, the Weighted Least Squares (WLS) estimator has become the most widely adopted technique for steady-state SE [1]. Despite its optimality under ideal assumptions, WLS is well known to be sensitive to modeling inaccuracies and parameter uncertainty, which are increasingly relevant in practical applications [2].

To address these limitations, several robust estimation methods have been proposed to explicitly account for network parameter errors [3,4,5,6]. However, most of these approaches treat parameter errors as isolated gross errors, neglecting the intrinsic uncertainty associated with both measurements and parameters. This simplification limits their ability to consistently process heterogeneous data sources and to distinguish between uncertainty-driven variability and actual gross errors.

A significant step toward overcoming these limitations was introduced in [7] through the extended Weighted Least Squares (EWLS) formulation. EWLS provides a unified framework for state estimation that explicitly incorporates uncertainty in both measurements and network parameters, ensuring consistent treatment of data derived from physical models and measurement processes. Building on this formulation, refs. [2,8] proposed an eigenvalue-based analysis of the EWLS-estimated data error covariance matrix, enabling the identification of dominant error components and clusters of gross errors while accounting for the limited data redundancy inherent in power systems. Alternative approaches based on data-driven and interval-analysis techniques have also been proposed to enhance bad data identification capabilities under limited redundancy conditions [9].

In this paper, the EWLS formulation is further revisited and interpreted as the unconstrained solution of an equivalent constrained estimation problem, providing additional theoretical insight while preserving the original estimation framework. This reinterpretation enables a more rigorous and transparent formulation of key concepts such as local observability and the number of determinable principal errors, which are directly related to the dimension of the residual subspace analyzed in Section 3.

While this clustering-based approach allows for the localization of groups of suspect data, it does not, by itself, resolve the fundamental non-uniqueness in identifying the actual sources of gross errors within a cluster. Multiple combinations of single or simultaneous errors may lead to similar observable effects, especially under conditions of limited observability.

Despite extensive research on bad data detection and identification, probabilistic formulations explicitly addressing the selection among multiple competing gross error hypotheses remain relatively limited. Bayesian approaches have been widely adopted in statistical inference for model selection and uncertainty quantification, but their application to bad data identification in power system state estimation is still relatively unexplored.

The main objective of this paper is to address this limitation by introducing a novel Bayesian bad data identification framework that extends the EWLS-based clustering methodology presented in [8]. The proposed approach formulates alternative bad data models based on hypotheses of single or multiple simultaneous gross errors. It then assigns posterior probabilities to each model using the observed error cluster. This probabilistic formulation enables a systematic and quantitative decision-making process for identifying the most likely sources of gross errors, moving beyond deterministic or heuristic selection criteria. The performance of the proposed EWLS-based Bayesian identification methodology is assessed through Monte Carlo simulations on the IEEE-14 bus test system, considering scenarios characterized by significant measurement and parameter uncertainties. The results demonstrate the effectiveness of the proposed approach in discriminating among competing gross error models and highlight its potential for robust bad data identification in power system state estimation.

In parallel with model-based approaches, recent works have explored machine learning techniques for anomaly detection in power systems, aiming to enhance the identification of inconsistent measurements under complex and uncertain operating conditions [10,11]. A recent overview of anomaly detection approaches in power system state estimation is provided in [12].

While these methods provide increased flexibility in handling large and heterogeneous data sets, they generally do not explicitly model alternative gross error configurations or quantify their associated probabilities. In this context, Bayesian approaches offer a complementary perspective, enabling a structured probabilistic assessment of competing bad data hypotheses and potentially supporting future integration with data-driven techniques.

The remainder of the paper is organized as follows. Section 2 summarizes the Extended Weighted Least Squares (EWLS) estimation framework. Section 3 introduces the concept of data principal errors. Section 4 revisits classical bad data detection methods within the context of principal error analysis. Section 5 reviews the formulation of bad data identification based on principal error clustering and discusses the inherent non-uniqueness of the associated solutions, leading to the definition of gross error clusters. Section 6 presents the proposed Bayesian formulation of gross error models and the computation of their posterior probabilities. Section 7 describes the test system used for the numerical assessment, while Section 8 presents the Monte Carlo simulation framework and discusses the results obtained from the selected case studies.

2. EWLS State Estimation Review

In [7], the Extended Weighted Least Squares (EWLS) method for static power system state estimation is introduced. The main difference between the classical Weighted Least Squares (WLS) estimator and EWLS is that the latter explicitly models uncertainty affecting both the measured data and the network parameters. Although a similar estimate could, in principle, be obtained within a standard WLS framework by redefining the measurement error covariance so as to account for network parameter uncertainty, such a reformulation does not allow gross errors affecting network parameters to be distinguished from those affecting the measured data. Moreover, since in practical power systems the uncertainty associated with network parameters is often significantly larger than that affecting measurements, this approach may reduce the estimator sensitivity to outliers in the measured data.

The key idea of EWLS is to treat both the measured (and pseudo-measured) quantities and the network parameters as data. Let the vector

y \in R^{M_{y} \times 1}

collect all measurements and pseudo-measurements (e.g., voltages, currents, power injections, and power flows), and let

π_{n} \in R^{M_{π} \times 1}

denote the vector of nominal network parameter values. The overall data vector is therefore defined as

d = [\begin{matrix} y \\ π_{n} \end{matrix}],

(1)

with size

M = M_{y} + M_{π}

.

Let

π

be the vector of true (unknown) network parameters and

Δ π

the corresponding parameter errors. Then,

π_{n} = π + Δ π .

(2)

Similarly, let

x

denote the vector of system state variables and let

h (x; π)

be the (generally nonlinear) measurement function describing the relationship among the system state, the network parameters, and the measured quantities. The measurement model is given by

y = h (x; π) + Δ y,

(3)

where

Δ y

represents the measurement errors. The error vectors

Δ π

and

Δ y

are assumed to be zero-mean random vectors.

Combining (1)–(3), the overall data model can be written as

d = f (x; π) + Δ d,

(4)

where

f (x; π) = [\begin{matrix} h (x; π) \\ π \end{matrix}],

(5)

and

Δ d = [\begin{matrix} Δ y \\ Δ π \end{matrix}] .

(6)

The EWLS state estimation problem proposed in [7] can be derived by the following unconstrained nonlinear least-squares problem:

[\begin{matrix} \tilde{x} \\ \tilde{π} \end{matrix}] = \underset{\{x, π\}}{arg min} {[d - f (x; π)]}^{T} R_{Δ d}^{- 1} [d - f (x; π)],

(7)

where

R_{Δ d} \in R^{M \times M}

denotes the covariance matrix of the data error vector

Δ d

.

Defining the optimal data residual as

Δ \tilde{d} = d - f (\tilde{x}; \tilde{π}),

(8)

the unconstrained problem (7) can be equivalently rewritten in constrained EWLS form as

\begin{matrix} [\begin{matrix} \tilde{x} \\ Δ \tilde{d} \end{matrix}] & = \underset{\{x, Δ d\}}{arg min} Δ d^{T} R_{Δ d}^{- 1} Δ d \end{matrix}

(9a)

\begin{matrix} subject to Δ d & = d - f (x; π) . \end{matrix}

(9b)

Since, for any given pair

(x, π)

, the constraint (9b) uniquely specifies

Δ d = d - f (x; π)

, eliminating

Δ d

from (9a) yields the same cost function as in (7), thus proving the equivalence of the two formulations.

When all network parameters are assumed to be perfectly known (i.e.,

M_{π} = 0

), the EWLS estimator reduces to the classical WLS estimator.

In addition to the estimates of the system state and of the data error vector, the EWLS formulation also provides the covariance matrix of the estimation errors, namely

P = [\begin{matrix} P_{\tilde{x}} & P_{\tilde{x}, Δ \tilde{d}} \\ P_{Δ \tilde{d}, \tilde{x}} & P_{Δ \tilde{d}} \end{matrix}] .

(10)

This is illustrated in Figure 1, which summarizes the estimator inputs and outputs.

The covariance matrices of the estimation errors associated with the system states and the data errors are given in [7] as

\begin{matrix} P_{\tilde{x}} & = {(\frac{\partial h^{T}}{\partial x} Q^{- 1} \frac{\partial h}{\partial x^{T}})}^{- 1}, \end{matrix}

(11a)

\begin{matrix} P_{Δ \tilde{d}} & = R_{Δ d} A^{T} (Q^{- 1} - W) A R_{Δ d}, \end{matrix}

(11b)

with

A = [\begin{matrix} I_{M_{y}} & - \frac{\partial h}{\partial π^{T}} \end{matrix}], Q = A R_{Δ d} A^{T}, W = Q^{- 1} \frac{\partial h}{\partial x^{T}} P_{\tilde{x}} \frac{\partial h^{T}}{\partial x} Q^{- 1} .

Since the constrained EWLS problem (9) is equivalent to the unconstrained formulation (7), it can be interpreted as a nonlinear system of

M = M_{y} + M_{π}

(12)

equations in the

L = N + M_{π}

(13)

unknowns

\tilde{x}

and

\tilde{π}

.

The system becomes underdetermined when the number of unknowns exceeds the number of equations, i.e.,

L > M

, which reduces to the condition

N > M_{y}

.

Therefore, a necessary condition for the existence of a locally unique EWLS solution is

M_{y} \geq N,

(14)

meaning that the number of measured (and pseudo-measured) quantities must be at least equal to the number of state variables.

However, the condition

M_{y} \geq N

is only necessary, but not sufficient, for the local solvability of the EWLS estimation problem. A stronger necessary condition is obtained by examining the rank of the Jacobian matrix of the model

f (x; π)

with respect to the unknown variables. In particular, denoting by

ρ

the rank of the Jacobian

J = [\begin{matrix} \frac{\partial f}{\partial x^{T}} & \frac{\partial f}{\partial π^{T}} \end{matrix}]

(15)

evaluated at the solution point, local identifiability of the EWLS estimator requires

ρ \geq N + M_{π} .

(16)

Using the model definition in (5), the Jacobian submatrix with respect to the network parameters can be written explicitly as

\frac{\partial f}{\partial π^{T}} = [\begin{matrix} - \frac{\partial h}{\partial π^{T}} \\ I_{M_{π}} \end{matrix}]

(17)

Due to the presence of the identity matrix

I_{M_{π}}

, the submatrix

\partial f / \partial π^{T}

has full column rank, that is

ρ_{1} = rank (\frac{\partial f}{\partial π^{T}}) = rank (I_{M_{π}}) = M_{π}

(18)

As a consequence, the network parameters are locally identifiable by construction, which is consistent with the relation (2), where nominal parameter values are treated as direct pseudo-measurements.

State observability, on the other hand, depends exclusively on the sensitivity of the measurement function with respect to the state variables. Using again the model definition in (5), the Jacobian submatrix with respect to the state variables can be written as

\frac{\partial f}{\partial x^{T}} = [\begin{matrix} - \frac{\partial h}{\partial x^{T}} \\ 0_{M_{π} \times N} \end{matrix}]

(19)

Local observability of the system state therefore requires that the Jacobian of the measurement function with respect to

x

has full column rank, i.e.,

ρ_{2} = rank (\frac{\partial f}{\partial x^{T}}) = rank (\frac{\partial h}{\partial x^{T}}) = N

(20)

In general, the rank of the full Jacobian matrix satisfies

rank (J) \leq ρ_{1} + ρ_{2} .

(21)

However, in the EWLS formulation, rank additivity holds provided that the system is locally observable. Indeed, due to the block structure of the Jacobian, the columns associated with the network parameters contain an identity submatrix, whereas the columns associated with the state variables have zero entries in the same rows. As a consequence, no nontrivial linear combination of state-related columns can reproduce parameter-related columns, and vice versa. Therefore, if

rank (\frac{\partial h}{\partial x^{T}}) = N,

(22)

the Jacobian has full column rank

rank (J) = N + M_{π} .

(23)

In the following analysis, it is assumed that condition (22) holds and that the power system is therefore locally observable [13]. This assumption guarantees that both the system state and the network parameters are locally identifiable within the EWLS framework and constitutes a necessary condition for the existence of a locally unique EWLS estimate of the system states and network parameters.

3. Principal Errors

As discussed in the previous section, observability ensures the local identifiability of the system states and network parameters. Data redundancy, on the other hand, concerns the amount of independent information available for assessing data consistency.

Under the linearized EWLS model, the estimation process removes from the data error all components that can be explained by variations of the estimated variables through the model. The residual vector

Δ \tilde{d}

therefore contains only the portion of the data error that cannot be reproduced by the model.

This property directly follows from the optimality conditions of the unconstrained EWLS problem (7). In particular, the solution satisfies

J^{T} R_{Δ d}^{- 1} Δ \tilde{d} = 0,

(24)

which states that the weighted inner product between the residual and any direction spanned by the Jacobian is zero. This result corresponds to the normal equations of the weighted least squares problem and has a clear geometric interpretation in terms of orthogonal projection [14,15].

As a consequence, the residual is orthogonal (in the weighted sense induced by the covariance matrix) to the column space of the Jacobian matrix

J

with respect to the unknown variables. Equivalently,

Δ \tilde{d}

lies in the orthogonal complement of the Jacobian column space, often referred to as the left null space of

J

[16,17].

Since the residual can vary only along directions that are not represented by the model, the rank of its covariance matrix equals the dimension of the subspace in which the residual can vary. In other words, the residual retains only those stochastic directions of the data error that are not explained by the model. The number of such directions is therefore given by the difference between the total number of independent stochastic directions in the data error and the number of independent directions explained by the model. This yields the general identity

rank (P_{Δ \tilde{d}}) = rank (R_{Δ d}) - rank (J) .

(25)

In the EWLS formulation, the data error vector is defined as

Δ d = {[Δ y^{T} Δ π^{T}]}^{T} .

Assuming that measurement and parameter errors are statistically uncorrelated, the corresponding data error covariance matrix has a block-diagonal structure,

R_{Δ d} = [\begin{matrix} R_{Δ y} & 0 \\ 0 & R_{Δ π} \end{matrix}] .

(26)

For block-diagonal matrices, the rank equals the sum of the ranks of the diagonal blocks. Moreover, assuming

R_{Δ π}

to be full rank, one obtains

rank (R_{Δ d}) = ν + M_{π},

(27)

where

ν = rank (R_{Δ y}) \leq M_{y}

(28)

denotes the number of statistically independent stochastic directions in the measurement error vector

Δ y

. The quantity

ν

characterizes the intrinsic stochastic dimension of the measured data prior to state estimation. In the generic case,

R_{Δ y}

is full rank and

ν = M_{y}

, whereas perfectly correlated measurement errors or hard constraints result in

ν < M_{y}

.

Under the local observability assumption established in the previous section, the Jacobian matrix has full column rank,

rank (J) = N + M_{π} .

Substituting the above expressions into the rank identity yields

rank (P_{Δ \tilde{d}}) = ν - N .

(29)

The quantity

ν - N

represents the residual subspace dimension, i.e., the number of statistically independent components of the residual vector after state estimation. Under Gaussian assumptions, this value coincides with the degrees of freedom of the weighted residual norm employed in

χ^{2}

-based consistency tests.

Since the residual subspace dimension

ν - N

is smaller than the dimension of the residual vector

Δ \tilde{d}

, the latter cannot be uniquely determined in all its components. Consequently, the residual vector can be expressed as a linear combination of only

ν - N

independent quantities

\tilde{ξ}

[17],

Δ \tilde{d} = T \tilde{ξ},

(30)

where

T \in R^{M \times (ν - N)}

is a (non-unique) transformation matrix.

By interpreting

\tilde{ξ}

as a vector of

ν - N

mutually uncorrelated random variables with unit variance, referred to as the principal errors of the EWLS residual, the transformation (30) can be constructed via the eigenvalue decomposition of the residual covariance matrix [15,18],

P_{Δ \tilde{d}} = U Λ U^{T} .

(31)

where the columns of

U

are the eigenvectors of

P_{Δ \tilde{d}}

, and

Λ

is a diagonal matrix containing the positive eigenvalues of

P_{Δ \tilde{d}}

. The matrices

U

and

Λ

in (31) are square and have size

M \times M

.

Since

P_{Δ \tilde{d}}

is rank-deficient, only

ν - N

eigenvalues are non-zero. Accordingly, the eigenvalue matrix in (31) can be written as

Λ = [\begin{matrix} Λ_{ν - N} & 0 \\ 0 & 0 \end{matrix}],

(32)

where

Λ_{ν - N}

is the diagonal matrix containing the non-zero eigenvalues of

P_{Δ \tilde{d}}

.

Letting

U_{ν - N}

denote the matrix collecting the eigenvectors associated with the non-zero eigenvalues, the transformation matrix can be written as

T = U_{ν - N} Λ_{ν - N}^{1 / 2} .

(33)

It is worth noting that the transformation

T

differs from a whitening transformation, since it is generally rectangular and performs both decorrelation and dimensionality reduction. In particular,

T

projects the residual vector onto the subspace spanned by the eigenvectors associated with the non-zero eigenvalues of the covariance matrix.

The Moore–Penrose pseudo-inverse of the transformation matrix

T

[19] is given by

T^{†} = Λ_{ν - N}^{- 1 / 2} U_{ν - N}^{T} = {[\begin{matrix} \frac{u_{1}^{T}}{\sqrt{λ_{1}}} & \dots & \frac{u_{ν - N}^{T}}{\sqrt{λ_{ν - N}}} \end{matrix}]}^{T} .

(34)

This pseudo-inverse provides the inverse mapping from the residual vector to the principal error space, namely

\tilde{ξ} = T^{†} Δ \tilde{d}, {\tilde{ξ}}_{i} = \frac{u_{i}^{T}}{\sqrt{λ_{i}}} Δ \tilde{d},

(35)

which allows the estimation of the

ν - N

principal errors associated with the EWLS residual.

The principal errors decomposition (35) provides a minimal and statistically independent representation of the residual vector and constitutes the basis for the subsequent bad data detection and identification procedures.

The results derived in this section can be extended to the more general case in which measurement and parameter errors are statistically correlated. In this situation, the data error covariance matrix is no longer block diagonal and the rank additivity property does not hold in general. Nevertheless, the residual subspace dimension remains given by the general identity

rank (P_{Δ \tilde{d}}) = rank (R_{Δ d}) - rank (J),

(36)

and can therefore be computed directly from the joint error covariance structure. The simplified expression

ν - N

is recovered as a special case under the assumption of uncorrelated measurement and parameter errors.

4. Bad Data Detection

In this work, a bad datum (or outlier) is defined as a datum, either a measurement, a pseudo-measurement, or a network parameter, affected by a random error whose statistical properties deviate from the nominal error model. Such an error, commonly referred to as a gross error, is not necessarily zero-mean and typically exhibits a magnitude significantly larger than the standard deviation associated with the datum under normal operating conditions. In practice, a datum is considered bad when the root mean square value of the gross error exceeds a prescribed multiple of the nominal standard deviation. The specific multiplicative factor depends on the physical origin and unpredictability of the error source.

Bad data detection is formulated as a classical statistical hypothesis testing problem based on the estimated data errors [20]. Within the EWLS framework, the residual vector is transformed into the principal errors defined in (35), which provide a statistically independent and normalized representation of the residual.

Under nominal operating conditions, the principal errors can be reasonably approximated as independent standard normal random variables. This approximation follows from the central limit theorem [21], since each principal error results from a linear combination of multiple contributions from both measurement and parameter errors.

Consequently, the quadratic form

\tilde{j} = {\tilde{ξ}}^{T} \tilde{ξ}

(37)

is approximately chi-square distributed with

ν - N

degrees of freedom under the null hypothesis [8], where

ν - N

corresponds to the residual subspace dimension. The chi-square distribution of

\tilde{j}

holds under the assumptions of correct model specification and absence of gross errors.

The detection problem is therefore formulated in terms of the following statistical hypotheses:

$H_{0}$ (null hypothesis): no bad data are present;
$H_{1}$ (alternative hypothesis): at least one bad datum is present.

For a prescribed significance level (false alarm probability)

α

, the test statistic

\tilde{j}

is compared with the critical value

T_{α} = Q_{ν - N}^{- 1} (1 - α),

(38)

where

Q_{ν - N}

denotes the cumulative distribution function of the chi-square distribution with

ν - N

degrees of freedom. At significance level

α

, the decision rule is given by [22]

\{\begin{matrix} H_{0} is rejected & if \tilde{j} > T_{α}, \\ H_{0} is accepted & if \tilde{j} \leq T_{α} . \end{matrix}

(39)

5. Bad Data Cluster Identification

Since the residual vector

Δ \tilde{d}

is not directly observable, because its dimension exceeds the residual subspace dimension

ν - N

, outlier identification cannot be performed directly in the data space. Instead, the identification of bad data is carried out in the space of the observable principal errors [2,8].

Under nominal operating conditions (i.e., in the absence of gross errors), each principal error is approximately normally distributed with zero mean and unit variance. Accordingly, the identification of outliers is formulated as a two-sided statistical hypothesis test applied independently to each principal error, for a prescribed significance level

α

.

For the i-th principal error, the following hypotheses are considered:

$h_{0}^{(i)}$ (null hypothesis): the i-th principal error is not affected by bad data;
$h_{1}^{(i)}$ (alternative hypothesis): the i-th principal error is affected by bad data.

A decision regarding the presence of bad data in the i-th principal error is made by comparing the test statistic

{\tilde{ξ}}_{i}

with the critical value corresponding to the significance level

α

[22],

t_{α} = Φ^{- 1} (1 - \frac{α}{2}) = \sqrt{2} {erf}^{- 1} (1 - α),

(40)

where

Φ (\cdot)

denotes the cumulative distribution function of the standard normal distribution and

erf (\cdot)

is the error function. At significance level

α

, the decision rule is

\{\begin{matrix} h_{0}^{(i)} is rejected & if | {\tilde{ξ}}_{i} | > t_{α}, \\ h_{0}^{(i)} is accepted & if | {\tilde{ξ}}_{i} | \leq t_{α} . \end{matrix}

(41)

Once the principal errors affected by bad data have been identified, their overall impact is quantified by computing the sum of the squares of the identified principal errors, as proposed in [8]. Using (35), this quantity can be expressed as

{\tilde{j}}_{Id} = \underset{\begin{matrix} restricted to identified \\ principal errors \end{matrix}}{\underset{︸}{{\tilde{ξ}}^{T} \tilde{ξ}}} = Δ {\tilde{d}}^{T} W Δ \tilde{d},

(42)

where the weighting matrix

W

is obtained from (34) and (35) as

W = {(T^{†})}^{T} T^{†} = U_{ν - N} Λ_{ν - N}^{- 1} U_{ν - N}^{T} .

(43)

Here,

{\hat{U}}_{ν - N}

is formed by selecting the rows of

U_{ν - N}

corresponding to the principal errors identified as affected by bad data.

The contribution of each individual datum to the statistic

{\tilde{j}}_{Id}

is assessed by computing its sensitivity, defined as the gradient of

{\tilde{J}}_{Id}

with respect to the estimated data errors,

w^{T} = \frac{d {\tilde{j}}_{Id}}{d Δ {\tilde{d}}^{T}} = 2 Δ {\tilde{d}}^{T} W .

(44)

The j-th component of the vector

w

represents the sensitivity of

{\tilde{j}}_{Id}

to a gross error affecting the j-th datum. Values of

w (j)

that are zero or negligibly small indicate that the corresponding datum has little influence on the identified principal errors and is therefore unlikely to be an outlier.

From the definition of the sensitivity vector in (44), and under the statistical model assumed for the data errors, the vector

w

has zero mean and covariance matrix

4 W P_{Δ \tilde{d}} W^{T}

. This property provides a statistical basis for testing whether the contribution of individual data to the identified principal-error cluster is statistically significant.

This sensitivity-based analysis enables the identification of clusters of bad data by accounting for their combined effect on the principal error space.

6. Bayesian Bad Data Identification

The sensitivity-based cluster identification procedure described in the previous section provides a deterministic indication of which data are most influential in producing the observed principal errors. In particular, the magnitudes of the components of the Jacobian sensitivity vector

w^{T}

quantify how strongly individual data contribute to the statistic

{\tilde{j}}_{Id}

and, consequently, to the identified principal error cluster.

While this information is invaluable for screening and ranking potential sources of bad data, it does not, by itself, provide a probabilistic assessment of which specific subset of data is responsible for the observed principal errors [8]. In other words, the sensitivity analysis indicates where gross errors may be located, but it does not provide a probabilistic ranking of alternative explanations.

To bridge this gap, a Bayesian framework for bad data identification is introduced in this section. The proposed approach assigns probabilities to alternative hypotheses describing single or multiple gross error configurations that are consistent with the detected and identified principal errors. This probabilistic formulation enables principled decision-making for instrument diagnostics and data maintenance in complex infrastructures.

Crucially, the sensitivity information encoded in

w

is used to constrain the hypothesis space: only gross error configurations involving data that significantly influence the identified principal error cluster are considered. This strategy drastically reduces the number of admissible bad data combinations while ensuring that all practically relevant scenarios are retained.

6.1. Gross Error Models

A gross error model

H_{k}

is formally defined by specifying which data are affected by gross errors. Specifically,

H_{k}

is defined by a binary activation vector

z_{k} = {[\begin{matrix} z_{k} (1) & \dots & z_{k} (M) \end{matrix}]}^{T},

(45)

where each component indicates the presence or absence of a gross error in the corresponding datum:

z_{k} (j) = \{\begin{matrix} 1 & if datum j is affected by a gross error under model H_{k}, \\ 0 & otherwise . \end{matrix}

(46)

By construction,

z_{k}

is a binary vector of length M, i.e.,

z_{k} \in {0, 1}^{M}

. In practice, specific patterns of nonzero entries in

z_{k}

naturally correspond to distinct physical fault sources, such as malfunctioning instruments, erroneous line parameters, or failures of shared components (e.g., voltage transformers) that simultaneously affect multiple data.

The total number of gross errors associated with model

H_{k}

is therefore given by

G_{k} = \sum_{j = 1}^{M} {(z_{k} (j))}^{2} = z_{k}^{T} z_{k} .

(47)

since

z_{k} (j) \in {0, 1}

.

A gross error model

H_{k}

is defined by the binary activation vector

z_{k}

. Here are six examples, numbered from 0 to 5, of gross error models:

$H_{0}$ : Nominal operating conditions, with no data affected by gross errors,

$z_{0} (j) = 0 \forall j, G_{0} = z_{0}^{T} z_{0} = 0 .$
$H_{1}$ : All data are affected by gross errors,

$z_{1} (j) = 1 \forall j, G_{1} = z_{1}^{T} z_{1} = M .$
$H_{2}$ : A single gross error affects the measurement of the active power flow between buses a and b. Denoting by $j^{'}$ the global index of the corresponding datum (i.e., $y (j^{'}) \leftrightarrow P_{a, b}$ ),

$\{\begin{matrix} z_{2} (j^{'}) = 1 \\ z_{2} (j) = 0 \forall j \neq j^{'} \end{matrix}, G_{2} = z_{2}^{T} z_{2} = 1 .$
$H_{3}$ : A single gross error affects the datum corresponding to the line resistance parameter of the line connecting buses m and n. Denoting by $j^{'}$ the global index of this datum (i.e., $y (j^{'}) \leftrightarrow r_{m, n}$ ),

$\{\begin{matrix} z_{3} (j^{'}) = 1 \\ z_{3} (j) = 0 \forall j \neq j^{'} \end{matrix}, G_{3} = z_{3}^{T} z_{3} = 1 .$
$H_{4}$ : Two gross errors affect the measurement of the reactive power flow between buses c and d and the voltage magnitude measurement at bus m. Denoting by $j^{'}$ and $j^{″}$ the global indices of the corresponding data (i.e., $y (j^{'}) \leftrightarrow Q_{c, d}$ and $y (j^{″}) \leftrightarrow V_{m}$ ),

$\{\begin{matrix} z_{4} (j^{'}) = 1 \\ z_{4} (j^{″}) = 1 \\ z_{4} (j) = 0 \forall j \notin {j^{'}, j^{″}} \end{matrix}, G_{4} = z_{4}^{T} z_{4} = 2 .$
$H_{5}$ : Gross errors affect all data associated with the voltage transformer (VT) connected to bus m, representing a common-cause fault. In this case,

$z_{5} (j) = \{\begin{matrix} 1 & if datum y (j) is associated with the VT at bus m \\ 0 & otherwise \end{matrix}, G_{5} = z_{5}^{T} z_{5} .$

Each gross error model is assigned an a priori probability,

π^{-} (H_{k})

, reflecting prior knowledge about the reliability of the measurement chain components and the confidence in the network parameter database. The practical assignment of these probabilities is discussed later in the paper.

For a network comprising M data, the number of possible gross error models grows combinatorially. In particular, the number of distinct models involving exactly G gross errors, without repetition, is given by the binomial coefficient

(\binom{M}{G}) = \frac{M!}{(M - G)!, G!},

(48)

where

G = 0, 1, \dots, M

. Consequently, the total number of admissible gross error models are

\sum_{G = 0}^{M} (\binom{M}{G}) = 2^{M} .

(49)

As an illustration, for a network with

M = 100

data, the total number of admissible models becomes

\sum_{G = 0}^{100} (\binom{100}{G}) = 2^{100},

(50)

which exceeds 1 × 10³⁰.

Given the sheer size of this hypothesis space, exhaustive evaluation of all possible gross error configurations is computationally infeasible. More importantly, such an exhaustive analysis is unnecessary for two fundamental reasons:

1.: Gross errors typically originate from malfunctioning components in the measurement chain or from erroneous parameter values. The likelihood of multiple independent gross error sources occurring simultaneously decreases rapidly as their number increases.
2.: As established in the previous sections, only data contributing significantly to the identified principal error cluster can plausibly explain the observed principal errors. Gross errors affecting data outside this cluster, or associated with negligible sensitivity coefficients, can therefore be safely excluded from consideration.

By restricting the analysis to the most probable and physically meaningful gross error models, the hypothesis space can be drastically reduced, thereby avoiding the computational burden associated with evaluating an infeasibly large number of combinations.

6.2. Gross Error Model a Priori Probability

Gross errors are modeled as random variables that are statistically independent of the data errors. Unlike data errors

Δ d

, which are assumed to be zero-mean, gross errors are not necessarily zero-mean and typically exhibit variances that are significantly larger than those associated with measurement and parameter uncertainties.

Let

δ_{k}

denote the M-dimensional random vector of gross errors associated with the gross error model

H_{k}

. Conditioned on

H_{k}

, the total data error vector is given by the superposition of data errors and gross errors:

Δ d ∣ H_{k} = Δ d + δ_{k} .

(51)

Under the assumption that data errors and gross errors are mutually independent, the mean vector and covariance matrix of

Δ d ∣ H_{k}

are

\{\begin{matrix} η_{k} = η_{Δ d} + η_{δ_{k}} = η_{δ_{k}}, \\ Σ_{k} = R_{Δ d} + Σ_{δ_{k}}, \end{matrix}

(52)

where

η_{δ_{k}}

and

Σ_{δ_{k}}

denote the mean vector and covariance matrix associated with the gross errors specified by model

H_{k}

, respectively. The data errors are assumed to be zero-mean, i.e.,

η_{Δ d} = 0

.

The vector

η_{k}

therefore represents the expected bias in the data errors induced by the gross errors hypothesized in model

H_{k}

, while the covariance matrix

Σ_{k}

characterizes the combined uncertainty due to both data errors and gross errors. The structure of

η_{δ_{k}}

and

Σ_{δ_{k}}

is determined by the activation vector

z_{k}

, with nonzero mean components and increased variances only in the entries corresponding to data activated by the model.

Assuming that the conditional distribution of

Δ d ∣ H_{k}

is Gaussian, the prior probability density function of the data errors (including gross errors), given that the gross error model

H_{k}

holds, is

\begin{matrix} π^{-} (Δ d ∣ H_{k}) & = & \frac{1}{\sqrt{{(2 π)}^{M} | Σ_{k} |}} exp (- \frac{1}{2} {(Δ d - η_{k})}^{T} Σ_{k}^{- 1} (Δ d - η_{k})) . \end{matrix}

(53)

The Gaussian prior assumption in (53) is justified by the observation that, in many practical scenarios, only the first- and second-order statistical moments of both data errors and gross errors are available or can be reasonably specified. Under these conditions, the Gaussian distribution represents the maximum entropy choice, providing the least informative model consistent with the available information.

6.3. Principal Error Evidence

Once the principal errors have been estimated and the corresponding error cluster has been identified, the relative plausibility of the competing gross error models can be assessed within a Bayesian framework.

Assuming that the gross error model

H_{k}

holds, the Bayesian evidence of the observed principal errors is defined as

p (ξ ∣ H_{k}) = \int_{R^{M}} L (Δ d; ξ) π^{-} (Δ d ∣ H_{k}) d^{M} Δ d,

(54)

where

p (ξ ∣ H_{k})

denotes the probability density of observing the principal errors

ξ

under the assumption that the gross error configuration specified by

H_{k}

is present. The Bayesian evidence therefore quantifies how well a given gross error model explains the observed principal errors.

In (54),

L (Δ d; ξ)

denotes the likelihood function of the principal errors,

L (Δ d; ξ) = p (ξ ∣ Δ d) .

(55)

Using the linear transformation defined in (35) and invoking the central limit theorem [21], the likelihood function is modeled as a multivariate Gaussian distribution with identity covariance matrix and mean vector

T^{†} Δ d

,

L (Δ d; ξ) = \frac{1}{\sqrt{{(2 π)}^{ν - N}}} exp (- \frac{1}{2} {(ξ - T^{†} Δ d)}^{T} (ξ - T^{†} Δ d)) .

(56)

Under the assumption of Gaussian distributions for both the likelihood function and the prior distribution of the data errors, the multidimensional integral in (54) admits a closed-form solution. The resulting Bayesian evidence is given by (see Appendix A)

p (ξ ∣ H_{k}) = \frac{exp (- \frac{1}{2} d_{k})}{\sqrt{{(2 π)}^{ν - N} | Σ_{k} | | A_{k} |}},

(57)

where the matrix

A_{k}

and the scalar quantity

d_{k}

are defined in Appendix A, Equations (A2) and (A3), respectively.

6.4. Gross Error Model a Posteriori Probability

The posterior probability of each gross error model, conditioned on the observed principal errors, is finally obtained using Bayes’ theorem:

π^{+} (H_{k} ∣ ξ) = \frac{p (ξ ∣ H_{k}) π^{-} (H_{k})}{\sum_{j} p (ξ ∣ H_{j}) π^{-} (H_{j})} .

(58)

Here,

π^{-} (H_{k})

denotes the a priori probability assigned to the gross error model

H_{k}

. The summation in the denominator extends over all gross error models retained in the analysis, which are assumed to be mutually exclusive and collectively exhaustive.

6.5. Scalability Considerations and Hypothesis Space Reduction

The Bayesian formulation involves, in principle, the evaluation of posterior probabilities over a hypothesis space whose size grows combinatorially with the number of data, i.e.,

2^{M}

. However, the proposed framework is not intended to operate on the full set of admissible models. Its practical applicability relies on a structured reduction of the candidate hypothesis space based on physical and statistical considerations.

A first level of reduction is achieved through the sensitivity-based cluster identification described in Section 5. Only data that significantly contribute to the identified principal error cluster are retained as potential carriers of gross errors, drastically reducing the number of candidate hypotheses.

A second level of reduction is obtained by restricting the admissible hypotheses to physically meaningful configurations, corresponding to plausible failure mechanisms of the measurement chain (e.g., individual instrument failures or common-cause events such as VT-related faults). This replaces the exponential growth of arbitrary combinations with a finite and structured set of candidate models.

Furthermore, the Bayesian formulation incorporates prior probabilities that naturally promote sparsity, since the likelihood of multiple independent failure events decreases rapidly with their number. As a result, complex multi-error hypotheses are penalized unless strongly supported by the observed data.

Consequently, the computational complexity scales with the number of retained candidate models rather than with

2^{M}

. In realistic large-scale systems, with on the order of

10^{3}

measurements and

10^{2}

buses, the total number of admissible combinations would be intractable without pruning. However, after sensitivity-based and structure-based reduction, the number of candidate models typically reduces to a few hundred or at most a few thousand, which can be efficiently evaluated thanks to the closed-form expression of the Bayesian evidence.

In practical transmission-system control centers, state estimation and bad data processing are performed over time windows of the order of minutes. The proposed approach is compatible with these operational constraints, since it operates on a reduced and structured set of candidate models rather than on the full combinatorial space. For very large-scale systems or when the sensitivity-based reduction is less effective, additional strategies such as problem decomposition or heuristic selection of candidate models can be used to further reduce the number of evaluated hypotheses.

7. Test System

To validate the Bayesian bad data identification framework developed in the previous sections, the statistical behavior of the proposed detection and identification procedures was assessed through Monte Carlo simulations. All simulations were conducted on the IEEE 14-bus test system [23], shown in Figure 2.

Each Monte Carlo experiment consisted of an initialization phase followed by a sequence of independent trials. During initialization, the true values of the network parameters

π

were assigned, and the corresponding true system state

x

was obtained by performing a power flow analysis on the IEEE 14-bus system. The power flow solution was computed under a convergence criterion requiring the absolute rounding error to be below 0.5 × 10⁻⁵. The resulting true system state values are reported in Table 1.

The system consists of

n = 14

buses, corresponding to

N = 2 n - 1 = 27

state variables. The number of measurements

M_{y}

was chosen to ensure a redundancy factor

μ = M_{y} / N

in the range 1.5–2.5, which is representative of typical transmission systems. The measurement set comprised five types of measurements: voltage magnitudes at bus j (

V_{j}

), active and reactive power injections at bus j (

P_{j}

and

Q_{j}

), and active and reactive power flows between buses i and j (

P_{i, j}

and

Q_{i, j}

). Given this configuration, the true measurement vector was computed as

h (x, π)

.

The covariance matrices associated with measurement and parameter errors, namely

R_{Δ y}

,

R_{Δ π}

, and

R_{Δ y, Δ π}

, were specified accordingly. In each Monte Carlo trial, measurement and parameter errors

Δ y

and

Δ π

were generated and superimposed on the true measurements and network parameters, yielding the measured data vector

y

and the nominal parameter vector

π_{n}

. These quantities were stacked into the overall data vector

d = [\begin{matrix} h (x, π) + Δ y \\ π + Δ π \end{matrix}] .

Measurement and parameter errors were modeled as zero-mean Gaussian random variables.

Subsequently, gross errors were introduced according to the predefined gross error models, representing alternative hypotheses regarding the origin of inconsistencies in the data vector.

Finally, the sensitivity-based cluster identification and the Bayesian identification procedures described in the previous sections were applied to assess the ability of the proposed framework to correctly detect, identify, and probabilistically rank the underlying gross error models.

8. Case Studies

To exemplify and assess the performance of the proposed Bayesian framework for bad data detection and identification, a set of Monte Carlo case studies was designed.

Although, in general, data errors (measurement and parameter errors) may exhibit statistical correlations and deviate from Gaussian distributions, the proposed framework naturally accommodates such general conditions. The simplified assumptions adopted here are intended to illustrate the effectiveness of the method, while a more comprehensive analysis is beyond the scope of this work.

The errors associated with the network parameters of the IEEE 14-bus system are assumed to be mutually uncorrelated zero-mean, and normally distributed, with standard deviations given by

σ_{Δ π} (i) = 5 % \cdot π_{n} (i) + 0.001 pu .

(59)

Parameters with zero nominal value were treated as deterministic. Consequently, all nonzero resistances, reactances, and susceptances of the IEEE 14-bus system were considered uncertain. For simplicity, off-nominal transformer tap ratios were assumed to be exact and were therefore excluded from the set of uncertain parameters. With these assumptions, the total number of network parameters is

M_{π} = 41

.

All test cases were conducted with a redundancy factor of

μ = 2.5

, corresponding to

M_{y} = 69

measurements. The adopted measurement configuration is reported in Table 2, where each measurement, corresponding to an element of the vector function

h (x, π)

, is identified by a pair of indices

(n_{f}, n_{t})

. Here,

n_{f}

and

n_{t}

denote the from- and to-buses associated with the measurement, respectively. For voltage magnitude and power injection measurements,

n_{t} = 0

.

Measurement errors were modeled as mutually uncorrelated zero-mean Gaussian random variables with identical standard deviation for all measurements, set to

σ_{Δ y} = 0.001 pu .

(60)

As a consequence, the measurement error covariance matrix

R_{Δ y}

is diagonal and full rank. According to (28), the number of statistically independent stochastic directions is therefore

ν = M_{y} = 69

. Since the system state dimension is

N = 27

, the number of principal errors

ν - N

is equal to 42 for all test cases.

The detection and identification level of significance was fixed at

α = 5

% for all case studies. This choice is widely adopted in statistical hypothesis testing and provides a reasonable compromise between false alarm probability and detection sensitivity.

Given the size and complexity of the power system under consideration, a fully exhaustive exploration of all possible gross error configurations was computationally infeasible. In particular, with

M = M_{y} + M_{π} = 110

data potentially affected by single or multiple gross errors, the total number of admissible gross error models, as given by (49), is

\sum_{G = 0}^{M} (\binom{M}{G}) = 2^{M} \approx 1.3 \times 10^{33} .

Therefore, the case studies were selected with the objective of providing maximum insight into the behavior, robustness, and limitations of the proposed procedures, while keeping the overall computational burden manageable.

The first criterion adopted to reduce the number of admissible gross error models was to restrict gross errors exclusively to the measurement data, while assuming network parameters to be free of gross errors. This choice was made to isolate the effect of corrupted measurements and to evaluate the robustness of the identification algorithms specifically with respect to measurement gross errors, since measurement data constitute the most common source of bad data in practical applications.

In practical measurement infrastructures, gross errors may arise either from the malfunction of individual measurement instruments or from failures affecting common components of the measurement chain, such as current transformers (CTs) and voltage transformers (VTs). Multiple VTs may be installed at the same bus, and multiple CTs may be associated with each transmission line, supplying different groups of instruments.

For simplicity, the measurement chain is modeled here by assuming a single VT per bus (for buses where measurements are available), while the probability of CT failures is set to zero. This assumption is introduced solely to simplify the analysis and to provide a clearer interpretation of the results, and is not motivated by computational limitations.

This simplified measurement-chain representation is adopted solely to facilitate the interpretation of the results and is not intended to limit the generality of the approach. More realistic configurations, including multiple VTs per bus and CT-related failures, can be naturally accommodated within the proposed Bayesian framework by defining the corresponding gross error models.

Failure events were assumed to occur independently and with small probability. Under this assumption, the probability of three or more simultaneous failure events was considered negligible. Importantly, this limitation applies to the number of physical failure events, rather than to the number of measurement data affected by gross errors. As a result, the retained gross error models may involve multiple measurement data, provided that they originate from at most two simultaneous failure events (e.g., the failure of one or two VTs, or combinations of VT and individual instrument failures).

Given the above considerations, the gross error models used in the case studies were grouped into five mutually exclusive classes, each corresponding to a specific failure mechanism. These classes are summarized below.

A:: A single gross error affects one measurement datum. The number of admissible models in this class is

$N_{A} = M_{y} = 69,$

(61)

which corresponds to the failure of a single measurement instrument.
Each model in this class is denoted by $H_{m}^{(A)}$ , where m identifies the m-th measurement.
B:: Multiple gross errors affect all measurement data (voltages and powers) associated with a single bus. The number of admissible models in this class is

$N_{B} = n = 14,$

(62)

corresponding to the failure of a single voltage transformer (VT) connected to that bus.
Each model in this class is denoted by $H_{b}^{(B)}$ , where b identifies the b-th bus.
C:: Two gross errors affect a pair of measurement data. The number of admissible models in this class is

$N_{C} = (\binom{M_{y}}{2}) = 2346,$

(63)

accounting for all possible combinations of two independent measurement instrument failures.
Each model in this class is denoted by $H_{m, n}^{(C)}$ , with $m \neq n$ , representing the simultaneous failure of instruments m and n.
D:: Multiple gross errors affect all measurement data associated with a pair of buses. The number of admissible models in this class is

$N_{D} = (\binom{n}{2}) = 91,$

(64)

corresponding to the simultaneous failure of the two voltage transformers connected to the selected bus pair.
Each model in this class is denoted by $H_{b, c}^{(D)}$ , with $b \neq c$ , representing faults affecting all instruments connected to buses b and c.
E:: Multiple gross errors affect all measurement data associated with one bus and a single additional measurement datum. The number of admissible models in this class is

$N_{E} = M_{y} \cdot n = 966,$

(65)

corresponding to the combined failure of one voltage transformer and one measurement instrument.
Each model in this class is denoted by $H_{m, b}^{(E)}$ , representing the simultaneous failure of instrument m and all instruments connected to bus b.

The assignment of prior probabilities to the gross-error models requires specifying the probability of the physical failure events that generate them. In SCADA-based power system state estimation, such events are relatively rare. Operational experience and the literature on bad data processing indicate that only a very small fraction of measurements is typically affected by gross errors during a measurement scan, often on the order of 1 × 10⁻⁴–1 × 10⁻³ [24,25].

Based on this empirical evidence, the elementary probability of a single measurement-chain failure event was set to

p = 1 \times 10^{- 4} .

(66)

This value was used to assign consistent a priori probabilities to all gross error models retained in the analysis. Accordingly, the a priori probability

π^{-} (H_{k})

associated with each gross error model depends on the number of underlying physical failure events giving rise to

H_{k}

. In particular,

π^{-} (H_{k}) = \{\begin{matrix} p, & H_{k} \in {classes A, B}, \\ p^{2}, & H_{k} \in {classes C, D, E}, \end{matrix}

(67)

In practical measurement infrastructures, different components of the measurement chain generally exhibit different failure probabilities. For example, the probability of failure of an instrument transformer differs from that of individual measurement instruments.

The proposed Bayesian framework can naturally accommodate such differences by assigning component-specific prior probabilities. In the present study, however, a single representative value of p was adopted for simplicity, since the test cases are intended primarily to illustrate the validity and capabilities of the proposed Bayesian identification approach.

Although the above restrictions significantly reduce the hypothesis space, a fully exhaustive investigation would still involve several thousand distinct gross error models. A comprehensive reporting of all such scenarios is beyond the scope of this paper and would not provide additional methodological insight.

Therefore, a restricted set of representative test cases was defined for each of the five gross error model classes introduced above (class-A to class-E). The selected cases capture the most relevant and structurally distinct configurations within each class. The Monte Carlo analysis was conducted exclusively on these representative scenarios.

The five test cases considered in this section share a common simulation and evaluation protocol. For each representative gross error model considered in the case studies,

N = 20

independent Monte Carlo simulations were performed. In each run, a specific gross error model

H^{*}

was applied to the measurement data and the Bayesian framework was used to compute the posterior probability associated with every admissible hypothesis belonging to classes A–E.

The Bayesian evidence (57) for each of the five gross error classes is computed by assuming that the covariance matrix

Σ_{δ_{k}}

in (52) is zero except for the diagonal entries

(i, i)

such that

z_{k} (i) = 1

, where

z_{k}

is the gross error activation vector defined in (45). For these entries, the variance is set to

1 / 3

, consistent with the assumption that the gross error follows a uniform distribution over the interval between

- 1

pu

and 1

pu

. Accordingly, the mean vector

η_{δ_{k}}

in (52) is set to zero.

For each Monte Carlo realization, the maximum a posteriori (MAP) probability was determined as

π_{MAP}^{+} = max_{H_{k}} π^{+} (H_{k} ∣ \tilde{ξ}) .

(68)

Since multiple hypotheses may attain nearly identical posterior probabilities, the identification outcome was defined through a relative ambiguity criterion. Specifically, all hypotheses satisfying

π^{+} (H_{k} ∣ \tilde{ξ}) \geq (1 - ε) π_{MAP}^{+}

(69)

were considered as jointly identified, where

ε = 0.05

.

Let

S

denote the resulting set of identified hypotheses for a given Monte Carlo realization. The cardinality of

S

reflects the degree of decisiveness of the identification process. In particular:

if $| S | = 1$ , the identification is unambiguous;
if $| S | > 1$ , the outcome is classified as ambiguous.

The identification outcome is evaluated by comparing the true gross error model

H^{*}

with the identified set

S

. For each Monte Carlo run, four situations may arise:

1.: Correct identification (unambiguous):

$S = {H^{*}} .$
2.: Correct but ambiguous identification:

$H^{*} \in S and | S | > 1 .$
3.: Misidentification within the same class:

$H^{*} \notin S,$

and all hypotheses in $S$ belong to the same gross error class as $H^{*}$ .
4.: Misidentification across classes:

$H^{*} \notin S,$

and at least one hypothesis in $S$ belongs to a class different from that of $H^{*}$ .

To quantitatively assess the performance of the proposed framework, the following empirical metrics are computed over the Monte Carlo runs:

P_{corr} = \frac{N_{corr}}{N}, P_{amb} = \frac{N_{amb}}{N}, P_{intra} = \frac{N_{intra}}{N}, P_{inter} = \frac{N_{inter}}{N} .

(70)

where

N_{corr}

,

N_{amb}

,

N_{intra}

, and

N_{inter}

denote, respectively, the number of runs classified in each of the four categories above.

In all Monte Carlo experiments used to compute the performance indices in (70), the gross errors affecting the corrupted measurements were generated independently and uniformly distributed in the interval between

0.02

pu

and

0.06

pu

. This range was adopted for all five test cases in order to provide a homogeneous statistical basis for the comparison of the identification performance across the different gross error classes. It is worth noting that this choice differs from the prior model adopted for gross errors, which is Gaussian and, in this section, is specified as zero-mean with variance equal to 1/3. Therefore, both the assumed distribution and the support of the gross errors differ from those used in the simulations, introducing a deliberate model mismatch. The results thus provide an indication of the robustness of the proposed Bayesian identification approach with respect to deviations from the assumed error model.

For the detailed analyses of selected representative scenarios presented in the following subsections, a different Monte Carlo setup was adopted in order to investigate the sensitivity of the identification procedure to the gross error magnitude. In these cases, the injected gross errors were generated independently, with zero mean, and uniformly distributed within the intervals indicated on the horizontal axes of the corresponding figures. This second type of simulation allows one to analyze how the posterior probability associated with the true hypothesis evolves as the gross error magnitude varies.

The following subsections focus on the case-specific aspects and on a limited number of representative examples.

8.1. Test Case 1: Single-Instrument Gross Errors (Class-A)

Test case 1 considers class-A gross errors, corresponding to the failure of a single measurement instrument. In this case,

N_{A} = 69

admissible gross error models are considered.

The resulting performance indices are

P_{corr} = 0.7275, P_{amb} = 0.0906, P_{intra} = 0.0920, P_{inter} = 0.0899 .

(71)

Therefore, thanks to the additional probabilistic information introduced through the structured prior on failure grouping, 50 out of 69 class-A gross errors are correctly identified, despite the fact that only

ν - N = 42

statistically independent principal error directions are available. This result highlights the benefit of the Bayesian hypothesis modeling in compensating for the limited redundancy of the measurement system.

Figure 3a reports the identification frequency aggregated by gross error class, whereas Figure 3b reports the detailed identification frequency within class-A. Together, the two subfigures show that all four identification outcomes defined above occur in this case study.

Four representative gross-error scenarios are highlighted in Figure 3 by vertical dashed lines. These correspond to measurement indices illustrating the four identification outcomes defined above: measurement 13 (correct identification, unambiguous), measurement 27 (correct but ambiguous identification), measurement 23 (misidentification within the same class), and measurement 67 (misidentification across classes).

Measurement index 13 (active power injection

P_{10}

at bus 10) illustrates a case of correct and unambiguous identification. Figure 4 reports the posterior probability

π^{+} (H_{13}^{(A)} ∣ \tilde{ξ})

as a function of the bound of the uniform distribution used to model the gross error. For bound values exceeding approximately

1.2 %

, the correct hypothesis is identified with high posterior probability and negligible dispersion across Monte Carlo runs.

Measurement index 27 (reactive power injection

Q_{14}

at bus 14) provides an example of correct but ambiguous identification. Figure 5 shows the posterior probabilities

π^{+} (H_{27}^{(A)} ∣ \tilde{ξ})

and

π^{+} (H_{65}^{(A)} ∣ \tilde{ξ})

as functions of the bound of the uniform distribution used to model the gross error. For bound values exceeding approximately 9%, two hypotheses are identified with comparable posterior probability: the true corrupted measurement 27 and measurement 65 (reactive power flow

Q_{13, 14}

between buses 13 and 14). It is worth noting that both measurements are physically associated with bus 14, which may contribute to the observed ambiguity.

Measurement index 23 (reactive power injection

Q_{4}

at bus 4) illustrates a case of misidentification within the same class, as highlighted in Figure 3. In this case the hypotheses most frequently selected correspond to measurements 57 (reactive power flow

Q_{5, 4}

between buses 5 and 4) and 66 (reactive power flow

Q_{4, 7}

between buses 4 and 7). Both measurements are connected to the same bus, which may contribute to the observed intra-class confusion.

A more detailed view of this behavior is provided in Figure 6, which shows the posterior probabilities

π^{+} (H_{23}^{(A)} ∣ \tilde{ξ})

,

π^{+} (H_{57}^{(A)} ∣ \tilde{ξ})

, and

π^{+} (H_{66}^{(A)} ∣ \tilde{ξ})

as functions of the uniform error bound across 20 Monte Carlo repetitions.

Finally, measurement index 67 (reactive power flow measurement

Q_{4, 9}

between buses 4 and 9) illustrates a case of misidentification across classes. In this scenario the most frequently selected hypothesis belongs to class-C and corresponds to the simultaneous gross errors affecting measurements 23 and 67. Although the true corrupted measurement is included in the identified hypothesis, the Bayesian framework attributes part of the evidence to an additional instrument failure. This outcome may be more appropriately interpreted as a correct but ambiguous identification between a single-instrument failure (class-A) and a two-instrument failure hypothesis (class-C).

8.2. Test Case 2: Single-Bus Gross Errors (Class-B)

Test case 2 considers class-B gross errors, corresponding to the simultaneous failure of all measurement instruments connected to a single bus. In this case,

N_{B} = 14

admissible gross error models are considered.

The resulting performance indices are

P_{corr} = 0.7321, P_{amb} = 0.0214, P_{intra} = 0.0107, P_{inter} = 0.2357 .

(72)

Thanks to the additional probabilistic information introduced through the structured prior on failure grouping, gross errors affecting 10 out of the 14 buses are correctly identified. This corresponds to the detection of 46 gross errors across 46 measuring instruments, despite the fact that only

ν - N = 42

statistically independent principal error directions are available.

Figure 7a reports the identification frequency aggregated by error class, whereas Figure 7b reports the detailed identification frequency within class-B.

Three representative gross-error scenarios are highlighted in Figure 7 by vertical dashed lines: bus 1 (correct and unambiguous identification), bus 14 (correct but ambiguous identification), and bus 6 (misidentification across classes). Intra-class misidentification is statistically negligible in this case and is therefore not considered in the following analysis.

Bus index 1 illustrates a case of correct and unambiguous identification. The injected gross error affects all measurements associated with bus 1, namely

V_{1}

,

P_{1}

,

Q_{1}

,

P_{1, 2}

,

P_{1, 5}

,

Q_{1, 2}

, and

Q_{1, 5}

.

Figure 8 reports the posterior probability

π^{+} (H_{1}^{(B)} ∣ \tilde{ξ})

as a function of the bound of the uniform distribution used to model the gross error, across the 20 Monte Carlo repetitions. For bound values exceeding approximately

2.5 %

, the correct hypothesis is identified with high posterior probability and negligible dispersion.

Bus index 14 provides an example of correct but ambiguous identification. Figure 9 shows the posterior probabilities

π^{+} (H_{13}^{(B)} ∣ \tilde{ξ})

and

π^{+} (H_{14}^{(B)} ∣ \tilde{ξ})

as functions of the uniform error bound applied to all measurements connected to bus 14.

When the gross error bound exceeds approximately

9

%, two hypotheses are simultaneously identified with comparable posterior probabilities. The corresponding models involve buses 14 and 13, whose measurements include

P_{13, 14}

and

Q_{13, 14}

. Since the measurements associated with these buses are physically coupled through the branch connecting buses 13 and 14, this coupling may contribute to the observed ambiguous identification.

Finally, bus index 6 illustrates a case of misidentification across classes. The injected gross error affects all measurements connected to bus 6, namely

V_{6}

,

P_{6}

,

Q_{6}

,

P_{6, 12}

,

P_{6, 13}

,

Q_{6, 12}

, and

Q_{6, 13}

.

As shown in Figure 7a, the most frequently selected hypotheses belong to class-D (two-bus failures) and class-E (combined bus–instrument failures). In particular, the class-E hypothesis corresponds to the simultaneous gross errors affecting measurement 68 (

Q_{5, 6}

) and bus 6, while the class-D hypothesis corresponds to simultaneous failures of buses 4 and 6.

Although formally classified as inter-class misidentifications, these cases include the true corrupted bus in the identified set. This outcome may therefore be interpreted as a correct but ambiguous identification between a single-bus failure (class-B) and more complex coupled-failure scenarios involving adjacent buses or instruments.

The first two test cases were analyzed in detail in order to illustrate the four possible identification outcomes introduced above, namely correct identification (unambiguous), correct but ambiguous identification, intra-class misidentification, and inter-class misidentification.

For the remaining test cases (class-C to class-E), the analysis focuses on representative scenarios leading to correct identification. The purpose of these examples is primarily to illustrate the behavior of the proposed Bayesian identification framework under more complex failure mechanisms involving multiple simultaneous gross errors.

A systematic exploration of all possible identification outcomes for these cases would require a significantly larger number of simulations and would considerably increase the length of the paper without providing additional methodological insight.

A more exhaustive statistical analysis of these multiple-failure scenarios is left for future work.

8.3. Test Case 3: Double-Instrument Gross Errors (Class-C)

Test case 3 considers class-C gross errors, corresponding to the simultaneous failure of two measurement instruments. In this case,

N_{C} = 2346

admissible gross error models are considered.

The resulting performance indices are

P_{corr} = 0.4142, P_{amb} = 0.1096, P_{intra} = 0.1558, P_{inter} = 0.3204 .

(73)

Thus, class-C scenarios are more challenging than classes-A and B, as expected for multiple simultaneous failures. Correct and unambiguous identification is obtained in about

41 %

of the simulations, while both intra-class and inter-class confusion become more significant.

Figure 10a reports the identification frequency aggregated by error class, whereas Figure 10b shows the detailed identification frequency within class-C.

Overall, 971 out of the 2346 admissible measurement pairs are correctly identified despite the limited number of statistically independent principal error directions. This result again demonstrates the ability of the Bayesian hypothesis modeling to mitigate the intrinsic redundancy limitations of the measurement configuration.

A representative example of correct and unambiguous identification corresponds to hypothesis index 237, associated with simultaneous gross errors affecting measurements 4 (phase voltage

V_{6}

at bus 6) and 40 (active power flow

P_{9, 10}

between bus 9 and 10).

Figure 11 reports the posterior probability

π^{+} (H_{237}^{(C)} ∣ \tilde{ξ})

as a function of the bound of the uniform distribution used to model the gross error, across 20 Monte Carlo repetitions.

It is observed that, for bound values exceeding approximately

4.5

%, the correct hypothesis is identified with high posterior probability and negligible dispersion across the Monte Carlo runs.

8.4. Test Case 4: Double-Bus Gross Errors (Class-D)

Test case 4 considers class-D gross errors, corresponding to the simultaneous failure of all measurement instruments connected to two distinct buses. In this case,

N_{D} = 91

admissible gross error models are considered.

The resulting performance indices are

P_{corr} = 0.6692, P_{amb} = 0.0258, P_{intra} = 0.1352, P_{inter} = 0.1698 .

(74)

Class-D scenarios show comparatively good performance: correct and unambiguous identification is obtained in about

67 %

of the simulations, while both ambiguity and inter-class confusion remain limited.

Figure 12a reports the identification frequency aggregated by error class, whereas Figure 12b shows the detailed identification frequency within class-D.

Overall, 61 out of the 91 admissible bus-pair hypotheses are correctly and unambiguously identified, even though only

ν - N = 42

statistically independent principal error directions are available.

A representative example of correct and unambiguous identification corresponds to hypothesis index 18, which represents gross errors simultaneously affecting all measurements associated with buses 2 and 7.

The measurements connected to bus 2 are

V_{2}

,

P_{2}

,

Q_{2}

,

P_{2, 3}

,

P_{2, 4}

,

P_{2, 5}

,

Q_{2, 3}

,

Q_{2, 4}

, and

Q_{2, 5}

, while those connected to bus 7 are

P_{7, 9}

and

Q_{7, 9}

.

Figure 13 shows the posterior probability

π^{+} (H_{18}^{(D)} ∣ \tilde{ξ})

as a function of the bound of the uniform distribution used to model the gross errors across the 20 Monte Carlo repetitions.

The results show excellent identification performance. In particular, the median posterior probability associated with the correct hypothesis exceeds 90% for bound values exceeding approximately 3%, with very limited dispersion across the Monte Carlo runs.

8.5. Test Case 5: Bus–Instrument Combined Gross Errors (Class-E)

Test case 5 considers class-E gross errors, corresponding to the simultaneous failure of all measurement instruments connected to one bus together with one additional measurement instrument. In this case,

N_{E} = 966

admissible gross error models are considered.

The resulting performance indices are

P_{corr} = 0.4313, P_{amb} = 0.0736, P_{intra} = 0.1661, P_{inter} = 0.3289 .

(75)

Class-E scenarios confirm that the proposed Bayesian framework remains effective even for structured multiple-failure configurations, although the identification task is more demanding than in classes A, B, C, and D.

Figure 14a reports the identification frequency aggregated by error class, whereas Figure 14b shows the detailed identification frequency within class-E.

Overall, 416 out of the 966 admissible measurement–bus pairs are correctly and unambiguously identified despite the limited number of statistically independent principal error directions.

A representative example of correct and unambiguous identification corresponds to hypothesis index 580, associated with gross errors affecting measurement 28 together with all instruments connected to bus 9.

Measurement 28 corresponds to the active power flow

P_{1, 2}

, while the measurements connected to bus 9 are

P_{9}

,

Q_{9}

,

P_{9, 10}

,

P_{9, 14}

,

Q_{9, 10}

, and

Q_{9, 14}

.

Figure 15 shows the posterior probability

π^{+} (H_{580}^{(E)} ∣ \tilde{ξ})

as a function of the bound of the uniform distribution used to model the gross errors across the 20 Monte Carlo repetitions.

The results again show excellent identification performance. In particular, the median posterior probability associated with the correct hypothesis exceeds 90% for gross error bound exceeding approximately 2%, with very limited dispersion across the Monte Carlo runs.

Despite the very large number of admissible gross-error models considered in the five case studies,

N_{A} + N_{B} + N_{C} + N_{D} + N_{E} = 3486,

a substantial number of the true gross-error configurations are correctly identified without ambiguity.

In particular, the total number of correctly identified hypotheses can be estimated as

P_{corr}^{(A)} N_{A} + P_{corr}^{(B)} N_{B} + P_{corr}^{(C)} N_{C} + P_{corr}^{(D)} N_{D} + P_{corr}^{(E)} N_{E} \approx 1500 .

This result is particularly noteworthy given that only

ν - N = 42

statistically independent principal error directions are available.

In other words, the proposed Bayesian identification framework is able to discriminate among more than 1500 admissible single and multiple gross-error hypotheses on the basis of only 42 observable principal-error components, highlighting its ability to effectively exploit the information contained in the principal-error space even under severe limitations in measurement redundancy.

It is also worth noting that the identification performance proved to be only weakly sensitive to the assumed value of the elementary failure probability p. In particular, varying p by one order of magnitude did not produce appreciable changes in the posterior model ranking, indicating that the Bayesian identification process is largely driven by the information contained in the principal-error observations rather than by the specific choice of prior probabilities.

These results confirm that the proposed Bayesian framework provides a practical and computationally feasible approach for gross-error identification even in large hypothesis spaces characterized by limited measurement redundancy.

9. Conclusions

This paper has presented a Bayesian framework for bad data identification in power system state estimation under measurement and network parameter uncertainty. The proposed approach builds upon the Extended Weighted Least Squares (EWLS) estimator and the eigenvalue-based decomposition of the residual covariance matrix, which provides a minimal and statistically independent representation of data inconsistencies through principal errors.

By revisiting the EWLS formulation as the unconstrained solution of an equivalent constrained estimation problem, the paper clarifies the relationship between system observability, residual subspace dimension, and the number of determinable principal errors. This analysis highlights the intrinsic limitations of bad data identification in systems with limited redundancy and motivates the use of probabilistic inference when multiple error configurations produce similar observable effects.

A key feature of the proposed framework is its ability to explicitly account for uncertainty affecting both measurements and network parameters through the EWLS formulation. This allows the identification procedure to remain robust even in the presence of significant parameter uncertainties, avoiding the loss of sensitivity to measurement outliers that may arise when parameter uncertainty is implicitly absorbed into measurement error models. Although the present study focused on gross errors affecting measurement data for simplicity, the proposed framework naturally extends to the identification of bad data affecting network parameters as well.

To address the non-uniqueness inherent in bad data identification, the Bayesian formulation evaluates and ranks alternative gross-error models corresponding to different physical failure mechanisms. By restricting the hypothesis space to physically meaningful and influential data combinations, the method provides a tractable and systematic decision-making procedure for identifying the most likely sources of gross errors.

Monte Carlo simulations conducted on the IEEE-14 bus test system demonstrate the effectiveness and robustness of the proposed methodology under significant measurement and parameter uncertainties. Despite the large number of admissible gross-error models considered in the study (

N_{A} + N_{B} + N_{C} + N_{D} + N_{E} = 3486

), the framework is able to correctly identify a substantial fraction of the true configurations on the basis of only

ν - N = 42

observable principal-error components. This result highlights the strong inferential capability of the Bayesian identification approach to compensate for the limited redundancy of the measurement configuration by exploiting the probabilistic structure of the hypothesis space even in the presence of complex multiple-failure scenarios. At the same time, the numerical results show that the identification performance is not uniform across all gross-error classes. In particular, some classes, especially those involving multiple simultaneous gross errors, exhibit lower correct identification rates and higher ambiguity or misidentification frequencies. A deeper understanding of the mechanisms underlying these differences remains an open issue and will be addressed in future work.

The results also show that the identification performance is only weakly sensitive to the assumed value of the elementary failure probability p. Varying p by one order of magnitude does not produce appreciable changes in the posterior ranking of the hypotheses, indicating that the Bayesian identification process is largely driven by the information contained in the principal-error observations rather than by the specific choice of prior probabilities.

The proposed framework is particularly suited to situations in which limited measurement redundancy and parameter uncertainty lead to intrinsic ambiguity in bad data localization. In such conditions, deterministic identification strategies may fail to uniquely attribute the observed inconsistencies, whereas the probabilistic ranking provided by the Bayesian formulation enables a systematic and interpretable assessment of the most plausible gross-error sources.

Future research will focus on extending the framework toward more detailed representations of measurement infrastructures, including the modeling of multiple instrument transformers and heterogeneous component reliability, as well as on the adoption of more general error models beyond the Gaussian assumption. In particular, the use of heavy-tailed or non-Gaussian distributions for both measurement and gross errors represents a promising direction to further improve the robustness of the approach under realistic operating conditions.

Furthermore, the explicit modeling and identification of gross errors affecting network parameters, including scenarios involving simultaneous measurement and parameter gross errors, will be investigated, taking into account the associated identifiability challenges.

Another promising direction concerns the integration of the proposed model-based Bayesian identification approach with data-driven techniques. In particular, the principal-error representation derived from the EWLS formulation provides a compact and physically meaningful feature space that could be exploited by learning-based methods while preserving the physical constraints imposed by the network model. Such hybrid model-based and physics-informed approaches may further improve the scalability and adaptability of the methodology when applied to large-scale power systems with evolving measurement infrastructures and heterogeneous measurement technologies.

These results indicate that combining physically grounded estimation models with probabilistic inference provides a powerful paradigm for robust bad data identification in modern power system monitoring environments.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data is contained within the article. The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Acknowledgments

ChatGPT (GPT-5.3) was used to improve the clarity and readability of the technical writing. The author reviewed and edited the output and takes full responsibility for the content.

Conflicts of Interest

The author Gabriele D’Antona declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Appendix A. Evidence with Gaussian Likelihood and Prior

The principal error evidence defined in (54) can be written as

p (ξ ∣ H_{k}) = \int_{R^{M}} L (Δ d; ξ) \cdot π^{-} (Δ d ∣ H_{k}) d^{M} Δ d

Assuming a Gaussian likelihood function

L (Δ d; ξ) = \frac{1}{\sqrt{{(2 π)}^{ν - N}}} e^{- 0.5 {(ξ - T^{†} Δ d)}^{T} (ξ - T^{†} Δ d)}

and a Gaussian prior distribution

π^{-} (Δ d ∣ H_{k}) = \frac{1}{\sqrt{{(2 π)}^{M} | Σ_{k} |}} e^{- 0.5 {(Δ d - η_{k})}^{T} Σ_{k}^{- 1} (Δ d - η_{k})}

the evidence integral can be equivalently expressed as

p (ξ ∣ H_{k}) = \frac{\int_{R^{M}} e^{- 0.5 (Δ d^{T} A_{k} Δ d + b_{k}^{T} Δ d + c_{k})} d^{M} Δ d}{\sqrt{{(2 π)}^{M + ν - N} | Σ_{k} |}}

(A1)

where

\{\begin{matrix} A_{k} = Σ_{k}^{- 1} + {(T^{†})}^{T} T^{†} \\ b_{k} = - 2 [Σ_{k}^{- 1} η_{k} + {(T^{†})}^{T} ξ] \\ c_{k} = η_{k}^{T} Σ_{k}^{- 1} η_{k} + ξ^{T} ξ \end{matrix}

(A2)

The quadratic form in the exponent of (A1) can be rewritten by completing the square as

Δ d^{T} A_{k} Δ d + b_{k}^{T} Δ d + c_{k} = {(Δ d - μ_{k})}^{T} A_{k} (Δ d - μ_{k}) + d_{k}

with

\{\begin{matrix} μ_{k} = - 0.5 A_{k}^{- 1} b_{k} \\ d_{k} = c_{k} - 0.25 b_{k}^{T} A_{k}^{- 1} b_{k} \end{matrix}

(A3)

Since the integral of a multivariate Gaussian probability density function over

R^{M}

equals unity, the integral in (A1) evaluates to

\int_{R^{M}} e^{- 0.5 [{(Δ d - μ_{k})}^{T} A_{k} (Δ d - μ_{k}) + d_{k}]} d^{M} Δ d = e^{- 0.5 d_{k}} \sqrt{{(2 π)}^{M} | A_{k}^{- 1} |}

Substituting this result yields the closed-form expression for the evidence:

p (ξ ∣ H_{k}) = \frac{e^{- 0.5 d_{k}}}{\sqrt{{(2 π)}^{ν - N} | Σ_{k} | \cdot | A_{k} |}}

(A4)

References

Schweppe, F.; Wildes, J. Power System Static-State Estimation, Part I: Exact Model. IEEE Trans. Power Appar. Syst. 1970, PAS-89, 120–125. [Google Scholar] [CrossRef]
D’Antona, G.; Perfetto, L. Bad data detection and identification in power system state estimation with network parameters uncertainty. In Proceedings of the 2015 2nd International Conference on Knowledge-Based Engineering and Innovation (KBEI), Tehran, Iran, 5–6 November 2015; pp. 26–31. [Google Scholar] [CrossRef]
Merrill, H.; Schweppe, F. Bad Data Suppression in Power System Static State Estimation. IEEE Trans. Power Appar. Syst. 1971, PAS-90, 2718–2725. [Google Scholar] [CrossRef]
Mili, L.; Cheniae, M.; Vichare, N.; Rousseeuw, P. Robust state estimation based on projection statistics [of power systems]. IEEE Trans. Power Syst. 1996, 11, 1118–1127. [Google Scholar] [CrossRef] [PubMed]
Mili, L.; Cheniae, M.; Rousseeuw, P. Robust state estimation of electric power systems. IEEE Trans. Circuits Syst. I Fundam. Theory Appl. 1994, 41, 349–358. [Google Scholar] [CrossRef]
Lin, Y.; Abur, A. Robust State Estimation Against Measurement and Network Parameter Errors. IEEE Trans. Power Syst. 2018, 33, 4751–4759. [Google Scholar] [CrossRef]
D’Antona, G. Power System Static-State Estimation with Uncertain Network Parameters as Input Data. IEEE Trans. Instrum. Meas. 2016, 65, 2485–2494. [Google Scholar] [CrossRef]
D’Antona, G.; Carni, S.; Arboleda, C.T. EWLS State Estimator Performance for Bad Data Detection and Identification. In Proceedings of the 2024 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), Glasgow, UK, 20–23 May 2024; pp. 1–6. [Google Scholar] [CrossRef]
Zhang, X.; Yan, W.; Lu, Z.; Tan, H.; Li, H. Bad data identification for power systems state estimation based on data-driven and interval analysis. Electr. Power Syst. Res. 2023, 217, 109088. [Google Scholar] [CrossRef]
Ganjkhani, M.; Abbaspour, A.; Fattaheian-Dehkordi, S.; Gholami, M.; Lehtonen, M. Application of machine learning in determining and resolving state estimation anomalies in power systems. Sustain. Energy Grids Netw. 2024, 38, 101335. [Google Scholar] [CrossRef]
Akagic, A.; Džafić, I. Enhancing smart grid resilience with deep learning anomaly detection prior to state estimation. Eng. Appl. Artif. Intell. 2024, 127, 107368. [Google Scholar] [CrossRef]
Cooper, A.; Bretas, A.S.; Meyn, S. Anomaly Detection in Power System State Estimation: Review and New Directions. Energies 2023, 16, 6678. [Google Scholar] [CrossRef]
Fetzer, E.; Anderson, P. Observability in the state estimation of power systems. IEEE Trans. Power Appar. Syst. 1975, 94, 1981–1988. [Google Scholar] [CrossRef]
Kay, S.M. Fundamentals of Statistical Signal Processing, Volume I: Estimation Theory; Prentice Hall: Englewood Cliffs, NJ, USA, 1993. [Google Scholar]
Therrien, C.W. Discrete Random Signals and Statistical Signal Processing; Prentice Hall: Englewood Cliffs, NJ, USA, 1992. [Google Scholar]
Golub, G.H.; Van Loan, C.F. Matrix Computations, 4th ed.; Johns Hopkins University Press: Baltimore, MD, USA, 2013. [Google Scholar]
Strang, G. Introduction to Linear Algebra, 4th ed.; Wellesley-Cambridge Press: Wellesley, MA, USA, 2009. [Google Scholar]
Jolliffe, I.T. Principal Component Analysis, 2nd ed.; Springer: New York, NY, USA, 2002. [Google Scholar]
Rao, C.R.; Mitra, S.K. Generalized Inverse of Matrices and Its Applications; Wiley: New York, NY, USA, 1971. [Google Scholar]
Monticelli, A. Electric power system state estimation. Proc. IEEE 2000, 88, 262–282. [Google Scholar] [CrossRef]
Billingsley, P. Probability and Measure; John Wiley & Sons: New York, NY, USA, 2008. [Google Scholar]
Lehmann, E.; Romano, J. Testing Statistical Hypotheses; Springer Texts in Statistics; Springer: New York, NY, USA, 2008. [Google Scholar]
Christie, R. Power Systems Test Case Archive. 1993. Available online: http://www.ee.washington.edu/research/pstca/pf14/pg_tca14bus.htm (accessed on 14 April 2026).
Abur, A.; Exposito, A. Power System State Estimation: Theory and Implementation; CRC Press: Boca Raton, FL, USA, 2004. [Google Scholar]
Monticelli, A. State Estimation in Electric Power Systems: A Generalized Approach; Springer: Boston, MA, USA, 1999. [Google Scholar]

Figure 1. EWLS estimator input and output quantities.

Figure 2. IEEE-14 bus test system. Bus indices, generators, and transmission network are shown. Arrows indicate the reference direction of power flows. A shunt capacitive compensation is connected at bus 9.

Figure 3. Identification frequency for class-A gross error models as a function of the injected gross error model index. Markers represent the identification frequency, with color intensity indicating its magnitude. Vertical dashed lines mark four representative scenarios: measurement 13 (correct, unambiguous), measurement 27 (correct but ambiguous), measurement 23 (misidentification within the same class), and measurement 67 (misidentification across classes): (a) identification frequency aggregated by error classes; (b) detailed identification frequency within class-A.

Figure 4. Boxplots of the posterior probability

π^{+} (H_{13}^{(A)} ∣ \tilde{ξ})

as a function of the bound of the uniform distribution used to model the gross error applied to measurement 13.

Figure 4. Boxplots of the posterior probability

π^{+} (H_{13}^{(A)} ∣ \tilde{ξ})

as a function of the bound of the uniform distribution used to model the gross error applied to measurement 13.

Figure 5. Boxplots of the posterior probabilities

π^{+} (H_{27}^{(A)} ∣ \tilde{ξ})

, and

π^{+} (H_{65}^{(A)} ∣ \tilde{ξ})

as a function of the bound of the uniform distribution used to model the gross error applied to measurement 27.

Figure 5. Boxplots of the posterior probabilities

π^{+} (H_{27}^{(A)} ∣ \tilde{ξ})

, and

π^{+} (H_{65}^{(A)} ∣ \tilde{ξ})

as a function of the bound of the uniform distribution used to model the gross error applied to measurement 27.

Figure 6. Boxplots of the posterior probabilities

π^{+} (H_{23}^{(A)} ∣ \tilde{ξ})

,

π^{+} (H_{57}^{(A)} ∣ \tilde{ξ})

, and

π^{+} (H_{66}^{(A)} ∣ \tilde{ξ})

as a function of the bound of the uniform distribution used to model the gross error applied to measurement 23.

Figure 6. Boxplots of the posterior probabilities

π^{+} (H_{23}^{(A)} ∣ \tilde{ξ})

,

π^{+} (H_{57}^{(A)} ∣ \tilde{ξ})

, and

π^{+} (H_{66}^{(A)} ∣ \tilde{ξ})

as a function of the bound of the uniform distribution used to model the gross error applied to measurement 23.

Figure 7. Identification frequency for class-B gross error models as a function of the injected gross error model index. Markers represent the identification frequency, with color intensity indicating its magnitude. Vertical dashed lines mark three representative scenarios: bus 1 (correct and unambiguous identification), bus 14 (correct but ambiguous identification), and bus 6 (misidentification across classes): (a) identification frequency aggregated by error classes; (b) detailed identification frequency within class-B.

Figure 8. Boxplots of the posterior probability

π^{+} (H_{1}^{(B)} ∣ \tilde{ξ})

as a function of the bound of the uniform distribution used to model the gross errors applied to all instruments connected to bus 1.

Figure 8. Boxplots of the posterior probability

π^{+} (H_{1}^{(B)} ∣ \tilde{ξ})

as a function of the bound of the uniform distribution used to model the gross errors applied to all instruments connected to bus 1.

Figure 9. Boxplots of the posterior probabilities

π^{+} (H_{13}^{(B)} ∣ \tilde{ξ})

and

π^{+} (H_{14}^{(B)} ∣ \tilde{ξ})

as a function of the bound of the uniform distribution used to model the gross errors applied to all instruments connected to bus 14.

Figure 9. Boxplots of the posterior probabilities

π^{+} (H_{13}^{(B)} ∣ \tilde{ξ})

and

π^{+} (H_{14}^{(B)} ∣ \tilde{ξ})

as a function of the bound of the uniform distribution used to model the gross errors applied to all instruments connected to bus 14.

Figure 10. Identification frequency for class-C gross error models as a function of the injected gross error model index. Markers represent the identification frequency, with color intensity indicating its magnitude: (a) identification frequency aggregated by error classes; (b) detailed identification frequency within class-C.

Figure 11. Boxplots of the posterior probability

π^{+} (H_{237}^{(C)} ∣ \tilde{ξ})

as a function of the bound of the uniform distribution used to model the gross errors applied to the pair of measurements 4 (

V_{6}

) and 40 (

P_{9, 10}

).

Figure 11. Boxplots of the posterior probability

π^{+} (H_{237}^{(C)} ∣ \tilde{ξ})

as a function of the bound of the uniform distribution used to model the gross errors applied to the pair of measurements 4 (

V_{6}

) and 40 (

P_{9, 10}

).

Figure 12. Identification frequency for class-D gross error models as a function of the injected gross error model index. Markers represent the identification frequency, with color intensity indicating its magnitude: (a) identification frequency aggregated by error classes; (b) detailed identification frequency within class-D.

Figure 13. Boxplots of the posterior probability

π^{+} (H_{18}^{(D)} ∣ \tilde{ξ})

as a function of the bound of the uniform distribution used to model the gross errors applied to all instruments connected to the pair of buses 2 and 7.

Figure 13. Boxplots of the posterior probability

π^{+} (H_{18}^{(D)} ∣ \tilde{ξ})

as a function of the bound of the uniform distribution used to model the gross errors applied to all instruments connected to the pair of buses 2 and 7.

Figure 14. Identification frequency for class-E gross error models as a function of the injected gross error model index. Markers represent the identification frequency, with color intensity indicating its magnitude: (a) identification frequency aggregated by error classes; (b) detailed identification frequency within class-E.

Figure 15. Boxplots of the posterior probability

π^{+} (H_{580}^{(E)} ∣ \tilde{ξ})

as a function of the bound of the uniform distribution used to model the gross errors applied to measurement 28 (

P_{1, 2}

) and all instruments connected to bus 9.

Figure 15. Boxplots of the posterior probability

π^{+} (H_{580}^{(E)} ∣ \tilde{ξ})

as a function of the bound of the uniform distribution used to model the gross errors applied to measurement 28 (

P_{1, 2}

) and all instruments connected to bus 9.

Table 1. Power Flow Analysis Results.

Bus	Type	Voltage/p.u.	Angle/rad
1	$V δ$	1.06000	0
2	$P V$	1.04500	−0.08696
3	$P V$	1.01000	−0.22210
4	$P Q$	1.01767	−0.17999
5	$P Q$	1.01951	−0.15313
6	$P V$	1.07000	−0.24820
7	$P Q$	1.06152	−0.23317
8	$P V$	1.09000	−0.23317
9	$P Q$	1.05593	−0.26073
10	$P Q$	1.05099	−0.26350
11	$P Q$	1.05691	−0.25815
12	$P Q$	1.05519	−0.26312
13	$P Q$	1.05038	−0.26453
14	$P Q$	1.03553	−0.27984

The bus-type “Vδ” represents the slack node, “

P V

” represents a generator bus, and “

P Q

” represents a load bus.

Table 2. Measurement configuration.

Type	From/To Buses
1	$\{(1, 0), (2, 0), (3, 0), (6, 0), (8, 0)\}$
2 & 3	$\{\begin{matrix} (1, 0), (2, 0), (3, 0), (4, 0), (6, 0), (8, 0), \\ (9, 0), (10, 0), (11, 0), (12, 0), (14, 0) \end{matrix}\}$
4 & 5	$\{\begin{matrix} (1, 2), (1, 5), (2, 4), (2, 3), (2, 5), (3, 4), (4, 2), (4, 3), \\ (4, 7), (4, 9), (5, 4), (5, 6), (6, 12), (6, 13), (7, 9), \\ (9, 10), (9, 14), (10, 11), (11, 6), (12, 13), (13, 14) \end{matrix}\}$

The measurement-type “1” represents a “voltage magnitude”, “2” a “real power injection”, “3” a “reactive power injection”, “4” a “real power flow”, and “5” a “reactive power flow”.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

D’Antona, G. A Bayesian Approach to Bad Data Identification in Power System State Estimation. Electronics 2026, 15, 1732. https://doi.org/10.3390/electronics15081732

AMA Style

D’Antona G. A Bayesian Approach to Bad Data Identification in Power System State Estimation. Electronics. 2026; 15(8):1732. https://doi.org/10.3390/electronics15081732

Chicago/Turabian Style

D’Antona, Gabriele. 2026. "A Bayesian Approach to Bad Data Identification in Power System State Estimation" Electronics 15, no. 8: 1732. https://doi.org/10.3390/electronics15081732

APA Style

D’Antona, G. (2026). A Bayesian Approach to Bad Data Identification in Power System State Estimation. Electronics, 15(8), 1732. https://doi.org/10.3390/electronics15081732

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Bayesian Approach to Bad Data Identification in Power System State Estimation

Abstract

1. Introduction

2. EWLS State Estimation Review

3. Principal Errors

4. Bad Data Detection

5. Bad Data Cluster Identification

6. Bayesian Bad Data Identification

6.1. Gross Error Models

6.2. Gross Error Model a Priori Probability

6.3. Principal Error Evidence

6.4. Gross Error Model a Posteriori Probability

6.5. Scalability Considerations and Hypothesis Space Reduction

7. Test System

8. Case Studies

8.1. Test Case 1: Single-Instrument Gross Errors (Class-A)

8.2. Test Case 2: Single-Bus Gross Errors (Class-B)

8.3. Test Case 3: Double-Instrument Gross Errors (Class-C)

8.4. Test Case 4: Double-Bus Gross Errors (Class-D)

8.5. Test Case 5: Bus–Instrument Combined Gross Errors (Class-E)

9. Conclusions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Evidence with Gaussian Likelihood and Prior

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI