A Data Driven Approach to Robust Event Detection in Smart Grids Based on Random Matrix Theory and Kalman Filtering †

: Increasing levels of complexity, due to growing volumes of renewable generation with an associated inﬂux of power electronics, are placing increased demands on the reliable operation of modern power systems. Consequently, phasor measurement units (PMUs) are being rapidly deployed in order to further enhance situational awareness for power system operators. This paper presents a novel data-driven event detection approach based on random matrix theory (RMT) and Kalman ﬁltering. A dynamic Kalman ﬁltering technique is proposed to condition PMU data. Both simulated and real PMU data from the transmission system of Great Britain (GB) are utilized in order to validate the proposed event detection approach and the results show that the proposed approach is much more robust with regard to event detection when applied in practical situations.


Introduction
The development and increasing complexity of modern power systems is leading to growing concerns over power system operational security, with network management and analysis becoming increasingly challenging. Traditional monitoring devices, employed in supervisory control and data acquisition (SCADA) systems, have become inefficient for this challenge [1]. As a result, phasor measurement units (PMUs) are being deployed to measure phasors of bus voltages and currents, enabling real-time state monitoring in power systems [2,3]. Due to PMUs' higher sampling rates and accurate time synchronization, a huge amount of synchronous data is now being collected from power systems globally [4,5], facilitating the detection of transient events.
At present, event detection is carried out following either a model-driven approach (supervised) or a data-driven approach (unsupervised). Model-driven approaches include neural networks [6,7], decision trees [8,9], and principal component analysis (PCA) [10][11][12][13][14]. However, for supervised learning, it is argued that if the training dataset is selected inappropriately or insufficiently, it will have a significantly negative effect on the performance of the approach when it is applied online. Therefore, much attention has been paid to the employment of unsupervised learning in this field [15][16][17][18], including wavelet-based approaches, energy function-based approaches, and frequency difference-based approaches.
More recently, as a data-driven approach, random matrix theory (RMT) was introduced and applied to early event detection in power systems [19]. However, PMU data are normally collected with significant noise levels and plenty of missing measurements, due to communication issues between PMUs and phasor data concentrators (PDCs) [20,21] and further issues in the measurement processes. When the above approach is faced with significantly noisy and missing data, it will be unable to accurately detect abnormal events.
A large amount of research has been presented in the field of event detection for power systems. In [7], Vasilic and Kezunovic presented an approach for transmission line fault classification building on an adaptive resonance theory (ART) neural network and fuzzy logic. This work applies a fuzzy decision rule to the output of the ART neural network to improve selectivity for a variety of real events. In [9], Adhikari et al. presented a method to effectively classify traditional power system contingencies and cyber-attacks in real time. It combines Hoeffding adaptive trees with a drift detection method and adaptive windowing to update the model continuously.
In recent years, PCA has been widely applied for early event detection in power systems. For example, Xie et al. [10] analyzed the dimensionality of PMU data under both normal and abnormal conditions and employed PCA to detect abnormal events based on the change of core subspaces of PMU data at the occurrence of an anomaly. Meanwhile, this model has an adaptive training mechanism that updates the model when an event happens. A similar work was carried out in [11] that uses a sliding window PCA to detect and classify multiple events, including islanding, loss of load, and loss of generation. However, different from the work presented in [10], this method uses a sliding window to update the PCA model in real time to adapt to the time varying behaviors of power systems. Another similar event detection method was developed by Guo et al. [12], which focuses on islanding detection for distributed generation systems. In this work, a recursive PCA algorithm is employed to analyze the time-varying behaviors of power systems for the purpose of reducing false alarms. Although a wide range of machine learning techniques have been applied to event detection and classification, these methods are mainly based on model-driven supervised learning. As mentioned above, when the sample space is chosen improperly or insufficiently, an ill-trained model may be obtained.
In order to address the limitations of supervised learning techniques, a few works have considered unsupervised learning techniques. For instance, Kim et al. designed a wavelet-based detection method using PMU data [15]. The key idea is to monitor the energy of coefficients within a sliding window in real time and to determine whether the energy value exceeds the threshold. Apart from this work, another data-driven analytics method, which relies on the rules created by PMU data, was proposed for power system fault detection in [18]. However, it should be noted that these unsupervised approaches need to pre-define an appropriate threshold in detection of events, which is a challenging issue. The RMT-based event detection approach presented in [19] does not need to predefine a threshold, but this approach does not take into account the quality aspects of PMU data in dealing with noisy and missing PMU measurements. This paper presents an event detection approach that improves the performance of the work presented in [19], making it much more robust when dealing with corrupted and significantly noisy samples of PMU data. To be specific, a robust event detection approach is developed by combining RMT and Kalman filtering. Furthermore, a dynamic Kalman filter is developed to condition PMU data. Finally, the proposed event detection approach is tested and validated on the standard IEEE 118-bus system and the transmission system of Great Britain (GB). The results show that the proposed approach is much more robust in practical situations that include significant levels of noisy or missing PMU measurements.
The remainder of this paper is organized as follows. Section 2 briefly introduces random matrix theory and formulates a dynamic quadratic prediction model based Kalman filtering technique. Section 3 combines RMT with Kalman filtering for event detection. Section 4 evaluates the performance of the proposed event detection approach on both simulated and real PMU data. Finally, Section 5 concludes the paper and points out some future work.

Random Matrix Theory and Kalman Filtering
Random matrix theory is first introduced briefly and then a dynamic quadratic prediction model based Kalman filtering technique is formulated.

Random Matrix Theory
RMT has proven to be effective for power system analysis, and a wide range of laws and theorems, such as Ring Law and Marchenko-Pastur Law, are included in RMT, which have been applied in power systems for different purposes [19,22,23]. In this paper, for the purpose of event detection, Ring Law, as the theoretical basis of data processing of event detection, and mean spectral radius (MSR), as an event indicator, will be briefly introduced.

Ring Law
Let X s ∈ C N×T be a standard non-Hermitian random matrix, whose entries are independent and identically distributed (i.i.d.) variables with where x s,i is the ith row vector of X s . For L standard non-Hermitian random matrices X s,i (i = 1, 2, . . . , L), a matrix product is defined as where X u,i is the singular value equivalent of X s,i . The matrix product Z can be transformed to the standard matrix product Z s , whose σ 2 (z s,i ) = 1/N in each row. Thus, the empirical spectral density (ESD) of Z s converges almost surely to the limit given by As N, T → ∞ with a constant ratio c = N/T ∈ (0, 1], N is the number of variables, and T is the time length.

MSR
Linear eigenvalue statistics indicate the statistical features of random matrices. A linear eigenvalue statistic of a random matrix X is defined as where λ i (i = 1, 2, . . . , n) are eigenvalues of X, and φ(·) is a test function. MSR is used to indicate the eigenvalue distribution of a random matrix. For the standard matrix product Z s (as mentioned above in Ring Law), MSR is formulated as where λ Z s ,i (i = 1, 2, . . . , N) are eigenvalues of Z s , and |λ Z s ,i | is the radius of λ Z s ,i on the complex plane.

Data Processing of Ring Law
Within a raw data source Ω, a raw random matrix X ∈ C N×T can be formed through a sliding window. Then, X can be transformed to a standard non-Hermitian matrix X s ∈ C N×T row by row as follows: where x i and x s,i are the ith row vectors of X and X s , respectively, and µ(x s,i ) = 0, σ 2 (x s,i ) = 1. Afterward, the matrix X u ∈ C N×N is introduced as the singular value equivalent of X s by where X s T is the transpose of X s , and U ∈ C N×N is a Haar unitary matrix, X u X u T ≡ X s X s T . However, for a series of arbitrary non-Hermitian random matrices X i (i = 1, 2, . . . , L) from the raw data source Ω, the matrix product Z = ∏ L i=1 X u,i ∈ C N×N is obtained. Then, it is converted to the standard matrix product Z s ∈ C N×N through the following formula: where z i and z s,i are the ith row vectors of Z and Z s , respectively.

Dynamic Kalman Filtering Technique
As a state estimation technique, Kalman filtering is a popular method for conditioning PMU data in power systems [24][25][26]. Jones et al. [27] proposed a quadratic prediction model-based Kalman filtering method to cleanse and recover PMU data easily. However, within this method, the measurement noise covariance matrix R is constant, thus leading to inaccurate PMU data conditioning in some situations. As a result, this paper makes an improvement by dynamically adapting the measurement noise covariance matrix R to the conditioning process.
Following [27], the classic model of Kalman filtering is defined as follows: For the purpose of optimization, the classic model of Kalman filtering can be expressed in the recursive way: According to [28], the quadratic prediction model is formulated bŷ In addition, each state vector is formed by a sliding window, which contains three successive snapshots of the system states and moves forward only one snapshot at a time. Thus, in order to estimate the next state, the state vectorsx(t|t) andx(t + 1|t) are respectively expressed as follows [27]: are all complex values, including magnitude and phase angle. As for Φ(t + 1, t) and H(t + 1), they can be expressed as constant matrices shown in (14) and (15): H(t + 1) = 1 0 0 (15) K(t + 1) is specified by the following formulas: where R and Q are constant scalar values in this method. Furthermore, it is assumed that the Kalman filtering process starts at t = 3,x(3|3) , and P(3|3) are the matrixes with 1 s and 0 s, respectively. This is to make the Kalman filter track the optimal estimate fast within a short period.
As the measurement noise covariance matrix R has a significant influence on the performance of Kalman filtering, this parameter will be adjusted dynamically. First of all, a residue between the a posteriori estimate and the actual measurement is calculated at each time t as follows [29]: where r(t) is the residue at time t, and z(t) = z(t). For a period, a residue vector is formed as: where r(t + 1) is the residue vector at time t + 1, and W is the time length. Then, the variance of the residue vector is calculated as the measurement noise covariance matrix shown below: where R(t + 1) is the measurement noise covariance matrix at time t + 1. Here, it is noted that Var(r(t + 1)) is the variance of a series of complex values. According to (18) and (19), it can be easily seen that R(t + 1) is obtained on the basis of W previous residues-namely, from time t − W + 1 to time t. However, when the noise is weak, R(t + 1) approaches 0. As a result, whenx(t + 1|t + 1) is calculated, z(t + 1) is almost trusted fully, whilex(t + 1|t) is hardly trusted, thus failing to reduce noise effectively. To solve this problem, R min is introduced as the lower bound for R(t + 1): R min=α (20) where α is a small value. When R(t + 1) is less than R min , R(t + 1) is set to R min by default. Additionally, it is supposed that R(t + 1) = R min within the initial period W.
Similarly, when missing data exist, which is processed as 0 in this paper, R max is defined as the upper bound for R(t + 1) to enhance the trust inx(t + 1|t) : where β is a large value. Specifically, in the presence of missing data at time t + 1, R(t + 1) is set to R max to recover the missing data. This practice is based on the fact that z(t + 1) is far from the real value at time t + 1 and can be hardly trusted.

Real-Time Event Detection
To perform real-time analysis, an N × T real-time sliding window is utilized to obtain the raw matrix X from the raw PMU data source Ω. Let N denote the number of voltage magnitudes and phase angles and T denote the time period. At time t, the raw matrix X(t) is formed as follows: where In order to add white Gaussian noise into the original data source O to obtain the raw data source Ω, the signal-to-noise ratio (SNR) is defined as where Tr(·) denotes the matrix trace, G is the white Gaussian noise matrix, whose entries follow the standard normal distribution, m is the magnitude of white Gaussian noise, and G T and O T are the transposes of G and O, respectively. During a period T max , the raw data source Ω is obtained through the following formulas: where Ω, O, and G are all N × T max matrixes. As the event indicator, MSR is calculated at each time interval and finally visualized to detect events. For simplicity, we set L = 1 when forming the matrix product Z.
To mitigate against false positives, the dynamic Kalman filter also serves as the data conditioner. Thus, after data conditioning, the conditioned raw matrixX(t) is formed as follows: wherex(t) = x 1 (t|t)x 2 (t|t) · · ·x N (t|t) T is the conditioned data at time t. As a result, the procedure of real-time event detection based on random matrix theory and Kalman filtering is shown in Table 1.  (26) 6: TransformX(t) to X s (t) using (6) 7: Transform X s (t) to X u (t) using (7) 8: Form z(t) using (2) 9: Transform z(t) to z s (t) using (8) 10: Calculate MSR using (5) 11: If t < T max , repeat 2-10 at the next time t + 1; otherwise, go to 12 12: Visualize MSR

Experimental Results and Discussion
In this section, the improved event detection approach is tested and compared with the work [19] using both simulated and real PMU data. The simulated data were generated based on the standard IEEE 118-bus system using MATPOWER [31], while the real PMU data were collected from the GB transmission system. Besides, voltage magnitude (in p.u.) and phase angle (in rad) were used, respectively, to validate the feasibility and effectiveness of the improved event detection approach. The SNRs of voltage magnitude and phase angle were kept the same in the tests. All the experiments were conducted on a common desktop using the matlab programming language.

Noise Reduction
A comparison was performed between the dynamic and original Kalman filters. During the whole period T max = 1500, there was a discrete event (an increase of active power demand at bus 60) happening at t = 601. In this case, the constant R of the original Kalman filter was 1 × 10 −2 . The results are shown in Table 2. In order to eliminate the effect of the error, root mean squared error (RMSE) is calculated during t = 501-1500. In Table 2, it is clearly shown that the RMSE value of voltage magnitude of the dynamic Kalman filter is almost the same as that of the original Kalman filter when SNR is greater than or equal to 25, while the RMSE value of voltage magnitude of the dynamic Kalman filter is less than that of the original Kalman filter when SNR is less than or equal to 15. The RMSE value of voltage phase angle follows similar to that of voltage magnitude.

Missing Data Recovery
A further comparison was performed between the dynamic and original Kalman filters in terms of missing data recovery, as shown in Table 3. In this case, the same discrete event happened to bus 60, and RMSE was still calculated during t = 501-1500, and the constant R of the original Kalman filter remained 1 × 10 −2 .
In Table 3, for voltage magnitude, it can be seen that the RMSE of the dynamic Kalman filter is significantly lower than that of the original Kalman filter. It is worth noting that as the percentage of missing data goes up, the RMSE of the dynamic Kalman filter only increases slightly from 6 × 10 −4 to 13 × 10 −4 , while the RMSE of the original Kalman filter increases dramatically from 3.08 to 16.03. However, for voltage phase angle, it is shown clearly in Table 3 that the RMSE of the dynamic Kalman filter is only slightly lower than that of the original Kalman filter. Moreover, when the percentage of missing data rises, both cases go up rather slowly.

Event Detection Using Voltage Magnitudes
Event detection was conducted in situations with heavy noise and missing data using voltage magnitudes. Different signals were used to test the performance of the improved and original event detection approaches. The defined signals are shown in Table 4.  Figure 1 shows the MSR-t curves of the improved and original event detectors. It is noted that event detection starts after a short period because the sliding window needs to be filled up with samples and the dynamic Kalman filter takes a short time to track the real values.
In Figure 1a, it can be observed from the MSR-t curve of the improved event detector that signals can be detected easily. Specifically, during t = 260-600, MSR remains stable around 0.77, which means that the system state is normal without any signals. Then, at t = 600, MSR starts to decrease gradually until t = 770 (from 0.7733 to 0.7462). After that, MSR begins to increase to the original stable level (0.7674) until t = 940, then MSR remains almost constant. During this period, a U-shaped curve is observed, which indicates that there are signals changing the system state during t = 601-940. In addition, the effect of a signal on MSR extends T extra instants, due to historical data in the sliding window. As a result, the actual duration of the signals can be calculated as 940 -601 + 1 -T = 100 instants, and the occurrence time of the signals is the starting time of the U-shaped curve, which is t = 601. That corresponds exactly to the fact that there is an increase of active power demand at bus 60 during t = 601-700. As a result, event detection is conducted in this way. Likewise, there is a second U-shaped curve from t = 1051 to t = 1390, which indicates that the signals occur at t = 1051, and the duration of the signals is 100 instants. However, for the original event detector, signals cannot be detected under the same situation. In other words, there is no clear U-shaped curve during the whole period because MSR changes slightly around 0.86 during t = 600-1390.  In Figure 1a, it can be observed from the MSR-t curve of the improved event detector that signals can be detected easily. Specifically, during t = 260-600, MSR remains stable around 0.77, which means that the system state is normal without any signals. Then, at t = 600, MSR starts to decrease gradually until t = 770 (from 0.7733 to 0.7462). After that, MSR begins to increase to the original stable level (0.7674) until t = 940, then MSR remains almost constant. During this period, a U-shaped curve is observed, which indicates that Apart from single signals, multiple signals can be also discovered by the improved event detector in a situation with heavy noise, as shown in Figure 1b. According to the MSR-t curve of the improved event detector, two types of signals can be detected. Firstly, two U-shaped curves are found during t = 601-940 and t = 1201-1540, and it indicates continuous signals during t = 601-700 and t = 1201-1300, respectively. Secondly, a third U-shaped curve is found during t = 601-1540, which indicates continuous signals during t = 601-1300. However, the same conclusion cannot be drawn from the MSR-t curve of the original event detector. Figure 2 depicts the MSR-t curves of the improved and original event detectors when the percentage of missing data is 30%. As the dynamic Kalman filter also needs to take a short time to track the real values in the situation with missing data, event detection starts after a short period. In Figure 2a, according to the MSR-t curve of the improved event detector, signals can be detected in the same way mentioned above. To be specific, two U-shaped curves are observed during t = 601-940 and during t = 1051-1390. However, the original event detector is unable to detect signals under the same situation because U-shaped curves cannot be found during the signal occurrence period. In addition, as shown in Figure 2b, multiple signals can still be discovered by the improved event detector in the situation with missing data. By contrast, the same signals cannot be discovered by the original event detector. In Figure 2a, according to the MSR-t curve of the improved event detector, signals can be detected in the same way mentioned above. To be specific, two U-shaped curves are observed during t = 601-940 and during t = 1051-1390. However, the original event detector is unable to detect signals under the same situation because U-shaped curves cannot be found during the signal occurrence period. In addition, as shown in Figure 2b, multiple signals can still be discovered by the improved event detector in the situation with missing data. By contrast, the same signals cannot be discovered by the original event detector.

Event Detection Using Voltage Phase Angles
Event detection was also conducted using voltage phase angles with respect to the slack bus in situations with heavy noise and missing data. However, as voltage magnitudebased event detection is quite similar to voltage phase angle-based event detection, only swell and sag signals (as shown in Table 4) were employed to test both approaches. Figure 3a describes the MSR-t curves of the improved and original event detectors when SNR is 20, while Figure 3b shows the MSR-t curves of the improved and original event detectors when SNR is 35 and the percentage of missing data is 30%. In Figure 3a,b, regardless of heavy noise or missing data, a U-shaped curve can be observed from t = 601 to t = 940 and from t = 1051 to t = 1390, respectively, for the improved event detector. By contrast, for the original event detector, there is no U-shaped curve found during the whole period in both situations.

Event Detection Using Voltage Phase Angles
Event detection was also conducted using voltage phase angles with respect to the slack bus in situations with heavy noise and missing data. However, as voltage magnitude-based event detection is quite similar to voltage phase angle-based event detection, only swell and sag signals (as shown in Table 4) were employed to test both approaches. Figure 3a describes the MSR-t curves of the improved and original event detectors when SNR is 20, while Figure 3b shows the MSR-t curves of the improved and original event detectors when SNR is 35 and the percentage of missing data is 30%. In Figure 3a,b, regardless of heavy noise or missing data, a U-shaped curve can be observed from t = 601 to t = 940 and from t = 1051 to t = 1390, respectively, for the improved event detector. By contrast, for the original event detector, there is no U-shaped curve found during the whole period in both situations.

Event Detection Using Real PMU Data
Real PMU data were also utilized to further validate the robustness of the proposed event detection approach. The PMU data were collected from 13 PMUs of the GB transmission system [32,33], and the sampling rate of the PMUs was 50 Hz. Besides, the total time length of the PMU data was 1 min, and thus there were 50 × 60 = 3000 samples in total for each PMU. What is more, the missing samples in the PMU data accounted for 12.5%. In this case, N was 13 and T was 500. The results are shown in Figures 4 and 5.

Event Detection Using Real PMU Data
Real PMU data were also utilized to further validate the robustness of the proposed event detection approach. The PMU data were collected from 13 PMUs of the GB transmission system [32,33], and the sampling rate of the PMUs was 50 Hz. Besides, the total time length of the PMU data was 1 min, and thus there were 50 × 60 = 3000 samples in total for each PMU. What is more, the missing samples in the PMU data accounted for 12.5%. In this case, N was 13 and T was 500. The results are shown in Figures 4 and 5.   In Figure 4a or Figure 5a, it can be observed that there is a U-shaped curve during t = 1777-2387 in the MSR-t curves of the improved event detector. This indicates that there are signals occurring at t = 1778 and lasting for 2387 − 1778 + 1 − T = 110, which is 110/50 = 2.2 s. According to Figure 4b or Figure 5b, the voltage magnitude and phase angle of a certain PMU change dramatically during t = 1778-1887, which confirms the above inference. In fact, there is indeed a tripping event from t = 1778 to t = 1887 across the GB transmission system. However, the original event detector fails to detect this event since no U-shaped curve is seen throughout the period. In Figure 4a or Figure 5a, it can be observed that there is a U-shaped curve during t = 1777-2387 in the MSR-t curves of the improved event detector. This indicates that there are signals occurring at t = 1778 and lasting for 2387 − 1778 + 1 − T = 110, which is 110/50 = 2.2 s. According to Figure 4b or Figure 5b, the voltage magnitude and phase angle of a certain PMU change dramatically during t = 1778-1887, which confirms the above inference. In fact, there is indeed a tripping event from t = 1778 to t = 1887 across the GB transmission system. However, the original event detector fails to detect this event since no Ushaped curve is seen throughout the period.

Conclusions
This paper presents a robust event detection approach based on random matrix theory and Kalman filtering. In addition, as the data conditioner, a dynamic Kalman filtering technique was developed through the adjustment of the measurement noise covariance

Conclusions
This paper presents a robust event detection approach based on random matrix theory and Kalman filtering. In addition, as the data conditioner, a dynamic Kalman filtering technique was developed through the adjustment of the measurement noise covariance matrix in order to reduce noise and recover missing samples in PMU data. The experimental results show that the dynamic Kalman filter outperforms the original one, in terms of both noise reduction and missing data recovery. Furthermore, the improved event detection approach is much more robust than the original one, especially in practical situations where PMU data exhibit significant noise levels and large numbers of missing samples. The proposed event detection approach is also capable of performing computationally efficient event detection in real time in practice, without pre-training a model.
In this paper, the proposed approach only aimed to identify the occurrence time and the duration period of an event, without locating and classifying the event. Hence, in future research, the authors will focus on the location and classification of events to further develop and demonstrate the proposed robust approach in terms of event detection.