1. Introduction
Many industries are becoming more modernized as technology advances, including the power systems [
1]. High-speed internet is being used as the primary mean of communication between various sectors of the power grid. Cyber-attacks pose a significant threat to various industries, as technologies increasingly rely on wireless communications, and most of the power system operations, such as energy management programs, state estimation, optimal power flow, etc., depend on safe and reliable communications [
3]. DSE is an important tool for monitoring and controlling the power network, especially when the system is performing in the transient mode [
4]. DSE is an effective method to track the behavior of the power system in the transient mode, and it usually employs Kalman filters to perform the state estimation process. Modernized power systems, known as smart grids, rely heavily on wireless communication, making them vulnerable to cybercriminals who can tamper with data derived from PMUs [
5]. DSE usually uses PMU data as inputs for estimation of the dynamic behavior of the power system, and thus the communication between PMUs and central control is of paramount importance due to vulnerability to cyber-attacks.
Because traditional methods based on various types of Kalman filters are only effective against certain types of cyber-sabotages [
5], a machine learning-enhanced method is needed for optimizing the detection process of these attacks. In this article, clustering and regression techniques are used to tackle these threats, and the results are compared to the previous methods of system defense against the mentioned attacks. While the traditional power system is transitioning to a new and more intelligent system known as the smart grids, the threat posed by cybercriminals is unavoidable, and by injecting more sophisticated attacks, linear and non-linear traditional detection techniques, such as Kalman filters, appear to be rendered ineffective.
Kalman filters were first used in the 1970s, when the term “Dynamic State Estimation” was introduced [
6]. With the development of related techniques, more advanced filters were used for the accurate and robust estimation of dynamic states. In recent years, some studies emphasized that DSE fulfills an essential function in dealing with electromechanical transient models and unknown inputs collected from PMUs [
8]. Numerous methods for detecting and eliminating cyber-sabotages were implemented, ranging from linear techniques to AI-based methods [
11]. Applying different types of Kalman filters, such as UKF and EKF, to eliminate the malfunctions and unknown data injections was proposed in recent years, and these filters were examined in various scenarios of cyber-attacks [
12]. While cyber threats are increasing these days, a power system needs better preparation for such attacks. FDI and DoS are two typical kinds of attacks, and while both FDI and DoS consider the measurement equipment as the main target of data injection, the former usually changes the mean value of the measured object, while the latter denies the transmission of data. FDI poses a considerable threat to network security, and by changing the value of the measured data, it misleads the operators via the fundamental change in the monitored states of the system. The situation worsens when it comes to transient electromechanical states. DoS attacks manipulate the power system observers during the transient period of the system and may lead to huge power outages by the incorrect decisions made by operators due to the false data injected by intruders. In 2019, a confirmed case of cyber-attack happened in the US power grid, in which the intruders tried to manipulate the operators by aiming some sensitive PMUs in the power network [
13]. Clustering and classification techniques, on the other hand, can be used to provide a more optimized mean of facing these cyber-attacks [
3]. These days, numerous AI-enhanced techniques are utilized to separate standard data from anomalous ones, which is called anomaly detection. HC is a well-known clustering method used on continuous data and time series [
14]. Many classification techniques were proposed in the field of machine learning. Many of them, however, are not suitable for processing challenging time series data. DTR and neighboring techniques seem to be more useful in the case of continuous data generated in power networks.
It is known that the Kalman filter performs well under specific conditions, such as Gaussian-based noise [
15]. However, in more complex situations, Kalman filters are struggling to detect the outliers, especially when the measurement noise does not obey the Gaussian assumption [
16]. In [
17], robust filters were implemented to tackle this problem by using the RCKF and CKF during electromechanical transients in the power transmission network. A non-linear control loop-based method was proposed in [
18] as a technique to detect and eliminate cyber-attacks along with risk mitigation.
Wang et al. [
19] mention a Luenberger method for both cyber-attack detection and power system isolation. In recent years, the use of AI techniques has boosted power engineering, as demonstrated in [
20] by the use of sequential hypothesis testing based on machine learning methods. In [
21], the authors propose a machine learning-aided dynamic state estimation method, and in [
22], the authors employ a variety of deep learning-based methods to detect false data injection. Ref. [
23] reviews how to control the power system under cascading failures. In [
24], a new Markov-based approach is proposed to detect DoS attacks, while in [
25], applications of extended Kalman filter is illustrated. An on-line DSE method is proposed in [
26]. Ref. [
27] uses novel Kalman filters to perform DSE. Reference [
28] introduces hierarchical clustering applications for anomaly detection. Basics of random tree for classification and regression was first introduced in [
29], and in [
30], the authors used the random tree method to recovering clusters under random noise. In Towards Data Science, Lorraine Li used decision tree for regression and classification problems [
31], and in [
32], the authors propose a power system toolbox for MATLAB. An EKF/UKF toolbox is proposed in [
33]. In [
34], a toolbox named PSAT is proposed for dynamic analysis of the power system. It is worth noting that all of the mentioned toolboxes are employed for simulating our results in this paper.
Reference [
35] proposes a machine learning-based approach to detect FDI attacks in the power system. In [
36], the authors propose a linear approach to detect cyber-attacks and outliers in PMU-based power system state estimation, and in [
37], a supervised learning-based approach is proposed to detect DoS attacks in smart grids.
Table 1 shows a comparison of the methods studied in this article and others in the same field. As illustrated in
Table 1, this article contributes to the field in at least two ways. Firstly, it employs two machine learning-based methods to detect and eliminate the cyber-threats. Secondly, it draws a comparison between Kalman filters and a proposed hybrid machine learning method for the same purpose.
The rest of this paper is organized as follows.
Section 2 formulates models for DSE, fourth-order generator, cyber-attacks and the two Kalman filters utilized for simulation. Machine learning methods are described in
Section 3. In
Section 4, the proposed approach is detailed, and in
Section 5 the proposed method is examined by different case studies, with the results of the simulations being illustrated as well. The paper is concluded in
Section 6.
4. Problem Formulation
Both machine learning methods mentioned in the last section are employed to spot and eliminate cyber-attacks. For tackling the cyber-sabotage problem in the power system, a modified version of HC is employed. The main idea behind this approach is to identify the anomalous data and eliminate them and reduce the features used as input to the DTR so that the algorithm predicts the correct states of the system. The challenge of the proposed method is to maintain the high accuracy of its predicted states (rotor angle and rotor mechanical speed), and reducing the dimension of the main features exerts a pervasive influence on the accuracy of this method. The main features chosen in this work are as follows.
In which
is the vector of features, and the clustering distortion is formulated as below [
While si is the ith sample, the cluster centroids are , m is the number of clusters, and c = [1, 2, 3, …, k].
HC is a vital tool for detecting anomalies, and when data is far from other tree roots or leaves, it is usually clustered as an outlier with respect to the threshold set for the method. Therefore, for each measurement type, voltage, speed, etc., an HC algorithm will be utilized to detect the attacked data. As the data derived from PMUs are flowing, the HC accepts the new data and starts clustering. If the data is clustered as an outlier, the algorithm will send it to the DTR, and the regression method replaces the data by using other features and predicts the real value of the attacked data, and then the predicted value will be sent to the DSE, while HC will delete the attacked data from its database. If not an outlier, the data will be sent directly to the DSE for state estimating purposes. We need an impurity metric appropriate for continuous variables to use a decision tree for regression, so we define the impurity measure using the children’s leaves’ weighted mean squared error (MSE) [
Nt is the number of samples at the leave
t, while
Dt is the training subset,
is the true target value and
is the estimated target value. It is worth noting that the mentioned equations are used for the training process of the DTR.
Figure 6 illustrates the flowchart of the proposed method.
For comparing different methods, some indices are defined and used as follows [
N is the number of samples while
are estimated and true states, respectively.
Sa and
Sad are the number of the attacked data and detected attacked data, respectively.
is the number of Monte-Carlo replications, which in this article is set to be 100.
are the end and the starting time of the period in which the cyber-attack was launched, respectively. It is clear that the first index is able to evaluate the estimation results, while the second one is the attack classification ratio. The last index represents the least squared error measure.
5. Simulation and Results
Here, the proposed method was tested on the IEEE 3-machine 9-bus system and the IEEE 5-machine 14-bus system, while the data of these test systems are derived by using the MATLAB power system toolbox [
32] and the EKF and UKF methods are from the EKF/UKF toolbox [
33]. All tests are conducted with MATLAB 2020a and Python 3.8. A sudden load fluctuation happened in 0.1 and lasted for 1 s in both test systems. The PMU sample rate is 120 samples per second, and a PMU is utilized at each generator bus.
Two case studies are represented in this article, and various cyber-attacks are employed for the simulation process. Both FDI and DoS attacks are simulated with different attack vectors and probabilities as illustrated in
Table 2. It is worth noting that the base rotor speed is 376.8 rad/sec for both case studies, and the base generator angle is 1 degree. The cyber-attacks were launched over t = 4.2 s and exerted a significant influence on the DSE. The HC has clustered all the features simultaneously by taking the distortion level of features into account, and the DTR was held responsible for clearing the attack and correcting the states. It is worth noting that the DTR was trained by numerous data from different contingencies ranging from three-phase fault to lightning stroke, all of which are available on a MATLAB power system’s toolbox named PSAT [
In the three scenarios of FDI cyber-attack, the “Normal Distribution” is employed with different standard deviations for simulating the attacks [
35]. In DoS cases, a “Packet Loss Ratio” is utilized for simulating the DoS attack process with four different intensities.
Figure 7 shows the schematic of the IEEE 3-machine 9-bus and IEEE 5-machine 14-bus systems. The whole simulation time is about 10 s, while the distortion constant is set to 10 for the IEEE 9-bus and 30 for the IEEE 14-bus.
Figure 8 illustrates the first generator’s states derived from the DSE, aided by EKF, UKF, and the proposed method under the three FDI cyber-attack scenarios.
Figure 9 shows the dynamic states of the mentioned generator calculated by DSE under DoS cyber-sabotages for the IEEE 3-machine 9-bus test system.
Figure 8a–c, it is clear that the proposed method boosted the accuracy of the DSE, especially during the time of FDI cyber-attacks, a task in which both EKF and UKF performed poorly. It is worth noting that before the cyber-attack, all three methods accurately estimated the dynamic states of the network. After the cyber-attack was launched, however, Kalman filters failed to detect and eliminate the attacks. The situation deteriorates in the case of DoS attacks. From (a) to (d) subplots of
Figure 9, it can be observed that the mentioned filters almost failed to eliminate the attacks, while the proposed DTR-based method properly detected and eliminated the attack vectors.
Figure 10a, an example of an attacked dataset detected by the HC method is illustrated, while in
Figure 10b, a feature is shown which is not attacked. Both of the mentioned figures are heatmaps plotted by scatter function in Python with “cmap” set to cool. The former is the rotor speed of the second generator, and the latter is the voltage angle of bus three.
Figure 11a,b shows the clustering inertia of both mentioned features. The accuracy of the proposed method significantly depends on the accurate functioning of the clustering method, which diagnoses malfeatures.
The proposed indices are calculated and compared to another related study in
Table 3 and
Table 4 under different attack scenarios in the IEEE 3-machine 9-bus test system. It is worth noting that the method proposed in [
12] is RCKF.
Figure 11a,b, it is clear that as soon as the attacked measurement of rotor speed enters the HC, the distortion of only one cluster boosts rapidly, and the injected data is eliminated, while all of the voltage angle data are correct and the distortion for only one cluster is smaller than
d. The second index has the same value for both rotor speed and angle, as it measures the detecting accuracy of HC and does not depend on any individual features. By taking the first index into account, the proposed method works slightly better than that of [
12], which shows the higher accuracy of the proposed method.
Table 3 and
Table 4, it can be seen that the HC-DTR-based dynamic state estimation outperformed the RCKF technique. The HC model managed to detect the attacked data better than the Kalman filter algorithm, and the DTR predicted the actual values more robustly than the method conducted in [
Figure 12 illustrates the generator’s states in the IEEE 5-machine 14-bus under three different FDI attack scenarios, while
Figure 13 shows the generator’s states under DoS attack scenarios. In this test system, only UKF was employed as an alternative method due to the low accuracy of EKF.
As it is clear from
Figure 12, the proposed machine learning-based method’s accuracy is far better than the UKF’s, even in more extensive scenarios under FDI attacks. Similar to the previous cyber-attack, it is clear from
Figure 13 that the DoS attack is well detected and eliminated by the proposed method, a task in which the UKF has failed. The DTR method shows considerable potential in eliminating different cyber-attacks, as illustrated in the mentioned figures for both the IEEE 3-machine 9-bus test system and the IEEE 5-machine 14-bus test system.
Figure 14 illustrates the rotor speed’s data of the second generator and voltage angle’s data of the third bus as attacked and the true features. Both mentioned figures are heatmaps plotted by scatter function in Python with “cmap” set to warm, while
Figure 15 shows the cluster inertia of both features. It is worth noting that the HC method is clustering the data simultaneously, which is vitally essential for rapid response against cyber-attacks.
The proposed indices are illustrated in
Table 5 and
Table 6 for rotor angle and rotor speed, respectively, and compared to results from two other related studies [
37], for the IEEE 5-machine 14-bus test system. It is worth mentioning that [
36] proposed a non-linear method based on a novel Kalman filter for detecting and eliminating the FDI attack, while [
37] employed a support vector machine classification-based method for diagnosing the DoS attack.
It is clear from
Table 5 and
Table 6 that the proposed method possesses better detection accuracy in the case of DoS attacks than that of [
36] and more accuracy for estimating the dynamic states of the case study than that of [
37]. Other indices illustrate that the proposed method is fully capable of eliminating DoS and FDI cyber-attacks and simply outperforms other mentioned techniques in [