A Power Grid Fault Diagnosis Method in the Case of Alarm Information Loss Based on the Top-k Skyline Query Algorithm

: After the failure of the power system, a large amount of alarm information will ﬂood into the dispatching terminal instantly. At the same time, there are inevitable problems, such as the abnormal operation of the protection and the circuit breaker, the lack of alarm information, and so on. This kind of uncertainty problem brings great trouble to the fault diagnosis algorithm. As a data processing algorithm for an uncertain information set, Top-k Skyline query algorithm can eliminate the data points that do not meet the requirements in the information set, and then output the ﬁnal K results in order. Based on this background, this paper proposes a power grid fault diagnosis method based on the Top-k Skyline query algorithm considering alarm information loss. Firstly, the fault area is determined by using the information of the electrical quantity and switching value. Then, backward reasoning Petri nets are established for the nodes in the fault area to form the data set of fault hypotheses. Then, the Top-k Skyline query algorithm is used to sort the hypotheses and choose the hypothesis with higher reliability. Finally, an IEEE 39-bus system example is given to verify the reliability of the proposed method.


Introduction
With the continuous development of power grids, accurate and fast fault diagnosis of the power system plays a very important role in ensuring the safe operation of power grids. Up to now, many scholars have used different intelligent methods, such as the expert system [1], analytical model [2,3], and Bayesian network [4][5][6], to diagnose power system faults. It has been proved by practice that these methods can correctly judge the fault situation when the alarm information is complete and accurate. However, in the actual operation of the power grid, refuse-operation and maloperation of relay protection devices often occur. Additionally, in the transmission of alarm information, due to the greatly increased amount of information transmission, communication equipment hardware failure, and other reasons, the problem of missing alarm information often occurs. These problems are likely to cause more serious problems, such as fault area expansion, fault element misdiagnosis, and so on. Eventually, it will affect the timely handling of the accident. Under this background, the research on the fault diagnosis method in the case of incomplete alarm information has particularly important practical significance.
The authors of [7] presuppose the preliminary diagnosis corresponding to the fault output, and use the Bayesian theorem to deal with the uncertainty of diagnosis. The uncertainty and temporal properties of alarm information are considered in [8]. The authors of [9] propose an optimization model of power grid fault diagnosis based on topology modeling. This model improves the objective function by integrating the action logic error and information communication error of the protection system, so as to realize the automatic and efficient construction of the optimization model. The authors of [10] divide the missing information into two categories: missing breaker information and When the power system fails, if the fault diagnosis algorithm judges the elements of the whole system indiscriminately, it will not only greatly increase its burden, but also cannot meet the requirements of rapid fault diagnosis. In addition, it will affect the further processing of the fault elements. As mentioned in the above section, in order to avoid the impact caused by the missing information of the fault area boundary breaker, the fault area identification algorithm based on the switching value and electrical quantity is selected in this paper.

Determination Method of the Fault Area Boundary Breaker
According to the previous definition, after the system fails, the fault area boundary breaker isolates the fault area from the normal area. The basic characteristic of the fault area boundary breaker is that it can divide the system into a charged system and uncharged system. This characteristic is not only an effective method to judge the fault area boundary breaker, but also the basis of the fault area identification algorithm proposed in this paper. Accordingly, the fault area boundary breaker is found by identifying the charging conditions on both sides of the circuit breaker. After finding out all the fault area boundary breakers in the system, the fault boundary is delimited. Firstly, the criteria (1) and (2) for judging line outage and bus outage based on PMU electrical information are given respectively as: In criterion (1), the I L represents the line current. The I set represents the set threshold, which is one 10th of the minimum load current of the line during normal operation of the system. After the circuit breakers on both sides of the line break off, the line current should tend to 0. However, considering the measurement error and the line residual current, in order to ensure that the criterion can be correctly judged both when the line is in normal operation and when the fault occurs, one 10th of the line minimum load current is chosen as I set in this paper after a lot of simulation. Accordingly, the V A represents the voltage of bus A, and one 10th of the bus voltage during normal operation is chosen as the V set .
According to the above criteria, after obtaining the voltage, current, and other electrical quantities on both sides of each circuit breaker of the system, the charging conditions on both sides of the circuit breaker can be judged. Then, according to the characteristic of the fault area boundary breaker, whether the breaker is a fault area boundary breaker or not can be judged.
In Figure 1, if a fault occurs on line L2, it will lead to operation of the main protection and trip circuit breakers CB3 and CB4 to isolate the fault. According to the criterion (1), the line current I L2 meets the criterion. The line is outage and marked with 0. According to criterion (2), the bus voltage V B and V C do not satisfy the criterion. The bus is charged and marked with 1. As shown in Figure 1, one side of CB3 is power off and the other side is power on. CB4 is similar to CB3. CB3 and CB4 are fault area boundary breakers.
Energies 2021, 14, x FOR PEER REVIEW 4 of 23 this paper. Accordingly, the fault area boundary breaker is found by identifying the charging conditions on both sides of the circuit breaker. After finding out all the fault area boundary breakers in the system, the fault boundary is delimited. Firstly, the criteria (1) and (2) for judging line outage and bus outage based on PMU electrical information are given respectively as: In criterion (1), the represents the line current. The set represents the set threshold, which is one 10th of the minimum load current of the line during normal operation of the system. After the circuit breakers on both sides of the line break off, the line current should tend to 0. However, considering the measurement error and the line residual current, in order to ensure that the criterion can be correctly judged both when the line is in normal operation and when the fault occurs, one 10th of the line minimum load current is chosen as set in this paper after a lot of simulation. Accordingly, the represents the voltage of bus A, and one 10th of the bus voltage during normal operation is chosen as the set .
According to the above criteria, after obtaining the voltage, current, and other electrical quantities on both sides of each circuit breaker of the system, the charging conditions on both sides of the circuit breaker can be judged. Then, according to the characteristic of the fault area boundary breaker, whether the breaker is a fault area boundary breaker or not can be judged.
In Figure 1, if a fault occurs on line L2, it will lead to operation of the main protection and trip circuit breakers CB3 and CB4 to isolate the fault. According to the criterion (1), the line current 2 meets the criterion. The line is outage and marked with 0. According to criterion (2), the bus voltage B and do not satisfy the criterion. The bus is charged and marked with 1. As shown in Figure 1, one side of CB3 is power off and the other side is power on. CB4 is similar to CB3. CB3 and CB4 are fault area boundary breakers.

Fault Area Formation Method
After the failure of the system, the dispatching terminal will receive the alarm information of the breaker and form the action circuit breakers set C. In order to effectively avoid the impact of breaker information loss and distortion, after the identification of the fault area boundary breaker in Section 2, this section verifies the action circuit breakers set C obtained at the dispatching terminal through the comprehensive application of the electrical quantity and switching value. The final fault area is formed by fault area boundary breakers. The specific steps of fault area formation are shown in Figure 2.

Fault Area Formation Method
After the failure of the system, the dispatching terminal will receive the alarm information of the breaker and form the action circuit breakers set C. In order to effectively avoid the impact of breaker information loss and distortion, after the identification of the fault area boundary breaker in Section 2, this section verifies the action circuit breakers set C obtained at the dispatching terminal through the comprehensive application of the electrical quantity and switching value. The final fault area is formed by fault area boundary breakers. The specific steps of fault area formation are shown in Figure 2.
The whole fault area formation process includes the following steps.

1.
Obtain the action breaker information through the SCADA system. Sort the action breaker information according to the sequence of the breaker serial number from small to large to form an ordered set C; 2.
In set C, judge the charging condition of the breakers at both ends through WAMS measurement, and select the fault area boundary breaker CBi with the lowest serial number as the starting point of the search; 3.
Select the power failure direction of CBi and conduct a breadth-first search for the breakers in this direction. Mark the bus and line passed in the search as 0 until another fault area boundary breaker is found; 4.
Remove the CBi and the circuit breakers searched for in (3) out of set C, and then determine whether set C is a null set. If set C is not a null set, set i to the minimum serial number of circuit breakers in set C and perform step (3). If the set C is a null set, The whole fault area formation process includes the following steps.
1. Obtain the action breaker information through the SCADA system. Sort the action breaker information according to the sequence of the breaker serial number from small to large to form an ordered set C; 2. In set C, judge the charging condition of the breakers at both ends through WAMS measurement, and select the fault area boundary breaker CBi with the lowest serial number as the starting point of the search; 3. Select the power failure direction of CBi and conduct a breadth-first search for the breakers in this direction. Mark the bus and line passed in the search as 0 until another fault area boundary breaker is found; 4. Remove the CBi and the circuit breakers searched for in (3) out of set C, and then determine whether set C is a null set. If set C is not a null set, set i to the minimum serial number of circuit breakers in set C and perform step (3). If the set C is a null set, then the elements with 0 mark are summarized to form the set Y, which is the set of elements in the fault area. The algorithm terminates.
A double-bus-system is taken as an example to illustrate the fault area formation algorithm. As shown in Figure 3, if a fault occurs on bus B3, the bus protection will trip the circuit breakers CB3, CB6, and CB7 connected to the fault bus. Among them, the yellow shading number represents the bus charging conditions, and the purple shading number represents the line charging conditions.  A double-bus-system is taken as an example to illustrate the fault area formation algorithm. As shown in Figure 3, if a fault occurs on bus B3, the bus protection will trip the circuit breakers CB3, CB6, and CB7 connected to the fault bus. Among them, the yellow shading number represents the bus charging conditions, and the purple shading number represents the line charging conditions. The whole fault area formation process includes the following steps.
1. Obtain the action breaker information through the SCADA system. Sort the action breaker information according to the sequence of the breaker serial number from small to large to form an ordered set C; 2. In set C, judge the charging condition of the breakers at both ends through WAMS measurement, and select the fault area boundary breaker CBi with the lowest serial number as the starting point of the search; 3. Select the power failure direction of CBi and conduct a breadth-first search for the breakers in this direction. Mark the bus and line passed in the search as 0 until another fault area boundary breaker is found; 4. Remove the CBi and the circuit breakers searched for in (3) out of set C, and then determine whether set C is a null set. If set C is not a null set, set i to the minimum serial number of circuit breakers in set C and perform step (3). If the set C is a null set, then the elements with 0 mark are summarized to form the set Y, which is the set of elements in the fault area. The algorithm terminates.
A double-bus-system is taken as an example to illustrate the fault area formation algorithm. As shown in Figure 3, if a fault occurs on bus B3, the bus protection will trip the circuit breakers CB3, CB6, and CB7 connected to the fault bus. Among them, the yellow shading number represents the bus charging conditions, and the purple shading number represents the line charging conditions.

1.
In set C, CB3 is first selected to judge whether it is a boundary breaker or not. According to the WAMS measurement, it is judged that both sides of CB3 are powered off. CB3 is not a fault area boundary breaker. Move CB3 out of set C. At this time, the minimum serial number of circuit breakers in set C is CB6. According to WAMS measurement, bus B3 is outage and bus B4 is energized. So, CB6 is determined as the fault area boundary breaker. Therefore, CB6 is selected as the starting point of the search.

2.
Take CB6 as the starting point, search for another fault area boundary breaker on the other side along the direction of power failure, that is, in the direction of L B1B3 . Search through B3 and mark B3 as 0. After searching CB3, it is determined that it is not a fault area boundary breaker, and continue the search. Search through L B1B3 and mark L B1B3 as 0. After searching CB2, CB2 is determined to be a fault area boundary breaker, and the search in this direction is terminated. 3. Take CB6 as the starting point, search for the fault area boundary breaker on the other side along another power failure direction, that is, in the direction of L B3B5 . Search through B3 and mark B3 as 0. After searching CB7, it is determined that it is not a fault area boundary breaker, and continue the search. Search through L B3B5 and mark L B3B5 as 0. After searching CB8, CB8 is determined to be a fault area boundary breaker, and the search in this direction is terminated.

4.
Move CB6 and CB7 out of set C and judge that set C is a null set. Summarize the buses and lines marked as 0 to form the power failure area set Y = {B3, L B1B3 , L B3B5 }. The algorithm terminates.
As mentioned in the above steps, CB6 is the fault area boundary breaker. CB3 and CB7 are the internal fault area breakers. If the CB6 action information is lost, it will be called as the missing information of the fault area boundary breaker. If the protection information corresponding to the breaker CB6 is lost, it will be called as the missing information of fault area boundary protection. If the CB3 or CB7 action information is lost, it will be called as the missing information of the internal fault area breaker. If the protection information corresponding to the breaker CB3 or CB7 is lost, it will be called as the missing information of internal fault area protection.

Optimal Configuration of PMU
In order to meet the requirement of the fault area formation algorithm, the system configured PMU should have observability. However, considering the high cost of PMU, the configuration for every bus cannot meet the economic requirements. On the premise of ensuring the system observability, some special buses can be selected in the system to configure the PMU to improve the economy. Considering that the system faults are mainly single fault and double faults, in order to ensure the observability of the system in case of double faults, this paper proposes a line N-2 fault PMU configuration method.
Considering the working principle and configuration principle of PMU, under normal working conditions, the bus with PMU can directly measure the corresponding bus voltage and branch current to meet the observability. Adjacent buses uninstalled PMU can also calculate the relevant electrical quantities through the I-V characteristics and Kirchhoff law. At present, the optimal configuration of PMU usually adopts the 0-1 integer programming algorithm. Specifically, under certain constraint conditions, the objective function can reach the optimal value when the variables are taken as 0 or 1. Variable X i represents whether bus i is configured with PMU: The i bus is configured with a PMU 0, The i bus is not configured with a PMU Considering the observability constraints and economic objective function, the basic PMU configuration model can be expressed as: In the model expression, I represents the set of system buses, and c represents the cost of PMU equipment, which is a certain value. The f i is the observability function of i bus. When f i ≥ 1, i bus is observable.
However, after a system failure, some buses may not be able to be observable through adjacent buses because the fault line cannot work normally. In order to ensure the observability of the system with double faults, the constraints of the N-2 fault configuration method are modified as follows. Satisfying one of them can ensure that the bus is still observable.

1.
The bus itself is configured with a PMU; At least two buses in the bus set I 2 are configured with PMU; In the above constraints, I 1 represents a bus set consisting of buses connected to bus i by only one circuit, and I 2 represents a bus set consisting of buses connected to bus i by at least two circuits.
At the same time, considering the economic requirements, the generator bus with known power flow and rated power transmission is not required to install PMU. The Figure 4 IEEE 39-bus system is taken as an example for PMU configuration.
cost of PMU equipment, which is a certain value. The is the observability function of i bus. When ≥ 1, i bus is observable. However, after a system failure, some buses may not be able to be observable through adjacent buses because the fault line cannot work normally. In order to ensure the observability of the system with double faults, the constraints of the N-2 fault configuration method are modified as follows. Satisfying one of them can ensure that the bus is still observable. 1. The bus itself is configured with a PMU; 2. At least three buses in the bus set 1 are configured with PMU; 3. At least two buses in the bus set 2 are configured with PMU; In the above constraints, 1 represents a bus set consisting of buses connected to bus i by only one circuit, and 2 represents a bus set consisting of buses connected to bus i by at least two circuits.
At the same time, considering the economic requirements, the generator bus with known power flow and rated power transmission is not required to install PMU. The Figure 4 IEEE 39-bus system is taken as an example for PMU configuration.  The objective function is listed and the of each bus as follows: From bus 1 to bus 29 and bus 39, the relationship between bus i and bus j of the system is as follows: The three parts at the left end of the inequality correspond to the three constraints of the N-2 fault configuration method one by one. In the IEEE 39-bus system, each bus is The objective function is listed and the f i of each bus as follows: From bus 1 to bus 29 and bus 39, the relationship between bus i and bus j of the system is as follows: The three parts at the left end of the inequality correspond to the three constraints of the N-2 fault configuration method one by one. In the IEEE 39-bus system, each bus is connected by a single line. The point set of I 2 in the above formula does not need to be considered.
The buses from 30 to 38 are the generator buses with known power flow and rated power transmission. These buses are not required to install PMU.

Establishment of the Backward Reasoning Petri Net
Petri net is a graph tool to model a system. In this section, the Petri net is used to characterize the system topology on the one hand. On the other hand, Petri nets are used to summarize the information about the protections and breakers received by the dispatching terminals after failure, so as to facilitate the design of the scheme of potential missing information of elements in the next section. The backward reasoning Petri net proposed in [11] can satisfy the characterization of the system topology and circuit breaker information well, but it lacks the protection information and missing information. This paper makes some improvements here. According to the characteristics of staged protection, when the protection of a section of line fails to operate, the protection of adjacent lines will operate as backup protection to cut off the fault. The line fault may involve the protection equipment of this line and the adjacent next line, and other farther protection has little correlation with it. According to this characteristic, the consideration range of fault information can be narrowed. This paper defines the information connection tree. It is used to record the topology information and protection equipment of lines, buses, and other elements in this section and adjacent sections. The information connection tree is represented by T = (N, R), where N is a limited set of buses, and R is a directed connection relationship. The relationship is formed by a breadth-first search of the system from a specific bus N root , and ends after searching for the adjacent lines. N root ∈ N and has no precursor in relation R, and N lea f has no successor in relation R.
When the system fails, each line and bus in the fault area can be selected as the N root to conduct a breadth-first search in two directions. The circuit breaker, bus, and line are transformed into the place. Each place is connected in a directed way according to the order in the search. After the search for adjacent lines is completed, the search stops. The tree relation formed by the connection is the information connection tree of the corresponding line or bus.
The fault case in Section 7.4 is selected. The fault area is shown in Figure 5, in which red represents the breaker trips, and green represents the breaker trips but the information lost. The L23-22 information connection tree is formed as shown in Figure 6.
characterize the system topology on the one hand. On the other hand, Petri nets are used to summarize the information about the protections and breakers received by the dispatching terminals after failure, so as to facilitate the design of the scheme of potential missing information of elements in the next section. The backward reasoning Petri net proposed in [11] can satisfy the characterization of the system topology and circuit breaker information well, but it lacks the protection information and missing information. This paper makes some improvements here.
According to the characteristics of staged protection, when the protection of a section of line fails to operate, the protection of adjacent lines will operate as backup protection to cut off the fault. The line fault may involve the protection equipment of this line and the adjacent next line, and other farther protection has little correlation with it. According to this characteristic, the consideration range of fault information can be narrowed. This paper defines the information connection tree. It is used to record the topology information and protection equipment of lines, buses, and other elements in this section and adjacent sections. The information connection tree is represented by T = (N, R), where N is a limited set of buses, and R is a directed connection relationship. The relationship is formed by a breadth-first search of the system from a specific bus , and ends after searching for the adjacent lines.
∈ N and has no precursor in relation R, and has no successor in relation R.
When the system fails, each line and bus in the fault area can be selected as the to conduct a breadth-first search in two directions. The circuit breaker, bus, and line are transformed into the place. Each place is connected in a directed way according to the order in the search. After the search for adjacent lines is completed, the search stops. The tree relation formed by the connection is the information connection tree of the corresponding line or bus.
The fault case in Section 7.4 is selected. The fault area is shown in Figure 5, in which red represents the breaker trips, and green represents the breaker trips but the information lost. The L23-22 information connection tree is formed as shown in Figure 6.  In order to characterize the system topology and the protection and circuit breaker information received by the dispatching terminal after the fault, inferences for the fault elements in the next section are made with the above information. This section needs to convert the information connection tree into the backward reasoning Petri net: The breaker buses are mapped to the place of Petri net, and the breaker tripping information is mapped to the token and loaded into the associated place of the breaker. The remaining nodes are mapped to the transition set of Petri net. In addition to the basic breaker nodes, there are also breaker reverse nodes in the backward reasoning Petri net. The breaker F first passes through line L, then reaches the bus. For such a breaker F, the relevant direction discriminating element ensures that the protection does not operate according to the In order to characterize the system topology and the protection and circuit breaker information received by the dispatching terminal after the fault, inferences for the fault elements in the next section are made with the above information. This section needs to convert the information connection tree into the backward reasoning Petri net: The breaker buses are mapped to the place of Petri net, and the breaker tripping information is mapped to the token and loaded into the associated place of the breaker. The remaining nodes are mapped to the transition set of Petri net. In addition to the basic breaker nodes, there are also breaker reverse nodes in the backward reasoning Petri net. The breaker F first passes through line L, then reaches the bus. For such a breaker F, the relevant direction discriminating element ensures that the protection does not operate according to the definition of directional overcurrent protection. The breaker F needs to be treated specially. As a special node, place F can only restrain the initial directional protection token, but it can conduct normally for the token of breaker failure protection and the token of transition from the upper place. In order to carry out backward reasoning on whether the element is faulty from the protection and circuit breaker information in the next section, the branch of the information connection tree is reversed according to the flow direction of the fault current in the line. The direction of the Petri net arc is consistent with the branch direction of the reversed information connection tree. Finally, if there is a token in the final place of the backward reasoning Petri net of an element, this element fails. Otherwise, this element is normal.
In addition, the protection extension coefficient is added into the Petri net to characterize the protection information obtained after a fault. For an element of the system, the protection extension coefficient is the distance between the protection information obtained from the dispatching terminal and the circuit breaker activated by the protection. It can be considered as the number of transition layers of the token contained in the corresponding place of the breaker.
Correspondingly, the extension coefficient of the main protection and near backup protection can be recorded as 1; the extension coefficient of the remote backup protection and circuit breaker failure protection can be recorded as 2. However, in the backward reasoning Petri net, only the protection extension coefficient corresponding to the protection action caused by this element is recorded. For example, when a line fails, if the main protection of the line does not operate, the corresponding circuit breaker is tripped by the main protection of the bus. In the backward reasoning Petri net diagram of the line, the extension coefficient of the corresponding protection of the circuit breaker is still recorded as 0.
Based on the information connection tree shown in Figure 6, the backward reasoning Petri net shown in Figure 7 can be established. Among them, the remote backup protection of L23-24 operates and trips the breaker CB3, so the CB3 place has a token, and the protection extension coefficient 2 is marked at the upper right corner of the place. The breaker CB0 acts, but there is no relevant protection information. It is represented as the token contained in the corresponding place, but there is no protection extension coefficient. The breakers CB1 and CB2 are circuit breaker reverse nodes, and the related places are represented by dotted lines. Both circuit breakers operate, and then both places contain a token. In this case, the breaker action information and related protection action information of CB6 and CB7 are not received, so circuit breakers are only represented as the place. protection extension coefficient 2 is marked at the upper right corner of the place. The breaker CB0 acts, but there is no relevant protection information. It is represented as the token contained in the corresponding place, but there is no protection extension coefficient. The breakers CB1 and CB2 are circuit breaker reverse nodes, and the related places are represented by dotted lines. Both circuit breakers operate, and then both places contain a token. In this case, the breaker action information and related protection action information of CB6 and CB7 are not received, so circuit breakers are only represented as the place.

Formation of Fault Hypothesis
In Section 4, the backward reasoning Petri net expresses the topology information, protection state information, and circuit breaker state information of the system in the form of Petri net. The action information of the circuit breaker is mapped to the token, and the corresponding topology information is mapped to the place, transition set, etc. The protection extension coefficient represents the transition ability of the token between different levels of the place. After establishing a backward reasoning Petri net for each element in the fault area, this section can use the Petri net to supplement and reason the various alarm information that the element may generate. Then, the possible alarm information is summarized to form the corresponding element fault hypothesis.

Formation of Fault Hypothesis
In Section 4, the backward reasoning Petri net expresses the topology information, protection state information, and circuit breaker state information of the system in the form of Petri net. The action information of the circuit breaker is mapped to the token, and the corresponding topology information is mapped to the place, transition set, etc. The protection extension coefficient represents the transition ability of the token between different levels of the place. After establishing a backward reasoning Petri net for each element in the fault area, this section can use the Petri net to supplement and reason the various alarm information that the element may generate. Then, the possible alarm information is summarized to form the corresponding element fault hypothesis.
The basic reasoning mode of the backward reasoning Petri net is the same as the operation mode of Petri net, which depends on the token's transition in the place to judge the fault elements. An upper place can only have token deployment when the tokens of all its lower places can jump to it. When the token of each branch jumps to the final place, the element corresponding to the final place can be inferred as a failure element by the backward reasoning Petri net.
As shown in Figure 8, after a fault occurs, a series of actions caused by the fault element can be symmetrical as the fault reasoning process of the backward reasoning Petri net. However, in the actual operation, due to the lack of alarm information, the dispatching terminal often has incomplete protection and breaker information. It cannot judge the fault elements directly and accurately. In this part, the absence of breaker action information is taken as an example to show that the absence of fault information will affect the traditional fault diagnosis method. As shown in Figure 7, CB6 trips, but the information is not uploaded to the dispatching terminal due to various reasons, such as a communication fault. The fault of L23-22 cannot be diagnosed by conventional fault diagnosis methods. Therefore, after establishing the backward reasoning Petri net model of each element, this paper compares the received protection and circuit breaker action information of each element with the complete alarm information that should be possessed when the element fails. The purpose of this process is to reversely reason the information that may be lost in the forward fault treatment, and establish the fault hypothesis of each element. By synthesizing the fault hypotheses of each element, the total set of fault area hypotheses can be determined. In order to ensure the completeness of the fault diagnosis process, the establishment process of the hypothesis set comprehensively considers various fault possibilities of each element, which expands the data scale of uncertain information. In the next section, this paper will process the data through the Top-k Skyline query algorithm to determine the final fault element result. The supplementary information is divided into supplementary breaker action information and supplementary protection action information. If mapped to the backward reasoning Petri net, these two kinds of supplementary information are used to supplement the token and supplement protection extension coefficient to the corresponding place. The basic steps are as follows: 1. Each subordinate place directly connected with the final place is divided into separate branches. Each branch includes not only the subordinate place of the final place, but also the connected places at all levels after the subordinate place. 2. Design the alarm information supplement scheme of each branch. If the token and protection extension coefficient of the upper place are not missing and can be transferred to the final place, the information of the lower place will not be supplemented. 3. After the design of each branch is completed, different schemes of each branch are selected for the scheme combination to form complete fault hypotheses for this ele- The supplementary information is divided into supplementary breaker action information and supplementary protection action information. If mapped to the backward reasoning Petri net, these two kinds of supplementary information are used to supplement the token and supplement protection extension coefficient to the corresponding place. The basic steps are as follows:

1.
Each subordinate place directly connected with the final place is divided into separate branches. Each branch includes not only the subordinate place of the final place, but also the connected places at all levels after the subordinate place. 2. Design the alarm information supplement scheme of each branch. If the token and protection extension coefficient of the upper place are not missing and can be transferred to the final place, the information of the lower place will not be supplemented.

3.
After the design of each branch is completed, different schemes of each branch are selected for the scheme combination to form complete fault hypotheses for this element.
Take the line L23-22 in Figure 7 as an example for the fault hypothesis design.
CB0-L23-22 direction branch: Scheme 1, supplement R (CB0) protection information (protection information related CB0 is missing). Scheme 2, supplement CB7 breaker information and related protection information (CB0 refuses to operate; CB7 breaker information and related protection information are missing).

4.
The fault hypotheses obtained by summarizing the fault schemes of each branch are as follows: Hypothesis 1: supplement CB6 breaker information and related protection information, and supplement R (CB0) protection information; Hypothesis 2: supplement CB6 breaker information and related protection information, and supplement CB7 breaker information and related protection information.

5.
After testing, the two fault hypotheses can ensure that the token will finally transition to the final place.
The bus B24 in Figure 9 is taken as an example for the fault hypothesis design. 2. CB4--B24 direction branch: Scheme 1, supplement CB4 breaker information and related protection information (CB4 breaker information and related protection information are missing). Scheme 2, supplement R (CB5) protection information (CB4 refuses to operate, protection information related CB5 is missing).
4. The fault hypotheses obtained by summarizing the fault schemes of each branch are as follows: Hypothesis 1: supplement CB4 breaker information and related protection information, and supplement R (CB3) protection information; Hypothesis 2: supplement CB4 breaker information and related protection information, and supplement R (CB2) protection information; Hypothesis 3: supplement R (CB5) and R (CB3) protection information; Hypothesis 4: supplement R (CB5) and R (CB2) protection information.

5.
After testing, all fault hypotheses can ensure that the token will finally transition to the final place.

Hypothesis Evaluation Based on the Top-k Skyline Query Algorithm
After a fault occurs, all elements in the fault area need to be designed as fault hypotheses, and the total set of fault hypotheses will be relatively large. In addition, relying only on the hypothesis set cannot provide substantial help to judge the specific missing information and fault elements. Therefore, in this section, the fault hypothesis set needs to be scaled down and data sorted. On the one hand, it can reduce the judgment interference caused by redundant information and unreasonable information. On the other hand, it can provide a selection basis for reasonable fault hypothesis, so as to facilitate the selection of more effective and more realistic fault hypothesis for the dispatching terminal. In this section, after selecting reasonable and comprehensive dimensions to data-describe each hypothesis, the Top-k Skyline query method is selected to compare each hypothesis.
The Top-k Skyline query algorithm is a query algorithm proposed by Papadias et al. in 2003. The algorithm uses the principle of skyline domination to establish a scoring function, and selects the first k skyline points to form the result set, which can control the scale of the result set. In this paper, the Top-k Skyline query algorithm is applied to fault diagnosis. Firstly, in the idea, the traditional action evaluation of the circuit breaker and protection status information is transformed into multi-dimensional numerical evaluation of the alarm information as a whole. The abstract equipment action information is directly transformed into concrete multi-dimensional numerical information, which simplifies the difficulty of information processing and improves the speed. Secondly, for uncertain problems, such as missing information, the internal function of the Top-k Skyline algorithm can reasonably reduce the uncertain information and reflect the results in the form of intuitive data. According to the data, the final fault element and missing information are presented. Finally, with the expansion of the power grid, the Top-k Skyline query algorithm is suitable for the rapid processing of big data information sets, which can effec-

4.
The fault hypotheses obtained by summarizing the fault schemes of each branch are as follows: Hypothesis 1: supplement CB4 breaker information and related protection information, and supplement R (CB3) protection information; Hypothesis 2: supplement CB4 breaker information and related protection information, and supplement R (CB2) protection information; Hypothesis 3: supplement R (CB5) and R (CB3) protection information; Hypothesis 4: supplement R (CB5) and R (CB2) protection information.

5.
After testing, all fault hypotheses can ensure that the token will finally transition to the final place.

Hypothesis Evaluation Based on the Top-k Skyline Query Algorithm
After a fault occurs, all elements in the fault area need to be designed as fault hypotheses, and the total set of fault hypotheses will be relatively large. In addition, relying only on the hypothesis set cannot provide substantial help to judge the specific missing information and fault elements. Therefore, in this section, the fault hypothesis set needs to be scaled down and data sorted. On the one hand, it can reduce the judgment interference caused by redundant information and unreasonable information. On the other hand, it can provide a selection basis for reasonable fault hypothesis, so as to facilitate the selection of more effective and more realistic fault hypothesis for the dispatching terminal. In this section, after selecting reasonable and comprehensive dimensions to data-describe each hypothesis, the Top-k Skyline query method is selected to compare each hypothesis.
The Top-k Skyline query algorithm is a query algorithm proposed by Papadias et al. in 2003. The algorithm uses the principle of skyline domination to establish a scoring function, and selects the first k skyline points to form the result set, which can control the scale of the result set. In this paper, the Top-k Skyline query algorithm is applied to fault diagnosis. Firstly, in the idea, the traditional action evaluation of the circuit breaker and protection status information is transformed into multi-dimensional numerical evaluation of the alarm information as a whole. The abstract equipment action information is directly transformed into concrete multi-dimensional numerical information, which simplifies the difficulty of information processing and improves the speed. Secondly, for uncertain problems, such as missing information, the internal function of the Top-k Skyline algorithm can reasonably reduce the uncertain information and reflect the results in the form of intuitive data. According to the data, the final fault element and missing information are presented. Finally, with the expansion of the power grid, the Top-k Skyline query algorithm is suitable for the rapid processing of big data information sets, which can effectively eliminate irrelevant data and speed up the processing progress.

Selection of Evaluation Dimensions
In the processing of each fault hypothesis, the following dimensions are selected to describe the hypothesis.

1.
Total number of missing information: since the information loss is a small probability event relative to the normal transmission of information, and the information loss among circuit breakers is independent of each other. The lower the total number of missing information, the higher the reliability of fault hypothesis.

2.
Number of interpretable circuit breakers: the number of breakers that can be explained in the fault hypothesis, including refuse-operated circuit breakers, information missing breakers, etc. As shown in hypothesis 2 of the line L23-22, it is known that CB0 refuses to operate; CB7 and CB3 are tripped by backup protection. CB6 is tripped by line main protection. The number of interpretable circuit breakers is four. The more breakers the hypothesis can explain, the clearer the description of each element in the fault area, and the higher the reliability of the hypothesis.

3.
Number of missing protection information: due to the double configuration of protection on the actual line, the protection information after fault can be sent by multiple protections. The protection information received by the dispatching end is redundant. Therefore, the lower the number of missing protection information, the higher the reliability of the fault hypothesis.

Skyline Query Algorithm
The skyline query algorithm is an algorithm that can select undominated data points from a large data set. The purpose of this algorithm is to reduce the size of a big data set quickly and eliminate the data points that do not meet the requirements. Its input is a set of data information described by multiple dimensions. Through comparison in the same dimension, it comprehensively outputs a set of data points that can dominate other data in multiple dimensions. The output set is called the skyline set. Assuming that the larger the data value, the more it meets the requirements, the definitions of domination and skyline set are given.
In the given n-dimensional finite set H, there are data X i , X j ∈ H, and X ia that denotes the value of the a-dimension of data X i . If there is ∀X ia ≥ X ja (1 ≤ a ≤ n), and ∃X ic > X jc , the data point X i dominates data point X j , which is marked as X i X j .
In the given n-dimensional finite set H, the skyline set is a subset of the set H, and none of the data points in the skyline set are dominated by the data points in set H.
Taking the four fault hypotheses of bus B24 as examples, the skyline query algorithm is illustrated by a simple case. After dimensional description of the fault hypothesis, the following Table 1 can be obtained. The analysis shows that the total number of missing information and the number of missing protection information in hypothesis 2 are consistent with hypothesis 1. The number of interpretable circuit breakers in hypothesis 2 is larger, so hypothesis 2 dominates hypothesis 1. Similarly, hypothesis 4 dominates hypothesis 3. According to the definition of the skyline set, the dominated data points will not appear in the skyline set, so hypothesis 1 and hypothesis 3 are excluded. Compared with hypothesis 4, hypothesis 2 has a higher total number of missing information and a lower number of interpretable circuit breakers. Hypothesis 4 dominates hypothesis 3. The final output skyline set contains only hypothesis 4.
As shown in the above case, the skyline query algorithm can filter and sort out the data set. However, when the scale of the data set is too large, the skyline query algorithm is likely to return a result set with a large scale after removing invalid data points. This situation loses the significance of data filtering. Users are still unable to make direct selection.

Top-k Query Algorithm
The algorithm scores each data set through a monotonic scoring function, and returns the top k small (or large) data objects from the data space as a result. The advantage of the Top-k query algorithm is that it can control the number of returned results and directly control the scale of returned data.
The Top-k query algorithm handles the case of B24 fault hypotheses as follows: The total number of missing information, the number of interpretable circuit breakers, and the number of missing protection information of the three evaluation dimensions are denoted as x, y, and z, respectively. The x and z are negatively correlated with the reliability of the fault hypothesis, while y is positively correlated. The scoring function F = −x + y − z can be selected to score the reliability of the fault description of the data points. The calculation results are shown in Table 2 below. For top-1 output, hypothesis 4 should be selected. For top-2 output, hypothesis 3 and hypothesis 4 should be selected. With direct evaluation criteria, the Top-k query algorithm can control the final output scale by sorting.
However, the Top-k algorithm also has problems. For example, in the processing of the above example, selecting different scoring functions for this example will directly affect the score of data points and the result selection. In addition, when calculating the unified function among different dimensions, it is also necessary to consider the impact of the unit among different dimensions. Although it is not obvious in this case, when the data is described in other dimensions, such as the time dimension and distance dimension, the corresponding units are selected as second and kilometer. It will have a certain impact on the actual meaning of the function score.

Top-k Skyline Query Algorithm
By comparing the characteristics of the skyline query algorithm and Top-k query algorithm, it can be found that the skyline query algorithm does not require a specific scoring function and its results are not affected by the different units between dimensions. However, it cannot control the scale of the returned set. Relatively speaking, the Top-k query algorithm is characterized by controlling the scale of the result set, but it is easily affected by the different units between dimensions. Combining the characteristics of the two query methods, the Top-k Skyline query algorithm can be used to compare the fault hypothesis data set and control the scale of the return set. After integrating the existing Top-k Skyline query algorithms and applying them to the fault diagnosis in the case of missing alarm information, this paper selects the Top-k Skyline query algorithm based on degree value filtering (DFTS algorithm) to process the total set of fault hypotheses.
The DFTS algorithm mainly includes three steps: 1.
Calculate the value of the degree-score function for each data point, and preliminarily process the data points based on this value. Select some data points into the result set.

2.
The skyline query algorithm is used to process the result set to ensure that the final result belongs to the skyline set of the original data set.

3.
Select Top-k points from the set for output.
The definition of the degree-score is given here. Given an n-dimensional finite data set H and data point X i ∈ DS, the degree-score of j-dimension of X i is τ i,j , then τ i,j = X i,j −µ j µ j . Among them, X i,j represents the value of the j-dimension of the data point O i , and µ j represents the mean value of the j-dimension, µ j = ∑ n i=1 X i,j n . Taking Table 1 as an example, µ 1 = 2.500, and the value of µ 1 is the mean value of the first dimension. For fault hypothesis 1, X 1,1 = 3, X 1,2 = 2, τ 1,1 = 0.200, τ 1,2 = −0.333. The τ 1,1 and τ 1,2 represent the degree-score of the first dimension and second dimension, respectively. These two data can easily quantify the position of fault hypothesis 1 compared with other hypotheses in the total number of the missing information dimension and the number of interpretable circuit breakers dimension.
The data set has multiple different dimensions. Each dimension degree-score of a data point is summed up as the final evaluation of the data point. The corresponding evaluation function is the degree-score function, and it is defined as: the sum of each dimension degree-score of the data point X i is recorded as the degree-score function φ(X i ), then φ(X i )= ∑ n i=1 τ i,j , where n represents the number of dimensions. The degree-score represents the position of the data point in this dimension compared with other data points. Accordingly, the degree-score may be positive, negative, or zero. Because the degree-score erases the influence of different data units in different dimensions, the degree-score function can directly add each dimension degree-score of a data point, which represents the comprehensive level of the data point compared with other data points.

Application of the DFTS Algorithm in Fault Diagnosis
Considering the gradual expansion of the scale of the power grid, when a fault occurs, the lack of alarm information will have an increasing impact on fault diagnosis. The DFTS algorithm is suitable for filtering and comparing large-scale data sets in uncertain problems, and output data points that meet the requirements efficiently. After applying the DFTS algorithm to fault diagnosis in the case of missing alarm information, it can effectively improve the efficiency of fault diagnosis. Combined with the three steps of the DFTS algorithm in Section 6.2.3, this section summarizes the fault diagnosis in the case of missing information into the following processes, where steps 5, 6, and 7 are the embodiment of the DFTS algorithm in fault diagnosis.

1.
When the dispatching terminal receives the alarm information, it starts the fault diagnosis algorithm in the case of missing alarm information proposed in this paper; 2.
The fault area is determined by the combination of the switching value and electrical quantity; 3.
Form the backward reasoning Petri net of each element; 4.
Form the fault hypothesis of each element; 5.
The fault hypotheses of each element are preliminarily processed through the degreescore function, and each element retains at most K fault hypotheses into the total set of fault hypotheses; 6.
Reprocess the total set of fault hypotheses and rank the results; 7.
Finally, the data points obtained by the DFTS algorithm are the final results.

Case Analysis
Considering that the system faults are mainly single faults and double faults, this section uses the IEEE 39-bus system to design the following fault cases.

Single Fault Accompanied by Maloperation of the Circuit Breaker
As shown in the following Figure 10, the fault at line L3-18 leads to the operation of main protection and trips the circuit breakers CB1 and CB2. The main protection maloperation of the adjacent line L18-17 trips the circuit breaker CB3. At this time, if the information of the breaker CB1 is lost, the state of the breaker CB1 will be directly judged by PMU measurement. If the information of the breaker CB3 is lost, it will offset the potential negative impact of breaker maloperation on the results. In order to improve the complexity, the alarm information of the breaker CB2 is designed as the missing alarm information in this case.

Case Analysis
Considering that the system faults are mainly single faults and double faults, this section uses the IEEE 39-bus system to design the following fault cases.

Single Fault Accompanied by Maloperation of the Circuit Breaker
As shown in the following Figure 10, the fault at line L3-18 leads to the operation of main protection and trips the circuit breakers CB1 and CB2. The main protection maloperation of the adjacent line L18-17 trips the circuit breaker CB3. At this time, if the information of the breaker CB1 is lost, the state of the breaker CB1 will be directly judged by PMU measurement. If the information of the breaker CB3 is lost, it will offset the potential negative impact of breaker maloperation on the results. In order to improve the complexity, the alarm information of the breaker CB2 is designed as the missing alarm information in this case. The ordered set C= {CB1, CB3} is obtained by the SCADA system; 2.
In set C, CB1 is first selected to judge whether it is a boundary breaker or not. According to the WAMS measurement, CB1 is determined as the fault area boundary breaker. Therefore, CB1 is selected as the starting point of the search.

3.
Take CB1 as the starting point, search for another fault area boundary breaker on the other side along the direction of power failure, that is, in the direction of L B3B18 . Search through L B3B18 and mark L B3B18 as 0. After searching CB2, it is determined that it is not a fault area boundary breaker, and continue the search. Search through B18 and mark B18 as 0. After searching CB3, it is determined that it is not a fault area boundary breaker, and continue the search; Search through L B18B17 and mark L B18B17 as 0. After searching CB4, CB4 is determined to be a fault area boundary breaker, and the search in this direction is terminated. 4.
Move CB1 and CB3 out of set C and judge that set C is a null set. Summarize the buses and lines marked as 0 to form the power failure area set Y = {L B3B18 , B18, L B18B17 }. The algorithm terminates.
Considering the electrical quantity and switching value, the final fault area is determined as shown in Figure 11. In set C, CB1 is first selected to judge whether it is a boundary breaker or not. According to the WAMS measurement, CB1 is determined as the fault area boundary breaker. Therefore, CB1 is selected as the starting point of the search. 3. Take CB1 as the starting point, search for another fault area boundary breaker on the other side along the direction of power failure, that is, in the direction of L B3B18 . Search through L B3B18 and mark L B3B18 as 0. After searching CB2, it is determined that it is not a fault area boundary breaker, and continue the search. Search through B18 and mark B18 as 0. After searching CB3, it is determined that it is not a fault area boundary breaker, and continue the search; Search through L B18B17 and mark L B18B17 as 0. After searching CB4, CB4 is determined to be a fault area boundary breaker, and the search in this direction is terminated. 4. Move CB1 and CB3 out of set C and judge that set C is a null set. Summarize the buses and lines marked as 0 to form the power failure area set Y = {L B3B18 , B18, L B18B17 }. The algorithm terminates. Considering the electrical quantity and switching value, the final fault area is determined as shown in Figure 11.
3 18 17 Figure 11. Fault area and breaker state diagram. Figure 11. Fault area and breaker state diagram.

Establish the Backward Reasoning Petri Net of each Element
After the fault area is determined, the element set includes L3-18, B18, and L18-17. The backward reasoning Petri net is established by integrating the protection state information and circuit breaker state information at the dispatching terminal, as shown in Figure 12 below. According to Section 4, the failure hypotheses of each element and the data of each dimension are shown in Table 3 below.

Formation and Data Processing of the Element Fault Hypothesis
According to Section 4, the failure hypotheses of each element and the data of each dimension are shown in Table 3 below. Then, data processing is carried out for each failure hypothesis according to step 5 in Section 6.2.4. Firstly, the fault hypotheses os screened preliminarily, and each element retains at most K fault hypotheses into the total set of fault hypotheses. In this example, K takes 2. The hypotheses of B18 should be reduced. The processing results are shown in Table 4 below. Among them, since the total number of missing information and the number of missing protection information are negatively correlated with the credibility of the hypothesis, and the number of interpretable circuit breakers is positively correlated with the credibility of the hypothesis, the degree-score calculated from the number of interpretable circuit breakers is taken as negative and then added with the degree-score of the other two dimensions. The final degree-score function will be negatively correlated with the credibility of the hypothesis. In this case, the algorithm selects serial NO.5 and NO.6 from the fault hypotheses of bus B18 to be put in the total set of fault hypotheses.
Finally, the DFTS query algorithm is applied to the total set of fault hypotheses, and the results are shown in Table 5 below. According to the above table, the possible fault element is L3-18, and the missing information supplemented by the corresponding hypothesis is CB2 action information. Compared with the preset fault situation, the fault element and missing information are diagnosed correctly.

Single Fault Accompanied by Refuse-Operation of the Circuit Breaker
The fault at line L3-18 leads to the operation of main protection and trips the breakers CB1 and CB2. However, CB2 fails to operate. The remote backup protection of the adjacent line L18-17 operates and trips the breaker CB4. At this time, if CB1 and CB4 breaker information are lost, their state will be directly judged by the electrical quantity and switching value. In order to improve the complexity, the remote backup protection information of line L18-17 is designed as the missing alarm information in this case. The intermediate process is similar to Section 7.1. This section will not be repeated. Now the fault hypothesis and final sequencing are shown in Table 6 below. According to the above table, the possible fault element is L3-18, and the missing information supplemented by the corresponding hypothesis is the remote backup protection information of line L18-17. Compared with the preset fault situation, the fault element and missing information are diagnosed correctly.

Double Faults Accompanied by Refuse-Operation of the Circuit Breaker
As shown in the following Figure 13, suppose that double faults occur at the bus B13 and the line L13-14. The fault at bus B13 leads to operation of the main protection and trips the breakers CB1, CB2, and CB3. The fault at the line L13-14 leads to the operation of main protection and trips the circuit breakers CB1 and CB2, but CB2 fails to operate. The remote backup protection of the adjacent line L14-15 operates and trips the breaker CB4. The remote backup protection of the adjacent line L4-14 operates and trips the circuit breaker CB7. The alarm information of breaker CB1 is designed as the missing alarm information in this case. Considering the electrical quantity and switching value, the final fault area is determined as shown in Figure 14.

Double faults accompanied by refuse-operation of the circuit breaker
As shown in the following Figure 13, suppose that double faults occur at the bus B13 and the line L13-14. The fault at bus B13 leads to operation of the main protection and trips the breakers CB1, CB2, and CB3. The fault at the line L13-14 leads to the operation of main protection and trips the circuit breakers CB1 and CB2, but CB2 fails to operate. The remote backup protection of the adjacent line L14-15 operates and trips the breaker CB4. The remote backup protection of the adjacent line L4-14 operates and trips the circuit breaker CB7. The alarm information of breaker CB1 is designed as the missing alarm information in this case. Considering the electrical quantity and switching value, the final fault area is determined as shown in Figure 14.   Table 7 below.   Table 7 below. As shown in the above table, the possible fault elements are L13-14, B13, B14, etc. in order. The missing information supplemented by the corresponding hypothesis is CB1 action information. Compared with the preset fault condition, the hypotheses that are most consistent with the design condition are ranked first and second, respectively.

Double Faults Accompanied by Maloperation of the Breaker
As shown in the following Figure 15, suppose that double faults occur at the bus B23 and the line L16-24. The fault at bus B23 leads to the operation of main protection and trips the breakers CB0, CB1, and CB2. The fault at the line L16-24 leads to the operation of main protection and trips the circuit breakers CB4 and CB5. The remote backup protection maloperation of line L23-24 trips the breaker CB3. The alarm information of breaker CB4 is designed as the missing alarm information in this case. Considering the electrical quantity and switching value, the final fault area is determined as shown in Figure 16. As shown in the above table, the possible fault elements are L13-14, B13, B14, etc. in order. The missing information supplemented by the corresponding hypothesis is CB1 action information. Compared with the preset fault condition, the hypotheses that are most consistent with the design condition are ranked first and second, respectively.

Double Faults Accompanied by Maloperation of the Breaker
As shown in the following Figure 15, suppose that double faults occur at the bus B23 and the line L16-24. The fault at bus B23 leads to the operation of main protection and trips the breakers CB0, CB1, and CB2. The fault at the line L16-24 leads to the operation of main protection and trips the circuit breakers CB4 and CB5. The remote backup protection maloperation of line L23-24 trips the breaker CB3. The alarm information of breaker CB4 is designed as the missing alarm information in this case. Considering the electrical quantity and switching value, the final fault area is determined as shown in Figure 16.  Table 8 below.  Table 8 below. As shown in the above table, the possible fault elements are B23, L16-24, etc. in order. The missing information supplemented by the corresponding hypothesis is CB4 action information. Compared with the preset fault condition, the hypotheses that are most consistent with the design condition are ranked first and second, respectively.