Fault Diagnosis System of Power Grid Based on Multi-Data Sources

: In order to complete the function of power grid fault diagnosis accurately, rapidly and comprehensively, the power grid fault diagnosis system based on multi-data sources is proposed. The integrated system uses accident-level information, warning-level information and fault recording documents and outputs a complete diagnosis and tracking report. According to the timeliness of three types of information transmission, the system is divided into three subsystems: real-time processing system, quasi-real-time processing system and batch processing system. The complete work is realized through the cooperation between them. While a real-time processing system completes fault diagnosis of elements, it also screens out incorrectly operating protections and circuit breakers and judges the loss of accident-level information. Quasi-real-time system outputs reasons for incorrect actions of protections and circuit breakers under the premise of considering partial warning-level information missing. The batch processing system corrects diagnosis results of the real-time processing system and outputs fault details, including fault phases, types, times and locations of faulty elements. The simulation results and test show that the system can meet actual engineering requirements in terms of execution efﬁciency and fault diagnosis and tracking effect. It can be used as a reference for self-healing and maintenance of power grids and has a preferable application value.


Introduction
The transmission network is the main component of the power system and is responsible for large-capacity power transmission tasks. Therefore, the safety of the transmission network is of vital importance to the entire power system. When a transmission network fails, quickly identifying faulty elements is the primary task of fault diagnosis, and it is also a prerequisite for the smart grid to be self-healing. In order to carry out follow-up maintenance and record fault history information, it is necessary to store fault details such as the fault locations, types, phases, times of the faulty elements. In addition, it is also indispensable to find the circuit breakers and relay protection devices that did not operate correctly at the time of the fault and to track the reasons for their incorrect operations. Therefore, proposing a complete fault diagnosis system to organically combine the above tasks has an urgent need in practical engineering applications.
When it comes to transmission network operation, power flow [1], optimal power flow [2], optimal power flow [3], security analysis [4] and state estimation [5] are indispensable and important computational tools. Reference [1] proposed a robust and efficient LF solver based on the Bulirsch-Stoer algorithm, which solves the load-flow (LF) problem of super-large-scale systems and improves computational performance. Reference [2] used marine predator algorithm (MPA) to solve the multi-region optimal power flow (OPF) problem considering renewable energy sources and load variability. Reference [3] proposed a methodology to quickly obtain the saddle-node bifurcation points of power systems, which significantly reduces the computational cost. Reference [4] proposed an online line Appl. Sci. 2021, 11, 7649 2 of 24 switching methodology to alleviate overloads with look-ahead capability, which provides a set of high-quality line switching solutions and improves speed (online application) and accuracy. Reference [5] proposed an improved probabilistic load and distributed energy resources (DERs) modeling as pseudo-measurements, which has the ability to estimate the states of the power grid with high accuracy and short computational time. These improved new technologies (especially the method proposed in reference [5]) enable the dispatching center to monitor and estimate the operating status of the entire power grid more accurately and quickly, providing feasibility for fault diagnosis.
Traditional power grid fault diagnosis methods, including expert system [6], Petri net [7] and other diagnostic algorithms, are based on protection operation information and circuit breaker tripping information. These methods could accurately diagnose faulty elements under the premise of complete information. Reference [8] proposed a fault diagnosis method under the condition of partial circuit breaker information loss, which supplemented practical application. As the data sources available for fault diagnosis become more abundant, multi-source information fusion [9] has also been applied to the field of fault diagnosis. However, these methods can only be used to determine faulty elements and cannot provide fault details.
In order to obtain the fault details, the recording wave data were also applied to the field of fault diagnosis by scholars as an information source [10]. However, due to the large storage space required by the fault recording documents, the recorded wave data cannot be uploaded to the dispatching end in real-time. Therefore, the fault diagnosis method based on the recorded wave data as the data source has the problem of time efficiency.
Fault tracking [11] refers to finding reasons for the incorrect actions of relay protection devices or circuit breakers by investigating the relevant alarm data in substations. Reference [12] used the inference chain and Bayesian network to track the faults of relay protection devices and achieved favorable results. However, it did not fully consider the problem of partial alarm data information loss. There was a possibility that some fault reasons were lost or misjudged.
Fault diagnosis methods based on multiple information sources [13,14] can provide a complete fault diagnosis report, including various fault details. However, former related studies have some shortcomings. The fault diagnosis method proposed in reference [13] fails to consider the time efficiency. Reference [14] evaluated the action behavior of protections and circuit breakers based on fault recording data, which may lead to low maintenance efficiency of relay protection devices and circuit breakers in practical engineering applications.
Considering the shortcomings of existing fault diagnosis and tracking methods and applications, this paper proposes a fault diagnosis system based on multi-data sources in order to achieve rapid, accurate and comprehensive effects of fault diagnosis and tracking. The main work is as follows: (1) Show the framework and the operation process of the fault diagnosis system; (2) Divide the whole system into three subsystems to realize, respectively, and design the cooperation scheme between them to complete the whole fault diagnosis and tracking work; (3) Take a fault scenario on the IEEE30 bus system as an example to verify the system.
A transverse comparison between this system and related technologies is made to prove the advantages of this system in terms of computational efficiency, reliability, and processing the problem of partial information loss.

System Framework and Operation Process
In this article, the fault diagnosis system is divided into a real-time processing system, quasi-real-time processing system and batch processing system from the timeliness of information transmission. The system framework is shown in Figure 1.

System Framework and Operation Process
In this article, the fault diagnosis system is divided into a real-time processing system, quasi-real-time processing system and batch processing system from the timeliness of information transmission. The system framework is shown in Figure 1.
Representation of electrical elements, protection inform ation and grid network topology Fault diagnosis batch processing system Step1 Step2 Step1 Step2 Figure 1. System framework.

Brief Introduction of Fault Diagnosis Real-time Processing System
Real-time processing system mainly deals with accident-level information, including protection action information and circuit breaker tripping information, which is reported by substations in real-time. Therefore, this subsystem at the dispatching center takes a real-time response to the information and diagnoses grid fault rapidly. By using this subsystem, circuit breakers and protections with maloperation or refuse-operation and faulty elements are obtained, and the loss of accident-level information is analyzed.

Brief Introduction of Fault Tracking Quasi-Real-Time Processing System
The quasi-real-time processing system mainly deals with warning-level information, including the online monitoring information of circuit breakers and warning information of relay protection devices. Warning-level information is sampled in the substation by polling and is not reported to the dispatching center. When maloperation and refuse-operation information of circuit breakers and protections is transmitted from dispatching center to substations, quasi-real-time processing systems at substations initiates. According to warning-level information of related circuit breakers and relay protection devices, the reasons for incorrect actions are output work on the premise of considering the loss of partial warning-level information.

Brief Introduction of Fault Diagnosis Batch Processing System
The batch processing system mainly deals with fault recording documents. Due to the large storage space and low real-time requirements of fault recording documents, this subsystem at dispatching center inputs fault recording documents in batches in the form of multiple sets of documents and schedules them in the way of workflow. Then, this

Brief Introduction of Fault Diagnosis Real-Time Processing System
Real-time processing system mainly deals with accident-level information, including protection action information and circuit breaker tripping information, which is reported by substations in real-time. Therefore, this subsystem at the dispatching center takes a real-time response to the information and diagnoses grid fault rapidly. By using this subsystem, circuit breakers and protections with maloperation or refuse-operation and faulty elements are obtained, and the loss of accident-level information is analyzed.

Brief Introduction of Fault Tracking Quasi-Real-Time Processing System
The quasi-real-time processing system mainly deals with warning-level information, including the online monitoring information of circuit breakers and warning information of relay protection devices. Warning-level information is sampled in the substation by polling and is not reported to the dispatching center. When maloperation and refuse-operation information of circuit breakers and protections is transmitted from dispatching center to substations, quasi-real-time processing systems at substations initiates. According to warning-level information of related circuit breakers and relay protection devices, the reasons for incorrect actions are output work on the premise of considering the loss of partial warning-level information.

Brief Introduction of Fault Diagnosis Batch Processing System
The batch processing system mainly deals with fault recording documents. Due to the large storage space and low real-time requirements of fault recording documents, this subsystem at dispatching center inputs fault recording documents in batches in the form of multiple sets of documents and schedules them in the way of workflow. Then, this subsystem verifies and corrects the diagnosis result of the real-time processing system and finally outputs fault details, including fault locations, phases, types and times.

System Operation Process
The complete operation process of fault diagnosis and tracking is completed by the cooperation of the three subsystems, described as follows: After the fault occurs, the dispatching center first obtains accident-level information from the scene. A real-time diagnosis system makes the information representation of the power grid network topology information and accident-level information and then finds out the suspicious faulty elements through fault pre-diagnosis link. In the process of fault diagnosis and information loss analysis, faulty elements can be determined, and operation statuses of related protections and circuit breakers are analyzed. Then through the classification of action protections, this subsystem judge whether information missing occurs. When information loss occurs, a set of suspected faulty elements is given.
After operation statuses of circuit breakers and protections having been analyzed, the incorrect action information of circuit breakers and protections are transmitted to corresponding substations. Each corresponding substation immediately starts a quasi-realtime processing system. This subsystem extracts fault features of relay protection devices and circuit breakers related to incorrect actions in warning-level information and compares them with the relation table between fault reasons and features. Through the fault reason tracking link, the possible fault reasons for incorrect actions are arranged in order to guide the maintenance work of corresponding relay protection devices and circuit breakers.
After getting the diagnosis result, the dispatching center invokes related fault recording documents from the fault recording information net and initiates a batch processing system. When partial accident-level information is lost, this subsystem extracts relevant fault recording wave data of the suspected faulty elements outputted by a real-time processing system and diagnoses them. Subsequently, it is consistent with the situation when no information loss occurs: fault recording wave data of faulty elements that have been diagnosed are extracted to verify whether a fault occurs, and fault phases, types, times and locations of faulty elements are outputted to form a complete diagnosis report.

Realization of Fault Diagnosis Real-Time Processing System
Many symbols are involved in this chapter. The main symbols are shown in Table 1.

V class
The class of all electrical elements in power grid NBP The class of near backup protections

V_e
The class of partial electrical elements containing transformers, buses, lines and generators in power grid

BP
The class of bus protections

B class
The class of circuit breakers in power grid P * The set of the grid's action protections

V_ot
The class of electrical elements other than V_e class elements and B class elements V_e1 (2) The initial set of suspicious faulty V_e class elements V_T The class of transformers V_e2 (2) The set of suspicious faulty V_e class elements V_G The class of generators V_e3 (2) The initial set of faulty V_e class elements V_L The class of lines V_e4 (3) The set of faulty V_e class elements V_B The class of buses V_e5 (4) The set of suspected faulty V_e class elements G class The class of the power grid network P_1 (3) The set of maloperation protections P class The class of protections P_2 (3) The set of refuse-operation protections MP The class of main protections P_3 (4) The set of unprocessed protections TP The class of transformer protections CB_1 (3) The set of maloperation circuit breakers RBP The class of remote backup protections CB_2 (3) The set of refuse-operation circuit breakers (1) Only the main symbols are listed here. Other symbols (especially symbols used in calculations) are defined in the corresponding sections.
All symbols in the main text of this article are expressed in italics. (2)

Object-Oriented Information Representation Method of Electrical Elements, Protection Information and Power Grid Network Topology
In this article, the object-oriented representation method is used to classify and design all electrical elements, protections and network topology in the power grid, as shown in Figure 2.

TP
The class of transformer protections CB_1 (3) The set of maloperation circuit breakers RBP The class of remote backup protections CB_2 (3) The set of refuse-operation circuit breakers (1) Only the main symbols are listed here. Other symbols (especially symbols used in calculations) are defined in the corresponding sections. All symbols in the main text of this article are expressed in italics. (2) V_e1, V_e2 and V_e3 are defined in detail in Section 3.2. (3) V_e3, P_1, P_2, CB_1 and CB_2 are defined in detail in Section 3.3.1. (4) V_e5 and P_3 are defined in detail in Section 3.3.2.

Object-Oriented Information Representation Method of Electrical Elements, Protection Information and Power Grid Network Topology
In this article, the object-oriented representation method is used to classify and design all electrical elements, protections and network topology in the power grid, as shown in Figure 2.  In the object-oriented representation method, a subclass inherits all the attributes and methods of its parent class and has its own other attributes, while a single object is an instantiation realization of the subclass. The inheritance performance between parent class and child class preferable shows the relationship between power grid network and various electrical elements.
Compared with other representation methods, object-oriented representation of electrical elements, protections and network topology in the power grid has many advantages: 1. The system is easier to maintain. When the operating status of certain elements or parts of the power grid changes, only one single object or partial module needs to be modified; 2. Code reuse and system development efficiency are improved. Electrical components, their connection relationships and protections in the power grid are abstracted into categories, and logical thinking methods closer to nature are adopted, which can reduce the amount of repetitive code and the amount of follow-up work development; 3. The system functions are easier to expand. Based on the characteristics of inheritance, encapsulation and polymorphism, a system structure with high cohesion and low In the object-oriented representation method, a subclass inherits all the attributes and methods of its parent class and has its own other attributes, while a single object is an instantiation realization of the subclass. The inheritance performance between parent class and child class preferable shows the relationship between power grid network and various electrical elements.
Compared with other representation methods, object-oriented representation of electrical elements, protections and network topology in the power grid has many advantages: 1.
The system is easier to maintain. When the operating status of certain elements or parts of the power grid changes, only one single object or partial module needs to be modified; 2.
Code reuse and system development efficiency are improved. Electrical components, their connection relationships and protections in the power grid are abstracted into categories, and logical thinking methods closer to nature are adopted, which can reduce the amount of repetitive code and the amount of follow-up work development; 3.
The system functions are easier to expand. Based on the characteristics of inheritance, encapsulation and polymorphism, a system structure with high cohesion and low coupling can be designed to reduce the complicated process of conversion and mapping from the actual grid fault to the diagnosis system model.
From the perspective of graph theory mentioned in reference [7], the topological structure of a single grid can be represented by an undirected graph G = (V * ,E). The vertex set V * is composed of all electrical elements in the grid, and the physical connection between elements constitutes the arc set E.
In this article, electrical elements are divided into three categories, namely V * = V * e ∪ B * ∪ V * ot . B * denotes the circuit breaker set.
is the bus set, V * T is the transformer set, V * G is the generator set. V * ot denotes other elements (including load node, etc.).
In Figure 2, V_e class is the abstraction of concrete objects in V * e . V_L class is the abstraction of concrete objects in V * L . V_B class is the abstraction of concrete objects in V * B . V_T class is the abstraction of concrete objects in V * T . V_G class is the abstraction of concrete objects in V * G . V_ot class is the abstraction of concrete objects in V * ot . B class is the abstraction of concrete objects in B * .
G class represents the power grid network, whose topology relationship is defined in a nested way: G stores the set of V_e class elements and the set of circuit breakers. The connection relationship between a V_e class element and other V_e class elements and the connected circuit breakers are stored in the attribute C (adjacency list) of the element. The connection relationship between a circuit breaker and V_e class elements is stored in the attribute TV of the circuit breaker.
The attribute n of a V_e class element represents the number of other V_e class elements directly connected to the V_e class element. Regardless of whether a circuit breaker is controlled or passively closing and opening, the attribute S of the circuit breaker represents the opening and closing state of the circuit breaker (1 represents the opening state and 0 represents the closing state). Attribute PBF stores all protections related to the circuit breaker. Attribute PB stores action protections related to the circuit breaker.
The set of action protections P * is the main information source used in a real-time processing system for fault diagnosis. The protection class (P class) is divided into five subclasses, as shown in Figure 2 (other protections that do not relate to fault diagnosis are not considered in this article). The attribute p (namely correctness of the protection), which has a detailed calculation method in Section 3.3.1, indicates the probability of protection operating correctly according to its setting principle. The protection range is represented by attributes PV and N: PV represents the nearest V_e class element in the protection range, which determines the direction of the protection range. N denotes the number of V_e class elements in the shortest path between the farthest V_e class element in the protection range and PV of the protection (containing PV and the element itself). N determines the extension of the protection range.
The attribute t (namely protective action time) indicates the sum of the action time of the protection relay and delay of the related circuit breaker. What needs to be noted is that the initialization of the real-time processing system should wait for the information of all action protections to be uploaded to the dispatching center before starting. According to the general setting time of remote backup protections and the delays of circuit breakers, this paper sets 1.5 s as the time delay between receiving the first action protection and the initialization of the real-time processing system. Action protections that exceed this delay time value will not participate in fault diagnosis this time.
All kinds of attributes and methods of classes include but are not limited to the content in Figure 2. Other attributes and methods will be defined in the corresponding chapters later.
For example, the method to obtain the protection range of protection is shown in Algorithm 1.
The fault description is as follows: Bus B1 fails. B1 bus protection (BP_B1) actions. Circuit breakers CB1 and CB2 trip. Circuit breaker CB4 refuses to operate. The fault is removed by the remote backup protection (RBP_L2_B3) on the B3 side of line L2. Circuit breaker CB5 trips. The action protections received by the dispatching center are BP_B1 and RBP_L2_B3, and the tripping circuit breakers are CB1, CB2 and CB5.
According to the object-oriented representation method in Figure 2, B1, L1, L2, CB1, CB2, CB4, CB5, BP_B1 and RBP_L2_B3 are taken as examples for object instantiation, as shown in Tables 2 and 3. Algorithm 1 Protection range search algorithm: Scope_Protection (P, G, P. V_e, P. tree) Input: a protection: P and the power grid network: G Output: the set of all V_e class elements within the protection range of P:P. V_e and protection range tree of P:P. tree The algorithm procedure is as follows: Step (1) Input P and G.
Step (2) Taking P. PV as the starting point, the depth-first search of G is carried out. The cut-off condition of a single branch search is that there are Pa. N V_e class element nodes or access to the generator node or load on the path of the branch. The tree structure P. tree = (V, E) is formed by the search topological results, where V is the finite set of electrical components in the power grid, E is the directed connection relationship formed by the depth-first search of the power grid topology from P. PV as the root node, and P. PV ∈ V.
Step (3) Take the related circuit breaker set P. TB of P and cut off the branches of P. tree containing elements in P. TB to form a new P. tree.
Step (4) Traverse all V_e class element nodes in the P. tree and place them in P. V_e.
An example of a simple power grid fault is shown in Figure 3.
Appl. Sci. 2021, 11, x FOR PEER REVIEW 7 of 25 Algorithm 1 Protection range search algorithm: Scope_Protection (P, G, P. V_e, P. tree) Input: a protection: P and the power grid network: G Output: the set of all V_e class elements within the protection range of P:P. V_e and protection range tree of P:P. tree The algorithm procedure is as follows: Step (1) Input P and G.
Step (2) Taking P. PV as the starting point, the depth-first search of G is carried out. The cut-off condition of a single branch search is that there are Pa. N V_e class element nodes or access to the generator node or load on the path of the branch. The tree structure P. tree = (V, E) is formed by the search topological results, where V is the finite set of electrical components in the power grid, E is the directed connection relationship formed by the depth-first search of the power grid topology from P. PV as the root node, and P. PV ∈ V.
Step (3) Take the related circuit breaker set P. TB of P and cut off the branches of P. tree containing elements in P. TB to form a new P. tree.
Step (4) Traverse all V_e class element nodes in the P. tree and place them in P. V_e.
An example of a simple power grid fault is shown in Figure 3. The fault description is as follows: Bus B1 fails. B1 bus protection (BP_B1) actions. Circuit breakers CB1 and CB2 trip. Circuit breaker CB4 refuses to operate. The fault is removed by the remote backup protection (RBP_L2_B3) on the B3 side of line L2. Circuit breaker CB5 trips. The action protections received by the dispatching center are BP_B1 and RBP_L2_B3, and the tripping circuit breakers are CB1, CB2 and CB5.
According to the object-oriented representation method in Figure 2, B1, L1, L2, CB1, CB2, CB4, CB5, BP_B1 and RBP_L2_B3 are taken as examples for object instantiation, as shown in Tables 2 and 3.     When an element fails, the protections of the faulty element itself act first to remove the fault. When related protections or circuit breakers refuse to act, the fault is removed by the backup protections of the adjacent elements, which is the embodiment of the selectivity of stage relay protection. Therefore, whether an element is faulty can be judged by its own protection and circuit breaker information and its adjacent element's backup protection and circuit breaker information. A tree structure is defined to store the topological relationship of the element waiting for diagnosis and contains relevant protection, circuit breaker information, adjacent elements and their connection relationship. Definition 1. Protection information connection tree: V_root. Tree = (V, E). Protection information connection tree is an attribute of V_e class. V is a finite set of electrical elements in the power grid. E is the directed connection relationship formed by the depth-first search of the power grid topology from V_ root. The termination condition for a single branch search is accessing to the farthest circuit breaker node from V_root within the protection range of related protections of the circuit breaker or accessing to V_G class element or load node. V_root ∈ V. And V_root is the root node of tree structure.
Taking L1, L2 and B1 in Figure 3 as examples to construct their protection information connection tree, as shown in Figure 4: the fault. When related protections or circuit breakers refuse to act, the fault is removed by the backup protections of the adjacent elements, which is the embodiment of the selectivity of stage relay protection. Therefore, whether an element is faulty can be judged by its own protection and circuit breaker information and its adjacent element's backup protection and circuit breaker information. A tree structure is defined to store the topological relationship of the element waiting for diagnosis and contains relevant protection, circuit breaker information, adjacent elements and their connection relationship.

Definition 1. Protection information connection tree: V_root. Tree = (V, E). Protection information connection tree is an attribute of V_e class. V is a finite set of electrical elements in the power grid. E is the directed connection relationship formed by the depth-first search of the power grid topology from V_ root. The termination condition for a single branch search is accessing to the
farthest circuit breaker node from V_root within the protection range of related protections of the circuit breaker or accessing to V_G class element or load node. V_root ∈ V. And V_root is the root node of tree structure.
Taking L1, L2 and B1 in Figure 3 as examples to construct their protection information connection tree, as shown in Figure 4: Figure 4. Protection information connection tree of L1, L2 and B1.

Fault Pre-Diagnosis
In this section, this article first uses an algorithm to process accident-level information received by the dispatching center and then uses the reverse reasoning Petri net proposed in reference [7] for fault diagnosis inference and calculation.
In practical engineering, there may be complex topology and excessive elements. The traversal of each V_e class element in the diagnosis process is time-consuming and laborintensive, which cannot meet the needs of practical engineering. Therefore, the algorithm searches the topology of the power grid through the set of action protections ( * ), and gets all elements within the protection range of each protection in * to form the initial set of suspicious faulty V_e class elements.
The sufficient and necessary condition for the existence of a token in the final place of a Petri net is that all the second layer places of the net can obtain tokens after iterative calculations. According to the mapping relationship between tree structure and reverse reasoning Petri net, for a V_e class element, only when the number of tripping circuit breakers (and the element is within the protection range of the action protection related to the circuit breaker) exceeds the number of adjacent list edges (n) of the element in the protection information connection tree of it, all the second layer places of the Petri net may obtain tokens after iterative calculations. Through the screening of this condition, the range of suspicious faulty elements can be further reduced to form the set of suspicious faulty V_e class elements.

Fault Pre-Diagnosis
In this section, this article first uses an algorithm to process accident-level information received by the dispatching center and then uses the reverse reasoning Petri net proposed in reference [7] for fault diagnosis inference and calculation.
In practical engineering, there may be complex topology and excessive elements. The traversal of each V_e class element in the diagnosis process is time-consuming and laborintensive, which cannot meet the needs of practical engineering. Therefore, the algorithm searches the topology of the power grid through the set of action protections (P * ), and gets all elements within the protection range of each protection in P * to form the initial set of suspicious faulty V_e class elements.
The sufficient and necessary condition for the existence of a token in the final place of a Petri net is that all the second layer places of the net can obtain tokens after iterative calculations. According to the mapping relationship between tree structure and reverse reasoning Petri net, for a V_e class element, only when the number of tripping circuit breakers (and the element is within the protection range of the action protection related to the circuit breaker) exceeds the number of adjacent list edges (n) of the element in the protection information connection tree of it, all the second layer places of the Petri net may obtain tokens after iterative calculations. Through the screening of this condition, the range of suspicious faulty elements can be further reduced to form the set of suspicious faulty V_e class elements. Definition 2. Eigenvalue: k. k is an attribute of V_e class. The k value of a V_e class element represents the number of tripping circuit breakers in the protection information connection tree of the element (and the element is within the protection range of the action protection related to the circuit breaker).
The flow chart of Algorithm 2 is shown in Figure 5. The flow chart of Algorithm 2 is shown in Figure 5.
Algorithm 2 Preliminary search algorithm for suspicious faulty V_e class elements: Search_Fault V_e (G, * , V_e1, V_e2) Input: the power grid network: G and the set of the grid's action protections: * . Output: the initial set of suspicious faulty V_e class elements: V_e1 and the set of suspicious faulty V_e class elements: V_e2.
Input power grid net work G and action protection s et P*,and copy P* as P*m.
Take an action protection in P*m as Pa Has Pa been taken ?
Do algorithm 1 for Pa. Let k value +1 of all elements in Pa.V_e, and place the elements in V_e1.
Has P*m been empty?
Take a V_e class element in V_e1m as V_ea Has V_ea been taken ?
Has V_e1m been empty?
After V_e2 having been obtained, the protection information connection tree is constructed for each element in V_e2. Map the protection information connection trees to reverse reasoning Petri nets and perform iterative calculations. Then place the V_e class elements with a token in the final place into the initial set of faulty V_e class elements (V_e3).
Demonstrate the process with a simple grid shown in Figure 4 of Section 3.1.
Initialize G and input * = {BP_B1, RBP_L2_B3}. Then complete Algorithm 2, get In Step (3) of Algorithm 2, the process of doing Algorithm 1 for RBP_L2_B3 is shown in Figure 6a.
Map the protection information connection tree of B1 to a reverse reasoning Petri net, and then perform iterative calculations, as shown in Figure 6b.
The final place of B1 Petri net is placed in a token, so B1 is placed in V_e3.

Iterative calculations
The ★ in the figure shows the location of token After V_e2 having been obtained, the protection information connection tree is constructed for each element in V_e2. Map the protection information connection trees to reverse reasoning Petri nets and perform iterative calculations. Then place the V_e class elements with a token in the final place into the initial set of faulty V_e class elements (V_e3).
represents the number of tripping circuit breakers in the protection information connection tree of the element (and the element is within the protection range of the action protection related to the circuit breaker).
The flow chart of Algorithm 2 is shown in Figure 5.
Algorithm 2 Preliminary search algorithm for suspicious faulty V_e class elements: Search_FaultV_e(G, * , V_e1, V_e2) Input: the power grid network: G and the set of the grid's action protections: * . Output: the initial set of suspicious faulty V_e class elements: V_e1 and the set of suspicious faulty V_e class elements: V_e2.
Input power grid network G and action protection s et P*,and copy P* as P*m.
Take an action protection in P*m as Pa Has Pa been taken ?
Do algorithm 1 for Pa. Let k value +1 of all elements in Pa.V_e, and place the elements in V_e1.
Has P*m been empty?
Take a V_e class element in V_e1m as V_ea Has V_ea been taken ?
Has V_e1m been empty?
Place V_ea in V_e2 Finish, outpu t V_e1 and V_e2 Step2 Step3 Step 4 Step6 Step7 Step5 Copy V_e1 as V_e1m After V_e2 having been obtained, the protection information connection tree is constructed for each element in V_e2. Map the protection information connection trees to reverse reasoning Petri nets and perform iterative calculations. Then place the V_e class elements with a token in the final place into the initial set of faulty V_e class elements (V_e3).
Demonstrate the process with a simple grid shown in Figure 4 of Section 3.1. In Step (3) of Algorithm 2, the process of doing Algorithm 1 for RBP_L2_B3 is shown in Figure 6a.
Map the protection information connection tree of B1 to a reverse reasoning Petri net, and then perform iterative calculations, as shown in Figure 6b.
The final place of B1 Petri net is placed in a token, so B1 is placed in V_e3.

Iterative calculations
The ★ in the figure shows the location of token Map the protection information connection tree of B1 to a reverse reasoning Petri net, and then perform iterative calculations, as shown in Figure 6b.
The final place of B1 Petri net is placed in a token, so B1 is placed in V_e3.

Fault Diagnosis Confirmation and Operation Status Analysis of Related Circuit Breakers and Protections
In this section, all the elements in V_e3 are investigated one by one to judge whether a fault occurs and whether relevant protections and circuit breakers operate normally.
In order to simplify the model, the state space of a single relay protection device (or a circuit breaker) is divided into two categories: device fault and device normal. The state transition diagram is shown in Figure 7.

Fault Diagnosis Confirmation and Operation Status Analysis of Related Circuit Breakers and Protections
In this section, all the elements in V_e3 are investigated one by one to judge whether a fault occurs and whether relevant protections and circuit breakers operate normally.
In order to simplify the model, the state space of a single relay protection device (or a circuit breaker) is divided into two categories: device fault and device normal. The state transition diagram is shown in Figure 7: According to Markov's state-space theory, the state transition matrix A is: It is assumed that the fault probability of a device is and the normal probability of the device is . The sojourn probability matrix = [ ], and • = , + = 1. The normal probability of a device defined as the correctness of the device, as Equation (2): Referring to the statistics of the China Electric Power Research Institute on the operation of Chinese grid relay protection devices and circuit breakers, the correctness of each protection and circuit breaker in this article is calculated by corresponding numerical values in references [15,16].
What needs to be noted is that in actual engineering applications, accurate data statistics should be determined according to product, manufacturer, year of production, and other details. The correctnesses of protections and circuit breakers currently used have reached a high level (at least higher than 95%). Thus, the error between the general statistical value used in this paper and the actual statistical value is extremely small. In the subsequent calculation process, the calculation error caused by such a small error is also extremely small, and it will not change the final judgment on whether circuit breakers and protections are operating correctly. This is the conclusion that has been gotten after comparing the calculation process between extreme data (95%) and general statistical values. Therefore, in order to simplify the description and calculation, this article adopts the statistical values of the general meaning.
Many types of protection only relate to one circuit breaker. Thus, these protections are only involved with one branch of the protection connection tree of a V_e class element. It is assumed that when Vy (a V_e class element) fails, according to the principle of relay protection stage coordination, the circuit breaker should trip arranged at the z-th position in a branch of Vy protection information connection tree is CBy_z. In CBy_z.PBF, the protection, which should act arranged at the x-th position according to action time is Py_z_x, and its protection correctness is denoted as _ _ . According to Markov's state-space theory, the state transition matrix A is: It is assumed that the fault probability of a device is p 0 and the normal probability of the device is p 1 . The sojourn probability matrix T = [p 0 p 1 ], and T·A = T, p 0 + p 1 = 1. The normal probability of a device defined as the correctness of the device, as Equation (2): Referring to the statistics of the China Electric Power Research Institute on the operation of Chinese grid relay protection devices and circuit breakers, the correctness of each protection and circuit breaker in this article is calculated by corresponding numerical values in references [15,16].
What needs to be noted is that in actual engineering applications, accurate data statistics should be determined according to product, manufacturer, year of production, and other details. The correctnesses of protections and circuit breakers currently used have reached a high level (at least higher than 95%). Thus, the error between the general statistical value used in this paper and the actual statistical value is extremely small. In the subsequent calculation process, the calculation error caused by such a small error is also extremely small, and it will not change the final judgment on whether circuit breakers and protections are operating correctly. This is the conclusion that has been gotten after comparing the calculation process between extreme data (95%) and general statistical values. Therefore, in order to simplify the description and calculation, this article adopts the statistical values of the general meaning.
Many types of protection only relate to one circuit breaker. Thus, these protections are only involved with one branch of the protection connection tree of a V_e class element. It is assumed that when Vy (a V_e class element) fails, according to the principle of relay protection stage coordination, the circuit breaker should trip arranged at the z-th position in a branch of Vy protection information connection tree is CBy_z. In CBy_z.PBF, the protection, which should act arranged at the x-th position according to action time is Py_z_x, and its protection correctness is denoted as p Py_z_x .
It is assumed that circuit breakers that should trip arranged before CBy_z do not trip, and CBy_z operates normally. The probability of Py_z_x action to remove the fault is: Thus, the probability of CBy_z tripping to remove the fault is: In Equation (4), zm denotes the number of protections that can be used to remove the Vy fault in CBy_z.PBF.
Remove the original assumptions. The new assumption is that CBy_z operates normally. The recurrence formula of the probability of CBy_z tripping to remove the fault is: In Equation (5), initial condition is that G(y_1) = p(y_1).
When Vy fault occurs, the recurrence formula of the probability of Py_z_x action to remove fault is: The refuse-operation information of related circuit breakers can usually be known. When a circuit breaker refuses to trip and the fault is removed by next-level protection, the circuit breaker node should be skipped when calculating the probability.
For example, when calculating G(B1_CB4_RBP_L2_B3) in the simple grid fault shown in Figure 3, CB5 should be considered as the first circuit breaker node on the B1-CB4-L2-CB5 path. G(B1_CB4_RBP_L2_B3) = CB. p × RBP. p = 0.9312.
Protection may be involved in multiple branches (such as longitudinal differential protection). It is assumed that the protection, which should act arranged in the t-th position is Py_t. In the branches Py_t is involved in, n related circuit breakers trip under the control of Py_t action, and k related circuit breakers refuse to trip. The probability of Py_t action to remove the fault is: When Vy fails, the joint action of ym protections causes the tripping of the related circuit breakers to remove the fault. The probability of Vy fault can be calculated by the transfer function f (·) and is defined as fault correctness of Vy (namely p Vy ).
The calculation method of Gi is determined by protection i. When i is involved in multiple branches, Gi is calculated according to Equation (7). When i is involved in only one branch, Gi is calculated by Equations (3)-(6).
According to relevant models and data in references [17,18], this article sets 0.8 as the boundary value to determine whether the element is faulty. When the fault correctness of the element is greater than or equal to 0.8, the element is judged to be faulty. What needs to be noted is that 0.8 is only the reference value set in this article, and calculation changes should be made according to the actual probability during engineering application. Elements whose fault correctness is lower than 0.8 are not without failures. This system will output their fault correctness as the probability of failure.
The essence of Vy fault correctness p Vy is the conditional probability of Vy fault when the operating information of protections and circuit breakers related to Vy are known. According to the Bayesian theorem, when the fault diagnosis is completed, the posterior probability of the action of the protection Px (namely action expectation of Px: E Px ) can be calculated according to Equations (9)-(11).

S Ni−Px =
0 (Ni f ault removed by protection with priority higher than Px) p Px p Ni (Protection resection with priority higher than Px f ailed to remove Ni f ault) (9) In Equations (9)- (11), Ni denotes the i-th V_e class node in the N-th branch of the protection range tree of Px (the smaller i value, the closer to the root node), and the initial condition is that AS N1−Px = S N1−Px . In Equation (11), the AS value of each V_e class node can only be calculated once and cannot be repeated.
Action information of protection Px is transformed into the actual action value T Px . When Px acts, T Px = 1, when Px does not act, T Px = 0. The operation status of Px can be analyzed according to the difference between action expectation and actual action value, as shown in Equation (12): 2 P x re f uses action.

(12)
According to the above method to determine faulty elements and the method to analyze operation statuses of protections, an algorithm defined on the power grid network G is given below. The fault diagnosis of each element in V_e3 is performed, and the operation statuses of related protections are analyzed.
The implementation flowchart of Algorithm 3 is shown in Figure 8.
In Equations (9)-(11), Ni denotes the i-th V_e class node in the N-th branch of the protection range tree of Px (the smaller i value, the closer to the root node), and the initial condition is that = . In Equation (11), the AS value of each V_e class node can only be calculated once and cannot be repeated.
Action information of protection Px is transformed into the actual action value . When Px acts, = 1, when Px does not act, = 0. The operation status of Px can be analyzed according to the difference between action expectation and actual action value, as shown in Equation (12): According to the above method to determine faulty elements and the method to analyze operation statuses of protections, an algorithm defined on the power grid network G is given below. The fault diagnosis of each element in V_e3 is performed, and the operation statuses of related protections are analyzed.
The implementation flowchart of Algorithm 3 is shown in Figure 8. Input G and V_e3,and copy V_e3 as V_e3m.
Take a element in V_e3m as V_eb Has V_eb been taken ?
Take the protection information connection tree of V_eb as V_eb.Tree. By traversing V_eb.Tree, the circuit breakers which own related action protection but closed on are inserted into CB_1, and the circuit breakers without related action protection but tripped are inserted into CB_2.
Depth-first traversal of circuit breaker nodes is performed on V_eb.Tree. The circuit breaker tripped by the related protection whose protection range include V_eb is denoted as CB_b. The protection pr ocessing value(SP) of CB_b.PB plus 1.Cut off the following portion of CB_b on V_eb.Tree.
According to the path from root node V_eb to CB_b on V_eb.Tree, these protections are placed in the set of fault-related protections G_RP according to the protection time sequence assumed to be actionable when V_eb fails. Take a protection in G_RPm as PC and invoke the Pc. tree Traversing V_e class elements on Pc.tree by depth-first search. According to the formulas(9-11), the action expectation(Pc.E) of Pc is calculated, and the actual action value of Pc is quantified as Pc.T. Pc.E-Pc.T≥0.2？ Pc.E-Pc.T≤-0.2？

Pc is maloper ation, put Pc in P_1.
Pc is refuse-operation, put PC in P_2.

Pc operates nor mally
Has PC been taken?
Has protections in G_RPm been taken out?

Remove V_eb from V_e3m
Copy G_RP as G_RPm Remove PC from G_RPm  In the following, the simple grid fault shown in Figure 3 is used as an example to show the algorithm process.
Initialize In Step 8 and Step 9, the action exceptions of protections related to B1 fault are calculated, respectively, and BP_B1.E = 0.9098, RBP_L2_B3.E = 0.8957. As BP_B1.T = 1, RBP_L2_B3.T = 1, the difference between the action expectations and actual values of the two protections is in the range of (−0.2, 0.2). BP_B1 and RBP_L2_B3 operate normally.
In summary, the faulty element in the power grid is B1. CB4 refuses to trip. There is no protection maloperation or refuse-operation phenomenon.

Information Missing Analysis Based on Classification of Action Protections
When a V_e class element in the power grid fails, protections on both sides (or multiple sides) of the element should act together to disconnect related circuit breakers to remove the faulty element. However, in practical engineering, the protection action information and circuit breaker tripping information are often lost. In this circumstance, the dispatching center may only receive accident-level information of one side (or more sides, but less than the number of adjacency list edges of the element).
The purpose of circuit breaker tripping controlled by protection action is to remove a faulty V_e class element. The reverse reasoning Petri net used in the fault pre-diagnosis process in Section 3.2 is logically calculated from the purpose of circuit breaker tripping and protection action. The mapped reverse reasoning Petri net cannot work when the protection action and circuit breaker tripping information on the protection information tree of suspicious faulty elements are incomplete.
Although it is impossible to determine the potential faulty elements related to the missing part of the information, some suspected faulty elements can be speculated by classifying the known protection action information.
It is defined that the protections whose action information can be used to diagnose suspicious faulty elements (namely, the protection action whose purpose is to remove one or more elements in V_e3) as the processed protections and other action protections are regarded as unprocessed protections. Definition 3. Protection processing value: SP. SP is an attribute of protection class to classify action protections. The calculation method of SP has been given in Step 4 of Algorithm 3. When the protection processing value SP of a protection is n, it means that the protection has participated in the fault diagnosis calculation of n V_e elements in V_e3. Therefore, when the protection processing value SP of a protection is no less than 1, it represents that the protection is a processed protection. The protection processing values of unprocessed protections are equal to the initial value 0.
The information between the processed protections and the untreated protections do not interfere with each other, so the original power grid network can be decomposed into two parts: the network with only the processed protections and the network with only the untreated protections. The fault diagnosis result of the original power grid is the superposition of the two diagnosis results. The fault diagnosis of the network with only processed protection is completed in Sections 3.2 and 3.3.1, while the fault diagnosis of the network with only unprocessed protection needs to be completed by the batch processing system.
The following is Algorithm 4 to complete the logic process described in this section. The specific steps of the algorithm are as follows: Step (1) input G and P * ; Step (2) Traversing P * , selecting the protections whose protection processing value (SP) is 0 to form P_3 (If P_3 is an empty set, no partial information loss occurs. The following steps are not carried out.); Step (3) Reconstructing power grid network G1. Only the topological connection of elements in G is retained, all V_e class elements in G1 are initialized. The breaking and closing states (S) of circuit breakers related to protections in P_3 are retained, and the other circuit breakers are initialized.

Implementation of Fault Tracking Quasi-Real-Time Processing System
When a real-time processing system transmits the sets (CB_1, CB_2) of maloperation and refuse-operation circuit breakers and the sets (P_1, P_2) of maloperation and refuseoperation protection to substations, the substations can call the warning information of corresponding relay protection devices and the online monitoring information of corresponding circuit breakers to screen fault features of these devices. According to the fault features of these devices and the relation table between fault reasons and features, fault reasons can be tracked immediately through the corresponding algorithm.

Relation Table between Fault Reasons and Fault Features of SF6 Circuit Breaker
Circuit breakers are various, including oil circuit breakers, vacuum circuit breakers, SF6 circuit breakers and so on. Due to limited space, this article only studies the most widely used SF6 circuit breaker.
According to the relevant content in references [19,20], this article gives the table of partial fault features and partial fault reasons (including the prior probability of each reason), as shown in Table 4. Combined with the engineering practice experience, the probabilities of each fault reason leading to corresponding fault features are calculated. The relation table between fault reasons and fault features of the SF6 circuit breaker is constructed according to calculation results. Abnormal rise in hydraulic/pneumatic pressure Table 5 is stored in substations in the form of adjacency list. For example, for fault reason r1, its fault feature adjacency list is {f1:0.45, f2:0.4}, for fault feature f1, its fault reason adjacency list is {r1:0.45, r4:0.55}. The construction process of the relation table of fault reasons and features of the relay protection device is consistent with that of the circuit breaker. Due to limited space, the detailed construction process and data, which can be found in reference [19] are not given here.

Fault Reason Tracking Considering Partial Information Loss
In reference [13], the Bayesian network algorithm for fault tracking of relay protection devices is given. Considering the loss of partial warning-level information, the algorithm is improved in this article. Table 5 can be seen as the adjacency matrix M(R,S) between the fault reasons and features. M(R,S) is a sparse matrix with a large number of elements of 0. After M(R,S) is partitioned, the part with a value equal to 0 can be separated from the part with data. This shows that part of fault reasons will only cause the fault features of the block part where these reasons are located, and there may be relations between the fault features in the same block part. When partial warning-level information is lost, the fault features related to the lost information can be used to calculate the probability of occurrence with the obtained fault features of the same block part.
The fault reason tracking process is as follows: Step (1): After receiving the incorrect action information of a circuit breaker or relay protection device from the dispatching center, fault features of the device are extracted from warning-level information. The initial set of fault features (F_1) is obtained.
Step (2): Call the fault reason adjacency list of each feature in F_1 and obtain the initial set of fault reasons as R_1.
Step (3) Step (4): Calculate the conditional probability of each reason in R_ under the premise of occurrence of each feature in F_ one by one according to Equation (13).
p(rx| f y) = p rx p( f y|rx) ∑ ri∈R_ p ri p( f y|ri) In Equation (13), p rx denotes the prior probability of the fault reason rx, p(fy|rx) represents the probability of the fault feature fy caused by rx, and p(rx|fy) represents the conditional probability of rx when fy occurs.
Step (5): First, calculate the Bayesian suspected degree of each reason in R_ according to Equation (14).
In Equation (14), B(rx) denotes the Bayesian suspected degree of the fault reason rx.
When the relevant data of partial warning-level information is known, the conditional probability H fi of fault feature fi is set to consider partial information loss. The Bayesian suspected degree is corrected to avoid a large deviation of the fault trace results. The calculation formula of H fi is as follows: Related data o f f i are known and f i / ∈ F_1 ∑ rt∈R_ B(rt)p( f i|rt), Related data o f f i are unknown (15) According to Equation (16), the Bayesian suspected degree of each reason in R_ is calculated again and regarded as the revised suspected degree of each fault reason.
Finally, the maintenance work can be rapidly carried out according to the order of the revised suspected degree of each fault correction. The following example of a simple grid fault shown in Figure 4 shows the fault tracking process.
According to Section 3.3.1, the set of refuse-operation circuit breakers is {CB4}. The abnormal data and unknown data of CB4 obtained by the substation are shown in Table 6. From the data analysis, the initial set (F_1) of CB4 fault features is {f3, f4}. According to Steps (2) and (3), the set of fault reasons (R_) is {r1, r2, r3, r4, r5}, the set of fault features (F_) is {f1, f2, f3, f4, f5}. According to Step (4), the conditional probability of each reason in R_ under the premise of occurrence of each feature in F_ is obtained as shown in Table 7. According to Equation (14), the Bayesian suspected degree of each fault reason is calculated. Then calculate the revised suspected degree of each fault reason according to Equations (15) and (16). The results are shown in Table 8. By comparing Bayesian suspected degree and revised suspected degree, it can be found that the fault reason tracking method based on Bayesian suspected degree does not consider r1. However, due to the loss of information, r1 may occur in practical engineering. The revised suspected degree can give r1, which plays a guiding role in the subsequent maintenance work.
Staff can rapidly repair CB4 according to the order of r4, r3, r5, r2 and r1.

Implementation of Fault Diagnosis Batch Processing System
At present, the networking work aimed at realizing the remote transmission of the entire power grid's fault recording data has been completed in many provinces of China. The batch processing system at the dispatching center can input the fault recording data documents in batches of multiple sets of documents through the fault recording information net and schedule them in the form of workflow. Under the premise that a real-time processing system provides diagnosis results, the batch processing system diagnoses elements in V_e5, verifies the set of faulty elements (V_e4), outputs the information of fault phases, fault types, fault locations and fault times.
Fault recording data processing flow is shown in Figure 9.
According to the COMTRADE recording document format and configuration information, get the fault recor ding wave data of the three-phase voltages and cur rents on each side of all elements in V_e4 and V_e5.

Input V_e4 and V_e5
Fourier algorithm is used to extract the fundamental component values of currents and voltages, and the phasors are obtained.
The mutation monitoring algorithm is used to determine the fault time and circuit breaker trip time of each data extraction point.

Description of Example and Running Process of Fault Diagnosis System
The fault diagnosis system and related algorithms have been implemented in the Py-charm2020 programming environment. In order to test the reliability and efficiency of the system in the face of rather complex fault problems, the IEEE 30bus system, whose protection is configured according to the requirements of 330-500 kV power grid in a province of China, is simulated in PSCAD.
The partial wiring diagram of the IEEE 30bus system is shown in Figure 10. The overall wiring diagram of the IEEE 30bus system is too large to be displayed clearly. Some electrical elements unrelated to fault diagnosis are hidden in Figure 10. All electrical elements have been numbered.
The fault example used in this paper considers the simultaneous faults of three V_e class elements. Although the simultaneous faults of three V_e class elements are very likely to lead to unstable operation of the power system, the purpose of this example is to simulate extreme grid fault conditions to demonstrate the process and performance of this fault diagnosis system.
Fault description is as follows: Line L3 fault occurs. L3 mainline protection acts.CB6 refuses to trip. Remote backup protection on the B1 side of line L2 acts to remove the fault. At the same time, line L6 fault occurs. L6 mainline protection acts. CB15 refuses to trip. The fault is removed by the remote backup protection on the B6 side of line L8 but causes near backup protection on the B7 side of line L8 maloperation. At the same time, the bus B15 fault occurs. B5 bus protection acts. CB58 refuses to trip, and the fault is removed by the remote backup protection on the B18 side of line L28. The COMTRADE recording document format, Fourier algorithm, mutation monitoring algorithm, symmetrical component method, fault direction algorithm, single-end fault location method and fault phase and type judgment rule have been introduced in detail in references [10,13]. Due to limited space, this article does not repeat.

Description of Example and Running Process of Fault Diagnosis System
The fault diagnosis system and related algorithms have been implemented in the Pycharm2020 programming environment. In order to test the reliability and efficiency of the system in the face of rather complex fault problems, the IEEE 30bus system, whose protection is configured according to the requirements of 330-500 kV power grid in a province of China, is simulated in PSCAD.
The partial wiring diagram of the IEEE 30bus system is shown in Figure 10. The overall wiring diagram of the IEEE 30bus system is too large to be displayed clearly. Some electrical elements unrelated to fault diagnosis are hidden in Figure 10. All electrical elements have been numbered.
The fault example used in this paper considers the simultaneous faults of three V_e class elements. Although the simultaneous faults of three V_e class elements are very likely to lead to unstable operation of the power system, the purpose of this example is to simulate extreme grid fault conditions to demonstrate the process and performance of this fault diagnosis system.
Fault description is as follows: Line L3 fault occurs. L3 mainline protection acts.CB6 refuses to trip. Remote backup protection on the B1 side of line L2 acts to remove the fault. At the same time, line L6 fault occurs. L6 mainline protection acts. CB15 refuses to trip. The fault is removed by the remote backup protection on the B6 side of line L8 but causes near backup protection on the B7 side of line L8 maloperation. At the same time, the bus B15 fault occurs. B5 bus protection acts. CB58 refuses to trip, and the fault is removed by the remote backup protection on the B18 side of line L28.  Table  9.   Table 9.  After obtaining P_1 and CB_2, corresponding substations start their quasi-real-time systems. Data mining is carried out for warning-level information related to maloperation protection and refuse-operation circuit breakers. Then fault reasons are tracked.
The tracking process of refuse-operation circuit breaker CB15 is taken as an example. The relevant data of CB15 is shown in Table 10. According to the process of tracking fault reasons in Section 4.1.2, the suspected degree of each fault reason for CB15 refuse-operation is obtained, as shown in Table 11. Therefore, CB15 is overhauled according to the sequence of r4, r3, r2 and r1.
The incorrect action reasons for CB58 and NBP_L8_B7 can be tracked in the same way.
After getting the diagnosis result of the real-time processing system, the dispatching center starts a batch processing system. The fault recording documents of related circuit breakers of elements in V_e5 (CB4, CB5, CB6 and CB7) are called through the fault recording information net. Their waveform diagrams are shown in Figure 11. After obtaining P_1 and CB_2, corresponding substations start their quasi-real-time systems. Data mining is carried out for warning-level information related to maloperation protection and refuse-operation circuit breakers. Then fault reasons are tracked.
The tracking process of refuse-operation circuit breaker CB15 is taken as an example. The relevant data of CB15 is shown in Table 10. According to the process of tracking fault reasons in Section 4.1.2, the suspected degree of each fault reason for CB15 refuse-operation is obtained, as shown in Table 11. After getting the diagnosis result of the real-time processing system, the dispatching center starts a batch processing system. The fault recording documents of related circuit breakers of elements in V_e5 (CB4, CB5, CB6 and CB7) are called through the fault recording information net. Their waveform diagrams are shown in Figure 11.
Blue represents phase A, green represents phase B, and red represents phase C. The meaning and reference direction of each current and each voltage are shown in Figure 10.  According to the mutation monitoring algorithm [10], the minimum fault time is 0.3002 s. The positive-sequence fault component of each current and voltage and the zero-sequence fault component of current is calculated by the symmetrical component method [10]. According to the fault direction discrimination method [10], the I1 fault direction is negative, the I2 fault direction is positive, the I3 fault direction is positive and the I4 fault direction is negative, so L3 is the faulty element.
The zero-sequence fault components of I1 and I2 are large, and the difference between the current phasor of phase A of I1 and the current phasor of phase BC is large, and the difference between the current phasor of two phases BC of I1 is almost zero. Thus A-phase grounding short circuit fault occurs in L3. According to the single-end fault location method [13], the fault location of L3 is 50.2 km away from CB6 and 149.8 km away from CB7.
Fault details of L6 and B15 can be obtained in the same way. Due to space limitations, this article does not repeat.

Time-Consuming Test and Analysis
In order to verify the computational efficiency of the system, a time-consuming test is carried out in this article, which is shown in Figure 12. According to the mutation monitoring algorithm [10], the minimum fault time is 0.3002 s. The positive-sequence fault component of each current and voltage and the zerosequence fault component of current is calculated by the symmetrical component method [10]. According to the fault direction discrimination method [10], the I1 fault direction is negative, the I2 fault direction is positive, the I3 fault direction is positive and the I4 fault direction is negative, so L3 is the faulty element.
The zero-sequence fault components of I1 and I2 are large, and the difference between the current phasor of phase A of I1 and the current phasor of phase BC is large, and the difference between the current phasor of two phases BC of I1 is almost zero. Thus Aphase grounding short circuit fault occurs in L3. According to the single-end fault location method [13], the fault location of L3 is 50.2 km away from CB6 and 149.8 km away from CB7.
Fault details of L6 and B15 can be obtained in the same way. Due to space limitations, this article does not repeat.

Time-Consuming Test and Analysis
In order to verify the computational efficiency of the system, a time-consuming test is carried out in this article, which is shown in Figure 12. It can be seen from Figure 12 that the work and data acquisition of the three subsystems are in parallel operation and do not interfere with each other. The initialization of a real-time processing system takes a long time. If the grid topology is stored in the system in advance, this part of time can be reduced. However, pre-initialization may take up a lot of stored space. Since the quasi-real-time processing system needs to wait for diagnostic results of the real-time processing system, pre-initialization can also help improve efficiency in fault trace. The batch processing system needs to wait for the transmission and scheduling of fault recording wave data, and its reasoning process does not take a short time.
However, under ideal conditions, the faulty elements can be diagnosed in about 2.9 s after the fault, the fault tracking process can be completed in about 4.2 s, and the detailed fault diagnosis report can be completed in about 5.0 s. It meets the actual needs of the power system.
Time-consuming tests are also carried out with different fault examples on different power grid systems, and the results are shown in Table 12.  It can be seen from Figure 12 that the work and data acquisition of the three subsystems are in parallel operation and do not interfere with each other. The initialization of a real-time processing system takes a long time. If the grid topology is stored in the system in advance, this part of time can be reduced. However, pre-initialization may take up a lot of stored space. Since the quasi-real-time processing system needs to wait for diagnostic results of the real-time processing system, pre-initialization can also help improve efficiency in fault trace. The batch processing system needs to wait for the transmission and scheduling of fault recording wave data, and its reasoning process does not take a short time.
However, under ideal conditions, the faulty elements can be diagnosed in about 2.9 s after the fault, the fault tracking process can be completed in about 4.2 s, and the detailed fault diagnosis report can be completed in about 5.0 s. It meets the actual needs of the power system.
Time-consuming tests are also carried out with different fault examples on different power grid systems, and the results are shown in Table 12. (1) n-1 scenario represents the scenario where only one V_e class element fails in the power grid. n-2 scenario represents the scenario where two V_e class elements of the power grid fail at the same time. n-3 scenario represents the scenario where three V_e class elements of the power grid fail at the same time. (2) The time in the table represents the time from initialization of the real-time processing subsystem to completion of the entire fault diagnosis system.
It can be seen from Table 12 that this fault diagnosis system not only performs well when dealing with relatively simple power systems (IEEE 14bus system and IEEE 30bus system) but also maintains a high efficiency when dealing with more complex power systems with a large number of buses. The topological relationship of the entire power grid is only involved in the initialization of the real-time processing system. The subsequent fault diagnosis and tracking link will not search the entire power grid, and its operating efficiency has little to do with the complexity of the power grid system. Therefore, the time consumed by this fault diagnosis system only slightly increases with the increase in the complexity of the power system being processed, and this fault diagnosis system can still maintain high computational efficiency.
However, as the number of simultaneous faulty elements increases, the time-consuming time of this fault diagnosis system increases a lot. This is essentially because the number of circulations required in fault diagnosis and tracking link increases at the rate that is a multiple of the increasing number of simultaneous faulty elements. However, in actual engineering applications, there are usually only one or two faulty elements at the same time. This problem will not bring a big reduction in computational efficiency.
This paper also makes a transverse comparison between this fault diagnosis system and related technologies, and the results are shown in Table A1 in Appendix A.
It can be seen from Table A1 that compared with other technologies, the fault diagnosis system proposed in this paper takes into account the reliability and efficiency of fault diagnosis and fault tracking. At the same time, this system can still provide diagnosis and output results even when partial information is missing. The output results of this system are also more comprehensive, which is conducive to the self-healing and maintenance of the power grid after a fault. Therefore, this system has preferable application value in actual engineering.

Conclusions
In this article, the fault diagnosis system is divided into a real-time processing system, quasi-real-time processing system and batch processing system from the perspective of data processing real-time capability.
The real-time processing system in the dispatching center establishes a fault diagnosis model based on the object-oriented information representation of power grid topology and electrical elements. According to accident-level information, this subsystem can find faulty elements by Algorithms 2 and 3, provide the judgment of incorrect actions of protections and circuit breakers by Algorithm 3 and analyze the problem of partial accident-level information loss by Algorithm 4.
According to warning-level information, quasi-real-time processing systems in substations output the reasons for incorrect actions of circuit breakers and relay protection devices by fault reason tracking process in Section 4.1.2. The batch processing system in the dispatching center verifies the diagnosis results of the real-time processing system and provides a detailed fault diagnosis report according to fault recording documents and related technologies in Section 4.2.
Tests in different power systems show that the fault diagnosis and tracking system can accurately and rapidly complete the fault diagnosis and tracking work in the case of partial information missing and output a complete fault report including faulty elements, fault phases, fault types, fault times, fault locations, protections and circuit breakers that operate incorrectly and reasons for incorrect actions.
Compared with the existing related fault diagnosis and tracking methods and systems, this system fully considers the current status of data sources available in the power system and the processing needs of dispatching centers and substations after fault, which improves the efficiency and reliability of fault diagnosis. It helps to rapidly complete the maintenance and self-healing work of the power grid after a fault and has a preferable application value.
Funding: This work was supported by the National Natural Science Foundation of China (No. 5187070349).

Data Availability Statement:
The data presented in this study are available in the article.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study, in the collection, analyses, or interpretation of data, in the writing of the manuscript or in the decision to publish the results.
Appendix A Table A1. Transverse comparison.

Fault Diagnosis Fault Tracking Partial Information Loss Output Results
Expert system used in reference [6] The reasoning process needs to search the rule base and cyclical calculate, which involves hard disk files, and the diagnosis speed is slow.
A large number of rules also can be listed for fault diagnosis during partial information loss. Fault diagnosis can still be performed.

Faulty elements
Petri net used in reference [7] The initialization time is a bit long, but the diagnosis speed is fast.
Fault diagnosis cannot be performed during partial information loss.

Faulty elements
Fault diagnosis method during partial information loss used in reference [8] Only the elements that may fail according to probability can be output during partial information loss.

Suspicious faulty elements and probabilities of their failure
Fault tracking method based on inference chain and Bayesian network used in reference [12] High efficiency but a bit low reliability led by no completely considering information loss.
The problem of partial information loss is not considered completely and systematically.

Reasons for incorrect actions of protections and circuit breakers
Fault diagnosis method based on multiple information sources used in reference [13] Fault diagnosis based on fault recording data and accident-level information together. Reliability is high, but efficiency is low.
Unable to provide the reasons for incorrect actions.
The problem of partial information loss is considered and solved.
Fault diagnosis report including faulty elements, fault details (1) and incorrect action protections and circuit breakers Fault diagnosis method based on multiple information sources used in reference [14] First use colored Petri net for fault diagnosis, and then supplement fault details based on fault recording data. Both reliability and efficiency are high.
Unable to provide the reasons for incorrect actions.
Without considering the problem of information loss, there is a risk that fault diagnosis cannot be performed when partial information is lost.
Fault diagnosis report including faulty elements, fault details and incorrect action protections and circuit breakers

Fault diagnosis system proposed in this article
Fault diagnosis is completed through the cooperation of real-time processing system and batch processing system.
Both efficiency and reliability are high.
Efficiency is as high as the efficiency of method used in reference [12], and reliability is higher than the reliability of it.
Through the cooperation of three subsystems, the problem of partial information loss is solved.
Fault diagnosis report including faulty elements and fault details;Fault tracking report including incorrect action protections and circuit breakers and reasons for incorrect actions (1) The fault details include fault locations, phases, types and times of faulty elements.