From the topological structure, the graph approach considering casual relationships, in essence, is a logic diagram [

137]. To be more precise, its nodes represent facts in FD tasks (e.g., causes, symptoms, locations of faults), and its edges are used to describe probabilistic relationships among each graph node. It is worth noting that these probabilistic relationships can be quantified (because they are usually hidden in conditional probability density functions among nodes). Thus, descriptions and reasonings of uncertainty issues can be carried out through a uniform theoretical framework [

138]. Moreover, by judging whether connections among each node are undirected or directed edges, graph methods can be divided into undirected and directed graphs.

However, mixed graphs are introduced based on undirected and directed graphs, to meet requirements of graph models in high-speed trains, mainly including the undirected edge (–), the directed edge (→), and the bidirectional edge (↔) [

139]. In fact, a graph model is a family of probability distributions, where each probability distribution should satisfy an independent set of conditions encoded by graphs. Suppose R is a finite set of non-empty vertices (variables), and

$U,V\in \mathrm{R}$ (

$U\ne V$). A set of edges is

$\epsilon =\left\{--,\to ,\leftarrow ,\leftrightarrow \right\}$, and

$B\left(\epsilon \right)$ represents a density set

$\epsilon $. A mixed graph

$\mathrm{G}=(\mathrm{R},\mathrm{E})$ is an ordered pair of the vertex set R and mapping

$\mathrm{E}:\mathrm{R}\times \mathrm{R}\to $ $B\left(\epsilon \right)$, satisfying:

Here, vertex sets and edge sets of mixed graphs can also be written as

$\mathrm{R}\left(G\right)$ and

$\mathrm{E}\left(G\right)$, respectively. The relationship among vertices in the graph

G could be described as: If the graph has the edge

$\left\{U-V,U\leftrightarrow V,U\to V,U\leftarrow V\right\}$, then

U is

$\left\{neighbor,spouse,parent,child\right\}$ of

V, as well as named as

$\left\{U\in n{e}_{G}\left(V\right),U\in s{p}_{G}\left(V\right),U\in p{a}_{G}\left(V\right),U\in c{h}_{G}\left(V\right)\right\}$. When

${V}^{{}^{\prime}}\subseteq V$ and

${E}^{{}^{\prime}}\subseteq E$ both are satisfied,

$G=(V,E)$ and

${G}^{{}^{\prime}}=({V}^{{}^{\prime}},{E}^{{}^{\prime}})$ are mixed graphs, and

${G}^{{}^{\prime}}$ is called a subgraph of

G.

#### 3.2.1. Directed Graph

Directed graph (DG) techniques, composed of node variables and directed edges, can describe qualitative relationships among variables to detect faults in high-speed trains through reasoning strategies of consistent path theory.

As the most basic DG models, signed DG (SDG) is adopted in describing causalities of systems [

140]. Different from other techniques, SDG can be implemented without precise matrix descriptions or complete measurement data because it uses a comprehensive graphical form in FD procedures [

141]. Besides, SDGs can simply analyze fault propagation principals for high-speed trains.

Denote SDG as $\chi =\left\{{G}^{{}^{\prime}},\delta ,\varphi \right\}$, where ${G}^{{}^{\prime}}$ is a DG, $\delta $ is the state function of nodes, and $\varphi $ is the branch symbol function. Symptoms or characteristics of faults in systems are represented by nodes, in which causalities among variables are described through directed edges from cause nodes to result nodes. When a system fails, the state of fault nodes deviates from normal values. Furthermore, an alarm is triggered due to the deviation. Thus, by analyzing the causes of node changes, the possible propagation paths of faults can be found. Based on these paths and known causes of faults, the evolution process of faults in systems can be discovered.

Aiming at different problems, various SDG methods for high-speed trains have been developed in FD fields. A fundamental challenge in high-speed trains is the optimal configuration of sensors in different systems. For this reason, Reference [

142] designed an IFD scheme, based on SDG, to perform two tasks for braking systems in trains, i.e., FD and optimal configurations of sensors. Besides, the extended work in Reference [

143], also named as a symptom-fault association-based method, is to reduce the complexity of SDGs. With the continuous expansion of high-speed trains, the single SDG technique is difficult to deal with dynamic changes of abnormal variables in systems. Therefore, Reference [

144] firstly introduced multi-layer strategies into SDGs and then developed a three-layer SDG framework to detect faults of the system state transition caused by a single cause.

In Reference [

145], three salient advantages of the SDG are summarized: (1) it can deal with the closed systems; (2) it can deal with uncertainties, incomplete information, and noise; and (3) it is easy to be built. Although SDG has better FD performance in the optimal configuration of sensors and fault propagation path analysis, it is difficult to be used for analyzing faults in several high coupling units (or subsystems, or components) according to the train structure.

Another important subset of DGs, the Bayesian network (BN), is proposed on the basis of the directed acyclic graph (DAG) and the conditional probability table (CPT) [

146]. Due to that fact that BNs can be used for IFD applications in highly coupling trains, they are gaining popularity.

Taking the component in trains as an example, there are various characteristic signals (e.g., pressure, speed, vibration, etc.) carrying with important information among components. Within a small area consisting of several high coupling components, the components that generate and deliver these signals can be regarded as information sources, and information exchanges among components are accomplished through non-component carriers. Once the information exchange is abnormal, it can be inferred that the component regarded as the information source is faulty. For instance, in order to control braking systems, air compressors feed the gas directly into cylinders and then adjust the pipe pressure of train pipes through an automatic control valve. If air compressors are abnormal, the pressure in cylinders will be directly affected. Explicitly, the pressure is the information source for the coupling of two components. Besides, there are no other valves or relays among them.

Similar to direct couplings, indirect couplings also exist in components, i.e., there is no direct information exchange between two components. Suppose the component ${A}_{1}$ can transfer information to the component ${A}_{3}$ through the component ${A}_{2}$, achieving the information exchange, where there are two direct couplings, including ${A}_{1}\to {A}_{2}$ and ${A}_{2}\to {A}_{3}$. It is consistent with the principle of directed acyclic. Thus, a DAG and a set of random variables in $B=(G,\theta )$ are needed to describe the above coupling, also named as the conditional dependency in BNs, in which G represents a DAG.

It is clear that qualitative descriptions of dependencies among features exist in

$B=(G,\theta )$, which is composed of nodes and directed edges from the parent node to the child node [

147]. Furthermore, nodes in

G are fault characteristics in systems, and directed edges represent dependencies among fault characteristics. Note that a specific case without connections through directed edges means that faults are independent of each other. The work in Reference [

148] demonstrates that BNs use graphs to encode the conditional dependency among variables in the probability distribution, also named as Markov property of graphs. Define

$\theta $ as a set of conditional probabilities describing the network distribution, where

$\theta $ shows dependencies among each characteristic and its parent node.

Denote a set (mainly including fault characteristics in core systems of high-speed trains) as

$U=\left\{{U}_{1},{U}_{2},\dots ,{U}_{n}\right\}$, where

${U}_{i}(i=1,\dots ,n)$ represents each node in the network. In order to achieve FD tasks using BNs, the first step is to determine relationships between fault modes and nodes in networks. Then, the DAG that meets independent conditions is further established in (6):

Moreover, the joint probability distribution of

U is given:

in which

$p{a}_{G}\left(U\right)$ is a set of parent nodes in the

U. Furthermore, the set of fault characteristics is extended to

$U=\left\{{U}_{1},{U}_{2},\dots ,{U}_{n},V\right\}$, and

V is defined as a set of fault types.

Figure 11 shows a simple modeling procedure of BNs.

When a fault occurs, the joint probability distribution of variables is described as the product of all variables under the variables of parent nodes, the vertex set containing faults can be uniquely identified as long as its probability of the fault exceeds the fault threshold “

t”. The above FD procedure is also named as the

t-probability diagnostic system [

149]. The joint probability distribution in BNs has certain physical significance and is convenient for FD applications in high-speed trains.

One of the earliest applications of BNs to high-speed train is from Zhou et al. [

150]. They employ BNs and fuzzy reasoning to detect faults in control systems of high-speed trains. Considering that most of the complex systems in trains have dynamic degradations, it is difficult to directly diagnose faults in a special case, the so-called multi-transition states. For this reason, Reference [

151] proposes a new dynamic BN. With the help of Markov chain, the dynamic BN approach can detect transient faults and intermittent faults in control systems, providing researchers with fault symptoms of systems at a specific time. Similar to Reference [

144], Reference [

152] develops a multi-layer BN framework, by combining with bond graphs, to analyze fault propagations of traction systems. Different from other work, Reference [

153] designs a mixed IFD approach, which is the combination of static and dynamic BNs, achieving real-time FD tasks of control systems in trains.

In BNs applied to high-speed trains, static and dynamic BNs are suitable for addressing different challenges. Specifically, static BNs are useful to solve uncertainty issues in fault location. On the contrary, dynamic BNs are often used for train maintenance.

#### 3.2.3. Clain Graph

Clain graph (CG) approaches are the natural extension of UGs and DAGs. In comparison with UGs and DAGs, CGs have a stronger ability to express Markov properties [

157]. The form of CGs is very special; when certain conditions are satisfied, CGs can be approximated as DAGs or UGs.

According to mathematical expressions of UGs and DAGs, CG can be defined as $G=(V,E)$, and there are no bidirectional edges and no directed cycles in G. To be specific, if there are only undirected edges in G, CGs will become UGs; if there are only directed edges in G, CGs will become DAGs. Suppose that the vertices in G can be divided into the ordered sequence $V={B}_{1}\cup \cdots \cup {B}_{k}$, and edges in CGs require to be satisfied:

- (1)
If $\left\{X,Y\right\}$ is an edge in G, and $X,Y\in {B}_{i}$, then the form of an edge will be $X-Y$.

- (2)
If $\left\{X,Y\right\}$ is an edge in G, and $X\in {B}_{i},Y\in {B}_{j},i<j$, then the form of an edge will be $X\to Y$.

Here,

${B}_{k}$ represents a block for

$i=1,\dots ,k$. When all blocks

${B}_{1},{B}_{2},\dots ,{B}_{k}\phantom{\rule{0.166667em}{0ex}}(k\ge 1)$ are connected together, dependency chains can be formed through this connection. In the actual FD procedure, vertex sets in CGs follow the principle of partially ordered sets. Let

$\left\{X,{V}_{1},\dots ,{V}_{n},Y\right\}$ be the vertex of CGs, and the subgraph

${G}_{A}$ can be described:

in which the subgraph

${G}_{A}$ is a composite structure in the

G. Both

X and

Y are parent nodes of composite structures, a set

$\left\{{V}_{1},\dots ,{V}_{n}\right\}$ is the domain of composite structures, and

n is the degree of the composite structure.

Based on the above description of CG structures, CGs can be used to simulate the coupling structure of systems in high-speed trains. More specifically, it is easy to simulate the monitoring of multiple subsystems of trains in the form of CGs. In particular, a variable ${X}_{i}$ is defined as the fault characteristic. In addition, another set of variables ${Y}_{1},\dots ,{Y}_{n}$ is defined as a condition variable that can directly affect ${X}_{i}$. Besides, variables ${Z}_{1},\dots ,{Z}_{t}$ are defined to describe FD procedures, which have no direct influence on the variable ${X}_{i}$. Let $G=(V,E)$ as a CG to describe the system structure. Note that it is the one-to-one relationship between the set of nodes and $\left\{{X}_{i}\right\}\cup \left\{{Y}_{1},\dots ,{Y}_{n}\right\}\cup \left\{{Z}_{1},\dots ,{Z}_{t}\right\}$.

In addition, the probability distribution in CGs should include the probabilities of variables (i.e.,

${x}_{i},{y}_{1},\dots ,{y}_{n}$), and the conditional probability used to judge whether there is a component failure is

When

p is close to 0, the component monitored by

${x}_{i}$ may be abnormal. When

p is close to 1, the component is normal.

Reference [

158] proves the Markov property of CGs, and even the conditional independence of strict integer probability distribution, is expressed by CGs. Evidently, this work promotes the development of CGs. To explain the causality of CG structures, Reference [

159] proves the decomposition of joint probability distributions of CGs. Because CG structures are also relatively complex, Reference [

160] developed a structure learning approach to training structures and parameters of CGs. Naturally, CGs are suitable for describing behaviors of complex systems. For example, a CG framework was developed by Reference [

161] to detect faults in electrical systems.

The direct separation criterion of global Markov property in CGs has always been the attention of academics because it is easily applied through the simple separation standard in practice. Therefore, research on related separation standards can extend the application of CGs in high-speed trains.

#### 3.2.4. Fault Tree Analysis

In addition to the above graphs, several other graphs thrived during the 1960s. Among these techniques, with a mixture of qualitative analysis and a specific logic diagram, fault tree analysis (FTA) approaches are causal models based on qualitative analysis [

35]. Different from CGs, FTA methods can conduct reasoning analysis step-by-step when high-speed trains fail and then identify fault causes [

147]. In addition, FTA techniques mainly discuss two kinds of faults: (1) the fault is not recovered by itself; and (2) the existence of the abnormal state, in which faults in high-speed trains are caused by external conditions. The fault automatically disappears when the abnormal conditions are restored.

Two core tasks on FTA modeling are: (1) to determine the top event; and (2) to find the boundary condition. The top event in FTA indicates FD events in high-speed trains, and the selection of top events is particularly important for FTA modeling [

162]. Besides, boundary conditions include system boundaries, initial conditions, disallowed events, and existing assumptions. These conditions could mainly reflect the current detailed information of systems, to judge whether the system is in a normal state [

135].

The FD implementation of FTA begins from the top FD event, through the way from top to bottom, gradually delivering to the bottom of FD events. Through reasoning to find out the cause and influences of each level of events, FTA approaches can further determine the cause of top-level FD events. Several principles using FTA are given as follows:

- (1)
Fault types ought to be as extensive as possible.

- (2)
The analysis of fault events should be as detailed as possible.

- (3)
The principle of layer by layer transmission should be followed (please pay attention to the function of the gates in FTA methods).

- (4)
Only gates in FTA approaches can be connected to events.

The function of “AND” gates is that, if all the input events occur, the output event will occur. On the contrary, the function of “OR” gates is that, if any input event occurs, the output event will occur. There are at least two input events when using “OR” gates or “AND” gates [

163]. Suppose

n-th input event exists in FTA frameworks, and the probability of using “AND” gates is

Similarly, the probability of using “OR” gates is

Since static FTA techniques are difficult to satisfy the real-time requirement in high-speed trains, Reference [

164] extends FTA into the dynamic FD scheme by incorporating the additional gates to diagnose faults in trains. Generally speaking, standard FTA techniques cannot express time-related behaviors. On this account, the work in Reference [

165] proposes a time-dependent FTA technique, by using timed statecharts to describe the time-variant system, to solve the optimal control problem of the high-speed railway crossing. On this basis, an extended FTA combined with Petri nets is developed in Reference [

166], simultaneously considering time parameters, to address FD issues of train dynamic systems with repairable multi-state components. Recent work has also reported on techniques that integrate other approaches into the standard FTA, e.g., Reference [

167] designed an IFD scheme based on FTA, mainly using fuzzy set theory to overcome the negative influence from inaccurate expert experience, to detect faults in the Chinese Train Control System Level 3 (CTCS-3).