A Novel Bottom-Up/Top-Down Hybrid Strategy-Based Fast Sequential Fault Diagnosis Method

Sequential fault diagnosis is a kind of important fault diagnosis method for large scale complex systems, and generating an excellent fault diagnosis strategy is critical to ensuring the performance of sequential diagnosis. However, with the system complexity increasing, the complexity of fault diagnosis tree increases sharply, which makes it extremely difficult to generate an optimal diagnosis strategy. Especially, because the existing methods need massive redundancy iteration and repeated calculation for the state parameters of nodes, the resulting diagnosis strategy is often inefficient. To address this issue, a novel fast sequential fault diagnosis method is proposed. In this method, we present a new bottom-up search idea based on Karnaugh map, SVM and simulated annealing algorithm. It combines failure sources to generate states and a Karnaugh map is used to judge the logic of every state. Eigenvalues of SVM are obtained quickly through the simulated annealing algorithm, then SVM is used to eliminate the less useful state. At the same time, the bottom-up method and cost heuristic algorithms are combined to generate the optimal decision tree. The experiments show that the calculation time of the method is shorter than the time of previous algorithms, and a smaller test cost can be obtained when the number of samples is sufficient.


Introduction
Fault diagnosis is a crucial activity for systems with high safety and mission criticality requirements [1], such as spacecraft [2], military helicopters [3], and aircraft satellites [4]. With the system structure becomes more and more complex [5], how to quickly and low-costly isolate faults [6] becomes a difficult problem. To solve it, Somnath Deb and K.R. Pattipati proposed the multi-signal flow graph model based on AND/OR graph in [7], which make test sequence optimization problem convert to the analysis and processing problem of D-matrix. Because the model is close to the actual physical system [8], and the integration and verification of the model is relatively simple [9], researchers have done a lot on this [10].
Currently, there are mainly two methods of sequential fault diagnosis algorithms based on multi-signal flow graph. One is the top-bottom search strategy, that is, the decision tree is constructed from top to bottom. The initial object is all faults, and test points are selected for fault isolation. There are four kinds of algorithms: (1) Greedy algorithm, whose time complexity is very low, but it is easy to fall into the local optimal solution because it does not consider the whole decision tree. (2) Heuristic search algorithm, which is also known as the AO* algorithm [11]. The heuristic search algorithm based on information theory was applied to generate the sub-tree to handle the imperfect tests existed extensively in the realistic systems [12]. To carry out iterative updating and construct the near-optimal diagnosis strategy, [13] proposed combining a rollout algorithm with information gain heuristic algorithms and [14] makes improvement in different search widths and depths. Several heuristic algorithms were proposed to find good strategies for large problem instances in [15]. The result of the algorithms is the global optimal solution, but when the heuristic value of each test point is close, it needs to traverse all cases, which takes a lot of time. In addition, with the increase of the number of faults, the time complexity increases exponentially and large storage is required. (3) Intelligent algorithm. In [16], the test sequencing problem was converted to search the minimal complete test sequencing based on the ant algorithm. Based on cuckoo search, [17] proposed an algorithm to generate optimized test sequences, which obtains 100% software coverage. These methods avoid the tedious logic design and seek rules from a large number of permutation and combination data. When the number of faults is large, it needs a lot of iterations and time is uncertain. (4) Hybrid algorithm, the combination of the Algorithms 1, 2 and 3 or the combination of these with other algorithms. Reference [18] proposed a new testing sequence optimization method based on AO* and dynamic programming to improve the efficiency of generating solutions. Reference [19] combined the greedy method with discrete binary particle swarm optimization (DPSO) to construct a test sequential tree. Reference [20] developed iterative algorithm called GLP(tau)S that uses genetic algorithms, LP tau low-discrepancy sequences of points and heuristic rules to find regions of attraction when searching a global minimum of an objective function. Another kind of method is bottom-up search strategy, that is, the fault tree is constructed from bottom to top [21]. The initial state is a single fault, continuous permutation and combination is carried until it is combined into a complete set of faults [22]. This avoids the repeated calculation of the same state, and has advantages in small-scale fault diagnosis. However, in large-scale fault diagnosis, the efficiency of the algorithm is low due to the generation of too many useless states.
To sum up, the main problem is that when the number of faults and test points is large, the time complexity of various algorithms is high [23] and the formation of decision tree is slow [21], that is NP hard problem [24]. Based on the above literature research, this paper proposes a new bottom-up/top-down hybrid strategy-based fast sequential fault diagnosis method. Firstly, a new bottom-up algorithm is established, which uses Karnaugh map, SVM [25] and simulated annealing algorithm [26] to decline the generation of useless state. It is suitable for the situation that the number of test points is more than the number of faults. Then the algorithm is combined with the cost heuristic search algorithm, which is suitable for the situation that the number of test points is less than the number of faults. This method makes the decision tree switch the algorithm according to the situation of nodes in the generation process, so as to improve the computational efficiency. Experimental results show that, compared with heuristic search algorithm and intelligent algorithm, the decision tree of this strategy has shorter generation time and guaranteed accuracy when dealing with large-scale fault dependency matrix.
The remainder of this paper is organized as follows. Section 2 describes the problem, Section 3 introduces the new proposed approach, Section 4 shows the comparative experiment with other algorithms, and Section 5 provides a summary.

Description of the Multi-Signal Flow Graph Combination Problem
In general, the testing sequencing problem is based on the information of failures, including five components: The purpose of the sequential fault algorithm is to generate the optimal binary tree that can incur the least cost of fault separation.
As shown in Table 1, the scale of D1 is × , and the matrix is transformed into D2, D2 includes all binary digit combinations. It is shown in Table 2.
Among them, S0*, S3*, S5*, and S7* are fictitious fault modules; if D2 is analyzed, the combination number of test sequence trees is calculated as shown in Equation (1).
The condition in which the fault in D1 can be completely isolated is shown in Equation (2).
As shown in Figures 1 and 2, any test sequence tree of D1 is a subset of a test sequence tree of D2.
Different D-matrices of the same scale will generate optimal decision trees with different levels. In the meanwhile, its number of layers will not be larger than + 1. For the convenience of description and calculation, Figure 1 is changed to Figure 3, and the format of Figure 3 is a standard tree with + 1 layers, which is the same to Figure 2.   As shown in Figure 3, for any node at level K of the binary tree: Node size refers to the number of contained in the node. If = 0, the node can be expanded by test point , and if = 1, the node cannot be expanded by the test point ; is the number of combinations that the node can expand, and denotes the number of test points that can be used; ( ) _ and ( ) _ indicate the number of combinations that can be expanded by the two subnodes of the node under condition , and the subnodes are located in the + 1 layer. To calculate the number of combinations of one node in k layer, we should obtain the number of combinations of the subnodes in + 1 layer firstly. Then, we should multiply the combination number of left and right subnodes based on the expansion of different test points. The whole calculation process is from bottom to top.
The combination number of test sequence trees generated by an × D matrix is ( < ), and the initial value condition is = 1. is generated by iterative calculation of the formula in the standard tree according to Equations (3) and (4). Additionally, the expected cost J can be calculated as where = [ ] is a binary matrix with dimensions of × ; = 1 means that test is selected for the identification of fault state and otherwise, = 0. In summary, the problem of test sequence diagnosis based on a multi-signal flow graph aims to select the test sequence tree with the smallest J among types of test sequence trees. As m and n increase, the number of combinations greatly increases, which takes a long time to compare the cost of each tree and select the lowest, that is an NP-hard problem.

Proposed Sequential Fault Diagnosis Approach
The flow chart of the hybrid strategy is shown in Figure 4 and has three main parts: generation of parameter set, modification of parameter set and acquisition of switch condition boundary. The first two parts are novel bottom-up algorithms, which represent a great improvement compared with traditional bottom-up algorithms. Part three is used to define the switching conditions of the two algorithms, according to the size of the current node switch using two algorithms.
Firstly, a large number of samples are generated in accordance with the parameters of the test object. Both top-down and bottom-up algorithms are used to generate the optimal decision tree with these samples. Then, the time of the two algorithms is compared to obtain the switching condition. The next follows processing the test object. The size of the initial node is × . The algorithm is selected according to the switching condition. After the algorithm is executed, the successor node is generated, and the algorithm is selected according to the situation of the successor node. It is then iterated until no successor nodes are generated. In addition, this flow chart is applicable to any top-down algorithms.

Generation of Parameter Set
The following definitions are applied:


: Any fault subset with size k of the fault complete set , it contains three parts, k faults, k1 and k2 storage location information, and a series of Karnaugh map logic values, the length of which is n;  : The set of all . (i) is used to denote the ith element in ;  : If there are only k failures that belong to in the system, the minimum cost of testing is required;  : The sum of the failure rate of k failures in .
The optimal test sequence tree is a full binary tree, and thus any ( ≥ 2) can be composed of and , where + = ( ≥ 1, ≥ 1). The calculation formula of is shown in Equation (7).
where refers to a test point that can separate and and should satisfy the The flow chart of the bottom-up algorithm is shown in Figure 4. There are two main parts: one is the production of , and the other is the modification of . By combining all the elements of and ( 1 + 2 = ) in turn, this part calculates and generates , judges the logic of , compares various combinations that can generate , and retains the combination with the least test cost until all combinations are generated, with calculations completed. When = , the algorithm is end and decision tree can be obtained from .
If all faults in the fault complete set s can be completely isolated, any subset of the fault complete set can also be completely isolated. According to permutation and combination | | refers to number of . The computing times of is shown in Equation (9).

Logic Judgement Module
The output of the previous module calculates the state of redundancy, which should be deleted. For example, the state of ; actually, regardless of the type of test sequence, the fault set S is not be separated into fault subset . States such as this are non-logical items, and by eliminating the non-logical items to decrease| |, the combination number and calculation time can be greatly reduced. To judge the logic, a Karnaugh map is introduced. Taking the data in Table 3 as an example, the Karnaugh map is constructed with the test points as variables and the faults as logical values in Figure 5. A complete segmentation method of the Karnaugh map is defined:


Each circle conforms to the Karnaugh map circle rule, and its size is 2 (0 ≤ y ≤ n);  A large circle containing two smaller circles is treated as one circle;  (1 ≤ i ≤ m) each have a circle of size one;  There are only two disjoint circles in a large circle;  The maximum circle size is 2 , and it must exist. Figure 5 presents the complete segmentation method of the Karnaugh map. According to the rule of Huffman code [27], a complete segmentation method of Karnaugh maps uniquely is consistent with a decision binary tree. Figure 6 corresponds to the decision tree of Figure 5. In the Karnaugh map, a circle uniquely corresponds to a series of logical sequences composed of 0, 1 and d (d denotes "don't care term"), and 0 and 1 denote the judgement results of the corresponding test points. where is composed of k1 and k2 , and is a nonlogical item. The necessary and sufficient condition is described as follows: 1. The circles of the Karnaugh map corresponding to k1 and k2 have no overlap; 2. The circle of the Karnaugh map corresponding to does not contain , ∉ k1 and ∉ k2 .
As shown in Figure 5, corresponds to the logical value of 001, corresponds to the logical value of 010 according to the minimum item combination principle [28], and corresponds to the logical value of 0dd. However, because is included in 0dd, = { } does not exist. According to the above theory, judging the logic of after it is generated and deleting nonlogic can reduce | |.

Entire Process of Generation of the Parameter Set
In summary, the entire process of generation of the parameter set is shown in Algorithm 1.

Algorithm 1 production 1:
= Ø, = 1, = 1; 2: = − , take out one element in and combine it with one element in in turn; 3: Judge the logic of the combination result. If (i) is a non-logic item, return to step 2. Otherwise, go to step 4; 4: Judge whether there is the same (i) in . If not, it will be included, and = + 1. Otherwise, compare the of the two and retain the smaller one; 5: Return to step 2, if all elements in step 2 have been combined, then = + 1, and if > [ /2], the calculation is completed; otherwise return to step 2.
Furthermore, | | can be estimated by Equations (10)- (12). xi is defined as the ith column of the D-matrix with the size of 1 × . The self-generated number is the number of that generated directly by . xi is equivalent to two fault subsets and ( + = ) . If ij = 0 , ∈ ; otherwise, ∈ . For example, x1 is equivalent to one and one shown in Table 3. Similarly, presented in Table 3 totally generates two , two and two . In addition, the combination number is defined as the number generated by the combination. The combination of and can also produce under the following conditions: if ∈ and ∈ , then ∈ ( < , < ) . For example, the combination of refers to combination number. and have | || | kinds of combination and only is the logical item. Considering that the sum of different and can be , double summation is done. The combination process may produce repeated ( and both produce it). The repetition rate of is

Modification of Parameter Set
Although the last part can eliminate all nonlogic items and reduce | |, | | is still extremely large and will generate a lot of combination, which takes a long time. Actually, only a few in eventually become part of the optimal test sequence binary tree, and the elimination of other does not influence the optimal decision tree. Three points can be used to make the elimination. Firstly, one includes a series of logical number and k faults. The features of is an array but not a number, so each should be analyzed independently. Secondly, the number of eigenvalues of is limited. There is no need to perform principal component analysis to lower the eigenvalues number. Thirdly, the judgement function is generated before the actual test and the sample is sufficient. Therefore, the longer the training time, the better. SVM (support vector machine) is used to distinguish which should be retained and which should be eliminated. The definitions are presented as follows:


If k1 and k2 can be combined into , then k1 and k2 are a pair of couples in ( > , > );  If can be composed of many pairs of couples, then the least cost pair of couples is called a pair of spouses.
Taking the matrix of Table 3 as an example,  can be composed of  and  , or and . Therefore, and , and are two couples in . According to Formula (7), the test costs of the two combinations are 2.75 and 3.25, respectively, and and are a pair of spouses in . Four parameters are given as follows: 1. The size of number depends on the logic of , i.e., the distribution of 0 and 1 in the D matrix. The larger number , the stronger ability of to combine other fault subsets. On the premise that has been combined with another fault subset to form a pair of couples, if the cost according to Formula (7) is lower than that of other couples, it becomes a pair of spouses. The larger the ratio , the greater the advantage represented by the cost. According to the analysis in the previous section, the smaller the of , the smaller the circle of the Karnaugh map, and the higher the probability of combination with other fault subsets.
If is a part of the optimal test sequence binary tree, then it has the advantage of more times or cost of combining with other fault subsets. Besides, the proportion of "couple" to "spouse" is higher. As a result, number or ratio is higher. However, number and ratio are unavailable until is generated. This method aims to predict number , ratio through number ⋯ number , ratio ⋯ ratio , and other parameters, that is, the predication of number and ratio through … ( > ). If the prediction result are low, then is eliminated to decrease | |, = + △ ,△ is a predetermined value. In the current experiment, = + 1. Each ( < < + 1) can generate a set of eigenvalues to classify . To make the classification more accurate, will be eliminated, if all the classification values according to Δ sets of eigenvalues are 0. The prediction function is obtained by the statistics of sample performance.

Condition Judgement Module DES
Not every | | needs to be reduced by SVM, and is worth modifying if the time used to statistics eigenvalues and determine the category ≤ the time taken of participating in the generation of , , ⋯ , . The that needs to be modified is denoted as , and the algorithm enters the modification module only when ∈ ; otherwise, = + 1. Suppose that 0 and 1 in D obey a uniform distribution, and the number of d in the logic sequence of obeys the normal distribution with n/2 as the mean. Similarly, obeys the normal distribution with − as the mean value. From the analysis of the last section, if the number of d of a is n, when ≠ , the corresponding circle of a Karnaugh graph is the whole Karnaugh graph, and thus this must be a nonlogic item. Therefore, the number of d of logically valid is at least − 1. Therefore, when | | satisfies − 1 > , it wasteful to modify. The largest element in < + 1. The smaller the value of , the more benefit after modification because it will participate in a large number of combinations. For example, in the D matrix of 30 × 30, the values of | |, | |, | |, | |, and | | are much larger than other | |. Consider choosing the smaller k as much as possible, that is, = {2,3,4}.

Prediction Function
The definitions are given as follows: Samples and test objects should have the same _ , P matrix approximation and C matrix approximation. In the case of no prior probability distribution, _ is 0.5, obeys the normal distribution with the mean value of 1/m, and obeys the uniform distribution within a certain range. The prediction function is shown in Algorithm 2. ∉ . 4: If number and ratio are lower than 1 • 1 and 2 • 2, the is marked as 0; otherwise, it is marked as 1 ( 1 and 2 are coefficient parameters, which are taken as 1 in this experiment); 5: Delete this element k in , and return to step 3 until = ∅; 6: The eigenvalues obtained in step 2 and the tag values obtained in step 4 are used as the training matrix. Return to step 2 until all samples are calculated, and use all of the training matrix to train the prediction function.

Statistical Eigenvalue Module
Each has j groups of eigenvalues. Only if all the classification value calculated by the j groups of eigenvalues is 0, will be deleted. However, after calculating , some satisfies the first set of classification values of 1, which can no longer attract the attention. The next focus is those with a classification value of 0. If we continue to calculate directly, , which has been determined to be 1, will participate, causing useless calculation. Therefore, the simulated annealing algorithm is used to obtain all the eigenvalue without calculating complete . The statistical eigenvalue module is displayed in Algorithm 3.

Algorithm 3 Statistical eigenvalue 1: Through
, the eigenvalues number , ratio , , + 1, , are obtained, and the (i) with classification value 0 is included in the set , = 1; 2: If = Δ , then go to step 5. Create an array: to record the eigenvalues. Randomly take the subsets of , , ⋯ , , , , ⋯ , as * , * , ⋯ , * , * , * , ⋯ * (The size of the subset is 1/ _ of the total set; in this paper, _ is 10), use * , * , ⋯ , * , * , * , ⋯ * to calculate ′ ; 3: Record all the couples and spouses in each , and add the couples and spouses not included in the into it; 4: Return to step 2; if the eigenvalue is not updated in multiple circles, it shows that the statistics of couples and spouses are completed, the eigenvalues number and ratio are obtained, and the elements in take number , ratio , , + 1 + , , as the eigenvalue to obtain the classification values. Delete the element in with classification value of 1, = + 1, empty the , and return to step 2; 5: Finally, the remaining elements in are the elements to be eliminated.

Entire Process of Modification of the Parameter Set
In summary, the entire process of parameter set modification is shown in Algorithm 4.

Algorithm 4
Modification 1: Judge whether k ∈ ; if yes, go to step 2, and otherwise, go to step 4; 2: Calculate in the same way as the production part; 3: Enter the statistical eigenvalue module, use the prediction function to obtain the classification value of , and eliminate the with the classification value of 0; 4: = + 1, = 1.
The size of the modified | | depends on four parameters: 1, 2, Δ and the number of samples. The larger 1 and 2, the smaller | | will be. However, the accuracy of the results will be reduced because the components of the optimal test sequence tree may be removed. The larger the value of Δ , the more is retained, and the larger the value of | |, the higher the accuracy will be, but with a longer calculation. When the number of samples is sufficiently large, even if 1 and 2 are large and Δ is small, the accuracy of the results is guaranteed. The prediction function is calculated in advance and is not included in the test time.

Acquisition of the Switch Condition Boundary
Top-down and bottom-up algorithms have their own advantages. They have the following characteristics:


In the top-down algorithm. | | , | |, ⋯ , | | is not affected by P and C when | | = 0; i.e., the calculation time is unchanged. After adding the prediction function, | | , | |, ⋯ , | | is less affected by P and C, and the calculation time is stable. When the cost of each test point and the probability of each fault are similar, the time of top-down search is short. When the cost of each test point and the probability of each fault are much different, the time of top-down search is long;  The bottom-up algorithm has an advantage when the number of test points | | is much larger than the number of faults | | because the bottom-up algorithm takes fault s as the core and calculates the combination of its subsets. The upper limit of calculation times is The core of bottom-up algorithm is comparing the characteristic of each test point. When the number of test points | | is large, the calculation time is long. In contrast, when the number of test points | | is much less than the number of faults | |, top-down search has more advantages.
In large-scale tests such as = = 100, the top-down search method is adopted. For the third level of the test sequence binary tree, as shown in Figure 7, the fault matrix problem of 100 × 100 is transformed into solving four 25 × 98 . The top-down algorithm has more advantages for the transformation and the bottom-up algorithm has more advantages for the 25 × 98 problem. The boundary is that the computation time of the bottom-up algorithm is the same as that of bottom-up algorithm. This boundary is obtained by sample simulation. Under the conditions of the same D, P, and C, we compare the computational efficiency of the two algorithms with different scales.

Experiment
The experiment is divided into two parts. One part compares the efficiency of the hybrid algorithm (the combination of bottom up and cost heuristic algorithm in this paper) and the algorithm [14] in dealing with a large-scale fault-test dependency matrix with simulation data. The other part compares the time spent and expected cost of the hybrid algorithm and algorithm [14,29] in dealing with the superheterodyne receiver example [30].

Large-Scale Fault-Test Dependency Matrix
The experimental data are based on different scales of the D matrix, the _ and the distribution of . According to the parameter settings, 8 groups of samples were randomly generated, and the average value was calculated. This experiment runs on MATLAB 2018b on a 3.60 GHz, 16.0 GB RAM desktop.
D-matrix scale change: _ = 1, obeys the uniform distribution of (1,2), and * obeys the uniform distribution of (0,1), = * ∑ * , the size of D-matrix changes from 60 × 60 to 80 × 80, and the interval is 5. The result is shown in Figure 8.    Table 4. The rows represent the number of failures | |, and the columns represent the number of test points | |, where 1 means that the efficiency of the bottom-up algorithm is higher, and 0 means that it is lower. It can be observed from the figure that when 65 < | | < 80, | | < 10, the bottom-up algorithm has more advantages-i.e., when the heuristic search node size is less than 10-the bottom-up algorithm is used.  7  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  8  1  1  1  1  1  1  1  1  1  1  1  1  1  0  1  1  9  1  1  1  1  1  1  1  1  1  1  0  1  0  1  0  0  10  1  1  1  1  1  1  0  1  0  1  1  1  0  0  1  1  The experimental results show that when the C gap is smaller, the _ is smaller, the calculation time is longer, and the combination algorithm is better than the rollout and information heuristic algorithm because the heuristic values of each measurement point are similar, which all need to be expanded. The more times the extended nodes switch to the bottom-up algorithm, the higher the efficiency.

Superheterodyne Receiver Example
In this experiment, the example is the five-tube superheterodyne receiver originally designed for the U.S. Navy [30,31], the system consists of 36 different tests and 22 failure states, and all the test costs are 1, as shown in Table 5. This experiment runs on MATLAB 2018b on a 3.60 GHz, 16.0 GB RAM desktop.   Table 6 The search time of rollout and information heuristic algorithm [14] is more than 15 min. Because the test cost is 1 and the failure probability is similar, the heuristic function value of each test point is similar, the number of test points is greater and the number of failures is less, which leads to the long operation time for the heuristic search in this type of problem. The algorithm runs from bottom to top with = ∅, takes 88.45 s, and the optimal test cost is 3.3473. The test sequence is shown in Table 7. For example,  , and after 7 test points,  , , , , , , are finally isolated. In combination with SVM, = {2,3,4} takes 60.97 s, but the computational cost depends on the number of samples generated. When the number of samples exceeds 30, it is easy to obtain the minimum value.
---- 7  --8  ----9  -----10  ----11  -----12  -13  --14  -----15  --16  17  ----18  --19  ---20  --21  --22 ---Test cost: This information is shown in Table 8. According to reference [29], the optimal cost of the genetic algorithm is 3.9526 and that of the information gain algorithm is 6.9124. The cost of the hybrid algorithm is 3.3473, which is minimal. Because of the uncertainty of the algorithms, the worst case time complexity is compared. In the worst case, only one fault can be isolated for each test, so the decision tree has layer and − 1 test points. Each test point has , − 1, … , − + 2 choices respectively. times calculation is required and the worst case time complexity of Information gain algorithm and rollout and information heuristic algorithm is ( ) . The worst complexity of genetic algorithm is ( _ • ), _ refers to the maximum number of iterations and refers to the population size. For the bottom-up algorithm, according to the analysis in Section 3.3, the calculation time is 2 − 1 and each time has n loops. Therefore, the worst-case time complexity is ( • 2 ). The worst-case time complexity of the hybrid algorithm is O( + • 2 ), k refers to the number of fault sources of the boundary.
The above experiments show that the bottom-up algorithm has advantages in cost and computing time when dealing with this type of problem.

Conclusions
In this paper, the problem of the test sequencing for large-scale systems is considered. A novel bottom-up algorithm based on the Karnaugh map, SVM, and the simulated annealing algorithm is proposed. When the heuristic values of each test point are similar or the number of faults is far less than the number of test points, this algorithm is suitable. Furthermore, decision tree processing strategy for large-scale D-matrix is proposed, which can combine the advantages of bottom-up and top-down algorithms. It can switch the algorithm based on the relationship between the number of test points and the number of faults of the node. Moreover, the strategy can be applied to sequential fault diagnosis of large complex equipment, and can be used in testability optimization of equipment. Experiments under different parameter settings are carried and demonstrate that this method has a short operation time and high efficiency when dealing with a large-scale D matrix. More research should be carried out to determine prior knowledge of C, P, and _ , which can significantly affect the efficiency.