A Hierarchical Decision Fusion Diagnosis Method for Rolling Bearings

: In order to achieve accurate fault diagnosis of rolling bearings, a hierarchical decision fusion diagnosis method for rolling bearings is proposed. The hierarchical back propagation neural networks (BPNNs) architecture includes a fault detection layer, fault isolation layer and fault degree identiﬁcation layer, which reduce the calculation cost and enhance the maintainability of the fault diagnosis algorithm. By wavelet packet decomposition and signal reconstruction of the raw vibration signal of a rolling bearing, the time-domain features of the reconstructed signals are extracted as the input of each BPNN and the accuracy of fault detection, fault isolation and degree estimation are improved. By using the majority voting method, the diagnosis results of multiple BPNNs are fused, which avoids the missed diagnosis and misdiagnosis caused by the insensitivity of a vibration characteristic to a speciﬁc fault. Finally, the proposed method is veriﬁed experimentally. The results show that the proposed method can accurately detect the fault of rolling bearings, recognize the fault location and estimate the fault severity under different operating conditions.


Introduction
The rolling bearing is one of the most frequently used parts in all kinds of mechanical equipment. About 30% of the faults in rotating machinery are caused by bearing faults, and the operating state of them will directly affect the machine's performance [1,2]. Periodic load and impact are the main causes of rolling bearing wear, cracking, spalling and other failures [3]. Therefore, the importance of rolling bearing condition monitoring and fault diagnosis has been increasingly recognized and has drawn extensive attention in the past decades [4,5].
As the vibration signals contain abundant bearing state information, vibration-based methods have been widely used in the past few decades. Common vibration signal feature extraction methods include time-domain analysis [6], frequency domain analysis [7], timefrequency analysis [8,9] and other nonlinear analysis methods [10][11][12]. Conventional timedomain or frequency-domain methods have difficulty extracting minor fault information from rolling bearing vibration signals with background noise [13]. Therefore, the vibration signal processing methods came into being, such as sparse decomposition [14], empirical mode decomposition (EMD) [15], ensemble local mean decomposition (ELMD) [16], short time Fourier transform (STFT) [17], wavelet transform (WT) [18], wavelet packet transform (WPT) [19], spectral kurtosis (SK) [20], fast spectral kurtosis (FK) [21], etc. A number of cases have shown that wavelet packet decomposition combined with time-domain analysis is an effective feature extraction method [22]. However, time-domain features are diverse; thus, the selection of time-domain features has become the focus of research.
The traditional fault diagnosis method of rolling bearings is based on a classifier for pattern recognition of rolling bearings, while fault detection, isolation and degree estimation are realized [23][24][25]. However, this kind of method requires a large number of training samples and high representative features, which can easily lead to misdiagnosis. In order to reduce the risk of misdiagnosis caused by the complexity of the diagnosis system, Gan [26] proposed a hierarchical diagnosis network based on deep learning to identify the fault mode of rolling bearings using a two-layer diagnosis network. The fault location and fault degree were evaluated respectively. The results showed that the method has high reliability. However, most bearings are running in normal condition. The contingency of fault occurrence makes it very difficult to collect fault data and establish complete diagnosis knowledge, so the double-layer diagnosis network will face the problem of sample imbalance [27]. In addition, the key step of rolling bearing fault diagnosis is to extract the appropriate eigenvalue from the bearing signal as the diagnostic index and to realize the fault diagnosis of the rolling bearing by setting the threshold of the diagnostic index [28]. However, feature extraction and selection mainly depend on human experience, and fault diagnosis based on a single feature will be affected by external environmental factors or human factors and the result of classification will be unstable [29]. How to obtain fault diagnosis results with strong reliability and high stability and minimize the influence of sample imbalance on bearing fault diagnosis is one of the challenges of rolling bearing fault diagnosis.
Aiming at the above problems, a hierarchical back propagation neural networks (BPNNs) diagnosis method of rolling bearings based on decision fusion is proposed. The first layer is the fault detection decision fusion network, the second layer is the fault isolation decision fusion network and the third layer is the fault degree estimation decision fusion network. When the bearing is in normal condition, only the first layer is activated and the calculation cost is low, which overcomes the problems in the literature [26]. Hierarchical diagnosis architecture avoids the difficulty of neural networks training caused by incomplete raw data. In addition, when fault data is updated, only relevant neural networks need to be updated, which enhances the maintainability of the method. The wavelet packet decomposition and signal reconstruction of the raw vibration signals of the rolling bearing is carried out, and the time-domain characteristics of the reconstructed signals at each node are extracted as the input signal of the decision fusion networks, which improves the accuracy of the diagnostic results. The decision fusion networks model of each layer includes many BPNNs. A time-domain feature corresponds to a BPNN, and the diagnostic results of BPNNs are fused by the majority voting method. The highest diagnostic rate of the hierarchical diagnosis method can be obtained without artificial selection of time-domain features. Finally, the proposed method is verified experimentally. The results show that the proposed method can realize rolling bearing fault detection, fault location isolation and fault degree estimation under different operating conditions. The structure of the paper is organized as follows: Section 2 introduces wavelet packet decomposition, BPNN and decision fusion based on the voting method, and proposes the rolling bearing hierarchical diagnosis method. Section 3 verifies the proposed method by experimental data. Section 4 provides the conclusion.

Wavelet Packet Decomposition
Wavelet packet transform is an adaptive nonlinear analysis method, which can adaptively determine the resolution of different frequency bands and decompose the approximate part of the low frequency and the detailed part of the high frequency of the signal at the same time. Wavelet packet decomposition improves the time-frequency local analysis ability of the vibration signals, which can obtain higher time-domain resolution at high frequency and has higher frequency-domain resolution in the low-frequency part [30].
The scale function ϕ(x) and the wavelet function ψ(x) are represented by µ(x). The scale function on scale 0 is µ 0,0 (x), and the scale function and wavelet function on scale 1 are µ 1,0 (x) and µ 1,1 (x). For any scale j, the recursive equation of the function system µ j,m (x) can be described as: is the orthogonal wavelet packet of the wavelet function ψ(x). As shown in Figure 1, a schematic diagram of n layer orthogonal wavelet packet decomposition for a signal is presented. The raw signal S is decomposed by a wavelet packet with n layers, and the signal at each node of each layer is decomposed into a low-frequency signal A and high-frequency signal D by the filter. The n layer wavelet packet decomposition will generate 2 n nodes. scale function on scale 0 is μ0,0(x), and the scale function and wavelet function on scale 1 are μ1,0(x) and μ1,1(x).
For any scale j, the recursive equation of the function system can be described as: μj,m(x) is the orthogonal wavelet packet of the wavelet function ψ(x). As shown in Figure  1, a schematic diagram of n layer orthogonal wavelet packet decomposition for a signal is presented. The raw signal S is decomposed by a wavelet packet with n layers, and the signal at each node of each layer is decomposed into a low-frequency signal A and highfrequency signal D by the filter. The n layer wavelet packet decomposition will generate 2 n nodes.

Back Propagation Neural Networks
The raw vibration signals are decomposed by a wavelet packet with n layers, producing s reconstructed signals. Features with the quantity of m are extracted from each reconstructed signal, forming the feature set F = {F1, F2, ..., Fm}. Fj is normalized to Xj = {Xj1, Xj2, ..., Xjs} T , which is the input of the jth BPNN. The number of output layer nodes is set to h, and the corresponding output is Dj = {Dj1, Dj2, ..., Djh} T . The jth BPNN consists of an input layer, hidden layer and output layer, as shown in Figure 2.

Back Propagation Neural Networks
The raw vibration signals are decomposed by a wavelet packet with n layers, producing s reconstructed signals. Features with the quantity of m are extracted from each reconstructed signal, forming the feature set F = {F 1 , F 2 , . . . , F m }. F j is normalized to X j = {X j1 , X j2 , . . . , X js } T , which is the input of the jth BPNN. The number of output layer nodes is set to h, and the corresponding output is D j = {D j1 , D j2 , . . . , D jh } T . The jth BPNN consists of an input layer, hidden layer and output layer, as shown in Figure 2. In this paper, the training process of the jth BPNN consists of the following steps: Step 1: Initialization.
The number of neurons in the hidden layer l is selected empirically. The weight ωjgk connects the input layer with the hidden layer, and the weight ωjki connects the hidden layer with the output layer. Then, the hidden layer threshold aj and the output layer threshold bj are initialized. The learning rate and neuronal excitation function are given in advance.
The output Hj of the hidden layer is calculated according to the input vector X, weight In this paper, the training process of the jth BPNN consists of the following steps: Step 1: Initialization.
The number of neurons in the hidden layer l is selected empirically. The weight ω jgk connects the input layer with the hidden layer, and the weight ω jki connects the hidden layer with the output layer. Then, the hidden layer threshold a j and the output layer threshold b j are initialized. The learning rate and neuronal excitation function are given in advance.
The output H j of the hidden layer is calculated according to the input vector X, weight ω jgk and hidden layer threshold a j .
where f is the hidden layer excitation function. There are many excitation functions for the hidden layer, and the excitation function used in this paper is: Step 3: The output calculation of the output layer.
The output of the jth BPNN D j is calculated by the hidden layer output H j , weight ω jki and threshold b j .
The network prediction error e j is calculated based on the network prediction output D j and the expected output Y j .
The network connection weights ω jgk and ω jki are updated by the network prediction error e j .
The network node threshold a j and b j are updated by the network prediction error e j .
Determining whether the algorithm iteration ends, and if not, returning to step 2.
A desired accuracy, E Expect , is set. N is the sample number.

Decision Fusion Based on the Voting Method
The idea of decision fusion comes from ensemble learning. Ensemble learning is not a single machine learning algorithm, but a method completing the task by building and combining multiple machine learning algorithms, which can be applied for classification problem integration, regression problem integration, feature selection integration, abnormal detection integration and so on [31,32]. For feature selection, Tripathi et al. [33] applied the data set with selected features to five base learners and aggregated the outputs obtained by the base classifier by a weighted voting method to predict the final output. A deep neural network algorithm, support vector machine (SVM) and decision tree c4.5 base classifiers were adopted by Asadi et al. [31], and the decision strategy based on the maximum number of votes was used to identify and classify botnets. Each ensemble learner produces a result, so it is necessary to integrate the results of each base learner to obtain the final decision result. A simple and effective way is to obtain the final decision result of multiple machine learning algorithms by the voting method. The voting method mainly includes the majority voting method [34] and the weighted voting method [33].
Inspired by the ensemble learning method, this paper makes use of the majority voting method to fuse the diagnosis results of multiple BPNN classifiers trained by a single feature set, which are extracted from all layers based on wavelet packet decomposition. Then, they are sorted from most to least. Finally, based on the principle that the minority obeys the majority, the diagnosis result with the largest number of votes is regarded as the decision fusion diagnosis result. The decision fusion algorithm based on the voting method is shown in Figure 3.    In Figure 3, m characteristics F 1 , F 2 , . . . , F m are extracted from the vibration signals and are normalized to X 1 , X 2 , . . . , X m as the input of m BPNN models Net 1 , Net 2 , . . . , Net m , and then h kinds of fault modes of bearings are recognized, The output results D 1 , D 2 , . . . , D m are Synthesized into a diagnostic matrix D h×m . Then, each row of the matrix D h×m is summed in order to get the scoring matrix Q h×1 = {Q 1 ; Q 2 ; . . . ; Q h }. Each Q i (i = 1, 2, . . . , h) represents the score of the fault mode i, respectively. Finally, the highest score Q r is the output, which is the maximum of the output Q 1 , Q 2 , . . . , Q h .
Setting fault discrimination matrix C: where C i (i = 1, 2, . . . , h) represents the fault mode and I h×h is an h-dimensional identity matrix. h is the total number of bearing state modes, which determines the number of output layer nodes. The corresponding fault mode of the fault discrimination matrix C is found according to the angle mark r of the highest score, which is the decision fusion diagnosis result of the current state.
It can be seen that decision fusion is the fusion of multiple bearing fault diagnosis results based on a single time-domain feature. It is not necessary to screen the features, and the diagnostic results under each feature contribute to the final decision results. The decision fusion method takes into account the diagnostic advantages of each single timedomain feature for rolling bearing signals, and the diagnostic results based on the reasoning of each single time-domain feature will be comprehensively analyzed. To some extent, it can avoid the misdiagnosis and missed diagnosis of rolling bearing faults caused by the sudden change of environmental or human factors, which is helpful in reflecting the healthy condition of rolling bearings more comprehensively and accurately.
In addition, this paper adopts the winner-take-all strategy, which is suitable for single faults. For multi-fault cases, a feasible method is to sort according to the score of Q 1 , Q 2 , . . . , Q h , from high to low, and output the order of the possibility of the fault mode, which can realize simultaneous fault diagnosis.

Hierarchical Fault Diagnosis Strategy
The traditional fault diagnosis method based on machine learning is to consider the fault position and severity synthetically. However, most of the rolling bearings are in normal condition, the fault occurs by chance and the evolution of the fault degree is nonlinear. In addition, in the actual process, the complete bearing fault samples are difficult to obtain. In this case, the fault diagnosis network that can judge the fault at the same time is often difficult to train because of the complexity of its network structure. In this paper, the hierarchical fault diagnosis architecture of rolling bearings is proposed, and the hierarchical diagnosis of rolling bearings is carried out in the progressive way. Further, the fault diagnosis of rolling bearings is divided into three layers: the first layer is the fault detection layer, which is used to judge whether the current state of the bearing is normal or not. The second layer is the fault isolation layer, and its goal is to identify the fault position of the bearing diagnosed as having a faulty state in the first layer. The third layer is the fault degree estimation layer, which is to further estimate the severity of the fault position of the rolling bearing.
The advantages of the hierarchical diagnosis proposed in this paper are as follows: the function of each layer of the neural network is relatively simple, the corresponding level of fault diagnosis network is trained by using the obtained fault samples, and the network training time is reduced. It is also relatively easy to improve the accuracy of fault identification. In addition, updating the fault algorithm is relatively easy, as it is only necessary to retrain the affected neural network to achieve the algorithm update, compared with the traditional diagnosis method; it also has better flexibility. Hierarchical decision fusion fault diagnosis methods include a training stage and a diagnosis stage. where u and u are the original data of the feature set and the normalized data, respectively. umax is the maximum value in the original feature set, and umin is the minimum value in the original feature set. Here, u represents the feature value Fj s and u represents the normalized eigenvalue Xj s .
Using the training process detailed in Section 2.2, 5 kinds of training samples are studied, including the bearing state training sample, the fault position training sample, the inner race fault degree training sample, the ball fault degree training sample and the outer race fault degree training sample. A hierarchical decision fusion network of rolling bearings is established capable of fault detection, isolation and fault degree estimation. (2) Diagnosis Stage In the application stage, the flow of the hierarchical decision fusion fault diagnosis method is shown in Figure 5. Step 1: Generate training samples.
According to the hierarchical fault diagnosis architecture, five kinds of training samples are designed, including a bearing state training sample, fault position training sample, inner race fault degree training sample, ball fault degree training sample and an outer race fault degree training sample.
Selecting the appropriate wavelet basis and the number of decomposition layers, each kind of sample data is decomposed into a wavelet packet, and the raw signal is decomposed into a series of reconstructed signals with equal bandwidth.
There are m features extracted from the wavelet packet reconstruction signal, and the feature set F = {F 1 , F 2 , . . . , F m } is obtained. The dimension of the feature subset F j (j = 1, 2, . . . , m) depends on the number of decomposition layers n by the wavelet packet. In this paper, all the training samples are decomposed by three-layer wavelet packets. Hence, s = 8, and the feature subset can be described as Step 4: Normalization.
Because neurons are highly sensitive to numbers in the range of [−1, 1], the bearing feature data cannot be directly input into the BPNN, so it is necessary to normalize the feature set. The min-max normalization method is used in this paper.
where u and u are the original data of the feature set and the normalized data, respectively. u max is the maximum value in the original feature set, and u min is the minimum value in the original feature set. Here, u represents the feature value F j s and u represents the normalized eigenvalue X j s .
Using the training process detailed in Section 2.2, 5 kinds of training samples are studied, including the bearing state training sample, the fault position training sample, the inner race fault degree training sample, the ball fault degree training sample and the outer race fault degree training sample. A hierarchical decision fusion network of rolling bearings is established capable of fault detection, isolation and fault degree estimation.
(2) Diagnosis Stage In the application stage, the flow of the hierarchical decision fusion fault diagnosis method is shown in Figure 5.  Step 1: Database collection and feature extraction The vibration signals of the rolling bearing are collected in real time. N samples of the same length are randomly sampled from the raw vibration signal. Wavelet packet decomposition and signal reconstruction of data samples are carried out to extract m kinds of features of reconstructed signals and form feature vectors [F 1 , F 2 , . . . , F m ]. Finally, the feature vector is normalized.
Step 2: Fault detection m normalized feature vectors are input into m fault detection neural networks (BPNN_ FD), and the current state of the bearing is judged by decision fusion. Step 2 and Step 3 are entered into only when the bearing is in a faulty state. Because most of the bearings are in the normal state during use, such a processing method will reduce the operational burden of the algorithm.
Step 3: Fault isolation When a bearing is detected as being in a faulty state, m normalized feature vectors are input into m fault isolation neural networks (BPNN_FI). After decision fusion, it can be determined whether the bearing fault has occurred in the inner race, ball or outer race.
Step 4: Fault severity estimation After identifying the location where the bearing fault has occurred, m normalized feature vectors are input to m fault severity estimation neural networks corresponding to the fault location. After decision fusion, the severity of the fault location (degree I, degree II, degree III) is determined. In this paper, there are 11 inner race fault severity estimation neural networks (BPNN_IR), 11 ball fault severity estimation neural networks (BPNN_BA) and 11 outer race fault severity estimation neural networks (BPNN_OR).

Experimental Platform
Experimental data was provided by the bearing data center of Western Reserve University [35]. The experiment platform was mainly composed of a motor, torque sensor, power meter and electronic control equipment, as shown in Figure 6. The type of rolling bearing was SKF6205. The single-point faults in the inner raceway, the outer raceway and ball were introduced to the test bearings using electro-discharge machining with fault diameters of 0.007 inches, 0.014 inches and 0.021 inches. Vibration data was collected by using accelerometers, which were installed at the 12 o'clock position at the fan end of the motor housing. When a bearing is detected as being in a faulty state, m normalized feature vectors are input into m fault isolation neural networks (BPNN_FI). After decision fusion, it can be determined whether the bearing fault has occurred in the inner race, ball or outer race.
Step 4: Fault severity estimation After identifying the location where the bearing fault has occurred, m normalized feature vectors are input to m fault severity estimation neural networks corresponding to the fault location. After decision fusion, the severity of the fault location (degree I, degree II, degree III) is determined. In this paper, there are 11 inner race fault severity estimation neural networks (BPNN_IR), 11 ball fault severity estimation neural networks (BPNN_BA) and 11 outer race fault severity estimation neural networks (BPNN_OR).

Experimental Platform
Experimental data was provided by the bearing data center of Western Reserve University [35]. The experiment platform was mainly composed of a motor, torque sensor, power meter and electronic control equipment, as shown in Figure 6. The type of rolling bearing was SKF6205. The single-point faults in the inner raceway, the outer raceway and ball were introduced to the test bearings using electro-discharge machining with fault diameters of 0.007 inches, 0.014 inches and 0.021 inches. Vibration data was collected by using accelerometers, which were installed at the 12 o'clock position at the fan end of the motor housing. According to the health state, different fault location and different fault degree, the data of 100 segments of length 2048 were randomly selected, generating 10 rolling bearing fault diagnosis sample data sets, as shown in Table 1. Each data set contained 30 training samples and 70 test samples. The sampling frequency was 12 kHz, the motor speed was 1797 rpm and the load was 0.  According to the health state, different fault location and different fault degree, the data of 100 segments of length 2048 were randomly selected, generating 10 rolling bearing fault diagnosis sample data sets, as shown in Table 1. Each data set contained 30 training samples and 70 test samples. The sampling frequency was 12 kHz, the motor speed was In this paper, the hierarchical decision fusion diagnosis method includes a fault detection layer, a fault isolation layer and a fault severity estimation layer. Each diagnostic layer contains multiple BPNNs. In order to realize the function of these neural networks, it is necessary to train through the corresponding samples, as shown in Tables 2-4. Taking the fault detection layer as an example, the output result of the BPNN_FD is healthy or faulty. The total number of training samples was set to 60, where 30 normal samples came from data set 1 in Table 1, and the other 30 fault samples were randomly selected from data sets 2-10.

Feature Extraction
According to the hierarchical decision fusion diagnosis process of rolling bearings shown in Figure 5, wavelet analysis was used to reconstruct vibration signals. A wavelet basis was chosen as the db4, to decompose the three-layer wavelet packet of vibration signals. Wavelet packet decomposition formed 8 node coefficients, corresponding to 8 frequency band signals, representing all the information of the rolling bearing fault characteristics, as shown in Figure 7. The time-domain features (feature 1-11) of each node reconstructed signal after wavelet packet decomposition were extracted. As shown in Table 5, there are 11 features of wavelet packet reconstructed signals.
4 Av Average rectified value 6 ku Kurtosis   Mi C Peak factor C = P k/ rm (21) 10 I Pulse factor 11 L Margin factor L = P k / 1

Fault Detection Results of Rolling Bearings
Each time-domain feature of each sample of wavelet packet reconstruction signals of rolling bearings is input into the input layer of BPNN, and the fault diagnosis of rolling bearing is carried out by the stratified diagnosis method.
In the rolling bearing fault detection layer, it is determined whether there is a fault in the rolling bearing. According to the sample set of Table 2, the diagnostic accuracy of the fault detection layer based on a single time-domain feature (serial number 1-11) and decision fusion based on a time-domain feature (serial number 12) is tested and compared, as shown in Figure 8.

Fault Detection Results of Rolling Bearings
Each time-domain feature of each sample of wavelet packet reconstruction signals of rolling bearings is input into the input layer of BPNN, and the fault diagnosis of rolling bearing is carried out by the stratified diagnosis method.
In the rolling bearing fault detection layer, it is determined whether there is a fault in the rolling bearing. According to the sample set of Table 2, the diagnostic accuracy of the fault detection layer based on a single time-domain feature (serial number 1-11) and decision fusion based on a time-domain feature (serial number 12) is tested and compared, as shown in Figure 8. It can be seen from the diagram that feature 4, feature 5 and feature 7 have a good effect on the diagnosis of wavelet packet reconstruction signal when fault detection is based on a single feature, the diagnosis rate is more than 98% and the diagnosis effect based on other features is relatively poor. When the decision fusion method (number 12) is used to diagnose the fault detection layer, the diagnosis rate increases to 100%, which is a higher diagnostic accuracy than the single-feature-based diagnosis results.

Fault Isolation Results of Rolling Bearings
Assuming that there is a fault in the fault detection layer, it is necessary to isolate the fault of the rolling bearing. According to the testing sample set of Table 3, the location of the fault is diagnosed. The result is shown in Figure 9. From the diagram, when the bearing is diagnosed based on the average rectified value (feature 4), standard deviation (feature 5) and root mean square (feature 7), the fault diagnosis rate is more than 99.9%. However, for the cases based on kurtosis (feature 6), It can be seen from the diagram that feature 4, feature 5 and feature 7 have a good effect on the diagnosis of wavelet packet reconstruction signal when fault detection is based on a single feature, the diagnosis rate is more than 98% and the diagnosis effect based on other features is relatively poor. When the decision fusion method (number 12) is used to diagnose the fault detection layer, the diagnosis rate increases to 100%, which is a higher diagnostic accuracy than the single-feature-based diagnosis results.

Fault Isolation Results of Rolling Bearings
Assuming that there is a fault in the fault detection layer, it is necessary to isolate the fault of the rolling bearing. According to the testing sample set of Table 3, the location of the fault is diagnosed. The result is shown in Figure 9.

Fault Detection Results of Rolling Bearings
Each time-domain feature of each sample of wavelet packet reconstruction signals of rolling bearings is input into the input layer of BPNN, and the fault diagnosis of rolling bearing is carried out by the stratified diagnosis method.
In the rolling bearing fault detection layer, it is determined whether there is a fault in the rolling bearing. According to the sample set of Table 2, the diagnostic accuracy of the fault detection layer based on a single time-domain feature (serial number 1-11) and decision fusion based on a time-domain feature (serial number 12) is tested and compared, as shown in Figure 8. It can be seen from the diagram that feature 4, feature 5 and feature 7 have a good effect on the diagnosis of wavelet packet reconstruction signal when fault detection is based on a single feature, the diagnosis rate is more than 98% and the diagnosis effect based on other features is relatively poor. When the decision fusion method (number 12) is used to diagnose the fault detection layer, the diagnosis rate increases to 100%, which is a higher diagnostic accuracy than the single-feature-based diagnosis results.

Fault Isolation Results of Rolling Bearings
Assuming that there is a fault in the fault detection layer, it is necessary to isolate the fault of the rolling bearing. According to the testing sample set of Table 3, the location of the fault is diagnosed. The result is shown in Figure 9. From the diagram, when the bearing is diagnosed based on the average rectified value (feature 4), standard deviation (feature 5) and root mean square (feature 7), the fault diagnosis rate is more than 99.9%. However, for the cases based on kurtosis (feature 6), From the diagram, when the bearing is diagnosed based on the average rectified value (feature 4), standard deviation (feature 5) and root mean square (feature 7), the fault diagnosis rate is more than 99.9%. However, for the cases based on kurtosis (feature 6), waveform factor (feature 8), peak factor (feature 9), pulse factor (feature 10) and margin factor (feature 11), the diagnostic accuracy is lower than 70%. When using the decision fusion method (number 12) for fault isolation layer diagnosis, the diagnostic accuracy is 98.57%. Although slightly lower than the highest diagnostic accuracy under a single feature, it is a combination of multiple features, which can avoid the possible error caused by the artificial selection of features. Therefore, a small reduction in diagnostic accuracy caused by decision fusion is acceptable.

Fault Severity Estimation of Rolling Bearings
After the fault location diagnosis of the rolling bearing is obtained in the fault isolation layer, the degree of fault is investigated.
(1) Fault severity estimation results based on single time-domain features For the same rolling bearing vibration signal sample, the hierarchical diagnosis based on different time-domain features will produce different diagnosis results. This is because the experimental data cannot achieve absolute idealization, it is easy to be interfered with by environmental factors or human factors in the actual experiment, and the selection of data samples also has a certain randomness. According to the test sample set in Table 4, the fault of a bearing may be between two fault degrees, which leads to the hierarchical diagnosis based on different time-domain features and gives different diagnosis results, as shown in Figures 10-12 waveform factor (feature 8), peak factor (feature 9), pulse factor (feature 10) and margi factor (feature 11), the diagnostic accuracy is lower than 70%. When using the decisio fusion method (number 12) for fault isolation layer diagnosis, the diagnostic accuracy 98.57%. Although slightly lower than the highest diagnostic accuracy under a single fea ture, it is a combination of multiple features, which can avoid the possible error caused b the artificial selection of features. Therefore, a small reduction in diagnostic accurac caused by decision fusion is acceptable.

Fault Severity Estimation of Rolling Bearings
After the fault location diagnosis of the rolling bearing is obtained in the fault isol tion layer, the degree of fault is investigated.
(1) Fault severity estimation results based on single time-domain features For the same rolling bearing vibration signal sample, the hierarchical diagnosis base on different time-domain features will produce different diagnosis results. This is becaus the experimental data cannot achieve absolute idealization, it is easy to be interfered wit by environmental factors or human factors in the actual experiment, and the selection o data samples also has a certain randomness. According to the test sample set in Table  the     waveform factor (feature 8), peak factor (feature 9), pulse factor (feature 10) and marg factor (feature 11), the diagnostic accuracy is lower than 70%. When using the decisio fusion method (number 12) for fault isolation layer diagnosis, the diagnostic accuracy 98.57%. Although slightly lower than the highest diagnostic accuracy under a single fe ture, it is a combination of multiple features, which can avoid the possible error caused b the artificial selection of features. Therefore, a small reduction in diagnostic accurac caused by decision fusion is acceptable.

Fault Severity Estimation of Rolling Bearings
After the fault location diagnosis of the rolling bearing is obtained in the fault isol tion layer, the degree of fault is investigated.
(1) Fault severity estimation results based on single time-domain features For the same rolling bearing vibration signal sample, the hierarchical diagnosis base on different time-domain features will produce different diagnosis results. This is becaus the experimental data cannot achieve absolute idealization, it is easy to be interfered wit by environmental factors or human factors in the actual experiment, and the selection data samples also has a certain randomness. According to the test sample set in Table  the                       According to the diagnosis results in Figures 10 and 13, the diagnosis results of inner race fault severity under each single feature and fusion decision are obtained, as shown in Figure 16. It can be seen from the diagram that the diagnostic effect of the wavelet packet reconstruction signal under other single time-domain features is higher than 90%, except for the kurtosis feature (feature 6). In particular, features 1-5, 7 and 8 have an excellent diagnostic effect on the fault severity estimation of the inner race, and the fault diagnosis rate is more than 99%. The result of decision fusion diagnosis under each time-domain feature (number 12) is a fault diagnosis rate of 100% of the wavelet packet reconstructed signal, which is not lower than that of the wavelet packet reconstructed signal under a single feature. This shows that the decision fusion diagnosis method improves the reliability of fault diagnosis of the rolling bearing inner ring to some extent.
According to the diagnosis results in Figures 11 and 14  It can be seen from the diagram that there is a big gap in the diagnostic accuracy of ball fault degree based on different single features. Among them, feature 4, feature 5 and feature 7 of the wavelet packet reconstruction signal have a good diagnostic effect on rolling body fault degree, and fault diagnosis accuracy is more than 90%. The diagnostic accuracy based on kurtosis (feature 6) and waveform factor (feature 8) is obviously lower than that based on other single features. The diagnosis results based on each single feature are fused (number 12), and the fault diagnosis accuracy of wavelet packet reconstruction signal is 98.57%. It can be seen from the diagram that the diagnostic effect of the wavelet packet reconstruction signal under other single time-domain features is higher than 90%, except for the kurtosis feature (feature 6). In particular, features 1-5, 7 and 8 have an excellent diagnostic effect on the fault severity estimation of the inner race, and the fault diagnosis rate is more than 99%. The result of decision fusion diagnosis under each time-domain feature (number 12) is a fault diagnosis rate of 100% of the wavelet packet reconstructed signal, which is not lower than that of the wavelet packet reconstructed signal under a single feature. This shows that the decision fusion diagnosis method improves the reliability of fault diagnosis of the rolling bearing inner ring to some extent.
According to the diagnosis results in Figures 11 and 14, the diagnosis results of ball fault degree under each single feature and fusion decision are obtained, as shown in Figure 17. According to the diagnosis results in Figures 10 and 13, the diagnosis results of inner race fault severity under each single feature and fusion decision are obtained, as shown in Figure 16. It can be seen from the diagram that the diagnostic effect of the wavelet packet reconstruction signal under other single time-domain features is higher than 90%, except for the kurtosis feature (feature 6). In particular, features 1-5, 7 and 8 have an excellent diagnostic effect on the fault severity estimation of the inner race, and the fault diagnosis rate is more than 99%. The result of decision fusion diagnosis under each time-domain feature (number 12) is a fault diagnosis rate of 100% of the wavelet packet reconstructed signal, which is not lower than that of the wavelet packet reconstructed signal under a single feature. This shows that the decision fusion diagnosis method improves the reliability of fault diagnosis of the rolling bearing inner ring to some extent.
According to the diagnosis results in Figures 11 and 14  It can be seen from the diagram that there is a big gap in the diagnostic accuracy of ball fault degree based on different single features. Among them, feature 4, feature 5 and feature 7 of the wavelet packet reconstruction signal have a good diagnostic effect on rolling body fault degree, and fault diagnosis accuracy is more than 90%. The diagnostic accuracy based on kurtosis (feature 6) and waveform factor (feature 8) is obviously lower than that based on other single features. The diagnosis results based on each single feature are fused (number 12), and the fault diagnosis accuracy of wavelet packet reconstruction signal is 98.57%. It can be seen from the diagram that there is a big gap in the diagnostic accuracy of ball fault degree based on different single features. Among them, feature 4, feature 5 and feature 7 of the wavelet packet reconstruction signal have a good diagnostic effect on rolling body fault degree, and fault diagnosis accuracy is more than 90%. The diagnostic accuracy based on kurtosis (feature 6) and waveform factor (feature 8) is obviously lower than that based on other single features. The diagnosis results based on each single feature are fused (number 12), and the fault diagnosis accuracy of wavelet packet reconstruction signal is 98.57%.
According to the diagnosis results in Figures 12 and 15, the diagnosis results of outer race fault degree under each single feature and fusion decision are obtained, as shown in Figure 18. According to the diagnosis results in Figures 12 and 15, the diagnosis results of outer race fault degree under each single feature and fusion decision are obtained, as shown in Figure 18. It can be seen from the diagram that when the fault degree of the outer race is identified, the diagnostic accuracy of all single time-domain features for the fault degree of the outer race of the rolling bearing is over 98%. The diagnosis results under each time-domain feature are fused (number 12), and the fault diagnosis rate of wavelet packet reconstruction signal is 100%, which shows that the decision fusion diagnosis method is more comprehensive and reliable than the single-feature diagnosis when diagnosing the fault degree of the outer race of the rolling bearing.

(4) Comparison with other methods
Based on the bearing data of Western Reserve University, the published typical diagnosis results are shown in Table 6. It can be seen that the proposed method not only realizes the accurate detection of bearing faults, but also can accurately locate the fault location and estimate the degree of damage. In this paper, the time-domain features in Table 5 of the wavelet packet reconstruction signal of the rolling bearing are extracted, and a stratified diagnosis of the rolling bearing is carried out based on the decision fusion method. In order to prove the positive effect of wavelet packet decomposition and signal reconstruction on fault diagnosis, the time-domain features in Table 5 of raw vibration signals without wavelet packet decomposition are also It can be seen from the diagram that when the fault degree of the outer race is identified, the diagnostic accuracy of all single time-domain features for the fault degree of the outer race of the rolling bearing is over 98%. The diagnosis results under each time-domain feature are fused (number 12), and the fault diagnosis rate of wavelet packet reconstruction signal is 100%, which shows that the decision fusion diagnosis method is more comprehensive and reliable than the single-feature diagnosis when diagnosing the fault degree of the outer race of the rolling bearing.

(4) Comparison with other methods
Based on the bearing data of Western Reserve University, the published typical diagnosis results are shown in Table 6. It can be seen that the proposed method not only realizes the accurate detection of bearing faults, but also can accurately locate the fault location and estimate the degree of damage. In this paper, the time-domain features in Table 5 of the wavelet packet reconstruction signal of the rolling bearing are extracted, and a stratified diagnosis of the rolling bearing is carried out based on the decision fusion method. In order to prove the positive effect of wavelet packet decomposition and signal reconstruction on fault diagnosis, the timedomain features in Table 5 of raw vibration signals without wavelet packet decomposition are also extracted in this section, and the results of stratified diagnosis are compared in Figure 19. It can be seen from the diagram that after wavelet packet signal reconstruction, the diagnostic accuracy of the rolling bearing in all layers has been improved to different degrees. Therefore, in the fault diagnosis of rolling bearings, wavelet packet signal reconstruction is an effective data preprocessing method and will play a positive role in the diagnosis results. extracted in this section, and the results of stratified diagnosis are compared in Figure 19. It can be seen from the diagram that after wavelet packet signal reconstruction, the diagnostic accuracy of the rolling bearing in all layers has been improved to different degrees. Therefore, in the fault diagnosis of rolling bearings, wavelet packet signal reconstruction is an effective data preprocessing method and will play a positive role in the diagnosis results.

Fault Diagnosis Results under Variable Load
All the above diagnosis results are based on zero load (HP0) vibration data fault diagnosis. In order to verify the adaptability of the proposed method to load change, the fault data under three constant load conditions, which are 1 horsepower (HP1), 2 horsepower (HP2) and 3 horsepower (HP3), respectively, were input into BPNNs based on zero load data training. Decision fusion diagnosis was used to diagnose rolling bearings.
From Table 7, it can be seen that when the diagnostic condition was extended to other loads, the fault diagnosis in each layer could still reach a higher diagnostic rate, except for the fault degree diagnosis of the ball. Especially for fault detection, inner race fault degree estimation and outer race fault degree estimation, the diagnosis rate did not change too much after changing the working condition to a nonzero load. The proposed method has good adaptability to load change. In addition, the diagnostic rate of wavelet packet reconstruction signal is higher than that of raw vibration signal under the same working condition, which indicates the necessity of wavelet packet reconstruction of the original vibration signal.

Conclusions
In this paper, a hierarchical diagnosis method for rolling bearings based on decision fusion is proposed, and bearing data of West Reserve University are used to verify the method. The results show that the proposed method can accurately realize the detection of bearing running state, the isolation of fault position and estimation of the fault severity of the bearing. Because most bearings are in a normal state, so the calculation cost of the proposed method is low. At the same time, the hierarchical diagnosis method only needs

Fault Diagnosis Results under Variable Load
All the above diagnosis results are based on zero load (HP0) vibration data fault diagnosis. In order to verify the adaptability of the proposed method to load change, the fault data under three constant load conditions, which are 1 horsepower (HP1), 2 horsepower (HP2) and 3 horsepower (HP3), respectively, were input into BPNNs based on zero load data training. Decision fusion diagnosis was used to diagnose rolling bearings.
From Table 7, it can be seen that when the diagnostic condition was extended to other loads, the fault diagnosis in each layer could still reach a higher diagnostic rate, except for the fault degree diagnosis of the ball. Especially for fault detection, inner race fault degree estimation and outer race fault degree estimation, the diagnosis rate did not change too much after changing the working condition to a nonzero load. The proposed method has good adaptability to load change. In addition, the diagnostic rate of wavelet packet reconstruction signal is higher than that of raw vibration signal under the same working condition, which indicates the necessity of wavelet packet reconstruction of the original vibration signal.

Conclusions
In this paper, a hierarchical diagnosis method for rolling bearings based on decision fusion is proposed, and bearing data of West Reserve University are used to verify the method. The results show that the proposed method can accurately realize the detection of bearing running state, the isolation of fault position and estimation of the fault severity of the bearing. Because most bearings are in a normal state, so the calculation cost of the proposed method is low. At the same time, the hierarchical diagnosis method only needs to update the related neural networks, which enhances the maintainability of the algorithm. In addition, the proposed method has good adaptability to load change. Compared with the raw vibration signal, the fault detection accuracy is equivalent, but the accuracy of fault isolation and fault severity estimation is obviously improved. The decision fusion is realized by the voting method, which reflects the comprehensive diagnosis results of each characteristic and avoids the missed diagnosis and misdiagnosis caused by the insensitivity of certain characteristics to specific faults. However, whether this method is suitable for bearing multi-fault diagnosis needs further study. In addition, decision fusion considering frequency-domain feature diagnosis results and fusion of different machine learning algorithms need to be further studied.