Rolling Bearings Fault Diagnosis Based on Tree Heuristic Feature Selection and the Dependent Feature Vector Combined with Rough Sets

Rolling element bearings (REB) are widely used in all walks of life, and they play an important role in the health operation of all kinds of rotating machinery. Therefore, the fault diagnosis of REB has attracted substantial attention. Fault diagnosis methods based on time-frequency signal analysis and intelligent classification are widely used for REB because of their effectiveness. However, there still exist two shortcomings in these fault diagnosis methods: (1) A large amount of redundant information is difficult to identify and delete. (2) Aliasing patterns decrease the methods’ classification accuracy. To overcome these problems, this paper puts forward an improved fault diagnosis method based on tree heuristic feature selection (THFS) and the dependent feature vector combined with rough sets (RS-DFV). In the RS-DFV method, the feature set was optimized through the dependent feature vector (DFV). Furthermore, the DFV revealed the essential difference among different REB faults and improved the accuracy of fault description. Moreover, the rough set was utilized to reasonably describe the aliasing patterns and overcome the problem of abnormal termination in DFV extraction. In addition, a tree heuristic feature selection method (THFS) was devised to delete the redundant information and construct the structure of RS-DFV. Finally, a simulation, four other feature vectors, three other feature selection methods and four other fault diagnosis methods were utilized for the REB fault diagnosis to demonstrate the effectiveness of the RS-DFV method. RS-DFV obtained an effective subset of five features from 100 features, and acquired a very good diagnostic accuracy (100%, 100%, 99.51%, 100%, 99.47%, 100%), which is much higher than all comparative tests. The results indicate that the RS-DFV method could select an appropriate feature set, deeply dig the effectiveness of the features and more exactly describe the aliasing patterns. Consequently, this method performs better in REB fault diagnosis than the original intelligent methods.


Introduction
Rolling bearings are the key components of various rotating machinery that are widely used in all walks of life [1,2].The failure of the bearings could lead to serious consequences such as grave safety accidents, long breaks of production and great pecuniary losses.Thus, accurate status monitoring and timely fault identification of REB is significant for the safe and regular operation of large machinery [3,4].Accordingly, research and development in REB fault diagnosis has had an all-important social significance and economic value [5][6][7].
In recent years, the researches on rolling bearing fault diagnosis mainly focused on two kinds of methods [7]: (1) Fault diagnosis based on feature extraction and fault recognition; (2) Self-learning and self-evolution fault diagnosis based on deep learning.The first kind of methods firstly obtained fault symptoms from fault signals by signal analysis methods, and then identified fault types through fault recognition methods.Feature extraction [8][9][10] and fault recognition are two determinant steps which affected the accuracy of fault diagnosis.Therefore, the related researches mainly focused on feature extraction method [11,12] and fault recognition method.The second kind of methods [13] did not need advance feature extraction and could learn complex fault parameters from the original fault data and realize fault identification autonomously.The related researches mainly focused on deep neural network and various deep learning methods based on it.
Signal processing is the most effective and widely used method for fault feature extraction and is the significant basis for machinery fault diagnosis.Recently, many kinds of signal analysis methods have been widely used in fault diagnosis, such as wavelet [14,15], empirical mode decomposition (EMD) [16,17], variational mode decomposition (VMD) [18,19] and singular value decomposition (SVD) [20,21].They are the most important signal analysis methods in REB fault diagnosis because of their prominent effectiveness.After a long time of in-depth research by a large number of scientists, these signal analysis methods all could obtain good enough fault feature of rolling bearings.In addition, intelligent algorithms are another key technology for fault diagnosis, and a variety of such algorithms [9,[22][23][24], such as support vector machine [9,22], clustering algorithms [23], neural networks [24] and so on, are used to design and improve the intelligent fault diagnosis system.Signal processing and an intelligent classification algorithm cooperate efficiently in REB fault diagnosis.
With the capacity of automatically learning complex features of input data, deep learning architectures have great potential to overcome drawbacks of traditional intelligent fault diagnosis.Accordingly, deep learning algorithms have been applied widely in machine health monitoring recently [2].Duy-Tang Hoang [2] proposed a method for diagnosing bearing faults based on a deep structure of convolutional neural network, and this method has high accuracy and robustness in noisy environment.With deep convolutional neural network as the main structure, Xiang Li [13] proposed a novel domain adaptation method for rolling bearing fault diagnosis.It minimized the maximum mean deviation between the source domain and the target domain in multi-core structure, and significantly improved the performance of cross-domain testing.An integrated deep learning diagnosis method based on multi-objective optimization is proposed by Sai Ma [5].This method weighted and integrated convolutional residual network (CRN), belief network (DBN) and depth automatic encoder (DAE) to realize effective diagnosis of bearing faults.Zhiyu Zhu [6] presented a capsule network with an inception block and a regression branch, which improved the generalization ability of deep neural network.Wentao Mao [25] designed a deep output kernel learning method to conduct collaborative diagnosis of multiple bearing fault types, which improved the generalization ability and robustness of the diagnosis model.Shao Haidong [26] proposed an integrated depth automatic coding method to overcome the dependence of traditional artificial feature extraction methods on depth learning models and the limitations of some depth learning models.
However, there still exist two shortcomings in traditional REB fault diagnosis methods: (1) A large amount of redundant information is difficult to identify and delete [27].(2) Aliasing patterns decrease the classification accuracy [27] and increase the complexity of fault diagnosis.Although deep learning techniques could extract more representative features from bearing fault data adaptively, they usually have high computational cost, slow convergence speed and unavoidable randomness [25].
To overcome these two shortcomings in traditional REB fault diagnosis methods, this paper proposes an improved intelligent fault diagnosis method for REB.The major contributions of this work are as follows: (1) An adaptive fault description model is established based on RS-DFV, (2) a rough set is used to enhance the description accuracy and the self-healing nature of RS-DFV, and (3) a tree heuristic feature selection method is devised to delete the redundant information and construct the structure of RS-DFV.Therefore, the fault diagnosis method presented in this paper ameliorated the algorithm performance in terms of accuracy, complexity as well as timeliness and greatly advanced the efficiency and practicability of REB fault diagnosis.
To improve the validity and timeliness of REB fault diagnosis, the researchers designed a fault diagnosis method based on RS-DFV for REB.The current study's discourse is composed as follows: Section 2 recounts the conception of RS-DFV, the tree heuristic feature selection method (THFS) and the RS-DFV extraction method.The evaluation of the feature vector and the classifier are introduced in Section 3. In Section 4, the REB fault diagnosis tests are described in detail, and the experimental results are analyzed and discussed.At last, the conclusions of this study are arranged in Section 5.

The Framework of REB Fault Diagnosis Based on THFS and RS-DFV
The fault diagnosis method based on THFS and RS-DFV proposed in this paper includes two crucial components: (1) a unique feature selection method (THFS) which is not only need to remove the redundant features to the maximum extent, but also must determine the composition of DFV and established the logical structure of DFV; (2) a feature organization method (RS-DFV) which could not only overcome the interference of overlapping samples, dig out the complex relationship between faults and features, but also improve the accuracy of fault representation.The framework of REB Fault diagnosis based on THFS and RS-DFV is showed in Figure 1.
Appl.Sci.2019, 9, x FOR PEER REVIEW 3 of 19 the structure of RS-DFV.Therefore, the fault diagnosis method presented in this paper ameliorated the algorithm performance in terms of accuracy, complexity as well as timeliness and greatly advanced the efficiency and practicability of REB fault diagnosis.
To improve the validity and timeliness of REB fault diagnosis, the researchers designed a fault diagnosis method based on RS-DFV for REB.The current study's discourse is composed as follows: Section 2 recounts the conception of RS-DFV, the tree heuristic feature selection method (THFS) and the RS-DFV extraction method.The evaluation of the feature vector and the classifier are introduced in Section 3. In Section 4, the REB fault diagnosis tests are described in detail, and the experimental results are analyzed and discussed.At last, the conclusions of this study are arranged in Section 5.

The Framework of REB Fault Diagnosis Based on THFS and RS-DFV
The fault diagnosis method based on THFS and RS-DFV proposed in this paper includes two crucial components: (1) a unique feature selection method (THFS) which is not only need to remove the redundant features to the maximum extent, but also must determine the composition of DFV and established the logical structure of DFV; (2) a feature organization method (RS-DFV) which could not only overcome the interference of overlapping samples, dig out the complex relationship between faults and features, but also improve the accuracy of fault representation.The framework of REB Fault diagnosis based on THFS and RS-DFV is showed in Figure 1.

The Basic Concept of DFV
To simulate the object description method of the human brain, one study [22] designed a special feature vector that was called dependent feature vector (DFV) for REB fault description.The topology and logical structure of DFV is displayed in Figure 2a.In DFV, there must be at most one leading feature (LF).However, A DFV could include many dependent features (DFs) or no DF.DFV could mining the essential difference among all kinds of faults through its unique nested structure.Moreover, the difference between different faults is magnified by the means of adaptive invalid features of the DFV, and the difference among faults of the same type is reduced at the same time, as illustrated in Figure 2b and Equations ( 1)- (7).Therefore, DFV greatly improved the accuracy of fault description and fault diagnosis.
In a DFV, the valid features must be acquired through the analysis of sample data.However, invalid features do not have to be calculated, and it is only necessary to subjectively give an effective

The Basic Concept of DFV
To simulate the object description method of the human brain, one study [22] designed a special feature vector that was called dependent feature vector (DFV) for REB fault description.The topology and logical structure of DFV is displayed in Figure 2a.In DFV, there must be at most one leading feature (LF).However, A DFV could include many dependent features (DFs) or no DF.DFV could mining the essential difference among all kinds of faults through its unique nested structure.Moreover, the difference between different faults is magnified by the means of adaptive invalid features of the DFV, and the difference among faults of the same type is reduced at the same time, as illustrated in Figure 2b and Equations ( 1)- (7).Therefore, DFV greatly improved the accuracy of fault description and fault diagnosis.For the examples in Figure 2b, F1 and F2 are two different fault types, xij is the jth feature item of the fault Fi,     In a DFV, the valid features must be acquired through the analysis of sample data.However, invalid features do not have to be calculated, and it is only necessary to subjectively give an effective value for them.The unified evaluation mechanism of invalid items in a DFV could greatly improve its fault discrimination.
For the examples in Figure 2b, F1 and F2 are two different fault types, xi j is the jth feature item of the fault Fi, | f v 1 − f v 2 | is the Euler distance between F1 and F2 based on the traditional feature vector, and |DFV 1 − DFV 2 | is the Euler distance between F1 and F2 based on the DFV.As long as the value (X) of the invalid features is suitable, Equations ( 3) and (4) are true.It is obvious that the DFV could magnify the difference between different faults through the value assignment of the invalid features. (2) (3) 5) | f v 11 − f v 12 | is the Euler distance between two faults of type F1 based on the traditional feature vector, and |DFV 11 − DFV 12 | is the Euler distance between two faults of type F1 based on DFV.No matter what is the value (X) of the invalid features, Equation ( 7) is true.Obviously, the DFV was able to significantly lessen the discrepancies among the same faults.

The Tree Heuristic Feature Selection
• Feature evaluation based on information granulation and neighboring clustering Different feature selection methods required different feature evaluation criteria, but the excellent traditional feature evaluation methods could not provide the best heuristic knowledge for the feature selection method proposed in this paper.So, a feature evaluation method based on information granulation and neighboring clustering is put forward for tree heuristic feature selection (THFS).In this feature evaluation method, information granulation and neighboring clustering were used to delete the features that were obviously ineffective in fault distinguishing at first; then, the remaining features were evaluated through Equations ( 8)- (11).The specific steps of this feature evaluation method are introduced below.
Step 1: The value range of one feature is divided into many small granules evenly in accordance with the same criterion, as illustrated in Figure 3 1 -2 .The granules that included some feature values of samples are black granules, and the others are white granules.
(THFS).In this feature evaluation method, information granulation and neighboring clustering were used to delete the features that were obviously ineffective in fault distinguishing at first; then, the remaining features were evaluated through Equations ( 8)- (11).The specific steps of this feature evaluation method are introduced below.
Step 1: The value range of one feature is divided into many small granules evenly in accordance with the same criterion, as illustrated in Figure 3①-②.The granules that included some feature values of samples are black granules, and the others are white granules.
Step 2: The adjoining granules of the same type are merged to a larger one through the neighboring clustering method, as illustrated in Figure 3②-③.The black granules that included very few samples and the white granules with a tiny length are outliers (Figure 3④).
Step 3: The black outlier is first changed to be white and amalgamated with white granules adjacent to it.Then, the white outlier is changed to be black and amalgamated with black granules adjacent to it, as illustrated in Figure 3④-⑤.Step 4: If there is only one black granule or two different black granules both with some samples of the same type, this feature is considered to be ineffective in fault distinguishing and deleted.
Step 5: The remaining features are evaluated through Equations ( 8)-( 11): where Li is the length of the No. i black granule, xi + is its upper bound, and xi − is its lower bound.Li (i+1) is the length of the white granule between the No. i and No. (i+1) black granules.Vi (i+1) is the separability between the No. i and No. (i+1) black granules, and N is the number of the black granules.P(x) is the score of this feature, and a large P(x) displayed its high separability among the black granules.Step 2: The adjoining granules of the same type are merged to a larger one through the neighboring clustering method, as illustrated in Figure 3 2 -3 .The black granules that included very few samples and the white granules with a tiny length are outliers (Figure 3 4 ).
Step 3: The black outlier is first changed to be white and amalgamated with white granules adjacent to it.Then, the white outlier is changed to be black and amalgamated with black granules adjacent to it, as illustrated in Figure 3 4 -5 .
Step 4: If there is only one black granule or two different black granules both with some samples of the same type, this feature is considered to be ineffective in fault distinguishing and deleted.
Step 5: The remaining features are evaluated through Equations ( 8)-( 11): where L i is the length of the No. i black granule, x i + is its upper bound, and x i − is its lower bound.
L i (i+1) is the length of the white granule between the No. i and No. (i+1) black granules.V i (i+1) is the separability between the No. i and No. (i+1) black granules, and N is the number of the black granules.P(x) is the score of this feature, and a large P(x) displayed its high separability among the black granules.

•
Feature selection based on the tree heuristic search strategy The structure establishment of the feature vector is a very important basis for DFV.To adapt to the uniqueness of DFV, this study proposes a feature selection method based on THFS to establish the structure of DFV.The workflow of THFS is illustrated in Figure 4.The most efficient feature (xk1) for the sample space is first chosen through the feature evaluation method based on information granulation and neighboring clustering, which is introduced above, and the black granules are obtained.Each black granule includes some fault types, and the samples of these types make a fault subspace.Then, the sample space is made the root node, the subspaces are made the leaf nodes, and the xk1 value range of each leaf node is marked.Thus, the local structure connected to the root of the heuristic tree is established, as illustrated in Figure 4a.If a leaf node includes faults of more than one type, its subtree is built using the same method.When all leaf nodes contain faults of only one type, the heuristic tree has completed its growth and gained complete tree, as illustrated in Figure 4b.
The optimization feature subset is obtained through traversing the heuristic tree.Because the position of each feature in a DFV is fixed, traversing the heuristic tree with a different method could get a different DFV.For example, in Figure 4b, depth-first traversal of the heuristic tree could get a DFV (xk1, xk2, xk3, xk4), but breadth-first traversal could get another DFV (xk1, xk2, xk4, xk3).Therefore, THFS not only has completed the feature selection and the optimization of feature subset but also could establish the structure of the DFV.

DFV Extraction Tree
Extracting the DFV of fault samples is another key step of the fault diagnosis method presented in this paper.The DFV is different from the traditional feature vector because of its unique structure.In a DFV, the importance of each feature item is not equal: some of them are valid feature items, and the others are invalid.More importantly, faults of different types had different valid feature items and invalid feature items, i.e., the same feature is of different importance for different faults.In addition, only the valid feature items of DFV need to be calculated from the sample data and the value of the invalid items are set subjectively, based on the requirement of the fault classification.All these differences considerably increased the difficulties of DFV extraction.Moreover, for a fault sample that has to be diagnosed, features that are valid items in the sample's DFV are unknown.This made the DFV extraction more difficult.
To overcome those problems in the DFV extraction mentioned above, a tree heuristic feature extraction method is put forward to extract the DFV in this paper, as illustrated in Figure 5.The most efficient feature (x k1 ) for the sample space is first chosen through the feature evaluation method based on information granulation and neighboring clustering, which is introduced above, and the black granules are obtained.Each black granule includes some fault types, and the samples of these types make a fault subspace.Then, the sample space is made the root node, the subspaces are made the leaf nodes, and the x k1 value range of each leaf node is marked.Thus, the local structure connected to the root of the heuristic tree is established, as illustrated in Figure 4a.If a leaf node includes faults of more than one type, its subtree is built using the same method.When all leaf nodes contain faults of only one type, the heuristic tree has completed its growth and gained complete tree, as illustrated in Figure 4b.
The optimization feature subset is obtained through traversing the heuristic tree.Because the position of each feature in a DFV is fixed, traversing the heuristic tree with a different method could get a different DFV.For example, in Figure 4b, depth-first traversal of the heuristic tree could get a DFV (x k1 , x k2 , x k3 , x k4 ), but breadth-first traversal could get another DFV (x k1 , x k2 , x k4 , x k3 ).Therefore, THFS not only has completed the feature selection and the optimization of feature subset but also could establish the structure of the DFV.

• DFV Extraction Tree
Extracting the DFV of fault samples is another key step of the fault diagnosis method presented in this paper.The DFV is different from the traditional feature vector because of its unique structure.In a DFV, the importance of each feature item is not equal: some of them are valid feature items, and the others are invalid.More importantly, faults of different types had different valid feature items and invalid feature items, i.e., the same feature is of different importance for different faults.In addition, only the valid feature items of DFV need to be calculated from the sample data and the value of the invalid items are set subjectively, based on the requirement of the fault classification.All these differences considerably increased the difficulties of DFV extraction.Moreover, for a fault sample that has to be diagnosed, features that are valid items in the sample's DFV are unknown.This made the DFV extraction more difficult.To overcome those problems in the DFV extraction mentioned above, a tree heuristic feature extraction method is put forward to extract the DFV in this paper, as illustrated in Figure 5. First, a DFV extraction tree (DET) that inherited the heuristic rules of the feature selection tree (FST) is put forward in this paper, as illustrated in Figure 5③.The DET is constructed based on the FST: (1) The FST is traversed.When a non-leaf node is passed, continue; when a feature selection segment is passed, a feature calculation node for the DET is built; when a leaf node is passed, an endnode for the DET is built.(2) A connection relationship is established among the nodes for the DET, and consistency with the FST is ensured.(3) According to the value ranges of the child nodes that had the same parent node in the FST, heuristic rules that could guide the subsequent feature calculation for the DET are built.
Then, for a new fault sample, the values of all valid feature items are calculated in turn according to the guidance of the (DET); hence, the valid feature items of the DFV are obtained.As illustrated in Figure 6, different faults have different valid features, but their valid features all could be accurately calculated through DET.Finally, the other feature items in the DFV are the invalid feature items of this fault sample, and all invalid features are subjectively assigned the same specific value.Therefore, the DFV of all fault samples could be acquired in this manner.

RS-DFV Extraction Tree
Although the tree heuristic feature extraction method overcomes the difficulties in DFV extraction, it could efficiently and accurately start the calculation of an unknown fault sample and successfully calculated the DFV of most of the fault samples.However, there is still another weakness in the DFV: The feature value obtained in the parent feature calculation node of the DFV extraction First, a DFV extraction tree (DET) that inherited the heuristic rules of the feature selection tree (FST) is put forward in this paper, as illustrated in Figure 5 3 .The DET is constructed based on the FST: (1) The FST is traversed.When a non-leaf node is passed, continue; when a feature selection segment is passed, a feature calculation node for the DET is built; when a leaf node is passed, an end-node for the DET is built.(2) A connection relationship is established among the nodes for the DET, and consistency with the FST is ensured.(3) According to the value ranges of the child nodes that had the same parent node in the FST, heuristic rules that could guide the subsequent feature calculation for the DET are built.
Then, for a new fault sample, the values of all valid feature items are calculated in turn according to the guidance of the (DET); hence, the valid feature items of the DFV are obtained.As illustrated in Figure 6, different faults have different valid features, but their valid features all could be accurately calculated through DET.First, a DFV extraction tree (DET) that inherited the heuristic rules of the feature selection tree (FST) is put forward in this paper, as illustrated in Figure 5③.The DET is constructed based on the FST: (1) The FST is traversed.When a non-leaf node is passed, continue; when a feature selection segment is passed, a feature calculation node for the DET is built; when a leaf node is passed, an endnode for the DET is built.(2) A connection relationship is established among the nodes for the DET, and consistency with the FST is ensured.(3) According to the value ranges of the child nodes that had the same parent node in the FST, heuristic rules that could guide the subsequent feature calculation for the DET are built.
Then, for a new fault sample, the values of all valid feature items are calculated in turn according to the guidance of the (DET); hence, the valid feature items of the DFV are obtained.As illustrated in Figure 6, different faults have different valid features, but their valid features all could be accurately calculated through DET.Finally, the other feature items in the DFV are the invalid feature items of this fault sample, and all invalid features are subjectively assigned the same specific value.Therefore, the DFV of all fault samples could be acquired in this manner.

RS-DFV Extraction Tree
Although the tree heuristic feature extraction method overcomes the difficulties in DFV extraction, it could efficiently and accurately start the calculation of an unknown fault sample and successfully calculated the DFV of most of the fault samples.However, there is still another weakness in the DFV: The feature value obtained in the parent feature calculation node of the DFV extraction Finally, the other feature items in the DFV are the invalid feature items of this fault sample, and all invalid features are subjectively assigned the same specific value.Therefore, the DFV of all fault samples could be acquired in this manner.

• RS-DFV Extraction Tree
Although the tree heuristic feature extraction method overcomes the difficulties in DFV extraction, it could efficiently and accurately start the calculation of an unknown fault sample and successfully calculated the DFV of most of the fault samples.However, there is still another weakness in the DFV: The feature value obtained in the parent feature calculation node of the DFV extraction tree could not meet the demand of any one heuristic rule connected to this feature, and there is be no inspire information for the next step, as illustrated in Figure 7a  To overcome the problem mentioned above, a rough set is introduced to solve the boundary problem of heuristic rule in the DFV extraction tree, and a DFV extraction method based on the rough set and the DFV extraction tree is presented in this paper.As illustrated in Figure 7c, a new DFV extraction tree combining the rough set and DET (RS-DET) is designed.The RS-DET increased an abnormal termination correction segment (ATCS) for each node that had more than one child nodes.When an abnormal DFV extraction termination occurred, the corresponding ATCS responded quickly and reasonably amended the value of the current heuristic feature to be instructive.Obviously, RS-DET could effectively overcome the abnormal suspension of DET and was able to complete the DFV extraction of each sample.The procedure of the ATCS is displayed below.
Step 1: Extract the DFV in accordance with the instructions of the RS-DET until an abnormal termination appeared, record the current fault sample as an AT fault sample, record the current node as an ATN, and record the current heuristic feature as an AT feature (xAT).
Step 2: Find the all child nodes of the ATN in the RS-DET and put them into the node set CN-ATN.
Step 3: For each node in CN-ATN, find fault types that were included in it, construct a rough set on xAT for each fault type, and calculate the roughness of this node based on Figure 8 and Equations ( 12)-( 14).
Step 4: Amend the value of xAT according to the roughness of the child nodes illustrated in Figure 9 and Equations ( 12) and (13).
Step 5: Go back to step 1.To overcome the problem mentioned above, a rough set is introduced to solve the boundary problem of heuristic rule in the DFV extraction tree, and a DFV extraction method based on the rough set and the DFV extraction tree is presented in this paper.As illustrated in Figure 7c, a new DFV extraction tree combining the rough set and DET (RS-DET) is designed.The RS-DET increased an abnormal termination correction segment (ATCS) for each node that had more than one child nodes.When an abnormal DFV extraction termination occurred, the corresponding ATCS responded quickly and reasonably amended the value of the current heuristic feature to be instructive.Obviously, RS-DET could effectively overcome the abnormal suspension of DET and was able to complete the DFV extraction of each sample.The procedure of the ATCS is displayed below.
Step 1: Extract the DFV in accordance with the instructions of the RS-DET until an abnormal termination appeared, record the current fault sample as an AT fault sample, record the current node as an ATN, and record the current heuristic feature as an AT feature (x AT ).
Step 2: Find the all child nodes of the ATN in the RS-DET and put them into the node set CN-ATN.
Step 3: For each node in CN-ATN, find fault types that were included in it, construct a rough set on x AT for each fault type, and calculate the roughness of this node based on Figure 8 and Equations ( 12)-( 14).
Step 4: Amend the value of x AT according to the roughness of the child nodes illustrated in Figure 9 and Equations ( 12) and (13).
Step 5: Go back to step 1.
Appl.Sci.2019, 9, x FOR PEER REVIEW 8 of 19 tree could not meet the demand of any one heuristic rule connected to this feature, and there is be no inspire information for the next step, as illustrated in Figure 7a,b.This could lead to an abnormal termination in the DFV extraction and inaccurate calculation of the sample DFV.To overcome the problem mentioned above, a rough set is introduced to solve the boundary problem of heuristic rule in the DFV extraction tree, and a DFV extraction method based on the rough set and the DFV extraction tree is presented in this paper.As illustrated in Figure 7c, a new DFV extraction tree combining the rough set and DET (RS-DET) is designed.The RS-DET increased an abnormal termination correction segment (ATCS) for each node that had more than one child nodes.When an abnormal DFV extraction termination occurred, the corresponding ATCS responded quickly and reasonably amended the value of the current heuristic feature to be instructive.Obviously, RS-DET could effectively overcome the abnormal suspension of DET and was able to complete the DFV extraction of each sample.The procedure of the ATCS is displayed below.
Step 1: Extract the DFV in accordance with the instructions of the RS-DET until an abnormal termination appeared, record the current fault sample as an AT fault sample, record the current node as an ATN, and record the current heuristic feature as an AT feature (xAT).
Step 2: Find the all child nodes of the ATN in the RS-DET and put them into the node set CN-ATN.
Step 3: For each node in CN-ATN, find fault types that were included in it, construct a rough set on xAT for each fault type, and calculate the roughness of this node based on Figure 8 and Equations ( 12)-( 14).
Step 4: Amend the value of xAT according to the roughness of the child nodes illustrated in Figure 9 and Equations ( 12) and (13).

Feature Correction based on the Rough Set
The ATCS could overcome the problem of abnormal termination in DFV extraction; hence, it is a most important part in RS-DET.In this paper, a rough set is applied to amend the current heuristic feature.
First, the fault sample that caused an abnormal termination is treated as a fault sample in fault type F, and the rough set of F is obtained through the method illustrated in Figure 8. F-(x) is the value range of the current heuristic feature for all the training samples in fault type F, and RN(F, xAT) is the roughness of fault type F based on the current heuristic feature.Through Equation ( 12), the roughness of every fault type based on the current heuristic feature could be acquired.Then, for the child nodes of the ATD, RN(CNh, xAT) is the roughness of the hth child node, and, as illustrated in Equation ( 13), RN(CNh, xAT) is the minimum roughness of all fault types in the hth child node.Finally, the child node that included the fault type with the minimum roughness is found, the value range (X) of the current heuristic feature for all the training samples in this child node is obtained, and the value of xAT toward X is amended based on Equation (14).Therefore, the ATCS solved the problem of abnormal termination in DFV extraction and made it possible for DFV extraction to be accurately completed.

The Evaluation of Feature Vector
Euclidean distance is widely used to evaluate feature sets in fault diagnosis.The average Euclidean distance presented by Equation ( 15) is put forward in this paper to confirm the superiority of the THFS and RS-DFV.The average Eulerian distance among different fault types reflected the dispersion among different faults, and the average Eulerian distance among the faults of the same type could measure the compactness of this fault.

•
Feature Correction based on the Rough Set The ATCS could overcome the problem of abnormal termination in DFV extraction; hence, it is a most important part in RS-DET.In this paper, a rough set is applied to amend the current heuristic feature.
First, the fault sample that caused an abnormal termination is treated as a fault sample in fault type F, and the rough set of F is obtained through the method illustrated in Figure 8. F -(x) is the value range of the current heuristic feature for all the training samples in fault type F, and RN(F, xAT) is the roughness of fault type F based on the current heuristic feature.Through Equation ( 12), the roughness of every fault type based on the current heuristic feature could be acquired.Then, for the child nodes of the ATD, RN(CNh, xAT) is the roughness of the hth child node, and, as illustrated in Equation ( 13), RN(CNh, xAT) is the minimum roughness of all fault types in the hth child node.Finally, the child node that included the fault type with the minimum roughness is found, the value range (X) of the current heuristic feature for all the training samples in this child node is obtained, and the value of xAT toward X is amended based on Equation (14).Therefore, the ATCS solved the problem of abnormal termination in DFV extraction and made it possible for DFV extraction to be accurately completed.

RN(F, x
x 2.3.The Evaluation of the Feature Vector and the Classifier

The Evaluation of Feature Vector
Euclidean distance is widely used to evaluate feature sets in fault diagnosis.The average Euclidean distance presented by Equation ( 15) is put forward in this paper to confirm the superiority of the THFS and RS-DFV.The average Eulerian distance among different fault types reflected the dispersion among different faults, and the average Eulerian distance among the faults of the same type could measure the compactness of this fault.
The average Euclidean distance between the ith fault type and the jth fault type is d ij ; the Euclidean distance between fault Fi and fault Fj is MSE(Fi, Fj); i(m) is the mth fault of the ith fault type, and j(n) is the nth fault of the jth fault type.h is the fault number of the ith fault type; and k is the fault number of the jth fault type.
Then, another parameter, DA, is presented to measure the fault distinguishing ability of feature vectors, as depicted by Equations ( 16) and (17).DA i(min) is the worst distinguishing ability of the feature vector for the ith fault.DAi(ave) is the average distinguishing ability of the feature vector for the ith fault; d ij is the average Euclidean distance between the ith fault type and the jth fault type; ii is the average Euclidean distance among all faults of the ith fault type; min( dij ) is the minimum Euclidean distance between the ith fault type and others; cn is the number of fault types.DA i(min) and DA i(ave) not only considered the dispersity among different faults but also comprised the compactness of one fault.Therefore, DA i(min) and DA i(ave) could well reflect the fault distinguishing ability of feature vectors.The bigger the values of DA i(min) and DA i(ave) , the better the fault distinguishing ability of the feature vector.

The Intelligent Fault Classification Method
The fault classification method is another significant determinant factor for the timeliness and accuracy of fault diagnosis.Thus, it is crucial in designing a suitable fault classification method according to the specificity of the faults and the feature vector.
For the REB fault diagnosis discussed in this study, the unique feature vector DFV not only had a very simple structure but also made the corresponding relation between the feature vectors and REB faults absolutely clear.In this case, the general classifiers all could get the satisfactory accuracy rate for the fault classification, such as BP, RBF, PNN, SVM and so on.By comparison, PNN is easy to train, has a better real-time performance, has a simpler structure and the calculation of PNN is simpler and faster [23].Most importantly, the PNN was proved to be very suitable for classification problems of this type [23].Consequently, this paper chose the PNN to design the intelligent fault classification method.
The probabilistic neural network (PNN) based on the Parzen probabilistic density function and the Bayesian classification rule was proposed by Specht [24].The PNN had four layers, namely the input layer, the pattern layer, the summation layer and the output layer, and its basic structure is illustrated in Figure 10.As found in many studies [22,23], the PNN is not only easy to train but also very excellent in timeliness.Furthermore, in the case of sufficient training samples, the PNN could get the optimal classification results.This is only a brief introduction of the PNN theory, and the detailed explanation was given by Specht (1990) [24].

The Fault Data of Rolling Element Bearings
In this section, the proposed method (RS-DFV) is applied to REB fault diagnosis.Besides, the experimental fault data of the paper comes from the Bearings Vibration Dataset of Case Western Reserve University [25][26][27].As showed in Figure 11, the test stand consists of a 2 hp motor (left), a torque transducer/encoder (center), a dynamometer (right), and control electronics (not shown).The test bearings support the motor shaft.Single point faults were introduced to the test bearings using electro-discharge machining.Vibration data was collected using accelerometers, which were attached to the housing with magnetic bases.Accelerometers were placed at the 12 o'clock position at the drive end of the motor housing.The signals of this this dataset are collected at 48,000 samples/s under 2 hp load, the defect sizes are 0.007 or 0.014 in, and this dataset does not contain data without failure.In this paper, each original fault signal is truncated into 118 fault samples that are presented in Table 1.

The Fault Data of Rolling Element Bearings
In this section, the proposed method (RS-DFV) is applied to REB fault diagnosis.Besides, the experimental fault data of the paper comes from the Bearings Vibration Dataset of Case Western Reserve University [25][26][27].As showed in Figure 11, the test stand consists of a 2 hp motor (left), a torque transducer/encoder (center), a dynamometer (right), and control electronics (not shown).The test bearings support the motor shaft.Single point faults were introduced to the test bearings using electro-discharge machining.Vibration data was collected using accelerometers, which were attached to the housing with magnetic bases.Accelerometers were placed at the 12 o'clock position at the drive end of the motor housing.The signals of this this dataset are collected at 48,000 samples/s under 2 hp load, the defect sizes are 0.007 or 0.014 in, and this dataset does not contain data without failure.In this paper, each original fault signal is truncated into 118 fault samples that are presented in Table 1.

The Fault Data of Rolling Element Bearings
In this section, the proposed method (RS-DFV) is applied to REB fault diagnosis.Besides, the experimental fault data of the paper comes from the Bearings Vibration Dataset of Case Western Reserve University [25][26][27].As showed in Figure 11, the test stand consists of a 2 hp motor (left), a torque transducer/encoder (center), a dynamometer (right), and control electronics (not shown).The test bearings support the motor shaft.Single point faults were introduced to the test bearings using electro-discharge machining.Vibration data was collected using accelerometers, which were attached to the housing with magnetic bases.Accelerometers were placed at the 12 o'clock position at the drive end of the motor housing.The signals of this this dataset are collected at 48,000 samples/s under 2 hp load, the defect sizes are 0.007 or 0.014 in, and this dataset does not contain data without failure.In this paper, each original fault signal is truncated into 118 fault samples that are presented in Table 1.

The Flow of the Fault Diagnosis Experiment
The application experiments were conducted according to Figure 1.First, each fault sample was decomposed through EMD, and the 20 feature parameters presented in Table 2 of that sample's first five intrinsic mode functions were extracted to constitute a feature set of 100 features.All fault samples were described by the feature vector composed by this feature set.

Time Domain
Frequency Domain 6 where x(n) is a signal series for n = 1, 2, . . ., N, N is the number of data points, s(k) is a spectrum for k = 1, 2, . . ., K, K is the number of spectrum lines; fk is the frequency value of the k-th spectrum line.
After that, the DFV of six REB faults presented in Table 1 was established through THFS proposed in this paper, and the FST was obtained at the same time.In accordance with the FST, the RS-DFV extraction tree (RS-DET) was set up for RS-DFV extraction, and the RS-DFVs of all the REB fault samples in Table 1 were calculated by means of the RS-DET.
Finally, the PNN was used to construct a fault classifier, and the values of the related parameters in this fault diagnosis experiment are presented in Table 3.For each test, 70 samples of each fault type described by RS-DFV were randomly selected to train the PNN, and the rest of the fault samples were used to verify the diagnostic accuracy and effectiveness.And the diagnostic accuracy, training time and test time of each experiment were recorded.This fault diagnosis test was repeated 100 times with the same data under the same condition, and the average recognition accuracy, training time and test time are calculated.

The Effectiveness of Tree Heuristic Feature Selection
Through THFS, the FST of six REB faults (RF 1 , RF 2 , RF 3 , RF 4 , RF 5 , and RF 6 ) was built as illustrated in Figure 12.Through the depth-first traverse of this FST, the DFV of this the six REB faults was obtained as (f 32 , f 6 , f 95 , f 19 , f 2 ).The valid items and invalid items of each fault type were illustrated in Table 4.The Eulerian distances based on the optimized feature set (f 2 , f 6 , f 19 , f 32 , f 95 ) between the six faults were recorded in Table 5.The minimum DA and average DA based on (f 2 , f 6 , f 19 , f 32 , f 95 ) of each fault were recorded in Table 6.The spatial distribution map based on this feature set is provided in Figure 13a.
In Table 5, it is clear that the average Euler distances between the samples of the same type were much smaller than the average Euler distances between different fault types, and in Table 6, the smallest DA of the six faults is 1.8296, which indicated the excellent fault distinguishing ability of the feature vector (f 2 , f 6 , f 19 , f 32 , f 95 ).In Figure 13a, there is a very clear distance between each fault and other faults, and fault samples of the same type are relatively centralized.These all meant that THFS put forward in this paper could find a very excellent optimized feature subset for the six REB faults in an acceptable time.

The Effectiveness of Tree Heuristic Feature Selection
Through THFS, the FST of six REB faults (RF1, RF2, RF3, RF4, RF5, and RF6) was built as illustrated in Figure 12.Through the depth-first traverse of this FST, the DFV of this the six REB faults was obtained as (f32, f6, f95, f19, f2).The valid items and invalid items of each fault type were illustrated in Table 4.The Eulerian distances based on the optimized feature set (f2, f6, f19, f32, f95) between the six faults were recorded in Table 5.The minimum DA and average DA based on (f2, f6, f19, f32, f95) of each fault were recorded in Table 6.The spatial distribution map based on this feature set is provided in Figure 13a.

The Effectiveness of RS-DFV
Based on the rough set and the DFV extraction tree, the RS-DFV extraction tree (RS-DET) of the six REB faults (RF 1 , RF 2 , RF 3 , RF 4 , RF 5 , RF 6 ) was built as illustrated in Figure 14.Through depth-first traverse of this RS-DET, the RS-DFV of the six REB faults was obtained as (f 32 , f 6 , f 95 , f 19 , f 2 ).The RS-DFV had the same structure, the same valid items and the same invalid items as the DFV.The Eulerian distances based on the RS-DFV among the six faults were recorded in Table 7, and the spatial distribution map based on RS-DFV is illustrated in Figure 13b.The statistics of abnormal termination in DET and RS-DET are presented in Table 8.In Table 7, the average Euler distances among the samples of the same type were much smaller than the corresponding record in Table 5, and the average Euler distances between different fault types were much larger than the corresponding record in the same table.In Table 9, the minimum DA and average DA based on RS-DFV of each fault were much larger than the corresponding record based on (f 2 , f 6 , f 19 , f 32 , f 95 ) in Table 6.Compared with Figure 13a, the fault samples of the same type are much more concentrated in Figure 13b, and the different faults were much more scattered.All these revealed that RS-DFV is much more remarkable in fault differentiation than the traditional feature vector.In Table 8, there were always a few abnormal interruptions in the DFV extraction based on the DFV extraction tree (DET), but the abnormal interruptions were eliminated in the RS-DFV extraction based on RS-DFV.This indicates that a rough set can effectively overcome the problem of abnormal interruptions in DFV extraction, and RS-DFV is a more efficient fault description method.

The Results of Fault Diagnosis
According to the RS-DET, (f 32 , f 6 , f 95 , f 19 , f 2 ) is selected to constitute the RS-DFV.The comparison between the effect of RS-DFV and the DFV in REB fault diagnosis are presented in Table 10.To verify the effectiveness of RS-DFV in REB fault diagnosis, three other comparison experiments of REB fault diagnosis are designed and carried out under the same conditions with the same data and classification methods.These comparison experiments are based on three other feature vectors such as SFI, FSR and AF.Comparisons among the five feature vectors are listed in Table 11.We also made a comparative experiment with other three feature selection method, such as filter, wrapper, filter the combination of filter and wrapper.These experiments used feature vectors with the same dimension, the same feature evaluation methods and classifiers.And the experimental results are shown in Table 12.In addition, the experimental results based on RS-DFV and four different classifiers are shown in Table 13.
In Table 10, it is very obvious that REB fault diagnosis based on RS-DFV achieved a very ideal diagnostic accuracy close to 100%, and its time consumption for feature extraction, classifier training and diagnosis test are acceptable.Compared with DFV, RS-DFV significantly increased the diagnosis accuracy without taking a much longer time.This indicates that RS-DFV is more suitable for REB fault diagnosis than the DFV and that the rough set improved the performance of DFV in fault characterization.In Table 11, the average diagnostic accuracy rate of the RS-DFV fault diagnosis method is much higher than that due to other feature vectors.Therefore, the distribution optimization technique of RS-DFV significantly improved the accuracy of REB fault diagnosis.It is clear that the RS-DFV method yielded the best testing accuracy rate and took the shortest time.Obviously, the fault diagnosis contrast test proved that RS-DFV realized the easiest and most effective fault sample representation and that the RS-DFV fault diagnosis method is very efficient for REB fault diagnosis.
In Table 12, the average diagnostic accuracy rate of the RS-DFV fault diagnosis method is much higher than that which were based on other feature selection methods, and the corresponding feature selection time is much less.Obviously, RS-DFV found the best feature subset in the shortest time and achieved the best diagnostic accuracy with the same size feature subset and the same classifier.
In Table 13, the four different classification methods all have achieved very satisfactory diagnostic accuracy.This further confirms the effectiveness of RS-DFV for rolling bearing fault diagnosis and shows that PNN reduces the classification complexity but does not affect the classification accuracy.

Conclusions
This paper has proposed and discussed an original REB fault diagnosis method based on rough set and dependent feature vector (RS-DFV).First, this method employed dependent feature vector (DFV) to describe rolling element bearing faults.And this fault characterization method greatly improved the accuracy of fault description and laid a reliable data base for the follow-up fault diagnosis.Afterwards, a tree heuristic feature selection method (THFS) is proposed for selecting the effective features and building the dependent feature vector structure.THFS perfectly overcame the difficult problem in dependent feature vector building and realized feature reduction as well as feature optimization.Above all, a feature extraction method based on rough set and the dependent feature vector extraction tree was designed for RS-DFV extraction, and it ensured that RS-DFV had the same structure advantage as that of dependent feature vector and solved the abnormal termination problem in DFV extraction.Therefore, RS-DFV not only inherits the advantages of DFV, but also cleverly overcomes the defects of DFV.The results of the contrast tests showed that: the diagnostic accuracy of RS-DFV are (100%, 100%, 99.51%, 100%, 99.47%, 100%), while those of the feature vector constituted by the same feature items and DFV are only (93.97%, 100%, 99.33%, 100%, 88.09%, 93.81%) and (100%, 100%, 99.51%, 100%, 99.47%, 100%); and there was no significant difference in training time and test time.In general, the fault diagnosis method presented in this paper ameliorated the algorithm performance in terms of accuracy, complexity and timeliness and greatly advanced the efficiency as well as the practicability of REB fault diagnosis.Therefore, RS-DFV is a very useful fault description method for REB fault diagnosis.

Figure 1 .
Figure 1.The framework of REB Fault diagnosis based on tree heuristic feature selection (THFS) and the dependent feature vector combined with rough sets (RS-DFV).

Figure 1 .
Figure 1.The framework of REB Fault diagnosis based on tree heuristic feature selection (THFS) and the dependent feature vector combined with rough sets (RS-DFV).

Figure 2 .
Figure 2. (a) The structure of DFV; (b) the valid and invalid features of DFV.

Figure 2 .
Figure 2. (a) The structure of DFV; (b) the valid and invalid features of DFV.

Figure 4 .
Figure 4. (a) Build the feature selection tree; (b) the heuristic tree and the composition and structure of DFV.

Figure 4 .
Figure 4. (a) Build the feature selection tree; (b) the heuristic tree and the composition and structure of DFV.

Figure 5 .
Figure 5. DFV extraction based on the DFV extraction tree.
,b.This could lead to an abnormal termination in the DFV extraction and inaccurate calculation of the sample DFV.Appl.Sci.2019, 9, x FOR PEER REVIEW 8 of 19tree could not meet the demand of any one heuristic rule connected to this feature, and there is be no inspire information for the next step, as illustrated in Figure7a,b.This could lead to an abnormal termination in the DFV extraction and inaccurate calculation of the sample DFV.

Figure 7 .
Figure 7.The RS-DFV extraction tree combining the rough set and DET (RS-DET).

Figure 8 .
Figure 8.The rough set of fault F on the AT feature (xAT).

Figure 7 .
Figure 7.The RS-DFV extraction tree combining the rough set and DET (RS-DET).

Figure 7 .
Figure 7.The RS-DFV extraction tree combining the rough set and DET (RS-DET).

Figure 8 .
Figure 8.The rough set of fault F on the AT feature (xAT).Figure 8.The rough set of fault F on the AT feature (x AT ).

Figure 8 .
Figure 8.The rough set of fault F on the AT feature (xAT).Figure 8.The rough set of fault F on the AT feature (x AT ).

Figure 9 .
Figure 9. RS-DFV extraction based on the rough set.

Figure 9 .
Figure 9. RS-DFV extraction based on the rough set.

Figure 10 .
Figure 10.Architecture of the probability neural network.

Figure 11
Figure 11 The test stand.

Figure 10 .
Figure 10.Architecture of the probability neural network.

Figure 10 .
Figure 10.Architecture of the probability neural network.

Figure 11
Figure 11 The test stand.

Figure 12 .
Figure 12.The feature selection tree of the REB faults.

Figure 12 .
Figure 12.The feature selection tree of the REB faults.

Figure 12 .
Figure 12.The feature selection tree of the REB faults.

19 Figure 14 .
Figure 14.The RS-DET of the REB faults.

Figure 14 .
Figure 14.The RS-DET of the REB faults.
RS-DFV-PNN: Fault diagnosis based on RS-DFV and PNN; RS-DFV-RBF: Fault diagnosis based on RS-DFV and radical basis function neural networks; RS-DFV-BP: Fault diagnosis based on RS-DFV and back propagation neural networks; RS-DFV-SVM: Fault diagnosis based on RS-DFV and SVM. ) )

Table 1 .
The data set of the rolling element bearings.

Table 1 .
The data set of the rolling element bearings.

Table 1 .
The data set of the rolling element bearings.

Table 2 .
The feature parameters.

Table 8 .
The abnormal interruptions in the DET and the RS-DET.

Table 9 .
The DAi(min) and DAi(ave) based on the RS-DFV.

Table 10 .
The experimental results of RS-DFV and DFV.

Table 11 .
Comparison with other feature vectors.The feature vector constituted by the same feature items of RS-DFV; FSR: The feature vector constituted by the same number of features selected randomly; AF: The feature vector constituted by all the features.

Table 12 .
Comparison with other feature selection methods.

Table 13 .
Comparison with different classification methods.