5.1. Experiments on the Bearing Dataset
In this experiment, the bearing data were used from the Case Western Reserve University (CWRU) rolling bearing dataset. As shown in
Figure 4, the data acquisition platform is divided into three main parts, from left to right, the motor, the torque sensor, and the dynamometer. The drive-end bearing vibration data at 12 kHz sampling frequency and 1772 rpm were used in our diagnosis experiment, containing four different types of data, namely normal data, inner race defect data, outer race defect data and ball defect data.
Figure 5 shows the one-dimensional signals of the four states. With a sampling length of 1024, 100 samples were obtained for each type of signal data. Then, the high-dimensional feature set is obtained by extracting the time–domain, frequency–domain and time–frequency features of the original fault samples according to the process described in
Section 4. The training set is composed of 40 samples randomly selected from each class, and the rest are used as test samples. To avoid the chance of experimental results, the random experiment was repeated ten times by us. Finally, the fault recognition accuracy was taken as the average of the ten-time results.
To obtain the appropriate parameters for the ESBDP algorithm, we conducted several experiments using the grid search method, and through comparison, we finally set the neighborhood parameter value
, the adjustment parameter
, and the kernel parameter
. In addition, for comparison with our algorithm, five related algorithms, KPCA, LPP, LLE, OLGPP, and LSPD were used for the same experiments, and the parameters of each algorithm were selected using the grid search method. Among them, the kernel parameter of the global nonlinear algorithm KPCA is
. The neighborhood parameters of the LPP, LLE, OLGPP, and LSPD [
45] algorithms are set to
,
,
, and
, respectively. The kernel parameters of LPP, KPCA and OLGPP are set to
and
, respectively.
To visualize the classification performance of ESBDP, we show the distribution of the sample features in the three-dimensional space after projection in
Figure 6 and compare it with the LLE, LPP, KPCA, OLGPP and LSPD algorithms, where the horizontal axis denotes the first dimension of
, the vertical axis denotes the second dimension of
, and the other axis denotes the third dimension of
. It can be seen that the visualization distribution results of LLE, LPP and KPCA are relatively poor, showing that the same kind of features are scattered and there is a serious overlap between different types of data without a clear demarcation. The reason for this situation is that all three algorithms only singularly consider the neighborhood information or global structure of the fault data and extract incomplete information; in addition, the supervision information is not used by these algorithms. Compared with the above three algorithms, the visualization results of OLGPP and LSPD are relatively good, but there are some cases of scattered data within the class and the heterogeneous samples are relatively close to each other. The visualization of the three-dimensional feature distribution based on the ESBDP method is optimal, with high aggregation between the same class of samples, high separation between the heterogeneous samples, and obvious demarcation between classes, which provides a favorable basis for the subsequent fault classification.
To further explain the low-dimensional feature separability of ESBDP, we adopted the ratio of inter-class distance and intra-class distance as the separability index. The inter-class distance
is used to reflect the separability between classes, while the intraclass distance
reflects the aggregation of samples within classes, and they can be calculated by Equations (32) and (33), where
denotes the number of classes of samples,
denotes the class center, and
denotes the
th sample of the
th class. For low-dimensional features, the larger the separability parameter metric, the better its relative separability.
Figure 7 provides the divisibility metrics based on the six algorithms, and it can be seen that the divisibility parameters of EGBDP are higher than those of the other algorithms. Combining
Figure 6 and
Figure 7, we can conclude that ESBDP has better dimensionality reduction performance and can provide a favorable basis for subsequent fault classification.
Table 3 provides the average fault recognition accuracy and its standard deviation and processing time for ten random experiments based on the six methods of LLE, LPP, KPCA, OLGPP, LSPD and ESBDP. The standard deviation is used to reflect the fluctuation of the recognition rate such as the smaller standard deviation indicating the smoother performance of the algorithm. From
Table 3, it can be observed that the recognition results of the three algorithms, LLE, LPP and KPCA, which only consider the global or local structure, are lower for nonlinear unstable machinery fault data. Among them, LPP has a recognition rate of 91.79% by discovering the local discriminative features on the sample manifold, and the large standard deviation indicates its poor stability. LLE is a nonlinear algorithm in the manifold algorithm that can respond to the sample globally with local linearity, and its low-dimensional samples can keep the original topology, and the fault diagnosis accuracy of this algorithm is 93.29%. With the kernel function, KPCA can capture the nonlinear global structure information of the fault data, and the recognition rate is 94.63%. Since OLGPP captures both global and local information of the data, its diagnosis accuracy is higher than the above three methods, but OLGPP does not utilize the supervised information of the fault data, so it is difficult to achieve the correct classification of all the data. While using supervised information, LSPD measures and preserves the similarity between fault samples by constructing a similarity function, which improves the accuracy to some extent, but the algorithm focuses more on the local structure of the samples and the global information is not effectively used.
ESBDP achieves the highest recognition accuracy because the algorithm maps the fault data into the Euler space through the cosine metric and integrates the structural relationships of local intra-class, local inter-class, global intra-class and global inter-class samples in this space. On the basis of expanding the differences between heterogeneous fault samples, the inter-class separation and intra-class aggregation between samples are effectively improved. In addition, ESBDP achieves an adaptive balance between global and local features, which further enhances the discriminative power of the extracted fault features. Remarkably, our method has the smallest standard deviation and the processing time is only 0.34 s, which means that the ESBDP algorithm has a low computational burden and is more stable.
The specific classification results are provided in
Figure 8. It can be observed that the classification errors exist in all algorithms except our algorithm. Among them, in the prediction results of the LPP and KPCA algorithms, the classification errors are mainly concentrated on the second and third fault types of data. Classification errors are detected in LLE and OLGPP for all other three types of faults except normal data. The classification results of the LSPD algorithm are better, but there is also a normal sample that is misclassified. Our method can accurately identify samples of each fault type, which shows the superiority of ESBDP in clustering and classification of all types of fault data.
To further verify the ability of the ESBDP algorithm to capture fault information, we used different numbers of samples to train each algorithm and then observed the change in fault diagnosis results.
Figure 9 shows the corresponding experimental results. Overall, as the number of training samples increases, the recognition rate of all six algorithms increases, because the training samples contain the discriminative information required for fault classification, and the more training samples, the more discriminative features can be learned. Among them, LLE, LPP and KPCA algorithms have lower recognition rates. In addition, the accuracy of LPP, LLE and KPCA can hardly be improved even if the training samples are increased due to the performance of the algorithms. ESBDP outperforms the other five algorithms in terms of recognition rate and stability. It is worth noting that ESBDP can achieve a 100% fault recognition rate at 20 training samples per class, which means that our algorithm has a strong ability to capture fault discriminative information and can better perform fault diagnosis tasks.
To further investigate the performance of the ESBDP algorithm, we experimented with bearing fault data for four operating conditions at 1797 r/min, 1772 r/min, 1750 r/min, and 1730 r/min, respectively. Following the process in Chapter 3, the identification results of the six algorithms are shown in
Figure 10. It can be concluded that for the bearing data under different working conditions, the recognition rates of the first three algorithms are generally low and unstable, and the accuracy rates of OLGPP and LSPD are relatively good, but they cannot classify the four data correctly at the same time. The ESBDP algorithm is the most stable and achieves the highest accuracy rate for the diagnosis of all four data. Experiments show that compared with other methods, ESBDP has stronger adaptability and can also mine effective fault discrimination features for data under different working conditions to achieve accurate classification.
5.2. Experiments on the Gear Dataset
The experimental data were obtained from a gear fault dataset published by the University of Connecticut, and the data collection platform is shown in
Figure 11. It is a two-stage gearbox containing a motor for controlling the gear speed, a tachometer for measuring the speed, an electromagnetic brake for providing torque, and two-stage input shaft equipped with gears. The gear operation data are obtained from the pinion of the first stage via an accelerometer with a sampling frequency of 20 KHZ and contain nine types of gear states for spalling, root crack, missing tooth, five different severity levels of chipping tip, and health state. To facilitate the description, we denote these nine types of data by {F1, F2, F3, F4, F5, F6, F7, F8, F9}. The vibration signals of the nine gear states are provided in
Figure 12. A total of 104 samples were collected for each type of data, totaling 936 samples with a dimension of 3600. Following the procedure in
Section 4, the statistical features in the time–frequency domain were computed for each sample to obtain the high-dimensional feature set. In this experiment, we randomly chose 50 samples from each of the nine data types as training samples and the rest of the samples as prediction samples. In addition, the parameters of each algorithm in the gear experiment are the same as those of the bearing diagnosis experiment.
Figure 13 and
Figure 14 show the three-dimensional distribution and separability indexes of the low-dimensional features after projection based on the six algorithms, respectively. Due to the increase in categories, the distribution of features is more complex. Among them, the separable indexes of LLE, LPP and KPCA are relatively low and the visualization effect is relatively poor. There is a serious overlap between multi-category features without clear demarcation, which makes it difficult to identify effectively. The main reason is that these three methods only consider the neighborhood structure or global information of the fault data. OLGPP is also more confounded between classes because it does not utilize the label information between samples. The comparison shows that for multi-class gear fault data, the ESBDP method has the highest separability index and its three-dimensional feature distribution visualization is also optimal, where homogeneous features are aggregated with each other and heterogeneous features have relatively obvious boundaries, which shows that our algorithm can effectively improve the aggregation between samples of the same class and the separability between heterogeneous samples.
The gear diagnosis accuracy is provided in
Table 4, and its corresponding classification details are provided in
Figure 15. For gear data with multiple classes, the recognition accuracy of LLE, LPP and KPCA algorithms, which fail to combine both local and global discriminative features of fault data, decreases more. The accuracy of the OLGPP algorithm also decreases because the supervised information is not utilized by the algorithm, which makes it not advantageous in the diagnosis of multi-category fault data. In addition, there are more parameters in the OLGPP algorithm, which makes it hard to effectively diagnose different fault data without changing the parameters. It is worth noting that our algorithm achieves accurate classification for all types of gear data. On the basis of expanding the differences between heterogeneous samples, ESBDP combines label information to fully consider the geometric structure relationship of fault data to achieve the acquisition and balance of local and global features, which enables effective fault discriminative features to be extracted. The results of gear diagnosis further demonstrate the effectiveness of our algorithm.
To further verify the ability of ESBDP to capture fault features, the number of samples used for training and the number of samples used for prediction were set to 2/102, 5/99, 10/94, 20/84, 30/74, 40/64, 50/54, 60/44, and 70/34 in the gear diagnosis experiments, respectively, to observe the variation of the accuracy rate of each algorithm. The results are provided in
Figure 16. For different numbers of training samples, the fault diagnosis accuracy of ESBDP is consistently higher than that of the other five algorithms. Notably, the recognition accuracy of ESBDP reaches 99.53% at 10 training samples per class, which indicates that the algorithm has a great ability to capture the discriminative features hidden in the fault data.