1. Introduction
It is necessary to monitor the health of rotating parts and diagnose faults in complex working conditions because the working environment of high-speed rotating machinery is usually very harsh and complicated [
1,
2]. However, the artificially extracted features of traditional fault diagnosis methods are usually shallow features. Therefore, the research of intelligent fault diagnosis methods based on deep learning methods has received more and more attention. As an important branch of deep learning methods, supervised learning models with powerful feature extraction capabilities and data analysis capabilities have been applied to fault diagnosis [
3,
4]. Lin H et al. [
5] used the strong non-mapping and self-learning ability of the neural network to build a BP neural network model to adapt to the state detection and fault diagnosis of rolling bearings and achieved satisfactory fault classification results. Tch CK et al. [
6] studied the use of the ELM algorithm to classify bearing faults, and experiments show that the performance of the ELM algorithm is better than the BP algorithm. However, many studies have shown that simple supervised models are greatly affected by parameter adjustments and data input. In order to dig deeper into the deep rules of data, unsupervised learning models have begun to be widely studied.
Through the transformation of the original data to characterize the deep features of the sample, the common unsupervised learning network is used for feature extraction and data analysis [
7]. In order to obtain good classification performance, the unsupervised model is often combined with the supervised model to form a semi-supervised model to achieve the purpose of adjusting the performance of the unsupervised model with the help of label information to obtain a reliable classification model. Wang FT et al. [
8] proposed an enhanced depth feature extraction method based on Gaussian radial basis kernel function and autoencoder. The advantage of this method is that it can obtain higher test accuracy based on fewer iterations, but the disadvantage is that it requires manual experience to adjust network parameters. Zhu J et al. [
9] proposed a new intelligent fault diagnosis method based on principal component analysis and deep belief network, which can realize reliable fault analysis based on original vibration data, but it is prone to overfitting.
As a typical unsupervised learning network, autoencoders have received extensive attention [
10,
11]. Zhao ZH et al. [
12] proposed a frequency domain feature extraction autoencoder network, which uses an asymmetric autoencoder to learn the mapping relationship between time-domain signals and frequency-domain signals and obtains a better clustering effect on bearing data. Li K et al. [
13] proposed a rolling bearing fault diagnosis method based on the sparse and nearest neighbor preservation theory deep extreme learning machine, which can unsupervised the deep law of data mining and supervised learning to solve the least square classification diagnosis. The results show that sparsity has a good ability to improve neural network performance. In addition, many studies have also shown that improving the sparsity of the autoencoder can enhance its network performance [
14,
15]. Moreover, the introduction of category information into the feature extractor can also enhance the effect of autoencoder feature extraction [
16,
17], but most of the existing research still does not consider this approach [
18,
19,
20,
21,
22].
Due to the characteristics of deep neural networks, the selection of hyperparameters in the model is still very important [
23,
24,
25]. To this end, genetic algorithm [
26], cuckoo algorithm [
27], gray wolf algorithm [
28] and other methods have been used for hyperparameter optimization to improve the performance of neural networks. Zhou J et al. [
29] researched and proposed a GA-SVM rolling bearing intelligent evaluation method based on feature optimization. The optimal parameter optimization model was obtained through a genetic algorithm, and high diagnostic accuracy was achieved. Chen J et al. [
30] introduced the gray wolf optimization algorithm to optimize the key parameter smoothing factor of the model to obtain an ideal classification model. The results show that this method can achieve an effective diagnosis of bearing faults under different working conditions under a small sample training set. The above research shows that selecting appropriate hyperparameter optimization algorithms for different network structures can effectively improve the performance of the network model.
This study fully considers the sparsity, classification and hyperparameter selection of autoencoders and proposes a hierarchical sparse discriminant autoencoder (HSDAE) method. The proposed HSDAE method can improve the feature extraction performance of autoencoders from the aspects of network sparsity and classification and is used for fault diagnosis of rotating components under complex working conditions. A novel hierarchical sparse strategy was used to enhance the sparse connection between deep network layers, and at the same time, was combined with particle swarm optimization to obtain the optimal sparse hyperparameters to improve network sparsity. Class aggregation and class separability strategy were used as discriminative distance to enhance the classification ability of the network, thereby enhancing the feature extraction ability of the improved autoencoder. The proposed method was compared and analyzed with a variety of existing similar methods, and the results show the superiority of the proposed method.
The main innovations and contributions are as follows:
- (1)
In this paper, a novel semi-supervised autoencoder (hierarchical sparse discriminant autoencoder) is proposed to extract features for fault diagnosis;
- (2)
A novel hierarchical sparsity strategy is proposed to enhance the sparsity of autoencoder networks, combining class aggregation and class separability strategy to improve feature extraction performance;
- (3)
Experimental comparative analysis verifies that the proposed method can achieve reliable fault diagnosis for rotating parts under complex working conditions;
The rest of this article is organized as follows: Firstly,
Section 2 introduces the basic principles of stacked sparse autoencoders and particle swarm optimization algorithms. Secondly,
Section 3 introduces the improvement of the original method in this study and the specific process of the proposed method.
Section 4 then introduces the experimental results and verifies the effectiveness of the proposed method on the CWRU bearing data set and private-bearing data set. Finally,
Section 5 is the conclusion.
5. Conclusions
This paper proposes a hierarchical sparse discrimination autoencoder method for the intelligent diagnosis of rotating mechanical component faults. Through the experimental verification and analysis of the CWRU data sets and the private data sets, the following conclusions can be drawn:
- (1)
The proposed hierarchical sparse strategy is used to optimize the SSAE, giving different sparse activation and sparse regularization weights to the neurons in each layer of SSAE, which enhances the randomness of the network sparseness and achieves better the diagnostic effect. Using the PSO to obtain the best sparse parameters adaptively driven by data can improve network sparseness and avoid the complexity of manual parameter selection;
- (2)
The class aggregation and class separability strategy can effectively enhance the classification ability of the autoencoder network. It is proved from the side that the proposed method can optimize the feature extraction performance of the autoencoder;
- (3)
Compared with other methods, the standard deviation of the HSDAE method proposed in this paper is small, and the fault diagnosis accuracy is high, which reflects the effectiveness, reliability and stability of the HSDAE method in fault diagnosis.
Regarding the future work direction, in this study, the hierarchical sparse strategy and discriminative distance are combined as the loss penalty term of the loss function. Therefore, specifying a better strategy is an important direction of subsequent research. In addition, the method proposed in this paper can only perform accurate fault diagnosis for the same distributed data set. In the future, the domain adaptive method will be further studied based on this method to realize the fault diagnosis for different distributed data sets.