1. Introduction
With the rapid development of industrial production, the modern industry’s demand for machinery equipment is developing towards high quality, high intelligence, and high reliability. Rotating machinery is widely used in the chemical industry, mining, electric power, aviation, and other fields. Rolling bearing is a key component in the field of rotating machinery, and its failure will lead to serious economic losses and even casualties [
1]. In order to ensure a safe and reliable production cycle, improve the production efficiency of enterprises, and provide effective intelligent diagnosis methods for the health status of rotor-bearing systems, it has become a hot research topic all over the world in recent years. Deep learning (DL) was used for fault diagnosis because traditional machine learning cannot meet the needs of contemporary industrial production. The fault diagnosis method based on DL further improves the intelligence of rotating machinery fault diagnosis technology with its powerful big data learning ability, nonlinear processing ability, and high generalization ability [
2]. In recent years, the deep-learning-based fault diagnosis model has been widely studied and achieved excellent results [
3,
4,
5].
In recent years, fault diagnosis methods based on DL have become a research hotspot around the world. DL models such as deep belief network (DBN), convolutional neural network (CNN), stacked auto-encoder (SAE), and generative adversarial neural network (GAN) are the most representative. For example, Li et al. [
6] used Gaussian elements to construct Gaussian convolution DBN to realize the fault diagnosis of rotor-bearing systems under time-varying speeds. In order to solve the problem of low fault diagnosis efficiency under noise conditions, Xue et al. [
7] improved CNN and proposed a new anti-noise CNN for fault diagnosis under noise background. Liu et al. [
8] constructed two deep SAEs to extract features from the source domain and target domain of training data to solve the fault diagnosis in the adaptive environment of the partial domain. Tong et al. [
9] proposed an auxiliary classifier GAN with spectral normalization for fault diagnosis with a small and unbalanced sample size. Huang et al. [
10] decomposed the discrete vibration signals of gear boxes via wavelet packet, input the decomposed signal components into hierarchical CNNs, and adaptively extracted multi-scale features to effectively classify faults.
Although fault diagnosis methods based on DL have achieved good results in recent years [
11], there are still some problems that seriously limit their application in practical production [
12]. DL is mainly divided into supervised and unsupervised learning [
13]. For supervised learning, obtaining a sufficient number of labeled fault samples is a challenging task. Moreover, dealing with the uneven distribution of various sample types, limited sample sizes, missing sample labels, and other issues related to data imbalance in fault data pose additional challenges. Particularly when working with unlabeled data, it becomes difficult to address these problems effectively using existing supervised intelligent fault diagnosis methods. As a result, the performance of the models significantly deteriorates, making it challenging to achieve high-precision intelligent fault diagnosis and monitoring for mechanical systems. Consequently, training the network to achieve satisfactory accuracy becomes increasingly difficult. Then, unsupervised learning can train the network to achieve satisfactory accuracy under the condition of unlabeled samples [
14]. Therefore, self-supervised learning based on unlabeled data has received more and more attention and research from scholars. Wang et al. [
15] proposed a one-stage self-supervised momentum contrastive learning model (OSSMCL) for open-set cross-domain fault diagnosis. The method is based on momentum encoders of self-supervised contrastive learning to capture distinguishable features between sample pairs and incorporates a one-stage framework of the meta-learning paradigm through which OSSMLC can learn to identify new faults with a small number of labeled samples in the target domain. Finally, the validity of the proposed method is verified on the open-set fault diagnosis dataset. An et al. [
16] proposed a domain adaptation network (DACL) based on contrastive learning to realize the purpose of bearing fault diagnosis across different working conditions, reduce the probability of samples being classified near or on the boundaries of various types, and improve diagnosis accuracy. The proposed method consists of a feature mining module and an adversarial domain adaptation module. In the feature mining module, the one-dimensional convolutional neural network (1-D CNN) was used to extract the features of the original vibration signal. The adversarial domain adaptation module aims to learn domain-shared discriminative features for aligning marginal distributions. At the same time, a contrast estimation term is designed to quantify the similarity of data distribution, increase the distance between samples of different health states, reduce the probability of samples close to the boundary, and improve diagnostic performance. Finally, an adaptive factor was introduced to measure the relative importance of the method’s transfer ability and discrimination ability. The effectiveness of the proposed method is demonstrated in various fault diagnosis scenarios with domain differences between the source and target domains using experimental data from two bearing systems. Wang et al. [
17] proposed a self-supervised contrastive learning framework based on nearest neighbor matching (SCLNNM) to solve the problem of limited labeled samples, which seriously affects the performance of fault diagnosis. The method proposed in this paper is used to learn discriminative feature representations from large-scale unlabeled data sets and then realize fault diagnosis. Since the collected mechanical 1D signals are different from 2D images, in addition to designing reasonable data augmentation combinations to generate similar real instances of 1D sequences, the proposed scheme also finds the nearest neighbors in the support set as positive instances of the input signals to increase the diversity of representations. In this framework, the 1D CNN model combined with contrastive learning aims to learn a robust generic representation from different augmented signals. Based on this, limited labeled data is finally used to investigate what kind of feature representation is appropriate and to train a simple classifier for fault diagnosis. The collected engine data set of an operating ship shows that the proposed framework can effectively extract valuable feature information and improve the classification accuracy under the limited labeled data set. Aiming at the problem of serious performance degradation of deep-learning fault diagnosis models caused by imbalanced data sets, Zhang et al. [
18] proposed a new feature learning-based method named class-sensitive supervised contrastive learning (CA-SupCon). Supervised contrastive learning (Sup Con) is used for the first time in imbalanced fault diagnosis, which uses class information to optimize the feature differences between any two classes. In addition, a class-sensitive sampler (CA) is designed to rebalance the data distribution within each mini-batch during training, which improves Sup Con’s ability to expand the feature distance between any two minority fault states. By effectively integrating SupCon and CA, the proposed CA-SUPCON framework can obtain a more discriminative feature space with better intra-class compactness and inter-class separability and achieves good performance in the above class imbalance scenario. Extensive experiments on two open-source datasets demonstrate the effectiveness of the proposed method.
For rolling bearing, obtaining complete labeled data for single and compound faults is a very difficult and costly task, and how to solve the above problem from unlabeled data obtained from experiments or production has become a new topic in bearing-rotor system fault diagnosis [
19]. GCL is a self-supervised learning algorithm for graph data, which aims to train a graph encoder on a given large amount of unlabeled graph data to obtain the feature representation vector of the graph [
20]. The general process is similar to traditional CL, with the advantage of data augmentation of graph signals and contrast hierarchy enhancement [
21]. The above advantages have been verified in the work of the literature [
22,
23,
24,
25]. The main work of this paper is as follows:
- (1)
We explored the distribution of wavelet energy in different frequency bands at the last level of wavelet packet decomposition for the original signal. Based on this, the Pearson correlation coefficient was introduced to calculate the correlation between wavelet energy in different frequency bands. Subsequently, a node graph construction method was proposed, where each frequency band served as a node, and the wavelet energy in the frequency band served as the node feature. The Pearson correlation coefficient was used as the edge weight between nodes, resulting in the construction of an undirected node graph to represent the information of the original signal.
- (2)
In consideration of the graph structure attributes of the node graph, the impact of node and edge deletion or addition on the graph structure and information was analyzed. Eventually, a method was proposed to use node and edge addition during the data augmentation phase. In the two augmentation steps, one involved computing the mean of the existing node features as the feature of the newly added node, while the other involved calculating the variance as the feature of the newly added node. The Pearson correlation coefficient was used to determine the relationship between the newly added node and the existing nodes, serving as the weight for the newly added edges.
- (3)
During the encoding process with graph convolutional neural networks, the weights of edges were utilized as the adjacency matrix, providing a more accurate representation of the relationships between the central node and its neighboring nodes.
- (4)
We analyzed the comprehensive performance of the proposed method using the vibration signal dataset from the bearing driving end of Western Reserve University. Experimental results demonstrate that WPDPCC-DGCL exhibits superior data processing capability and achieves better fault diagnosis of rolling bearings compared to contrastive learning (CL).
The remainder of this article is arranged as follows: the second section mainly introduces the proposed WPDPCC-DGCL method, the third section verifies the feasibility of the proposed method using the bearing data of Western Reserve University, and the fourth section summarizes the main achievements made in this paper.
4. Conclusions
In this paper, we propose a fault diagnosis method for rolling bearings based on WPDPCC-DGCL, which focuses on extracting signal component information of different frequency bands from the unlabeled data of rolling bearing time series. The main contribution of the method is to propose the WPDPCC method of constructing node graphs to build the dataset and pre-training it on the DGCL model, to combine the advantages of node graphs with data enhancement by randomly removing nodes and edges, which provides a more complete information representation of graph data in space compared to the one-dimensional time-domain data, and to explore the application of the DGCL method in downstream tasks in the fault diagnosis domain. The results show that the self-supervised pre-training model is effective compared to the traditional method knot in the case of large amounts of unlabeled data. However, there are several issues that need to be addressed as follows:
- (1)
High requirements for pre-processing of the original signal and the need for comprehensive analysis in conjunction with the characteristics of the original signal in the construction of a high-quality node graph;
- (2)
The long training time of the DCCL method due to the large amount of data and the repetition of positive and negative samples during the training process;
- (3)
The generalization capability of the model needs to be improved, and the mode of data set processing needs to be modified in the future.
In order to better solve the above problems, in the future, the research on fault diagnosis based on DGCL can be improved from the perspective of data acquisition, the method of constructing node graphs and data augmentation, and the specific analysis is as follows:
- (1)
Considering the spatial layout of sensors in the initial stage, data preprocessing is used to decompose the 1D time series data at different spatial locations, and the results of 1D signal decomposition are concatenated on the spatial layout according to the location of sensors to achieve the multi-dimensional representation of the signal;
- (2)
Keeping up exploring the signal decomposition methods, such as wavelet packet decomposition, empirical mode decomposition, and other methods in the application of 1D signal decomposition, extracting more accurate and complete feature vectors as node features, and determining the weight relationship between nodes by measuring the distribution similarity and distance of node features in space to provide a feasible theoretical method for constructing high-quality node graphs;
- (3)
The experimental verification of data enhancement methods of deleting nodes or adding nodes is improved to ensure the interpretability and feasibility of data enhancement.