1. Introduction
Three-phase motors are commonly used in industrial production. Given the uncertainty of the motor’s working environment, it is inevitable that the motor will experience structural or performance faults such as bearing damage and phase loss during operation. Motor faults may not only lead to a production line shutdown, resulting in production interruption, but also trigger a local power outage, which has a great impact on production, life, and the national economy [
1,
2]. Quickly identifying the fault parts and responding accordingly can greatly save manpower and material resources, as well as improve the reliability and stability of power grid operation. Fault diagnosis is drawing more and more engineers’ attention. In-depth study of motor fault monitoring and diagnosis has very important practical significance in engineering and practice.
Vibration [
3], current [
4], electromagnetic [
5], temperature [
6], and acoustic signals [
7] are widely used signals for fault monitoring. Vibration signals are often used for the diagnosis of mechanical faults, and magnetic field signals are often used to monitor whether misalignment occurs on the stator-rotor structure. Different faults have different characteristics in the signal, and the use of vibration, current, or electromagnetic signals can all represent the characteristics of fault signals and be used for fault classification. By analyzing the characteristics of the signals, motor fault states can be effectively classified. With the complexity of the motor environment and the large size of the system, the analysis of motor faults is not limited to a single signal. Data-driven fault diagnostic techniques are frequently employed in production, and multi-sensor data fusion techniques are a topic of much discussion. Feature fusion of signals captured by multiple sensors can fully extract the features of fault signals, thus improving the robustness, accuracy, and universality of fault detection. Xia [
8] fused the operating conditions of the equipment through multiple sensors and automatically extracted the representative features from the original signals. Li [
9] proposed a multi-layer deep fusion network with attention mechanism (AMMFN) model using the bearing fault data of Paderborn University and their own data, whose average accuracy is 98.72%. Gong [
10] performed the diagnosis of mechanical faults by improving the technique of CNN and multi-channel data fusion.
Traditional methods such as wavelet transform (WT) [
11], spectral analysis (SA) [
12], fast fourier transform (FFT) [
13], empirical mode decomposition (EMD) [
14], and other time-frequency domain analysis methods are commonly used in signal processing. Although it can somewhat extract features with representative fault information and achieve fault diagnosis, due to the complexity of the motor operating environment, the signals received by the sensors under different working conditions are easily affected by the environmental noise, so the weak feature signals are easily covered by the interference signals [
15]. Besides, the above methods rely on the manual extraction of the features, and the detection accuracy is greatly dependent on the staff’s personal experience. In recent years, the application of image processing to the analysis of fault diagnosis has been widely discussed. Zhu [
16] converted the time-frequency domain vibration signals into a two-dimensional symmetric dot pattern (SDP) and used CNN for fault identification, whose SDP image can intuitively show the different vibration states of the motor with a diagnostic accuracy of 96.50%. Xu [
17] used picture similarity matching to diagnose faults by gathering vibration signals from various operating states and creating SDP mode templates for those states. By matching the visual image with the fault type through the SDP method, the characteristic signal is transformed into a snowflake map with obvious differences in radius and angle, which is able to visualize the nonlinear, unstable signal characteristics.
With the increasing maturity of intelligent fault diagnosis techniques, algorithms such as convolutional neural networks [
18] (CNN), extreme learning machines [
19] (ELM), support vector machine (SVM) [
20], deep belief networks [
21] (DBN), and so on have been gradually used in the field of fault classification. Jegadeeshwaran [
22] classified hydraulic brake defects using SVM and decision tree algorithms, allowing for brake condition monitoring and, to some extent, preserving passenger safety. Convolutional neural network is mainly based on shallow features for learning, through the convolution operation to extract local features in the image, but lacks representation and judgement of the positional relationship between structures. When the image is rotated or tilted, it will often be unable to be identified. Compared with convolutional neural network, Hinton [
23] further proposed a capsule network (CapsNet) and an inter-capsule dynamic routing algorithm used to train the capsule network to improve the CNN, which is able to retain the coupling relationship between the internal elements of the data, which can effectively identify and differentiate the positional relationship of the image. It is more suitable for small samples, dealing with the overlapping of the object, and other complex scenarios. Numerous academics have expressed interest in the capsule network, which has been used in the fault diagnostic sector. Zhu [
24] put forward an initial capsule network with initial blocks and regression branches for bearing fault diagnosis. He used bearing fault data from CWRU under a 1 HP (horse power) load and conducted experiments based on vibration signals, and the average accuracy of diagnosis is 97.15%. In order to diagnose faults in rotating equipment, Li [
25] presented a capsule network with two convolutional layers and two pooling layers. This network performed exceptionally well in tests involving various rotating machines and defective components.
Due to CNN having the same weight for all feature extraction, high computing power, and resource consumption, combining the attention mechanism with a deep learning network and multi-channel data can adaptively learn the data features, dynamically focus on the input data, and adjust the weights to improve the generalization ability of the model. Wang [
26] proposed a deep subdomain adaptive sub-attention network that obtains the feature maps of time-frequency domain signals through continuous wavelet transform and combines multi-channel data and channel attention mechanisms for feature extraction. With just a few parameters, Wang’s [
27] Efficient Channel Attention (ECA) module significantly boosts CNN performance by implementing a non-dimensionality-reducing local cross-channel interaction technique through one-dimensional convolution. In a similar vein, adding the ECA mechanism to CapsNet can further strengthen the model’s capacity to learn and incorporate new features, as well as improve feature representation.
In the capsule network, the use of multi-channel sensor signals processed by the SDP methodology allows for a more adequate characterization of fault signals, and by introducing the ECA mechanism, key features in the input data can be more accurately identified and highlighted. Based on this, a capsule network approach that combines multi-channel signal and ECA attention mechanisms is proposed. The experiment is based on early diagnosis of faults, and the experimental motor will not stop running due to the presence of faults. By learning about abnormal conditions during motor operation, it can provide convenience for maintenance personnel and avoid power outages caused by faults. The main work is as follows:
(1) Adopt the SDP image method to visualize the sampled vibration signals on the time-frequency domain as a 2-D snowflake image, which intuitively shows the different fault types, classifies them, and matches them.
(2) Propose a multi-channel sensor signal fusion method to fully learn the characteristics of the original signal.
(3) Propose a method combining the ECA attention mechanism and a capsule network, which can effectively improve fault diagnosis accuracy.
5. Conclusions
A capsule network model based on the ECA channel attention mechanism and multi-sensor data fusion is proposed for identifying eight types of motor faults, such as broken bar motor. The model learns SDP images, filters image features through a dynamic routing algorithm, constantly and dynamically attends to the input data, and adjusts the weights of the model. The function of fault classification is achieved by matching the input signals with the fault templates learned by the model. Through the above operations, the method proposed in this article can be extended and applied to the fault diagnosis of any three-phase motor.
The experimental results show that:
(1) The SDP method creates a 2-D picture from the 1-D vibration signal, allowing for the visualization of nonlinear, unstable signal characteristics and a reduction in the impact of ambient noise on signal extraction.
(2) By improving the model’s feature extraction capabilities, the suggested feature extraction method—which is based on the ECA attention mechanism and the multi-sensor data fusion strategy—can successfully raise the accuracy of fault classification.
(3) The capsule network outperforms CNN, ALEXNET, LENET, LSTM, and other networks in fault identification accuracy, reaching 99.21%. Additionally, the capsule network uses less data, resulting in higher fault diagnostic accuracy and efficiency.