1. Introduction
Mechanical rotating components are the critical components of the transmission system. However, the failure is almost inevitable because of the long-time dynamic running of the components [
1,
2,
3,
4]. Once a failure occurs, the entire transmission system will be shut down. Therefore, effective early fault diagnosis of mechanical rotating components will significantly improve the transmission system’s reliability and reduce downtime. A lot of studies have been conducted on early fault diagnosis algorithms for mechanical rotating components [
5,
6,
7,
8,
9,
10]. However, several problems still exist, such as high recognition accuracy but complex algorithm architecture, and simple algorithm logic but low recognition accuracy, which lead to a difficulty for practical application.
Traditional maintenance methods can be divided into two types: repair maintenance and preventive maintenance. Repairable maintenance often refers to repairing equipment after failure. The biggest drawback is that it will affect the production plan. At the same time, the cost of spare parts and labor for emergency repairs will also bring high repair costs to the professional maintenance team. Preventive maintenance planned regular equipment maintenance and replacement of parts and components, usually including maintenance, regular inspections, regular functional testing, regular disassembly and repair, regular replacement, and other methods. Regular maintenance requires the overall assessment and maintenance of equipment shutdown. The disadvantage is that it takes a long time, is low in efficiency, and brings new failure risks.
The two maintenance methods have been gradually outdated in the mature era of IoT and big data, so predictive maintenance has emerged as the times require. This new maintenance method can predict the time of failure and maintain the equipment in real-time and efficiently. However, in the application process of predictive maintenance mode, some practical problems have gradually emerged. Assuming that all the massive data generated daily are collected and uploaded to the cloud for analysis and processing, they will inevitably cause a huge load on the network. It will not be easy to meet the real-time requirements of critical services.
Some new solutions have also emerged in response to the large network load, low real-time requirements, and poor accuracy encountered in traditional predictive maintenance. For example, some early solutions for predictive maintenance algorithms for integrated transmission systems have high fault diagnosis accuracy: Shalalfeh [
11] proposed a new method to analyze the multi-dimensional sensory data and used the characteristics of the technique to conduct health prediction. A detrended fluctuation analysis was utilized to evaluate the long-term correlation in the rolling bearing vibration data. The test results showed that Kendall’s tau coefficient could be deemed an early warning signal of bearing failure, but the real-time performance was poor due to the long correlation in the time domain. Jia [
12,
13] proposed an intelligent diagnosis method based on a neural network. A multi-layer perceptron deep learning network was used to diagnose bearing faults with an accuracy of 99.74%, but the model architecture is very complicated. Ince [
14] proposed a fast and accurate early fault diagnosis algorithm to monitor the motor operating status by using an adaptive one-dimensional convolutional neural network. Meanwhile, the feature extraction and the classification stages of motor fault detection were fused into a single learning body. The experimental results verified the effectiveness of the method. However, it cannot be applied in practice due to poor real-time performance. He then proposed a bearing fault diagnosis method based on deep learning [
15]. This method used a short-time Fourier transform (STFT) to preprocess the sensor signal. A large memory storage retrieval neural network (LAMSTAR) neural network was established. The experiments showed that the bearing fault could be diagnosed effectively, but the early data-driven model is required to complete the fault diagnosis.
With the rapid development of machine learning, many methods such as a novel model, deep inception net with atrous convolution (ACDIN) [
16], convolutional neural network [
17,
18,
19,
20], long and short-term memory hybrid neural network [
21], and deep belief nets [
22], and combined vibration images [
23], deep encoder [
24] and other deep learning models were applied to the field of fault diagnosis. Moreover, both the data sets and the actual test data were used to verify their effectiveness. However, these methods are computationally expensive and cannot be suitable for real-time applications.
Sabour [
25] proposed a deep network with activation vectors based on classic deep learning and convolutional neural networks. The network expanded the ordinary neurons in the deep network into the multi-dimensional neurons and encapsulated the multi-dimensional data into a capsule network. The multi-dimensional neurons can obtain the size and direction information of the data and had stronger learning capabilities compared with the ordinary neuron networks. However, the introduction of the iterative algorithm based on dynamic routing and unsupervised clustering brought extra loop iteration while using the network, resulting in a greater hardware resource consumption and a longer calculation time.
Wang [
26] combined the capsule network with the Xception module (XCN) to achieve intelligent fault diagnosis. Firstly, a wavelet time-frequency analysis was performed to obtain the fault time-frequency diagram. Secondly, the XCN was input for training, and the cost function penalty was performed on the parameters, which changed considerably. Finally, the fault types were classified according to the envelope length by the dynamic routing operation. Zhu [
27] built a fault recognition network with a depth of 12 layers and a network parameter of 7.9M by combining the inception network with the capsule network. The time-frequency diagram of the vibration signal was obtained by preprocessing to diagnose the fault of the rotating part. Due to a large number of parameters in the network architecture, it is hard to reduce the computation time. Wang [
28] proposed an envelope network based on a wide convolution and multi-scale convolution for fault diagnosis. The proposed capsule network based on wide convolution and multi-scale convolution (WMSCCN) algorithm used a one-dimensional vibration signal as the input signal. At the same time, the adaptive batch normalization (AdaBN) algorithm was introduced into the model. The effectiveness of the algorithm is verified through experiments, but the fault recognition accuracy is low. Kao [
29] proposed an effective fault diagnosis algorithm. The fault diagnosis is based on the current signature analysis. A complete faulty motor diagnosis system needs to perform feature extraction based on existing methods and then perform additional classification methods. The first is a classification method using wavelet packet transform and a deep one-dimensional convolutional neural network containing a softmax layer. The experimental results using real-time data of motor stator current prove the effectiveness of this method for real-time monitoring of motor status. When this method is training, a high-specification PC may be needed to train a neural network containing a large number of neurons, and the real-time performance is poor. Zhang [
30] proposed an enhanced CNN model that uses time-frequency images as input for bearing fault diagnosis. Seven data sets provided by CWRU and YSU are used to verify the effectiveness of the proposed method. The training time of this method is relatively short, and the accuracy rate is as high as 96%, but the model has poor robustness. Zhao [
31] proposed an improved DCGAN (deep CNN based GAN) for vibration-based fault diagnosis with unbalanced data. An auxiliary classifier is introduced to facilitate the training process, and an AE-based method is introduced to estimate the similarity of the generated samples. At the same time, an online sample filter is designed and embedded in GAN for automatic sample selection, where the selected samples should meet the requirements of accuracy and diversity. This method has good diagnostic performance, but the time cost of parameter adjustment is too long, and the reliability is low.
In summary, the early fault diagnosis algorithms based on traditional methods have problems such as complex data preprocessing, poor applicability, and low recognition accuracy. Deep learning abstracts and encodes the original features by constructing a multi-layer perceptual structure and then realizes the classification or recognition of samples. The more layers and neurons the deep learning model has, the more features and storage details it can obtain. Deep learning in image processing and natural language processing has achieved outstanding results, but with the development of deep learning model, the depth of each neuron in the network only consists of one feature, its ability to obtain and mine information is limited; because of the expansion of neurons to dimensional vector, each dimension of learning has different features, and it can learn to get more information. Hinton [
25,
32] proposed two vector structure deep networks, vector CapsuleNet and matrix CapsuleNet. These two networks expand pixels into multidimensional vectors, and the expanded neurons are called Capsules. Capsule replaces a single neuron of the original neural network. For this purpose, the introduction of deep learning algorithms into the early fault diagnosis process of the transmission system can effectively improve the capabilities of early fault diagnosis. Despite CapsuleNet dynamic routing algorithm for high-dimensional vectors and a large number of training samples, the calculation time is much longer than other deep learning networks of the same scale. The algorithm has high complexity, a large calculation amount, a long calculation time, and high hardware requirements. Due to this limitation, the routing algorithm cannot be applied to the real-time processing of vibration signals of mechanical rotating parts in the integrated transmission system of actual vehicles. In the case of high real-time requirements and limited equipment terminal computing capabilities, the existing complex deep learning network architecture needs to be improved in order to propose a deep network algorithm with simple architecture, small amount of calculation, high real-time performance, and robust data mining capabilities to realize early real-time fault diagnosis of rotating components.
It contains two pieces of information: the magnitude of the vector and the direction of the vector. It uses an iterative update method to transmit the Capsule information between the two layers dynamically. CapsuleNet has two main contributions to the deep network: one is to expand the dimension of neurons and enhance the ability of the network to obtain information; the other is that the transmission of the two-layer Capsule adopts dynamic routing algorithm and Expectation–Maximization algorithm to realize non-features supervised cluster learning.
Due to the high complexity, a large amount of calculation, and the long time consuming of the capsule network algorithm [
25], in order to implement the actual vehicle deployment, the routing update algorithm needs to be improved to solve the iterative update process, reduce the amount of calculation, and improve the real-time performance of the calculation. This paper uses optimal transport and generative adversarial networks to replace the routing update algorithm and further modifies the capsule network architecture to realize the processing of one-dimensional original vibration data. This paper proposes the OT-Caps fault diagnosis model, which has more robust fault feature mining capabilities and better recognition accuracy. It solves the problems of poor real-time performance, large calculation volume, and high hardware platform requirements of the capsule network and provides a basis for actual deployment applications.
The main contribution of this paper is to propose a novel fault diagnosis model named OT-Caps. Based on the capsule network’s characteristics, the model expands the one-dimensional neuron in the traditional convolutional neural network into the multi-dimensional neuron, which enhances the deep network data mining ability and fault feature storage ability. High-precision identification of multiple failure modes in rotating parts such as bearings, gears, and shafts can be accomplished by collecting raw vibration signals. Simultaneously, the model introduces the generative adversarial networks and the optimal transport theory to construct the objective loss function, which solves the problem of large calculation volume and long calculation time for the multi-dimensional neuron network. The fault identification transferability, fault identification ability, and real-time computing ability of the model are verified by the public data sets and actual vibration data.
The second chapter mainly introduces the proposed OT-Caps algorithm architecture based on the generative adversarial networks and the optimal transport theory. The third chapter mainly introduces the test results of the OT-Caps algorithm under different test data sets and actual test data. The fourth chapter mainly introduces the conclusion.
3. Experiment Method
In order to verify the effect of the OT-Caps model on fault diagnosis designed in this paper, data sets such as gearbox fault data, bearing failure data, and actual test fault data of the transmission system are used to conduct the verification.
The computer used in this article is configured with an Intel Core (TM) i7-6700 CPU, SDRAM is 16G, the graphics card is NVIDIA GTX 980, and video memory is 4G. We are using GPU-based pytorch1.0 for model training and testing.
3.1. OT-Caps Fault Diagnosis Algorithm Real-Time Comparison Verification
The gearbox failure data in this test are selected from IEEE PHM (Prognostics Health Management) 2009. The structure of the test gearbox is shown in
Figure 4a. The input shaft is equipped with gear, the intermediate shaft is equipped with two gears, and the output shaft is equipped with gear. The shaft and the box are connected by bearings, and vibration sensors are respectively installed at the input shaft end and the output shaft end. The number of teeth of the above-mentioned gears is 32, 96, 48, and 80, respectively. The input speed during the test is 30, 35, 40, 45, and 50rpm. The load is divided into high load and low load. The data sampling rate is 66.7kHz. The endurance of each sampling is 4s, and one sample has approximately 256,000 data points. The installation position of the vibration sensor is shown in
Figure 4b,c, which is used to collect vibration signals at the input and output ends, respectively.
Since the OT-Caps model performs the fault diagnoses on one-dimensional time series vibration signals, from where the fault features can be extracted directly from the original data, the frequency domain analysis is not required. In order to increase the amount and diversity of the training data, this paper uses the sliding window method to perform repeated slice processing on one-dimensional collected data, as shown in
Figure 5. The sliding window length is 2048, and the sliding step length is 100. Thus 2048 data points are taken every 100 points. The number of generated samples is 8500, of which 7000 samples are used for training, and 1500 samples are used for testing.
3.1.1. OT-Caps Network Training Optimization Algorithm
In this part, the OT-Caps fault diagnosis model was compared with the original CapsuleNet model in terms of training time, test running time, and recognition accuracy. The original CapsuleNet can be found in [
25], from which the input is a two-dimensional vector. Here, the original vibration data is directly converted into a two-dimensional vector to meet the input requirements of CapsuleNet. The comparison results of the two network models are shown in
Table 2. It can be found from the table that the calculation time of the improved OT-Caps model in this paper is much lower. During the training process, the training speed of the OT-Caps model is 13.5 times that of the original CapsuleNet model. During the test, because the OT-Caps fault diagnosis model uses the OT loss solution process, its operation speed is 130 times that of CapsuleNet, which shows a great advantage. According to the time-consuming test, its data processing rate is 7.692kHz, which has the ability to meet the real-time requirements for fault diagnosis of mechanical rotating parts of the transmission system.
3.1.2. Comparison of Fault Recognition Accuracy
The test accuracy of various deep fault diagnosis models was then compared, and the diagnosis results are shown in
Table 3. Among the tested models, the dislocated time series CNN (DTS-CNN) can be found in Reference [
34], the one-dimensional convolutional neural network (1-DCNN) can be found in Reference [
35], and the deep adversarial convolutional neural network (DACNN) can be found in Reference [
36]. Through comparison, it can be seen that the OT-Caps proposed in this paper also has good recognition accuracy.
The fault recognition confusion matrix is used to identify the probability of recognition errors between the failure modes.
Figure 5 shows the gearbox fault recognition confusion matrix.
Figure 6 and
Table 3 show that the OT-Caps fault diagnosis model has high recognition accuracy for eight types of faults.
3.2. OT-Caps Transfer Capability Comparison Verification
The bearing failure test data set used in this paper is a set of open standard bearing data from the Data Center of Western Reserve University. Due to its openness and representativeness, many scholars worldwide have carried out related research on this data set such as fault characteristic signal extraction and fault pattern recognition. Different scholars have worked on the same data, which makes the dataset helpful to perform the comparison of fault recognition capabilities of different algorithms.
The bearing failure test equipment of Western Reserve University is shown in
Figure 7 [
27]. The test bench is composed of two motors. The bearing is installed in the bearing box, and the bearing can work under different working conditions by adjusting the different speeds of the motor. The data sets under different working conditions are classified according to the working status.
During the test, all the bearing faults were manufactured manually, and the rolling element, inner ring and outer ring were processed by the electric spark fault injection method. Several sizes of the bearing were used to simulate different fault levels. The test bearing load contained three types, 1, 2, and 3 hp, and the speeds were 1772, 1750, and 1730 rpm, respectively. The failure modes are different under different loads and speeds, including nine different failure modes. Therefore, this data set contained ten working states (including health states) in total.
3.2.1. Data Preprocessing
The vibration data analyzed in this article were collected at the drive end. The sampling frequency was 12 kHz, the sampling time was 10s, and each data set contains 120,000 data points. In order to increase the number of samples, the sliding window length is set as 2048, the sliding step length is 100, and 2048 data points are taken every 100 points, which can generate a total of 6000 samples. Five thousand samples are used for training, and 1000 samples are used for testing. As shown in
Table 4, according to the different speeds and load, there are three working conditions, which constitute data sets A, B, and C, respectively. One data set is used for training, and the other two data sets are used for testing to verify the fault identification transferability of this model.
3.2.2. Comparison of Experimental Results
Several algorithms are used to conduct the comparison in this paper, including support vector machine (SVM), k-nearest neighbor (kNN), Support Vector Classification (SVC), and classic architectures such as AlexNet, ResNet, the bearing diagnosis architecture ACDIN mentioned in [
16], a deep architecture for bearing fault diagnosis wide first layer kernels (WDCNN) proposed in [
37]. Among them, SVC and KNN use frequency spectrum as the input features. AlexNet, ResNet, and Information centric networking (ICN) use time domain spectrogram as the input features. ACDIN, WDCNN, and OT-Caps use raw data as the input features. The fault recognition accuracy of each algorithm is shown in
Figure 8 and
Table 5. According to
Figure 8 and
Table 5, it can be seen that the prediction accuracy of the deep learning architecture is significantly higher than the two shallow architectures SVC and KNN, indicating that the deep learning architecture can better extract fault features. In the deep learning architecture, the prediction accuracy of the methods which use time domain or frequency spectrum as the input feature is generally higher than the methods that use the original feature as the input feature, indicating that it is more difficult for the deep model to extract features directly from the original data. Because the OT-Caps architecture proposed in this paper can extract the original data features well, and the prediction accuracy is higher than that of other deep models, it shows that OT-Caps has a stronger feature extraction ability.
After using t-SNE (t-distributed stochastic neighbor embedding) to cluster the output features of each layer, from where the result can be found in
Figure 9, it can be seen that more layers have the better ability to extracted features. Among them, Layer1 and Layer2 are ordinary convolutional networks, and their feature extraction degree is relatively shallow. Layer3 is the capsule layer. After passing through the capsule network, the features have a better degree of discrimination, but certain types of features are still mixed together. Layer4 is the second capsule layer. After the second capsule layer, the features can be distinguished well, and then the learned features are output through the fully connected layer.
3.3. Gearbox Failure Test
In order to verify the effectiveness of the OT-Caps fault diagnosis algorithm for the fault diagnosis of the transmission system, a real failure test was carried out on the gearbox in the transmission system, and the early fault diagnosis ability of the OT-Caps algorithm was verified through the gearbox test data.
3.3.1. Test Equipment
During the test, a spur gear transmission box with a transmission ratio of 1:4 was used as the test object. The test transmission box is shown in
Figure 10, including a pair of meshing spur gears and two fixed shafts. Support bearings are installed at both ends of the shaft. The two bearing sizes are 6015 and 6210, respectively. The test process was mainly conducted on the support bearing.
3.3.2. Test Result
Sufficient bearing lubrication is a common failure mode. This testing process mainly simulates the failure of insufficient lubricating oil and is conducted on the transmission system by applying torque loads of different magnitudes at different speeds. During the test, the speed includes 400, 800, 1200 rpm, and the load is 50, 150 and 200 N.m. In order to avoid the gluing of the gearbox bearings which could damage the motor, the test should be stopped immediately when the gearbox vibrates severely. During the experiments, two failure tests were carried out. In the first test, the original bearing in the transmission box was damaged during operation. After replacing the new bearing, the second failure test was then carried out. The test stopped when the strong vibration occurred, and the bearing was damaged again. The test data was captured during the two tests.
The data sampling rate is 25.6 kHz, and each data set includes about 2560 points. The first bearing runs for 3.12 h and the second bearing runs for 2.97 h. The original data in the Y direction are shown in
Figure 11. With the degradation process, the vibration gradually increases.
Figure 11a shows the degradation process of the original bearing of the transmission box. The degradation process is not stable because the box has been running for a long time.
Figure 11b is the vibration curve of the bearing degradation process in a new state, and the vibration increases significantly with the wear that exists.
This test data are used to verify the ability of OT-Caps to identify early faults. Early fault identification is mainly based on the increase of the vibration signal and the appearance of periodic shock vibration when the transmission system fails. The vibration data of the transmission box failure process are shown in
Figure 12. The red dotted line is the division of the transmission box failure state, which is divided into three stages, which are normal, early failure, and failure stage, according to the degradation process. Taking a certain interval between the two states to avoid the similarity of the samples in the adjacent places of the data results in low sample discrimination. The data preprocessing is the same as the method used before. A total of 3000 samples, composed of three groups and each group containing 1000 samples for a certain state, are generated. Two thousand four hundred samples are randomly selected for training samples, and the other 600 samples are used for testing.
After testing, the fault recognition accuracy of the OT-Caps model is 97.17%, which can effectively identify different damage levels. The fault recognition confusion matrix is shown in
Figure 13. There is no recognition error between the normal state and the early fault or fault state. The test results indicate that when the transmission box fails, OT-Caps has the ability to identify early failures effectively.
3.4. Bench Test Verification of Integrated Transmission System
3.4.1. Test Equipment
The fault data come from two integrated transmission systems of a certain model undergoing maintenance. Perform a bench test on two units. Install a vibration sensor on the input shaft and output shafts on both sides, and install three vibration sensors on the upper-end cover of the integrated transmission system close to the fan drive link. A total of 6 three-directional acceleration vibration sensors are installed. The Simens LMS acquisition instrument acquires vibration signals.
In the test process, the comprehensive transmission box was mounted in reverse gears 1 and 2, neutral gear, and forward gears 1 to 6, and each gear was carried out for four-speed inputs of 800, 1200, 1600, and 2200rpm. The load is divided into no-load and medium-speed. For three load conditions of low load and full load, the sampling rate is 10 kHz, and each operating condition works for about 5 min.
3.4.2. Test Result
After testing, the classification accuracy of the OT-Caps fault diagnosis model on the test set can reach 100%. The algorithm can effectively identify different failure modes of mechanical rotating parts of the integrated transmission system. The operational efficiency and accuracy of CapsuleNet and OT-Caps are compared. As shown in
Table 6, it can be seen that this network has great advantages in real-time. The CapsuleNet architecture provided by Li [
38] is used here, and the network parameter is 10.58M. The OT-Caps fault diagnosis network proposed in this paper, under the premise of achieving high-precision fault diagnosis by reducing the network architecture, reducing network parameters, and improving the network architecture, effectively improves the real-time performance of the fault diagnosis process. It provides technical support for actual vehicle deployment.