Next Article in Journal
Revisiting Nature’s “Unifying Patterns”: A Biological Appraisal
Next Article in Special Issue
A Novel Heteromorphous Convolutional Neural Network for Automated Assessment of Tumors in Colon and Lung Histopathology Images
Previous Article in Journal
A Biomorphic Approach to Designing Special-Purpose Vehicles for Arctic Conditions
Previous Article in Special Issue
Improved Colony Predation Algorithm Optimized Convolutional Neural Networks for Electrocardiogram Signal Classification
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Fault Diagnosis of Planetary Gearbox Based on Dynamic Simulation and Partial Transfer Learning

1
College of Information, Mechanical and Electrical Engineering, Ningde Normal University, Ningde 352000, China
2
College of Mechanical Engineering and Automation, Fuzhou University, Fuzhou 350202, China
3
Dongguan Xinghuo Gear Co., Ltd., Dongguan 523000, China
*
Authors to whom correspondence should be addressed.
Biomimetics 2023, 8(4), 361; https://doi.org/10.3390/biomimetics8040361
Submission received: 17 July 2023 / Revised: 6 August 2023 / Accepted: 9 August 2023 / Published: 12 August 2023
(This article belongs to the Special Issue Bionic Artificial Neural Networks and Artificial Intelligence)

Abstract

:
To address the problem of insufficient real-world data on planetary gearboxes, which makes it difficult to diagnose faults using deep learning methods, it is possible to obtain sufficient simulation fault data through dynamic simulation models and then reduce the difference between simulation data and real data using transfer learning methods, thereby applying diagnostic knowledge from simulation data to real planetary gearboxes. However, the label space of real data may be a subset of the label space of simulation data. In this case, existing transfer learning methods are susceptible to interference from outlier label spaces in simulation data, resulting in mismatching. To address this issue, this paper introduces multiple domain classifiers and a weighted learning scheme on the basis of existing domain adversarial transfer learning methods to evaluate the transferability of simulation data and adaptively measure their contribution to label predictor and domain classifiers, filter the interference of unrelated categories of simulation data, and achieve accurate matching of real data. Finally, partial transfer experiments are conducted to verify the effectiveness of the proposed method, and the experimental results show that the diagnostic accuracy of this method is higher than existing transfer learning methods.

1. Introduction

A planetary gearbox (PG) is a key component of rotating machinery. Due to its advantages of large bearing capacity, small volume, and high transmission efficiency, it has been widely applied in mechanical transmission systems in industries such as wind power, aviation, lifting, and transportation. However, PG often works in harsh environments, making it susceptible to malfunctions. If PG fails, it may cause the entire transmission system to degrade and fail, even causing catastrophic damage and huge economic losses [1]. Therefore, researching fault diagnosis methods for PG has important practical significance for ensuring stable operation and prolonging the service life of mechanical equipment [2]. By conducting relevant research and establishing reliable fault diagnosis models, people can detect problems in a timely manner and take corresponding measures. This can not only reduce the maintenance costs of mechanical equipment but also avoid larger faults and losses.
When using machine learning and deep learning methods for PG fault diagnosis, huge quantities of labeled fault data are usually necessary. However, for actual industrial production, most data is collected while the machine is running normally [3], making it challenging to obtain vast and detailed fault data. To address this issue, an abundance of simulated fault data can be gathered through dynamic simulation analysis, and then transfer learning (TL) methods can be used to narrow the disparity between simulated and real data. Thus, the diagnostic information in the simulated data is able to be applied to the fault diagnosis of real PG. Li et al. [4] simulated vibration signals using a lumped parameter dynamic model and then used a CNN-based TL network to obtain domain-invariant features from various domains in order to classify faults. Dong et al. [5] generated an abundance of simulated data using dynamic models and then used CNN and parameter transfer methods to apply the learned fault diagnosis knowledge to practical scenarios, solving the problem of small samples. Li et al. [6] trained a deep neural network model using computer-simulated data to deal with the challenge caused by an insufficient amount of labeled fault data and used TL to narrow the discrepancy between the simulated and actual domains. Zhu et al. [7] introduced a defect vibration model for simulating fault vibration signals and used real and simulated signals as the target domain and source domain of TL fault diagnosis methods, demonstrating the method’s effectiveness and superiority through experimentation. Liu et al. [8] generated simulated vibration signals using a phenomenological model and then used domain adversarial neural networks to train adversarial data among source and target domains. According to the experimental findings, this means can produce excellent classification accuracy with just relatively little real data.
Diagnostic methods based on dynamic simulation and TL usually presume that the label space of simulation data and real data are identical. But when this method is applied to real planetary gear fault diagnosis, simulation data can contain all possible fault categories while the real planetary gearbox may only have one or a few faults, which means that the label space of real data is a subset of the label space of simulation data. This can lead to interference from outlier label space in simulation data, causing mismatching. Partial transfer learning (PTL) methods can help reduce mismatching. Wang et al. [9] proposed a balanced adversarial domain adaptive network for fault diagnosis tasks in partial transfer scenarios, which alleviated the mismatching problem by introducing balancing strategies and class-level weights. Li et al. [10] proposed a class-weighted adversarial neural network that encourages positive transfers of shared classes and ignores source class outliers through class-weighting strategies. Sun et al. [11] suggested a game theory-enhanced domain adaptation network to solve partial domain adaptation problems. The network constructs three attention matrices using maximum mean discrepancy, Jensen-Shannon divergence, and Wasserstein distance and generates the best probability weight through the combination of game theory weights, thereby filtering out irrelevant source domain samples and improving mechanical fault diagnosis performance. Li et al. [12] suggested a new weighted adversarial transfer network that filters out irrelevant source domain samples and improves the performance of the target task through weighted learning. Jiao et al. [13] proposed a domain adaptive network based on classifier inconsistency. The network uses two discriminative 1D-CNN as the basic architecture and promotes active network training by identifying and emphasizing source domain samples with the same classification as the target domain. At the same time, the classifier inconsistency is added in order to direct the model towards learning discriminative and domain-invariant representations for precise classification of unlabeled target data. Kuang et al. [14] proposed a two-stage double-weight consistency-induced partial domain adaptive network. This network obtains double-level composite weights from class-level and sample-level weights through double-weight consistency-induced weighting strategies, enabling selective mapping of source diagnosis knowledge to the target domain.
To deal with the challenge of scarce labeled fault data in real PG fault diagnosis, this paper establishes a dynamic simulation model of PG to obtain abundant fault simulation data. But simulation data and actual data are distinct, and the label space is also heterogeneous. To solve these problems, this paper introduces multiple domain classifiers and weighted learning schemes on the basis of existing domain adversarial TL methods, evaluates the transferability of simulation data, adaptively measures their contributions to label predictor and domain classifiers, filters out the interference of irrelevant categories of simulation data, achieves accurate matching of real data, and thus improves the diagnostic accuracy of transfer tasks.
The main contributions of this paper are as follows:
  • Through the rigid-flexible coupling dynamic model of PG, a wealth of fault simulation data is obtained, and then the problem of scarcity of labeled fault data in real-world scenarios is solved.
  • By introducing multiple domain discriminators and a weighted learning scheme, the interference from simulation data of irrelevant categories is filtered, thereby improving the diagnostic accuracy of partial transfer tasks.
The remainder of this paper is arranged as below: Section 2 introduces the relevant theories. Section 3 describes the proposed means. Section 4 studies a practical case. Section 5 summarizes this paper.

2. Theoretical Background

2.1. Partial Transfer Learning

In TL, it is often assumed that the label space of the source domain (Ds) samples, Cs, is the same as that of the label space of the target domain (Dt) samples, Ct. However, in real-world applications, Ct is more likely to be a subset of Cs. In this case, all labels in Dt are shared by both Ds and Dt, and Ct is also known as the shared label space. There are also some labels in Ds that are unique to it, known as the outlier label space (Cs-Ct), which can lead to mismatches between Ds and Dt samples and affect the accuracy of the transfer task. PTL aims to promote the positive transfer of samples in the shared label space while suppressing the negative transfer of samples in the outlier label space when Ct is a subset of Cs, thereby improving the accuracy of the transfer task [15]. Figure 1 illustrates the concept behind traditional TL and PTL. In the figure, Ds samples have three types of labels: △, ○, and □, while Dt samples only have two types of labels: △ and □. In this case, ○ in Ds is an outlier label, which may lead to mismatching with Dt samples during TL. However, PTL methods can recognize and filter out outlier labels in Ds, effectively reducing the risk of mismatching and improving model performance.

2.2. Residual Neural Network

For deep neural networks, the number of layers is crucial. The deeper the network, the richer its ability to extract hierarchical features and its recognition and classification capabilities are also enhanced. However, for traditional CNN, too many layers can cause gradient vanishing or explosion, making the network difficult to train. To address the issue, He et al. [16] proposed Residual Neural Networks (ResNet) in 2015. ResNet usually includes convolutional layers, pooling layers, residual blocks, and fully connected layers. Figure 2 illustrates a schematic diagram of residual blocks, where x is the input, H(x) = F(x) + x is the output, and F(x) = H(x) − x is the residual. The residual block has two branches, the residual branch, and the identity mapping branch. The residual branch consists of two convolutional layers, which are used to fit the residual F(x), while the identity mapping branch keeps the input x unchanged. The output H(x) of the residual block is obtained by element-wise addition of the two branches and then passed through the ReLu activation function. The introduction of identity mapping ensures that the performance of deep networks is not worse than that of shallow networks, and no additional parameters or computation complexity are added.

2.3. Domain Adversarial Neural Network

Domain Adversarial Neural Networks (DANNs) have been extensively implemented in TL. Through the adversarial learning process, the network is capable of extracting domain-invariant features from both Ds and Dt. The adversarial learning process is able to be seen as a two-player game, with the first player being a domain classifier Gd taught to differentiate between Ds and Dt features, and the second player is a feature extractor Gf trained to confuse Gd. The framework of DANN is shown in Figure 3.
To obtain domain-invariant features, during the training process of DANN, the parameters θf of Gf are learned by maximizing the loss of the Gd. The parameters θd of Gd are learned by minimizing its loss. In addition, minimizing the loss of the label predictor Gy ensures a low Ds classification error. The overall loss function of DANN is shown in Equation (1) [17]:
L θ f , θ y , θ d = 1 n s x i D s L y G y G f x i , y i λ n s + n t x i D s D t L d G d G f x i , d i
In the formula, Ly is the loss function of Gy, Ld is the loss function of Gd, di is the domain label of the i-th sample, and λ is the hyperparameter that balances Ly and Ld. The parameter optimization for DANN is:
θ ^ f , θ ^ y = arg min θ f , θ y   L θ f , θ y , θ d
θ ^ d = arg max θ d   L θ f , θ y , θ d

3. Proposed Method

Weighted Domain Adversarial Neural Network Diagnostic Model

When Dt label space Ct is a subset of Ds label space Cs, if existing TL fault diagnosis models are used, Dt samples may be incorrectly matched with samples belonging to the outlier label space Cs-Ct in Ds, resulting in reduced diagnostic accuracy. To deal with this problem, this paper suggests a domain adversarial neural network with a weighted learning strategy to promote the positive transfer of the shared label space Ct and suppress the negative transfer of the outlier label space Cs-Ct in Ds. The network framework is shown in Figure 4.
This network includes a feature extractor Gf, a label predictor Gy, and |Cs| domain classifiers Gd. Gf and Gy are composed of ResNet-18. Gf is the feature extraction part of ResNet-18, including convolutional layers, pooling layers, and residual blocks. To extract more effective features, CBAM [18] is added to each residual block. Gy corresponds to the output of ResNet-18 and includes a fully connected layer and a softmax classification layer. |Cs| domain classifiers Gd have the same structure, including three fully connected layers. The c-th domain classifier G d c (c = 1, 2, …, |Cs|) is responsible for matching the Ds sample with label c and the Dt sample with label c. Therefore, in G d c , samples with label c should be assigned larger weights, while samples with other labels should be assigned smaller weights. In addition, only the domain classifier is accountable for matching the shared label space Ct can promote positive transfer, while the domain classifier accountable for matching the outlier label space Cs-Ct will introduce noise. Therefore, it is necessary to reduce the weight of the domain classifiers responsible for matching the outlier label space Cs-Ct.
As the labeled samples in Dt are unknown during model training, it is not possible to determine the weights based on labels. Joint distribution adaptation (JDA) [19] is often used to calculate differences between samples, and this paper will calculate weights using JDA values. The calculation formula for the weight matrix Wd assigned to the domain classifier is shown in Equations (4)–(6):
j i c = j i s c : m e a n x j D s c J D A G f x j , G f x i , x i D s j i t c : m e a n x j D s c J D A G f x j , G f x i , x i D t
y c = 1 / m e a n i = 1 n t j i t c c = 1 C s 1 / m e a n i = 1 n t j i t c
W d = [ y 1 , y 2 , , y C s ]
In the formula, mean(.) calculates the average value, Gf(.) represents the extracted features by the feature extractor, and JDA(.) represents the JDA value used to measure the difference between the two samples. If the JDA value is small, it indicates that the difference between the two samples is small, and there is a significant possibility that they belong to the same label. j i s c represents the discrepancy between the i-th Ds sample and Ds sample with label c, and j i t c represents the discrepancy between the i-th Dt sample and Ds sample with label c. By calculating the discrepancy between each Dt sample and Ds sample with label c, the probability yc of label c belonging to the shared label space Ct can be obtained. Since the labels in the outlier label space, Cs-Ct, do not belong to Ct, their probabilities yc, cCs-Ct are small enough to reduce the weight of the domain classifiers responsible for Cs-Ct.
The calculation formula for the weight matrix Ws assigned to the samples is shown in Equations (7) and (8):
s i c = s i s c : 1 / j i s c c = 1 C s 1 / j i s c s i t c : 1 / j i t c c = 1 C s 1 / j i t c
W s = s 1 1 s 1 2 s 1 C s s 2 1 s 2 2 s 2 C s s n s + n t 1 s n s + n t 2 s n s + n t C s
In the formula, s i s c is the probability that the label of the i-th Ds sample is c, while s i t c is the probability that the label of the i-th Dt sample is c.
After incorporating Wd and Ws, the total loss function of the model is outlined below:
L θ f , θ y , θ d c c = 1 C s = 1 n s x i D s L y G y G f x i , y i λ n s c = 1 C s x i D s y c L d c G d c s i s c G f x i , d i λ n t c = 1 C s x i D t y c L d c G d c s i t c G f x i , d i
In the equation, θf represents the parameters of Gf, θy represents the parameters of the Gy, θ d c represents the parameters of G d c , Ly represents the loss function of the Gy, L d c represents the loss function of G d c , di represents the domain label of the i-th sample, and λ is a hyperparameter that balances Ly and Ld.
The optimization of the model parameters is as follows:
θ ^ f , θ ^ y = arg min θ f , θ y   L θ f , θ y , θ d c c = 1 C s
θ ^ d 1 , , θ ^ d C s = arg max θ d 1 , , θ d C s   L θ f , θ y , θ d c c = 1 C s
Compared with a single domain classifier, the multiple domain classifiers used in this paper have two advantages: (1) by using the weight matrix Wd of the domain classifiers, the model can emphasize the domain classifiers responsible for the shared label space and suppress the ones responsible for the outlier label spaces, thereby reducing the negative impact of outlier label spaces. (2) The sample weight matrix Ws allows Dt samples to only align with Ds samples of one or multiple most relevant labels, thus reducing mismatching.

4. Experiment and Analysis

4.1. Dataset Comparison and Analysis

The real data for the PG comes from the Drivetrain Diagnostics Simulator (DDS), which is a comprehensive experimental platform for diagnosing power transmission faults. Figure 5 shows the physical model of the DDS experiment platform. During data collection, the variable-speed drive motor has three speeds: 20 Hz, 30 Hz, and 40 Hz, and the magnetic brake has three currents: 0 A, 0.4 A, and 0.8 A (by adjusting the current of the magnetic brake, various loads can be transferred to the output shaft). The sampling frequency is 12,800 Hz.
The simulated data for the PG comes from our previous article, where a rigid-flexible coupled model was established [20]. We used this model to obtain simulation data for four different health conditions of the PG: sun gear broken tooth fault (BR), sun gear crack fault (CR), sun gear tooth missing fault (MI), and normal sun gear (NO). During data acquisition, the input shaft speed was 30 Hz with no load, and the simulation time was 10 s with a simulation step of 128,000. This is equivalent to a working conditions of 30 Hz 0 A for the simulation data, with a sampling frequency of 12,800 Hz. In addition, in our previous article, we also analyzed the effect of the simulation step size on the simulation data, and the results are shown in Figure 6 [20], where Δt represents the time interval between two impacts, fm represents the meshing frequency, and fg represents the fault frequency. It can be seen from Figure 6 that the smaller the simulation step size, the more obvious the periodic shock in the time domain diagram and the sideband in the frequency domain diagram, and they are all consistent with the calculated values of the theoretical formula, which verifies the simulation model and simulation data plausibility.
The rigid-flexible coupling model has simplified the PG of the DDS experimental platform to a large extent, and the parameter settings in the model are difficult to completely match with the actual PG. This results in a discrepancy between the simulation and real data, even though the simulated data agrees with the theory. In order to more intuitively demonstrate this difference, this section will analyze and compare the simulation and real data from the perspectives of time-domain, frequency-domain, and probability distribution.
(1) Comparison and analysis of time domain diagrams
Figure 7 shows the time-domain plots of simulated data and real data, both of which were obtained under a 30 Hz 0 A operating condition. It can be observed that there is no clear periodic impulse in either the simulated data or the real data due to a lack of sufficiently high sampling frequency. However, the real data exhibits amplitude modulation, while the simulated data does not. This is because the real data was collected using a fixed-position vibration sensor, while in PG, the planetary gear rotates, causing the distance between the planetary gear and sensor to change. When the planetary gear is closer to the sensor, the measured meshing vibration is larger, and when it is farther away, it is smaller, resulting in amplitude modulation in the time-domain waveform. In contrast, the simulated data measures the angular acceleration of the planetary carrier, which is less affected by the position of the planetary gear and hence does not exhibit amplitude modulation.
(2) Comparison and analysis of frequency domain diagrams
Figure 8 shows the frequency domain comparison of the simulated and real data, both of which were obtained under the same working conditions of 30 Hz 0 A. The frequency domain spectra of both simulated and real data exhibit obvious meshing frequencies and their low harmonics, indicating similar vibration characteristics between the two groups of data. However, the real data does not show higher harmonics of the meshing frequency, which may be due to environmental noise and other factors affecting the high-frequency components of the frequency domain spectra during data acquisition.
(3) Comparison and analysis of probability distribution
From the simulated data with a frequency of 30 Hz and a current of 0 A, as well as the real data from different conditions (30 Hz 0 A, 20 Hz 0.4 A, 40 Hz 0.8 A), 12,800 data points were selected and normalized to [−1,1], resulting in the probability distribution curves shown in Figure 9. Figure 9 illustrates that the probability distribution curve of the simulated data is more concentrated and has a higher peak compared to the real data. This indicates that the probability distribution of the simulated data and the real data differ significantly. In addition, the probability distribution curves of the real data from different conditions are relatively similar, indicating that the difficulty of transfer between the simulated and real data is higher than the difficulty of transfer between real data from various working condition.

4.2. Dataset Description

By using overlapping sampling, 512-length data segments were extracted with a resampling step of 50 from both simulated and real data. Then, Short-Time Fourier Transform (STFT) was applied to convert these segments into time-frequency images with a size of 96 pixels in both width and height. There are 2400 time-frequency images for each health condition of simulated and real data, of which 2000 are utilized for training, and the remaining 400 are utilized for testing. Figure 10 shows the time-frequency images of simulated data. After obtaining the time-frequency images, some partial transfer tasks were designed, as shown in Table 1 and Table 2. In Table 1, the real data of 30 Hz 0 A is used as Ds, while the real data of other working conditions are used as Dt. In Table 2, the simulated data of 30 Hz 0 A is used as Ds, while the real data are used as Dt. Ds in Table 1 and Table 2 both contain four health conditions: BR, CR, MI, and NO.

4.3. Result Comparison

To demonstrate the effectiveness of the proposed method, we compared it with ResNet [16], DeepCoral [21], DDC [22], and DANN [23]. To make an impartial comparison, all methods used the same ResNet network structure and parameters as the proposed method.
Figure 11 and Figure 12 illustrate the diagnostic accuracies of various methods. It can be observed that when both Ds and Dt are real data, the proposed method obtains a mean diagnostic precision of 98.02%. When Ds is simulated data and Dt is real data, as analyzed in Section 4.1, the transfer difficulty is significantly increased. Nevertheless, the proposed means still achieves a mean diagnostic precision of 83.83%, indicating its practical value. Furthermore, the proposed means outperforms other TL methods in all transfer tasks. This is because the proposed method introduces multiple domain classifiers and a weighted learning strategy, which enables the model to effectively measure the transferability of each label’s Ds sample to Dt and increase the contribution of shared label Ds samples and decrease the contribution of outlier label Ds samples during training. This effectively reduces the mismatching between Dt samples and outlier label Ds samples, thereby improving the diagnostic accuracy of transfer tasks.

4.4. Feature Visualization Analysis

To clearly demonstrate the feature distribution when simulation data is used as Ds and real data as Dt, the features extracted by various methods were visualized using t-SNE. Figure 13 depicts the feature visualization of different methods in task C5. In this task, there are four labels in Ds samples, while there are only three labels in Dt samples, and MI in Ds belongs to the outlier label. From Figure 13, it can be seen that ResNet can accurately distinguish Ds features of different labels, but the distribution of Dt features it extracts is quite different from that of Ds features, which leads to lower accuracy in classifying Dt samples. In comparison, Ds features and Dt features extracted by DeepCoral, DDC, and DANN have a more similar distribution, but there is a large overlap between features of different labels, and some Dt features are incorrectly aligned with Ds MI features. In the proposed method, Dt features can be correctly aligned with the corresponding Ds features of the label, and the discriminability between features of different labels is higher, which further validates the effectiveness of the proposed means.

5. Conclusions

This paper puts forward a weighted domain adversarial neural network diagnostic model aimed at improving the fault diagnosis performance of PG in partial transfer tasks. Unlike traditional domain adaptation diagnostic methods that directly adapt all Ds and Dt class samples, this method considers the influence of outlier label Ds samples. Specifically, this method uses multiple domain classifiers, each of which is responsible for matching samples of a certain label. And a weighting scheme is introduced to assign smaller weights to outlier label Ds samples and domain classifiers responsible for matching outlier label source domain samples, effectively reducing the negative impact of outlier label Ds samples and promoting correct matching of shared labeled Ds samples and Dt samples. When both Ds and Dt are real data, this means achieved an average diagnostic accuracy of 98.02%; when Ds is simulated data and Dt is real data, this method achieved an average diagnostic accuracy of 83.83%, both of which are better than other TL methods. In addition, this method relaxes the requirement that Ds and Dt need to have the same label space, which is more in line with practical application scenarios.

Author Contributions

Conceptualization, M.S. and S.X.; software, Z.X.; validation, J.Z., J.R.; writing—original draft preparation, Z.X.; writing—review and editing, S.X. and J.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This paper was funded by the following research projects: The young and middle-aged science and technology project of Ningde Normal University (Grant No. 2022ZQ102) and the Collaborative innovation center project of Ningde Normal University (Grant No. 2023ZX01).

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Feng, Z.; Zhu, W.; Zhang, D. Time-Frequency demodulation analysis via Vold-Kalman filter for wind turbine planetary gearbox fault diagnosis under nonstationary speeds. Mech. Syst. Signal Process. 2019, 128, 93–109. [Google Scholar] [CrossRef]
  2. He, Z.; Shao, H.; Cheng, J.; Zhao, X.; Yang, Y. Support tensor machine with dynamic penalty factors and its application to the fault diagnosis of rotating machinery with unbalanced data. Mech. Syst. Signal Process. 2020, 141, 106441. [Google Scholar] [CrossRef]
  3. Kwak, J.; Lee, T.; Kim, C.O. An incremental clustering-based fault detection algorithm for class-imbalanced process data. IEEE Trans. Semicond. Manuf. 2015, 28, 318–328. [Google Scholar] [CrossRef]
  4. Li, D.; Zhao, Y.; Zhao, Y. A dynamic-model-based fault diagnosis method for a wind turbine planetary gearbox using a deep learning network. Prot. Control Mod. Power Syst. 2022, 7, 22. [Google Scholar] [CrossRef]
  5. Dong, Y.; Li, Y.; Zheng, H.; Wang, R.; Xu, M. A new dynamic model and transfer learning based intelligent fault diagnosis framework for rolling element bearings race faults: Solving the small sample problem. ISA Trans. 2022, 121, 327–348. [Google Scholar] [CrossRef] [PubMed]
  6. Li, W.; Gu, S.; Zhang, X.; Chen, T. Transfer learning for process fault diagnosis: Knowledge transfer from simulation to physical processes. Comput. Chem. Eng. 2020, 139, 106904. [Google Scholar] [CrossRef]
  7. Zhu, P.; Dong, S.; Pan, X.; Hu, X.; Zhu, S. A simulation-data-driven subdomain adaptation adversarial transfer learning network for rolling element bearing fault diagnosis. Meas. Sci. Technol. 2022, 33, 075101. [Google Scholar] [CrossRef]
  8. Liu, C.; Gryllias, K. Simulation-driven domain adaptation for rolling element bearing fault diagnosis. IEEE Trans. Ind. Inform. 2021, 18, 5760–5770. [Google Scholar] [CrossRef]
  9. Wang, Y.; Liu, Y.; Chow, T.W.; Gu, J.; Zhang, M. A Balanced Adversarial Domain Adaptation Method for Partial Transfer Intelligent Fault Diagnosis. IEEE Trans. Instrum. Meas. 2022, 71, 3526711. [Google Scholar] [CrossRef]
  10. Li, X.; Zhang, W.; Ma, H.; Luo, Z.; Li, X. Partial transfer learning in machinery cross-domain fault diagnostics using class-weighted adversarial networks. Neural Netw. 2020, 129, 313–322. [Google Scholar] [CrossRef] [PubMed]
  11. Sun, R.; Liu, X.; Liu, S.; Xiang, J. A game theory enhanced domain adaptation network for mechanical fault diagnosis. Meas. Sci. Technol. 2022, 33, 115501. [Google Scholar] [CrossRef]
  12. Li, W.; Chen, Z.; He, G. A novel weighted adversarial transfer network for partial domain fault diagnosis of machinery. IEEE Trans. Ind. Inform. 2020, 17, 1753–1762. [Google Scholar] [CrossRef]
  13. Jiao, J.; Zhao, M.; Lin, J.; Ding, C. Classifier inconsistency-based domain adaptation network for partial transfer intelligent diagnosis. IEEE Trans. Ind. Inform. 2019, 16, 5965–5974. [Google Scholar] [CrossRef]
  14. Kuang, J.; Xu, G.; Tao, T.; Wu, Q.; Han, C.; Wei, F. Dual-weight Consistency-induced Partial Domain Adaptation Network for Intelligent Fault Diagnosis of Machinery. IEEE Trans. Instrum. Meas. 2022, 71, 3519612. [Google Scholar] [CrossRef]
  15. Cao, Z.; Long, M.; Wang, J.; Jordan, M.I. Partial transfer learning with selective adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 2724–2732. [Google Scholar]
  16. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
  17. Ganin, Y.; Ustinova, E.; Ajakan, H.; Germain, P.; Larochelle, H.; Laviolette, F.; Marchand, M.; Lempitsky, V. Domain-adversarial training of neural networks. J. Mach. Learn. Res. 2016, 17, 2096–2130. [Google Scholar]
  18. Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
  19. Long, M.; Wang, J.; Ding, G.; Sun, J.; Yu, P.S. Transfer feature learning with joint distribution adaptation. In Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia, 1–8 December 2013; pp. 2200–2207. [Google Scholar]
  20. Song, M.-M.; Xiong, Z.-C.; Zhong, J.-H.; Xiao, S.-G.; Tang, Y.-H. Research on fault diagnosis method of planetary gearbox based on dynamic simulation and deep transfer learning. Sci. Rep. 2022, 12, 17023. [Google Scholar] [CrossRef] [PubMed]
  21. Sun, B.; Saenko, K. Deep coral: Correlation alignment for deep domain adaptation. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; pp. 443–450. [Google Scholar]
  22. Tzeng, E.; Hoffman, J.; Zhang, N.; Saenko, K.; Darrell, T. Deep domain confusion: Maximizing for domain invariance. arXiv 2014, arXiv:1412.3474. [Google Scholar]
  23. Ganin, Y.; Lempitsky, V. Unsupervised domain adaptation by backpropagation. In Proceedings of the International conference on machine learning, Lille, France, 6–11 July 2015; pp. 1180–1189. [Google Scholar]
Figure 1. Schematic diagram of TL and PTL.
Figure 1. Schematic diagram of TL and PTL.
Biomimetics 08 00361 g001
Figure 2. Schematic diagram of residual block.
Figure 2. Schematic diagram of residual block.
Biomimetics 08 00361 g002
Figure 3. DANN framework.
Figure 3. DANN framework.
Biomimetics 08 00361 g003
Figure 4. Weighted domain adversarial neural network framework.
Figure 4. Weighted domain adversarial neural network framework.
Biomimetics 08 00361 g004
Figure 5. DDS experimental platform.
Figure 5. DDS experimental platform.
Biomimetics 08 00361 g005
Figure 6. Time domain diagrams and frequency domain diagrams of different simulation step sizes of rigid-flexible coupling model.
Figure 6. Time domain diagrams and frequency domain diagrams of different simulation step sizes of rigid-flexible coupling model.
Biomimetics 08 00361 g006aBiomimetics 08 00361 g006b
Figure 7. Comparison of simulation data and real data time-domain diagram.
Figure 7. Comparison of simulation data and real data time-domain diagram.
Biomimetics 08 00361 g007aBiomimetics 08 00361 g007b
Figure 8. Comparison of simulation data and real data frequency domain diagram.
Figure 8. Comparison of simulation data and real data frequency domain diagram.
Biomimetics 08 00361 g008aBiomimetics 08 00361 g008b
Figure 9. Probability distribution curves of simulated data and real data.
Figure 9. Probability distribution curves of simulated data and real data.
Biomimetics 08 00361 g009aBiomimetics 08 00361 g009b
Figure 10. Time-frequency diagram of simulation data for different health conditions.
Figure 10. Time-frequency diagram of simulation data for different health conditions.
Biomimetics 08 00361 g010
Figure 11. Partial transfer diagnostic accuracy with real data in both Ds and Dt.
Figure 11. Partial transfer diagnostic accuracy with real data in both Ds and Dt.
Biomimetics 08 00361 g011
Figure 12. Partial transfer diagnostic accuracy with Ds being simulated data and Dt being real data.
Figure 12. Partial transfer diagnostic accuracy with Ds being simulated data and Dt being real data.
Biomimetics 08 00361 g012
Figure 13. Feature visualization of different methods in partial transfer task C1.
Figure 13. Feature visualization of different methods in partial transfer task C1.
Biomimetics 08 00361 g013aBiomimetics 08 00361 g013bBiomimetics 08 00361 g013c
Table 1. Partial transfer tasks where both Ds and Dt are real data.
Table 1. Partial transfer tasks where both Ds and Dt are real data.
Task NameDt ConditionsDt Health Conditions
C130 Hz 0.8 ABR, CR, MI
C220 Hz 0 ABR, CR
C320 Hz 0.4 ACR, NO
C440 Hz 0.8 ACR
Table 2. Partial transfer tasks where Ds is simulation data, and Dt is real data.
Table 2. Partial transfer tasks where Ds is simulation data, and Dt is real data.
Task NameDt ConditionsDt Health Conditions
C530 Hz 0 ABR, CR, NO
C630 Hz 0.8 ABR, CR, MI
C720 Hz 0 ABR, CR
C820 Hz 0.4 ACR, NO
C940 Hz 0.8 ACR
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Song, M.; Xiong, Z.; Zhong, J.; Xiao, S.; Ren, J. Fault Diagnosis of Planetary Gearbox Based on Dynamic Simulation and Partial Transfer Learning. Biomimetics 2023, 8, 361. https://doi.org/10.3390/biomimetics8040361

AMA Style

Song M, Xiong Z, Zhong J, Xiao S, Ren J. Fault Diagnosis of Planetary Gearbox Based on Dynamic Simulation and Partial Transfer Learning. Biomimetics. 2023; 8(4):361. https://doi.org/10.3390/biomimetics8040361

Chicago/Turabian Style

Song, Mengmeng, Zicheng Xiong, Jianhua Zhong, Shungen Xiao, and Jihua Ren. 2023. "Fault Diagnosis of Planetary Gearbox Based on Dynamic Simulation and Partial Transfer Learning" Biomimetics 8, no. 4: 361. https://doi.org/10.3390/biomimetics8040361

Article Metrics

Back to TopTop