A Multi-Task Joint Learning Model Based on Transformer and Customized Gate Control for Predicting Remaining Useful Life and Health Status of Tools †
Abstract
:1. Introduction
2. The Proposed Method
2.1. Feature Extraction and Selection
2.2. The Proposed TECGC Model
2.2.1. The Transformer Encoder
- (1)
- Input embedding layer: This layer serves to transform the input data into a high-dimensional vector representation. This process enables the model to work in a high-dimensional representation space where relationships and patterns within the data can be more readily identified. This layer can be mathematically represented as:
- (2)
- Position encoding layer: Due to the transformer encoder lacking an inherent sequential learning mechanism, a position encoding layer is introduced to enable the model to distinguish the positional relationships among different data in the input sequence. Primarily, this layer appends a series of fixed vectors to the input data, which can be mathematically expressed as follows:
- (3)
- Multi-head self-attention layer: This layer primarily applies multiple scaled dot-product self-attention mechanisms to capture the intricate correlations between various features across different representation spaces. This approach significantly enhances the feature representation ability of the model. Specifically, the scaled dot-product self-attention mechanism is utilized to transform input data into query matrix Qi, key matrix Ki, and value matrix Vi. Subsequently, it calculates the similarity between the query matrix Qi and the key matrix Ki to weight the value matrix Vi, thereby generating an output representation of the sensor data. The mathematical formulation of this computation can be expressed as:
- (4)
- Feed-forward neural network (FNN): The FNN is composed of two linear layers and a ReLU activation function. Its primary objective is to improve the nonlinear representation ability of the model, thereby enriching the overall feature space.
- (5)
- Residual connection and layer normalization: Residual connections play a crucial role in the transformer encoder. They can enable the model to construct a depth network structure and accelerate its training speed. Furthermore, they can effectively avoid the phenomenon of gradient vanishing by providing shortcuts for gradient flow during the backpropagation of the model. In addition, layer normalization is another crucial technique utilized in the transformer encoder. This operation of normalizing the output enhances the robustness of the model and makes it more stable to variations in input data.
2.2.2. Customized Gate Control (CGC)
2.3. Multi-Task Dynamic Adaptive Loss Function
Algorithm 1 Model training process by using multi-task adaptive dynamic adjustment loss function |
Preparation: |
Input: Training samples |
Initial learning rate: γ |
Training: |
for j = 1, 2, …, N do: |
if j = 1: |
Input x into the model |
Output: |
Calculate the loss function Equations (16) and (17) |
break |
else: |
Input x into the model |
Output: |
Calculate the αj and βj by Equations (19) and (20) |
Calculate the regression and classification loss by Equations (16) and (17) |
Calculate the total loss by Equation (18) |
Update the model parameters W b by using Adam optimizer |
End for |
Return: The trained model parameters: W b |
2.4. The Hyperparameter Information of the Proposed TECGC Model
3. Data Description
4. Experimental Study and Discussion
4.1. Evaluation Index
4.2. Prediction Performance Analysis
4.2.1. Prediction Performance Evaluation of Multi-Task Learning Model
4.2.2. Effectiveness Analysis of Model Structure
4.2.3. Effectiveness Analysis of Dynamic Adaptive Loss Function
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Zhou, J.; Zhao, X.; Gao, J. Tool remaining useful life prediction method based on LSTM under variable working conditions. Int. J. Adv. Manuf. Technol. 2019, 104, 4715–4726. [Google Scholar] [CrossRef]
- Gao, J.; Heng, F.; Yuan, Y.; Liu, Y. A novel machine learning method for multiaxial fatigue life prediction: Improved adaptive neuro-fuzzy inference system. Int. J. Fatigue 2024, 178, 108007. [Google Scholar] [CrossRef]
- Li, X.; Zhang, W.; Ding, Q. Deep learning-based remaining useful life estimation of bearings using multi-scale feature extraction. Reliab. Eng. Syst. Saf. 2019, 182, 208–218. [Google Scholar] [CrossRef]
- Xue, B.; Xu, H.; Huang, X.; Zhu, K.; Xu, Z.; Pei, H. Similarity-based prediction method for machinery remaining useful life: A review. Int. J. Adv. Manuf. Technol. 2022, 121, 1501–1531. [Google Scholar] [CrossRef]
- Wang, C.; Lu, N.; Cheng, Y.; Jiang, B. A data-driven aero-engine degradation prognostic strategy. IEEE Trans. Cybern. 2021, 51, 1531–1541. [Google Scholar] [CrossRef]
- Sun, H.; Cao, D.; Zhao, Z.; Kang, X. A hybrid approach to cutting tool remaining useful life prediction based on the wiener process. IEEE Trans. Reliab. 2018, 67, 1294–1303. [Google Scholar] [CrossRef]
- Liu, T.; Zhu, K.; Zeng, L. Diagnosis and prognosis of degradation process via hidden semi-markov model. IEEE-ASME Trans. Mechatron. 2018, 23, 1456–1466. [Google Scholar] [CrossRef]
- Zhu, K.; Liu, T. Online tool wear monitoring via hidden semi-markov model with dependent durations. IEEE Trans. Ind. Inform. 2018, 14, 69–78. [Google Scholar] [CrossRef]
- Sun, H.; Pan, J.; Zhang, J.; Cao, D. Non-linear Wiener process-based cutting tool remaining useful life prediction considering measurement variability. Int. J. Adv. Manuf. Technol. 2020, 107, 4493–4502. [Google Scholar] [CrossRef]
- Gokulachandran, J.; Mohandas, K. Comparative study of two soft computing techniques for the prediction of remaining useful life of cutting tools. J. Intell. Manuf. 2015, 26, 255–268. [Google Scholar] [CrossRef]
- Siddique, M.F.; Ahmad, Z.; Kim, J.M. Pipeline leak diagnosis based on leak-augmented scalograms and deep learning. IEEE Trans. Ind. Inform. 2023, 17, 2225577. [Google Scholar] [CrossRef]
- Zhao, C.; Huang, X.; Liu, H.; Gao, T.; Shi, J. A novel bootstrap ensemble learning convolutional simple recurrent unit method for remaining useful life interval prediction of turbofan engines. Meas. Sci. Technol. 2022, 33, 125004. [Google Scholar] [CrossRef]
- Peng, K.; Jiao, R.; Dong, J.; Pi, Y. A deep belief network based health indicator construction and remaining useful life prediction using improved particle filter. Neurocomputing 2019, 361, 19–28. [Google Scholar] [CrossRef]
- Zhang, C.; Lim, P.; Qin, A.K.; Tan, K.C. Multiobjective Deep Belief Networks Ensemble for Remaining Useful Life Estimation in Prognostics. IEEE Trans. Neural Netw. Learn. Syst. 2017, 28, 2306–2318. [Google Scholar] [CrossRef] [PubMed]
- Chang, T.-J.; Cheng, S.J.; Hsu, C.-H.; Miao, J.M.; Chen, S.-F. Prognostics for remaining useful life estimation in proton exchange membrane fuel cell by dynamic recurrent neural networks. Energy Rep. 2022, 8, 9441–9452. [Google Scholar] [CrossRef]
- Liu, C.; Zhang, Y.; Sun, J.; Cui, Z.; Wang, K. Stacked bidirectional LSTM RNN to evaluate the remaining useful life of supercapacitor. Int. J. Energy Res. 2022, 46, 3034–3043. [Google Scholar] [CrossRef]
- Elsheikh, A.; Yacout, S.; Ouali, M.-S. Bidirectional handshaking LSTM for remaining useful life prediction. Neurocomputing 2019, 323, 148–156. [Google Scholar] [CrossRef]
- Li, X.; Li, J.; Zuo, L.; Zhu, L.; Shen, H.T. Domain adaptive remaining useful life prediction with Transformer. IEEE Trans. Instrum. Meas. 2022, 71, 3521213. [Google Scholar] [CrossRef]
- Wu, J.-Y.; Wu, M.; Chen, Z.; Li, X.-L.; Yan, R. Degradation-aware remaining useful life prediction with LSTM autoencoder. IEEE Trans. Instrum. Meas. 2021, 70, 3511810. [Google Scholar] [CrossRef]
- Xu, X.; Li, X.; Ming, W.; Chen, M. A novel multi-scale CNN and attention mechanism method with multi-sensor signal for remaining useful life prediction. Comput. Ind. Eng. 2022, 169, 108204. [Google Scholar] [CrossRef]
- Yao, J.; Lu, B.; Zhang, J. Tool remaining useful life prediction using deep transfer reinforcement learning based on long short-term memory networks. Int. J. Adv. Manuf. Technol. 2022, 118, 1077–1086. [Google Scholar] [CrossRef]
- Cheng, Y.; Hu, K.; Wu, J.; Zhu, H.; Shao, X. Autoencoder quasi-recurrent neural networks for remaining useful life prediction of engineering systems. IEEE-ASME Trans. Mechatron. 2022, 27, 1081–1092. [Google Scholar] [CrossRef]
- Cao, Y.; Ding, Y.; Jia, M.; Tian, R. A novel temporal convolutional network with residual self-attention mechanism for remaining useful life prediction of rolling bearings. Reliab. Eng. Syst. Saf. 2021, 215, 107813. [Google Scholar] [CrossRef]
- Fang, X.; Gong, G.; Li, G.; Chun, L.; Li, W.; Peng, P. A hybrid deep transfer learning strategy for short term cross-building energy prediction. Energy 2021, 215, 119208. [Google Scholar] [CrossRef]
- Gao, K.; Xu, X.; Jiao, S. Measurement and prediction of wear volume of the tool in nonlinear degradation process based on multi-sensor information fusion. Eng. Fail. Anal. 2022, 136, 106164. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the 31st Annual Conference on Neural Information Processing Systems (NIPS), Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
- Ma, J.; Zhao, Z.; Yi, X.; Chen, J.; Hong, L.; Chi, E.H. Modeling task relationships in multi-task learning with multi-gate mixture-of-experts. In Proceedings of the 24th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, London, UK, 19–23 August 2018; pp. 1930–1939. [Google Scholar]
- Tang, H.; Liu, J.; Zhao, M.; Gong, X. Progressive layered extraction (PLE): A novel multi-task learning (MTL) model for personalized recommendations. In Proceedings of the 14th ACM Conference on Recommender Systems, Virtual Event, 22–26 September 2020; pp. 269–278. [Google Scholar]
- Li, X.; Lim, B.S.; Zhou, J.H.; Huang, S.; Er, M.J. Fuzzy neural network modelling for tool wear estimation in dry milling operation. In Proceedings of the Annual Conference of the PHM Society, San Diego, CA, USA, 27 September 2009. [Google Scholar]
- Zhang, Y.; Zhu, K.; Duan, X.; Li, S. Tool wear estimation and life prognostics in milling: Model extension and generalization. Mech. Syst. Signal Process. 2021, 155, 107617. [Google Scholar] [CrossRef]
- Sun, M.; Guo, K.; Zhang, D.; Yang, B.; Sun, J.; Li, D.; Huang, T. A novel exponential model for tool remaining useful life prediction. J. Manuf. Syst. 2024, 73, 223–240. [Google Scholar] [CrossRef]
- Li, T.; Zhao, Z.; Sun, C.; Yan, R.; Chen, X. Hierarchical attention graph convolutional network to fuse multi-sensor signals for remaining useful life prediction. Reliab. Eng. Syst. Saf. 2021, 215, 107878. [Google Scholar] [CrossRef]
- Hou, G.; Xu, S.; Zhou, N.; Yang, L.; Fu, Q. Remaining useful life estimation using deep convolutional generative adversarial networks based on an autoencoder scheme. Comput. Intell. Neurosci. 2020, 2020, 9601389. [Google Scholar] [CrossRef] [PubMed]
- Zhao, C.; Huang, X.; Li, Y.; Li, S. A novel remaining useful life prediction method based on gated attention mechanism capsule neural network. Measurement 2022, 189, 110637. [Google Scholar] [CrossRef]
Features | Equation | Features | Equation | ||
---|---|---|---|---|---|
Time Domain | Root mean square | Frequency Domain | Spectral skewness | ||
Variance | Spectral kurtosis | ||||
Maximum | Spectral power | ||||
Peak-to-peak | Time–frequency domain | Wavelet energy | |||
Skewness | |||||
Kurtosis |
Module | Parameter | Value |
---|---|---|
Transformer encoder | The number of encoders | 1 |
The number of heads | 4 | |
da | 32 | |
CGC | Units | 64 |
The number of experts in the expert-task 1 | 4 | |
The number of experts in the expert-task 2 | 4 | |
The number of experts in the expert-shared | 4 | |
Tower-reg | The number of neurons | 1 |
Tower-cla | The number of neurons | 3 |
Parameter | Value |
---|---|
Spindle speed | 10,400 rpm |
Tool feed rate | 1555 mm/min |
Workpiece material | HRC52 stainless steel |
Tool material | high-speed steel |
Sampling frequency | 50 k HZ |
The cutting depth of the tool in the y-direction | 0.125 mm |
The cutting depth of the tool in the z-direction | 0.2 mm |
Tool | Initial Wear Stage | Steady-State Region | Accelerated Wear Zone |
---|---|---|---|
C1 | [1, 79] | [80, 247] | [248, 315] |
C4 | [1, 37] | [38, 218] | [219, 315] |
C6 | [1, 23] | [24, 192] | [193, 315] |
Label | 0 | 1 | 2 |
Task | Training Dataset | Test Dataset |
---|---|---|
Task 1 | C1 + C4 | C6 |
Task 2 | C1 + C6 | C4 |
Task 3 | C4 + C6 | C1 |
Model | C1 | C4 | C6 | |||
---|---|---|---|---|---|---|
RMSE | Score | RMSE | Score | RMSE | Score | |
Paris model [31] | 16.37 | - | 27.84 | - | 25.02 | - |
Original exponential model [31] | 14.46 | - | 29.73 | - | 24.1 | - |
HAGCN [32] | 23.2 | 2278.3 | 11.2 | 589.6 | 15.6 | 1162.8 |
CNN-LSTM [33] | 44.7 | 46,929 | 15.1 | 829 | 31.7 | 19,966 |
GAM-CapsNet [34] | 4.59 | 148 | 7.44 | 221 | 5.99 | 217 |
TECGC | 4.09 | 126 | 4.31 | 113 | 3.20 | 80 |
Model | C1 | C4 | C6 | ||||||
---|---|---|---|---|---|---|---|---|---|
RMSE | Score | Accuracy | RMSE | Score | Accuracy | RMSE | Score | Accuracy | |
LSTMCGC | 6.84 | 225 | 0.872 | 4.70 | 154 | 0.950 | 4.66 | 105.10 | 0.908 |
GRUCGC | 7.897 | 258.5 | 0.908 | 4.49 | 136 | 0.964 | 4.318 | 102.23 | 0.913 |
CNNCGC | 8.25 | 303 | 0.890 | 4.42 | 123 | 0.944 | 4.10 | 101.56 | 0.890 |
TECGC | 4.09 | 126 | 0.953 | 4.31 | 113 | 0.967 | 3.20 | 80 | 0.979 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Hou, C.; Zheng, L. A Multi-Task Joint Learning Model Based on Transformer and Customized Gate Control for Predicting Remaining Useful Life and Health Status of Tools. Sensors 2024, 24, 4117. https://doi.org/10.3390/s24134117
Hou C, Zheng L. A Multi-Task Joint Learning Model Based on Transformer and Customized Gate Control for Predicting Remaining Useful Life and Health Status of Tools. Sensors. 2024; 24(13):4117. https://doi.org/10.3390/s24134117
Chicago/Turabian StyleHou, Chunming, and Liaomo Zheng. 2024. "A Multi-Task Joint Learning Model Based on Transformer and Customized Gate Control for Predicting Remaining Useful Life and Health Status of Tools" Sensors 24, no. 13: 4117. https://doi.org/10.3390/s24134117
APA StyleHou, C., & Zheng, L. (2024). A Multi-Task Joint Learning Model Based on Transformer and Customized Gate Control for Predicting Remaining Useful Life and Health Status of Tools. Sensors, 24(13), 4117. https://doi.org/10.3390/s24134117