Fault Early Warning Model for High-Speed Railway Train Based on Feature Contribution and Causal Inference
Abstract
:1. Introduction
- We combine the causal inference method used in the field of economics with the method of machine learning in the AI field and use this method to solve the problem of the early warning of key equipment failures in high-speed railway trains.
- By introducing characteristic time series data with significant contribution and causal relationships, the prediction accuracy of the classic machine learning model is improved, and the computational complexity is reduced.
- We propose a set of process methods for the application of causal inference to product experts so as to analyze the causes of product failures.
- Based on the real scenario of high-speed railway on-site operation, a set of methods in which product experts and data experts contribute their respective skills and cooperate to solve practical problems are proposed.
2. Methods
2.1. Data Preparation
2.1.1. Sensor Data Collection
2.1.2. Positive and Negative Sample Balance Processing
2.1.3. Missing Value Handling
2.1.4. Abnormal Sample Removal
2.1.5. Normalization Processing
2.1.6. Selection by Prior Knowledge
2.2. Feature Contribution Calculation
Algorithm 1: SHAP Tree Model Shapley Value Calculation Pseudo-Code . |
Inputs: x,S,tree={v,a,b,t,r,d} |
Process: |
1: if does not belong to an internal node, then |
2: return w* |
3: elseif |
4: if , then |
5: if |
6: return |
7: else return |
9: endif |
8: else return |
9: endif 10: endif |
Output: |
2.3. Causal Inference
2.3.1. Causal Graph Construction
- Chain structure:
- Fork structure:
- Colliding structure:
2.3.2. Identification
- (1)
- In some studies, even if some variables in the DAG are unobservable, we can still estimate some causal effects from the observed data.
- (2)
- These two criteria help us to identify “confounding variables” and design observational studies.
- Adjust the formula:
- Front Door Guidelines:
- Z cuts off all the direct paths from T to Y
- There is no backdoor path from T to Z
- All Z-to-Y backdoor paths are blocked by T
2.3.3. Estimate
2.3.4. Refute
2.4. Model Training and Prediction
3. Results
3.1. Dataset Description
3.2. Feature Contribution Results
3.3. Causal Inference Results
3.4. Model Prediction Results
- (1)
- Through the causal inference method proposed in this paper, compared with the classical model of the original input, the failure prediction of each model achieves an average improvement of 10% in the accuracy, an average of 21% in the precision, an average improvement of 23% in the recall rate, and an average improvement of 24% in the F1 value.
- (2)
- Comparing the “Input time series” method and the “Causal inference” method proposed in this paper, the introduction of the full-feature time series data method increases the feature dimension complexity to 270%. The calculation time is 35% longer than that of the method proposed in this paper.
- (3)
- The method of applying full time series data increases the computational complexity. However, the failure prediction effect is equal or inferior to that of the method proposed in this paper.
- (4)
- The causal inference method proposed in this paper can achieve results using various machine learning models and has a good generalization ability and robustness.
4. Discussion
4.1. Discussion of the Model’s Validity
- As described in Section 2.1, during the data preparation, product experts conduct preliminary feature screening based on prior knowledge, which greatly reduces the data dimension and the complexity of the subsequent model calculations.
- As described in Section 2.2, during the calculation of the feature contribution degree, the data experts calculate the correlation between each feature and the fault using the SHAP model. The feature dimension is further reduced to 26, and the visual SHAP value contribution graph of the correlation ranking of each feature is provided.
- As described in Section 2.3, in the causal inference stage, the product experts draw causal inference diagrams based on prior knowledge. Through the steps of identification, estimation, and refutation, the feature causal relationship decision recommendation is obtained according to the data. The product experts then re-examine and modify the causal inference graph based on prior knowledge and, finally, obtain eight important features with causal relationships.
- In the model training and prediction link stage, as described in Section 2.4, the diagnostic and prediction capabilities of the model are provided by introducing characteristic time series data with high contributions and causal relationships. At the same time, the feature data with low contributions and no causal relationships are deleted to avoid the overfitting of the model and reduce the robustness of the model.
4.2. Application Scenarios
- Diagnostic warning based on the model’s robustness and accuracy:
- Model Visualization-Based Product Optimization:
4.3. Disadvantages and Prospects
- Disadvantages
- (1)
- There are usually superficial causes and root causes of the failures in field industry applications. The causal inference model proposed in this paper does not provide feature causal chain decision support. Product experts may identify the shallow cause of the failure, but they may not be able to identify the root cause of the failure.
- (2)
- Selecting features with high contributions and introducing them into the time series data improves the model’s effect. However, the new data introduced into the model algorithm is used as another feature for the training. Data from different time series for the same feature are not related in the model.
- Prospects
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Jun, W. Development and Practice of PHM Oriented High-Speed Train Pedigree Product Technology Platform. China Railw. Sci. 2021, 42, 80–86. [Google Scholar]
- Mccall, B. COVID-19 and artificial intelligence: Protecting health-care workers and curbing the spread. Lancet Digit. Health 2020, 2, e166–e167. [Google Scholar] [CrossRef] [PubMed]
- Ge, W.; Patino, J.; Todisco, M. Explaining deep learning models for spoofing and deepfake detection with SHapley Additive exPlanations. In Proceedings of the ICASSP 2022–2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 22–27 May 2022; pp. 6387–6391. [Google Scholar]
- Mahjoub, S.; Chrifi-Alaoui, L.; Marhic, B.; Delahoche, L. Predicting Energy Consumption Using LSTM, Multi-Layer GRU and Drop-GRU Neural Networks. Sensors 2022, 22, 4062. [Google Scholar] [CrossRef] [PubMed]
- Zhou, H.; Zhang, S.; Peng, J. Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI, Vancouver, BC, Canada, 28 February–1 March 2022; pp. 11106–11115. [Google Scholar]
- Suawa, P.; Meisel, T.; Jongmanns, M.; Huebner, M.; Reichenbach, M. Modeling and Fault Detection of Brushless Direct Current Motor by Deep Learning Sensor Data Fusion. Sensors 2022, 22, 3516. [Google Scholar] [CrossRef] [PubMed]
- Beard, V. Failure Accommodation in Linear System Through Self Reorganization. Ph.D. Thesis, Massachusetts Institute of Technology, Cambridge, MA, USA, 1971. [Google Scholar]
- Lin, J.; Zuo, M. Gearbox fault diagnosis using adaptive wavelet filter. Mech. Syst. Signal Process. 2003, 17, 1259–1269. [Google Scholar] [CrossRef]
- Cover, T.; Hart, P. Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 1967, 13, 21–27. [Google Scholar] [CrossRef] [Green Version]
- Friedman, H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
- Dunwen, G. Recognition Method and Real-Time Online Monitoring Device for Lateral Stability of High Speed Train Based on Bayesian Clustering. China Railw. Sci. 2016, 37, 139–144. [Google Scholar]
- Ge, X.; Qitian, Z.; Zhe, L. Incipient Fault Autonomous Identification Method of Train Axle Box Bearing Based on Ginigram and CHMR. China Railw. Sci. 2022, 43, 104–114. [Google Scholar]
- Dai, P.; Wang, S.; Du, X. Image Recognition Method for the Fastener Defect of Ballastless Track Based on Semi-Supervised Deep Learning. China Railw. Sci. 2018, 39, 43–49. [Google Scholar]
- Lulu, X. Fault Diagnosis Method of Vehicle Vertical Damper Based on IMM. China Railw. Sci. 2018, 39, 119–125. [Google Scholar]
- Cisuo, S.; Jun, L.; Yong, Q. Intelligent Detection Method for Rail Flaw Based on Deep Learning. China Railw. Sci. 2018, 39, 51–57. [Google Scholar]
- Pearl, J. Causality; Cambridge University Press: Cambridge, UK, 2009; pp. 1–61. [Google Scholar]
- Sharma, A.; Kiciman, E. DoWhy: An end-to-end library for causal inference. arXiv 2011, arXiv:2011.04216. Available online: https://arxiv.org/abs/2011.04216 (accessed on 9 November 2020).
- Blöbaum, P.; Götz, P.; Budhathoki, K. DoWhy-GCM: An extension of DoWhy for causal inference in graphical causal models. arXiv 2022, arXiv:2206.06821. Available online: https://arxiv.org/abs/2206.06821 (accessed on 14 June 2022).
- Shouling, J.; Jinfeng, L.I.; Tianyu, D.; Bo, L. Survey on Techniques, Applications and Security of Machine Learning Interpret. J. Comput. Res. Dev. 2019, 56, 2071–2096. [Google Scholar]
- Devlin, J.; Chang, W.; Lee, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. Available online: https://arxiv.org/abs/1810.04805 (accessed on 24 May 2019).
- Brown, T.; Mann, B.; Ryder, N. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 2020, 33, 1877–1901. [Google Scholar]
- Lundberg, M.; Lee, S. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 2017, 30, 119–125. [Google Scholar]
- Lloyd, S. A Value for N-Person Games. Available online: https://www.rand.org/pubs/papers/P295.html (accessed on 12 November 2022).
Data Type | Data Category | Units/Format | Data Example |
---|---|---|---|
Sensor data | Speed | km/h | Train speed, shaft speed, tug shaft speed, etc. |
Temperature | °C | Outdoor temperature, traction motor stator temperature, traction converter coolant temperature, etc. | |
Pressure | Pa | Coolant inlet/outlet pressure, air spring pressure, etc. | |
Voltage | V | Intermediate voltage value, input voltage, catenary voltage, etc. | |
Current | A | Auxiliary converter output current, transformer primary current peak value. | |
Power | W | APS input power, traction converter inverter AC side power, etc. | |
Weight | kg | The actual quality of the train, etc. | |
Status data | Time | YYYY-MM-DD hh:mm:ss | Sample sampling time. |
Operating state | 1/0 | Train maneuver command status. | |
Diagnostic status | # | Train life signal, fault code, etc. | |
Operational data | Train information | # | Line, train number, car number, etc. |
Early warning data | Judgment results | 1/0 | Fault occurrence, fault removal, etc. |
Type | Characteristic | Mathematical Meaning | Average Representation | Example | Preprocessing Method |
---|---|---|---|---|---|
Classify | Mutually exclusive separable classes | =, ≠ | Mode | Car model | One-hot encoding |
Sequencing | Rank order | >, < | Median | Level 1–Level 3 | One-hot encoding |
Distance | Continuous value, no multiples | +, − | Arithmetic mean | Celsius | Normalized |
Fixed ratio | With 0 points, multiples comparable | +, −, ×, ÷ | Geometric mean | Voltage and current | Normalized |
Feature | Local Mutation | Global Variance | Max SHAP Value | ATE | Random Common Causes ATE | Placebo Treatment ATE |
---|---|---|---|---|---|---|
Intermediate voltage | 3637 | 1015 | 0.115 | 0.46 | 0.46 | 0.02 |
APS input power | 98 | 32 | 0.096 | 0.74 | 0.74 | 0 |
Input voltage | 384 | 154 | 0.069 | 0.67 | 0.68 | 0.01 |
Input voltage of auxiliary converter | 3246 | 1254 | 0.078 | 0.36 | 0.37 | 0 |
Output current of auxiliary converter | 98.5 | 50 | 0.082 | 0.08 | 0.08 | 0.01 |
Output voltage of auxiliary converter | 380.6 | 145 | 0.076 | 0.93 | 0.92 | 0 |
Transformer ground current peak | 188 | 73 | 0.054 | 0.52 | 0.52 | 0.01 |
Transformer primary current peak | 189 | 73 | 0.071 | 0.52 | 0.52 | 0.01 |
Method | Model | Accuracy | Precision | Recall | F1 | Dimension | Computing Time |
---|---|---|---|---|---|---|---|
Original | KNN | 0.78 | 0.61 | 0.4 | 0.44 | 26 | 5.171 |
XGBoost | 0.73 | 0.58 | 0.51 | 0.43 | |||
Random Forest | 0.82 | 0.57 | 0.4 | 0.43 | |||
GDBT | 0.73 | 0.53 | 0.57 | 0.48 | |||
Naïve Bayes | 0.84 | 0.72 | 0.39 | 0.49 | |||
Logistic Regression | 0.84 | 0.68 | 0.61 | 0.63 | |||
Input time series | KNN | 0.86 | 0.7 | 0.69 | 0.67 | 52 | 6.008 |
XGBoost | 0.92 | 0.88 | 0.74 | 0.76 | |||
Random Forest | 0.87 | 0.85 | 0.58 | 0.63 | |||
GDBT | 0.81 | 0.69 | 0.57 | 0.59 | |||
Naïve Bayes | 0.83 | 0.64 | 0.42 | 0.48 | |||
Logistic Regression | 0.87 | 0.73 | 0.71 | 0.71 | |||
Causal inference | KNN | 0.91 | 0.85 | 0.76 | 0.79 | 19 | 4.444 |
XGBoost | 0.91 | 0.88 | 0.75 | 0.76 | |||
Random Forest | 0.89 | 0.86 | 0.68 | 0.7 | |||
GDBT | 0.87 | 0.77 | 0.58 | 0.62 | |||
Naïve Bayes | 0.87 | 0.71 | 0.72 | 0.7 | |||
Logistic Regression | 0.91 | 0.86 | 0.76 | 0.79 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, D.; Qin, Y.; Zhao, Y.; Yang, W.; Hu, H.; Yang, N.; Liu, B. Fault Early Warning Model for High-Speed Railway Train Based on Feature Contribution and Causal Inference. Sensors 2022, 22, 9184. https://doi.org/10.3390/s22239184
Liu D, Qin Y, Zhao Y, Yang W, Hu H, Yang N, Liu B. Fault Early Warning Model for High-Speed Railway Train Based on Feature Contribution and Causal Inference. Sensors. 2022; 22(23):9184. https://doi.org/10.3390/s22239184
Chicago/Turabian StyleLiu, Dian, Yong Qin, Yiying Zhao, Weijun Yang, Haijun Hu, Ning Yang, and Bing Liu. 2022. "Fault Early Warning Model for High-Speed Railway Train Based on Feature Contribution and Causal Inference" Sensors 22, no. 23: 9184. https://doi.org/10.3390/s22239184
APA StyleLiu, D., Qin, Y., Zhao, Y., Yang, W., Hu, H., Yang, N., & Liu, B. (2022). Fault Early Warning Model for High-Speed Railway Train Based on Feature Contribution and Causal Inference. Sensors, 22(23), 9184. https://doi.org/10.3390/s22239184