Multi-Feature Fusion YOLO Approach for Fault Detection and Location of Train Running Section
Abstract
1. Introduction
- Marker-based Fastener Monitoring: Applying standardized markings to critical components (e.g., fasteners) according to assembly specifications, enabling systematic visual tracking of displacement anomalies.
- Multi-scale Feature Fusion: Utilizing YOLO-v3’s hierarchical feature pyramid network (FPN) [9] to detect defects across varying scales, from micro-cracks to macro-looseness, ensuring robustness against lighting variations and partial occlusions.
- Non-contact Image Analysis: Eliminating physical contact with inspected objects, thereby reducing damage risks and operational costs while maintaining high detection precision.
2. Related Work
2.1. Object Detection
2.2. YOLOs
3. Method
3.1. The Overall Architecture of YOLO-v3
- Image input: Input an image of appropriate size and perform data augmentation operations such as random cropping, rotation, flipping, etc. to increase the robustness of the model.
- Feature extraction: Use Darknet-53 backbone network to extract image features and generate multi-scale feature maps. These feature maps contain information of different scales and levels in the image.
- Feature Fusion: FPN fuses feature maps from different scales to generate a feature pyramid with rich semantic information. This step helps the model better detect targets of different sizes.
- Object detection: Apply convolutional and predictive layers on the feature pyramid to predict bounding boxes and category probabilities for each grid cell. YOLO-v3 adopts a multi-scale prediction strategy, which outputs feature maps of different sizes at different network layers to adapt to object detection of different sizes.
- Post-processing: Apply Non-Maximum Suppression (NMS) to remove redundant bounding boxes and generate the final detection result. NMS compares the confidence levels of adjacent bounding boxes, retains the bounding box with the highest confidence level, and removes other bounding boxes with excessive overlap.
3.1.1. Backbone
3.1.2. Neck
3.1.3. Head
3.2. Loss and Optimization
3.2.1. Objectivity Loss
3.2.2. Bounding Box Loss
3.2.3. Classification Loss
4. Experimental Results
4.1. Experiment Setting and Dataset Introduction
4.2. Evaluation Criteria
- Accuracy: The percentage of correctly predicted sample sizes to the total:Among them, represents True Positive cases, represents True Negative cases, represents False Positive cases, and represents False Negative cases.
- mAP (Mean Average Precision): The average AP value across multiple categories is used to evaluate the overall performance of the model in multi-category object detection tasks:Among them, represents the value of the i-th category, and N represents the total number of categories.The larger the mAP value, the better the overall performance of the model in multi-class object detection tasks.
- Recall: The proportion of correctly predicted positive samples among all true positive samples:Recall rate reflects the model’s ability to cover positive samples.
4.3. Experimental Analysis
4.3.1. Comparative Experiments of Different Methods
4.3.2. Ablation Experiment
- Batch size = 64: Determined based on the NVIDIA GTX 1080 Ti GPU (11 GB VRAM). Testing showed that a larger batch size (e.g., 128) caused VRAM overflow, while a smaller batch size (e.g., 32) led to unstable convergence with larger loss fluctuations.
- Initial learning rate = 0.001: Adopted based on YOLO series conventions and pre-experiments, which demonstrated this rate balances convergence speed and stability (avoiding early loss oscillations compared to 0.01, and accelerating convergence compared to 0.0001).
- Anchor box settings: Generated via K-Means clustering on the custom dataset, resulting in three anchor boxes: (13 × 17), (24 × 38), and (49 × 67). These sizes match the scale distribution of the three defect classes (welding pores, surface scratches, and fastener loosening, respectively), reducing localization error by 15% compared to default anchor boxes.
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
YOLO | You Only Look Once |
FPN | Feature Pyramid Network |
SSD | Single Shot multiBox Detector |
RPN | Region Proposal Network |
CNN | Convolutional Neural Network |
NMS | Non-Maximum Suppression |
References
- Lin, Y.W.; Hsieh, C.C.; Huang, W.H.; Hsieh, S.L.; Hung, W.H. Railway Track Fasteners Fault Detection using Deep Learning. In Proceedings of the 2019 IEEE Eurasia Conference on IOT, Communication and Engineering (ECICE), Taipei, Taiwan, 3–5 May 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 231–235. [Google Scholar]
- Du, C.; Dutta, S.; Kurup, P.; Yu, T.; Wang, X. A review of railway infrastructure monitoring using fiber optic sensors. Sens. Actuators A Phys. 2020, 303, 111728. [Google Scholar] [CrossRef]
- Liang, Z.; Zhang, H.; Liu, L.; He, Z.; Zheng, K. Defect Detection of Rail Surface with Deep Convolutional Neural Networks. J. Vis. Commun. Image Represent. 2018, 55, 892–901. [Google Scholar] [CrossRef]
- Izumi, S.; Yokoyama, T.; Iwasaki, A.; Sakai, S. Three-dimensional finite element analysis of tightening and loosening mechanism of threaded fastener. Eng. Fail. Anal. 2005, 12, 604–615. [Google Scholar] [CrossRef]
- Guagliano, M.; Vergani, L. Experimental and numerical analysis of sub-surface cracks in railway wheels. Eng. Fract. Mech. 2005, 72, 255–269. [Google Scholar] [CrossRef]
- Liu, X.; Zhou, Y.; Tang, Y.; Qian, J.; Zhou, Y. Human-in-the-loop online just-in-time software defect prediction. J. Syst. Softw. 2023, 198, 111567. [Google Scholar] [CrossRef]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef]
- Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar] [CrossRef]
- Lin, T.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 936–944. [Google Scholar]
- Dey, A.; Pal, A.; Mukherjee, A.; Bhattacharjee, K.G. An Approach for Identification Using Knuckle and Fingerprint Biometrics Employing Wavelet Based Image Fusion and SIFT Feature Detection. In Advances in Signal Processing and Intelligent Recognition Systems; Springer: Berlin/Heidelberg, Germany, 2015; pp. 399–410. [Google Scholar] [CrossRef]
- Rampriya, R.S.; Suganya, R.; Nathan, S.; Perumal, P.S. A Comparative Assessment of Deep Neural Network Models for Detecting Obstacles in the Real Time Aerial Railway Track Images. Appl. Artif. Intell. 2022, 36, 34. [Google Scholar] [CrossRef]
- Wang, D.; Hongsheng, S.U.; Chen, D.; Zhao, X. A method of railway fastener defect detection based on ResNet-SSD. J. Meas. Sci. Instrum. 2023, 14, 360. [Google Scholar] [CrossRef]
- Yu, T.; Luo, X.; Li, Q.; Li, L. CRGF-YOLO: An Optimized Multi-Scale Feature Fusion Model Based on YOLOv5 for Detection of Steel Surface Defects. Int. J. Comput. Intell. Syst. 2024, 17, 154. [Google Scholar] [CrossRef]
- Liu, H.-H.; Sun, C.; He, H.-Q.; Hui, K.-H. Metal surface defect detection based on improved YOLOv3. Comput. Eng. Sci./Jisuanji Gongcheng yu Kexue 2023, 45, 257. [Google Scholar]
- Połap, D.; Kł Sik, K.; Księ Ek, K.; Woź Niak, M. Obstacle Detection as a Safety Alert in Augmented Reality Models by the Use of Deep Learning Techniques. Sensors 2017, 17, 2803. [Google Scholar] [CrossRef] [PubMed]
- Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar] [CrossRef]
- Jocher, G. YOLOv5 Documentation. 2020. Available online: https://github.com/ultralytics/yolov5 (accessed on 15 June 2023).
- Cobb, A.C.; Michaels, J.E.; Michaels, T.E. An automated time–frequency approach for ultrasonic monitoring of fastener hole cracks. Ndt E Int. 2007, 40, 525–536. [Google Scholar] [CrossRef]
- Li, Q.; Ren, S. A Real-Time Visual Inspection System for Discrete Surface Defects of Rail Heads. IEEE Trans. Instrum. Meas. 2012, 61, 2189–2199. [Google Scholar] [CrossRef]
- Hattori, T.; Yamashita, M.; Mizuno, H.; Naruse, T. Loosening and Sliding Behaviour of Bolt-Nut Fastener under Transverse Loading. Eur. Phys. J. Conf. 2010, 6, 08002. [Google Scholar] [CrossRef]
- Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic Minority Over-Sampling Technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
- Rezatofighi, H.; Tsoi, N.; Gwak, J.; Sadeghian, A.; Reid, I.; Savarese, S. Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 658–666. [Google Scholar]
- Canny, J. A Computational Approach to Edge Detection. IEEE Trans. Pattern Anal. Mach. Intell. 1986, 8, 679–698. [Google Scholar] [CrossRef]
- Bishop, S.S.; Isaacs, J.C.; Besaw, L.E. Detecting buried explosive hazards with handheld GPR and deep learning. In Proceedings of the Detection & Sensing of Mines, Explosive Objects, & Obscured Targets XXI, Baltimore, MD, USA, 18–21 April 2016; p. 98230N. [Google Scholar]
- Kong, W.; Hong, J.; Jia, M.; Yao, J.; Zhang, H. YOLOv3-DPFIN: A Dual-Path Feature Fusion Neural Network for Robust Real-time Sonar Target Detection. IEEE Sens. J. 2019, 20, 3745–3756. [Google Scholar] [CrossRef]
- Wang, S.; Liu, T.; Tan, L. Automatically learning semantic features for defect prediction. In Proceedings of the 38th International Conference on Software Engineering, Austin, TX, USA, 14–22 May 2016. [Google Scholar]
- Li, L.; Ota, K.; Dong, M. Deep Learning for Smart Industry: Efficient Manufacture Inspection System With Fog Computing. IEEE Trans. Ind. Inform. 2018, 14, 4665–4673. [Google Scholar] [CrossRef]
- Sappa, A.D.; Dornaika, F.; Ponsa, D.; Geronimo, D.; Lopez, A. An Efficient Approach to Onboard Stereo Vision System Pose Estimation. IEEE Trans. Intell. Transp. Syst. 2008, 9, 476–490. [Google Scholar] [CrossRef]
Model | mAP | Recall | Accuracy | FPS |
---|---|---|---|---|
YOLO-v3 | 87.5 | 92 | 89 | 22 |
Model | mAP (%) | Recall (%) | FPS |
---|---|---|---|
YOLO-v3 | 87.5 | 92 | 22 |
Faster R-CNN | 84.6 | 88 | 5 |
SSD | 83.1 | 85 | 16 |
YOLO-v5 | 89.0 | 93 | 15 |
Experimental Setup | mAP (%) | Recall (%) | Accuracy (%) |
---|---|---|---|
Original model (with all components) | 87.5 | 92 | 89 |
Without data augmentation | 82.3 | 86 | 84 |
Without FPN (feature fusion) | 79.1 | 83 | 81 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, B.; Shu, D.; Fu, P.; Yao, S.; Chong, C.; Zhao, X.; Yang, H. Multi-Feature Fusion YOLO Approach for Fault Detection and Location of Train Running Section. Electronics 2025, 14, 3430. https://doi.org/10.3390/electronics14173430
Zhang B, Shu D, Fu P, Yao S, Chong C, Zhao X, Yang H. Multi-Feature Fusion YOLO Approach for Fault Detection and Location of Train Running Section. Electronics. 2025; 14(17):3430. https://doi.org/10.3390/electronics14173430
Chicago/Turabian StyleZhang, Beijia, Dong Shu, Pengzhan Fu, Song Yao, Chuanqiang Chong, Xingwei Zhao, and Hongtai Yang. 2025. "Multi-Feature Fusion YOLO Approach for Fault Detection and Location of Train Running Section" Electronics 14, no. 17: 3430. https://doi.org/10.3390/electronics14173430
APA StyleZhang, B., Shu, D., Fu, P., Yao, S., Chong, C., Zhao, X., & Yang, H. (2025). Multi-Feature Fusion YOLO Approach for Fault Detection and Location of Train Running Section. Electronics, 14(17), 3430. https://doi.org/10.3390/electronics14173430