Camera-Adaptive Foreign Object Detection for Coal Conveyor Belts
Abstract
:1. Introduction
- We introduce a novel approach to camera-adaptive foreign object detection that enhances generalization across varying camera perspectives without requiring extensive retraining, through viewpoint data augmentation, background feature enhancement, and belt area constraint.
- We incorporate Context Feature Perception (CFP) and Conveyor Belt Area Loss (CBAL) to enhance scene understanding. CFP helps the model focus on the coal background surrounding foreign objects, reducing false detections in non-coal areas, while CBAL explicitly guides attention to the conveyor belt region, minimizing background interference.
- Extensive experiments on real-world coal mine datasets demonstrate that CAFOD outperforms State-of-the-Art detection models in terms of accuracy and cross-camera adaptability, highlighting its robustness across diverse viewpoints and challenging real-world conditions.
2. Related Work
2.1. Object Detection Models
2.2. Detection Transformer
2.3. Semantic Feature Fusion
3. Methodology
3.1. Problem Definition
3.2. Overview of the Method
3.3. Multi-View Data Augmentation
3.4. Context Feature Perception
3.5. Conveyor Belt Area Loss
3.6. Varifocal Loss
3.7. The Overall Loss
4. Experiments
4.1. Datasets
4.2. Baselines
- AutoAssign (2020) [18] adopted a differentiable label assignment mechanism, dynamically adjusting the label assignment strategy during training to improve the effectiveness and robustness of object detection.
- Sparse-RCNN (2021) [22] introduced learnable proposals, reducing the number of candidate boxes while maintaining high efficiency in object detection, enabling end-to-end training and inference.
- Deformable DETR [27] incorporated a deformable attention mechanism to greatly reduce the computational complexity of Transformer and enhance detection capabilities for objects at different scales.
- VarifocalNet (2021) [10] mainly introduced an IoU-aware loss function, improving the model’s performance in predicting the confidence of candidate boxes and thus enhancing the accuracy and stability of object detection.
- CentripetalNet (2020) [21] utilized high-quality keypoints at the center of objects to achieve more accurate object detection, especially in keypoint detection and object localization.
- O2F [17] proposed a “one-to-one” label assignment strategy, which dynamically adjusts the number of labels for each prediction, optimizing the label matching process and significantly improving the accuracy and robustness of end-to-end object detection.
- CEASC [19] introduced the Adaptive Sparse Convolutional Network (ASCN), combined with a global context enhancement module, which greatly accelerated and improved the speed and accuracy of object detection.
- AFPN [20] presented the Asymptotic Feature Pyramid Network (AFPN), which gradually fuses feature pyramids in stages to enhance feature representation quality and improve object detection performance, with a significant boost in small object detection.
4.3. Experimental Settings
4.4. Camera Adaptability Evaluation
4.5. General Detection Performance Evaluation
4.6. Ablation Study
4.6.1. Detection Performance
4.6.2. Analysis of Error Type
- Classification error (Cls): , but the classification is wrong;
- Localization error (Loc): the classification is right, but ;
- Both Cls and Loc error (Cls + Loc): and misclassified;
- Missed GT error (Missed): all undetected ground truth except Cls Error and Loc Error;
- Background error (Bkgd): for all ground truths .
4.6.3. Feature Map Visualization
4.7. Model Complexity Analysis
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Luo, B.; Kou, Z.; Han, C.; Wu, J. A “hardware-friendly” foreign object identification method for belt conveyors based on improved YOLOv8. Appl. Sci. 2023, 13, 11464. [Google Scholar] [CrossRef]
- Shang, D.; Wang, Y.; Yang, Z.; Wang, J.; Liu, Y. Study on comprehensive calibration and image sieving for coal-gangue separation parallel robot. Appl. Sci. 2020, 10, 7059. [Google Scholar] [CrossRef]
- Yang, J.; Peng, J.; Li, Y.; Xie, Q.; Wu, Q.; Wang, J. Gangue Localization and Volume Measurement Based on Adaptive Deep Feature Fusion and Surface Curvature Filter. IEEE Trans. Instrum. Meas. 2021, 70, 1–13. [Google Scholar] [CrossRef]
- Zhang, M.; Shi, H.; Zhang, Y.; Yu, Y.; Zhou, M. Deep learning-based damage detection of mining conveyor belt. Measurement 2021, 175, 109130. [Google Scholar] [CrossRef]
- Liu, Y.; Miao, C.; Li, X.; Ji, J.; Meng, D.; Wang, Y. A Dynamic Self-Attention-Based Fault Diagnosis Method for Belt Conveyor Idlers. Machines 2023, 11, 216. [Google Scholar] [CrossRef]
- Zhang, R.; Lei, Y. AFGN: Attention Feature Guided Network for object detection in optical remote sensing image. Neurocomputing 2024, 610, 128527. [Google Scholar] [CrossRef]
- Li, Z.; Dong, Y.; Shen, L.; Liu, Y.; Pei, Y.; Yang, H.; Zheng, L.; Ma, J. Development and challenges of object detection: A survey. Neurocomputing 2024, 598, 128102. [Google Scholar] [CrossRef]
- Liang, T.; Xie, H.; Yu, K.; Xia, Z.; Lin, Z.; Wang, Y.; Tang, T.; Wang, B.; Tang, Z. Bevfusion: A simple and robust lidar-camera fusion framework. Adv. Neural Inf. Process. Syst. 2022, 35, 10421–10434. [Google Scholar]
- Jiang, W.; Luan, Y.; Tang, K.; Wang, L.; Zhang, N.; Chen, H.; Qi, H. Adaptive feature alignment network with noise suppression for cross-domain object detection. Neurocomputing 2025, 614, 128789. [Google Scholar] [CrossRef]
- Zhang, H.; Wang, Y.; Dayoub, F.; Sunderhauf, N. VarifocalNet: An IoU-Aware Dense Object Detector. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 8514–8523. [Google Scholar]
- Chen, K.; Du, B.; Wang, Y.; Wang, G.; He, J. The real-time detection method for coal gangue based on YOLOv8s-GSC. J. Real-Time Image Process. 2024, 21, 37. [Google Scholar] [CrossRef]
- Pu, Y.; Apel, D.B.; Szmigiel, A.; Chen, J. Image recognition of coal and coal gangue using a convolutional neural network and transfer learning. Energies 2019, 12, 1735. [Google Scholar] [CrossRef]
- Gao, R.; Sun, Z.; Li, W.; Pei, L.; Hu, Y.; Xiao, L. Automatic Coal and Gangue Segmentation Using U-Net Based Fully Convolutional Networks. Energies 2020, 13, 829. [Google Scholar] [CrossRef]
- Hu, T.; Zhuang, D.; Qiu, J. An EfficientNetv2-based method for coal conveyor belt foreign object detection. Front. Energy Res. 2025, 12, 1444877. [Google Scholar] [CrossRef]
- Li, X.; Li, W.; Qiu, K.; Wang, S.; Zhao, S. Coal mine belt conveyor foreign object detection based on improved yolov8. In Proceedings of the 2023 IEEE 11th Joint International Information Technology and Artificial Intelligence Conference (ITAIC), Chongqing, China, 8–10 December 2023; Volume 11, pp. 209–215. [Google Scholar]
- Ling, J.; Fu, Z.; Yuan, X. Lightweight coal mine conveyor belt foreign object detection based on improved Yolov8n. Sci. Rep. 2025, 15, 10361. [Google Scholar] [CrossRef] [PubMed]
- Li, S.; Li, M.; Li, R.; He, C.; Zhang, L. One-to-Few Label Assignment for End-to-End Dense Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 7350–7359. [Google Scholar]
- Zhu, B.; Wang, J.; Jiang, Z.; Zong, F.; Liu, S.; Li, Z.; Sun, J. AutoAssign: Differentiable Label Assignment for Dense Object Detection. arXiv 2020. [Google Scholar] [CrossRef]
- Du, B.; Huang, Y.; Chen, J.; Huang, D. Adaptive Sparse Convolutional Networks with Global Context Enhancement for Faster Object Detection on Drone Images. arXiv 2023. [Google Scholar] [CrossRef]
- Yang, G.; Lei, J.; Zhu, Z.; Cheng, S.; Feng, Z.; Liang, R. AFPN: Asymptotic Feature Pyramid Network for Object Detection. In Proceedings of the 2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Honolulu, HI, USA, 1–4 October 2023; pp. 2184–2189. [Google Scholar] [CrossRef]
- Dong, Z.; Li, G.; Liao, Y.; Wang, F.; Ren, P.; Qian, C. CentripetalNet: Pursuing High-Quality Keypoint Pairs for Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020. [Google Scholar]
- Sun, P.; Zhang, R.; Jiang, Y.; Kong, T.; Xu, C.; Zhan, W.; Tomizuka, M.; Li, L.; Yuan, Z.; Wang, C.; et al. Sparse R-CNN: End-to-End Object Detection with Learnable Proposals. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 14454–14463. [Google Scholar]
- Xie, E.; Ding, J.; Wang, W.; Zhan, X.; Xu, H.; Sun, P.; Li, Z.; Luo, P. Detco: Unsupervised contrastive learning for object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 8392–8401. [Google Scholar]
- Hong, Y.; Wang, L.; Su, J.; Li, Y.; Zhu, B.; Wang, H. Enhanced foreign body detection on coal mine conveyor belts using improved DLEA and lightweight SARC-DETR model. Signal Image Video Process. 2025, 19, 349. [Google Scholar] [CrossRef]
- Chen, H.; Wang, Z.; Qin, H.; Mu, X. Self-supervised domain feature mining for underwater domain generalization object detection. Expert Syst. Appl. 2025, 265, 126023. [Google Scholar] [CrossRef]
- Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-end object detection with transformers. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; Springer: Berlin/Heidelberg, Germany, 2020; pp. 213–229. [Google Scholar]
- Zhu, X.; Su, W.; Lu, L.; Li, B.; Wang, X.; Dai, J. Deformable {DETR}: Deformable Transformers for End-to-End Object Detection. In Proceedings of the International Conference on Learning Representations, Vienna, Austria, 4 May 2021. [Google Scholar]
- Ding, J.; Ye, C.; Wang, H.; Huyan, J.; Yang, M.; Li, W. Foreign bodies detector based on detr for high-resolution X-ray images of textiles. IEEE Trans. Instrum. Meas. 2023, 72, 1–10. [Google Scholar] [CrossRef]
- Ge, Y.; Jiang, D.; Sun, L. Wood Veneer Defect Detection Based on Multiscale DETR with Position Encoder Net. Sensors 2023, 23, 4837. [Google Scholar] [CrossRef]
- Ji, Y.; Zhang, H.; Gao, F.; Sun, H.; Wei, H.; Wang, N.; Yang, B. LGCNet: A local-to-global context-aware feature augmentation network for salient object detection. Inf. Sci. 2022, 584, 399–416. [Google Scholar] [CrossRef]
- Cheng, G.; Si, Y.; Hong, H.; Yao, X.; Guo, L. Cross-Scale Feature Fusion for Object Detection in Optical Remote Sensing Images. IEEE Geosci. Remote Sens. Lett. 2021, 18, 431–435. [Google Scholar] [CrossRef]
- Yang, Z.; Wang, Y.; Chen, X.; Liu, J.; Qiao, Y. Context-transformer: Tackling object confusion for few-shot detection. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, UISA, 7–12 February 2020; Volume 34, pp. 12653–12660. [Google Scholar]
- Weisstein, E.W. Affine Transformation. 2004. Available online: https://mathworld.wolfram.com/ (accessed on 22 April 2025).
- Haines, E. Point in Polygon Strategies. Graph. Gems 1994, 4, 24–46. [Google Scholar]
- Mao, A.; Mohri, M.; Zhong, Y. Cross-entropy loss functions: Theoretical analysis and applications. In Proceedings of the International Conference on Machine Learning, Honolulu, HI, USA, 23–29 July 2023; pp. 23803–23828. [Google Scholar]
- Janocha, K.; Czarnecki, W.M. On loss functions for deep neural networks in classification. arXiv 2017, arXiv:1702.05659. [Google Scholar] [CrossRef]
- Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. Distance-IoU loss: Faster and better learning for bounding box regression. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 12993–13000. [Google Scholar]
- Li, G.; Peng, F.; Wu, Z.; Wang, S.; Xu, R.Y.D. ODCL: An Object Disentanglement and Contrastive Learning Model for Few-Shot Industrial Defect Detection. IEEE Sens. J. 2024, 24, 18568–18577. [Google Scholar] [CrossRef]
- Robertson, S. A new interpretation of average precision. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’08, Singapore, 20–24 July 2008; pp. 689–690. [Google Scholar] [CrossRef]
- Yue, Y.; Finley, T.; Radlinski, F.; Joachims, T. A support vector method for optimizing average precision. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’07, Amsterdam, The Netherlands, 23–27 July 2007; pp. 271–278. [Google Scholar] [CrossRef]
- Bolya, D.; Foley, S.; Hays, J.; Hoffman, J. Tide: A general toolbox for identifying object detection errors. In Proceedings of the Computer Vision—ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; Proceedings, Part III 16. Springer: Berlin/Heidelberg, Germany, 2020; pp. 558–573. [Google Scholar]
Category | Camera 1 | Camera 2 | Camera 3 | All | ||||
---|---|---|---|---|---|---|---|---|
Training | Test | Training | Test | Training | Test | Training | Test | |
Big coal | 330 | 125 | 594 | 334 | 678 | 280 | 1602 | 739 |
Wood | 81 | 32 | 70 | 39 | 60 | 24 | 211 | 95 |
Iron bars | 103 | 46 | 330 | 148 | 456 | 192 | 889 | 194 |
Iron mesh | 87 | 31 | 227 | 74 | 206 | 94 | 520 | 199 |
Total | 601 | 234 | 1221 | 595 | 1400 | 590 | 3222 | 1227 |
Method | Camera 1 | Camera 2 | Camera 3 | All |
---|---|---|---|---|
Autoassign (2020) [18] | 80.09 | 50.40 | 36.27 | 53.42 |
Sparse-RCNN (2021) [22] | 70.01 | 20.17 | 26.68 | 34.37 |
Deformable DETR (2021) [27] | 65.97 | 24.66 | 23.89 | 34.03 |
VarifocalNet (2021) [10] | 78.41 | 47.43 | 28.37 | 40.03 |
Centripetalnet (2021) [21] | 77.90 | 33.70 | 26.32 | 41.93 |
O2F (2023) [17] | 70.02 | 33.53 | 19.04 | 35.55 |
CEASC (2023) [19] | 76.40 | 36.25 | 25.13 | 42.14 |
AFPN (2023) [20] | 73.21 | 30.38 | 27.35 | 39.28 |
CAFOD (ours) | 82.92 | 52.53 | 41.62 | 56.45 |
Impr. | 20.44% | 53.06% | 42.60% | 39.72% |
Mean Impr. | 10.15% | 34.20% | 36.01% | 27.20% |
Methods | Big Coal | Wood | Iron Bars | Iron Mesh | All |
---|---|---|---|---|---|
Autoassign (2020) [18] | 56.32 | 87.81 | 84.15 | 73.52 | 75.45 |
Sparse-RCNN (2021) [22] | 49.81 | 82.97 | 81.80 | 77.83 | 73.10 |
Deformable DETR (2021) [27] | 57.11 | 83.92 | 81.17 | 76.85 | 74.76 |
VarifocalNet (2021) [10] | 58.72 | 85.10 | 83.33 | 73.92 | 75.26 |
CentripetalNet (2021) [21] | 60.53 | 87.91 | 83.92 | 74.15 | 76.61 |
O2F (2023) [17] | 55.38 | 88.72 | 81.50 | 73.71 | 74.82 |
CEASC (2023) [19] | 55.53 | 87.15 | 84.23 | 70.12 | 74.25 |
AFPN (2023) [20] | 60.61 | 83.64 | 83.91 | 81.55 | 77.42 |
CAFOD (ours) | 65.41 | 88.95 | 83.86 | 83.37 | 80.39 |
Method | MVDA | CFP | CBAL | VFL | Big Coal | Wood | Iron Bars | Iron Mesh | All |
---|---|---|---|---|---|---|---|---|---|
A | 57.11 | 83.92 | 81.17 | 76.85 | 74.76 | ||||
B | ✓ | 58.85 | 83.97 | 81.36 | 78.63 | 75.70 | |||
C | ✓ | ✓ | 63.52 | 85.61 | 83.58 | 78.74 | 77.86 | ||
D | ✓ | ✓ | ✓ | 64.31 | 86.77 | 83.65 | 81.27 | 79.01 | |
CAFOD | ✓ | ✓ | ✓ | ✓ | 65.41 | 88.95 | 83.86 | 83.37 | 80.39 |
Method | MVDA | CFP | CBAL | VFL | Cls ↓ | Loc ↓ | Cls + Loc ↓ | Missed ↓ | Bkgd ↓ |
---|---|---|---|---|---|---|---|---|---|
A | 5.38 | 9.41 | 0.95 | 2.04 | 2.93 | ||||
B | ✓ | 5.18 | 8.93 | 0.69 | 1.99 | 2.81 | |||
C | ✓ | ✓ | 4.78 | 8.23 | 0.40 | 1.26 | 1.17 | ||
D | ✓ | ✓ | ✓ | 2.95 | 6.87 | 0.19 | 0.68 | 0.21 | |
CAFOD | ✓ | ✓ | ✓ | ✓ | 2.81 | 6.22 | 0.15 | 0.66 | 0.21 |
Stage | Methods | Parameters | GPU | Memory | Time |
---|---|---|---|---|---|
Training Phase | Autoassign | 285.78 MB | 8346 MB | 5214 MB | 50,435 s |
Sparse-RCNN | 1219.75 MB | 9566 MB | 6991 MB | 64,082 s | |
Deformable DETR | 471.52 MB | 17,998 MB | 13,790 MB | 57,600 s | |
VarifocalNet | 249.19 MB | 8860 MB | 5720 MB | 89,287 s | |
CentripetalNet | 2365.44 MB | 11,546 MB | 10,665 MB | 79,235 s | |
O2F | 246.31 MB | 5742 MB | 3374 MB | 61,261 s | |
CEASC | 324.92 MB | 8531 MB | 2521 MB | 69,488 s | |
AFPN | 31.4 MB | 6531 MB | 2109 MB | 58,804 s | |
CAFOD (ours) | 479.48 MB | 22,392 MB | 19,255 MB | 61,200 s | |
Inference Phase | Autoassign | 285.78 MB | 6966 MB | 735 MB | 0.1450 s |
Sparse-RCNN | 1219.75 MB | 9566 MB | 1947 MB | 0.1759 s | |
Deformable DETR | 471.52 MB | 13,868 MB | 1127 MB | 0.2186 s | |
VarifocalNet | 249.19 MB | 15,272 MB | 806 MB | 0.6561 s | |
CentripetalNet | 2365.44 MB | 8748 MB | 5342 MB | 0.7804 s | |
O2F | 246.31 MB | 5742 MB | 635 MB | 0.2015 s | |
CEASC | 324.92 MB | 8531 MB | 985 MB | 0.2041 s | |
AFPN | 31.4 MB | 6531 MB | 531 MB | 0.0510 s | |
CAFOD (ours) | 479.48 MB | 1242 MB | 965 MB | 0.2078 s |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Peng, F.; Hao, K.; Lu, X. Camera-Adaptive Foreign Object Detection for Coal Conveyor Belts. Appl. Sci. 2025, 15, 4769. https://doi.org/10.3390/app15094769
Peng F, Hao K, Lu X. Camera-Adaptive Foreign Object Detection for Coal Conveyor Belts. Applied Sciences. 2025; 15(9):4769. https://doi.org/10.3390/app15094769
Chicago/Turabian StylePeng, Furong, Kangjiang Hao, and Xuan Lu. 2025. "Camera-Adaptive Foreign Object Detection for Coal Conveyor Belts" Applied Sciences 15, no. 9: 4769. https://doi.org/10.3390/app15094769
APA StylePeng, F., Hao, K., & Lu, X. (2025). Camera-Adaptive Foreign Object Detection for Coal Conveyor Belts. Applied Sciences, 15(9), 4769. https://doi.org/10.3390/app15094769