GOG-RT-DETR: An Improved RT-DETR-Based Method for Graphite Ore Grade Detection
Abstract
1. Introduction
- To address the challenges of scarce annotated datasets and environmental interference in graphite ore grade detection, we have developed a systematic workflow that covers the entire process from mine sampling, precise preparation of ore grades through chemical methods, image data collection, to fine-grained annotation. This process has led to the creation of a proprietary graphite ore image dataset that includes both native and oxidized ore forms, covering high, medium, and low grade levels.
- To address the issue of inefficient feature extraction caused by computational redundancy and insufficient global context utilization in traditional residual networks, this paper introduces the lightweight Faster-Rep-EMA module to improve the ResNet18 backbone network. This modification dynamically allocates weights, effectively enhancing the model’s ability to capture key features for detecting graphite ore. As a result, the model reduces the number of parameters while significantly improving the richness of feature representation.
- To optimize the feature fusion mechanism of the Neck network, this method designs and introduces the BiFPN-GLSA module to replace the CCFM module in the Neck. This module combines a bidirectional multi-scale feature pyramid with a global–local spatial attention mechanism, aiming to significantly enhance the model’s ability to accurately locate and recognize targets under complex background interference.
- The GIoU loss function in the baseline model is replaced with the Wise-Inner-Shape-IoU loss function. By integrating the advantages of Wise-IoU [18], Shape-IoU [19], and Inner-IoU [20], this new loss function significantly enhances the model’s robustness to variations in target shape and scale. This not only refines defect localization but also accelerates convergence speed.
2. Methods
2.1. The Original RT-DETR Model
2.2. The Improved GOG-RT-DETR Model
2.3. Faster-Rep-EMA
2.4. BiFPN-GLSA
2.5. Wise-Inner-Shape-IoU
3. Data and Experimental Preparation
3.1. Data Collection
3.2. Data Processing
- Random Horizontal Flipping: By simulating random orientations of ore samples, this technique improves the model’s adaptability to directional variations, ensuring accurate recognition regardless of left–right orientation.
- Center Cropping: The central region of each image is cropped while maintaining its original aspect ratio. This effectively simulates variations in target scale caused by different shooting distances, thereby improving the model’s performance in detecting ores of varying sizes.
- Median Filtering Denoising: A median filter is applied to smooth images, effectively reducing random noise introduced by uneven illumination or sensor artifacts while preserving ore edge information. This enables the model to learn cleaner and more representative ore features.
- Contrast Adjustment: The image contrast is randomly adjusted within a limited range to simulate subtle variations in lighting intensity within mining environments. This enhances the model’s generalization capability under different illumination conditions and ensures stable grade recognition across light variations.
- Proportional Scaling and Padding: Each original image is rescaled proportionally to maintain its aspect ratio and then placed at the center of a standardized gray canvas. This process unifies input dimensions while avoiding geometric distortion from forced stretching, preserving the ore’s true morphological and textural characteristics—critical for accurate grade identification.
3.3. Experimental Environment
3.4. Evaluation Metrics
4. Results and Discussion
4.1. Confusion Matrix Visualization and Accuracy Comparison
4.2. Ablation Experiments
- A: Replace the original backbone network with the lightweight and efficient Faster-Rep-EMA module as the new feature extractor.
- B: In the neck network, replace the original CCFM feature fusion module with the improved BiFPN-GLSA module.
- C: In the loss function, adopt the Wise-Inner-Shape-IoU loss to optimize bounding box regression.
4.3. Comparative Experiments
4.4. Visualization Experiments
4.5. Limitations
- Although the dataset in this study was meticulously constructed and enhanced through simulated data augmentation, its scale and diversity remain relatively limited. The images were primarily collected from a specific industrial environment—the mining site of China Minmetals Corporation (Heilongjiang, China) Graphite Industry Co., Ltd. (Heilongjiang, China). This may restrict the model’s generalization ability to other mining areas, different geological backgrounds, or unseen oxidized graphite ore morphologies.
- While the core contribution of this research lies in the lightweight architectural innovation (e.g., Faster-Rep-EMA) and efficiency improvements, with significant reductions in FLOPs (−23.37%) and parameters (−26.0%) demonstrating theoretical efficiency, the model has not yet been fully validated on real edge computing hardware, such as NVIDIA Jetson devices or industrial PLCs, to evaluate its actual inference speed (FPS) and deployment performance in real-world conditions.
- The current GOG-RT-DETR framework relies entirely on visible-light (RGB) visual data for grade classification. However, graphite ores with similar grades may exhibit highly similar visual appearances. In future work, integrating multimodal sensing technologies—such as hyperspectral imaging, X-ray fluorescence (XRF), or near-infrared (NIR) imaging—could provide richer discriminative information reflecting mineralogical and chemical composition. This would be particularly valuable in ambiguous or visually indistinct boundary cases.
5. Conclusions
- The Faster-Rep-EMA module is introduced as the backbone network. By employing partial convolution and re-parameterization techniques, it effectively reduces computational redundancy and memory access costs, while the multi-scale attention mechanism strengthens the extraction of key graphite ore features.
- A new BiFPN-GLSA module replaces the original neck structure. This module fuses a bidirectional feature pyramid network with a global–local self-attention mechanism, substantially improving multi-scale feature fusion quality and enhancing the model’s ability to capture both global contextual information and fine-grained local details.
- The Wise-Inner-Shape-IoU loss function is applied for bounding box regression optimization. By integrating dynamic focus mechanisms, auxiliary bounding box alignment, and shape-awareness, it accelerates convergence and significantly improves localization accuracy.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Jara, A.D.; Betemariam, A.; Woldetinsae, G.; Kim, J.Y. Purification, application and current market trend of natural graphite: A review. Int. J. Min. Sci. Technol. 2019, 29, 671–689. [Google Scholar] [CrossRef]
- Sun, L.; Xu, C.-P.; Xiao, K.-Y.; Zhu, Y.-S.; Yan, L.-Y. Geological characteristics, metallogenic regularities and the exploration of graphite deposits in China. China Geol. 2018, 1, 425–434. [Google Scholar] [CrossRef]
- Liu, C.; Zhao, T.; Liu, S.; Zhe, M.A.; Meihui, J.I. Demand Prediction of Natural Graphite Resources in China from 2025 to 2035. China Min. Mag. 2024, 33, 78–88. [Google Scholar] [CrossRef]
- Park, J.; Cho, S.-J.; Shin, S.; Kim, R.; Shin, D.; Shin, Y. Overview of graphite supply chain and its challenges. Geosci. J. 2025, 29, 329–341. [Google Scholar] [CrossRef]
- Wang, H.; Liu, Z.; Qu, F.; Wang, L.; Yue, X.; Zhang, X.; Shao, A. Development of Online Detection Technologies for Ore Grade. Strateg. Study Chin. Acad. Eng. 2024, 26, 152–163. [Google Scholar] [CrossRef]
- Wang, H.; Liu, Z.; Qu, F.; Wang, L.; Yue, X.; Zhang, X.; Shao, A. Quantitative phase-analysis by the Rietveld method using X-ray powder-diffraction data: Application to the study of alteration halos associated with volcanic-rock-hosted massive sulfide deposits. Can. Mineral. 2001, 39, 1617–1633. [Google Scholar] [CrossRef]
- Ribeiro, T.M.G.; Brandão, P.R.G. Development and validation of graphitic carbon analysis of graphite ore samples. Tecnol. Metal. Mater. Mineração 2018, 14, 183–189. [Google Scholar] [CrossRef]
- Cui, B.; Pan, W.; Yu, Y.; Chen, H.; Tang, Y.; Liao, M.; Wei, K.; Pan, S.; Fu, L. Experimental Study on the Washability of a Flake Graphite Ore. J. Non-Met. Miner. Ind. Des. Res. Inst. 2025, 4, 74–77. [Google Scholar]
- Saders, J.A.; Gravel, J.; Janke, L.; Hall, L. In-depth study on carbon speciation focussed on graphite. In Symposium on Critical and Strategic Materials; British Columbia Geological Survey Paper: Victoria, BC, USA, 2015; Volume 3, pp. 187–191. [Google Scholar]
- Zhang, Y.; Li, M.; Han, S.; Ren, Q.; Shi, J. Intelligent identification for rock-mineral microscopic images using ensemble machine learning algorithms. Sensors 2019, 19, 3914. [Google Scholar] [CrossRef]
- Cevik, I.S.; Leuangthong, O.; Caté, A.; Ortiz, J.M. On the use of machine learning for mineral resource classification. Min. Metall. Explor. 2021, 38, 2055–2073. [Google Scholar] [CrossRef]
- Pereira Borges, H.; de Aguiar, M.S. Mineral classification using machine learning and images of microscopic rock thin section. In Mexican International Conference on Artificial Intelligence; Springer International Publishing: Cham, Switzerland, 2019; pp. 63–76. [Google Scholar] [CrossRef]
- Qiu, Z.; Huang, X.; Li, S.; Wang, J. Stellar-YOLO: A Graphite Ore Grade Detection Method Based on Improved YOLO11. Symmetry 2025, 17, 966. [Google Scholar] [CrossRef]
- Xiang, J.; Shi, H.; Huang, X.; Chen, D. Improving graphite ore grade identification with a novel FRCNN-PGR method based on deep learning. Appl. Sci. 2023, 13, 5179. [Google Scholar] [CrossRef]
- Izadi, H.; Sadri, J.; Bayati, M. An intelligent system for mineral identification in thin sections based on a cascade approach. Comput. Geosci. 2017, 99, 37–49. [Google Scholar] [CrossRef]
- Vasumathi, N.; Sarjekar, A.; Chandrayan, H.; Chennakesavulu, K.; Reddy, G.R.; Kumar, T.V.V.; El-Gendy, N.S.; Gopalkrishna, S.J. A mini review on flotation techniques and reagents used in graphite beneficiation. Int. J. Chem. Eng. 2023, 2023, 1007689. [Google Scholar] [CrossRef]
- Wang, Q.; Zhang, K.; Zou, G.; Yang, J.; Wang, X.; Liu, Y.; Song, Y. Advances in Deep Learning-Based Ore Particle Size Detection: A Review of Methods, Challenges, and Trends. MetaResource 2025, 2, 83–104. [Google Scholar] [CrossRef]
- Tong, Z.; Chen, Y.; Xu, Z.; Yu, R. Wise-iou: Bounding box regression loss with dynamic focusing mechanism. arXiv 2023, arXiv:2301.10051. [Google Scholar]
- Zhang, H.; Zhang, S. Shape-iou: More accurate metric considering bounding box shape and scale. arXiv 2023, arXiv:2312.17663. [Google Scholar]
- Zhang, H.; Xu, C.; Zhang, S. Inner-iou: More effective inter-section over union loss with auxiliary bounding box. arXiv 2023, arXiv:2311.02877. [Google Scholar]
- Han, T.; Hou, S.; Gao, C.; Xu, S.; Pang, J.; Gu, H.; Huang, Y. EF-RT-DETR: A efficient focused real-time DETR model for pavement distress detection. J. Real-Time Image Process. 2025, 22, 63. [Google Scholar] [CrossRef]
- Dun, J.; Yang, H.; Yuan, S.; Tang, Y. EER-DETR: An Improved Method for Detecting Defects on the Surface of Solar Panels Based on RT-DETR. Appl. Sci. 2025, 15, 6217. [Google Scholar] [CrossRef]
- Zhang, M.; Wei, X.; Liu, G.; Chen, M.; Zhao, C.; Liu, Y.; Bao, Z.; Guo, Y.; An, R.; Zhao, P. Balancing complexity and accuracy for defect detection on filters with an improved RT-DETR. Sci. Rep. 2025, 15, 29720. [Google Scholar] [CrossRef]
- Zhao, Y.; Lv, W.; Xu, S.; Wei, J.; Wang, G.; Dang, Q.; Liu, Y.; Chen, J. Detrs beat yolos on real-time object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 17–21 June 2024; pp. 16965–16974. [Google Scholar] [CrossRef]
- Mao, H.; Gong, Y. Steel surface defect detection based on the lightweight improved RT-DETR algorithm. J. Real-Time Image Process. 2025, 22, 28. [Google Scholar] [CrossRef]
- Wu, M.; Qiu, Y.; Wang, W.; Su, X.; Cao, Y.; Bai, Y. Improved RT-DETR and its application to fruit ripeness detection. Front. Plant Sci. 2025, 16, 1423682. [Google Scholar] [CrossRef] [PubMed]
- Yu, C.; Chen, X. Railway rutting defects detection based on improved RT-DETR. J. Real-Time Image Process. 2024, 21, 146. [Google Scholar] [CrossRef]
- Chen, J.; Kao, S.-H.; He, H.; Zhuo, W.; Wen, S.; Lee, C.-H.; Chan, S.-H.G. Run, don’t walk: Chasing higher FLOPS for faster neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 12021–12031. [Google Scholar] [CrossRef]
- Ding, X.; Zhang, X.; Ma, N.; Han, J.; Ding, G.; Sun, J. Repvgg: Making vgg-style convnets great again. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 13733–13742. [Google Scholar] [CrossRef]
- Han, T.; Bao, M.; He, T.; Zhang, R.; Feng, X.; Huang, Y. LW-PV DETR: Lightweight model for photovoltaic panel surface defect detection. Eng. Res. Express 2025, 7, 015357. [Google Scholar] [CrossRef]
- Ouyang, D.; He, S.; Zhang, G.; Luo, M.; Guo, H.; Zhan, J.; Huang, Z. 2023 Efficient multi-scale attention module with cross-spatial learning. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 4–10 June 2023; IEEE: New York, NY, USA; pp. 1–5. [Google Scholar] [CrossRef]
- Zhang, C.; Yang, J. Emsd-detr: Efficient small object detection for UAV aerial images based on enhanced RT-DETR model. J. Supercomput. 2025, 81, 1052. [Google Scholar] [CrossRef]
- Tan, M.; Pang, R.; Le, Q.V. Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 10781–10790. [Google Scholar] [CrossRef]
- Tang, F.; Xu, Z.; Huang, Q.; Wang, J.; Hou, X.; Su, J.; Liu, J. DuAT: Dual-aggregation transformer network for medical image segmentation. In Proceedings of the Chinese Conference on Pattern Recognition and Computer Vision (PRCV), Xiamen, China, 13–15 October 2023; Volume 41, pp. 343–356. [Google Scholar] [CrossRef]
- Zhang, A.; Chai, C.; Qie, L.; Zhao, L.; He, J.; Wang, R. Research on LD-YOLO for Surface Defect Detection of Wind Turbine Blades. In Proceedings of the 2025 IEEE 20th Conference on Industrial Electronics and Applications (ICIEA), Shandong, China, 3–6 August 2025; IEEE: New York, NY, USA; pp. 1–6. [Google Scholar]
- Hong, Y.; Wang, H.; Guo, S. PFW-YOLO Lightweight Helmet Detection Algorithm; IEEE Access: New York, NY, USA, 2025. [Google Scholar] [CrossRef]
- Du, J.; Li, Y. Efficient real-time detection of complex tire cord defects on airjet loom. In Proceedings of the 2024 9th International Conference on Automation, Control and Robotics Engineering (CACRE), Jeju Island, Republic of Korea, 18–20 July 2024; IEEE: New York, NY, USA; pp. 242–247. [Google Scholar]
- Hongjuan, S.; Tongjiang, P.; Bo, L.; Caifeng, M.; Liming, L.; Quanjun, W.; Jiaqi, D.; Xiaoyi, L. Study of oxidation process occurring in natural graphite deposits. RSC Adv. 2017, 7, 51411–51418. [Google Scholar] [CrossRef]
- Human Signal. LabelImg. Available online: https://github.com/HumanSignal/labelImg (accessed on 24 April 2025).
- Yang, W.; Yang, Z.; Wu, M.; Zhang, G.; Zhu, Y.; Sun, Y. SIMCB-Yolo: An efficient multi-scale network for detecting forest fire smoke. Forests 2024, 15, 1137. [Google Scholar] [CrossRef]
- Liu, R.; Zhang, X.; Jin, S.; Wang, Q.; Zeng, L.; Liao, J. A Small Target Detection Model Based on an Improved RT-DETR. In Proceedings of the 2024 4th International Conference on Industrial Automation, Robotics and Control Engineering (IARCE), Chengdu, China, 15–17 November 2024; IEEE: New York, NY, USA, 2024; pp. 434–438. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 1 July 2016; pp. 779–788. [Google Scholar] [CrossRef]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. Ssd: Single shot multibox detector. In Proceedings of the European conference on computer vision, Amsterdam, The Netherlands, 11–14 October 2016; Springer International Publishing: Cham, Switzerland; pp. 21–37. [Google Scholar] [CrossRef]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards Real-Time Object Detection with Region Proposal Networks; Advances in Neural Information Processing Systems: San Diego, CA, USA, 2015; p. 28. [Google Scholar]
- Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 618–626. [Google Scholar] [CrossRef]
















| Training Set | Validation Set | Test Set | Total | |
|---|---|---|---|---|
| Total | 2660 | 760 | 380 | 3800 |
| 0–10% | 1110 | 339 | 144 | 1593 |
| 10–20% | 1094 | 280 | 159 | 1533 |
| 20%+ | 456 | 141 | 77 | 674 |
| Name | Parameter |
|---|---|
| System | Windows10 (64bit) |
| CPU | 12th-generation Intel Core i5-12490F |
| Memory | 31 GB |
| GPU | NVIDIA GeForce RTX 3080 Ti |
| Video Memory | 11 GB |
| Programming Software | Python 3.10.15 |
| Deep Learning Framework | PyTorch 2.5.1 |
| GPU Acceleration Library | CUDA 11.8 |
| Methods | P (%) | R (%) | mAP50 (%) | FPS (Frame·s−1) | FLOPS (G) | Params (M) |
|---|---|---|---|---|---|---|
| RT-DETR-r18 | 81.6 | 84.4 | 81.2 | 80.6 | 56.9 | 19.88 |
| A | 84.3 | 86.8 | 82.5 | 83.9 | 51.4 | 16.90 |
| B | 84.5 | 85.6 | 82.8 | 85.2 | 49.9 | 17.76 |
| C | 83.1 | 84.4 | 82.2 | 81.3 | 55.3 | 19.69 |
| A + B | 84.4 | 86.1 | 83.1 | 88.5 | 44.6 | 14.71 |
| A + C | 83.5 | 84.6 | 82.9 | 82.1 | 53.0 | 17.15 |
| B + C | 84.6 | 86.0 | 83.2 | 85.0 | 49.9 | 17.76 |
| GOG-RT-DETR (A + B + C) | 85.4 | 87.3 | 83.7 | 87.2 | 43.6 | 14.71 |
| Model | P (%) | R (%) | mAP50 (%) | FPS (Frame·s−1) | FLOPS (G) | Params (M) |
|---|---|---|---|---|---|---|
| Wise-IoU | 82.0 | 84.2 | 81.6 | 80.7 | 57.9 | 21.3 |
| Inner-IoU | 82.3 | 83.9 | 82.0 | 81.1 | 58.3 | 20.09 |
| Shape-IoU | 82.5 | 84.0 | 81.5 | 80.8 | 57.4 | 20.02 |
| Wise-Inner-Shape-IoU | 83.1 | 84.4 | 82.2 | 81.3 | 55.3 | 19.69 |
| HSFPN | 84.4 | 86.2 | 82.9 | 82.4 | 54.4 | 18.22 |
| CGRFPN | 83.8 | 85.3 | 82.1 | 83.3 | 48.6 | 19.34 |
| BiFPN-GLSA | 84.5 | 85.6 | 82.8 | 85.2 | 49.9 | 17.76 |
| DualConv | 83.9 | 85.8 | 81.8 | 82.7 | 49.6 | 16.08 |
| iRMB | 84.0 | 86.3 | 82.1 | 83.2 | 50.8 | 16.73 |
| Faster-Rep-EMA | 84.3 | 86.8 | 82.5 | 83.9 | 51.4 | 16.90 |
| Methods | P (%) | R (%) | mAP50 (%) | FPS (Frame·s−1) | FLOPS (G) | Params (M) |
|---|---|---|---|---|---|---|
| SSD | 68.8 | 79.7 | 74.5 | 67.4 | 53.1 | 24.7 |
| Faster-RCNN | 76.1 | 81.8 | 77.6 | 42.5 | 370.2 | 136.5 |
| YoLov5s | 74.5 | 78.4 | 79.6 | 88.4 | 23.18 | 9.21 |
| YoLov5n | 76.8 | 79.1 | 81.8 | 89.7 | 7.16 | 2.51 |
| YoLov8s | 79.4 | 80.5 | 82.3 | 87.9 | 28.3 | 12.12 |
| YoLov8n | 75.9 | 78.3 | 81.4 | 90.0 | 8.16 | 3.01 |
| YoLov10s | 80.4 | 81.6 | 82.5 | 88.7 | 24.8 | 8.07 |
| YoLov10n | 76.3 | 78.6 | 81.7 | 90.3 | 8.40 | 2.71 |
| YoLov11n | 75.3 | 79.7 | 80.6 | 91.2 | 6.30 | 2.58 |
| RT-DETR-r18 | 81.6 | 84.4 | 81.2 | 80.6 | 56.9 | 19.88 |
| MobileNetV4 | 80.7 | 83.1 | 79.7 | 88.1 | 39.5 | 11.31 |
| RT-DETR-DRB | 82.1 | 82.9 | 79.3 | 84.9 | 42.4 | 13.70 |
| GOG-RT-DETR | 85.4 | 87.3 | 83.7 | 87.2 | 43.6 | 14.71 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Sun, Z.; Huang, X.; Qiu, Z.; Wei, B. GOG-RT-DETR: An Improved RT-DETR-Based Method for Graphite Ore Grade Detection. Appl. Sci. 2025, 15, 13195. https://doi.org/10.3390/app152413195
Sun Z, Huang X, Qiu Z, Wei B. GOG-RT-DETR: An Improved RT-DETR-Based Method for Graphite Ore Grade Detection. Applied Sciences. 2025; 15(24):13195. https://doi.org/10.3390/app152413195
Chicago/Turabian StyleSun, Zhaojie, Xueyu Huang, Zeyang Qiu, and Binghui Wei. 2025. "GOG-RT-DETR: An Improved RT-DETR-Based Method for Graphite Ore Grade Detection" Applied Sciences 15, no. 24: 13195. https://doi.org/10.3390/app152413195
APA StyleSun, Z., Huang, X., Qiu, Z., & Wei, B. (2025). GOG-RT-DETR: An Improved RT-DETR-Based Method for Graphite Ore Grade Detection. Applied Sciences, 15(24), 13195. https://doi.org/10.3390/app152413195

