Enhanced Real-Time Detector for Industrial Vision-Based Corn Impurity Detection
Abstract
1. Introduction
- (1)
- Introducing receptive field attention convolutions (RFAConv) into the main model enhances sensitivity to local texture details, improving feature extraction capabilities for small irregular impurities.
- (2)
- Replacing the conventional bilinear upscaling commonly used in the neck region with the dynamic upscaling operator DySample, which preserves high-frequency edge information while improving detection performance for slender and transparent objects.
- (3)
- An Inner-Shape-IoU loss function is proposed to accelerate bounding box regression and improve localization accuracy for objects with varying aspect ratios. Experiments on a custom dataset demonstrate that this model achieves enhanced accuracy and real-time performance compared to mainstream detectors.
2. Materials and Methods
2.1. Data Collection, Processing, and Enhancement
2.2. Experimental Environment and Parameter Settings
2.3. Backbone Network Optimization: Embedded Receptive Field Attention Convolution (RFAConv)
2.4. Neck Network Reconstruction: DySample Dynamic Upsampling
2.5. Loss Function Upgrade: Inner-Shape-IoU
3. Results and Discussion
3.1. Comparison Experiment
3.2. Ablation Experiment
3.3. Discussion
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
| RT-DETR | Real-Time Detection Transformer |
| RT-DETRCD | RT-DETR with Convolution and Dynamic upsampling |
| CNN | Convolutional Neural Network |
| RFAConv | Receptive Field Attention Convolution |
| DySample | Dynamic Upsampling operator |
| Inner-Shape-IoU | Inner-Shape Intersection over Union |
| mAP | mean Average Precision |
| FPS | Frames Per Second |
| YOLO | You Only Look Once |
| SVM | Support Vector Machine |
| SSD | Single Shot MultiBox Detector |
| Faster R-CNN | Faster Region-based CNN |
| CBAM | Convolutional Block Attention Module |
| SE | Squeeze-and-Excitation |
| ECA | Efficient Channel Attention |
| FPN | Feature Pyramid Network |
| PANet | Path Aggregation Network |
| BiFPN | Bidirectional Feature Pyramid Network |
| DETR | Detection Transformer |
| SGD | Stochastic Gradient Descent |
| ResNet | Residual Network |
| GIoU | Generalized Intersection over Union |
| CIoU | Complete Intersection over Union |
| IoU | Intersection over Union |
| GT | Ground Truth |
| GFLOPS | Giga Floating-Point Operations Per Second |
| P | Precision |
| R | Recall |
References
- Zhang, W.; Guo, H.; Zhao, B.; Zhou, L.; Wang, F.; Wang, D.; Liu, Y. Full-Condition Monitoring and Intelligent Yield Prediction and Decision-Making Technology for Wheat Combine Harvesters. Int. J. Agric. Biol. Eng. 2025, 18, 202–211. [Google Scholar] [CrossRef]
- Vithu, P.; Moses, J.A. Machine Vision System for Food Grain Quality Evaluation: A Review. Trends Food Sci. Technol. 2016, 56, 13–20. [Google Scholar] [CrossRef]
- Al-Harbi, H.F.F.; Al-Mohaimeed, A.M.M.; El-Tohamy, M.F.F. Assessment of Essential Elements and Heavy Metals in Saudi Arabian Rice Samples Underwent Various Processing Methods. Open Chem. 2023, 21, 20220328. [Google Scholar] [CrossRef]
- Aldoshin, N.; Didmanidze, O.; Lylin, N.; Mosyakov, M. Work Improvement of Air-and-Screen Cleaner of Combine Harvester. In Proceedings of the 18th International Scientific Conference Engineering for Rural Development; Malinovska, L., Osadcuks, V., Eds.; Latvia Univ Agriculture, Faculty Engineering, Inst Mechanics: Jelgava, Latvia, 2019; pp. 100–104. [Google Scholar]
- Assadzadeh, S.; Walker, C.K.; McDonald, L.S.; Panozzo, J.F. Prediction of Milling Yield in Wheat with the Use of Spectral, Colour, Shape, and Morphological Features. Biosyst. Eng. 2022, 214, 28–41. [Google Scholar] [CrossRef]
- Kumaravelu, C.; Gopal, A. A Review on the Applications of Near-Infrared Spectrometer and Chemometrics for the Agro-Food Processing Industries. In Proceedings of the 2015 IEEE International Conference on Technological Innovations in ICT for Agriculture and Rural Development Tiar 2015, Chennai, India, 10–12 July 2015; IEEE: New York, NY, USA, 2015; pp. 8–12. [Google Scholar]
- Qu, Z.; Lu, Q.; Shao, H.; Le, J.; Wang, X.; Zhao, H.; Wang, W. Design and Test of a Grain Cleaning Loss Monitoring Device for Wheat Combine Harvester. Agriculture 2024, 14, 671. [Google Scholar] [CrossRef]
- Yang, H.; Sheng, S.; Jiang, F.; Zhang, T.; Wang, S.; Xiao, J.; Zhang, H.; Peng, C.; Wang, Q. YOLO-SDW: A Method for Detecting Infection in Corn Leaves. Energy Rep. 2024, 12, 6102–6111. [Google Scholar] [CrossRef]
- Zhao, Z.; Chen, S.; Ge, Y.; Yang, P.; Wang, Y.; Song, Y. RT-DETR-Tomato: Tomato Target Detection Algorithm Based on Improved RT-DETR for Agricultural Safety Production. Appl. Sci. 2024, 14, 6287. [Google Scholar] [CrossRef]
- Abbasimehr, H.; Shabani, M.; Yousefi, M. An Optimized Model Using LSTM Network for Demand Forecasting. Comput. Ind. Eng. 2020, 143, 106435. [Google Scholar] [CrossRef]
- Yuan, J.; Tang, F.; Qi, Z.; Zhao, H. Prediction and Determination of Mildew Grade in Grain Storage Based on FOA-SVM Algorithm. Food Qual. Saf. 2023, 7, fyac071. [Google Scholar] [CrossRef]
- Gao, L.; Bai, J.; Xu, J.; Du, B.; Zhao, J.; Ma, D.; Hao, F. Detection of Miss-Seeding of Sweet Corn in a Plug Tray Using a Residual Attention Network. Appl. Sci. 2022, 12, 12604. [Google Scholar] [CrossRef]
- Bhatt, D.; Patel, C.; Talsania, H.; Patel, J.; Vaghela, R.; Pandya, S.; Modi, K.; Ghayvat, H. CNN Variants for Computer Vision: History, Architecture, Application, Challenges and Future Scope. Electronics 2021, 10, 2470. [Google Scholar] [CrossRef]
- Guo, M.-H.; Xu, T.-X.; Liu, J.-J.; Liu, Z.-N.; Jiang, P.-T.; Mu, T.-J.; Zhang, S.-H.; Martin, R.R.; Cheng, M.-M.; Hu, S.-M. Attention Mechanisms in Computer Vision: A Survey. Comput. Vis. Media 2022, 8, 331–368. [Google Scholar] [CrossRef]
- Haq, M.A. CNN Based Automated Weed Detection System Using UAV Imagery. Comput. Syst. Sci. Eng. 2022, 42, 837–849. [Google Scholar] [CrossRef]
- Menghani, G. Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better. ACM Comput. Surv. 2023, 55, 1–37. [Google Scholar] [CrossRef]
- Bhupendra; Moses, K.; Miglani, A.; Kumar Kankar, P. Deep CNN-Based Damage Classification of Milled Rice Grains Using a High-Magnification Image Dataset. Comput. Electron. Agric. 2022, 195, 106811. [Google Scholar] [CrossRef]
- Zhu, L.; Spachos, P.; Pensini, E.; Plataniotis, K.N. Deep Learning and Machine Vision for Food Processing: A Survey. Curr. Res. Food Sci. 2021, 4, 233–249. [Google Scholar] [CrossRef]
- Khaki, S.; Wang, L.; Archontoulis, S.V. A CNN-RNN Framework for Crop Yield Prediction. Front. Plant Sci. 2020, 10, 1750. [Google Scholar] [CrossRef]
- Viejo, C.G.; Harris, N.M.; Fuentes, S. Quality Traits of Sourdough Bread Obtained by Novel Digital Technologies and Machine Learning Modelling. Fermentation 2022, 8, 516. [Google Scholar] [CrossRef]
- Xiao, Y.; Lepetit, V.; Marlet, R. Few-Shot Object Detection and Viewpoint Estimation for Objects in the Wild. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 3090–3106. [Google Scholar] [CrossRef]
- Zhuge, M.; Fan, D.-P.; Liu, N.; Zhang, D.; Xu, D.; Shao, L. Salient Object Detection via Integrity Learning. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 3738–3752. [Google Scholar] [CrossRef]
- Diwan, T.; Anirudh, G.; Tembhurne, J. Object Detection Using YOLO: Challenges, Architectural Successors, Datasets and Applications. Multimed. Tools Appl. 2023, 82, 9243–9275. [Google Scholar] [CrossRef] [PubMed]
- Deng, C.; Wang, M.; Liu, L.; Liu, Y.; Jiang, Y. Extended Feature Pyramid Network for Small Object Detection. IEEE Trans. Multimedia 2022, 24, 1968–1979. [Google Scholar] [CrossRef]
- Hu, J.; Shen, L.; Albanie, S.; Sun, G.; Wu, E. Squeeze-and-Excitation Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2018, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
- Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2020, Seattle, WA, USA, 13–19 June 2020. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. CBAM: Convolutional Block Attention Module. In Proceedings of the European Conference on Computer Vision (ECCV) 2018, Munich, Germany, 8–14 September 2018. [Google Scholar]
- Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-End Object Detection with Transformers. In Proceedings of the European Conference on Computer Vision 2020, Online, 23–28 August 2020. [Google Scholar]
- Wang, A.; Zhang, W.; Wei, X. A Review on Weed Detection Using Ground-Based Machine Vision and Image Processing Techniques. Comput. Electron. Agric. 2019, 158, 226–240. [Google Scholar] [CrossRef]
- Zhao, Y.; Lv, W.; Xu, S.; Wei, J.; Wang, G.; Dan, Q.; Liu, Y.; Chen, J. DETRs Beat YOLOs on Real-Time Object Detection. In Proceedings of the 2024 IEEE/Cvf Conference on Computer Vision and Pattern Recognition (cvpr); IEEE Computer Soc: Los Alamitos, CA, USA, 2024; pp. 16965–16974. [Google Scholar]
- Peng, G.; Wang, K.; Ma, J.; Cui, B.; Wang, D. AGRI-YOLO: A Lightweight Model for Corn Weed Detection with Enhanced YOLO V11n. Agriculture 2025, 15, 1971. [Google Scholar] [CrossRef]
- Sun, W.; Xu, M.; Xu, K.; Chen, D.; Wang, J.; Yang, R.; Chen, Q.; Yang, S. CSGD-YOLO: A Corn Seed Germination Status Detection Model Based on YOLOv8n. Agronomy 2025, 15, 128. [Google Scholar] [CrossRef]
- Wang, Q.; Liu, Y.; Zheng, Q.; Tao, R.; Liu, Y. SMC-YOLO: A High-Precision Maize Insect Pest-Detection Method. Agronomy 2025, 15, 195. [Google Scholar] [CrossRef]
- Ganesh, P.; Chen, Y.; Yang, Y.; Chen, D.; Winslett, M. YOLO-ReT: Towards High Accuracy Real-Time Object Detection on Edge GPUs. In Proceedings of the 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV); IEEE: Waikoloa, HI, USA, 2022; pp. 1311–1321. [Google Scholar]
- Zhang, X.; Liu, C.; Yang, D.; Song, T.; Ye, Y.; Li, K.; Song, Y. RFAConv: Innovating Spatial Attention and Standard Convolutional Operation. arXiv 2024, arXiv:2304.03198. [Google Scholar] [CrossRef]
- Wei, H.; Zhao, L.; Li, R.; Zhang, M. RFAConv-CBM-ViT: Enhanced Vision Transformer for Metal Surface Defect Detection. J. Supercomput. 2025, 81, 155. [Google Scholar] [CrossRef]
- Lin, T.-Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2017, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Liu, W.; Lu, H.; Fu, H.; Cao, Z. Learning to Upsample by Learning to Sample. In Proceedings of the IEEE/CVF International Conference on Computer Vision 2023, Paris, France, 2–3 October 2023. [Google Scholar]
- Rezatofighi, H.; Tsoi, N.; Gwak, J.; Sadeghian, A.; Reid, I.; Savarese, S. Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: Long Beach, CA, USA, 2019; pp. 658–666. [Google Scholar]
- Zhang, H.; Zhang, S. Shape-IoU: More Accurate Metric Considering Bounding Box Shape and Scale. arXiv 2024, arXiv:2312.17663. [Google Scholar]
- Zhang, H.; Xu, C.; Zhang, S. Inner-IoU: More Effective Intersection over Union Loss with Auxiliary Bounding Box. arXiv 2023, arXiv:2311.02877. [Google Scholar]
- Chen, Q.; Wang, Y.; Yang, T.; Zhang, X.; Cheng, J.; Sun, J. You Only Look One-Level Feature. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: Nashville, TN, USA, 2021; pp. 13034–13043. [Google Scholar]
- Ge, Z.; Liu, S.; Wang, F.; Li, Z.; Sun, J. YOLOX: Exceeding YOLO Series in 2021. arXiv 2021, arXiv:2107.08430. [Google Scholar] [CrossRef]
- Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar] [CrossRef]
- Lv, W.; Zhao, Y.; Chang, Q.; Huang, K.; Wang, G.; Liu, Y. RT-DETRv2: Improved Baseline with Bag-of-Freebies for Real-Time Detection Transformer. arXiv 2024, arXiv:2407.17140. [Google Scholar]








| Samples | Quantities |
|---|---|
| Corn husk | 2100 |
| Corn stalk | 1800 |
| Corn cob | 1200 |
| Weed | 950 |
| Gravel | 800 |
| Glass | 600 |
| Moldy kernel | 1500 |
| Items | Detailed Specifications |
|---|---|
| CPU | Intel Core i7-14700K |
| GPU | NVIDIA GeForce RTX 4060Ti (16 GB) |
| Memory | 32 GB |
| Operating System | Windows 11 |
| Deep learning framework | PyTorch 1.13.1 |
| Programming language | Python 3.8 |
| Models | mAP50 (%) | mAP50:95 (%) | Parameters (M) | GFLOPS 1 | FPS |
|---|---|---|---|---|---|
| YOLOv5s | 86.2 | 61.5 | 7.2 | 16 | 115 |
| YOLOv8n | 89.4 | 65.8 | 3.2 | 9 | 142 |
| YOLOv10 | 90.1 | 66.5 | 8.0 | 22 | 120 |
| DETR | 78.5 | 52.3 | 41.3 | 86 | 28 |
| RT-DETR (Original) | 91.5 | 68.2 | 32.8 | 50 | 74 |
| Ours | 96.2 | 65.2 | 33.5 | 56 | 68 |
| Impurity Category | Baseline mAP50 (%) | Ours mAP50 (%) |
|---|---|---|
| Corn husk | 92.3 | 94.5 |
| Corn stalk | 88.5 | 92.7 |
| Corn cob | 90.1 | 91.8 |
| Weed | 89.7 | 91.2 |
| Gravel | 91.0 | 92.3 |
| Glass fragment | 84.3 | 90.1 |
| Moldy kernel | 93.5 | 94.6 |
| Models | P 1 (%) | R 2 (%) | mAP50 (%) | mAP50:95 (%) |
|---|---|---|---|---|
| Base model | 94.2 | 89.5 | 91.5 | 68.2 |
| + DySample | 94.8 | 91.6 | 94.8 | 61.4 |
| + RFAConv | 95.1 | 90.3 | 94.2 | 60.2 |
| + Inner-Shape-IoU | 94.5 | 90.8 | 93.9 | 59.6 |
| + DySample + RFAConv | 95.7 | 92.4 | 95.4 | 63.1 |
| Ours | 96.3 | 93.8 | 96.2 | 65.2 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Zhang, X.; Bian, Y.; Li, X.; Yu, H.; Li, D.; Wu, M. Enhanced Real-Time Detector for Industrial Vision-Based Corn Impurity Detection. Foods 2026, 15, 1065. https://doi.org/10.3390/foods15061065
Zhang X, Bian Y, Li X, Yu H, Li D, Wu M. Enhanced Real-Time Detector for Industrial Vision-Based Corn Impurity Detection. Foods. 2026; 15(6):1065. https://doi.org/10.3390/foods15061065
Chicago/Turabian StyleZhang, Xiao, Yuhang Bian, Xiangdong Li, Haoze Yu, Dong Li, and Min Wu. 2026. "Enhanced Real-Time Detector for Industrial Vision-Based Corn Impurity Detection" Foods 15, no. 6: 1065. https://doi.org/10.3390/foods15061065
APA StyleZhang, X., Bian, Y., Li, X., Yu, H., Li, D., & Wu, M. (2026). Enhanced Real-Time Detector for Industrial Vision-Based Corn Impurity Detection. Foods, 15(6), 1065. https://doi.org/10.3390/foods15061065

