YOLO-SW: A Real-Time Weed Detection Model for Soybean Fields Using Swin Transformer and RT-DETR
Abstract
1. Introduction
2. Materials and Methods
2.1. Model Architecture of YOLO-SW
2.1.1. YOLOv8 Baseline
2.1.2. Anti-Interference Block
2.1.3. Detail Capture Upsampling Operator
2.1.4. RealTime-Head
2.1.5. Overview of YOLO-SW
2.2. Dataset Design
2.2.1. Dataset Acquisition
2.2.2. Dataset Construction
2.2.3. Improved Dataset Enhancement
3. Experiment Setup
3.1. Experimental Environment
3.2. Evaluation Metrics
4. Results
4.1. Model Cross-Vertical Comparision Experiments
4.2. Data Enhancement Contrast
4.3. Comparison Experiments of Public Datasets
4.4. Backbone Comparaision
4.5. Ablation Experiment
4.6. Effect Comparison Experiment
4.7. Heatmap Visualization
4.8. Model Deployment Testing
5. Discussion
5.1. Dataset Limitations
5.2. Cross-Field Technology Comparison
5.3. The Impact of Hyperparameter Configuration
5.4. Challenges and Solutions for Model Deployment
6. Conclusions
6.1. Research Achievements and Technological Innovation
6.2. Practical Application and Model Value
6.3. Limitations and Future Directions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A
Weed Species | Precision (%) | Recall (%) | F1-Score (%) |
---|---|---|---|
Galinsoga parviflora | 91.4 | 89.7 | 90.5 |
Alternanthera philoxeroides | 88.6 | 92.3 | 90.4 |
Cerastium glomeratum | 85.2 | 83.6 | 84.4 |
Cardamine hirsuta | 87.3 | 86.1 | 86.7 |
Amaranthus retroflexus | 93.5 | 91.2 | 92.3 |
Pilea peperomioides | 89.8 | 90.5 | 90.1 |
Figure 11a: Detection results of foggy weather environment | |||
Model | mAP@50(%) | Small target recall rate (%) | False alarm rate (%) |
YOLOv8 | 76.4 | 68.3 | 8.7 |
YOLO-SW | 89.5 * | 85.2 * | 3.2 * |
Figure 11b: Low-light environment detection results | |||
Model | mAP@50(%) | Precision(%) | Running time (ms) |
YOLOv8 | 80.1 | 77.5 | 14.1 |
YOLO-SW | 91.2 * | 88.6 * | 10.3 * |
Figure 11c: Detection results of dense target scenes | |||
Model | Small goals mAP@50(%) | Missed detection rate (%) | FPS |
YOLOv8 | 72.3 | 18.7 | 71 |
YOLO-SW | 87.6 * | 5.2 * | 59 * |
Figure 11d: Detection results of cloudy environment | |||
Model | Detection results of cloudy environment (%) | Average IoU (%) | Power consumption (W) |
YOLOv8 | 65.4 | 0.68 | 55 |
YOLO-SW | 88.1 * | 0.82 * | 65 * |
Figure 11e: Detection results of strong light environment | |||
Model | Texture feature retention rate (%) | Correct detection count/total target | |
YOLOv8 | 58.6 | 42/60 | |
YOLO-SW | 79.3 * | 56/60 * | |
Figure 11f Test results of rain and snow environment | |||
Model | Anti-interference mAP@50 (%) | Reasoning stability (fluctuation range) | |
YOLOv8 | 71.2 | 12.5% | |
YOLO-SW | 86.7 * | 3.8% * |
References
- Tang, J.; Chen, X.; Miao, R.-H.; Wang, D. Weed detection using image processing under different illumination for site-specific areas spraying. Comput. Electron. Agr. 2016, 122, 103–111. [Google Scholar] [CrossRef]
- Bah, M.D.; Hafiane, A.; Canals, R. Deep Learning with Unsupervised Data Labeling for Weed Detection in Line Crops in UAV Images. Remote Sens. 2018, 10, 1690. [Google Scholar] [CrossRef]
- Tsiafouli, M.A.; hébault, E.; Sgardelis, S.P.; de Ruiter, P.C.; van der Putten, W.H.; Birkhofer, K.; Hemerik, L.; de Vries, F.T.; Bardgett, R.D.; Brady, M.V.; et al. Intensive Agriculture Reduces Soil Biodiversity across Europe. Glob. Change Biol. 2014, 21, 973–985. [Google Scholar] [CrossRef] [PubMed]
- Mauro, M.; Simone, C.; Salvetti, F.; Angarano, S.; Chiaberge, M. Position-Agnostic Autonomous Navigation in Vineyards with Deep Reinforcement Learning. In Proceedings of the 2022 IEEE 18th International Conference on Automation Science and Engineering (CASE), Mexico City, Mexico, 20–24 August 2022. [Google Scholar]
- Olsen, A.; Konovalov, D.A.; Philippa, B.; Ridd, P.; Wood, J.C.; Johns, J.; Banks, W.; Girgenti, B.; Kenny, O.; Whinney, J.; et al. DeepWeeds: A multiclass weed species image dataset for deep learning. Sci. Rep.-Uk 2019, 9, 2058. [Google Scholar] [CrossRef] [PubMed]
- Zhu, H.; Zhang, Y.; Mu, D.; Bai, L.; Zhuang, H.; Li, H. YOLOX-based blue laser weeding robot in corn field. Front Plant Sci. 2022, 13, 1017803. [Google Scholar] [CrossRef] [PubMed]
- Yu, H.; Men, Z.; Bi, C.; Liu, H. Research on field soybean weed identification based on an improved UNet model combined with a channel attention mechanism. Front Plant Sci. 2022, 13, 890051. [Google Scholar] [CrossRef] [PubMed]
- Dos Santos Ferreira, A.; Matte Freitas, D.; da Silva, G.G.; Pistori, H.; Folhes, M.T. Weed detection in soybean crops using ConvNets. Comput. Electron. Agr. 2017, 143, 314–324. [Google Scholar] [CrossRef]
- Sun, T.; Cui, L.; Zong, L.; Zhang, S.; Jiao, Y.; Xue, X.; Jin, Y. Weed Recognition at Soybean Seedling Stage Based on YOLOV8nGP+ NExG Algorithm. Agronomy 2024, 14, 657. [Google Scholar] [CrossRef]
- Xu, Y.; He, R.; Gao, Z.; Li, C.; Zhai, Y.; Jiao, Y. Weed Density Detection Method Based on Absolute Feature Corner Points in Field. Agronomy 2020, 10, 113. [Google Scholar] [CrossRef]
- Jia, Z.; Zhang, M.; Yuan, C.; Liu, Q.; Liu, H.; Qiu, X.; Zhao, W.; Shi, J. ADL-YOLOv8: A Field Crop Weed Detection Model Based on Improved YOLOv8. Agronomy 2024, 14, 2355. [Google Scholar] [CrossRef]
- Ding, Y.; Jiang, C.; Song, L.; Liu, F.; Tao, Y. RVDR-YOLOv8: A Weed Target Detection Model Based on Improved YOLOv8. Electronics 2024, 13, 2182. [Google Scholar] [CrossRef]
- Zhao, K.; Lu, R.; Wang, S.; Yang, X.; Li, Q.; Fan, J. ST-YOLOA: A Swin-transformer-based YOLO model with an attention mechanism for SAR ship detection under complex background. Front. Neurorobotics. 2023, 17. [Google Scholar] [CrossRef] [PubMed]
- Lin, A.; Chen, B.; Xu, J.; Zhang, Z.; Lu, G.; Zhang, D. DS-TransUNet: Dual Swin Transformer U-Net for Medical Image Segmentation. IEEE T Instrum. Meas. 2022, 71, 1–15. [Google Scholar] [CrossRef]
- Ma, L.; Yu, Q.; Yu, H.; Zhang, J. Maize Leaf Disease Identification Based on YOLOv5n Algorithm Incorporating Attention Mechanism. Agronomy 2023, 13, 521. [Google Scholar] [CrossRef]
- Saleem, M.H.; Potgieter, J.; Arif, K.M. Automation in Agriculture by Machine and Deep Learning Techniques: A Review of Recent Developments. Precis. Agric. 2021, 22, 2053–2091. [Google Scholar] [CrossRef]
- Liu, H.; Hou, Y.; Zhang, J.; Zheng, P.; Hou, S. Research on Weed Reverse Detection Methods Based on Improved You Only Look Once (YOLO) v8: Preliminary Results. Agronomy 2024, 14, 1667. [Google Scholar] [CrossRef]
- Guo, B.; Ling, S.; Tan, H.; Wang, S.; Wu, C.; Yang, D. Detection of the Grassland Weed Phlomoides umbrosa Using Multi-Source Imagery and an Improved YOLOv8 Network. Agronomy 2023, 13, 3001. [Google Scholar] [CrossRef]
- Dao, D.-P.; Yang, H.-J.; Ho, N.-H.; Pant, S.; Kim, S.-H.; Lee, G.-S.; Oh, I.-J.; Kang, S.-R. Survival Analysis based on Lung Tumor Segmentation using Global Context-aware Transformer in Multimodality. In Proceedings of the 2022 26th International Conference on Pattern Recognition (ICPR), Montreal, QC, Canada, 21–25 August 2022. [Google Scholar]
- Yaseen, M. What is YOLOv9: An In-Depth Exploration of the Internal Features of the Next-Generation Object Detector. arXiv 2024, arXiv:2409.07813. [Google Scholar]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. arXiv 2021, arXiv:2103.14030. [Google Scholar] [CrossRef]
- Wang, J.; Chen, K.; Xu, R.; Liu, Z.; Loy, C.C.; Lin, D. Carafe: Content-aware reassembly of features. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October 2019–2 November 2019; pp. 3007–3016. [Google Scholar]
- Zhou, R.; Wan, C. Quantum Image Scaling Based on Bilinear Interpolation with Decimals Scaling Ratio. Int. J. Theor. Phys. 2021, 60, 2115–2144. [Google Scholar] [CrossRef]
- Zhao, Y.; Lv, W.; Xu, S.; Wei, J.; Wang, G.; Dang, Q. DETRs Beat YOLOs on Real-time Object Detection. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–22 June 2024; pp. 16965–16974. [Google Scholar]
- Dai, X.; Chen, Y.; Yang, J.; Zhang, P.; Yuan, L.; Dynamic, L.Z. DETR: End-to-End Object Detection With Dynamic Attention. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; pp. 2988–2997. [Google Scholar]
- Wang, P.; Peteinatos, G.; Efthimiadou, A.; Ma, W. Editorial: Weed identification and integrated control. Front. Plant Sci. 2023, 14, 1351481. [Google Scholar] [CrossRef] [PubMed]
- Liu, S.; Wang, J.; Tao, L.; Li, Z.; Sun, C.; Zhong, X. Farmland Weed Species Identification Based on Computer Vision; Springer: Cham, Switzerland, 2019; pp. 452–461. [Google Scholar]
- Hui, L.; Peng, H. An Improved Sharpening Algorithm for Foggy Picture Based on Dark-Channel Prior; Atlantis Press: Van Godewijckstraat, Dordrecht, 2015. [Google Scholar]
- Wang, P.; Tang, Y.; Luo, F.; Wang, L.; Li, C.; Niu, Q.; Li, H. Weed25: A deep learning dataset for weed identification. Front. Plant Sci. 2022, 13. [Google Scholar] [CrossRef] [PubMed]
- Selvaraju, R.R.; Das, A.; Vedantam, R.; Cogswell, M.; Parikh, D.; Batra, D. Grad-CAM: Why did you say that? arXiv 2016, arXiv:1611.07450. [Google Scholar]
- Shuai, L.; Li, Z.; Chen, Z.; Luo, D.; Mu, J. A research review on deep learning combined with hyperspectral Imaging in multiscale agricultural sensing. Comput. Electron. Agr. 2024, 217, 108577. [Google Scholar] [CrossRef]
Parameters | Value |
---|---|
Input shape | (640, 640, 3) |
Epoch | 200 |
Close mosaic | 10 |
Batch size | 8 |
Workers | 8 |
Optimizer | Adam |
Lr0 | 1 × 10−2 |
Lr1 | 1 × 10−4 |
Momentum | 0.937 |
IoU | 0.7 |
Model | Param/106 | FLOPs (G) | F1-Score | mAP@50 (%) | mAP@75 (%) | mAP@95 (%) | FPS |
---|---|---|---|---|---|---|---|
Faster R-CNN | 43.2 | 96.4 | 66.7 | 58.2 | 45.6 | 28.3 | 55 |
YOLOv5n | 6.5 | 7.8 | 77.5 | 79.8 | 70.2 | 52.4 | 74 |
YOLOv7-tiny | 27.6 | 64.3 | 83.4 | 85.6 | 75.8 | 58.6 | 61 |
RT-DETR | 9.3 | 16.4 | 81.7 | 76.5 | 65.3 | 44.1 | 70 |
YOLOv8n | 6.6 | 8.0 | 85.5 | 88.3 | 80.1 | 62.7 | 70 |
YOLO-SW | 12.6 | 87.7 | 88.1 | 92.3 * | 84.6 | 67.8 | 59 |
Model | Param/106 | FLOPs (G) | F1-Score | mAP@50 (%) | mAP@75 (%) | mAP@95 (%) | FPS |
---|---|---|---|---|---|---|---|
YOLOv8 unaug | 6.6 | 8.0 | 85.5 | 88.3 | 75.2 | 50.1 | 72 |
YOLOv8 aug | 6.8 | 8.2 | 86.7 * | 89.5 * | 76.4 | 51.3 | 71 |
YOLO-SW unaug | 7.4 | 8.7 | 89.6 | 91.1 * | 80.5 | 55.6 | 60 |
YOLO-SW aug | 7.6 | 8.8 | 90.8 * | 92.3 * | 82.8 | 57.7 | 59 |
Model | Param/106 | FLOPs (G) | F1-Score | mAP@50 (%) | mAP@75 (%) | mAP@95 (%) | FPS |
---|---|---|---|---|---|---|---|
YOLOv8 Weed25 | 6.8 | 8.2 | 83.6 | 86.3 | 75.1 | 56.4 | 71 |
YOLO-SW Weed25 | 7.2 | 8.5 | 88.1 | 89.5 | 81.2 | 63.5 | 70 |
Backbone | Param/106 | FLOPs (G) | F1-Score | mAP@50 (%) | mAP@75 (%) | mAP@95 (%) | FPS |
---|---|---|---|---|---|---|---|
YOLOv8 (baseline) | 6.8 | 8.2 | 86.7 | 89.5 | 81.2 | 63.5 | 71 |
YOLOv8 + MobileNetv3 | 5.7 | 2.6 | 83.8 | 84.6 * | 74.5 | 55.6 | 56 |
YOLOv8 + VanillaNet | 20.8 | 72.0 | 86.2 | 87.6 * | 76.5 | 57.8 | 55 |
YOLOv8 + ShuffleNet | 26.9 | 76.5 | 84.1 | 86.4 * | 75.2 | 56.9 | 49 |
YOLOv8 + ResNet | 12.5 | 28.3 | 86.6 | 89.8 | 81.5 | 64.2 | 65 |
YOLOv8 + CSPDarkNet | 9.0 | 18.3 | 86.9 | 90.2 * | 82.1 | 65.3 | 72 |
YOLOv8 + Swin Transformer | 9.9 | 19.9 | 88.0 * | 90.6 * | 82.5 | 65.8 | 63 |
Model | Param/106 | FLOPs (G) | F1-score | mAP@50 (%) | mAP@75 (%) | mAP@95 (%) | FPS |
---|---|---|---|---|---|---|---|
YOLOv8 | 6.8 | 8.2 | 86.7 | 89.5 | 81.2 | 63.5 | 71 |
YOLOv8 + ST | 19.9 | 79.1 | 88.0 | 90.6 * | 82.5 | 65.8 | 63 |
YOLOv8 + CARAFE | 6.8 | 8.2 | 88.2 * | 89.8 * | 81.5 | 64.2 | 71 |
YOLOv8 + RTHead | 9.5 | 16.8 | 88.0 * | 90.8 * | 82.3 | 65.5 | 86 |
YOLOv8 + ST + CARAFE | 9.9 | 8.1 | 88.8 ** | 90.8 ** | 82.6 | 65.9 | 84 |
YOLOv8 + ST + RTHead | 12.60 | 87.70 | 89.9 *** | 91.2 *** | 83.5 | 66.7 | 55 |
YOLO-SW | 12.60 | 87.70 | 90.8 * | 92.3 * | 84.6 | 67.8 | 59 |
Color | Activation Intensity | Semantic Meaning |
---|---|---|
Dark blue | 0–0.2 | Low-concern area (background soil) |
Light blue | 0.2–0.4 | Moderate attention (non-critical leaf area) |
Yellow | 0.4–0.6 | High attention (Leaf edge texture) |
Deep red | 0.6–1.0 | Highest attention (discriminative feature point) |
System Name | Algorithm Architecture | Hardware Equipment | FPS | mAP@50 | F1-Score |
---|---|---|---|---|---|
Weed Identification System (WIS) | CNN | GTX Titan (6 GB) | 25 | 65.4 | 68.3 |
GCN-ResNet101 System | Graph Convolutional Network + ResNet101 | NVIDIA IGX Orin | 46 | 83.2 | 81.7 |
YOLO-SW Detection System | YOLO-SW | NVIDIA Jetson AGX Orin | 59 | 92.3 | 90.8 |
Indicator | YOLOv8 | YOLO-SW | Performance Difference |
---|---|---|---|
FPS | 71 | 59 | −17% |
Power consumption (W) | 55 | 65 | +18% |
GFLOPs | 8.2 | 19.9 | +142% |
VRAM | 1.3 | 2.1 | +62% |
mAP@50 | 88.3% | 92.3% | +4.5% |
False alarm rate | 3.8% | 2.5% | −1.3% |
Small target recall rate | 78.3% | 89.5% | +11.2% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Shuai, Y.; Shi, J.; Li, Y.; Zhou, S.; Zhang, L.; Mu, J. YOLO-SW: A Real-Time Weed Detection Model for Soybean Fields Using Swin Transformer and RT-DETR. Agronomy 2025, 15, 1712. https://doi.org/10.3390/agronomy15071712
Shuai Y, Shi J, Li Y, Zhou S, Zhang L, Mu J. YOLO-SW: A Real-Time Weed Detection Model for Soybean Fields Using Swin Transformer and RT-DETR. Agronomy. 2025; 15(7):1712. https://doi.org/10.3390/agronomy15071712
Chicago/Turabian StyleShuai, Yizhou, Jingsha Shi, Yi Li, Shaohao Zhou, Lihua Zhang, and Jiong Mu. 2025. "YOLO-SW: A Real-Time Weed Detection Model for Soybean Fields Using Swin Transformer and RT-DETR" Agronomy 15, no. 7: 1712. https://doi.org/10.3390/agronomy15071712
APA StyleShuai, Y., Shi, J., Li, Y., Zhou, S., Zhang, L., & Mu, J. (2025). YOLO-SW: A Real-Time Weed Detection Model for Soybean Fields Using Swin Transformer and RT-DETR. Agronomy, 15(7), 1712. https://doi.org/10.3390/agronomy15071712