Benchmarking YOLO Models for Crop Growth and Weed Detection in Cotton Fields
Abstract
1. Introduction
- A comprehensive comparison of nineteen YOLO models was conducted under identical experimental settings, ensuring fair evaluation across versions and architectures.
- Model performance was statistically analyzed using standard detection metrics (precision, recall, mAP@0.5, mAP@0.5:0.95) to identify consistent trends and significant differences among models.
- Inference latency was measured uniformly to quantify computational efficiency, followed by a detailed speed–accuracy trade-off analysis highlighting practical deployment considerations.
- Per-class error analysis was performed to examine detection difficulty across cotton growth stages and weed types, providing insights into model strengths and remaining challenges.
2. Related Work
3. Materials and Methods
3.1. Experimental Framework
3.2. Dataset Description
3.3. Implementation Details
3.4. Evaluation Metrics
- Precision () measures the proportion of correctly identified objects among all predicted detections:where and represent true and false positives, respectively. High precision indicates fewer false alarms, which is critical for avoiding incorrect weed treatments in agricultural applications.
- Recall () measures the proportion of ground-truth objects correctly detected by the model:where denotes false negatives. High recall ensures that most relevant objects (e.g., weeds or cotton plants) are captured.
- Intersection-over-Union (IoU) quantifies localization accuracy between predicted and ground-truth bounding boxes:where and denote the predicted and ground-truth boxes. A detection is considered correct when IoU exceeds a chosen threshold (commonly 0.5).
- Average Precision (AP) represents the area under the precision–recall curve for a single class:
- Mean Average Precision at 0.5 IoU (mAP@05) averages AP across all C classes at a fixed IoU threshold of 0.5:
- COCO-style Mean Average Precision (0.5:0.95) computes AP across multiple IoU thresholds (0.5—0.95 in 0.05 increments) and averages over all classes:where thresholds in this study. This metric provides a stricter and more comprehensive measure of detection robustness, penalizing poor localization.
4. Results
4.1. Model Performance and Statistical Evaluation
4.2. Inference Latency Analysis
4.3. Speed vs. Accuracy Trade-Off
4.4. Per-Class Error Insight
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Bank, W. World Development Report 2024: Thriving with Agriculture; World Bank: Washington, DC, USA, 2024. [Google Scholar]
- Awais, M.; Wang, X.; Hussain, S.; Aziz, F.; Mahmood, M.Q. Advancing precision agriculture through digital twins and smart farming technologies: A review. AgriEngineering 2025, 7, 137. [Google Scholar] [CrossRef]
- Qu, H.R.; Su, W.H. Deep learning-based weed–crop recognition for smart agricultural equipment: A review. Agronomy 2024, 14, 363. [Google Scholar]
- Gund, R.; Badgujar, C.M.; Samiappan, S.; Jagadamma, S. Application of Digital Twin Technology in Smart Agriculture: A Bibliometric Review. Agriculture 2025, 15, 1799. [Google Scholar] [CrossRef]
- AlZubi, A.A.; Galyna, K. Artificial intelligence and internet of things for sustainable farming and smart agriculture. IEEE Access 2023, 11, 78686–78692. [Google Scholar] [CrossRef]
- Wu, F.; Zhu, R.; Meng, F.; Qiu, J.; Yang, X.; Li, J.; Zou, X. An enhanced cycle generative adversarial network approach for nighttime pineapple detection of automated harvesting robots. Agronomy 2024, 14, 3002. [Google Scholar] [CrossRef]
- Suzer, M.H.; Şenbayram, M.; Çullu, M.A. Sustainable Farming through Precision Agriculture: Enhancing Nitrogen Use and Weed Management. Precis.-Agric.-Emerg. Technol. 2024. [Google Scholar]
- Agrawal, A.; Singh, M. Agronomic Interventions for Sustainable Weed Management in Cotton. J. Cotton Sci. 2025, 29, 1–8. [Google Scholar]
- Devi, P.; Singh, K.; Sewhag, M.; Kumar, S. Performance of Bt cotton evaluated in relation to mulching and weed control measures in northwest India. J. Cotton Res. 2024, 7, 35. [Google Scholar] [CrossRef]
- Gao, W.T.; Su, W.H. Weed management methods for herbaceous field crops: A review. Agronomy 2024, 14, 486. [Google Scholar] [CrossRef]
- Azghadi, M.R.; Olsen, A.; Wood, J.; Saleh, A.; Calvert, B.; Granshaw, T.; Fillols, E.; Philippa, B. Precision robotic spot-spraying: Reducing herbicide use and enhancing environmental outcomes in sugarcane. Comput. Electron. Agric. 2025, 235, 110365. [Google Scholar] [CrossRef]
- Su, W.H. Advanced machine learning in point spectroscopy, RGB-and hyperspectral-imaging for automatic discriminations of crops and weeds: A review. Smart Cities 2020, 3, 767–792. [Google Scholar]
- Li, Y.; Al-Sarayreh, M.; Irie, K.; Hackell, D.; Bourdot, G.; Reis, M.M.; Ghamkhar, K. Identification of weeds based on hyperspectral imaging and machine learning. Front. Plant Sci. 2021, 11, 611622. [Google Scholar] [CrossRef]
- Dadashzadeh, M.; Abbaspour-Gilandeh, Y.; Mesri-Gundoshmian, T.; Sabzi, S.; Hernández-Hernández, J.L.; Hernández-Hernández, M.; Arribas, J.I. Weed classification for site-specific weed management using an automated stereo computer-vision machine-learning system in rice fields. Plants 2020, 9, 559. [Google Scholar] [CrossRef]
- Dalal, M.; Mittal, P. A Systematic Review of Deep Learning-Based Object Detection in Agriculture: Methods, Challenges, and Future Directions. Comput. Mater. Contin. 2025, 84, 57–91. [Google Scholar] [CrossRef]
- Altalak, M.; Ammad uddin, M.; Alajmi, A.; Rizg, A. Smart agriculture applications using deep learning technologies: A survey. Appl. Sci. 2022, 12, 5919. [Google Scholar] [CrossRef]
- Adhinata, F.D.; Sumiharto, R.; Meng, F.; Qiu, J.; Yang, X.; Li, J.; Zou, X. A comprehensive survey on weed and crop classification using machine learning and deep learning. Artif. Intell. Agric. 2024, 13, 45–63. [Google Scholar] [CrossRef]
- Silva, J.A.O.S.; Siqueira, V.S.d.; Mesquita, M.; Vale, L.S.R.; Marques, T.d.N.B.; Silva, J.L.B.d.; Silva, M.V.d.; Lacerda, L.N.; Oliveira-Júnior, J.F.d.; Lima, J.L.M.P.d.; et al. Deep learning for weed detection and segmentation in agricultural crops using images captured by an unmanned aerial vehicle. Remote. Sens. 2024, 16, 4394. [Google Scholar] [CrossRef]
- Faisal, H.M.; Aqib, M.; Mahmood, K.; Safran, M.; Alfarhood, S.; Ashraf, I. A customized convolutional neural network-based approach for weeds identification in cotton crops. Front. Plant Sci. 2025, 15, 1435301. [Google Scholar] [CrossRef]
- Li, Y.; Guo, R.; Li, R.; Ji, R.; Wu, M.; Chen, D.; Han, C.; Han, R.; Liu, Y.; Ruan, Y.; et al. An improved U-net and attention mechanism-based model for sugar beet and weed segmentation. Front. Plant Sci. 2025, 15, 1449514. [Google Scholar] [CrossRef]
- Hu, J.; Gong, H.; Li, S.; Mu, Y.; Guo, Y.; Sun, Y.; Hu, T.; Bao, Y. Cotton Weed-YOLO: A Lightweight and Highly Accurate Cotton Weed Identification Model for Precision Agriculture. Agronomy 2024, 14, 2911. [Google Scholar] [CrossRef]
- Allmendinger, A.; Saltık, A.O.; Peteinatos, G.G.; Stein, A.; Gerhards, R. Assessing the capability of YOLO-and transformer-based object detectors for real-time weed detection. Precis. Agric. 2025, 26, 52. [Google Scholar] [CrossRef]
- Sonawane, S.; Patil, N.N. Comparative performance analysis of YOLO object detection algorithms for weed detection in agriculture. Intell. Decis. Technol. 2025, 19, 507–519. [Google Scholar] [CrossRef]
- Bajraktari, F.; Toylan, H. Performance Evaluation of Deep Learning Object Detectors for Weed Detection for Cotton. Machines 2025, 13, 219. [Google Scholar] [CrossRef]
- Kim, J.; Kim, G.; Yoshitoshi, R.; Tokuda, K. Real-Time Object Detection for Edge Computing-Based Agricultural Automation: A Case Study Comparing the YOLOX and YOLOv12 Architectures and Their Performance in Potato Harvesting Systems. Sensors 2025, 25, 4586. [Google Scholar] [CrossRef] [PubMed]
- Rai, S.; Sun, H. WeedVision: A Single-Stage Deep Learning Architecture for Weed Detection and Segmentation Using Drone-Acquired Images. Comput. Electron. Agric. 2024, 219, 108792. [Google Scholar] [CrossRef]
- Saini, P.; Nagesh, D. Robotic Weed Removal Using Deep Learning for Precision Farming. In Proceedings of the 2024 2nd International Conference on Advancements and Key Challenges in Green Energy and Computing (AKGEC), Ghaziabad, India, 21–23 November 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 1–5. [Google Scholar]
- Lu, Z.; Chengao, Z.; Lu, L.; Yan, Y.; Jun, W.; Wei, X.; Ke, X.; Jun, T. Star-YOLO: A lightweight and efficient model for weed detection in cotton fields using advanced YOLOv8 improvements. Comput. Electron. Agric. 2025, 235, 110306. [Google Scholar] [CrossRef]
- Zhou, Q.; Li, H.; Cai, Z.; Zhong, Y.; Zhong, F.; Lin, X.; Wang, L. YOLO-ACE: Enhancing YOLO with Augmented Contextual Efficiency for Precision Cotton Weed Detection. Sensors 2025, 25, 1635. [Google Scholar] [CrossRef]
- Zhu, Y.; Hao, S.; Zheng, W.; Jin, C.; Yin, X.; Zhou, P. Multi-teacher cotton field weed detection model based on knowledge distillation. Trans. Chin. Soc. Agric. Eng. 2025, 41, 200–210. [Google Scholar]
- Li, Y.; Nie, J.; Chao, X. Do we really need deep CNN for plant diseases identification? Comput. Electron. Agric. 2020, 178, 105803. [Google Scholar] [CrossRef]
- Nanwal, J.; Sethi, P. A hybrid algorithm for efficient and scalable weed detection for precision in agriculture using CNN and random forest algorithm. In Progressive Computational Intelligence, Information Technology and Networking; CRC Press: Boca Raton, FL, USA, 2025; pp. 615–621. [Google Scholar]
- Macedo, F.L.; Nóbrega, H.; de Freitas, J.G.; Pinheiro de Carvalho, M.A. Assessment of Vegetation Indices Derived from UAV Imagery for Weed Detection in Vineyards. Remote. Sens. 2025, 17, 1899. [Google Scholar] [CrossRef]
- Alam, M.; Alam, M.S.; Roman, M.; Tufail, M.; Khan, M.U.; Khan, M.T. Real-time machine-learning based crop/weed detection and classification for variable-rate spraying in precision agriculture. In Proceedings of the 2020 7th International Conference on Electrical and Electronics Engineering (ICEEE), Antalya, Turkey, 14–16 April 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 273–280. [Google Scholar]
- Zheng, L.; Long, L.; Zhu, C.; Jia, M.; Chen, P.; Tie, J. A lightweight cotton field weed detection model enhanced with efficientNet and attention mechanisms. Agronomy 2024, 14, 2649. [Google Scholar] [CrossRef]
- Wang, J.; Qi, Z.; Wang, Y.; Liu, Y. A lightweight weed detection model for cotton fields based on an improved YOLOv8n. Sci. Rep. 2025, 15, 457. [Google Scholar] [CrossRef]
- Qi, Z.; Wang, J.; Yang, G.; Wang, Y. Lightweight YOLOv8-Based Model for Weed Detection in Dryland Spring Wheat Fields. Sustainability 2025, 17, 6150. [Google Scholar] [CrossRef]
- Wang, A.; Peng, T.; Cao, H.; Xu, Y.; Wei, X.; Cui, B. TIA-YOLOv5: An improved YOLOv5 network for real-time detection of crop and weed in the field. Front. Plant Sci. 2022, 13, 1091655. [Google Scholar] [CrossRef] [PubMed]
- Deng, B.; Lu, Y.; Xu, J. Weed database development: An updated survey of public weed datasets and cross-season weed detection adaptation. Ecol. Inform. 2024, 81, 102546. [Google Scholar] [CrossRef]
- Hasan, A.M.; Diepeveen, D.; Laga, H.; Jones, M.G.; Sohel, F. Object-level benchmark for deep learning-based detection and classification of weed species. Crop. Prot. 2024, 177, 106561. [Google Scholar] [CrossRef]










| Class | Train (3108 img) | Val (888 img) | Test (444 img) | Total Objects |
|---|---|---|---|---|
| Broadleaf Weed | 1680 | 362 | 367 | 2409 |
| Grass Weed | 1425 | 331 | 339 | 2095 |
| Cotton Stage 1 | 3210 | 676 | 688 | 4574 |
| Cotton Stage 2 | 3012 | 652 | 663 | 4327 |
| Cotton Stage 3 | 2747 | 598 | 611 | 3956 |
| Total Objects | 12,074 | 2619 | 2668 | 17,361 |
| Parameter | Description |
|---|---|
| Hardware | NVIDIA GeForce RTX 4060 Ti (8 GB VRAM), Intel Core i7 CPU, 16 GB RAM |
| Software | Windows 11, Python 3.10.12, PyTorch 2.3.0 + TorchVision 0.18.0 (CUDA 12.1) |
| Input size | 640 × 640 pixels (RGB) |
| Optimizer | AdamW (optimizer=auto) |
| Learning rate (start) | 0.001111 |
| Betas/Momentum | , (Ultralytics default; momentum = 0.9) |
| Weight decay | 0.0005 |
| Warm-up schedule | 3.0 epochs with momentum ramp from 0.8 to 0.937 and bias learning rate = 0.1 |
| Epochs | 100 |
| Batch/image size | 64 @ 640 × 640 (adjusted per model to avoid memory overflow) |
| Automatic Mixed Precision (AMP) | Enabled (amp=True) for faster and memory-efficient training |
| Augmentations | Random horizontal flip, HSV color jitter, scaling, and mosaic (Ultralytics default pipeline) |
| Initialization | COCO-pretrained weights provided by Ultralytics |
| Evaluation metrics | Precision, recall, mAP@0.5, and mAP@0.5:0.95 (COCO standards) |
| Inference measurement | Average latency (ms/image) with batch size = 1 |
| Model export | PyTorch checkpoint (.pt) saved for each best-performing model |
| YOLO Variant(s) | Batch Size |
|---|---|
| YOLO10x, YOLO9e | 8 |
| YOLO11l, YOLO11x, YOLO10l, YOLO9c, YOLO3u, YOLO8x, YOLO8l, YOLO8m, YOLO8s | 16 |
| YOLO11s, YOLO10s, YOLO9m, YOLO9s, YOLO9t | 32 |
| YOLO11n, YOLO8n | 64 |
| Model Variant | Precision | Recall | mAP@0.5 | mAP@0.5:0.95 |
|---|---|---|---|---|
| YOLOv3n | 0.794 ± 0.004 (0.788–0.800) | 0.740 ± 0.005 (0.733–0.747) | 0.785 ± 0.006 (0.777–0.793) | 0.548 ± 0.007 (0.539–0.557) |
| YOLOv3s | 0.777 ± 0.003 (0.773–0.781) | 0.759 ± 0.004 (0.754–0.764) | 0.798 ± 0.004 (0.793–0.803) | 0.574 ± 0.005 (0.568–0.580) |
| YOLOv3m | 0.780 ± 0.003 (0.776–0.784) | 0.767 ± 0.003 (0.763–0.771) | 0.801 ± 0.004 (0.796–0.806) | 0.579 ± 0.004 (0.574–0.584) |
| YOLOv3l | 0.780 ± 0.002 (0.777–0.783) | 0.771 ± 0.003 (0.767–0.775) | 0.804 ± 0.003 (0.800–0.808) | 0.585 ± 0.003 (0.581–0.589) |
| YOLOv3x | 0.782 ± 0.002 (0.779–0.785) | 0.773 ± 0.002 (0.770–0.776) | 0.807 ± 0.003 (0.803–0.811) | 0.587 ± 0.003 (0.583–0.591) |
| YOLOv8s | 0.783 ± 0.002 (0.780–0.786) | 0.772 ± 0.003 (0.768–0.776) | 0.808 ± 0.003 (0.804–0.812) | 0.589 ± 0.003 (0.585–0.593) |
| YOLOv8m | 0.817 ± 0.002 (0.814–0.820) | 0.765 ± 0.003 (0.761–0.769) | 0.812 ± 0.003 (0.808–0.816) | 0.588 ± 0.004 (0.583–0.593) |
| YOLOv8l | 0.775 ± 0.003 (0.771–0.779) | 0.771 ± 0.002 (0.768–0.774) | 0.811 ± 0.004 (0.806–0.816) | 0.590 ± 0.003 (0.586–0.594) |
| YOLOv8x | 0.802 ± 0.002 (0.799–0.805) | 0.748 ± 0.003 (0.744–0.752) | 0.811 ± 0.003 (0.807–0.815) | 0.597 ± 0.003 (0.593–0.601) |
| YOLOv9t | 0.791 ± 0.003 (0.787–0.795) | 0.769 ± 0.003 (0.765–0.773) | 0.811 ± 0.004 (0.806–0.816) | 0.589 ± 0.004 (0.584–0.594) |
| YOLOv9s | 0.781 ± 0.003 (0.777–0.785) | 0.764 ± 0.002 (0.761–0.767) | 0.799 ± 0.003 (0.795–0.803) | 0.570 ± 0.004 (0.565–0.575) |
| YOLOv9m | 0.802 ± 0.002 (0.799–0.805) | 0.771 ± 0.002 (0.768–0.774) | 0.804 ± 0.003 (0.800–0.808) | 0.580 ± 0.003 (0.576–0.584) |
| YOLOv10n | 0.784 ± 0.002 (0.781–0.787) | 0.770 ± 0.002 (0.767–0.773) | 0.806 ± 0.003 (0.802–0.810) | 0.582 ± 0.003 (0.578–0.586) |
| YOLOv10s | 0.785 ± 0.002 (0.782–0.788) | 0.772 ± 0.003 (0.768–0.776) | 0.807 ± 0.003 (0.803–0.811) | 0.584 ± 0.003 (0.580–0.588) |
| YOLOv10m | 0.786 ± 0.002 (0.783–0.789) | 0.773 ± 0.002 (0.770–0.776) | 0.808 ± 0.003 (0.804–0.812) | 0.585 ± 0.003 (0.581–0.589) |
| YOLOv10l | 0.778 ± 0.002 (0.775–0.781) | 0.763 ± 0.002 (0.760–0.766) | 0.800 ± 0.003 (0.796–0.804) | 0.572 ± 0.003 (0.568–0.576) |
| YOLOv11n | 0.781 ± 0.002 (0.778–0.784) | 0.794 ± 0.003 (0.790–0.798) | 0.809 ± 0.003 (0.805–0.813) | 0.577 ± 0.003 (0.573–0.581) |
| YOLOv11s | 0.782 ± 0.002 (0.779–0.785) | 0.774 ± 0.002 (0.771–0.777) | 0.808 ± 0.003 (0.804–0.812) | 0.591 ± 0.003 (0.587–0.595) |
| YOLOv11l | 0.779 ± 0.002 (0.776–0.782) | 0.790 ± 0.002 (0.787–0.793) | 0.815 ± 0.003 (0.811–0.819) | 0.592 ± 0.003 (0.588–0.596) |
| Model Variant | Parameters (M) | FLOPs (GFLOPs) | Latency (ms/Image) |
|---|---|---|---|
| YOLOv3n | 3.01 | 8.2 | 62.4 ± 2.1 (60.1–64.7) |
| YOLOv3s | 11.14 | 28.7 | 27.1 ± 1.2 (25.8–28.4) |
| YOLOv3m | 25.86 | 79.1 | 33.5 ± 1.5 (31.9–35.1) |
| YOLOv3l | 43.63 | 165.4 | 42.6 ± 1.7 (40.7–44.5) |
| YOLOv3x | 68.16 | 258.1 | 54.9 ± 2.3 (52.4–57.4) |
| YOLOv8s | 11.14 | 28.7 | 67.3 ± 3.0 (63.9–70.7) |
| YOLOv8m | 25.86 | 79.1 | 28.8 ± 1.1 (27.6–30.0) |
| YOLOv8l | 43.63 | 165.4 | 36.7 ± 1.3 (35.3–38.1) |
| YOLOv8x | 68.16 | 258.1 | 88.0 ± 4.5 (83.4–92.6) |
| YOLOv9t | 2.01 | 7.9 | 72.4 ± 3.6 (68.6–76.2) |
| YOLOv9s | 7.29 | 27.4 | 29.4 ± 1.2 (28.1–30.7) |
| YOLOv9m | 20.16 | 77.6 | 37.1 ± 1.4 (35.6–38.6) |
| YOLOv9e | 58.15 | 192.7 | 46.8 ± 1.8 (44.8–48.8) |
| YOLOv10n | 2.71 | 8.4 | 59.2 ± 2.2 (56.8–61.6) |
| YOLOv10s | 8.07 | 24.8 | 73.5 ± 3.1 (70.1–76.9) |
| YOLOv10m | 25.86 | 79.1 | 30.1 ± 1.3 (28.7–31.5) |
| YOLOv10l | 25.77 | 127.2 | 38.0 ± 1.5 (36.4–39.6) |
| YOLOv10x | 31.66 | 171.0 | 52.7 ± 2.0 (50.4–55.0) |
| YOLOv11n | 2.59 | 6.4 | 90.2 ± 4.8 (85.4–95.0) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Raza, H.; Abu Bakr, M.; Khan, S.D.; Batool, H.; Ullah, H.; Ullah, M. Benchmarking YOLO Models for Crop Growth and Weed Detection in Cotton Fields. AgriEngineering 2025, 7, 375. https://doi.org/10.3390/agriengineering7110375
Raza H, Abu Bakr M, Khan SD, Batool H, Ullah H, Ullah M. Benchmarking YOLO Models for Crop Growth and Weed Detection in Cotton Fields. AgriEngineering. 2025; 7(11):375. https://doi.org/10.3390/agriengineering7110375
Chicago/Turabian StyleRaza, Hassan, Muhammad Abu Bakr, Sultan Daud Khan, Hira Batool, Habib Ullah, and Mohib Ullah. 2025. "Benchmarking YOLO Models for Crop Growth and Weed Detection in Cotton Fields" AgriEngineering 7, no. 11: 375. https://doi.org/10.3390/agriengineering7110375
APA StyleRaza, H., Abu Bakr, M., Khan, S. D., Batool, H., Ullah, H., & Ullah, M. (2025). Benchmarking YOLO Models for Crop Growth and Weed Detection in Cotton Fields. AgriEngineering, 7(11), 375. https://doi.org/10.3390/agriengineering7110375

