LMS-Res-YOLO: Lightweight and Multi-Scale Cucumber Detection Model with Residual Blocks
Abstract
1. Introduction
- We propose a plug-and-play multi-branch convolutional residual module, which can be applied to the backbone, neck and detection heads of YOLO series models to improve detection accuracy and recall, especially for small object detection.
- We researched the detection model’s lightweight module design and further reduced the model’s parameters, FLOPs and size while maintaining its performance.
- We created an image dataset of greenhouse cucumbers, which is useful for studying automatic picking and yield analysis.
2. Related Work
3. Methodology
3.1. Cucumber Dataset
3.2. Benchmark Model YOLOv8_n
3.3. Our Improved Model
3.3.1. HEU Module
3.3.2. DE-HEAD
3.3.3. KWConv
4. Experiments
4.1. Evaluation Metrics
4.2. Ablation Experiments
4.3. Comparison Experiment with Advanced Models
4.4. Visualization Comparison Experiments
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| YOLO | You Only Look Once |
| PAFPN | Path Aggregation Feature Pyramid Network |
| HEU | High-Efficiency Unit |
| Res_HEB | High Efficiency Block with Residual blocks |
| DE-HEAD | Decoupled and Efficient detection HEAD |
| KWConv | KernelWarehouse convolution |
| FLOPs | Floating-point operations per inference |
| mAP | mean Average Precision |
| FOV | field of view |
Appendix A
| Detection Model | mAP@0.5 | mAP@0.5:0.95 | Params (M) | FLOPs (G) | P | R | F1 |
|---|---|---|---|---|---|---|---|
| YOLOv8_n | 0.807 | 0.604 | 3.01 | 8.1 | 0.819 | 0.718 | 0.765 |
| YOLOv10_n | 0.805 | 0.607 | 2.30 | 6.7 | 0.815 | 0.723 | 0.767 |
| LMS-Res-YOLO(our) | 0.802 | 0.616 | 2.43 | 5.4 | 0.800 | 0.726 | 0.761 |
| Detection Model | mAP@0.5 | mAP@0.5:0.95 | Params (M) | FLOPs (G) | P | R | F1 |
|---|---|---|---|---|---|---|---|
| YOLOv8_n | 0.517 | 0.369 | 3.01 | 8.7 | 0.631 | 0.476 | 0.543 |
| YOLOv10_n | 0.536 | 0.388 | 2.30 | 6.7 | 0.643 | 0.492 | 0.558 |
| LMS-Res-YOLO(our) | 0.540 | 0.394 | 2.43 | 5.4 | 0.652 | 0.492 | 0.563 |
| Detection Model | mAP@0.5:0.95 (Small) | mAP@0.5:0.95 (Medium) | mAP@0.5:0.95 (Large) | ||||
| YOLOv8_n | 0.182 | 0.405 | 0.527 | ||||
| YOLOv10_n | 0.184 | 0.424 | 0.558 | ||||
| LMS-Res-YOLO(our) | 0.195 | 0.434 | 0.570 | ||||
| Detection Model | mAP@0.5 | mAP@0.5:0.95 | Params (M) | FLOPs (G) | P | R | F1 |
|---|---|---|---|---|---|---|---|
| YOLOv8_n | 0.341 | 0.195 | 3.01 | 8.1 | 0.455 | 0.343 | 0.391 |
| YOLOv10_n | 0.341 | 0.199 | 2.30 | 6.7 | 0.477 | 0.333 | 0.396 |
| LMS-Res-YOLO(our) | 0.343 | 0.200 | 2.43 | 5.4 | 0.479 | 0.339 | 0.399 |
References
- Bao, G.; Cai, S.; Qi, L.; Xun, Y.; Zhang, L.; Yang, Q. Multi-template matching algorithm for cucumber recognition in natural environment. Comput. Electron. Agric. 2016, 127, 754–762. [Google Scholar] [CrossRef]
- Wang, J.; Li, B.; Li, Z.; Zubrycki, I.; Granosik, G. Grasping behavior of the human hand during tomato picking. Comput. Electron. Agric. 2021, 180, 105901. [Google Scholar] [CrossRef]
- Zheng, C.; Chen, P.; Pang, J.; Yang, X.; Chen, C.; Tu, S.; Xue, Y. A mango picking vision algorithm on instance segmentation and key point detection from RGB images in an open orchard. Biosyst. Eng. 2021, 206, 32–54. [Google Scholar] [CrossRef]
- Omer, S.M.; Ghafoor, K.Z.; Askar, S.K. Lightweight improved yolov5 model for cucumber leaf disease and pest detection based on deep learning. Signal Image Video Process. 2024, 18, 1329–1342. [Google Scholar] [CrossRef]
- Sun, Y.; Zhang, J.; Wang, H.; Wang, L.; Li, H. Identifying optimal water and nitrogen inputs for high efficiency and low environment impacts of a greenhouse summer cucumber with a model method. Agric. Water Manag. 2019, 212, 23–34. [Google Scholar] [CrossRef]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014. [Google Scholar]
- Shi, S.; Guo, C.; Jiang, L.; Wang, Z.; Shi, J.; Wang, X.; Li, H. Pv-rcnn: Point-voxel feature set abstraction for 3d object detection. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 10529–10538. [Google Scholar]
- Sun, P.; Zhang, R.; Jiang, Y.; Kong, T.; Xu, C.; Zhan, W.; Luo, P. Sparse r-cnn: End-to-end object detection with learnable proposals. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 14454–14463. [Google Scholar]
- Cai, Z.; Vasconcelos, N. Cascade r-cnn: Delving into high quality object detection. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 6154–6162. [Google Scholar]
- Girshick, R. Fast r-cnn. In Proceedings of the 2015 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 27–30 June 2016; pp. 1440–1448. [Google Scholar]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask r-cnn. In Proceedings of the 2017 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2961–2969. [Google Scholar]
- Ultralytics YOLO. 2023. Available online: https://github.com/ultralytics/ultralytics (accessed on 15 October 2025).
- Liu, C.; Zhu, H.; Guo, W.; Han, X.; Chen, C.; Wu, H. EFDet: An efficient detection method for cucumber disease under natural complex environments. Comput. Electron. Agric. 2021, 189, 106378. [Google Scholar] [CrossRef]
- Li, S.; Li, K.; Qiao, Y.; Zhang, L. A multi-scale cucumber disease detection method in natural scenes based on YOLOv5. Comput. Electron. Agric. 2022, 202, 107363. [Google Scholar] [CrossRef]
- Ultralytics YOLOv5. 2020. Available online: https://github.com/ultralytics/yolov5 (accessed on 15 October 2025).
- Khan, M.A.; Akram, T.; Sharif, M.; Javed, K.; Raza, M.; Saba, T. An automated system for cucumber leaf diseased spot detection and classification using improved saliency method and deep features selection. Multimed. Tools Appl. 2020, 79, 18627–18656. [Google Scholar] [CrossRef]
- Chen, M.; Lang, X.; Zhai, X.; Li, T.; Shi, Y. Intelligent recognition of greenhouse cucumber canopy vine top with deep learning model. Comput. Electron. Agric. 2023, 213, 108219. [Google Scholar] [CrossRef]
- Yang, W.; Wu, J.; Zhang, J.; Gao, K.; Du, R.; Wu, Z.; Firkat, E.; Li, D. Deformable convolution and coordinate attention for fast cattle detection. Comput. Electron. Agric. 2023, 211, 108006. [Google Scholar] [CrossRef]
- Zhong, Z.; Yun, L.; Cheng, F.; Chen, Z.; Zhang, C. Light-YOLO: A lightweight and efficient YOLO-based deep learning model for mango detection. Agriculture 2024, 14, 140. [Google Scholar] [CrossRef]
- Xu, X.; Xie, Z.; Wang, N.; Yan, P.; Shao, C.; Shu, X.; Zhang, J. PMDS-YOLO: A lightweight multi-scale detector for efficient aquatic product detection. Agriculture 2025, 612, 743210. [Google Scholar] [CrossRef]
- Gui, Z.; Peng, D.; Wu, H.; Long, X. MSGC: Multi-scale grid clustering by fusing analytical granularity and visual cognition for detecting hierarchical spatial patterns. Future Gener. Comput. Syst. 2020, 112, 1038–1056. [Google Scholar] [CrossRef]
- Khan, S.I.; Shahrior, A.; Karim, R.; Hasan, M.; Rahman, A. MultiNet: A deep neural network approach for detecting breast cancer through multi-scale feature fusion. J. King Saud-Univ.-Comput. Inf. Sci. 2022, 34, 6217–6228. [Google Scholar] [CrossRef]
- Karnati, M.; Seal, A.; Sahu, G.; Yazidi, A.; Krejcar, O. A novel multi-scale based deep convolutional neural network for detecting COVID-19 from X-rays. Appl. Soft Comput. 2022, 125, 109109. [Google Scholar] [CrossRef] [PubMed]
- Hu, X.; Li, X.; Huang, Z.; Chen, Q.; Lin, S. Detecting tea tree pests in complex backgrounds using a hybrid architecture guided by transformers and multi-scale attention mechanism. J. Sci. Food Agric. 2024, 104, 3570–3584. [Google Scholar] [CrossRef]
- Chen, Y.; Yuan, X.; Wang, J.; Wu, R.; Li, X.; Hou, Q.; Cheng, M.M. YOLO-MS: Rethinking multi-scale representation learning for real-time object detection. IEEE Trans. Pattern Anal. Mach. Intell. 2025, 47, 4240–4252. [Google Scholar] [CrossRef] [PubMed]
- Liu, S.; Qi, L.; Qin, H.; Shi, J.; Jia, J. Path aggregation network for instance segmentation. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 8759–8768. [Google Scholar]
- Li, C.; Yao, A. KernelWarehouse: Towards parameter-efficient dynamic convolution. arXiv 2023, arXiv:2308.08361. [Google Scholar]
- Guo, Q.; Geng, L.; Xiao, Z.; Zhang, F.; Liu, Y. Mifanet: Multi-scale information fusion attention network for determining hatching eggs activity via detecting PPG signals. Neural Comput. Appl. 2023, 35, 22637–22649. [Google Scholar] [CrossRef]
- Yang, J.; Li, A.; Xiao, S.; Lu, W.; Gao, X. MTD-Net: Learning to detect deepfakes images by multi-scale texture difference. IEEE Trans. Inf. Forensics Secur. 2021, 16, 4234–4245. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the 2016 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Li, C.; Li, L.; Jiang, H.; Weng, K.; Geng, Y.; Li, L.; Ke, Z.; Li, Q.; Cheng, M.; Nie, W.; et al. YOLOv6: A single-stage object detection framework for industrial applications. arXiv 2022, arXiv:2209.02976. [Google Scholar] [CrossRef]
- Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 7464–7475. [Google Scholar]
- Ge, Z.; Liu, S.; Wang, F.; Li, Z.; Sun, J. YOLOX: Exceeding YOLO series in 2021. arXiv 2021, arXiv:2107.08430. [Google Scholar] [CrossRef]
- Wang, A.; Chen, H.; Liu, L.; Chen, K.; Lin, Z.; Han, J. Yolov10: Real-time end-to-end object detection. In Proceedings of the 38th International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 10–15 December 2024; pp. 107984–108011. [Google Scholar]
- Everingham, M.; Van Gool, L.; Williams, C.K.; Winn, J.; Zisserman, A. The pascal visual object classes (voc) challenge. Int. J. Comput. Vis. 2010, 88, 303–338. [Google Scholar] [CrossRef]
- Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Zitnick, C.L. Microsoft coco: Common objects in context. In Proceedings of the 2014 European Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014; pp. 740–755. [Google Scholar]
- Zhu, P.; Wen, L.; Du, D.; Bian, X.; Fan, H.; Hu, Q.; Ling, H. Detection and tracking meet drones challenge. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 7380–7399. [Google Scholar] [CrossRef] [PubMed]








| Index | Parameters C, [k1, k2, k3] |
|---|---|
| 1 | 16, [7, 3, 3] |
| 2 | 32, [9, 3, 3] |
| 3 | 32, [9, 3, 3] |
| 4 | 64, [11, 3, 3] |
| 5 | 64, [11, 3, 3] |
| 6 | 128, [11, 3, 3] |
| 7 | 64, [11, 3, 3] |
| 8 | 32, [11, 3, 3] |
| 9 | 64, [11, 3, 3] |
| 10 | 128, [11, 3, 3] |
| 11 | 64, [7, 3, 3] |
| Experimental Environment | Configuration |
|---|---|
| CPU | AMD Ryzen 95900X@3.70 GHz |
| GPU | NVIDIA GeForce RTX3090 (24 GB) |
| Memory | 64 GB |
| Operating system | Windows10 64 bit |
| Python version | Python 3.8 |
| Pytorch version | Pytorch1.8.0 (torchvision0.9.0) |
| CUDA/CUDNN | 11.1/18.0.5 |
| HEU | KWConv | DE-HEAD | mAP@0.5 | mAP@0.5:0.95 | Params (M) | FLOPs (G) | P | R | F1 | Model Size (MB) |
|---|---|---|---|---|---|---|---|---|---|---|
| 0.972 | 0.855 | 3.01 | 8.1 | 0.966 | 0.938 | 0.952 | 5.98 | |||
| ✓ | 0.975 | 0.865 | 2.74 | 7.8 | 0.962 | 0.946 | 0.954 | 6.03 | ||
| ✓ | 0.975 | 0.859 | 3.02 | 6.9 | 0.968 | 0.940 | 0.954 | 6.04 | ||
| ✓ | 0.974 | 0.863 | 2.67 | 6.9 | 0.958 | 0.944 | 0.951 | 5.5 | ||
| ✓ | ✓ | 0.980 | 0.874 | 2.76 | 6.6 | 0.968 | 0.953 | 0.960 | 6.12 | |
| ✓ | ✓ | 0.980 | 0.877 | 2.41 | 6.6 | 0.972 | 0.951 | 0.961 | 5.55 | |
| ✓ | ✓ | 0.974 | 0.866 | 2.69 | 5.6 | 0.962 | 0.938 | 0.950 | 5.57 | |
| ✓ | ✓ | ✓ | 0.979 | 0.878 | 2.43 | 5.4 | 0.969 | 0.949 | 0.959 | 5.65 |
| Detection Model | mAP@0.5 | mAP@0.5:0.95 | Params (M) | FLOPs (G) | P | R | F1 | Model Size (MB) |
|---|---|---|---|---|---|---|---|---|
| YOLOv5_s | 0.983 | 0.858 | 7.01 | 15.80 | 0.970 | 0.971 | 0.970 | 13.76 |
| YOLOv6_s | 0.975 | 0.861 | 18.50 | 45.17 | 0.969 | 0.947 | 0.958 | 38.74 |
| YOLOv6_m | 0.973 | 0.864 | 34.80 | 85.62 | 0.970 | 0.928 | 0.948 | 72.50 |
| YOLOv7_tiny | 0.978 | 0.843 | 6.01 | 13.20 | 0.970 | 0.958 | 0.964 | 11.71 |
| YOLOX_tiny | 0.951 | 0.732 | 5.03 | 15.23 | 0.935 | 0.914 | 0.924 | 38.70 |
| YOLOX_s | 0.977 | 0.847 | 8.94 | 26.76 | 0.954 | 0.942 | 0.948 | 68.51 |
| YOLOX_m | 0.980 | 0.873 | 25.28 | 73.73 | 0.963 | 0.953 | 0.958 | 193.35 |
| YOLOv8_s | 0.981 | 0.889 | 11.13 | 28.40 | 0.967 | 0.962 | 0.964 | 21.50 |
| YOLOv10_s | 0.981 | 0.879 | 8.04 | 21.6 | 0.966 | 0.957 | 0.961 | 15.79 |
| LMS-Res-YOLO(our) | 0.979 | 0.878 | 2.43 | 5.4 | 0.969 | 0.949 | 0.959 | 5.65 |
| Detection Model | mAP@0.5 | mAP@0.5:0.95 | Params (M) | FLOPs (G) | P | R | F1 | Model Size (MB) |
|---|---|---|---|---|---|---|---|---|
| YOLOv5_n | 0.971 | 0.804 | 1.76 | 4.20 | 0.951 | 0.936 | 0.943 | 3.72 |
| YOLOv6_n | 0.970 | 0.842 | 4.63 | 11.34 | 0.966 | 0.940 | 0.953 | 9.95 |
| YOLOv8_n | 0.972 | 0.855 | 3.01 | 8.1 | 0.966 | 0.938 | 0.952 | 5.98 |
| YOLOv10_n | 0.974 | 0.843 | 2.28 | 6.7 | 0.966 | 0.926 | 0.946 | 5.51 |
| LMS-Res-YOLO(our) | 0.979 | 0.878 | 2.43 | 5.4 | 0.969 | 0.949 | 0.959 | 5.65 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, B.; Zhong, G.; Ke, W. LMS-Res-YOLO: Lightweight and Multi-Scale Cucumber Detection Model with Residual Blocks. Sensors 2025, 25, 7305. https://doi.org/10.3390/s25237305
Li B, Zhong G, Ke W. LMS-Res-YOLO: Lightweight and Multi-Scale Cucumber Detection Model with Residual Blocks. Sensors. 2025; 25(23):7305. https://doi.org/10.3390/s25237305
Chicago/Turabian StyleLi, Bo, Guangjin Zhong, and Wei Ke. 2025. "LMS-Res-YOLO: Lightweight and Multi-Scale Cucumber Detection Model with Residual Blocks" Sensors 25, no. 23: 7305. https://doi.org/10.3390/s25237305
APA StyleLi, B., Zhong, G., & Ke, W. (2025). LMS-Res-YOLO: Lightweight and Multi-Scale Cucumber Detection Model with Residual Blocks. Sensors, 25(23), 7305. https://doi.org/10.3390/s25237305

