YOLOv8n-Pose-DSW: A Precision Picking Point Localization Model for Zucchini in Complex Greenhouse Environments
Abstract
1. Introduction
- Zucchini Fruit Dataset: A robust dataset encompassing diverse illumination conditions, capture distances, fruit densities, and scenarios simulated through advanced data augmentation techniques was established.
- YOLOv8n-Pose-DSW Model for Unstructured Environments: The proposed YOLOv8n-Pose-DSW model effectively addresses missed detections and low-accuracy challenges in unstructured environments. Comparative experiments validate its superiority and demonstrate the efficacy of each constituent module.
- Adaptive Dysample Operator for Computational Efficiency: Traditional Upsampling methods were replaced with Dysample, achieving synchronized optimization of computational efficiency and GPU memory consumption while maintaining detection accuracy.
- Slim-Neck Architecture for Feature Representation: A Slim-Neck network structure was developed, enhancing computational efficiency and feature representation capability through optimized bottleneck layer design.
- WIoUv3 Loss Function for Localization Sensitivity: The WIoUv3 loss function was introduced to replace the CIoU loss function, which can improve its detection sensitivity for zucchini fruits, enhance picking point localization precision, and refine model fitting accuracy.
2. Materials and Methods
2.1. Description of Study Area and Data Collection
2.2. The Dataset
2.2.1. Dataset Annotation and Division
2.2.2. Data Augmentation
2.3. Improved YOLOv8n-Pose Network
2.3.1. Dysample Module
2.3.2. Slim-Neck Module
2.3.3. WIoUv3 Loss Function
3. Experimental Design and Discussion
3.1. Experimental Details
3.2. Metrics for Model Evaluation
3.3. Picking Point Positioning Accuracy Metrics
3.4. Experimental Results
3.4.1. Ablation Experiment
3.4.2. Visualization and Discussion of Ablation Experiment
3.4.3. Comparative Experiments
3.4.4. Visualization and Discussion of Comparative Experiments
3.4.5. Analysis of Picking Point Location Under Different Occlusion Levels
3.4.6. Picking Point Positioning Accuracy Discussion
4. Discussion
4.1. Visual Analysis
4.2. Discussion of Model Limitations and Applications
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Liang, J.; Liu, G. High-Quality and High-Efficiency Management Techniques for Open-Field Cultivation of Zucchini. Fruit Grow. Friend 2025, 26, 78–80. (In Chinese) [Google Scholar]
- Ben-Noun (Nun), L. Characteristics of Zucchini; B. N. Publication House: Hawthorne, CA, USA, 2019. [Google Scholar]
- Paris, H.S. Germplasm enhancement of Cucurbita pepo (pumpkin, squash, gourd: Cucurbitaceae): Progress and challenges. Euphytica 2016, 208, 415–438. [Google Scholar] [CrossRef]
- Grumet, R.; McCreight, J.D.; McGregor, C.; Weng, Y.; Mazourek, M.; Reitsma, K.; Labate, J.; Davis, A.; Fei, Z. Genetic resources and vulnerabilities of major cucurbit crops. Genes 2021, 12, 1222. [Google Scholar] [CrossRef] [PubMed]
- Zhao, Y.; Gong, L.; Huang, Y.; Liu, C. A review of key techniques of vision-based control for harvesting robot. Comput. Electron. Agric. 2016, 127, 311–323. [Google Scholar] [CrossRef]
- Luo, L.; Tang, Y.; Zou, X.; Ye, M.; Feng, W.; Li, G. Vision-based extraction of spatial information in grape clusters for harvesting robots. Biosyst. Eng. 2016, 151, 90–104. [Google Scholar] [CrossRef]
- Zhao, C.; Lee, W.S.; He, D. Immature green citrus detection based on colour feature and sum of absolute transformed difference (SATD) using colour images in the citrus grove. Comput. Electron. Agric. 2016, 124, 243–253. [Google Scholar] [CrossRef]
- Rathnayake, N.; Rathnayake, U.; Dang, T.L.; Hoshino, Y. An Efficient Automatic Fruit-360 Image Identification and Recognition Using a Novel Modified Cascaded-ANFIS Algorithm. Sensors 2022, 22, 4401. [Google Scholar] [CrossRef]
- Fu, L.; Feng, Y.; Wu, J.; Liu, Z.; Gao, F.; Majeed, Y.; Al-Mallahi, A.; Zhang, Q.; Li, R.; Cui, Y. Fast and accurate detection of kiwifruit in orchard using improved YOLOv3-tiny model. Precis. Agric. 2021, 22, 754–776. [Google Scholar] [CrossRef]
- Lu, Y.; Young, S. A survey of public datasets for computer vision tasks in precision agriculture. Comput. Electron. Agric. 2020, 178, 105760. [Google Scholar] [CrossRef]
- Tang, Y.; Qiu, J.; Zhang, Y.; Wu, D.; Cao, Y.; Zhao, K.; Zhu, L. Optimization strategies of fruit detection to overcome the challenge of unstructured background in field orchard environment: A review. Precis. Agric. 2023, 24, 1183–1219. [Google Scholar] [CrossRef]
- Wang, H.; Lin, Y.; Xu, X.; Chen, Z.; Wu, Z.; Tang, Y. A study on long-close distance coordination control strategy for litchi picking. Agronomy 2022, 12, 1520. [Google Scholar] [CrossRef]
- Zhang, T.; Wu, F.; Wang, M.; Chen, Z.; Li, L.; Zou, X. Grape-bunch identification and location of picking points on occluded fruit axis based on YOLOv5-GAP. Horticulturae 2023, 9, 498. [Google Scholar] [CrossRef]
- Li, Y.; Wang, W.; Guo, X.; Wang, X.; Liu, Y.; Wang, D. Recognition and positioning of strawberries based on improved YOLOv7 and RGB-D sensing. Agriculture 2024, 14, 624. [Google Scholar] [CrossRef]
- Chen, X.; Dong, G.; Fan, X.; Xu, Y.; Liu, T.; Zhou, J.; Jiang, H. Fruit Stalk Recognition and Picking Point Localization of New Plums Based on Improved DeepLabv3+. Agriculture 2024, 14, 2120. [Google Scholar] [CrossRef]
- Ma, Z.; Dong, N.; Gu, J.; Cheng, H.; Meng, Z.; Du, X. STRAW-YOLO: A detection method for strawberry fruits targets and key points. Comput. Electron. Agric. 2025, 230, 109853. [Google Scholar] [CrossRef]
- Du, X.; Meng, Z.; Ma, Z.; Lu, W.; Cheng, H. Tomato 3D pose detection algorithm based on keypoint detection and point cloud processing. Comput. Electron. Agric. 2023, 212, 108056. [Google Scholar] [CrossRef]
- Wu, Z.; Xia, F.; Zhou, S.; Xu, D. A method for identifying grape stems using keypoints. Comput. Electron. Agric. 2023, 209, 107825. [Google Scholar] [CrossRef]
- Huang, Y.; Zhong, Y.; Zhong, D.; Yang, C.; Wei, L.; Zou, Z.; Chen, R. Pepper-YOLO: An lightweight model for green pepper detection and picking point localization in complex environments. Front. Plant Sci. 2024, 15, 1508258. [Google Scholar] [CrossRef]
- Wang, H.; Yun, L.; Yang, C.; Wu, M.; Wang, Y.; Chen, Z. OW-YOLO: An Improved YOLOv8s Lightweight Detection Method for Obstructed Walnuts. Agriculture 2025, 15, 159. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Liu, W.; Lu, H.; Fu, H.; Cao, Z. Learning to upsample by learning to sample. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 1–6 October 2023; pp. 6027–6037. [Google Scholar]
- Li, H.; Li, J.; Wei, H.; Liu, Z.; Zhan, Z.; Ren, Q. Slim-neck by GSConv: A lightweight-design for real-time detector architectures. J. Real-Time Image Process. 2024, 21, 62. [Google Scholar] [CrossRef]
- Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. Distance-IoU loss: Faster and better learning for bounding box regression. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–9 February 2020; Volume 34, pp. 12993–13000. [Google Scholar]
- Tong, Z.; Chen, Y.; Xu, Z.; Yu, R. Wise-IoU: Bounding box regression loss with dynamic focusing mechanism. arXiv 2023, arXiv:2301.10051. [Google Scholar]
- Toshev, A.; Szegedy, C. Deeppose: Human pose estimation via deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 1653–1660. [Google Scholar]
- Xu, Y.; Zhang, J.; Zhang, Q.; Tao, D. Vitpose: Simple vision transformer baselines for human pose estimation. Adv. Neural Inf. Process. Syst. 2022, 35, 38571–38584. [Google Scholar]
- Jiang, T.; Lu, P.; Zhang, L.; Ma, N.; Han, R.; Lyu, C.; Li, Y.; Chen, K. Rtmpose: Real-time multi-person pose estimation based on mmpose. arXiv 2023, arXiv:2303.07399. [Google Scholar]
- Maji, D.; Nagori, S.; Mathew, M.; Poddar, D. Yolo-pose: Enhancing yolo for multi person pose estimation using object keypoint similarity loss. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 19–20 June 2022; pp. 2637–2646. [Google Scholar]
- Tian, Y.; Ye, Q.; Doermann, D. YOLOv12: Attention-Centric Real-Time Object Detectors. arXiv 2025, arXiv:2502.12524. [Google Scholar]
Training Parameters | Values |
---|---|
Initial learning rate | 0.01 |
Optimizer | SGD |
Optimizer momentum | 0.937 |
Optimizer weight decay rate | 0.0005 |
Number of images per batch | 16 |
Number of epochs | 150 |
Baseline Network | Dysample | Slim-Neck | WIoUv3 | P (%) | R (%) | mAP@50 (%) | mAP@50-95 (%) | Para (M) | FLOPs (G) | FPS |
---|---|---|---|---|---|---|---|---|---|---|
YOLOv8n-Pose | × | × | × | 88.8 ± 0.2 | 79.0 ± 0.4 | 86.6 ± 0.3 | 56.0 ± 0.5 | 3.08 | 8.7 | 132 ± 5 |
YOLOv8n-Pose | ✓ | × | × | 91.0 ± 0.2 | 89.5 ± 0.5 | 91.3 ± 0.2 | 71.1 ± 0.3 | 3.09 | 8.5 | 131 ± 5 |
YOLOv8n-Pose | × | ✓ | × | 87.2 ± 0.4 | 87.2 ± 0.3 | 92.1 ± 0.2 | 62.8 ± 0.4 | 2.87 | 7.6 | 140 ± 3 |
YOLOv8n-Pose | × | × | ✓ | 91.0 ± 0.1 | 88.6 ± 0.3 | 93.5 ± 0.1 | 71.4 ± 0.3 | 3.07 | 8.4 | 135 ± 3 |
YOLOv8n-Pose | ✓ | ✓ | ✓ | 92.1 ± 0.3 | 90.7 ± 0.2 | 94.0 ± 0.1 | 71.4 ± 0.3 | 3.05 | 8.3 | 137 ± 3 |
Baseline Network | Dysample | Slim-Neck | WIoUv3 | P (%) | R (%) | mAP@50 (%) | mAP@50–95 (%) |
---|---|---|---|---|---|---|---|
YOLOv8n-Pose | × | × | × | 84.3 ± 0.2 | 78.5 ± 0.2 | 84.3 ± 0.1 | 67.3 ± 0.2 |
YOLOv8n-Pose | ✓ | × | × | 91.9 ± 0.1 | 90.7 ± 0.2 | 89.2 ± 0.1 | 79.0 ± 0.1 |
YOLOv8n-Pose | × | ✓ | × | 86.6 ± 0.2 | 86.4 ± 0.2 | 93.3 ± 0.1 | 89.4 ± 0.2 |
YOLOv8n-Pose | × | × | ✓ | 91.5 ± 0.2 | 87.3 ± 0.1 | 93.7 ± 0.1 | 91.0 ± 0.2 |
YOLOv8n-Pose | ✓ | ✓ | ✓ | 93.1 ± 0.1 | 89.5 ± 0.2 | 95.6 ± 0.2 | 95.2 ± 0.2 |
Model | P (%) | R (%) | mAP@50 (%) | mAP@50–95 (%) | Para (M) | FLOPs (G) | FPS |
---|---|---|---|---|---|---|---|
DeepPose | 59.6 ± 0.3 | 60.1 ± 0.2 | 72.6 ± 0.3 | 52.0 ± 0.2 | 23.55 | 42.8 | 39 ± 3 |
RTMPose | 64.4 ± 0.2 | 63.8 ± 0.2 | 75.2 ± 0.1 | 62.5 ± 0.2 | 6.17 | 7.4 | 150 ± 6 |
ViTPose | 67.3 ± 0.3 | 66.7 ± 0.2 | 65.2 ± 0.4 | 51.1 ± 0.3 | 22.46 | 88.9 | 12 ± 1 |
YOLOX-Pose | 79.1 ± 0.3 | 77.3 ± 0.3 | 84.2 ± 0.1 | 66.5 ± 0.1 | 6.04 | 13.7 | 125 ± 4 |
YOLO11n-Pose | 86.0 ± 0.2 | 82.0 ± 0.2 | 90.3 ± 0.1 | 88.7 ± 0.1 | 2.63 | 6.7 | 135 ± 6 |
YOLO12n-Pose | 88.0 ± 0.3 | 84.6 ± 0.3 | 89.5 ± 0.2 | 86.1 ± 0.4 | 2.66 | 6.7 | 104 ± 3 |
YOLOv8n-Pose-DSW | 93.1 ± 0.1 | 89.5 ± 0.2 | 95.6 ± 0.2 | 95.2 ± 0.2 | 3.05 | 8.3 | 137 ± 3 |
Occlusion | Model | P (%) | R (%) | mAP@50 (%) | mAP@50–95 (%) |
---|---|---|---|---|---|
Degree | |||||
Light | YOLOv8n-Pose-DSW | 95.8 ± 0.1 | 93.1 ± 0.2 | 97.1 ± 0.2 | 88.8 ± 0.2 |
YOLOv8n-Pose | 90.1 ± 0.1 | 88.3 ± 0.1 | 92.2 ± 0.1 | 79.8 ± 0.2 | |
Medium | YOLOv8n-Pose-DSW | 90.2 ± 0.3 | 86.0 ± 0.3 | 92.0 ± 0.2 | 78.5 ± 0.1 |
YOLOv8n-Pose | 82.1 ± 0.2 | 75.5 ± 0.3 | 80.0 ± 0.2 | 65.2 ± 0.4 | |
Heavy | YOLOv8n-Pose-DSW | 76.3 ± 0.4 | 63.7 ± 0.3 | 71.4 ± 0.2 | 58.0 ± 0.4 |
YOLOv8n-Pose | 64.8 ± 0.4 | 51.9 ± 0.2 | 58.6 ± 0.3 | 44.7 ± 0.5 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Su, H.; Wang, S.; Su, H.; Ma, F.; Li, Y.; Li, J. YOLOv8n-Pose-DSW: A Precision Picking Point Localization Model for Zucchini in Complex Greenhouse Environments. Agriculture 2025, 15, 1954. https://doi.org/10.3390/agriculture15181954
Su H, Wang S, Su H, Ma F, Li Y, Li J. YOLOv8n-Pose-DSW: A Precision Picking Point Localization Model for Zucchini in Complex Greenhouse Environments. Agriculture. 2025; 15(18):1954. https://doi.org/10.3390/agriculture15181954
Chicago/Turabian StyleSu, Hongxiong, Sa Wang, Honglin Su, Fumin Ma, Yanwen Li, and Juxia Li. 2025. "YOLOv8n-Pose-DSW: A Precision Picking Point Localization Model for Zucchini in Complex Greenhouse Environments" Agriculture 15, no. 18: 1954. https://doi.org/10.3390/agriculture15181954
APA StyleSu, H., Wang, S., Su, H., Ma, F., Li, Y., & Li, J. (2025). YOLOv8n-Pose-DSW: A Precision Picking Point Localization Model for Zucchini in Complex Greenhouse Environments. Agriculture, 15(18), 1954. https://doi.org/10.3390/agriculture15181954