Precision Detection of Dense Litchi Fruit in UAV Images Based on Improved YOLOv5 Model
Abstract
:1. Introduction
2. Materials and Methods
2.1. Image Data Collection
2.2. Dataset Construction
2.3. Experimental Environment Setup
2.4. Evaluation Metrics
2.5. Overview of YOLOv5
2.6. The Proposed Model
2.6.1. BiFPN with P2 Feature Layer Fusion
2.6.2. NWD
2.6.3. SAHI
3. Experimental Results and Comparative Analysis
3.1. Ablation Experiments
3.1.1. BiFPN
3.1.2. P2 Feature Layer Fusion
3.1.3. NWD
3.1.4. SAHI
4. Comparative Discussion
4.1. Comparison with Other Object Detection Algorithms
4.2. Analysis of Model Detection Effects
4.3. Test Results on Datasets
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Wen’E, Q.; Houbin, C.; Tao, L.; Fengxian, S. Development Status, Trend and Suggestion of Litchi Industry in Mainland China. Guangdong Agric. Sci. 2019, 46, 132–139. [Google Scholar] [CrossRef]
- Qi, W.; Chen, H.; Li, J. Status, Trend and Countermeasures of Development of Litchi Industry in the Mainland of China in 2022. Guangdong Agric. Sci. 2023, 1–10. [Google Scholar]
- Lan, Y.; Huang, Z.; Deng, X.; Zhu, Z.; Huang, H.; Zheng, Z.; Lian, B.; Zeng, G.; Tong, Z. Comparison of machine learning methods for citrus greening detection on UAV multispectral images. Comput. Electron. Agric. 2020, 171, 105234. [Google Scholar] [CrossRef]
- Chen, P.; Douzals, J.P.; Lan, Y.; Cotteux, E.; Delpuech, X.; Pouxviel, G.; Zhan, Y. Characteristics of unmanned aerial spraying systems and related spray drift: A review. Front. Plant Sci. 2022, 13, 870956. [Google Scholar] [CrossRef]
- Junos, M.H.; Mohd Khairuddin, A.S.; Thannirmalai, S.; Dahari, M. Automatic detection of oil palm fruits from UAV images using an improved YOLO model. Vis. Comput. 2022, 38, 2341–2355. [Google Scholar] [CrossRef]
- Maldonado, W., Jr.; Barbosa, J.E.C. Automatic green fruit counting in orange trees using digital images. Comput. Electron. Agric. 2016, 127, 572–581. [Google Scholar] [CrossRef] [Green Version]
- Bhargava, A.; Bansal, A. Automatic Detection and Grading of Multiple Fruits by Machine Learning. Food Anal. Methods 2020, 13, 751–761. [Google Scholar] [CrossRef]
- Xiong, J.; Lin, R.; Liu, Z.; He, Z.; Tang, L.; Yang, Z.; Zou, X. The recognition of litchi clusters and the calculation of picking point in a nocturnal natural environment. Biosyst. Eng. 2018, 166, 44–57. [Google Scholar] [CrossRef]
- Wang, C.; Tang, Y.; Zou, X.; Luo, L.; Chen, X. Recognition and Matching of Clustered Mature Litchi Fruits Using Binocular Charge-Coupled Device (CCD) Color Cameras. Sensors 2017, 17, 2564. [Google Scholar] [CrossRef] [Green Version]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar] [CrossRef] [Green Version]
- Apolo-Apolo, O.E.; Martinez-Guanter, J.; Egea, G.; Raja, P.; Pérez-Ruiz, M. Deep learning techniques for estimation of the yield 556 and size of citrus fruits using a UAV. Eur. J. Agron. 2020, 115, 126030. [Google Scholar] [CrossRef]
- Gao, F.; Fu, L.; Zhang, X.; Majeed, Y.; Li, R.; Karkee, M.; Zhang, Q. Multi-class fruit-on-plant detection for apple in SNAP system using Faster R-CNN. Comput. Electron. Agric. 2020, 176, 105634. [Google Scholar] [CrossRef]
- Zhang, J.; Karkee, M.; Zhang, Q.; Zhang, X.; Yaqoob, M.; Fu, L.; Wang, S. Multi-class object detection using faster R-CNN and estimation of shaking locations for automated shake-and-catch apple harvesting. Comput. Electron. Agric. 2020, 173, 105384. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar] [CrossRef] [Green Version]
- Lin, P.; Li, D.; Jia, Y.; Chen, Y.; Huang, G.; Elkhouchlaa, H.; Yao, Z.; Zhou, Z.; Zhou, H.; Li, J.; et al. A novel approach for estimating the flowering rate of litchi based on deep learning and UAV images. Front. Plant Sci. 2022, 13, 966639. [Google Scholar] [CrossRef]
- Wang, L.; Zhao, Y.; Xiong, Z.; Wang, S.; Li, Y.; Lan, Y. Fast and precise detection of litchi fruits for yield estimation based on the improved YOLOv5 model. Front. Plant Sci. 2022, 13, 965425. [Google Scholar] [CrossRef]
- Liang, J.; Chen, X.; Liang, C.; Long, T.; Tang, X.; Shi, Z.; Zhou, M.; Zhao, J.; Lan, Y.; Long, Y. A detection approach for late-autumn shoots of litchi based on unmanned aerial vehicle (UAV) remote sensing. Comput. Electron. Agric. 2023, 204, 107535. [Google Scholar] [CrossRef]
- Liu, G.; Han, J.; Rong, W. Feedback-driven loss function for small object detection. Image Vis. Comput. 2021, 111, 104197. [Google Scholar] [CrossRef]
- Gong, Y.; Yu, X.; Ding, Y.; Peng, X.; Zhao, J.; Han, Z. Effective fusion factor in FPN for tiny object detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 5–9 January 2021; pp. 1160–1168. [Google Scholar] [CrossRef]
- Girshick, R. Fast R-CNN. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar] [CrossRef]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [Green Version]
- Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar] [CrossRef]
- Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar] [CrossRef]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollar, P. Focal Loss for Dense Object Detection. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [Google Scholar] [CrossRef] [Green Version]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. In Computer Vision—ECCV 2016; Leibe, B., Matas, J., Sebe, N., Welling, M., Eds.; Springer: Cham, Switzerland, 2016; pp. 21–37. [Google Scholar] [CrossRef] [Green Version]
- Tan, M.; Pang, R.; Le, Q.V. Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 10781–10790. [Google Scholar] [CrossRef]
- Chen, J.; Mai, H.; Luo, L.; Chen, X.; Wu, K. Effective Feature Fusion Network in BIFPN for Small Object Detection. In Proceedings of the 2021 IEEE International Conference on Image Processing (ICIP), Anchorage, AK, USA, 19–22 September 2021; pp. 699–703. [Google Scholar] [CrossRef]
- Lv, J.; Xu, H.; Han, Y.; Lu, W.; Xu, L.; Rong, H.; Yang, B.; Zou, L.; Ma, Z. A visual identification method for the apple growth forms in the orchard. Comput. Electron. Agric. 2022, 197, 106954. [Google Scholar] [CrossRef]
- Liu, X.; Li, G.; Chen, W.; Liu, B.; Chen, M.; Lu, S. Detection of dense Citrus fruits by combining coordinated attention and cross-scale connection with weighted feature fusion. Appl. Sci. 2022, 12, 6600. [Google Scholar] [CrossRef]
- Wang, J.; Xu, C.; Yang, W.; Yu, L. A normalized Gaussian Wasserstein distance for tiny object detection. arXiv 2021, arXiv:2110.13389. [Google Scholar] [CrossRef]
- Yang, J.; Yang, H.; Wang, F.; Chen, X. A modified YOLOv5 for object detection in UAV-captured scenarios. In Proceedings of the 2022 IEEE International Conference on Networking, Sensing and Control (ICNSC), Shanghai, China, 15–18 December 2022; pp. 1–6. [Google Scholar] [CrossRef]
- Yu, Z.; Huang, H.; Chen, W.; Su, Y.; Liu, Y.; Wang, X. YOLO-FaceV2: A Scale and Occlusion Aware Face Detector. arXiv 2022, arXiv:2208.02019. [Google Scholar] [CrossRef]
- Xu, C.; Wang, J.; Yang, W.; Yu, H.; Yu, L.; Xia, G.S. Detecting tiny objects in aerial images: A normalized Wasserstein distance and a new benchmark. ISPRS J. Photogramm. Remote Sens. 2022, 190, 79–93. [Google Scholar] [CrossRef]
- Akyon, F.C.; Onur Altinuc, S.; Temizel, A. Slicing Aided Hyper Inference and Fine-Tuning for Small Object Detection. In Proceedings of the 2022 IEEE International Conference on Image Processing (ICIP), Bordeaux, France, 16–19 October 2022; pp. 966–970. [Google Scholar] [CrossRef]
- Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-end object detection with transformers. In European Conference on Computer Vision; Springer: Berlin/Heidelberg, Germany, 2020; pp. 213–229. [Google Scholar] [CrossRef]
- Ge, Z.; Liu, S.; Wang, F.; Li, Z.; Sun, J. Yolox: Exceeding yolo series in 2021. arXiv 2021, arXiv:2107.08430. [Google Scholar] [CrossRef]
- Tian, Z.; Shen, C.; Chen, H.; He, T. FCOS: Fully Convolutional One-Stage Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019. [Google Scholar]
Dataset | Number of Images | Number of Labels |
---|---|---|
Train | 280 | 29,922 |
Validation | 40 | 4480 |
Test | 80 | 8754 |
Total | 400 | 43,156 |
Model | Weight (MB) | Params (M) | FPS | ||||||
---|---|---|---|---|---|---|---|---|---|
YOLOv5 | 50.6 | 27.8 | 53.8 | 81.3 | 77.3 | 24.0 | 13.7 | 7.01 | 68.8 |
+NWD | 48.5 | 22.6 | 64.0 | 74.2 | 65.5 | 31.6 | 13.7 | 7.01 | 68.2 |
+BiFPN | 55.2 | 31.6 | 67.9 | 82.3 | 78.6 | 31.8 | 13.8 | 7.08 | 70.9 |
+NWD + BiFPN | 63.6 | 35.8 | 79.5 | 86.4 | 80.1 | 47.1 | 13.8 | 7.08 | 70.4 |
+Fuse P2 + BiFPN | 62.9 | 35.9 | 78.2 | 85.7 | 80.2 | 45.6 | 14.2 | 7.24 | 68.2 |
+Fuse P2 + NWD + BiFPN | 64.1 | 36.2 | 79.8 | 88.0 | 80.4 | 47.8 | 14.2 | 7.24 | 71.4 |
+Fuse P2 + NWD + BiFPN + SAHI | 72.6 | 57.3 | 80.1 | 86.0 | 86.4 | 58.8 | 14.2 | 7.24 | — |
Repeated Blocks | Weight (MB) | Params (M) | ||||||
---|---|---|---|---|---|---|---|---|
1× | 64.1 | 36.2 | 79.8 | 88.0 | 80.4 | 47.8 | 14.2 | 7.24 |
2× | 64.2 | 35.6 | 79.7 | 87.7 | 80.4 | 47.9 | 20.3 | 10.4 |
3× | 58.7 | 34.7 | 74.1 | 81.0 | 80.3 | 37.1 | 26.3 | 13.5 |
NWD | CIoU | ||||||
---|---|---|---|---|---|---|---|
0 | 1 | 62.9 | 35.9 | 78.2 | 85.7 | 80.2 | 45.6 |
1 | 0 | 63.8 | 35.8 | 78.7 | 87.4 | 80.4 | 47.1 |
0.5 | 0.5 | 63.9 | 37.1 | 77.6 | 87.1 | 80.8 | 46.9 |
0.8 | 0.2 | 62.8 | 36.7 | 78.5 | 85.3 | 81.3 | 44.3 |
0.2 | 0.8 | 64.1 | 36.2 | 79.8 | 88.0 | 80.4 | 47.8 |
Model | Weight (MB) | Params (M) | FPS | ||||||
---|---|---|---|---|---|---|---|---|---|
YOLOv5 | 50.6 | 27.8 | 53.8 | 81.3 | 77.3 | 24.0 | 13.7 | 7.01 | 68.8 |
YOLOv5-TinyLitchi | 64.1 | 36.2 | 79.8 | 88.0 | 80.4 | 47.8 | 14.2 | 7.24 | 71.4 |
YOLOv5-TinyLitchi with SAHI | 72.6 | 57.3 | 80.1 | 86.0 | 86.4 | 58.8 | 14.2 | 7.24 | — |
DETR | 25.7 | 6.0 | 28.8 | 55.0 | 31.3 | 20.1 | 158 | 41.28 | 14.0 |
Faster R-CNN | 53.5 | 18.7 | 69.1 | 83.6 | 64.9 | 42.0 | 159 | 41.13 | 10.5 |
RetinaNet | 46.1 | 15.1 | 51.3 | 80.0 | 54.6 | 37.6 | 145 | 36.13 | 12.1 |
SSD | 31.5 | 4.9 | 36.0 | 69.5 | 44.3 | 18.7 | 130 | 23.88 | 33.6 |
YOLOX | 68.1 | 50.2 | 79.4 | 80.3 | 86.9 | 49.3 | 34.4 | 8.94 | 27.9 |
FCOS | 36.2 | 10.3 | 43.4 | 63.6 | 55.7 | 16.6 | 123 | 31.84 | 12.3 |
Figure | Dtected | Real | False | Omission | False Detection Rate | Correct Detection Rate |
---|---|---|---|---|---|---|
A1 | 148 | 155 | 9 | 16 | 6.1% | 89.7% |
A2 | 143 | 139 | 5 | 1 | 3.5% | 99.3% |
B1 | 292 | 286 | 25 | 19 | 8.6% | 93.4% |
B2 | 206 | 219 | 16 | 29 | 7.8% | 86.8% |
C1 | 173 | 199 | 11 | 37 | 6.4% | 81.4% |
C2 | 170 | 153 | 28 | 11 | 16.5% | 92.8% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xiong, Z.; Wang, L.; Zhao, Y.; Lan, Y. Precision Detection of Dense Litchi Fruit in UAV Images Based on Improved YOLOv5 Model. Remote Sens. 2023, 15, 4017. https://doi.org/10.3390/rs15164017
Xiong Z, Wang L, Zhao Y, Lan Y. Precision Detection of Dense Litchi Fruit in UAV Images Based on Improved YOLOv5 Model. Remote Sensing. 2023; 15(16):4017. https://doi.org/10.3390/rs15164017
Chicago/Turabian StyleXiong, Zhangjun, Lele Wang, Yingjie Zhao, and Yubin Lan. 2023. "Precision Detection of Dense Litchi Fruit in UAV Images Based on Improved YOLOv5 Model" Remote Sensing 15, no. 16: 4017. https://doi.org/10.3390/rs15164017
APA StyleXiong, Z., Wang, L., Zhao, Y., & Lan, Y. (2023). Precision Detection of Dense Litchi Fruit in UAV Images Based on Improved YOLOv5 Model. Remote Sensing, 15(16), 4017. https://doi.org/10.3390/rs15164017