Mango Inflorescence Detection Based on Improved YOLOv8 and UAVs-RGB Images
Abstract
:1. Introduction
2. Materials and Methods
2.1. Data Acquisition
2.2. Improved YOLOv8 Model
- Image Preprocessing: YOLOv8 typically resizes the input image to a fixed size (e.g., 640 × 640). The image pixels are usually normalized to a range between [0, 1] or [−1, 1]. Data augmentation techniques are used to improve the model’s generalization ability, including: random cropping, random rotation, random flipping, and color adjustments (such as variations in brightness and contrast).
- Feature Extraction: CSPDarknet is an improved version of Darknet. By modifying the network structure, CSPDarknet reduces computational complexity while retaining sufficient feature information. The input image is processed through various layers, including convolutional layers, activation functions (e.g., ReLU or Leaky ReLU), and pooling layers. These operations generate feature maps at different scales, which capture essential information from the image, such as edges, textures, and colors.
- Feature Fusion: YOLOv8 employs the PANet (Path Aggregation Network) feature fusion technique. This method combines feature maps of different scales by weighted aggregation and uses Skip Connections to directly link lower-layer feature maps with higher-layer ones. This allows YOLOv8 to leverage more low-level features to enhance detection performance.
- Head: The head module employs a modern Decoupled-Head architecture, enhancing performance by isolating classification and regression tasks. Additionally, it transitions from an Anchor-Based to an Anchor-Free framework. Unlike earlier versions, YOLOv8 eliminates dependency on preset anchor boxes, instead generating direct predictions for an object’s position and category from every feature map pixel (or key point).
2.2.1. FasterNet Module
2.2.2. GAM Attention Mechanism
2.2.3. Loss Function Improvement
2.3. Dataset Construction
2.3.1. Construction of Mango Tidbits Data Set
2.3.2. Mango Tidbits Data
2.3.3. Sample Expansion and Revision
2.3.4. Dataset Format Conversion
2.4. Test Platform and Training Schedule
3. Results and Analysis
3.1. Model Ablation Test and Analysis
3.1.1. Comparative Test and Analysis of Improved FasterNet Module
3.1.2. Comparative Experiment and Analysis of GAM and Other Attention Mechanisms
3.1.3. Loss Function Comparison
3.1.4. Ablation Test
3.1.5. SOTA Model Comparison Test
3.2. Practicality Test of Edge Device Deployment
4. Conclusions
- A specialized dataset for mango bloom identification was developed, incorporating selective sampling, data augmentation, and manual adjustments to align with real-world farming conditions and project needs.
- FasterNet was integrated as the backbone network in place of YOLOv8’s default structure. Although precision saw a marginal decline of 0.1%, notable improvements were observed in processing speed and parameter efficiency. This highlights FasterNet’s ability to streamline computational demands while maintaining accuracy through advanced architectural design. The Global Attention Mechanism (GAM) was implemented within the feature extraction framework, creating a more effective structure. Tests revealed that GAM boosted average precision by 2.2% over the baseline YOLOv8 model, surpassing alternative attention approaches like SE, CA, and CBAM. Replacing the CIoU loss function with WIoU (Weighted Intersection over Union) improved bounding box localization. WIoU accelerated convergence by refining the alignment between predicted and actual boxes, leading to a 1.1% increase in mAP@.5 performance.
- Our proposed automatic detection algorithm, based on UAV-RGB imagery and an improved YOLOv8 architecture, was deployed on edge devices for practical evaluation. The results demonstrated a deviation of less than 10% compared to manual counting, indicating the practical applicability and feasibility of this approach.
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Venkata, R.R.P.; Gundappa, B.; Chakravarthy, A.K. Pests of mango. In Pests and Their Management; Springer: Singapore, 2018; pp. 415–440. [Google Scholar]
- Han, D.Y.; Li, L.; Niu, L.; Chen, J.; Zhang, F.; Ding, S.; Fu, Y. Spatial Distribution Pattern and Sampling of Thrips on Mango Trees. Chin. J. Trop. Crops 2019, 40, 323–327. [Google Scholar]
- Li, J.; Lin, L.; Tian, K.; Al, A. Detection of leaf diseases of balsam pear in the field based on improved Faster R-CNN. Trans. Chin. Soc. Agric. Eng. 2020, 36, 179–185. [Google Scholar]
- Girshick, R. Fast R-CNN. In Proceedings of the International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 7–12 December 2015. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.; Berg, A. SSD: Single shot multibox detector. In Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 8–16 October 2016. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016. [Google Scholar]
- Lin, T.; Goyal, P.; Girshick, R.; He, K.; Dollar, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017. [Google Scholar]
- Yu, B.; Lü, M. Improved YOLOv3 algorithm and its application in military target detection. J. Ordnance Eng. 2022, 43, 345–354. [Google Scholar]
- Peng, H.; Li, Z.; Zou, X.; Wang, H.; Xiong, J. Research on litchi image detection in orchard using UAV based on improved YOLOv5. Expert Syst. Appl. 2025, 263, 125828. [Google Scholar] [CrossRef]
- Guo, B.; Wang, B.; Zhang, Z.; Wu, S.; Li, P.; Hu, L. Improved YOLOv3 crop target detection algorithm. J. Agric. Big Data 2024, 6, 40–47. [Google Scholar]
- Tian, Y.; Yang, G.; Wang, Z.; Wang, H.; Li, E.; Liang, Z. Apple detection during different growth stages in orchards using the improved YOLO-V3 model. Comput. Electron. Agric. 2019, 157, 417–426. [Google Scholar] [CrossRef]
- Kuznetsova, A.; Maleva, T.; Soloviev, V. Using YOLOv3 Algorithm with Pre- and Post-Processing for Apple Detection in Fruit-Harvesting Robot. Agronomy 2020, 10, 1016. [Google Scholar] [CrossRef]
- Chen, C.; Peng, C.; Cheng, Y.; Wang, Y.; Zou, S. Research on food crop and weed recognition based on lightweight YOLOv5. Agric. Technol. 2023, 43, 36–40. [Google Scholar]
- Cai, Z.; Cai, Y.; Zeng, F.; Yue, X. Rice panicle recognition based on improved YOLOv5 in field. J. South China Agric. Univ. 2024, 45, 108–115. [Google Scholar]
- Hu, J.; Li, G.; Mo, H.; Lv, Y.; Qian, T.; Chen, M.; Lu, S. Crop node detection and internode length estimation using an improved YOLOv5 model. Agriculture 2023, 13, 473. [Google Scholar] [CrossRef]
- Zhao, J.; Zhang, X.; Yan, J.; Qiu, X.; Yao, X.; Tian, Y.; Zhu, Y.; Cao, W. A wheat spike detection method in UAV images based on improved YOLOv5. Remote Sens. 2021, 13, 3095. [Google Scholar] [CrossRef]
- Ye, J.; Qiu, W.; Yang, J.; Yi, W.; Ma, Z. A deep learning-based lychee pest identification method. Lab. Res. Explor. 2021, 40, 29–32. [Google Scholar]
- Wen, T.; Wang, T.; Huang, S. Crop and Amaranth detection algorithm based on improved YOLOv8: MES-YOLO. Comput. Eng. Sci. 2024, 14, 29888. [Google Scholar]
- Chen, F.; Li, S.; Han, J.; Ren, F.; Yang, Z. Review of Lightweight Deep Convolutional Neural Networks. Arch Comput. Methods Eng. 2024, 31, 1915–1937. [Google Scholar] [CrossRef]
- Chen, J.; Kao, S.; He, H.; Zhou, W.; Wen, S.; Lee, C.H.; Chan, S.H.G. Run, Don’t Walk: Chasing Higher FLOPS for Faster Neural Networks. arXiv 2023, arXiv:2303.03667. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. CBAM: Convolutional Block Attention Module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018. [Google Scholar]
- Zhang, X.; Xu, R.; Yu, H.; Zou, H.; Cui, P. Gradient Norm Aware Minimization Seeks First-Order Flatness and Improves Generalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023. [Google Scholar]
- Gupta, R.; Jain, S.; Kumar, M. Enhanced Thermal Object Detection and Classification with MobileNetV3: A Cutting-Edge Deep Learning Solution. In Proceedings of the 2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT), Kamand, India, 24–28 June 2024. [Google Scholar]
- Zhang, H.; Li, Y. GhostStereoNet: Stereo Matching from Cheap Operations. In Proceedings of the 2024 4th International Conference on Computer Science, Electronic Information Engineering and Intelligent Control Technology (CEI), Guangzhou, China, 8–10 November 2024. [Google Scholar]
- Zhou, H.; Su, Y.; Chen, J.; Li, J.; Ma, L.; Liu, X.; Lu, S.; Wu, Q. Maize Leaf Disease Recognition Based on Improved Convolutional Neural Network ShuffleNetV2. Plants 2024, 13, 1621. [Google Scholar] [CrossRef] [PubMed]
- Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-End Object Detection with Transformers. In Proceedings of the Computer Vision–ECCV 2020, Glasgow, UK, 23–28 August 2020. [Google Scholar]
- Mehta, S.; Rastegari, M. Mobilevit: Light-weight, general-purpose, and mobile-friendly vision transformer. arXiv 2021, arXiv:2110.02178. [Google Scholar]
Project | Params |
---|---|
Operating system | Windows-10-10.0.18362-SP0 |
CPU | Intel(R) Xeon(R) Gold 6130 CPU @ 2.10 GHz |
GPU | NVIDIA Quadro RTX 8000 |
CUDA version | 12.1 |
Python version | python3.12 |
Deep learning model | Yolov8s.pt |
Backbone Network | mAP@.5/% | mAP@.5:.95/% | Params/MB | Time/ms |
---|---|---|---|---|
YOLOv8-FasterNet | 90.0 | 70.7 | 8.05 | 2.4 |
MobileNetV3 | 90.1 | 69.2 | 37.21 | 12.1 |
GhostNet | 89.6 | 68.1 | 15.53 | 7.7 |
ShuffleNetV2 | 89.0 | 67.8 | 11.85 | 6.2 |
Model | mAP@.5/% | mAP@.5:.95/% | Params/Mb | Time/ms |
---|---|---|---|---|
YOLOv8 | 90.1 | 71.0 | 26.3 | 5.4 |
YOLOv8-SE | 84.2 | 51.9 | 29.5 | 6.2 |
YOLOv8-CA | 81.4 | 58.4 | 28.7 | 5.1 |
YOLOv8-CBAM | 91.6 | 66.3 | 30.1 | 6.7 |
YOLOv8-GAM | 92.3 | 72.7 | 31.8 | 6.5 |
Experiment | Baseline Model | FasterNet | GAM | WIoU | mAP@.5/% | mAP@.5:.95/% | Params/MB | Time/ms |
---|---|---|---|---|---|---|---|---|
1 | YOLOv8 | 90.1 | 71.0 | 26.3 | 5.4 | |||
2 | YOLOv8 | √ | 90.0 | 70.7 | 8.05 | 2.4 | ||
3 | YOLOv8 | √ | 92.3 | 72.7 | 31.8 | 6.5 | ||
4 | YOLOv8 | √ | 91.2 | 70.0 | 32.4 | 6.1 | ||
5 | YOLOv8 | √ | √ | 93.7 | 74.8 | 13.7 | 3.2 | |
6 | YOLOv8 | √ | √ | 93.1 | 70.9 | 12.8 | 4.7 | |
7 | YOLOv8 | √ | √ | 92.8 | 71.2 | 34.6 | 7.1 | |
8 | YOLOv8 | √ | √ | √ | 94.0 | 74.4 | 18.1 | 4.2 |
Num | Model | mAP@.5/% | mAP@.5:.95/% | Params /MB | Time /ms |
---|---|---|---|---|---|
1 | YOLOv8n | 87.2 | 70.1 | 3.1 | 1.8 |
2 | YOLOv8l | 91.5 | 72.5 | 43.4 | 8.2 |
3 | DETR-R50 | 89.7 | 71.2 | 41.2 | 15.3 |
4 | RT-DETR-L | 92.1 | 72.6 | 32.0 | 6.5 |
5 | MobileViT-XXS | 88.3 | 70.9 | 5.6 | 3.4 |
6 | This study | 94.0 | 75.1 | 18.1 | 4.2 |
Tree ID | Automatic Detected | Manual Counted | 95% Confidence Intervals | Mean Time/ms | |
---|---|---|---|---|---|
21 | 88,88,89,90,89 | 91 | 0.836 | [87.76, 89.84] | 14.2 |
13 | 47,47,48,49,48 | 49 | 0.836 | [46.76, 48.84] | 14.3 |
38 | 50,50,52,51,52 | 52 | 1.000 | [49.76, 52.24] | 15.0 |
84 | 86,87,86,86,88 | 88 | 0.894 | [85.49, 87.71] | 13.8 |
105 | 70,70,71,72,71 | 74 | 0.836 | [69.76, 71.84] | 15.2 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, L.; Xiao, J.; Peng, X.; Tan, Y.; Zhou, Z.; Chen, L.; Tang, Q.; Cheng, W.; Liang, X. Mango Inflorescence Detection Based on Improved YOLOv8 and UAVs-RGB Images. Forests 2025, 16, 896. https://doi.org/10.3390/f16060896
Wang L, Xiao J, Peng X, Tan Y, Zhou Z, Chen L, Tang Q, Cheng W, Liang X. Mango Inflorescence Detection Based on Improved YOLOv8 and UAVs-RGB Images. Forests. 2025; 16(6):896. https://doi.org/10.3390/f16060896
Chicago/Turabian StyleWang, Linhui, Jiayi Xiao, Xuxiang Peng, Yonghong Tan, Zhenqi Zhou, Lizhi Chen, Quanli Tang, Wenzhi Cheng, and Xiaolin Liang. 2025. "Mango Inflorescence Detection Based on Improved YOLOv8 and UAVs-RGB Images" Forests 16, no. 6: 896. https://doi.org/10.3390/f16060896
APA StyleWang, L., Xiao, J., Peng, X., Tan, Y., Zhou, Z., Chen, L., Tang, Q., Cheng, W., & Liang, X. (2025). Mango Inflorescence Detection Based on Improved YOLOv8 and UAVs-RGB Images. Forests, 16(6), 896. https://doi.org/10.3390/f16060896