Lightweight Object Detector Based on Images Captured Using Unmanned Aerial Vehicle
Abstract
1. Introduction
2. Materials and Methods
2.1. Materials
2.1.1. Object Detection
2.1.2. Fasternet Block
2.1.3. Coordinate Attention
2.2. Methods
2.2.1. Improvements in the Yolov8 Model
2.2.2. C2f Module in Conjunction with FasterNet
2.2.3. Self-Weight Coordinate Attention
2.2.4. C2f Module Incorporating Lightweight Self-Weight Coordinate Attention
3. Results
3.1. Dataset and Experimental Environment
3.2. Indicators for Model Evaluation
3.3. Experiments and Analysis of Results
3.3.1. Comparative Experiments on Model Lightweighting
3.3.2. Comprehensive Comparison Experiments
3.3.3. Ablation Experiments
4. Conclusions
5. Discussion
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
CFS | C2f-Faster self-weight coordinate attention; |
CA | Coordinate attention; |
SWCA | Self-weight coordinate attention. |
References
- Stefas, N.; Bayram, H.; Isler, V. Vision-based UAV navigation in orchards. IFAC-PapersOnLine 2016, 49, 10–15. [Google Scholar] [CrossRef]
- Asadzadeh, S.; de Oliveira, W.J.; de Souza Filho, C.R. UAV-based remote sensing for the petroleum industry and environmental monitoring: State-of-the-art and perspectives. J. Pet. Sci. Eng. 2022, 208, 109633. [Google Scholar] [CrossRef]
- Cho, J.; Lim, G.; Biobaku, T.; Kim, S.; Parsaei, H. Safety and security management with unmanned aerial vehicle (UAV) in oil and gas industry. Procedia Manuf. 2015, 3, 1343–1349. [Google Scholar] [CrossRef]
- Lu, J.; Liu, Y.; Jiang, C.; Wu, W. Truck-drone joint delivery network for rural area: Optimization and implications. Transp. Policy 2025, 163, 273–284. [Google Scholar] [CrossRef]
- Yang, L. The development status and future trend of China’s civil UAV industry. China Secur. Prot. 2022, 12, 15–18. [Google Scholar]
- Men, D.; Tan, Q. Improved personnel detection of aerial images based on YOLOv8. Laser J. 2025, 46, 112–118. [Google Scholar]
- Yao, J.; Cheng, G.; Wan, F.; Zhu, D. Improved Lightweight Bearing Defect Detection Algorithm of YOLOv8. Comput. Eng. Appl. 2024, 60, 205–214. [Google Scholar]
- Weng, Z.; Liu, H.; Zheng, Z. CSD-YOLOv8s: Dense Sheep Small Target Detection Model Based on UAV Images. Smart Agric. 2024, 6, 42. [Google Scholar]
- Ye, D.; Jing, J.; Zhang, Z.; Li, H.; Wu, H.; Xie, L. MSH-YOLOv8: Mushroom Small Object Detection Method with Scale Reconstruction and Fusion. Smart Agric. 2024, 6, 139. [Google Scholar]
- Zhou, P.; Liu, G.; Wang, J.; Weng, Q.; Zhang, K.; Zhou, Z. Lightweight unmanned aerial vehicle video object detection based on spatial-temporal correlation. Int. J. Commun. Syst. 2022, 35, e5334. [Google Scholar] [CrossRef]
- Zhou, Y.; Liao, B. Foreign object detection in transmission lines based on improved YOLOv7 algorithm. J. North China Electr. Power Univ. 2024, 1–9. [Google Scholar]
- Liu, L.; Zhang, S.; Bai, Y.; Li, Y.; Zhang, C. Improved light-weight military aircraft detection algorithm of YOLOv8. J. Comput. Eng. Appl. 2024, 60, 114–125. [Google Scholar]
- Zhu, X.; Lyu, S.; Wang, X.; Zhao, Q. TPH-YOLOv5: Improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 10–17 October 2021; pp. 2778–2788. [Google Scholar]
- Chen, J.; Ma, A.; Huang, L.; Li, H.; Zhang, H.; Huang, Y.; Zhu, T. Efficient and lightweight grape and picking point synchronous detection model based on key point detection. Comput. Electron. Agric. 2024, 217, 108612. [Google Scholar] [CrossRef]
- Wu, M.; Yun, L.; Chen, Z.; Zhong, T. Improved YOLOv5s small object detection algorithm in UAV view. J. Comput. Eng. Appl. 2024, 60, 191–199. [Google Scholar]
- Viola, P.; Jones, M. Rapid object detection using a boosted cascade of simple features. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001, Kauai, HI, USA, 8–14 December 2001; Volume 1, pp. I–I. [Google Scholar]
- Dalal, N.; Triggs, B. Histograms of oriented gradients for human detection. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 20–25 June 2005; Volume 1, pp. 886–893. [Google Scholar]
- Felzenszwalb, P.; McAllester, D.; Ramanan, D. A discriminatively trained, multiscale, deformable part model. In Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, Alaska, 23–28 June 2008; pp. 1–8. [Google Scholar]
- Girshick, R. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
- Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-end object detection with transformers. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; Springer: Berlin/Heidelberg, Germany, 2020; pp. 213–229. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. [Google Scholar]
- Zhang, X.; Zeng, H.; Guo, S.; Zhang, L. Efficient long-range attention network for image super-resolution. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; Springer: Berlin/Heidelberg, Germany, 2022; pp. 649–667. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Li, X.; Wang, W.; Wu, L.; Chen, S.; Hu, X.; Li, J.; Tang, J.; Yang, J. Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection. Adv. Neural Inf. Process. Syst. 2020, 33, 21002–21012. [Google Scholar]
- Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. Distance-IoU loss: Faster and better learning for bounding box regression. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 12993–13000. [Google Scholar]
- Chen, J.; Kao, S.h.; He, H.; Zhuo, W.; Wen, S.; Lee, C.H.; Chan, S.H.G. Run, don’t walk: Chasing higher FLOPS for faster neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 12021–12031. [Google Scholar]
- Hou, Q.; Zhou, D.; Feng, J. Coordinate attention for efficient mobile network design. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 13713–13722. [Google Scholar]
- Du, D.; Zhu, P.; Wen, L.; Bian, X.; Lin, H.; Hu, Q.; Peng, T.; Zheng, J.; Wang, X.; Zhang, Y.; et al. VisDrone-DET2019: The vision meets drone object detection in image challenge results. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, Seoul, Republic of Korea, 27–28 October 2019. [Google Scholar]
- Hardt, M.; Recht, B.; Singer, Y. Train faster, generalize better: Stability of stochastic gradient descent. In Proceedings of the International Conference on Machine Learning, PMLR, New York, NY, USA, 19–24 June 2016; pp. 1225–1234. [Google Scholar]
- Kingma, D.P. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Feng, Y.; Li, Y. An overview of deep learning optimization methods and learning rate attenuation methods. Hans J. Data Min. 2018, 8, 186–200. [Google Scholar] [CrossRef]
Name | Detailed Information |
---|---|
Operating system | Ubuntu 16.04 |
Display card (computer) | NVIDIA RTX 2080ti |
CUDA | 10.2 |
Deep learning frameworks | Pytorch 1.12.0 |
Language | Python 3.8 |
Model | mAP50 | mAP50:95 | GFLOPs | Params/M |
---|---|---|---|---|
Baseline | 39.9 | 23.8 | 28.5 | 11.13 |
C2f-Faster (Backbone) | 39.3 | 23.4 | 21.4 | 8.30 |
C2f-Faster (Neck) | 40.0 | 24.1 | 25.6 | 9.75 |
C2f-Faster (All) | 39.3 | 23.8 | 21.4 | 8.10 |
Model | Image Size | Params/M | mAP50 | mAP50:95 |
---|---|---|---|---|
Yolov3-tiny | 640*640 | 8.68 | 13.5 | 5.8 |
Yolov5s | 640*640 | 7.03 | 26.4 | 14.2 |
VA-Yolo | 640*640 | 6.56 | 23.6 | 12.6 |
PP-Yolo | 640*640 | 52.2 | 39.6 | 24.6 |
FasterRCNN ResNeXt101 | 640*640 | — | 40.2 | 22.6 |
YoloX-X | 640*640 | 99.1 | 43.2 | 25.8 |
Swin-T | 640*640 | 38.6 | 42.5 | 23.1 |
DDETR | 640*640 | 39.8 | 42.7 | 24.8 |
Yolov8s | 640*640 | 11.13 | 39.9 | 23.8 |
Yolo-CFS | 640*640 | 8.61 | 40.0 | 23.8 |
Model | mAP50 | mAP50:95 | GFLOPs | Params/M |
---|---|---|---|---|
Baseline | 39.9 | 23.8 | 28.5 | 11.13 |
+C2f-Faster | 39.3 | 23.8 | 21.4 | 8.30 |
+C2f-CA | 40.4 | 24.4 | 28.4 | 11.15 |
+C2f-SWCA | 40.6 | 24.4 | 28.5 | 11.56 |
+C2f-Faster-SWCA | 40.0 | 23.8 | 21.3 | 8.61 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chen, D.; Sui, J.; Zhang, J.; Wang, H. Lightweight Object Detector Based on Images Captured Using Unmanned Aerial Vehicle. Appl. Sci. 2025, 15, 7482. https://doi.org/10.3390/app15137482
Chen D, Sui J, Zhang J, Wang H. Lightweight Object Detector Based on Images Captured Using Unmanned Aerial Vehicle. Applied Sciences. 2025; 15(13):7482. https://doi.org/10.3390/app15137482
Chicago/Turabian StyleChen, Dike, Jiacheng Sui, Ji Zhang, and Hongyuan Wang. 2025. "Lightweight Object Detector Based on Images Captured Using Unmanned Aerial Vehicle" Applied Sciences 15, no. 13: 7482. https://doi.org/10.3390/app15137482
APA StyleChen, D., Sui, J., Zhang, J., & Wang, H. (2025). Lightweight Object Detector Based on Images Captured Using Unmanned Aerial Vehicle. Applied Sciences, 15(13), 7482. https://doi.org/10.3390/app15137482