An Improved Lightweight Real-Time Detection Algorithm Based on the Edge Computing Platform for UAV Images
Abstract
:1. Introduction
1.1. Background
1.2. Contributions
- (1)
- We compare the performance of the depthwise separable convolution module and the C3 module in terms of parameter compression via calculations.
- (2)
- The lightweight MobileNetv3 and ECA attention mechanism are used to compress the backbone network of YOLOv5.
- (3)
- A prediction head is added improve the detection ability of the model.
- (4)
- The FocalEIoU loss function is introduced into YOLOv5 to improve the localization accuracy.
- (5)
- Two kinds of neck structures are designed to meet the needs of different embedded devices.
1.3. Organization
2. Methods
2.1. MobileNetV3 Backbone Network
2.2. ECA Attention Mechanism
3. YOLOv5 Algorithm Improvement
3.1. Improvement of the Backbone Network
3.2. Improvement of the Prediction Head
3.3. Loss Function Improvement
4. Experiments
4.1. Experimental Introduction
4.1.1. Experimental Environment
4.1.2. Experimental Dataset
4.1.3. Evaluating Indicators
4.2. Experimental Results and Comparisons
4.3. Performance Comparison of Mainstream YOLO Series Algorithms
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Viola, P.; Jones, M. Rapid Object Detection Using a Boosted Cascade of Simple Features. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Kauai, HI, USA, 8–14 December 2001; p. I-I. [Google Scholar] [CrossRef]
- Dalal, N.; Triggs, B. Histograms of oriented gradients for human detection. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, CA, USA, 20–25 June 2005; pp. 886–893. [Google Scholar]
- Felzenszwalb, P.; McAllester, D.; Ramanan, D. A discriminatively trained, multiscale, deformable part model. In Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA, 23–28 June 2008; pp. 1–8. [Google Scholar]
- Wang, Z.; Qi, L.; Tie, Y.; Ding, Y.; Bai, Y. Drone Detection Based on FD-HOG Descriptor. In Proceedings of the 2018 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, Zhengzhou, China, 18–20 October 2018; pp. 433–4333. [Google Scholar]
- Xu, Y.; Yu, G.; Wang, Y.; Wu, X.; Ma, Y. A Hybrid Vehicle Detection Method Based on Viola-Jones and HOG + SVM from UAV Images. Sensors 2016, 16, 1325. [Google Scholar] [CrossRef] [PubMed]
- Jiang, J.; Zhong, X.; Chang, Z.; Gao, X. Object Detection of Transmission Tower Based on DPM. In Proceedings of the 4th International Conference on Information Technologies and Electrical Engineering, Changde, China, 4–6 November 2023; p. 24. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. Commun. Acm 2012, 60, 84–90. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2014, arXiv:1409-1556. [Google Scholar]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.E.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going Deeper with Convolutions. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- Uijlings, J.R.R.; van de Sande, K.E.A.; Gevers, T.; Smeulders, A.W.M. Selective Search for Object Recognition. Int. J. Comput. Vision 2013, 104, 154–171. [Google Scholar] [CrossRef]
- Girshick, R. Fast R-CNN. In Proceedings of the 2015 IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. In You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLO9000: Better, Faster, Stronger. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 6517–6525. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv 2018, arXiv:1804-2767. [Google Scholar]
- Bochkovskiy, A.; Wang, C.; Liao, H.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004-10934. [Google Scholar]
- Wang, C.; Liao, H.M.; Wu, Y.; Chen, P.; Hsieh, J.; Yeh, I. In CSPNet: A New Backbone that can Enhance Learning Capability of CNN. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA, 14–19 June 2020; pp. 1571–1580. [Google Scholar]
- Wang, Z.; Zhang, X.; Li, J.; Luan, K. A YOLO-Based Target Detection Model for Offshore Unmanned Aerial Vehicle Data. Sustainability 2021, 13, 12980. [Google Scholar] [CrossRef]
- Li, S.; Li, Y.; Li, Y.; Li, M.; Xu, X. YOLO-FIRI: Improved YOLOv5 for Infrared Image Object Detection. IEEE Access 2021, 9, 141861–141875. [Google Scholar] [CrossRef]
- Zhu, X.; Lyu, S.; Wang, X.; Zhao, Q. In TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-Captured Scenarios. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision Workshops, Montreal, Canada, 10–17 October 2021; pp. 2778–2788. [Google Scholar]
- Li, Y.; Yuan, H.; Wang, Y.; Xiao, C. GGT-YOLO: A Novel Object Detection Algorithm for Drone-Based Maritime Cruising. Drones 2022, 6, 335. [Google Scholar] [CrossRef]
- Li, W.; Wu, G.; Sun, H.; Bai, C.; Bao, W. In Dim and Small Target Detection in Unmanned Aerial Vehicle Images. In Proceedings of the 2022 International Conference on Autonomous Unmanned Systems, Singapore, 23–25 September 2023; pp. 3143–3152. [Google Scholar]
- Cheng, Q.; Wang, H.; Zhu, B.; Shi, Y.; Xie, B. A Real-Time UAV Target Detection Algorithm Based on Edge Computing. Drones 2023, 7, 95. [Google Scholar] [CrossRef]
- Lin, T.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 936–944. [Google Scholar]
- Liu, S.; Qi, L.; Qin, H.; Shi, J.; Jia, J. Path Aggregation Network for Instance Segmentation. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 8759–8768. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 1904–1916. [Google Scholar] [CrossRef] [PubMed]
- Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. In Distance-IoU loss: Faster and better learning for bounding box regression. In Proceedings of the 2020 AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; pp. 12993–13000. [Google Scholar]
- Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv 2017, arXiv:1704-4861. [Google Scholar]
- Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L. MobileNetV2: Inverted Residuals and Linear Bottlenecks. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4510–4520. [Google Scholar]
- Howard, A.; Sandler, M.; Chen, B.; Wang, W.; Chen, L.; Tan, M.; Chu, G.; Vasudevan, V.; Brain, G.; Zhu, Y.; et al. Searching for MobileNetV3. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 1314–1324. [Google Scholar]
- Hu, J.; Shen, L.; Albanie, S.; Sun, G.; Wu, E. Squeeze-and-Excitation Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 2011–2023. [Google Scholar] [CrossRef] [PubMed]
- Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Hu, Q. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11531–11539. [Google Scholar]
- Zhang, Y.; Ren, W.; Zhang, Z.; Jia, Z.; Wang, L.; Tan, T. Focal and efficient IOU loss for accurate bounding box regression. Neurocomputing 2022, 506, 146–157. [Google Scholar] [CrossRef]
- Cao, Y.; He, Z.; Wang, L.; Wang, W.; Yuan, Y.; Zhang, D.; Zhang, J.; Zhu, P.; Gool, L.V.; Han, J.; et al. VisDrone-DET2021: The Vision Meets Drone Object detection Challenge Results. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision Workshops, Montreal, BC, Canada, 10–17 October 2021; pp. 2847–2854. [Google Scholar]
V3 | ECA | SL | Method | Params (M) | FLOPs (G) | Inference Time (ms) |
---|---|---|---|---|---|---|
YOLOv5l (baseline) | 46.2 | 107.9 | 65.5 | |||
✓ | M-YOLOv5 | 21.8 | 40.2 | 35.7 | ||
✓ | ✓ | ME-YOLOv5 | 20.3 | 40.2 | 34.5 | |
✓ | ✓ | ✓ | MEL-YOLOv5-S | 3.4 | 9.6 | 21.8 |
✓ | ✓ | ✓ | MEL-YOLOv5-L | 5.8 | 56.8 | 43.5 |
Method | mAP@0.5(%) | mAP@0.5:0.95(%) |
---|---|---|
MEL-YOLOv5-S | 34.8 | 18.3 |
MELF-YOLOv5-S | 34.8 | 18.7 |
MEL-YOLOv5-L | 46.8 | 26.8 |
MELF-YOLOv5-L | 46.9 | 27.7 |
Method | Size | mAP@0.5 (%) | mAP@0.5:0.95 (%) | Params (M) | FLOPs (G) | Inference Time (ms) |
---|---|---|---|---|---|---|
YOLOv3-Tiny | 448 | 11.5 | 4.8 | 8.7 | 12.9 | 19.4 |
640 | 16.2 | 7.0 | 21.2 | |||
832 | 19.6 | 8.4 | 23.0 | |||
YOLOv4-Tiny | 446 | 16.6 | 9.0 | 5.9 | 16.2 | 15.5 |
640 | 24.4 | 13.5 | 17.7 | |||
832 | 29.8 | 16.7 | 22.7 | |||
YOLOv5s | 446 | 26.5 | 13.8 | 7.0 | 15.9 | 20.0 |
640 | 33.5 | 17.8 | 21.5 | |||
832 | 37.1 | 20.0 | 24.6 | |||
MELF-YOLOv5-S | 446 | 29.1 | 14.4 | 3.4 | 9.8 | 20.4 |
640 | 34.8 | 18.7 | 21.8 | |||
832 | 39.3 | 21.2 | 29.6 | |||
YOLOv3 | 448 | 32.8 | 17.7 | 61.5 | 154.9 | 53.4 |
640 | 40.3 | 22.3 | 81.5 | |||
832 | 43.4 | 24.3 | 121.2 | |||
YOLOv4 | 446 | 36.5 | 21.3 | 64.0 | 141.6 | 67.2 |
640 | 45.3 | 27.3 | 67.6 | |||
832 | 50.4 | 30.8 | 83.3 | |||
YOLOv5l | 446 | 31.5 | 17.3 | 46.2 | 107.9 | 44.2 |
640 | 39.6 | 22.5 | 65.5 | |||
832 | 43.4 | 25.0 | 99.1 | |||
YOLOv5x | 446 | 33.2 | 18.6 | 86.2 | 204.2 | 78.5 |
640 | 41.0 | 23.6 | 118.2 | |||
832 | 44.9 | 26.0 | 175.5 | |||
MELF-YOLOv5-L | 446 | 39.0 | 22.4 | 5.8 | 56.8 | 33.7 |
640 | 46.9 | 27.7 | 43.5 | |||
832 | 50.2 | 30.2 | 63.9 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Cao, L.; Song, P.; Wang, Y.; Yang, Y.; Peng, B. An Improved Lightweight Real-Time Detection Algorithm Based on the Edge Computing Platform for UAV Images. Electronics 2023, 12, 2274. https://doi.org/10.3390/electronics12102274
Cao L, Song P, Wang Y, Yang Y, Peng B. An Improved Lightweight Real-Time Detection Algorithm Based on the Edge Computing Platform for UAV Images. Electronics. 2023; 12(10):2274. https://doi.org/10.3390/electronics12102274
Chicago/Turabian StyleCao, Lijia, Pinde Song, Yongchao Wang, Yang Yang, and Baoyu Peng. 2023. "An Improved Lightweight Real-Time Detection Algorithm Based on the Edge Computing Platform for UAV Images" Electronics 12, no. 10: 2274. https://doi.org/10.3390/electronics12102274
APA StyleCao, L., Song, P., Wang, Y., Yang, Y., & Peng, B. (2023). An Improved Lightweight Real-Time Detection Algorithm Based on the Edge Computing Platform for UAV Images. Electronics, 12(10), 2274. https://doi.org/10.3390/electronics12102274