A Efficient and Accurate UAV Detection Method Based on YOLOv5s
Abstract
:1. Introduction
- The main backbone network of YOLOv5s was reconstructed using Shufflenetv2 and coordinate attention mechanism, simplifying the network structure and reducing the model parameter size.
- The neck network of YOLOv5s was reconstructed using Bi-FPN and Ghost Convolution, extracting more accurate and rich features.
- The introduction of the new loss function EIoU to replace CIoU accelerates the network convergence speed and enhances the model’s localization ability.
2. Related Work
2.1. Small Object Detection Method
2.2. Lightweight Network
3. Materials and Methods
3.1. Lightweight Feature Extraction Module
3.2. Multi-Scale Feature Fusion Module
3.3. Boundary Box EIoU Loss Function
4. Results
4.1. UAV Dataset
4.2. Evaluation Indicators for Experiments
4.3. Experimental Environment Setup
4.4. Experimental Results Analysis
5. Discussion
6. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Misbah, M.; Khan, M.U.; Yang, Z.; Kaleem, Z. TF-NET: Deep Learning Empowered Tiny Feature Network for Night-time UAV Detection. Int. Conf. Wirel. Satell. Syst. 2023, 509, 3–18. [Google Scholar]
- Jian, M.; Lu, Z.; Chen, V.C. Drone Detection and Tracking Based on Phase-interferometric Doppler Radar. In Proceedings of the IEEE Radar Conference, Oklahoma, OK, USA, 23–27 April 2018; pp. 1146–1149. [Google Scholar]
- Gan, W.; Wu, X.; Wu, W.; Yang, X.; Ren, C.; He, X.; Liu, K. Infrared and Visible Image Fusion With the Use of Multi-scale Edge-preserving Decomposition and Guided Image Filter. Infrared Phys. Technol. 2015, 72, 37–51. [Google Scholar] [CrossRef]
- Al-Qubaydhi, N.; Alenezi, A.; Alanazi, T.; Senyor, A.; Alanezi, N.; Alotaibi, B.; Alotaibi, M.; Razaque, A.; Abdelhamid, A.A.; Alotaibi, A. Detection of Unauthorized Unmanned Aerial Vehicles Using YOLOv5 and Transfer Learning. Electronics 2022, 11, 2669. [Google Scholar] [CrossRef]
- Kim, J.; Park, C.; Ahn, J.; Ko, Y.; Park, J.; Gallagher, J.C. Real-time UAV Sound Detection and Analysis System. In Proceedings of the IEEE Sensors Applications Symposium (SAS), Glassboro, NJ, USA, 13–15 March 2017; pp. 1–5. [Google Scholar]
- Lv, H.; Liu, F.; Yuan, N. Drone Presence Detection by the Drone’s RF Communication. J. Phys. Conf. Ser. 2021, 1738, 012044. [Google Scholar] [CrossRef]
- Nguyen, P.; Ravindranatha, M.; Nguyen, A.; Han, R.; Vu, T. Investigating Cost-effective RF-based Detection of Drones. In Proceedings of the Workshop on Micro Aerial Vehicle Networks, Systems, and Applications for Civilian Use, Singapore, 26 June 2016; pp. 17–22. [Google Scholar]
- Samaras, S.; Diamantidou, E.; Ataloglou, D.; Sakellariou, N.; Vafeiadis, A.; Magoulianitis, V.; Lalas, A.; Dimou, A.; Zarpalas, D.; Votis, K. Deep Learning on Multi Sensor Data for Counter UAV Applications—A Systematic Review. Sensors 2019, 19, 4837. [Google Scholar] [CrossRef] [PubMed]
- Xie, L.; Xiang, C.; Yu, Z.; Xu, G.; Yang, Z.; Cai, D.; He, X. PI-RCNN: An Efficient Multi-sensor 3D Object Detector With Point-based Attentive Cont-conv Fusion Fodule. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 1–12 February 2020; pp. 12460–12467. [Google Scholar]
- Kim, Y.; Shin, J.; Kim, S.; Lee, I.J.; Choi, J.W.; Kum, D. CRN: Camera Radar Net for Accurate, Robust, Efficient 3D Perception. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 2–6 October 2023; pp. 17615–17626. [Google Scholar]
- Qi, G.; Zhang, Y.; Wang, K.; Mazur, N.; Liu, Y.; Malaviya, D. Small Object Detection Method Based on Adaptive Spatial Parallel Convolution and Fast Multi-Scale Fusion. Remote Sens. 2022, 14, 420. [Google Scholar] [CrossRef]
- Zhang, X.; Zhou, X.; Lin, M.; Sun, J. Shufflenet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake, UT, USA, 18–22 June 2018; pp. 6848–6856. [Google Scholar]
- Hua, W.; Chen, Q.; Chen, W. A new lightweight network for efficient UAV object detection. Sci. Rep. 2024, 14, 13288. [Google Scholar] [CrossRef] [PubMed]
- Zheng, J.; Chen, R.; Yang, T.; Liu, X.; Liu, H.; Su, T.; Wan, L. An efficient strategy for accurate detection and localization of UAV swarms. IEEE Internet Things J. 2021, 8, 15372–15381. [Google Scholar] [CrossRef]
- Liang, X.; Zhang, J.; Zhuo, L.; Li, Y.; Tian, Q. Small object detection in unmanned aerial vehicle images using feature fusion and scaling-based single shot detector with spatial context analysis. IEEE Trans. Circuits Syst. Video Technol. 2019, 30, 1758–1770. [Google Scholar] [CrossRef]
- Mehta, V.; Dadboud, F.; Bolic, M.; Mantegh, I. A Deep Learning Approach for Drone Detection and Classification Using Radar and Camera Sensor Fusion. IEEE Sens. Appl. Symp. 2023, 77, 1–6. [Google Scholar]
- Wen, W.; Wu, C.; Wang, Y.; Chen, Y.; Li, H. Learning Structured Sparsity in Deep Neural Networks. In Proceedings of the Advances in Neural Information Processing Systems, Barcelona, Spain, 5–10 December 2016; pp. 2074–2082. [Google Scholar]
- Zagoruyko, S.; Komodakis, N. Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks Via Attention Transfer. In Proceedings of the International Conference on Learning Representations, Toulon, France, 24–26 April 2017; pp. 1320–1334. [Google Scholar]
- Tung, F.; Mori, G. CLIP-Q: Deep Network Compression Learning by In-parallel Pruning-quantization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake, UT, USA, 18–22 June 2018; pp. 7873–7882. [Google Scholar]
- Wang, T.; Anwer, R.M.; Cholakkal, H.; Khan, F.S.; Pang, Y.; Shao, L. Learning Rich Features at High-Speed for Single-Shot Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 1971–1980. [Google Scholar]
- Zhang, G.; Luo, Z.; Cui, K.; Lu, S.; Xing, E.P. Meta-DETR: Image-Level Few-Shot Object Detection With Inter-Class Correlation Exploitation. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 12832–12843. [Google Scholar] [CrossRef] [PubMed]
- Kang, B.; Liu, Z.; Wang, X.; Yu, F.; Feng, J. Few-Shot Object Detection via Feature Reweighting. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 8419–8428. [Google Scholar]
- Yan, X.; Chen, Z.; Xu, A.; Wang, X.; Liang, X.; Lin, L. Meta R-CNN: Towards General Solver for Instance-Level Low-Shot Learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 9576–9585. [Google Scholar]
- Zhang, T.; Zhang, Y.; Sun, X.; Sun, H.; Yan, M. Comparison Network for One-Shot Conditional Object Detection. arXiv 2019, arXiv:1904.02317. [Google Scholar]
- Karlinsky, L.; Shtok, J.; Harary, S.; Schwartz, E.; Aides, A.; Feris, R.; Giryes, R.; Bronstein, A.M. RepMet: Representative-Based Metric Learning for Classification and Few-Shot Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 5192–5201. [Google Scholar]
- Wu, A.; Han, Y.; Zhu, L.; Yang, Y. Universal-Prototype Enhancing for Few-Shot Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 9547–9556. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed]
- Cao, Z.; Yu, H.; Kong, L.; Zhang, D. Multi-Scene Small Object Detection with Modified YOLOv4. J. Phys. Conf. Ser. 2022, 2253, 012027. [Google Scholar]
- Zhu, X.; Lyu, S.; Wang, X.; Zhao, Q. TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-Captured Scenarios. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 2778–2788. [Google Scholar]
- Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. Yolov4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single Shot Multibox Detector. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 10–16 October 2016; pp. 21–37. [Google Scholar]
- Shamsolmoali, P.; Zareapoor, M.; Granger, E.; Chanussot, J.; Yang, J. Enhanced Single-Shot Detector for Small Object Detection in Remote Sensing Images. IEEE Int. Geosci. Remote Sens. Symp. 2022, 22, 1716–1719. [Google Scholar]
- Zhang, X.; Zhao, C.; Luo, H.; Zhao, W.; Zhong, S.; Tang, L.; Peng, J.; Fan, J. Automatic Learning for Object Detection. Neurocomputing 2022, 484, 260–272. [Google Scholar] [CrossRef]
- He, Y.; Lin, J.; Liu, Z.; Wang, H.; Li, L.; Han, S. AMC: Automl for Model Compression and Acceleration on Mobile Devices. In Proceedings of the European Conference on Computer Vision, Munich, Germany, 8–14 September 2018; pp. 784–800. [Google Scholar]
- Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M. Mobilenets: Efficient Convolutional Neural Networks for Mobile Vision Applications. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 432–445. [Google Scholar]
- Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L. MobileNetV2: Inverted Residuals and Linear Bottlenecks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake, UT, USA, 18–22 June 2018; pp. 4510–4520. [Google Scholar]
- Ma, N.; Zhang, X.; Zheng, H.; Sun, J. Shufflenet v2: Practical Guidelines for Efficient Cnn Architecture Design. In Proceedings of the European Conference on Computer Vision, Munich, Germany, 8–14 September 2018; pp. 116–131. [Google Scholar]
- Tan, M.; Chen, B.; Pang, R.; Vasudevan, V.; Sandler, M.; Howard, A. Mnasnet: Platform-Aware Neural Architecture Search for Mobile. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 2820–2828. [Google Scholar]
- Zoph, B.; Vasudevan, V.; Shlens, J.; Le, Q. Learning Transferable Architectures for Scalable Image Recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake, UT, USA, 18–22 June 2018; pp. 8697–8710. [Google Scholar]
- Zhang, T.; Qi, G.J.; Xiao, B.; Wang, J. Interleaved Group Convolutions. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 4373–4382. [Google Scholar]
- Xie, G.; Wang, J.; Zhang, T.; Lai, J.; Hong, R.; Qi, G.J. Interleaved Structured Sparse Convolutional Neural Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake, UT, USA, 18–22 June 2018; pp. 8847–8856. [Google Scholar]
- Mehta, S.; Rastegari, M.; Caspi, A.; Shapiro, L.; Hajishirzi, H. ESPNet:Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation. In Proceedings of the European Conference on Computer Vision, Munich, Germany, 8–14 September 2018; pp. 561–580. [Google Scholar]
- Mehta, S.; Rastegari, M.; Shapiro, L.; Hajishirzi, H. ESPNetv2: A light-weight, Power Efficient, and General Purpose Convolutional Neural Network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 9182–9192. [Google Scholar]
- Hou, Q.; Zhou, D.; Feng, J. Coordinate Attention for Efficient Mobile Network Design. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 19–25 June 2021; pp. 13708–13717. [Google Scholar]
- Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11531–11539. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. CBAM: Convolutional Block Attention Module. In Proceedings of the European Conference on Computer Vision, Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Deng, C.; Wang, M.; Liu, L.; Liu, Y.; Jiang, Y. Extended Feature Pyramid Network for Small Object Detection. IEEE Trans. Multimed. 2021, 24, 1968–1979. [Google Scholar] [CrossRef]
- Ghiasi, G.; Lin, T.Y.; Le, Q.V. Nas-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 7036–7045. [Google Scholar]
- Kim, S.W.; Kook, H.K.; Sun, J.Y.; Kang, M.C.; Ko, S.J. Parallel Feature Pyramid Network for Object Detection. In Proceedings of the European Conference on Computer Vision, Munich, Germany, 8–14 September 2018; pp. 234–250. [Google Scholar]
- Gong, Y.; Yu, X.; Ding, Y.; Peng, X.; Zhao, J.; Han, Z. Effective Fusion Factor in FPN for Tiny Object Detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 3–7 January 2023; pp. 1160–1168. [Google Scholar]
- Tan, M.; Pang, R.; Le, Q.V. Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, Seattle, WA, USA, 13–19 June 2020; pp. 10781–10790. [Google Scholar]
- Chollet, F. Xception: Deep Learning With Depthwise Separable Convolutions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1251–1258. [Google Scholar]
- Huang, G.; Liu, Z.; Maaten, L.V.D.; Weinberger, K.Q. Densely Connected Convolutional Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2261–2269. [Google Scholar]
- Han, K.; Wang, Y.; Tian, Q.; Guo, J.; Xu, C.; Xu, C. Ghostnet: More Features From Cheap Operations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 1580–1589. [Google Scholar]
- Sairam, R.V.C.; Keswani, M.; Sinha, U.; Shah, N.; Balasubramanian, V.N. ARUBA: An Architecture-Agnostic Balanced Loss for Aerial Object Detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 3–7 January 2023; pp. 3719–3728. [Google Scholar]
- Zhang, Y.; Ren, W.; Zhang, Z.; Jia, Z.; Wang, L.; Tan, T. Focal and Efficient IOU Loss for Accurate Bounding Box Regression. Neurocomputing 2022, 506, 146–157. [Google Scholar] [CrossRef]
Algorithms | Small Object Detection Methods | Backbone Network | ||||||
---|---|---|---|---|---|---|---|---|
Multi-Scale Prediction | Data Augmentation | Improving Feature Resolution | Based on Contextual Information | New Backbone Network | ResNet−101 | ResNeXt−101 | DetNet−59 | |
SSD513 | √ | √ | √ | |||||
DSSD513 | √ | √ | √ | |||||
FPN | √ | √ | ||||||
PANet | √ | √ | ||||||
SNIPER | √ | √ | ||||||
CoupleNet | √ | √ | ||||||
DetNet | √ | √ | ||||||
DetectoRS | √ | √ |
Name | Related Parameters |
---|---|
GPU | NVIDIA GeForce GTX 4070 |
CPU | i9-14900KF/32G |
GPU Memory | 12 GB |
Operating System | Ubuntu 20.04 |
Computing Platform | Cuda 12.1 |
Model | mAP/% | R/% | P/% | GFLOPs | FPS S−1 |
---|---|---|---|---|---|
YOLOv5s | 93.9 | 90.2 | 93.0 | 16.0 | 153 |
1 YOLOv5s-G | 91.7 | 88.7 | 92.4 | 5.8 | 178 |
YOLOv5s-SV1 | 92.2 | 88.5 | 92.5 | 3.6 | 177 |
YOLOv5s-MV2 | 92.5 | 87.1 | 92.6 | 2.8 | 183 |
YOLOv5s-SV2 | 91.1 | 86.3 | 91.2 | 2.3 | 186 |
Model | mAP/% | R/% | P/% | GFLOPs | FPS S−1 |
---|---|---|---|---|---|
2 YOLOv5s-SV2-P | 91.1 | 86.3 | 91.2 | 2.3 | 186 |
YOLOv5s-SV2-B | 91.4 | 87.4 | 91.8 | 2.2 | 187 |
Model | mAP/% | R/% | P/% | Parms/M | GFLOPs | FPS/S−1 |
---|---|---|---|---|---|---|
YOLOv5s | 93.9 | 90.2 | 93.0 | 44.5 | 16.0 | 153 |
3 YOLOv5s-SG | 90.9 | 85.7 | 90.6 | 22 | 2.1 | 195 |
YOLOv5s-SGB | 91.9 | 87.0 | 91.6 | 18.3 | 2.2 | 189 |
YOLOv5s-SGBC | 92.6 | 88.2 | 92.0 | 16.3 | 2.2 | 188 |
EDU-YOLO | 92.8 | 88.6 | 92.1 | 16 | 2.2 | 188 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Feng, Y.; Wang, T.; Jiang, Q.; Zhang, C.; Sun, S.; Qian, W. A Efficient and Accurate UAV Detection Method Based on YOLOv5s. Appl. Sci. 2024, 14, 6398. https://doi.org/10.3390/app14156398
Feng Y, Wang T, Jiang Q, Zhang C, Sun S, Qian W. A Efficient and Accurate UAV Detection Method Based on YOLOv5s. Applied Sciences. 2024; 14(15):6398. https://doi.org/10.3390/app14156398
Chicago/Turabian StyleFeng, Yunsong, Tong Wang, Qiangfu Jiang, Chi Zhang, Shaohang Sun, and Wangjiahe Qian. 2024. "A Efficient and Accurate UAV Detection Method Based on YOLOv5s" Applied Sciences 14, no. 15: 6398. https://doi.org/10.3390/app14156398
APA StyleFeng, Y., Wang, T., Jiang, Q., Zhang, C., Sun, S., & Qian, W. (2024). A Efficient and Accurate UAV Detection Method Based on YOLOv5s. Applied Sciences, 14(15), 6398. https://doi.org/10.3390/app14156398