Underwater Robot Target Detection Algorithm Based on YOLOv8
Abstract
1. Introduction
- (1)
- Due to the frequent occurrence of image deformation, if the convolution process is still executed according to the preset fixed path, the processing efficiency will be greatly affected. For this reason, we only adjust the convolutional layer in layer 9, and replace the original convolutional method with adaptive deformable convolutional DCN v3, which can capture changes in and subtle features of underwater targets more efficiently and effectively deal with the challenge of the deformation of the underwater image with fewer parameters.
- (2)
- The utilization of SPPFCSPC (spatial pyramid pooling-fast with cross-stage partial connections) to improve multi-scale target recognition and to improve feature expression through the integration of shallow, low-level characteristics with high-level semantic features.
- (3)
- When WIoU loss v3 is substituted for the CIoU (complete intersection over union) loss function, the model’s overall performance improves.
2. Underwater Target Detection Algorithm Based on Improved YOLOv8
2.1. YOLOv8 Object Detection Algorithm
2.2. Improvement of YOLOv8 Object Detection Network
2.2.1. Improvement of Original Convolution
2.2.2. Improvement of Spatial Pyramid Pooling
2.2.3. Improvement of Loss Function
2.3. Improved YOLOv8 Object Detection Network
3. Experiments and Analysis
3.1. Datasets and Experimental Environment
3.2. Underwater Robot Experimental Platform
3.3. Evaluation Indicators
3.4. Experimental Results and Analysis
3.4.1. Verification of the Enhanced YOLOv8 Object Detection Algorithm
3.4.2. Evaluation of Several Object Detection Models’ Detection Capabilities in Comparison
3.4.3. Ablation Experiment
3.4.4. Underwater Robot Prototype Grasping Experiment and Result Analysis
- (1)
- Artificial water tank experiment
- (2)
- Natural water experiments
4. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| ROV | Remotely Operated Vehicle | 
| CNN | Convolutional Neural Network | 
| R-CNN | Region-Based Convolutional Neural Network | 
| Fast R-CNN | Fast Region-Based Convolutional Neural Network | 
| Faster R-CNN | Faster Region-Based Convolutional Neural Network | 
| SSD | Single Shot MultiBox Detector | 
| YOLO | You Only Look Once | 
| Resnet | Residual Network | 
| SPPF | Spatial Pyramid Pooling-Fast | 
| SPPFCSPC | Spatial Pyramid Pooling-Fast and Cross-Stage Partial Connection | 
| WIoU loss | Weighted Intersection over Union Loss | 
| CIoU loss | Complete Intersection over Union Loss | 
| DCNv3 | Deformable ConvNet v3 | 
| URPC | Underwater Robot Professional Challenge | 
References
- Chen, L.; Zheng, M.; Duan, S.; Luo, W.; Yao, L. Underwater target recognition based on improved YOLOv4 neural network. Electronics 2021, 10, 1634. [Google Scholar] [CrossRef]
- von Benzon, M.; Liniger, J.; Sørensen, F.F.; Pedersen, S. Investigation of operating range of marine growth removing rov under offshore disturbances. IFAC-PapersOnLine 2022, 55, 85–90. [Google Scholar] [CrossRef]
- Mou, L.; Bruzzone, L.; Zhu, X.X. Learning spectral-spatial-temporal features via a recurrent convolutional neural network for change detection in multispectral imagery. IEEE Trans. Geosci. Remote Sens. 2018, 57, 924–935. [Google Scholar] [CrossRef]
- Khankeshizadeh, E.; Mohammadzadeh, A.; Moghimi, A.; Mohsenifar, A. FCD-R2U-net: Forest change detection in bi-temporal satellite images using the recurrent residual-based U-net. Earth Sci. Inform. 2022, 15, 2335–2347. [Google Scholar] [CrossRef]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
- Girshick, R. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 2015, 28, 1137–1149. [Google Scholar] [CrossRef] [PubMed]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A.C. Ssd: Single shot multibox detector. In Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Part I 14. Springer Internarional Publishing: Cham, Switzerland, 2016; pp. 21–37. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Wei, X.; Yu, L.; Tian, S.; Feng, P.; Ning, X. Underwater target detection with an attention mechanism and improved scale. Multimed. Tools Appl. 2021, 80, 33747–33761. [Google Scholar] [CrossRef]
- Zhang, M.; Xu, S.; Song, W.; He, Q.; Wei, Q. Lightweight underwater object detection based on yolo v4 and multi-scale attentional feature fusion. Remote Sens. 2021, 13, 4706. [Google Scholar] [CrossRef]
- Li, Y.; Bai, X.; Xia, C. An improved YOLOV5 based on triplet attention and prediction head optimization for marine organism detection on underwater mobile platforms. J. Mar. Sci. Eng. 2022, 10, 1230. [Google Scholar] [CrossRef]
- Lei, F.; Tang, F.; Li, S. Underwater target detection algorithm based on improved YOLOv5. J. Mar. Sci. Eng. 2022, 10, 310. [Google Scholar] [CrossRef]
- Li, W.; Zhang, Z.; Jin, B.; Yu, W. A real-time fish target detection algorithm based on improved yolov5. J. Mar. Sci. Eng. 2023, 11, 572. [Google Scholar] [CrossRef]
- Zhang, J.; Chen, H.; Yan, X.; Zhou, K.; Zhang, J.; Zhang, Y.; Jiang, H.; Shao, B. An improved yolov5 underwater detector based on an attention mechanism and multi-branch reparameterization module. Electronics 2023, 12, 2597. [Google Scholar] [CrossRef]
- Li, K.; Wang, Y.; Hu, Z. Improved YOLOv7 for Small Object Detection Algorithm Based on Attention and Dynamic Convolution. Appl. Sci. 2023, 13, 9316. [Google Scholar] [CrossRef]
- Liu, K.; Sun, Q.; Sun, D.; Peng, L.; Yang, M.; Wang, N. Underwater target detection based on improved YOLOv7. J. Mar. Sci. Eng. 2023, 11, 677. [Google Scholar] [CrossRef]
- Chen, X.; Yuan, M.; Yang, Q.; Yao, H.; Wang, H. Underwater-ycc: Underwater target detection optimization algorithm based on YOLOv7. J. Mar. Sci. Eng. 2023, 11, 995. [Google Scholar] [CrossRef]
- Wang, C.-Y.; Liao, H.-Y.M.; Wu, Y.-H.; Chen, P.-Y.; Hsieh, J.-W.; Yeh, I.-H. CSPNet: A new backbone that can enhance learning capability of CNN. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 390–391. [Google Scholar]
- Gao, H.; Zhu, X.; Lin, S.; Dai, J. Deformable kernels: Adapting effective receptive fields for object deformation. arXiv 2019, arXiv:1910.02940. [Google Scholar]
- Wang, W.; Dai, J.; Chen, Z.; Huang, Z.; Li, Z.; Zhu, X.; Hu, X.; Lu, T.; Lu, L.; Li, H. Internimage: Exploring large-scale vision foundation models with deformable convolutions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 14408–14419. [Google Scholar]
- Zhu, X.; Hu, H.; Lin, S.; Dai, J. Deformable convnets v2: More deformable, better results. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 9308–9316. [Google Scholar]
- Li, X.; Wang, W.; Wu, L.; Chen, S.; Hu, X.; Li, J.; Tang, J.; Yang, J. Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection. Adv. Neural Inf. Process. Syst. 2020, 33, 21002–21012. [Google Scholar]
- Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. Distance-IoU loss: Faster and better learning for bounding box regression. In Proceedings of the 2020 AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 12993–13000. [Google Scholar]
- Tong, Z.; Chen, Y.; Xu, Z.; Yu, R. Wise-IoU: Bounding box regression loss with dynamic focusing mechanism. arXiv 2023, arXiv:2301.10051. [Google Scholar]






















| Parameter | Value | 
|---|---|
| Batch size | 4 | 
| Learning rate | 0.01 | 
| Optimizer | SGD | 
| Weight attention factor | 0.0005 | 
| Confidence threshold | 0.5 | 
| Model Name | AP (%) | MAP (%) | FPS (Hz) | |||
|---|---|---|---|---|---|---|
| Sea Urchin | Sea Cucumber | Starfish | Scallop | |||
| SSD | 74.7 | 69.9 | 75.2 | 60.2 | 70.0 | 21 | 
| YOLOv5s | 91.3 | 75.1 | 85.0 | 84.4 | 83.9 | 97 | 
| RetinaNet | 77.2 | 68.1 | 78.3 | 61.2 | 71.2 | 26 | 
| Faster R-CNN | 87.4 | 69.4 | 80.5 | 61.3 | 74.4 | 12 | 
| YOLOv8s | 90.1 | 74.7 | 87.3 | 85.5 | 84.4 | 96 | 
| Improve YOLOv8s | 92.1 | 76.5 | 90.2 | 87.2 | 86.5 | 85 | 
| Model | B | S | D | W | MAP (%) | FPS | FLOPs (G) | Parameter Quantity (M) | 
|---|---|---|---|---|---|---|---|---|
| 1 | √ | 84.9 | 96.3 | 28.4 | 11.1 | |||
| 2 | √ | 85.2 | 90.2 | 31.6 | 14.5 | |||
| 3 | √ | √ | 86.3 | 87.9 | 33.2 | 17.6 | ||
| 4 | √ | √ | √ | 86.5 | 85.7 | 33.2 | 17.6 | 
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. | 
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Song, G.; Chen, W.; Zhou, Q.; Guo, C. Underwater Robot Target Detection Algorithm Based on YOLOv8. Electronics 2024, 13, 3374. https://doi.org/10.3390/electronics13173374
Song G, Chen W, Zhou Q, Guo C. Underwater Robot Target Detection Algorithm Based on YOLOv8. Electronics. 2024; 13(17):3374. https://doi.org/10.3390/electronics13173374
Chicago/Turabian StyleSong, Guangwu, Wei Chen, Qilong Zhou, and Chenkai Guo. 2024. "Underwater Robot Target Detection Algorithm Based on YOLOv8" Electronics 13, no. 17: 3374. https://doi.org/10.3390/electronics13173374
APA StyleSong, G., Chen, W., Zhou, Q., & Guo, C. (2024). Underwater Robot Target Detection Algorithm Based on YOLOv8. Electronics, 13(17), 3374. https://doi.org/10.3390/electronics13173374
 
        


 
       