EGM-YOLOv8: A Lightweight Ship Detection Model with Efficient Feature Fusion and Attention Mechanisms
Abstract
:1. Introduction
- We propose a ship detection method termed EGM-YOLOv8, which integrates the efficient ECA module into the backbone network of YOLOv8, enhancing the network’s capability to extract vessel features.
- We combine the lightweight GELAN and PANet to construct a more lightweight and efficient neck network, which better integrates vessel information from different levels, addressing the trade-off between model precision, speed, and parameters.
- We use MPDIoU instead of the original bounding box loss function, thereby enhancing the convergence speed of the detection model and improving regression accuracy in predicting results.
- Comprehensive experiments are performed using the publicly accessible Seaships and Mcships datasets to validate the contribution of various improvement modules and diverse attention mechanisms on the performance of ship detection. Additionally, the reliability of inshore vessel detection is established compared to domain-specific and general CNN-based detectors.
2. Related Work
2.1. Object Detection
2.2. The YOLOv8 Model
3. Proposed Method
3.1. Improvement of the Backbone Module
3.2. Improvement of the Neck Module
3.3. Loss Function
4. Experimental Design and Results Analysis
4.1. Dataset
4.2. Implementation Details
4.3. Evaluation Metrics
4.4. Experimental Results and Discussion
4.4.1. Ablation Study on Overall Architecture
4.4.2. Comparative Experiment on the Embedding of Different Attention Modules
4.4.3. Compared with Common Detection Methods
4.4.4. Comparisons with Domain-Specific Models
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Lehtola, V.; Montewka, J.; Goerlandt, F.; Guinness, R.; Lensu, M. Finding safe and efficient shipping routes in ice-covered waters: A framework and a model. Cold Reg. Sci. Technol. 2019, 165, 102795.1–102795.14. [Google Scholar]
- Namgung, H.; Kim, J.S. Collision risk inference system for maritime autonomous surface ships using COLREGs rules compliant collision avoidance. IEEE Access 2021, 9, 7823–7835. [Google Scholar]
- Vagale, A.; Oucheikh, R.; Bye, R.T.; Osen, O.L.; Fossen, T.I. Path planning and collision avoidance for autonomous surface vehicles I: A review. J. Mar. Sci. Technol. 2021, 26, 1292–1306. [Google Scholar]
- Qian, L.; Zheng, Y.; Li, L.; Ma, Y.; Zhou, C.; Zhang, D. A new method of inland water ship trajectory prediction based on long short-term memory network optimized by genetic algorithm. Appl. Sci. 2022, 12, 4073–4088. [Google Scholar] [CrossRef]
- Namgung, H. Local route planning for collision avoidance of maritime autonomous surface ships in compliance with COLREGs rules. Sustainability 2021, 14, 198. [Google Scholar] [CrossRef]
- Vagale, A.; Bye, R.T.; Oucheikh, R.; Osen, O.L.; Fossen, T.I. Path planning and collision avoidance for autonomous surface vehicles II: A comparative study of algorithms. J. Mar. Sci. Technol. 2021, 26, 1307–1323. [Google Scholar]
- Zwemer, M.H.; Wijnhoven, R.G.; de With Peter, H.N. Ship Detection in Harbour Surveillance based on Large-Scale Data and CNNs. In Proceedings of the VISIGRAPP, Funchal-Madeira, Portugal, 27–29 January 2018; Volume 5, pp. 153–160. [Google Scholar]
- Hu, C.; Zhu, Z.; Yu, Z. Ship Identification Based on Improved SSD. In Proceedings of the 2022 6th International Conference on Electronic Information Technology and Computer Engineering, Xiamen, China, 21–23 October 2022; pp. 476–482. [Google Scholar]
- Shao, Z.; Wang, L.; Wang, Z.; Du, W.; Wu, W. Saliency-aware convolution neural network for ship detection in surveillance video. IEEE Trans. Circuits Syst. Video Technol. 2019, 30, 781–794. [Google Scholar]
- Li, H.; Deng, L.; Yang, C.; Liu, J.; Gu, Z. Enhanced YOLO v3 tiny network for real-time ship detection from visual image. IEEE Access 2021, 9, 16692–16706. [Google Scholar]
- Zhou, S.; Yin, J. YOLO-Ship: A Visible Light Ship Detection Method. In Proceedings of the 2022 2nd International Conference on Consumer Electronics and Computer Engineering (ICCECE), Guangzhou, China, 14–16 January 2022; pp. 113–118. [Google Scholar]
- Chen, Z.; Liu, C.; Filaretov, V.F.; Yukhimets, D.A. Multi-Scale Ship Detection Algorithm Based on YOLOv7 for Complex Scene SAR Images. Remote Sens. 2023, 15, 2071. [Google Scholar] [CrossRef]
- Corbane, C.; Najman, L.; Pecoul, E.; Demagistri, L.; Petit, M. A complete processing chain for ship detection using optical satellite imagery. Int. J. Remote Sens. 2010, 31, 5837–5854. [Google Scholar]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 38, 142–158. [Google Scholar] [CrossRef] [PubMed]
- Girshick, R. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. In Proceedings of the Advances in Neural Information Processing Systems 28, Montreal, QU, Canada, 7–12 December 2015; Volume 28, pp. 1–14. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NA, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Zhu, L.; Xie, Z.; Liu, L.; Tao, B.; Tao, W. Iou-uniform r-cnn: Breaking through the limitations of rpn. Pattern Recognit. 2021, 112, 107816. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 1904–1916. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NA, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. In Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands, 11–14 October 2016; pp. 21–37. [Google Scholar]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollar, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
- Kim, J.H.; Kim, N.; Won, C.S. High-Speed Drone Detection Based On Yolo-V8. In Proceedings of the ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 4–10 June 2023; pp. 1–2. [Google Scholar]
- Zhu, X.; Lyu, S.; Wang, X.; Zhao, Q. TPH-YOLOv5: Improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Virtual Conference, 11–17 October 2021; pp. 2778–2788. [Google Scholar]
- Zhao, Q.; Liu, B.; Lyu, S.; Wang, C.; Zhang, H. TPH-YOLOv5++: Boosting Object Detection on Drone-Captured Scenarios with Cross-Layer Asymmetric Transformer. Remote Sens. 2023, 15, 1687. [Google Scholar] [CrossRef]
- Wang, F.; Wang, H.; Qin, Z.; Tang, J. UAV target detection algorithm based on improved YOLOv8. IEEE Access 2023, 11, 116534–116544. [Google Scholar] [CrossRef]
- Shen, L.; Lang, B.; Song, Z. DS-YOLOv8-Based Object Detection Method for Remote Sensing Images. IEEE Access 2023, 11, 125122–125137. [Google Scholar] [CrossRef]
- Huang, Y.; Han, D.; Han, B.; Wu, Z. ADV-YOLO: Improved SAR ship detection model based on YOLOv8. J. Supercomput. 2025, 81, 34. [Google Scholar] [CrossRef]
- Chen, Y.; Ren, J.; Li, J.; Shi, Y. Enhanced Adaptive Detection of Nearby and Distant Ships in Fog: A Real-Time Multi-Scale Target Detection Strategy. Digit. Signal Process. 2024, 158, 104961. [Google Scholar] [CrossRef]
- Li, Z.; Ma, H.; Guo, Z. MAEE-Net: SAR ship target detection network based on multi-input attention and edge feature enhancement. Digit. Signal Process. 2025, 156, 104810. [Google Scholar] [CrossRef]
- Liu, D.; Zhang, Y.; Zhao, Y.; Shi, Z.; Zhang, J.; Zhang, Y.; Zhang, Y. AARN: Anchor-guided attention refinement network for inshore ship detection. IET Image Process. 2023, 17, 2225–2237. [Google Scholar] [CrossRef]
- Zhou, W.; Peng, Y. Ship detection based on multi-scale weighted fusion. Displays 2023, 78, 102448. [Google Scholar] [CrossRef]
- Liu, W.; Chen, Y. IL-YOLOv5: A Ship Detection Method Based on Incremental Learning. In Proceedings of the International Conference on Intelligent Computing, Chennai, India, 28–29 April 2023; pp. 588–600. [Google Scholar]
- Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 7464–7475. [Google Scholar]
- Lin, T.Y.; Dollar, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Liu, S.; Qi, L.; Qin, H.; Shi, J.; Jia, J. Path aggregation network for instance segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 8759–8768. [Google Scholar]
- Li, X.; Wang, W.; Wu, L.; Chen, S.; Hu, X.; Li, J.; Yang, J. Generalized focal loss: Learning qualified and distributed bounding boxes for dense object detection. Adv. Neural Inf. Process. Syst. 2020, 33, 21002–21012. [Google Scholar]
- Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. Distance-IoU loss: Faster and better learning for bounding box regression. Proc. AAAI Conf. Artif. Intell. 2020, 34, 12993–13000. [Google Scholar] [CrossRef]
- Zhang, T.; Qi, G.J.; Xiao, B.; Wang, J. Interleaved group convolutions. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 4373–4382. [Google Scholar]
- Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WT, USA, 14–19 June 2020; pp. 11534–11542. [Google Scholar]
- Wang, C.Y.; Liao, H.Y.M.; Wu, Y.H.; Chen, P.Y.; Hsieh, J.W.; Yeh, I.H. CSPNet: A new backbone that can enhance learning capability of CNN. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WT, USA, 14–19 June 2020; pp. 390–391. [Google Scholar]
- Wang, C.Y.; Liao, H.Y.M.; Yeh, I.H. Designing network design strategies through gradient path analysis. arXiv 2022, arXiv:2211.04800. [Google Scholar]
- Lee, Y.; Hwang, J.W.; Lee, S.; Bae, Y.; Park, J. An energy and GPU-computation efficient backbone network for real-time object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA, 16–20 June 2019; pp. 1–11. [Google Scholar]
- Siliang, M.; Yong, X. Mpdiou: A loss for efficient and accurate bounding box regression. arXiv 2023, arXiv:2307.07662. [Google Scholar]
- Shao, Z.; Wu, W.; Wang, Z.; Du, W.; Li, C. Seaships: A large-scale precisely annotated dataset for ship detection. IEEE Trans. Multimedia 2018, 20, 2593–2604. [Google Scholar] [CrossRef]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 7132–7141. [Google Scholar]
- Hou, Q.; Zhou, D.; Feng, J. Coordinate attention for efficient mobile network design. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual Conference, 19–25 June 2021; pp. 13713–13722. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Zhao, Y.; Lv, W.; Xu, S.; Wei, J.; Wang, G.; Dang, Q.; Chen, J. Detrs beat yolos on real-time object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 16–22 June 2024; pp. 16965–16974. [Google Scholar]
- Xu, X.; Jiang, Y.; Chen, W.; Huang, Y.; Zhang, Y.; Sun, X. Damo-yolo: A report on real-time object detection design. arXiv 2022, arXiv:2211.15444. [Google Scholar]
- Chen, Y.; Yuan, X.; Wang, J.; Wu, R.; Li, X.; Hou, Q.; Cheng, M.M. YOLO-MS: Rethinking multi-scale representation learning for real-time object detection. IEEE Trans. Pattern Anal. Mach. Intell. 2025, 1–14. [Google Scholar] [CrossRef]
- Zheng, Y.; Zhang, S. Mcships: A large-scale ship dataset for detection and fine-grained categorization in the wild. In Proceedings of the 2020 IEEE International Conference on Multimedia and Expo (ICME), London, UK, 6–10 July 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–6. [Google Scholar]
- Wang, S.; Li, Y.; Qiao, S. ALF-YOLO: Enhanced YOLOv8 based on multiscale attention feature fusion for ship detection. Ocean. Eng. 2024, 308, 118233. [Google Scholar] [CrossRef]
Model | Improvement | P | R | mAP@0.5 | mAP@0.5:0.95 | Parameters (M) | Inference Time (ms) | ||
---|---|---|---|---|---|---|---|---|---|
ECA | GELAN | MPDIoU | |||||||
YOLOv8 | × | × | × | 0.978 | 0.972 | 0.987 | 0.844 | 43.61 | 15.2 |
YOLOv8+ECA | √ | × | × | 0.988 | 0.963 | 0.989 | 0.843 | 43.61 | 14.9 |
YOLOv8+GELAN | × | √ | × | 0.969 | 0.985 | 0.990 | 0.840 | 38.40 | 18.7 |
YOLOv8+MPDIoU | × | × | √ | 0.974 | 0.981 | 0.989 | 0.846 | 43.61 | 15.7 |
w/o ECA | × | √ | √ | 0.975 | 0.986 | 0.990 | 0.840 | 38.40 | 18.2 |
w/o GELAN | √ | × | √ | 0.971 | 0.981 | 0.990 | 0.843 | 43.61 | 17.7 |
w/o MPDIoU | √ | √ | × | 0.979 | 0.984 | 0.990 | 0.843 | 38.40 | 20.0 |
Ours | √ | √ | √ | 0.978 | 0.983 | 0.991 | 0.848 | 38.40 | 22.3 |
Types of Attention | P | R | mAP@0.5 | mAP@0.5:0.95 | Parameters (M) | Inference Time (ms) |
---|---|---|---|---|---|---|
C2f_SE | 0.986 | 0.978 | 0.991 | 0.842 | 38.49 | 20.5 |
C2f_CBAM | 0.976 | 0.981 | 0.989 | 0.843 | 38.60 | 28.1 |
C2f_CA | 0.985 | 0.981 | 0.992 | 0.842 | 38.49 | 21.1 |
Ours | 0.978 | 0.983 | 0.991 | 0.848 | 38.40 | 22.3 |
Model | P | R | mAP@0.5 | mAP@0.5:0.95 | Parameters (M) | Inference Time (ms) |
---|---|---|---|---|---|---|
Faster R-CNN | 0.718 | 0.972 | 0.962 | 0.601 | 51.75 | 70.0 |
YOLOv6 | 0.978 | 0.976 | 0.989 | 0.827 | 110.87 | 33.6 |
YOLOv7 | 0.980 | 0.980 | 0.993 | 0.816 | 36.51 | 12.3 |
YOLOv8 | 0.978 | 0.972 | 0.987 | 0.844 | 43.61 | 15.2 |
TPH-YOLOv5 | 0.967 | 0.969 | 0.986 | 0.781 | 45.40 | 33.2 |
TPH-YOLOv5++ | 0.977 | 0.967 | 0.987 | 0.801 | 41.52 | 19.2 |
DAMO-YOLO | 0.986 | 0.975 | 0.988 | 0.842 | 51.97 | 18.6 |
YOLO-MS | 0.983 | 0.969 | 0.989 | 0.831 | 50.36 | 15.6 |
RT-DETR | 0.966 | 0.964 | 0.988 | 0.797 | 31.99 | 36.4 |
Ours | 0.978 | 0.983 | 0.991 | 0.848 | 38.40 | 22.3 |
Model | P | R | mAP@0.5 | mAP@0.5:0.95 | Parameters (M) | Inference Time (ms) |
---|---|---|---|---|---|---|
Faster R-CNN | 0.531 | 0.906 | 0.858 | 0.477 | 51.75 | 80.9 |
YOLOv6 | 0.909 | 0.848 | 0.919 | 0.668 | 110.87 | 16.5 |
YOLOv7 | 0.914 | 0.868 | 0.925 | 0.630 | 36.51 | 12.3 |
YOLOv8 | 0.928 | 0.866 | 0.933 | 0.687 | 43.61 | 20.4 |
TPH-YOLOv5 | 0.876 | 0.786 | 0.867 | 0.588 | 45.40 | 35.4 |
TPH-YOLOv5++ | 0.904 | 0.850 | 0.910 | 0.632 | 41.52 | 21.7 |
DAMO-YOLO | 0.929 | 0.833 | 0.908 | 0.666 | 51.97 | 23.2 |
YOLO-MS | 0.886 | 0.847 | 0.905 | 0.648 | 50.36 | 15.7 |
RT-DETR | 0.910 | 0.839 | 0.889 | 0.638 | 31.99 | 26.2 |
Ours | 0.932 | 0.866 | 0.934 | 0.690 | 38.40 | 20.3 |
Model | Metrics | All | OC | BCC | GCS | CS | FB | PS | Parameters (M) | FPS |
---|---|---|---|---|---|---|---|---|---|---|
AARN | mAP@0.5 | 0.947 | 0.948 | 0.947 | 0.958 | 0.980 | 0.927 | 0.923 | 35.82 | 45 |
mAP@0.5:0.95 | 0.702 | 0.677 | 0.708 | 0.718 | 0.786 | 0.659 | 0.666 | |||
YOLOv5ship | mAP@0.5 | 0.976 | 0.984 | 0.963 | 0.975 | 0.983 | 0.972 | 0.980 | 40.30 | 60 |
mAP@0.5:0.95 | 0.710 | 0.644 | 0.678 | 0.741 | 0.794 | 0.656 | 0.744 | |||
IL-YOLOv5 | mAP@0.5 | 0.989 | 0.990 | 0.992 | 0.987 | 0.981 | 0.992 | 0.991 | 29.80 | 94 |
mAP@0.5:0.95 | 0.790 | 0.759 | 0.792 | 0.831 | 0.834 | 0.750 | 0.777 | |||
ALF-YOLO | mAP@0.5 | 0.991 | 0.995 | 0.994 | 0.986 | 0.985 | 0.991 | 0.995 | 42.51 | 38 |
mAP@0.5:0.95 | 0.850 | 0.850 | 0.859 | 0.870 | 0.866 | 0.796 | 0.857 | |||
Ours | mAP@0.5 | 0.991 | 0.995 | 0.993 | 0.990 | 0.985 | 0.990 | 0.991 | 38.40 | 45 |
mAP@0.5:0.95 | 0.848 | 0.845 | 0.853 | 0.871 | 0.866 | 0.798 | 0.854 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, Y.; Wang, S. EGM-YOLOv8: A Lightweight Ship Detection Model with Efficient Feature Fusion and Attention Mechanisms. J. Mar. Sci. Eng. 2025, 13, 757. https://doi.org/10.3390/jmse13040757
Li Y, Wang S. EGM-YOLOv8: A Lightweight Ship Detection Model with Efficient Feature Fusion and Attention Mechanisms. Journal of Marine Science and Engineering. 2025; 13(4):757. https://doi.org/10.3390/jmse13040757
Chicago/Turabian StyleLi, Ying, and Siwen Wang. 2025. "EGM-YOLOv8: A Lightweight Ship Detection Model with Efficient Feature Fusion and Attention Mechanisms" Journal of Marine Science and Engineering 13, no. 4: 757. https://doi.org/10.3390/jmse13040757
APA StyleLi, Y., & Wang, S. (2025). EGM-YOLOv8: A Lightweight Ship Detection Model with Efficient Feature Fusion and Attention Mechanisms. Journal of Marine Science and Engineering, 13(4), 757. https://doi.org/10.3390/jmse13040757