Ship Plate Detection Algorithm Based on Improved RT-DETR
Abstract
1. Introduction
- Proposing an improved High-Frequency Enhancement Residual Block (HFERB) and fusing it with the backbone of RT-DETR. The HFERB module adopts a dual-branch structure to process high-frequency information and local features, respectively, followed by effective fusion. Deformable convolution is introduced into HFERB to dynamically adapt to the shape and position changes in license plates caused by occlusion, enabling efficient capture of partially visible license plate information. A spatially adaptive filter is employed in the local branch to monitor nonlinear deformations in real time, ensuring that text and numeric features on tilted or distorted license plates remain clear and distinguishable.
- Introducing a Pinwheel-shaped Convolution (PConv) employing multi-directional convolution kernels to extract ship plate features from multiple angles, precisely capturing edge contours and local details such as characters and digits, enhancing small target detection accuracy.
- Employing Adaptive Sparse Self-Attention (ASSA) to improve the AIFI module by automatically selecting important regions and filtering noise, thus improving computational and information utilization efficiency and enhancing the model’s capability to analyze correlations between overall and local ship plate features.
2. Materials and Methods
2.1. RT-DETR
- Attention-based Intra-scale Feature Interaction (AIFI) module: This module performs intra-scale interactions on high-level features (S5) to capture relationships between conceptual entities in the image.
- CNN-based Cross-scale Feature Fusion Module (CCFM): This module fuses features across different scales to fully leverage multi-scale information.
2.2. RT-DETR-HPA
2.2.1. High-Frequency Enhancement Residual Block
2.2.2. Pinwheel-Shaped Convolution
2.2.3. Adaptive Sparse Self-Attention
3. Results
3.1. Dataset Construction and Preprocessing
3.1.1. Data Acquisition Environment and Equipment Configuration
3.1.2. Data Collection Scheme Design
3.1.3. Data Statistics and Sample Distribution
3.1.4. Data Cleaning and Storage
3.1.5. Data Augmentation Strategy
3.1.6. Dataset Partitioning Strategy
3.2. Experimental Platform and Environment Configuration
3.3. Evaluation Metrics
3.4. Training Process
4. Discussion
4.1. Ablation Study Analysis
- All proposed modules significantly enhance ship detection performance.
- HFERB and PConv exhibit positive synergy at the feature extraction level, while ASSA enhances feature discriminability through spatial attention mechanisms.
- The module combinations achieve an optimal accuracy-speed trade-off, providing a reliable solution for practical engineering applications.
4.2. Comparative Experimental Analysis
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Zou, Y.; Zhang, Y.; Wang, S.; Jiang, Z.; Wang, X. Ship regulatory method for maritime mixed traffic scenarios based on key risk ship identification. Ocean Eng. 2024, 298, 117105. [Google Scholar] [CrossRef]
- Xu, F.; Chen, C.; Shang, Z.; Peng, Y.; Li, X. A CRNN-based method for Chinese ship license plate recognition. IET Image Process. 2024, 18, 298–311. [Google Scholar] [CrossRef]
- Dan, W.; Yan, J. Ship collision risk analysis in port waters integrating GRA algorithm and BPNN. Transp. Saf. Environ. 2025, 7, tdaf012.1. [Google Scholar]
- Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Khanam, R.; Hussain, M. What is yolov5: A deep look into the internal features of the popular object detector. arXiv 2024, arXiv:2407.20892. [Google Scholar]
- Li, C.; Li, L.; Jiang, H.; Weng, K.; Geng, Y.; Li, L.; Ke, Z.; Li, Q.; Cheng, M.; Nie, W.; et al. YOLOv6: A single-stage object detection framework for industrial applications. arXiv 2022, arXiv:2209.02976. [Google Scholar]
- Wang, C.; Bochkovskiy, A.; Liao, H. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv 2023, arXiv:2207.02696. [Google Scholar]
- Liu, Q.; Jiang, R.; Xu, Q.; Wang, D.; Sang, Z.; Jiang, X. Yolov8n_bt: Research on classroom learning behavior recognition algorithm based on improved yolov8n. IEEE Access 2024, 12, 36391–36403. [Google Scholar] [CrossRef]
- Wang, C.; Yeh, I.; Liao, H.M. Yolov9: Learning what you want to learn using programmable gradient information. In Proceedings of the European Conference on Computer Vision, Milan, Italy, 29 September–4 October 2024; pp. 1–21. [Google Scholar]
- Cheng, T.; Song, L.; Ge, Y.; Liu, W.; Wang, X.; Shan, Y. Yolo-world: Real-time open-vocabulary object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–22 June 2024. [Google Scholar]
- Tian, Y.; Ye, Q.; Doermann, D. Yolov12: Attention-centric real-time object detectors. arXiv 2025, arXiv:2502.12524. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. Ssd: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision 2016, Amsterdam, The Netherlands, 8–16 October 2016; pp. 21–37. [Google Scholar]
- Lin, T.Y.; Dollar, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar] [CrossRef]
- Zhang, R.; Zhang, L.; Su, Y.; Yu, Q.; Bai, G. Automatic vessel plate number recognition for surface unmanned vehicles with marine applications. Front. Neurorobotics 2023, 17, 1131392. [Google Scholar] [CrossRef]
- Nabati, R.; Qi, H. RRPN: Radar Region Proposal Network for Object Detection in Autonomous Vehicles. In Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 22–25 September 2019. [Google Scholar] [CrossRef]
- Zhou, C.; Liu, D.; Wang, T.; Tian, J. M3 ANet: Multi-Modal and Multi-Attention Fusion Network for Ship License Plate Recognition. IEEE Trans. Multimed. 2023, 26, 5976–5986. [Google Scholar] [CrossRef]
- Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-End Object Detection with Transformers. In Computer Vision–ECCV 2020, Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; Springer: Cham, Switzerland, 2020. [Google Scholar]
- Zhao, Y.; Lv, W.; Xu, S.; Wei, J.; Wang, G.; Dang, Q.; Liu, Y.; Chen, J. Detrs beat yolos on real-time object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 16–22 June 2024; pp. 16965–16974. [Google Scholar]
- Du, Z.; Liang, Y. Object detection of remote sensing image based on multi-scale feature fusion and attention mechanism. IEEE Access 2024, 12, 8619–8632. [Google Scholar] [CrossRef]
- Li, Y.; Wang, L.; Chen, S. Visual Attention Guided Sparse Reconstruction for Infrared Small Target Detection. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5005615. [Google Scholar] [CrossRef]
- Liu, B.; Wu, S.; Zhang, S.; Hong, Z.; Ye, X. Ship license numbers recognition using deep neural networks. J. Phys. Conf. Ser. 2018, 1060, 012064. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar] [CrossRef]
- Li, A.; Zhang, L.; Liu, Y.; Zhu, C. Feature modulation transformer: Cross-refinement of global representation via high-frequency prior for image super-resolution. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 1–6 October 2023; pp. 12514–12524. [Google Scholar]
- Yang, J.; Liu, S.; Wu, J.; Su, X.; Hai, N.; Huang, X. Pinwheel-shaped convolution and scale-based dynamic loss for infrared small target detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Philadelphia, PA, USA, 25 February–4 March 2025; Volume 39, pp. 9202–9210. [Google Scholar]
- Zhou, S.; Chen, D.; Pan, J.; Shi, J.; Yang, J. Adapt or Perish: Adaptive Sparse Transformer with Attentive Feature Refinement for Image Restoration. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–22 June 2025. [Google Scholar] [CrossRef]
- Dai, J.; Qi, H.; Xiong, Y.; Li, Y.; Zhang, G.; Hu, H.; Wei, Y. Deformable Convolutional Networks. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [Google Scholar] [CrossRef]
- Arthanari, S.; Moorthy, S.; Jeong, J.H.; Joo, Y.H. Adaptive spatially regularized target attribute-aware background suppressed deep correlation filter for object tracking. Signal Process. Image Commun. 2025, 136, 117305. [Google Scholar] [CrossRef]
- Chen, X.; Wei, C.; Xin, Z.; Zhao, J.; Xian, J. Ship detection under low-visibility weather interference via an ensemble generative adversarial network. J. Mar. Sci. Eng. 2023, 11, 2065. [Google Scholar] [CrossRef]
Hardware | Software |
---|---|
GPU: RTX 4090 (24G × 4) | OS: Ubuntu 20.04 LTS |
CPU: Intel Xeon 8375C | Framework: PyTorch 2.0 |
Memory: 64 GB | CUDA Version: 11.8 |
Storage: 16 TB SSD | Libraries: NumPy1.26.4, Matplotlib3.9.4 |
Methods | Precision | Recall | mAP50 | mAP50–90 | FPS |
---|---|---|---|---|---|
RT-DETR | 92.44 | 91.65 | 93.76 | 59.18 | 37.5 |
RT-DETR + HFERB | 93.25 | 92.70 | 94.53 | 60.39 | 36.2 |
RT-DETR + PConv | 93.56 | 92.48 | 94.37 | 60.12 | 38.3 |
RT-DETR + ASSA | 93.89 | 92.35 | 94.65 | 60.41 | 42.3 |
RT-DETR + HFERB + PConv | 94.65 | 94.48 | 96.41 | 61.72 | 38.7 |
RT-DETR + HFERB + ASSA | 95.34 | 94.23 | 96.28 | 61.78 | 40.7 |
RT-DETR + PConv + ASSA | 95.84 | 94.45 | 96.60 | 61.35 | 40.9 |
RT-DETR + HFERB + PConv + ASSA | 96.26 | 94.88 | 97.12 | 61.90 | 40.1 |
Methods | Precision | Recall | mAP50 | mAP50–90 | FPS |
---|---|---|---|---|---|
Faster R-CNN | 82.35 | 78.68 | 52.41 | 39.86 | 20.5 |
YOLOv3m | 88.90 | 81.12 | 65.34 | 49.26 | 28.6 |
YOLOv5m | 91.80 | 84.25 | 82.75 | 56.83 | 39.2 |
YOLOv8m | 93.65 | 86.38 | 88.23 | 58.74 | 44.1 |
YOLOv12m | 95.28 | 89.41 | 91.65 | 60.03 | 52.6 |
RT-DETR | 92.44 | 91.65 | 93.76 | 59.18 | 37.5 |
RT-DETR-HPA | 96.26 | 94.88 | 97.12 | 61.90 | 40.1 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, L.; Huang, L. Ship Plate Detection Algorithm Based on Improved RT-DETR. J. Mar. Sci. Eng. 2025, 13, 1277. https://doi.org/10.3390/jmse13071277
Zhang L, Huang L. Ship Plate Detection Algorithm Based on Improved RT-DETR. Journal of Marine Science and Engineering. 2025; 13(7):1277. https://doi.org/10.3390/jmse13071277
Chicago/Turabian StyleZhang, Lei, and Liuyi Huang. 2025. "Ship Plate Detection Algorithm Based on Improved RT-DETR" Journal of Marine Science and Engineering 13, no. 7: 1277. https://doi.org/10.3390/jmse13071277
APA StyleZhang, L., & Huang, L. (2025). Ship Plate Detection Algorithm Based on Improved RT-DETR. Journal of Marine Science and Engineering, 13(7), 1277. https://doi.org/10.3390/jmse13071277