Context-Guided SAR Ship Detection with Prototype-Based Model Pretraining and Check–Balance-Based Decision Fusion
Abstract
1. Introduction
- Significant ship scale diversity, as shown in Figure 1. Ship targets in SAR images have huge size variations, ranging from small fishing boats, with dozens of pixels, to large freighters and oil tankers coexisting, featuring hundreds of pixels. This intra-class size variation stems from the diversity of imaging distance, ship tonnage, and orientation, making it difficult for a single model to accurately detect targets of different scales at the same time. This can very easily result in the omission of small targets or the inaccurate localization of large targets.
- Complex interference in inshore scenarios, as shown in Figure 2. There are a large number of strong scattering man-made targets (e.g., cranes, oil storage tanks, breakwaters) with similar scattering characteristics to ships and land background clutter in inshore areas such as harbours, islands, etc., resulting in a surge of false alarm rate, especially in densely moored shoreline areas small ships are close to.
- A training strategy for SAR ship detection with harbour layout and ship prototype prior knowledge is proposed. The harbour layout and ship prototype are taken as key prior knowledge and integrated into the deep neural network training process, which significantly improves the model’s ability of discrimination when dealing with ship size diversity and inshore complex background.
- An adaptive decision-level fusion strategy is proposed based on dynamic confidence threshold selection. A novel weighted fusion mechanism mimicking the “president–senate” check–balance (PSCB) principle is proposed, where a dominant leading model (the president) and the committee of expert models (the senate) are designated based on their performance in model pretraining. The experimental results show that by employing the proposed strategy, the mAP compared to the dominant model can be improved by up to 3.8% when the ratio of training data is 50%.
2. Related Work
2.1. SAR Ship Detection Based on FCOS
2.2. SAR Ship Detection Based on ATSS
2.3. TOOD, PP-YOLOE and PP-YOLOE+
3. Methods
3.1. Pre-Processing of the FAIR-CSAR Dataset
3.2. Pretraining Based on Prior Ship Prototype Knowledge
3.3. Harbour Area Mask Processing
3.4. Adaptive Decision-Level Fusion Based on Dynamic Confidence Threshold Selection
3.4.1. Dynamic Confidence Threshold Assignment Strategy Based on the Sizes of Targets
3.4.2. Weighted Fusion Mechanism Based on President–Senate Check–Balance
3.4.3. Soft-NMS-Based Dense Group Target Bounding Box Fusion
4. Experiment
4.1. Dataset
4.2. Evaluations Metrics
4.3. Experimental Settings
4.4. Comparative Experiments
4.4.1. Effectiveness of the Delimitation Masks
4.4.2. Effectiveness of the FUSAR-Based Pretraining
4.4.3. Effectiveness of the Adaptive Decision-Level Fusion Framework
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Moreira, A.; Prats-Iraola, P.; Younis, M.; Krieger, G.; Hajnsek, I.; Papathanassiou, K.P. A tutorial on synthetic aperture radar. IEEE Geosci. Remote Sens. Mag. 2013, 1, 6–43. [Google Scholar] [CrossRef]
- Achim, A.; Kuruoglu, E.E.; Zerubia, J. SAR image filtering based on the heavy-tailed Rayleigh model. IEEE Trans. Image Process. 2006, 15, 2686–2693. [Google Scholar] [CrossRef] [PubMed]
- Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M. ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 2015, 115, 211–252. [Google Scholar] [CrossRef]
- Geng, Z.; Zhang, S.; Xu, C.; Zhou, H.; Li, W.; Yu, X.; Zhu, D.; Zhang, G. Context-driven automatic target detection with cross-modality real-synthetic image merging. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2025, 18, 5600–5618. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Zhou, X.; Zhuo, J.; Krahenbuhl, P. Bottom-up object detection by grouping extreme and center points. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 850–859. [Google Scholar]
- Duan, K.; Bai, S.; Xie, L.; Qi, H.; Huang, Q.; Tian, Q. Centernet: Keypoint triplets for object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 6569–6578. [Google Scholar]
- Zhou, X.; Wang, D.; Krähenbühl, P. Objects as points. arXiv 2019, arXiv:1904.07850. [Google Scholar]
- Tian, Z.; Shen, C.; Chen, H.; He, T. Fcos: Fully convolutional one-stage object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 9627–9636. [Google Scholar]
- Kong, T.; Sun, F.; Liu, H.; Jiang, Y.; Li, L.; Shi, J. FoveaBox: Beyond anchor-based object detection. IEEE Trans. Image Process. 2020, 29, 7389–7398. [Google Scholar] [CrossRef]
- Cui, Z.; Li, Q.; Cao, Z.; Liu, N. Dense attention pyramid networks for multi-scale ship detection in SAR images. IEEE Trans. Geosci. Remote Sens. 2019, 57, 8983–8997. [Google Scholar] [CrossRef]
- Deng, Z.; Sun, H.; Zhou, S.; Zhao, J.; Lei, L.; Zou, H. Multi-scale object detection in remote sensing imagery with convolutional neural networks. ISPRS J. Photogramm. Remote Sens. 2018, 145, 3–22. [Google Scholar] [CrossRef]
- Fu, J.; Sun, X.; Wang, Z.; Fu, K. An anchor-free method based on feature balancing and refinement network for multiscale ship detection in SAR images. IEEE Trans. Geosci. Remote Sens. 2020, 59, 1331–1344. [Google Scholar] [CrossRef]
- Sun, Z.; Dai, M.; Leng, X.; Lei, Y.; Xiong, B.; Ji, K.; Kuang, G. An anchor-free detection method for ship targets in high-resolution SAR images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 7799–7816. [Google Scholar] [CrossRef]
- Zhao, X.; Zhang, B.; Tian, Z.; Xu, C.; Wu, F.; Sun, C. An anchor-free method for arbitrary-oriented ship detection in SAR images. In Proceedings of the 2021 SAR in Big Data Era (BIGSARDATA), Nanjing, China, 22–24 September 2021; pp. 1–4. [Google Scholar]
- Zhou, L.; Yu, H.; Wang, Y.; Xu, S.; Gong, S.; Xing, M. LASDNet: A lightweight anchor-free ship detection network for SAR images. In Proceedings of the IGARSS 2022-2022 IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, 17–22 July 2022; pp. 2630–2633. [Google Scholar]
- Yang, S.; An, W.; Li, S.; Wei, G.; Zou, B. An improved FCOS method for ship detection in SAR images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 8910–8927. [Google Scholar] [CrossRef]
- Zhang, D.; Wang, C.; Fu, Q. OFCOS: An oriented anchor-free detector for ship detection in remote sensing images. IEEE Geosci. Remote Sens. Lett. 2023, 20, 6004005. [Google Scholar] [CrossRef]
- Zhang, S.; Chi, C.; Yao, Y.; Lei, Z.; Li, S.Z. Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 9759–9768. [Google Scholar]
- Zhang, Q.; Han, Z. DRSNet: Rotated-ROI ship segmentation for SAR images based on dual-scale cross attention. IEEE Geosci. Remote Sens. Lett. 2024, 21, 4012805. [Google Scholar] [CrossRef]
- Feng, C.; Zhong, Y.; Gao, Y.; Scott, M.R.; Huang, W. Tood: Task-aligned one-stage object detection. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; pp. 3490–3499. [Google Scholar]
- Xu, S.; Wang, X.; Lv, W.; Chang, Q.; Cui, C.; Deng, K.; Wang, G.; Dang, Q.; Wei, S.; Du, Y. PP-YOLOE: An evolved version of YOLO. arXiv 2022, arXiv:2203.16250. [Google Scholar]
- PaddlePaddle Authors. PaddleDetection, Object Detection and Instance Segmentation Toolkit Based on PaddlePaddle. 2022. Available online: https://github.com/PaddlePaddle/PaddleDetection (accessed on 6 June 2025).
- Wang, X.; Wang, G.; Dang, Q.; Liu, Y.; Hu, X.; Yu, D. PP-YOLOE-R: An efficient anchor-free rotated object detector. arXiv 2022, arXiv:2211.02386. [Google Scholar]
- Wu, Y.; Suo, Y.; Meng, Q.; Dai, W.; Miao, T.; Zhao, W.; Yan, Z.; Diao, W.; Xie, G.; Ke, Q.; et al. FAIR-CSAR: A benchmark dataset for fine-grained object detection and recognition based on single look complex SAR Images. IEEE Trans. Geosci. Remote Sens. 2024, 63, 5201022. [Google Scholar] [CrossRef]
- Hou, X.; Ao, W.; Song, Q.; Lai, J.; Wang, H.; Xu, F. FUSAR-Ship: Building a high-resolution SAR-AIS matchup dataset of Gaofen-3 for ship detection and recognition. Sci. China Inf. Sci. 2020, 63, 140303. [Google Scholar] [CrossRef]
- Rosenfeld, A.; Thurston, M. Edge and Curve Detection for Visual Scene Analysis. IEEE Trans. Comput. 1971, C-20, 562–569. [Google Scholar] [CrossRef]
- Bodla, N.; Singh, B.; Chellappa, R.; Davis, L.S. Soft-NMS-Improving Object Detection with One Line of Code. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 5562–5570. [Google Scholar]
- Li, J.; Qu, C.; Shao, J. Ship detection in SAR images based on an improved faster R-CNN. In Proceedings of the 2017 SAR in Big Data Era: Models, Methods and Applications (BIGSARDATA), Beijing, China, 13–14 November 2017; pp. 1–6. [Google Scholar]
- Lei, S.; Lu, D.; Qiu, X.; Ding, C. SRSDD-v1. 0: A high-resolution SAR rotation ship detection dataset. Remote Sens. 2021, 13, 5104. [Google Scholar] [CrossRef]
- Xian, S.; Zhirui, W.; Yuanrui, S.; Wenhui, D.; Yue, Z.; Kun, F. AIR-SARShip-1.0: High-resolution SAR ship detection dataset. Radars 2019, 8, 852–863. [Google Scholar]
- Wei, S.; Zeng, X.; Qu, Q.; Wang, M.; Su, H.; Shi, J. HRSID: A high-resolution SAR images dataset for ship detection and instance segmentation. IEEE Access 2020, 8, 120234–120254. [Google Scholar] [CrossRef]
- Li, Y.; Li, X.; Li, W.; Hou, Q.; Liu, L.; Cheng, M.-M.; Yang, J. SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection. In Proceedings of the 38th Conference on Neural Information Processing Systems (NeurIPS 2024), Vancouver, BC, Canada, 9–15 December 2024; pp. 1–25. [Google Scholar]
Name | Source of SAR | Images | Instances | Resolution | Size |
---|---|---|---|---|---|
SSDD/SSDD+ | RadarSat-2, TerraSAR, Sentinel-1 | 1160 | 2456 | 1 m–15 m | 190–668 |
SAR-Ship-Dataset | Gaofen-3, Sentinel-1 | 43,918 | 59,535 | 3 m–25 m | 256 × 256 |
HRSID | Sentinel-1 | 5604 | 16,951 | 0.5 m, 1 m, 3 m | 800 × 800 |
SRSDD-SAR | Gaofen-3 | 666 | 2275 | 1 m | 1024 × 1024 |
RSDD-SAR | Gaofen-3, TerraSAR | 7000 | 10,263 | 2 m–20 m | 512 × 512 |
FUSAR-Ship | Gaofen-3 | 126 | 6252 | 1.124 m × 1.728 m | 512 × 512 |
SARDet-100K | Gaofen-3, Sentinel-1, TerraSAR, TanDEMX, HISEA-1, RadarSat-2 | 116,598 | 245,653 | 0.5 m–25 m | 512 × 512 |
FAIR-CSAR | Gaofen-3 | 29,948 | 340,000 | 1 m, 5 m | 1024 × 1024 |
Performance Metrics | AP Performance by Size Category | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
AP@0.25 | F1@0.25 | Precision | Recall | AP@0.25 by Image | XS <15 m | S 15–45 m | M 45–85 m | L 85–200 m | XL >200 m | |
Total | 0.76 ± 0.02 | 0.74 ± 0.02 | 0.71 | 0.77 | 0.79 ± 0.02 | 0.69 | 0.76 | 0.89 | 0.91 | 0.91 |
Inshore | 0.74 ± 0.02 | 0.73 ± 0.02 | 0.69 | 0.77 | 0.75 ± 0.02 | 0.69 | 0.76 | 0.89 | 0.90 | 0.89 |
Offshore | 0.90 ± 0.01 | 0.86 ± 0.01 | 0.84 | 0.87 | 0.92 ± 0.01 | 0.26 | 0.70 | 0.95 | 0.95 | 0.95 |
Model | Toolbox | Epochs | Initial LR | Scheuler | Tr (IoU) | Batch Size | Augmentation |
---|---|---|---|---|---|---|---|
PP-YOLOE+ | Paddle Detection | 80 | 1.25 × 10−4 | DEF | 0.7 | 8 | Norm. |
PP-YOLOE | 240 | 1.25 × 10−3 | DEF | 0.7 | 8 | Norm. | |
TOOD | 100 | 1.25 × 10−3 | DEF | 0.6 | 4 | Norm. | |
FCOS | MM Detection | 100 | 5 × 10−4 | Cosine AnnealingLR | 0.5 | 2 | Norm. + RandRot. + Albmen. |
ATSS | 180 | 1.5 × 10−3 | Cosine AnnealingLR | 0.6 | 2 | Norm. + RandRot. + Albmen. |
PP-YOLOE+ | PP-YOLOE | FCOS | ATSS | TOOD | |
---|---|---|---|---|---|
AP@0.25 (%) | 95.7 | 94.8 | 93.9 | 89.2 | 94.1 |
AP@0.5 (%) | 94.3 | 92.6 | 90.4 | 85.6 | 91.9 |
Model | Evaluation Metrics | AP@0.25 | AR@0.25–0.75 | ||||||
---|---|---|---|---|---|---|---|---|---|
Ratio of Training Data | 70% | 50% | 70% | 50% | |||||
MaxDets | 50 | 100 | 50 | 100 | 50 | 100 | 50 | 100 | |
PP-YOLOE+ | original | 94.0 | 96.2 | 91.6 | 93.6 | 93.1 | 95.5 | 91.5 | 93.6 |
FUSAR_pretrained | 94.6 (+0.6) | 96.7 (+0.5) | 92.5 (+0.9) | 93.9 (+0.3) | 93.2 (+0.1) | 95.4 (−0.1) | 91.1 (−0.4) | 93.5 (−0.1) | |
FUSAR + mask | 94.5 (+0.5) | 97.0 (+0.8) | 93.1 (+1.5) | 94.0 (+0.4) | 93.3 (+0.2) | 96.0 (+0.5) | 91.5 | 93.0 (−0.6) | |
PP-YOLOE | original | 89.7 | 92.0 | 85.5 | 90.4 | 89.7 | 94.8 | 86.9 | 92.6 |
FUSAR + mask | 93.4 (+3.7) | 95.7 (+3.7) | 89.3 (+3.8) | 93.3 (+2.9) | 92.3 (+2.6) | 95.5 (+0.7) | 89.4 (+2.5) | 93.9 (+1.3) | |
ATSS | original | 83.6 | 86.0 | 77.7 | 80.6 | 84.1 | 88.2 | 78.2 | 82.0 |
FUSAR + mask | 86.4 (+2.8) | 88.7 (+2.7) | 81.1 (+3.4) | 82.7 (+2.1) | 85.8 (+1.7) | 89.4 (+1.2) | 81.4 (+3.2) | 83.4 (+1.4) | |
FCOS | original | 85.5 | 87.5 | 81.7 | 82.9 | 81.8 | 84.4 | 76.8 | 79.2 |
FUSAR + mask | 86.5 (+1.0) | 88.4 (+0.9) | 82.2 (+0.5) | 84.3 (+1.4) | 83.0 (+1.2) | 84.9 (+0.5) | 78.3 (+1.5) | 80.0 (+0.8) | |
TOOD | original | 83.6 | 85.6 | 79.3 | 81.9 | 79.2 | 83.0 | 75.8 | 80.0 |
FUSAR + mask | 87.7 (+4.1) | 89.4 (+3.8) | 85.7 (+6.4) | 87.3 (+5.4) | 82.8 (+3.6) | 85.2 (+2.2) | 81.3 (+5.5) | 83.8 (+3.8) |
Model | FLOPs |
---|---|
PP-YOLOE+ | 0.126 T |
PP-YOLOE | 0.125 T |
ATSS | 0.179 T |
FCOS | 0.174 T |
TOOD | 0.403 T |
President Model | Committee Model | MaxDets | Training Data Ratio | AP@0.25 (%) | AP@0.5 (%) |
---|---|---|---|---|---|
PP-YOLOE+ | FCOS, ATSS, TOOD | 100 | 70% | 97.2 (+0.2) | 95.0 |
50% | 94.1 (+0.1) | 90.5 (−0.1) | |||
50 | 70% | 94.6 (+0.1) | 92.1 (−0.3) | ||
50% | 932 (+0.1) | 89.3 (−0.5) | |||
PP-YOLOE | FCOS, TOOD | 100 | 70% | 96.2 (+0.5) | 93.2 (+0.2) |
50% | 94.2 (+0.9) | 90.1 (+0.4) | |||
50 | 70% | 94.0 (+0.6) | 90.5 (+0.1) | ||
50% | 90.8 (+1.5) | 85.9 (+1.2) | |||
TOOD | FCOS, ATSS | 100 | 70% | 90.4 (+1.0) | 82.9 (+1.3) |
50% | 88.0 (+0.7) | 78.2 (+0.4) | |||
50 | 70% | 90.7 (+3.0) | 84.0 (+3.8) | ||
50% | 88.2 (+2.5) | 78.5 (+2.6) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhou, H.; Geng, Z.; Sun, M.; Wu, L.; Yan, H. Context-Guided SAR Ship Detection with Prototype-Based Model Pretraining and Check–Balance-Based Decision Fusion. Sensors 2025, 25, 4938. https://doi.org/10.3390/s25164938
Zhou H, Geng Z, Sun M, Wu L, Yan H. Context-Guided SAR Ship Detection with Prototype-Based Model Pretraining and Check–Balance-Based Decision Fusion. Sensors. 2025; 25(16):4938. https://doi.org/10.3390/s25164938
Chicago/Turabian StyleZhou, Haowen, Zhe Geng, Minjie Sun, Linyi Wu, and He Yan. 2025. "Context-Guided SAR Ship Detection with Prototype-Based Model Pretraining and Check–Balance-Based Decision Fusion" Sensors 25, no. 16: 4938. https://doi.org/10.3390/s25164938
APA StyleZhou, H., Geng, Z., Sun, M., Wu, L., & Yan, H. (2025). Context-Guided SAR Ship Detection with Prototype-Based Model Pretraining and Check–Balance-Based Decision Fusion. Sensors, 25(16), 4938. https://doi.org/10.3390/s25164938