RSDB-Net: A Novel Rotation-Sensitive Dual-Branch Network with Enhanced Local Features for Remote Sensing Ship Detection
Highlights
- We propose RSDB-Net, a novel network for rotation-aware ship detection in remote sensing.
- We designed STCBackbone that uniquely fuses Swin Transformer and CNN via FCCM.
- We developed RCFHead for cross-branch feature fusion to boost orientation robustness.
- The eFPN with learnable transposed convolutions is suitable for adaptive multi-scale fusion.
- RSDB-Net achieves 89.13% AP-ship on DOTA-v1.0 and 90.10% AP on HRSC2016.
Abstract
1. Introduction
- We present a tightly coupled dual-branch backbone, Swin Transformer–CNN Backbone (STCBackbone) that features integrated ST and CNN branches. Through residual feature injection and the proposed FCCM, the model effectively aligns and enhances local and global features, enabling robust multi-scale representation of ships in complex remote sensing scenes.
- We propose a rotation-sensitive RCFHead that incorporates cross-branch feature interaction and weight sharing between classification and regression tasks, leading to improved detection accuracies and orientation robustness.
- We develop an eFPN equipped with learnable transposed convolutions for adaptive upsampling. This design helps preserve fine-grained spatial details while promoting consistent and efficient multi-scale feature fusion.
- Extensive experiments demonstrate that RSDB-Net consistently outperforms existing methods in capturing detailed spatial structures and in detection performance under complex conditions.
2. Related Work
2.1. Traditional Machine Learning Approaches
2.2. CNN-Based Detectors
2.3. Transformer-Based Ship Detection
3. Materials and Methods
3.1. Feature Extraction Backbone: STCBackbone
| Algorithm 1: FCCM | |
| Input: Output: Fused for Unit i. Step 1: Channel Alignment and Flattening 1. For each Unit i = 1, 2, 3,4: 2. Project CNN features via 1 × 1 convolution to match Swin dimensionality: Step 2: Reshape CNN Features into a Token Sequence 4. Flatten spatial dimensions and permute to reorder (B, C, H, W) → (B, H, W, C) for tokenization: 5. | |
| Comment: This operation ensures that the spatial tokens from the CNN branch share the same ordering and dimension layout as Swin tokens, facilitating token-wise fusion. | |
| Step 3: Additive Coupling 6. Retrieve ST token sequence: 7. 8. Fuse aligned CNN and Swin features: 9. Step 4: Forward Propagation 10. Pass to the next ST Unit. | |
3.2. Feature Fusion Neck: eFPN
3.3. Detection Head: RCFHead
4. Experiments
4.1. Datasets and Metrics
4.2. Ablation Experiments
4.2.1. Effective Backbone Experiments
4.2.2. Neck Experiments
4.2.3. Effective Head Experiments
4.2.4. Experiments on DOTA-v1.0 and HRSC2016
4.3. RSDB-Net Visual Effects
4.4. Comparison with State-of-the-Art Networks
5. Conclusions
| Algorithm 2: RSDB-Net |
| Input: Preprocessed training data I and corresponding labels t. Output: Trained object detection network and detection results. Step 1: Feature Extraction 1. For each training iteration t = 1, 2, …, T: 2. For each Unit i = 1, 2, …, 4: 3. Extract features MC using the CNN branch in Unit i. 4. Extract features MS using the ST branch in Unit i-1. 5. If the features are from the CNN branch: 6. Fuse MC with MS using the FCCM to obtain fused features MSMS. 7. Pass the fused features MSMS. to the next unit. Step 2: Feature fusion 8. Propagate features through the feature pyramid module. 9. Apply transposed convolution to perform upsampling and enhance the feature maps. 10. Construct the feature pyramid P = {P1, P2, P3, P4, P5}. 11. For each feature map Pi∈P, predict the detection results r. Step 3: Post-Processing Detection Results 12. Compute the overlap between the predicted boxes pb and all ground truth. 13. If the max_overlap > 0.5, then: 14. Assign the corresponding ground truth as the object and label the predicted box as a positive example. 15. Return the final detection results. |
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. Adv. Neural Inf. Process. Syst. 2017, 30, 5998–6008. [Google Scholar]
- Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Ding, J.; Xue, N.; Long, Y.; Xia, G.S.; Lu, Q. Learning RoI Transformer for Oriented Object Detection in Aerial Images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 2849–2858. [Google Scholar]
- Lin, J.; Yang, X.; Xiao, S.; Yu, Y.; Jia, C. A Line Segment-Based Inshore Ship Detection Method. In Future Control and Automation, Proceedings of the 2nd International Conference on Future Control and Automation (ICFCA 2012), Changsha, China, 1–2 July 2012; Springer: Berlin/Heidelberg, Germany, 2012; Volume 2, pp. 261–269. [Google Scholar] [CrossRef]
- Yang, G.; Li, B.; Ji, S.; Gao, F.; Xu, Q. Ship detection from optical satellite images based on sea surface analysis. IEEE Geosci. Remote Sens. Lett. 2013, 11, 641–645. [Google Scholar] [CrossRef]
- Yu, D.; Guo, H.; Zhao, C.; Liu, X.; Xu, Q.; Lin, Y.; Ding, L. An Anchor-Free and Angle-Free Detector for Oriented Object Detection Using Bounding Box Projection. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5618517. [Google Scholar] [CrossRef]
- Zhou, K.; Zhang, M.; Dong, Y.; Tan, J.; Zhao, S.; Wang, H. Vector Decomposition-Based Arbitrary-Oriented Object Detection for Optical Remote Sensing Images. Remote Sens. 2023, 15, 4738. [Google Scholar] [CrossRef]
- Xie, X.; You, Z.H.; Chen, S.B.; Huang, L.L.; Tang, J.; Luo, B. Feature Enhancement and Alignment for Oriented Object Detection. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 17, 778–787. [Google Scholar] [CrossRef]
- Yan, Z.; Li, Z.; Xie, Y.; Li, C.; Li, S.; Sun, F. ReBiDet: An Enhanced Ship Detection Model Utilizing ReDet and Bi-Directional Feature Fusion. Appl. Sci. 2023, 13, 7080. [Google Scholar] [CrossRef]
- Gao, F.; Cai, C.; Tang, W.; Tian, Y.; Huang, K. RA2DC-Net: A Residual Augment-Convolutions and Adaptive Deformable Convolution for Points-Based Anchor-Free Orientation Detection Network in Remote Sensing Images. Expert Syst. Appl. 2024, 238, 122299. [Google Scholar] [CrossRef]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; pp. 10012–10022. [Google Scholar]
- Wang, Y.; Li, X. Ship-DETR: A Transformer-Based Model for Efficient Ship Detection in Complex Maritime Environments. IEEE Access 2025, 13, 3559107. [Google Scholar] [CrossRef]
- Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-End Object Detection with Transformers. In Computer Vision–ECCV 2020, Proceedings of the 16th European Conference, Glasgow, UK, 23–28 August 2020; Springer International Publishing: Cham, Switzerland, 2020; pp. 213–229. [Google Scholar]
- Zhang, W.; Cai, M.; Zhang, T.; Lei, G.; Zhuang, Y.; Mao, X. Popeye: A Unified Visual-Language Model for Multi-Source Ship Detection from Remote Sensing Imagery. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 20050–20063. [Google Scholar] [CrossRef]
- Cao, H.; Wang, Y.; Chen, J.; Jiang, D. Swin-UNet: Unet-Like Pure Transformer for Medical Image Segmentation. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2022, Proceedings of the 25th International Conference, Singapore, 18–22 September 2022; Springer Nature: Cham, Switzerland, 2022; pp. 205–218. [Google Scholar]
- Chen, Z.; Zhu, Y.; Zhao, C.; Xu, H.; Liu, H. DPT: Deformable Patch-Based Transformer for Visual Recognition. In Proceedings of the 29th ACM International Conference on Multimedia, Virtual Event, China, 20–24 October 2021; pp. 2899–2907. [Google Scholar]
- Yao, X.; Zhang, H.; Wen, S.; Shi, Z.; Jiang, Z. Single-Image Super Resolution for RGB Remote Sensing Imagery Via Multi-Scale CNN-Transformer Feature Fusion. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 18, 1302–1316. [Google Scholar] [CrossRef]
- Zhu, Q.; Huang, X.; Guan, Q. TabCtNet: Target-Aware Bilateral CNN-Transformer Network for Single Object Tracking in Satellite Videos. Int. J. Appl. Earth Obs. Geoinf. 2024, 128, 103723. [Google Scholar] [CrossRef]
- Zhang, X.; Xu, C.; Fan, G.; Wu, Q. FSCMF: A Dual-Branch Frequency–Spatial Joint Perception Cross-Modality Network for Visible and Infrared Image Fusion. Neurocomputing 2025, 587, 130376. [Google Scholar] [CrossRef]
- Wang, W.; Chen, W.; Qiu, Q.; Chen, L.; Wu, B.; Lin, B.; Li, L.; Liu, W. Crossformer++: A Versatile Vision Transformer Hinging on Cross-Scale Attention. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 46, 3123–3136. [Google Scholar] [CrossRef] [PubMed]
- Jamali, A.; Roy, S.K.; Ghamisi, P. WetMapFormer: A Unified Deep CNN and Vision Transformer for Complex Wetland Mapping. Int. J. Appl. Earth Obs. Geoinf. 2023, 120, 103333. [Google Scholar] [CrossRef]
- Ji, R.; Tan, K.; Wang, X.; Tang, S.; Sun, J.; Niu, C.; Pan, C. PatchOut: A Novel Patch-Free Approach Based on a Transformer-CNN Hybrid Framework for Fine-Grained Land-Cover Classification on Large-Scale Airborne Hyperspectral Images. Int. J. Appl. Earth Obs. Geoinf. 2025, 138, 104457. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef]
- Xia, G.-S.; Bai, X.; Ding, J.; Zhu, Z.; Belongie, S.; Luo, J.; Datcu, M.; Pelillo, M.; Zhang, L. DOTA: A Large-Scale Dataset for Object Detection in Aerial Images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018; pp. 3974–3983. [Google Scholar]
- Liu, Z.; Yuan, L.; Weng, L.; Yang, Y. A High Resolution Optical Satellite Image Dataset for Ship Recognition and Some New Baselines. In Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods (ICPRAM 2017), Porto, Portugal, 24–26 February 2017; pp. 324–331. [Google Scholar]
- Everingham, M.; Van Gool, L.; Williams, C.K.I.; Winn, J.; Zisserman, A. The Pascal Visual Object Classes (VOC) Challenge. Int. J. Comput. Vis. 2010, 88, 303–338. [Google Scholar] [CrossRef]
- Han, J.; Ding, J.; Xue, N.; Shao, Z. ReDet: A Rotation-Equivariant Detector for Aerial Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Virtual, 19–25 June 2021; pp. 2786–2795. [Google Scholar]
- Pan, X.; Ren, Y.; Sheng, K.; Dong, W.; Yuan, H.; Guo, X.; Ma, Z.; Xu, C. Dynamic Refinement Network for Oriented and Densely Packed Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 11207–11216. [Google Scholar]
- Zhao, T.; Yuan, M.; Jiang, F.; Celik, T.; Li, H.-C. Removal Then Selection: A Coarse-to-Fine Fusion Perspective for RGB-Infrared Object Detection. arXiv 2024, arXiv:2401.10731. [Google Scholar]
- Ou, Z.; Chen, Z.; Shen, S.; Fan, L.; Yao, S.; Song, M.; Hui, P. Free3Net: Gliding Free, Orientation Free, and Anchor Free Network for Oriented Object Detection. IEEE Trans. Multimed. 2022, 25, 7089–7100. [Google Scholar] [CrossRef]
- Huang, Q.; Yao, R.; Lu, X.; Zhu, J.; Xiong, S.; Chen, Y. Oriented Object Detector with Gaussian Distribution Cost Label Assignment and Task-Decoupled Head. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–15. [Google Scholar] [CrossRef]
- Ming, Q.; Miao, L.; Zhou, Z.; Dong, Y. CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Detection in Remote-Sensing Images. IEEE Trans. Geosci. Remote Sens. 2021, 60, 1–14. [Google Scholar] [CrossRef]
- Yang, X.; Yan, J.; Feng, Z.; He, T. R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object. Proc. AAAI Conf. Artif. Intell. 2021, 35, 3163–3171. [Google Scholar] [CrossRef]
- Yao, Y.; Cheng, G.; Lang, C.; Xie, X.; Han, J. Centric Probability-Based Sample Selection for Oriented Object Detection. IEEE Trans. Circuits Syst. Video Technol. 2024, 34, 10289–10302. [Google Scholar] [CrossRef]
- Zhang, X.; Zhao, C.; Hu, B.; Li, J.; Plaza, A. Efficient Object Detection in Large-Scale Remote Sensing Images via Situation-Aware Model. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2025, 18, 22486–22498. [Google Scholar] [CrossRef]
- Fu, R.; Chen, C.; Yan, S.; Li, W.; Wang, P. FADL-Net: Frequency-Assisted Dynamic Learning Network for Oriented Object Detection in Remote Sensing Images. IEEE Trans. Ind. Inform. 2024, 20, 9939–9951. [Google Scholar] [CrossRef]
- Zhao, J.; Ding, Z.; Zhou, Y.; Zhu, H.; Du, W.L.; Yao, R.; El Saddik, A. OrientedFormer: An End-to-End Transformer-Based Oriented Object Detector in Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–16. [Google Scholar] [CrossRef]
- Zhang, C.; Chen, Z.; Xiong, B.; Li, Y.; Wang, J. EOOD: End-to-end Oriented Object Detection. Neurocomputing 2025, 621, 129251. [Google Scholar] [CrossRef]
- Cheng, G.; Yao, Y.; Li, S.; Li, K.; Xie, X.; Wang, J.; Han, J. Dual-Aligned Oriented Detector. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–11. [Google Scholar] [CrossRef]
- Dong, Y.; Wei, M.; Gao, G.; Li, C.; Liu, Z. SARFA-Net: Shape-Aware Label Assignment and Refined Feature Alignment for Arbitrary-Oriented Object Detection in Remote Sensing Images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2025, 18, 8865–8881. [Google Scholar] [CrossRef]
- Yang, X.; Yan, J.; Liao, W.; Yang, X.; Tang, J.; He, T. SCRDet++: Detecting Small, Cluttered and Rotated Objects via Instance-Level Feature Denoising and Rotation Loss Smoothing. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 2384–2399. [Google Scholar] [CrossRef]
- Wuan, S.; Zheng, W.; Zhijing, X. Ship-Yolo: A Deep Learning Approach for Ship Detection in Remote Sensing Images. J. Mar. Sci. Eng. 2025, 13, 737. [Google Scholar] [CrossRef]
- Rao, C.; Wang, J.; Cheng, G.; Xie, X.; Han, J. Learning orientation-aware distances for oriented object detection. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–11. [Google Scholar] [CrossRef]
- Du, X.; Wu, X. Small Object Detection in Synthetic Aperture Radar with Modular Feature Encoding and Vectorized Box Regression. Remote Sens. 2025, 17, 3094. [Google Scholar] [CrossRef]
- Yan, C.; Qi, N. LTGS: An optical remote sensing tiny ship detection model. Pattern Anal. Appl. 2025, 28, 124. [Google Scholar] [CrossRef]
- Dai, L.; Liu, H.; Tang, H.; Wu, Z.; Song, P. AO2-DETR: Arbitrary-Oriented Object Detection Transformer. IEEE Trans. Circuits Syst. Video Technol. 2023, 33, 2342–2356. [Google Scholar] [CrossRef]
- Xu, Y.; Fu, M.; Wang, Q.; Wang, Y.; Chen, K.; Xia, G.S.; Bai, X. Gliding Vertex on the Horizontal Bounding Box for Multi-Oriented Object Detection. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 1452–1459. [Google Scholar] [CrossRef]
- Cui, Z.; Leng, J.; Liu, Y.; Zhang, T.; Quan, P.; Zhao, W. SKNet: Detecting Rotated Ships as Keypoints in Optical Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2021, 59, 8826–8840. [Google Scholar] [CrossRef]
- Kang, Y.; Zheng, B.; Shen, W. Research on Oriented Object Detection in Aerial Images Based on Architecture Search with Decoupled Detection Heads. Appl. Sci. 2025, 15, 8370. [Google Scholar] [CrossRef]
- Song, J.; Miao, L.; Zhou, Z.; Ming, Q.; Dong, Y. Optimized Point Set Representation for Oriented Object Detection in Remote-Sensing Images. IEEE Geosci. Remote Sens. Lett. 2023, 20, 1–5. [Google Scholar] [CrossRef]
- Zhu, L.; Jing, D.; Lu, B.; Zheng, D.; Ren, S.; Chen, Z. Shape-Aware Dynamic Alignment Network for Oriented Object Detection in Aerial Images. Symmetry 2025, 17, 779. [Google Scholar] [CrossRef]
- Pan, Y.; Xu, Y.; Wu, Z.; Wei, Z.; Plaza, J.; Plaza, A. A Mask Guided Oriented Object Detector Based on Rotated Size-Adaptive Tricube Kernel. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5615815. [Google Scholar] [CrossRef]
- Zhao, T.; Liu, N.; Celik, T.; Li, H.-C. An Arbitrary-Oriented Object Detector Based on Variant Gaussian Label in Remote Sensing Image. IEEE Geosci. Remote Sens. Lett. 2022, 19, 8013605. [Google Scholar] [CrossRef]
- Wang, J.; Li, L.; Bi, H. Gaussian Focal Loss: Learning Distribution Polarized Angle Prediction for Rotated Object Detection in Aerial Image. IEEE Trans. Geosci. Remote Sens. 2022, 60, 4707013. [Google Scholar] [CrossRef]
- Zhao, Z.; Li, L. ABFL: Angular Boundary Discontinuity Free Loss for Arbitrary-Oriented Object Detection in Aerial Images. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–14. [Google Scholar] [CrossRef]
- You, D.; Zhao, B.; Lei, D.; Mao, Y. A Robust Multi-Scale Ship Detection Approach Leveraging Edge Focus Enhancement and Dilated Residual Aggregation. J. Supercomput. 2025, 81, 1524. [Google Scholar] [CrossRef]
- Zhuang, Y.; Liu, Y.; Zhang, T.; Chen, H. Contour Modeling Arbitrary-Oriented Ship Detection from Very High-Resolution Optical Remote Sensing Imagery. IEEE Geosci. Remote Sens. Lett. 2023, 20, 1–5. [Google Scholar] [CrossRef]










| Method Category | Feature Strategy | Modeling Capability | Rotation Sensitivity | Performance |
|---|---|---|---|---|
| Traditional ML | Hand-crafted features | × | × | Sensitive to backgrounds; poor generalization |
| CNN-based | Layer-wise convolutional | ✓ | ✓Angle encoding | Strong local extraction but lacks global context |
| Transfer Learning | Layer reuse | ✓ | ✓ | Improved convergence with limited data, but domain shift remains challenging |
| Hybrid Transformer | Simple concatenation; linear fusion | ✓✓ | ✓ | Excellent context modeling but limited fine-grained alignment |
| RSDB-Net | Residual feature injection +FCCM cross-branch coupling | ✓✓✓ | ✓✓✓ (Rotation-aware fusion) | Achieves superior accuracy and robustness in complex maritime environments |
| Unit | CNN Branch | FCCM | ST Branch | Output |
|---|---|---|---|---|
| Input | C × H0 × W0 | - | (K + 1) × E | - |
| 0 | Conv 7 × 7, s = 2, C: 3 → 64; MaxPool 3 × 3, s = 2 | - | PatchEmbed p = 4, s = 4, d = 96; flatten → transpose | - |
| 1 | [1 × 1, 64] → [3 × 3, 64, s = 1] → [1 × 1, 256] × 3 | 256 × 256 × 256 → 1 × 1.96 | w = 7, h = 3, d = 96, blocks × 2 (Swin) | C1: 96 × (H0/4) × (W0/4) |
| 2 | [1 × 1, 128] → [3 × 3, 128, s =2] → [1 × 1, 512] × 4 | 512 × 128 × 128 → 1 × 1.192 | PatchMerging (↓2); w = 7, h = 3, d = 192, blocks × 6 | C2: 192 × (H0/8) × (W0/8) |
| 3 | [1 × 1, 256] → [3 × 3, 256, s = 2] → [1 × 1, 1024] × 6 | 1024 × 64 × 64 → 1 × 1.384 | PatchMerging (↓2); w = 7, h = 12, d = 384, blocks × 6 | C3: 384 × (H0/16) × (W0/16) |
| 4 | [1 × 1, 512] → [3 × 3, 512, s = 2] → [1 × 1, 2048] × 3 | 2048 × 32 × 32 → 1 × 1.768 | PatchMerging (↓2); w = 7, h = 24, d = 768, blocks × 2 | C4: 768 × (H0/32) × (W0/32) |
| Dataset | Head | Backbone | AP % (Ship) | Dataset | Head | Backbone | AP % (Ship) |
|---|---|---|---|---|---|---|---|
| DOTA-v1.0 | RoITransformer | ResNet50 | 83.59 | HRSC2016 | RoITransformer | ResNet50 | 65.1 |
| Swin Transformer | 82.65 | Swin Transformer | 45.7 | ||||
| STCBackbone | 87.83↑ | STCBackbone | 88↑ | ||||
| Faster_RCNN | ResNet50 | 85.80 | Faster_RCNN | ResNet50 | 54.3 | ||
| Swin Transformer | 87.82 | Swin Transformer | 41.5 | ||||
| STCBackbone | 88.08↑ | STCBackbone | 88.3↑ | ||||
| RCFHead | ResNet50 | 87.82 | RCFHead | ResNet50 | 88.3 | ||
| Swin Transformer | 87.88 | Swin Transformer | 88.0 | ||||
| STCBackbone | 88.12↑ | STCBackbone | 88.4↑ |
| Head | Backbone | Neck | AP % (Ship) |
|---|---|---|---|
| RoITransformer | Swin Transformer | FPN | 87.29 |
| FPN + P0 | 88.40 ↑ | ||
| eFPN | 87.74 ↑↑ | ||
| ResNet50 | FPN | 87.47 | |
| FPN + P0 | 87.50 ↑ | ||
| eFPN | 87.61 ↑↑ | ||
| RSDB-Net | FPN | 87.83 | |
| FPN + P0 | 88.04 ↑ | ||
| eFPN | 88.40 ↑↑ |
| Backbone | Neck | Head | AP % (Ship) |
|---|---|---|---|
| ResNet50 | FPN | RoITransformer | 83.59 |
| Faster_RCNN | 85.80 | ||
| RCFHead | 87.82 | ||
| Swin Transformer | RoITransformer | 82.65 | |
| Faster_RCNN | 87.82 | ||
| RCFHead | 87.88 | ||
| STCBackbone | RoITransformer | 87.83 | |
| Faster_RCNN | 88.08 | ||
| RCFHead | 88.12 |
| Network | PL | BD | BR | GTF | SV | LV | SH | TC | BC | ST | SBF | RA | HA | SP | HC | mAP % |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RoITransformer | 88.55 | 82.30 | 52.92 | 71.49 | 78.72 | 83.07 | 87.30 | 90.90 | 83.57 | 86.63 | 63.61 | 62.44 | 75.41 | 72.01 | 52.62 | 75.43 |
| RoITransformer + MRPN | 89.11 | 84.96 | 53.66 | 73.69 | 79.60 | 83.36 | 87.84 | 90.91 | 85.97 | 87.07 | 63.35 | 64.02 | 75.46 | 73.76 | 58.74 | 76.70 |
| RoITransformer + MRPN + MRoI | 89.09 | 83.21 | 54.06 | 72.17 | 80.81 | 84.21 | 88.08 | 90.91 | 85.36 | 86.39 | 66.11 | 66.01 | 77.38 | 73.87 | 62.25 | 77.27 |
| Backbone | Head | IoU50 | IoU55 | IoU60 | IoU65 | IoU70 | IoU75 | IoU80 | IoU85 | IoU90 | IoU95 | Slope |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ResNet50 | faster_rcnn | 54.3 | 44.9 | 38.5 | 27.1 | 15.9 | 8.0 | 3.6 | 1.0 | 0.2 | 0.0 | −1.852 |
| roitransformer | 65.1 | 62.2 | 51.2 | 40.9 | 30.8 | 20.8 | 11.5 | 4.9 | 0.9 | 0.3 | −1.772 | |
| redet | 83.1 | 75.8 | 75.0 | 65.2 | 53.9 | 43.3 | 31.0 | 15.6 | 9.1 | 0.2 | −1.592 | |
| mixhead (ours) | 88.3 | 88.1 | 87.0 | 78.8 | 77.3 | 66.2 | 48.6 | 29.9 | 12.5 | 1.5 | −0.884 | |
| Swin Transformer | faster_rcnn | 41.5 | 29.0 | 18.9 | 10.3 | 6.3 | 1.5 | 0.5 | 0.1 | 0.0 | 0.0 | −1.6 |
| roitransformer | 45.7 | 37.0 | 28.0 | 21.0 | 15.2 | 11.2 | 9.1 | 1.3 | 1.3 | 0.0 | −1.38 | |
| redet | 60.0 | 50.1 | 40.0 | 29.1 | 21.4 | 14.2 | 10.7 | 4.5 | 0.4 | 0.1 | −1.832 | |
| mixhead (ours) | 88.0 | 87.8 | 86.9 | 78.8 | 75.9 | 65.6 | 44.8 | 24.3 | 5.3 | 1.8 | −0.896 | |
| STCBackbone | faster_rcnn | 45.0 | 36.1 | 27.4 | 13.7 | 7.4 | 3.3 | 0.9 | 0.4 | 0.1 | 0.0 | −1.668 |
| roitransformer | 75.5 | 72.5 | 62.6 | 51.5 | 37.2 | 22.2 | 10.3 | 2.8 | 1.2 | 0.0 | −2.132 | |
| redet | 70.3 | 60.9 | 50.3 | 40.1 | 29.4 | 19.0 | 11.3 | 9.1 | 0.8 | 0.0 | −2.052 | |
| mixhead (ours) | 88.4 | 88.1 | 87.2 | 79.2 | 77.0 | 66.5 | 45.5 | 24.5 | 7.5 | 0.3 | −0.876 |
| Dataset | Backbone | Neck | Head | AP %(Ship)/AP50 |
|---|---|---|---|---|
| DOTA-v1.0 | Swin Transformer | FPN | RoITransformer | 87.29 |
| STCBackbone | FPN | RoITransformer | 87.83 | |
| STCBackbone | eFPN | RoITransformer | 88.04 | |
| STCBackbone | eFPN | RCFHead | 88.75 | |
| HRSC2016 | Swin Transformer | FPN | RoITransformer | 45.7 |
| STCBackbone | FPN | RoITransformer | 75.50 | |
| STCBackbone | eFPN | RoITransformer | 88.4 | |
| STCBackbone | eFPN | RCFHead | 90.1 |
| Network | Backbone | PL | BD | BR | GTF | SV | LV | SH | TC | BC | ST | SBF | RA | HA | SP | HC | mAP % |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| DRN [30] | H-104 | 88.91 | 80.22 | 43.52 | 63.35 | 73.48 | 70.69 | 84.94 | 90.14 | 83.85 | 84.11 | 50.12 | 58.41 | 67.62 | 68.60 | 52.50 | 70.70 |
| RSDet [31] | ResNet101 | 89.80 | 82.90 | 48.60 | 65.20 | 69.50 | 70.10 | 70.20 | 90.50 | 85.60 | 83.40 | 62.50 | 63.90 | 65.60 | 67.20 | 68.00 | 72.20 |
| Free3Net [32] | ResNet50 | 89.32 | 80.23 | 48.23 | 66.23 | 66.09 | 83.80 | 85.05 | 89.60 | 77.01 | 86.33 | 70.57 | 59.46 | 72.73 | 63.88 | 61.91 | 73.36 |
| GTDET [33] | CSPDarknets | 87.03 | 79.76 | 47.09 | 69.12 | 79.16 | 83.54 | 88.68 | 90.91 | 85.48 | 86.31 | 58.59 | 60.47 | 73.70 | 71.17 | 40.96 | 73.46 |
| CFC-Net [34] | ResNet50 | 89.08 | 80.41 | 52.41 | 70.02 | 76.28 | 78.11 | 87.21 | 90.89 | 84.47 | 85.64 | 60.51 | 61.52 | 67.82 | 68.02 | 50.09 | 73.5 |
| R3Det [35] | ResNet101 | 88.76 | 83.09 | 50.91 | 67.27 | 76.23 | 80.39 | 86.72 | 90.78 | 84.68 | 83.24 | 61.98 | 61.35 | 66.91 | 70.63 | 53.94 | 73.79 |
| FCOS-O + CPSS [36] | ResNet50 | 88.78 | 76.60 | 52.30 | 72.50 | 80.19 | 78.09 | 87.38 | 90.88 | 81.84 | 83.84 | 61.48 | 65.41 | 66.39 | 70.23 | 52.45 | 73.83 |
| SFANet [37] | ResNet50 | 89.08 | 83.13 | 50.77 | 76.97 | 77.76 | 74.17 | 86.25 | 90.89 | 86.63 | 85.32 | 60.93 | 63.54 | 64.26 | 68.34 | 65.11 | 74.16 |
| FADL-Net [38] | Swin-T | 89.17 | 80.90 | 51.37 | 71.74 | 79.70 | 81.09 | 87.75 | 90.89 | 87.08 | 86.17 | 60.83 | 64.44 | 66.49 | 72.74 | 59.13 | 75.30 |
| OrientedFoemer [39] | ResNet50 | 88.14 | 79.13 | 51.96 | 67.34 | 81.02 | 83.26 | 88.29 | 90.90 | 85.57 | 86.25 | 60.84 | 66.36 | 73.81 | 71.23 | 56.49 | 75.37 |
| EOOD [40] | ResNet50 | 87.85 | 80.70 | 56.12 | 69.20 | 79.13 | 81.57 | 87.68 | 90.88 | 83.35 | 83.47 | 57.62 | 62.88 | 73.74 | 75.09 | 60.49 | 75.31 |
| DODet [41] | ResNet50 | 89.34 | 84.31 | 51.39 | 71.04 | 79.04 | 82.86 | 88.15 | 90.90 | 86.88 | 84.91 | 62.69 | 67.63 | 75.47 | 72.22 | 45.54 | 75.49 |
| SARFA-Net [42] | ResNet50 | 89.06 | 82.92 | 49.96 | 71.60 | 77.56 | 81.10 | 87.72 | 90.88 | 85.02 | 86.03 | 62.05 | 66.94 | 75.34 | 71.45 | 57.39 | 75.66 |
| SCRDet++ [43] | ResNet101 | 90.01 | 82.23 | 61.94 | 68.62 | 69.62 | 81.17 | 78.83 | 90.86 | 86.32 | 85.10 | 65.10 | 61.12 | 77.69 | 80.86 | 64.25 | 76.24 |
| RA2DC-Net [10] | ResNet50 | 88.39 | 83.12 | 53.78 | 74.01 | 79.50 | 78.81 | 87.30 | 90.91 | 85.69 | 84.22 | 63.48 | 68.42 | 75.89 | 73.91 | 56.33 | 76.25 |
| ReBiDet [9] | ReResNet50 | 89.36 | 83.77 | 53.04 | 71.55 | 79.01 | 83.56 | 88.41 | 90.89 | 87.31 | 86.00 | 65.45 | 61.79 | 76.73 | 70.30 | 60.36 | 76.50 |
| Ship-Yolo [44] | - | 89.30 | 83.70 | 54.60 | 72.40 | 83.20 | 79.00 | 88.60 | 90.90 | 87.50 | 85.70 | 64.30 | 66.70 | 74.10 | 70.80 | 58.20 | 76.60 |
| FEADet [8] | ResNet50 | 89.20 | 82.07 | 54.24 | 72.67 | 81.24 | 83.72 | 88.23 | 90.84 | 86.13 | 84.34 | 69.56 | 63.39 | 76.20 | 75.43 | 58.48 | 77.05 |
| RSDB-Net (Ours) | STCBackbone | 88.89 | 83.69 | 55.99 | 69.42 | 80.07 | 84.84 | 89.13 | 90.85 | 83.75 | 88.15 | 76.97 | 66.17 | 76.87 | 80.48 | 72.27 | 77.81 |
| Network | FPS | GFLOPs | Params(M) | AP-Ship |
|---|---|---|---|---|
| RoITransformer | 5.9 | 225 | - | 83.59 |
| FCOSF [45] | 23.6 | 202 | 31.8 | 87.8 |
| S2ANet [8] | 17.2 | - | 198 | 85.72 |
| RSDB-Net (Ours) | 72 | 101.6 | 89.49 | 89.13 |
| Method | Backbone | Image Size | AP50% |
|---|---|---|---|
| DVDNet [46] | ResNet50 | 800 × 800 | 80.7 |
| LTGS [47] | LTGS | - | 84.5 |
| RoITransformer [3] | ResNet101 | 512 × 800 | 86.2 |
| CFC-Net [34] | ResNet50 | 416 × 416 | 86.3 |
| AO2-DETR [48] | ResNet50 | - | 88.12 |
| Gliding vertex [49] | ResNet101 | 512 × 800 | 88.2 |
| SKNet [50] | Hourglass-104 | 511 × 511 | 88.3 |
| FAS [51] | FAS-Inception-ResNet101 | 640 × 640 | 89.1 |
| PSD + PPSS [52] | ResNet50 | 800 × 1333 | 89.53 |
| SADA-Net [53] | ResNet50-DRRCM | 512 × 800 | 89.58 |
| MRSDet [54] | ResNet101 | 612 × 612 | 89.6 |
| CenterNet + VGL + CPA [55] | DLA34-DCN | 608 × 608 | 89.78 |
| GF-CSL [56] | ResNet50 | 512 × 512 | 89.8 |
| RFCOS-ABFL [57] | ResNet50 | 800 × 800 | 89.98 |
| MSDA [58] | YOLOv8 | - | 90.00 |
| CMDet [59] | ResNet50 | 640 × 640 | 90.1 |
| RSDB-Net (ours) | STCBackbone | 1024 × 1024 | 90.1 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhou, D.; Xiong, Y.; Yu, S.; Feng, P.; Liu, J.; Wu, N.; Dou, R.; Liu, L. RSDB-Net: A Novel Rotation-Sensitive Dual-Branch Network with Enhanced Local Features for Remote Sensing Ship Detection. Remote Sens. 2025, 17, 3925. https://doi.org/10.3390/rs17233925
Zhou D, Xiong Y, Yu S, Feng P, Liu J, Wu N, Dou R, Liu L. RSDB-Net: A Novel Rotation-Sensitive Dual-Branch Network with Enhanced Local Features for Remote Sensing Ship Detection. Remote Sensing. 2025; 17(23):3925. https://doi.org/10.3390/rs17233925
Chicago/Turabian StyleZhou, Danshu, Yushan Xiong, Shuangming Yu, Peng Feng, Jian Liu, Nanjian Wu, Runjiang Dou, and Liyuan Liu. 2025. "RSDB-Net: A Novel Rotation-Sensitive Dual-Branch Network with Enhanced Local Features for Remote Sensing Ship Detection" Remote Sensing 17, no. 23: 3925. https://doi.org/10.3390/rs17233925
APA StyleZhou, D., Xiong, Y., Yu, S., Feng, P., Liu, J., Wu, N., Dou, R., & Liu, L. (2025). RSDB-Net: A Novel Rotation-Sensitive Dual-Branch Network with Enhanced Local Features for Remote Sensing Ship Detection. Remote Sensing, 17(23), 3925. https://doi.org/10.3390/rs17233925

