A Novel Dynamic Context Branch Attention Network for Detecting Small Objects in Remote Sensing Images
Abstract
1. Introduction
- We propose a DCSA Block with a multi-branch architecture, which can adaptively capture the optimal range of contextual information for small objects in RSIs. Furthermore, we introduce an EBA module that employs dynamic weight adjustment to significantly enhance the network’s capability to utilize branches containing critical information.
- We propose a DGFFN that effectively reduces the number of parameters and complexity while improving the detection speed of the model by introducing gating mechanisms into the feedforward network.
- We construct a novel lightweight backbone network called the Dynamic Context Branch Attention Network (DCBANet) by repeatedly stacking the proposed components. The network demonstrates outstanding performance on public remote sensing datasets, validating that our components effectively and dynamically adapt to small targets of various sizes. This highlights the model’s universal applicability in detecting small objects across the entire scale range (<1024 pixels).
2. Related Works
2.1. RSOD Frameworks
2.2. Small Object Detection
2.3. Attention Mechanisms
2.4. Feedforward Network
3. Method
3.1. DCBANet Backbone Architecture
3.2. Dynamic Context Scale-Aware Block
3.3. Efficient Branch Attention
Algorithm 1: Forward Propagation Process of the DCSA Block |
Input: Feature map X_in |
Output: Feature map X_out |
1: Split X_in into n branches: X_1, X_2, …, X_n |
2: Initialize an empty list of branch outputs: Outputs = [] |
3: for i = 1 to n do |
4: //Decomposed large kernel convolution |
5: U_s_i = Conv1×1(DWConv_k1(X_i)) |
6: U_l_i = Conv1×1(DWDConv_k2,d(U_s_i)) |
7: //Context Adaptive Selection Module (CASM) |
8: U_combined_i = Concat(Conv1×1(U_s_i), Conv1×1(U_l_i)) |
9: W_i = σ(Conv1×1(Concat(P_avg(U_combined_i), P_max(U_combined_i)))) |
10: X_branch_out_i = W_s_i × U_s_i + W_l_i × U_l_i |
11: Append X_branch_out_i to Outputs |
12: end for |
13://Efficient Branch Attention (EBA) |
14: X_cat = Concat(Outputs) |
15: Y_pooled = GlobalAvgPool(X_cat) |
16: w_eba = σ(Conv1 × 1(Y_pooled)) |
17: X_eba = Σ (w_eba_i × X_branch_out_i) |
18://Residual Connection |
19: X_out = Conv1×1(X_eba) + X_in |
20: return X_out |
3.4. Dual-Gated Feedforward Network
4. Experiments
4.1. Datasets
4.2. Experimental Setup and Evaluation Metrics
4.2.1. Experimental Setup
4.2.2. Evaluation Metrics
4.3. Ablation Study
4.4. Main Result
4.5. Visual Analytics
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
RSOD | Remote Sensing Object Detection |
RSIs | Remote Sensing Images |
DCBANet | Dynamic Context Branch Attention Network |
DCSA Block | Dynamic Context Scale-Aware Block |
EBA | Efficient Branch Attention |
FFN | Feedforward Network |
DGFFN | Dual-Gated Feedforward Network |
SGU | Spatial Gating Unit |
CGU | Channel Gating Unit |
MLP | Multilayer Perceptron |
FLOPs | Floating-point Operations |
AP | Average Precision |
mAP | Mean Average Precision |
FPS | Frames Per Second |
References
- Gui, S.; Song, S.; Qin, R.; Tang, Y. Remote Sensing Object Detection in the Deep Learning Era—A Review. Remote Sens. 2024, 16, 327. [Google Scholar] [CrossRef]
- Guo, Z.; Liu, C.; Zhang, X.; Jiao, J.; Ji, X.; Ye, Q. Beyond Bounding-Box: Convex-hull Feature Adaptation for Oriented and Densely Packed Object Detection. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 8788–8797. [Google Scholar] [CrossRef]
- Lyu, C.; Zhang, W.; Huang, H.; Zhou, Y.; Wang, Y.; Liu, Y.; Zhang, S.; Chen, K. RTMDet: An Empirical Study of Designing Real-Time Object Detectors. arXiv 2022, arXiv:2212.07784. [Google Scholar]
- Ding, J.; Xue, N.; Long, Y.; Xia, G.-S.; Lu, Q. Learning RoI Transformer for Oriented Object Detection in Aerial Images. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 2844–2853. [Google Scholar] [CrossRef]
- Xie, X.; Cheng, G.; Wang, J.; Yao, X.; Han, J. Oriented R-CNN for Object Detection. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; pp. 3500–3509. [Google Scholar] [CrossRef]
- Cheng, G.; Yuan, X.; Han, X.J. Towards Large Scale Small Object Detection: Survey and Benchmarks. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 13467–13488. [Google Scholar] [CrossRef]
- Chen, G.; Wang, H.; Chen, K.; Li, Z.; Song, Z. A Survey of the Four Pillars for Small Object Detection: Multiscale Representation, Contextual Information, Super-Resolution, and Region Proposal. IEEE Trans. Syst. Man Cybern. Syst. 2022, 52, 936–953. [Google Scholar] [CrossRef]
- Hua, X.; Cui, X.; Xu, X. Weakly Supervised Underwater Object Real-time Detection Based on High-resolution Attention Class Activation Mapping and Category Hierarchy. Pattern Recognit. 2025, 159, 111111. [Google Scholar] [CrossRef]
- Gao, T.; Xia, S.; Liu, M. MSNet: Multi-Scale Network for Object Detection in Remote Sensing Images. Pattern Recognit. 2025, 158, 110983. [Google Scholar] [CrossRef]
- Chen, C.; Zeng, W.; Zhang, X. HFPNet: Super Feature Aggregation Pyramid Network for Maritime Remote Sensing Small-Object Detection. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 5973–5989. [Google Scholar] [CrossRef]
- Wei, C.; Bai, L.; Chen, X.; Han, J. Cross-Modality Data Augmentation for Aerial Object Detection with Representation Learning. Remote Sens. 2024, 16, 4649. [Google Scholar] [CrossRef]
- Cheng, G.; Wang, J.; Li, K.; Xie, X. Anchor-Free Oriented Proposal Generator for Object Detection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5625411. [Google Scholar] [CrossRef]
- Song, G.; Du, H.; Zhang, X. Small Object Detection in Unmanned Aerial Vehicle Images Using Multi-Scale Hybrid Attention. Eng. Appl. Artif. Intell. 2024, 128, 107455. [Google Scholar] [CrossRef]
- Li, Y.; Hou, Q.; Zheng, Z.; Cheng, M.-M.; Yang, J.; Li, X. Large Selective Kernel Network for Remote Sensing Object Detection. In Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 1–6 October 2023; pp. 16748–16759. [Google Scholar] [CrossRef]
- Cui, L.; Lv, P.; Jiang, X.; Gao, Z. Context-Aware Block Net for Small Object Detection. IEEE Trans. Cybern. 2022, 52, 2300–2313. [Google Scholar] [CrossRef]
- He, X.; Zheng, X.; Hao, X.; Jin, H.; Zhou, X.; Shao, L. Improving Small Object Detection via Context-Aware and Feature-Enhanced Plug-and-Play Modules. J. Real-Time Image Process. 2024, 21, 44. [Google Scholar] [CrossRef]
- Wu, C.; Zeng, Z. YOLOX-CA: A Remote Sensing Object Detection Model Based on Contextual Feature Enhancement and Attention Mechanism. IEEE Access 2024, 12, 84632. [Google Scholar] [CrossRef]
- Li, Z.; Wang, Y.; Zhang, Y. Context Feature Integration and Balanced Sampling Strategy for Small Weak Object Detection in Remote Sensing Imagery. IEEE Geosci. Remote Sens. Lett. 2024, 21, 6009105. [Google Scholar] [CrossRef]
- Ding, S.; Xiong, M.; Wang, X. Dynamic feature and context enhancement network for faster detection of small objects. Expert Syst. Appl. 2025, 265, 125732. [Google Scholar] [CrossRef]
- Yang, H.; Qiu, S. A Novel Dynamic Contextual Feature Fusion Model for Small Object Detection in Satellite Remote-Sensing Images. Information 2024, 15, 230. [Google Scholar] [CrossRef]
- Wang, B.; Ji, R.; Zhang, L. Learning to zoom: Exploiting mixed-scale contextual information for object detection. Expert Syst. Appl. 2025, 264, 125871. [Google Scholar] [CrossRef]
- Yang, X.; Liu, Q.; Yan, J.; Li, A. R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object. arXiv 2019, arXiv:1908.05612. [Google Scholar] [CrossRef]
- Han, J.; Ding, J.; Li, J.; Xia, G. Align Deep Features for Oriented Object Detection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5602511. [Google Scholar] [CrossRef]
- Cai, Z.; Vasconcelos, N. Cascade R-CNN: Delving Into High Quality Object Detection. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 6154–6162. [Google Scholar] [CrossRef]
- Zhang, H.; Chang, H.; Ma, B. Dynamic R-CNN: Towards High Quality Object Detection via Dynamic Training. In Computer Vision–ECCV 2020; Lecture Notes in Computer Science, Volume 12360; Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M., Eds.; Springer: Cham, Switzerland, 2020; pp. 260–275. [Google Scholar] [CrossRef]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef]
- Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-End Object Detection with Transformers. In Computer Vision–ECCV 2020; Lecture Notes in Computer Science, Volume 12346; Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M., Eds.; Springer: Cham, Switzerland, 2020; pp. 213–229. [Google Scholar] [CrossRef]
- Zhao, J.; Ding, Z.; Zhou, Y.; Zhu, H.; Du, W. OrientedFormer: An End-to-End Transformer-Based Oriented Object Detector in Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5640816. [Google Scholar] [CrossRef]
- Zhu, X.; Su, W.; Lu, L. Deformable DETR: Deformable Transformers for End-to-End Object Detection. In Proceedings of the International Conference on Learning Representations (ICLR), Vienna, Austria, 3–7 May 2021; pp. 1–16. [Google Scholar]
- Yang, C.; Huang, Z.; Wang, N. QueryDet: Cascaded Sparse Query for Accelerating High-Resolution Small Object Detection. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 13658–13667. [Google Scholar] [CrossRef]
- Deng, C.; Wang, M.; Liu, L. Extended Feature Pyramid Network for Small Object Detection. IEEE Trans. Multimed. 2022, 24, 1968–1979. [Google Scholar] [CrossRef]
- Zeng, N.; Wu, P.; Wang, Z. A Small-Sized Object Detection Oriented Multi-Scale Feature Fusion Approach with Application to Defect Detection. IEEE Trans. Instrum. Meas. 2022, 71, 3507014. [Google Scholar] [CrossRef]
- Liu, Y.; Li, Q.; Yuan, Y.; Du, Q.; Wang, Q. ABNet: Adaptive Balanced Network for Multiscale Object Detection in Remote Sensing Imagery. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5614914. [Google Scholar] [CrossRef]
- Yang, X.; Yan, J.; Liao, W.; Yang, X.; Tang, J.; He, T. SCRDet++: Detecting Small, Cluttered and Rotated Objects via Instance-Level Feature Denoising and Rotation Loss Smoothing. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 2384–2399. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.M.; Parmar, N. Attention is All You Need. In Proceedings of the 31st Conference on Neural Information Processing Systems (NeurIPS), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Guo, M.; Xu, T.; Liu, J. Attention Mechanisms in Computer Vision: A Survey. Comput. Vis. Media 2022, 8, 331–368. [Google Scholar] [CrossRef]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation Networks. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar] [CrossRef]
- Wang, Q.; Wu, B.; Zhu, P. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 11531–11539. [Google Scholar] [CrossRef]
- Hu, J.; Shen, L.; Albanie, S.; Sun, G.; Vedaldi, A. Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks. arXiv 2018, arXiv:1810.12348. [Google Scholar] [CrossRef]
- Zhao, H.; Zhang, Y.; Liu, S.; Shi, J. PSANet: Point-wise Spatial Attention Network for Scene Parsing. In Computer Vision–ECCV 2018; Lecture Notes in Computer Science, Vol. 11213; Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y., Eds.; Springer: Cham, Switzerland, 2018; pp. 270–286. [Google Scholar] [CrossRef]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A. An Image is Worth 16 × 16 Words: Transformers for Image Recognition at Scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Zhang, H.; Wu, C.; Zhang, Z.; Zhu, Y. ResNeSt: Split-Attention Networks. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), New Orleans, LA, USA, 19–20 June 2022; pp. 2735–2745. [Google Scholar] [CrossRef]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; pp. 9992–10002. [Google Scholar] [CrossRef]
- Chen, Z.; Zhu, Y.; Li, Z.; Yang, F.; Zhao, C.; Wang, J.; Tang, M. The Devil is in Details: Delving Into Lite FFN Design for Vision Transformers. In Proceedings of the ICASSP 2024–2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seoul, Republic of Korea, 14–19 April 2024; pp. 4130–4134. [Google Scholar] [CrossRef]
- Xu, H.; Zhou, Z.; He, D. Vision Transformer with Attention Map Hallucination and FFN Compaction. arXiv 2023, arXiv:2306.10875. [Google Scholar]
- Yang, Z.; Zhu, L.; Wu, Y. Gated Channel Transformation for Visual Recognition. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 11791–11800. [Google Scholar] [CrossRef]
- Wang, Y.; Li, Y.; Wang, G. Multi-Scale Attention Network for Single Image Super-Resolution. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, WA, USA, 17–18 June 2024; pp. 5950–5960. [Google Scholar] [CrossRef]
- Li, S.; Wang, Z.; Liu, Z.; Tan, C.; Lin, H.; Wu, D.; Chen, Z.; Zheng, J.; Li, S. MogaNet: Multi-Order Gated Aggregation Network. In Proceedings of the International Conference on Learning Representations (ICLR), Toulon, France, 25–29 April 2022. [Google Scholar]
- Guo, M.H.; Lu, C.Z.; Liu, Z.N.; Cheng, M.M.; Hu, S.M. Visual Attention Network. Comput. Vis. Media 2023, 9, 733–752. [Google Scholar] [CrossRef]
- Yu, W.; Si, C.; Zhou, P.; Luo, M. MetaFormer Baselines for Vision. IEEE Trans. Pattern Anal. Mach. Intell. 2024, 46, 896–912. [Google Scholar] [CrossRef]
- Xia, G.-S.; Bai, X.; Ding, J.; Zhu, Z. DOTA: A Large-Scale Dataset for Object Detection in Aerial Images. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 3974–3983. [Google Scholar] [CrossRef]
- Haroon, M.; Shahzad, M.; Fraz, M. Multisized Object Detection Using Spaceborne Optical Imagery. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 3032–3046. [Google Scholar] [CrossRef]
- Cheng, G.; Han, J.; Zhou, P. Multi-Class Geospatial Object Detection and Geographic Image Classification Based on Collection of Part Detectors. ISPRS J. Photogramm. Remote Sens. 2014, 98, 119–132. [Google Scholar] [CrossRef]
- Wang, J.; Yang, W.; Guo, H.; Zhang, R.; Xia, G. Tiny Object Detection in Aerial Images. In Proceedings of the 2020 25th International Conference on Pattern Recognition (ICPR), Milan, Italy, 10–15 January 2021; pp. 3791–3798. [Google Scholar] [CrossRef]
- Zeng, Y.; Chen, Y.; Yang, X. ARS-DETR: Aspect Ratio-Sensitive Detection Transformer for Aerial Oriented Object Detection. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5610315. [Google Scholar] [CrossRef]
- Xie, X.; Cheng, G.; Rao, C. Oriented Object Detection via Contextual Dependence Mining and Penalty-Incentive Allocation. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5618010. [Google Scholar] [CrossRef]
- Hou, L.; Lu, K.; Xue, J. Shape-Adaptive Selection and Measurement for Oriented Object Detection. In Proceedings of the AAAI Conference on Artificial Intelligence; 2022; Volume 36, pp. 923–932. [Google Scholar] [CrossRef]
- Cheng, G.; Yao, Y.; Li, S.; Li, K. Dual-Aligned Oriented Detector. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5618111. [Google Scholar] [CrossRef]
- Han, J.; Ding, J.; Xue, N.; Xia, G. ReDet: A Rotation-Equivariant Detector for Aerial Object Detection. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 2785–2794. [Google Scholar] [CrossRef]
- Lin, T.; Goyal, P.; Girshick, R. Focal Loss for Dense Object Detection. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 318–327. [Google Scholar] [CrossRef]
- Tian, Z.; Shen, C.; Chen, H. FCOS: Fully Convolutional One-Stage Object Detection. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 9626–9635. [Google Scholar] [CrossRef]
- Chen, Q.; Wang, Y.; Yang, T. You Only Look One-Level Feature. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 19–25 June 2021; pp. 13034–13043. [Google Scholar] [CrossRef]
- Xie, S.; Girshick, R.; Dollár, P.; Tu, Z.; He, K. Aggregated Residual Transformations for Deep Neural Networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 5987–5995. [Google Scholar] [CrossRef]
Group | Branch Number | RF | ) | mAP | FPS |
---|---|---|---|---|---|
baseline | 1 | 23 | (5,1), (7,3) | 74.72 | 20.7 |
DCSA Block | 2 | 15 27 | (3,1), (7,2) (3,1), (9,3) | 74.16 | 20.2 |
3 | 21 23 27 | (3,1), (7,3) (5,1), (7,3) (3,1), (9,3) | 74.55 | 18.9 | |
3 | 15 23 31 | (3,1), (7,2) (5,1), (7,3) (7,1), (9,3) | 75.05 | 19.6 | |
4 | 15 23 29 39 | (3,1), (7,2) (5,1), (7,3) (5,1), (7,4) (7,1), (9,4) | 74.68 | 15.2 |
Backbone | FFN | Param (M) | FLOPs (G) | mAP (%) | FPS |
---|---|---|---|---|---|
LSKNet [14] | MLP | 13.84 | 54.31 | 74.72 | 20.7 |
DGFFN | 11.72(↓15.3%) | 44.96(↓17.2%) | 74.89(↑0.17%) | 24.1(↑16.4%) | |
Swin-t [43] | MLP | 27.49 | 95.03 | 73.96 | 27.7 |
DGFFN | 23.27(↓15.4%) | 81.73(↓14.0%) | 74.12(↑0.16%) | 29.6(↑6.4%) | |
DCBANet | MLP | 13.38 | 52.50 | 75.05 | 19.6 |
DGFFN | 11.28(↓15.7%) | 43.14(↓17.8%) | 75.16(↑0.11%) | 23.4(↑16.2%) | |
GSAU [47] | 8.18 (↓38.8%) | 27.06(↓48.4%) | 74.02(↓1.03%) | 24.4(↑24.5%) | |
CAFFN [48] | 12.39(↓7.3%) | 51.62(↓1.6%) | 74.32(↓0.73%) | 19.2(↓1.0%) |
DCSA Block | EBA | DGFFN | Param | FLOPs | DOTA | NWPU VHR-10 | SIMD | AI-TOD | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
mAP | mAP | mAP | mAP@0.5 | mAP@0.75 | ||||||||||
× | × | × | 13.84 m | 54.31 G | 76.79 | 88.03 | 77.10 | 63.75 | 86.53 | 91.03 | 63.2 | 44.9 | 38.2 | 76.3 |
√ | × | × | 13.28 m | 49.96 G | 77.03 | 88.66 | 79.06 | 66.35 | 89.27 | 91.10 | 64.3 | 46.0 | 38.7 | 77.7 |
√ | √ | × | 13.99 m | 51.33 G | 77.24 | 89.05 | 79.80 | 67.17 | 90.36 | 91.32 | 64.8 | 46.8 | 39.7 | 78.9 |
√ | √ | √ | 11.28 m | 43.14 G | 77.33 | 89.17 | 80.27 | 67.45 | 91.14 | 91.85 | 65.0 | 47.1 | 39.9 | 79.3 |
Method | Backbone | mAP | PL | BD | BR | GTF | SV | LV | SH | TC | BC | ST | SBF | RA | HA | SP | HC |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
End-to-end | |||||||||||||||||
D.DETR-O [29] | R50 | 69.48 | 84.89 | 70.71 | 46.04 | 61.92 | 73.99 | 78.83 | 87.71 | 90.07 | 77.97 | 78.41 | 47.07 | 54.48 | 66.87 | 67.66 | 55.62 |
ARS-DETR [55] | R50 | 74.16 | 86.97 | 75.56 | 48.32 | 69.20 | 77.92 | 77.94 | 87.69 | 90.50 | 77.31 | 82.86 | 60.28 | 64.58 | 74.88 | 71.76 | 66.62 |
OrientedFormer [28] | R50 | 75.37 | 88.14 | 79.13 | 51.96 | 67.34 | 81.02 | 83.26 | 88.29 | 90.90 | 85.57 | 86.25 | 60.84 | 66.36 | 73.81 | 71.23 | 56.49 |
One-stage | |||||||||||||||||
R3Det [22] | R101 | 73.79 | 88.76 | 83.09 | 50.91 | 67.27 | 76.23 | 80.39 | 86.72 | 90.78 | 84.68 | 83.24 | 61.98 | 61.35 | 66.91 | 70.63 | 53.94 |
S2Anet [23] | R50 | 74.12 | 89.11 | 82.84 | 48.37 | 71.11 | 78.11 | 78.38 | 87.25 | 90.83 | 84.90 | 85.64 | 60.36 | 62.60 | 65.26 | 63.19 | 57.94 |
CFA [2] | R50 | 74.51 | 88.34 | 83.09 | 51.92 | 72.23 | 79.95 | 78.68 | 87.25 | 90.90 | 85.38 | 85.71 | 59.63 | 63.05 | 73.33 | 70.36 | 47.86 |
DFDet [56] | R50 | 74.71 | 88.92 | 79.25 | 48.40 | 70.00 | 80.22 | 78.85 | 87.21 | 90.90 | 83.13 | 83.92 | 60.07 | 66.49 | 68.27 | 76.78 | 58.11 |
SASM [57] | R50 | 74.92 | 86.42 | 78.97 | 52.47 | 69.84 | 77.30 | 75.99 | 86.72 | 90.89 | 82.63 | 85.66 | 60.13 | 68.25 | 73.98 | 72.22 | 62.37 |
RTMDet [3] | CSPNeXt-t | 75.36 | 89.21 | 80.03 | 47.88 | 69.73 | 82.05 | 83.33 | 88.63 | 90.91 | 86.31 | 86.85 | 59.94 | 52.30 | 74.23 | 71.97 | 57.03 |
Two-stage | |||||||||||||||||
RoI Transformer [4] | R50 | 74.61 | 88.65 | 82.60 | 52.53 | 70.87 | 77.93 | 76.67 | 86.87 | 90.71 | 83.83 | 52.81 | 53.95 | 67.61 | 74.67 | 68.75 | 61.03 |
AOPG [12] | R50 | 75.24 | 89.27 | 83.49 | 52.50 | 69.97 | 73.51 | 82.31 | 87.95 | 90.89 | 87.64 | 84.71 | 60.01 | 66.12 | 74.19 | 68.30 | 57.80 |
DODet [58] | R50 | 75.49 | 89.34 | 84.31 | 51.39 | 71.04 | 79.04 | 82.86 | 88.15 | 90.90 | 86.88 | 84.91 | 62.69 | 67.63 | 75.47 | 72.22 | 45.54 |
Oriented RCNN [5] | R50 | 75.87 | 89.46 | 82.12 | 54.78 | 70.86 | 78.93 | 83.00 | 88.20 | 90.90 | 87.50 | 84.68 | 63.97 | 67.63 | 74.94 | 68.84 | 52.28 |
SCRDet++ [34] | R101 | 76.20 | 89.77 | 83.90 | 56.30 | 73.98 | 72.60 | 75.63 | 82.82 | 90.76 | 87.89 | 86.14 | 65.24 | 63.17 | 76.05 | 68.06 | 70.24 |
ReDet [59] | ReR50 | 76.25 | 88.79 | 82.64 | 53.97 | 74.00 | 78.16 | 84.06 | 88.04 | 90.89 | 87.78 | 85.75 | 61.76 | 60.39 | 75.96 | 68.07 | 63.59 |
DCBANet | - | 77.33 | 89.78 | 82.34 | 53.59 | 76.55 | 80.30 | 84.89 | 88.45 | 90.90 | 87.65 | 85.24 | 58.44 | 64.80 | 76.24 | 77.82 | 62.90 |
DCBANet * | - | 80.91 | 89.04 | 85.11 | 60.66 | 81.90 | 81.21 | 85.48 | 88.44 | 90.87 | 88.31 | 87.90 | 69.78 | 66.91 | 79.20 | 81.58 | 77.25 |
Method | mAP | PL | SH | ST | BD | TC | BC | GTF | HA | BR | VE |
---|---|---|---|---|---|---|---|---|---|---|---|
RetinaNet [60] | 81.80 | 98.50 | 77.49 | 95.18 | 96.66 | 72.05 | 73.90 | 99.17 | 73.33 | 63.60 | 68.11 |
FCOS [61] | 82.87 | 99.77 | 75.87 | 95.28 | 97.73 | 71.05 | 77.31 | 98.67 | 78.25 | 66.35 | 68.48 |
YOLOF [62] | 85.39 | 99.32 | 82.98 | 82.78 | 98.31 | 72.89 | 87.30 | 99.16 | 83.79 | 80.20 | 67.20 |
Faster-RCNN [26] | 85.43 | 99.92 | 76.68 | 89.71 | 97.77 | 84.62 | 83.54 | 99.78 | 82.66 | 76.10 | 63.56 |
Cascade-RCNN [24] | 86.11 | 99.36 | 73.00 | 93.01 | 98.10 | 84.43 | 79.47 | 99.95 | 91.98 | 77.09 | 64.69 |
Dynamic-RCNN [25] | 87.40 | 99.69 | 73.45 | 92.93 | 96.67 | 88.83 | 86.87 | 99.88 | 87.38 | 82.22 | 66.03 |
LSKNet [14] | 88.03 | 99.78 | 82.09 | 95.54 | 97.24 | 79.54 | 73.79 | 99.98 | 88.19 | 88.79 | 75.40 |
DCBANet | 89.17 | 99.52 | 74.97 | 90.48 | 97.41 | 91.76 | 86.34 | 99.88 | 88.88 | 89.91 | 72.55 |
Method | Param(M) | FLOPs(G) | mAP | |||
---|---|---|---|---|---|---|
FCOS [61] | 32.15 | 206.72 | 70.88 | 53.43 | 84.24 | 88.04 |
Cascade-RCNN [24] | 41.42 | 225.06 | 75.95 | 59.75 | 88.92 | 91.43 |
LSKNet [14] | 31.03 | 187.16 | 77.10 | 63.75 | 86.53 | 91.03 |
ResNeXt [63] | 40.89 | 229.05 | 77.63 | 62.43 | 90.61 | 91.26 |
Dynamic-RCNN [25] | 41.75 | 225.72 | 78.31 | 62.85 | 91.36 | 92.32 |
ResNeSt [42] | 44.66 | 481.30 | 79.57 | 66.05 | 91.34 | 91.38 |
Swin-t [43] | 44.81 | 233.00 | 80.08 | 65.78 | 92.65 | 92.63 |
DCBANet | 28.49 | 176.05 | 80.27 | 67.45 | 91.14 | 91.85 |
Method | mAP@0.5:0.95 | mAP@0.5 | mAP@0.75 | ||
---|---|---|---|---|---|
YOLOF [62] | 8.5 | 20.0 | 6.2 | 8.6 | 25.9 |
Faster-RCNN [26] | 32.2 | 60.7 | 33.8 | 31.6 | 66.7 |
Dynamic-RCNN [25] | 34.5 | 60.1 | 34.9 | 33.5 | 62.2 |
Swin-t [43] | 34.7 | 57.8 | 35.5 | 35.1 | 70.5 |
ResNeSt [42] | 36.8 | 61.1 | 37.6 | 35.3 | 65.9 |
ResNeXt [63] | 37.7 | 63.1 | 40.5 | 35.9 | 72.7 |
LSKNet [14] | 40.4 | 63.2 | 44.9 | 38.2 | 76.6 |
DCBANet | 42.4 | 65.0 | 47.1 | 39.9 | 79.3 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Jin, H.; Song, Y.; Bai, T.; Sun, K.; Chen, Y. A Novel Dynamic Context Branch Attention Network for Detecting Small Objects in Remote Sensing Images. Remote Sens. 2025, 17, 2415. https://doi.org/10.3390/rs17142415
Jin H, Song Y, Bai T, Sun K, Chen Y. A Novel Dynamic Context Branch Attention Network for Detecting Small Objects in Remote Sensing Images. Remote Sensing. 2025; 17(14):2415. https://doi.org/10.3390/rs17142415
Chicago/Turabian StyleJin, Huazhong, Yizhuo Song, Ting Bai, Kaimin Sun, and Yepei Chen. 2025. "A Novel Dynamic Context Branch Attention Network for Detecting Small Objects in Remote Sensing Images" Remote Sensing 17, no. 14: 2415. https://doi.org/10.3390/rs17142415
APA StyleJin, H., Song, Y., Bai, T., Sun, K., & Chen, Y. (2025). A Novel Dynamic Context Branch Attention Network for Detecting Small Objects in Remote Sensing Images. Remote Sensing, 17(14), 2415. https://doi.org/10.3390/rs17142415