MambaShadowDet: A High-Speed and High-Accuracy Moving Target Shadow Detection Network for Video SAR
Abstract
1. Introduction
2. Methodology
2.1. Preliminaries
2.2. Architecture
2.3. Mamba-Backbone
2.3.1. 2-Dimension-Selective-Scan
2.3.2. Local-Spatial Block

2.3.3. Residual-Gated Block

2.4. Slim-PAFPN
2.4.1. PAFPN
2.4.2. SCSP
3. Experiments
3.1. Dataset
3.2. Experimental Details
3.3. Evaluation Indices
4. Results
4.1. Quantitative Results
- MambaShadowDet demonstrates leading both accuracy and speed advantages in video SAR shadow detection, achieving an AP accuracy index of up to 71.99%, F1 accuracy index of up to 80.32%, FPS speed index of up to 44.44, FLOPs complexity index of low to 10.65 G, and #Para complexity index of low to 4.47 M, outperforming seven other state-of-the-art (SOTA) methods. This high-level shadow detection performance is largely due to its core contributions: Mamba-Backbone and Slim-PAFPN.
- MambaShadowDet achieves notable F1 improvements, with gains of 15.16%, 17.98%, 5.49%, 7.71%, 7.74%, 5.98%, and 4.53% compared to YOLOX, RetinaNet, CenterNet, Faster R-CNN, Cascade R-CNN, Deformable DETR, and ShadowDeNet, respectively. Especially, though our MambaShadowDet only slightly exceeds the sub-optimal model, i.e., ShadowDeNet, in the AP index, the former’s F1 value is much greater than the latter’s, i.e., 80.32 >> 75.79. The above reveals the accuracy superiority of ShadowDeNet in moving target shadow detection for video SAR images.
- MambaShadowDet achieves notable model complexity (FLOPs and #Para) improvements, with a decrease in FLOPs of 2.70 G, 131.47 G, 318.65 G, 82.34 G, 82.34 G, 50.61 G, and 89.32 G compared to YOLOX, RetinaNet, CenterNet, Faster R-CNN, Cascade R-CNN, Deformable DETR, and ShadowDeNet, as well as a decrease in #Para of 4.47 M, 27.73 M, 79.15 M, 36.88 M, 64.68 M, 35.63 M, and 43.37 M, respectively. Especially, our MambaShadowDet achieves the lowest model space complexity of 4.47 M #Para, and lowest model computation complexity of 10.65 G FLOPs. The above reveals the model complexity superiority of ShadowDeNet in moving target shadow detection for video SAR images.
- MambaShadowDet achieves notable FPS improvements, with an increase of 16.16, 6.85, 16.08, 18.40, 22.63, 33.78, and 24.00 compared to YOLOX, RetinaNet, CenterNet, Faster R-CNN, Cascade R-CNN, Deformable DETR, and ShadowDeNet, respectively. Especially, our MambaShadowDet is the only model whose inference speed is over 40 FPS, outperforming the second-fast model, i.e., RetinaNet, in speed with a gain of 6.85 FPS. The above reveals the speed superiority of ShadowDeNet in moving target shadow detection for video SAR images.
| Method | GT | TP | FP | FN | P (%) | R (%) | AP (%) | F1 (%) | 
|---|---|---|---|---|---|---|---|---|
| YOLOX [65] | 1694 | 1140 | 665 | 554 | 63.16 | 67.30 | 64.04 | 65.16 | 
| RetinaNet [66] | 1694 | 798 | 68 | 896 | 92.15 | 47.11 | 45.63 | 62.34 | 
| CenterNet [44] | 1694 | 1064 | 103 | 630 | 91.17 | 62.81 | 60.93 | 74.38 | 
| Faster R-CNN [38] | 1694 | 1156 | 334 | 538 | 77.58 | 68.24 | 65.87 | 72.61 | 
| Cascade R-CNN [67] | 1694 | 1248 | 501 | 446 | 71.36 | 73.67 | 69.98 | 72.49 | 
| Deformable DETR [42] | 1694 | 1289 | 485 | 405 | 72.66 | 76.09 | 71.52 | 74.34 | 
| ShadowDeNet [31] | 1694 | 1263 | 376 | 431 | 77.06 | 74.56 | 71.90 | 75.79 | 
| MambaShadowDet | 1694 | 1241 | 155 | 453 | 88.90 | 73.26 | 71.99 | 80.32 | 
| Method | FLOPs (G) | #Para (M) | FPS | 
|---|---|---|---|
| YOLOX [65] | 13.35 | 8.94 | 28.28 | 
| RetinaNet [66] | 142.12 | 32.20 | 37.59 | 
| CenterNet [44] | 329.30 | 83.62 | 28.36 | 
| Faster R-CNN [38] | 92.99 | 41.35 | 26.04 | 
| Cascade R-CNN [67] | 92.99 | 69.15 | 21.81 | 
| Deformable DETR [42] | 61.26 | 40.10 | 10.66 | 
| ShadowDeNet [31] | 99.97 | 47.84 | 20.44 | 
| MambaShadowDet | 10.65 | 4.47 | 44.44 | 
4.2. Qualitative Results
- MambaShadowDet demonstrates effective false alarm suppression ability in video SAR shadow detection. For example, in images of the first column, except for our MambaShadowDet and RetinaNet, all other models have misjudged the vehicle shape interference on the upper right side of the image areas, resulting in the generation of false alarm boxes. Especially, there are six false alarm boxes in YOLOX, none in RetinaNet, one in CenterNet, one in Faster R-CNN, one in Cascade R-CNN, two in Deformable DETR, two in ShadowDeNet, and none in MambaShadowDet. Though RetinaNet does not offer the false alarm box, it also misses massive (i.e., seven) detection boxes of targets, showing its poor shadow feature extraction ability.
- MambaShadowDet demonstrates effective missed detection suppression ability in video SAR shadow detection. For example, in images of the fourth column, except for our MambaShadowDet, ShadowDeNet, Deformable DETR, and Cascade R-CNN, all other models have missed more than one detection box. Specifically, there are two missed detection boxes in YOLOX, six in RetinaNet, two in CenterNet, two in Faster R-CNN, one in Cascade R-CNN, one in Deformable DETR, one in ShadowDeNet, and one in MambaShadowDet.
- Both false alarm suppression and missed detection suppression abilities of MambaShadowDet in video SAR shadow detection visualization results on the SNL dataset intuitively reveal the superiority of our model.


5. Ablation Study
5.1. Ablation Study on Mamba-Backbone
5.2. Ablation Study on Slim-PAFPN
6. Discussion
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
| Abbreviation | Full Name | 
|---|---|
| Adam | Adaptive moment estimation | 
| AP | Average precision | 
| BCE | Binary cross entropy | 
| CNN | Convolution neural network | 
| CSP | Cross-stage partial network | 
| CV | Computer vision | 
| CIoU | Complete intersection over union | 
| DETR | Detection transformer | 
| DL | Deep learning | 
| DOAJ | Directory of open access journals | 
| DWSConv | Depth-wise separable convolution | 
| FLOPs | Floating point of operations | 
| FPN | Feature pyramid network | 
| FPS | Frames per second | 
| GSConv | Group shuffle convolution | 
| IoU | Intersection over union | 
| LD | Linear dichroism | 
| LSB | Local-spatial block | 
| MLP | Multi-layer perceptron | 
| MDPI | Multidisciplinary Digital Publishing Institute | 
| NLP | Natural language processing | 
| NMS | Non-maximum suppression | 
| OBB | Oriented bounding boxes | 
| PC | Personal computer | 
| PAPFN | Path aggregation feature pyramid network | 
| RGB | Residual-gated block | 
| S4 | Structured state–space sequence | 
| SAR | Synthetic aperture radar | 
| SConv | Standard convolution | 
| SCSP | Slim cross stage partial | 
| SNL | Sandia national laboratories | 
| SOTA | State-of-the-art | 
| SS2D | Two-dimension selective scan | 
| SSM | State space model | 
| TLA | Three letter acronym | 
| ViT | Vision transformer | 
| ZOH | Zero-order-hold | 
References
- Zhou, Y.; Wang, S.; Ren, H.; Hu, J.; Zou, L.; Wang, X. Multi-Level Feature-Refinement Anchor-Free Framework with Consistent Label-Assignment Mechanism for Ship Detection in SAR Imagery. Remote Sens. 2024, 16, 975. [Google Scholar] [CrossRef]
- Hu, Q.; Hu, S.; Liu, S. BANet: A Balance Attention Network for Anchor-Free Ship Detection in SAR Images. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5222212. [Google Scholar] [CrossRef]
- Zhang, T.; Zhang, X.; Shi, J.; Wei, S. Depthwise Separable Convolution Neural Network for High-Speed SAR Ship Detection. Remote Sens. 2019, 11, 2483. [Google Scholar] [CrossRef]
- Kong, L.; Gao, F.; He, X.; Wang, J.; Sun, J.; Zhou, H.; Hussain, A. Few-Shot Class-Incremental SAR Target Recognition Via Orthogonal Distributed Features. IEEE Trans. Aerosp. Electron. Syst. 2024, 1–18. [Google Scholar] [CrossRef]
- Zhang, X.; Feng, S.; Zhao, C.; Sun, Z.; Zhang, S.; Ji, K. MGSFA-Net: Multiscale Global Scattering Feature Association Network for SAR Ship Target Recognition. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 4611–4625. [Google Scholar] [CrossRef]
- Xu, X.; Zhang, X.; Shao, Z.; Shi, J.; Wei, S.; Zhang, T.; Zeng, T. A Group-Wise Feature Enhancement-and-Fusion Network with Dual-Polarization Feature Enrichment for SAR Ship Detection. Remote Sens. 2022, 14, 5276. [Google Scholar] [CrossRef]
- Hu, Q.; Hu, S.; Liu, S.; Xu, S.; Zhang, Y.-D. FINet: A Feature Interaction Network for SAR Ship Object-Level and Pixel-Level Detection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5239215. [Google Scholar] [CrossRef]
- Xu, X.; Zhang, X.; Zhang, T.; Zhang, W.; Shi, J.; Wei, S.; Shao, Z.; Xu, Y.; Zeng, T. Distribution-Based Anchor Assignment and Comprehensive Score Voting with Distance Penalty IoU Loss for SAR Remote Sensing Ship Detection. IEEE Trans. Instrum. Meas. 2024, 73, 5040318. [Google Scholar] [CrossRef]
- Zhang, T.; Zhang, X. High-Speed Ship Detection in SAR Images Based on a Grid Convolutional Neural Network. Remote Sens. 2019, 11, 1206. [Google Scholar] [CrossRef]
- Gao, F.; Kong, L.; Lang, R.; Sun, J.; Wang, J.; Hussain, A.; Zhou, H. SAR Target Incremental Recognition Based on Features With Strong Separability. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5202813. [Google Scholar] [CrossRef]
- Hu, X.; Xie, H.; Yi, S.; Zhang, L.; Lu, Z. An Improved NLCS Algorithm Based on Series Reversion and Elliptical Model Using Geosynchronous Spaceborne–Airborne UHF UWB Bistatic SAR for Oceanic Scene Imaging. Remote Sens. 2024, 16, 1131. [Google Scholar] [CrossRef]
- Zeng, T.; Zhang, T.; Shao, Z.; Xu, X.; Zhang, W.; Shi, J.; Wei, S.; Zhang, X. CFAR-DP-FW: A CFAR-guided Dual-Polarization Fusion Framework for Large Scene SAR Ship Detection. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 7242–7259. [Google Scholar] [CrossRef]
- Zhong, C.; Ding, J.; Zhang, Y. Joint Tracking of Moving Target in Single-Channel Video SAR. IEEE Trans. Geosci. Remote Sens. 2021, 60, 5212718. [Google Scholar] [CrossRef]
- Tian, X.; Liu, J.; Mallick, M.; Huang, K. Simultaneous Detection and Tracking of Moving-Target Shadows in ViSAR Imagery. IEEE Trans. Geosci. Remote Sens. 2021, 59, 1182–1199. [Google Scholar] [CrossRef]
- Ding, J. Focusing Algorithms and Moving Target Detection Based on Video SAR. J. Radars 2020, 9, 321–334. [Google Scholar]
- Huang, X.; Xu, Z.; Ding, J. Video SAR Image Despeckling by Unsupervised Learning. IEEE Trans. Geosci. Remote Sens. 2020, 59, 10151–10160. [Google Scholar] [CrossRef]
- Yang, X.; Shi, J.; Zhou, Y.; Wang, C.; Hu, Y.; Zhang, X.; Wei, S. Ground Moving Target Tracking and Refocusing Using Shadow in Video-SAR. Remote Sens. 2020, 12, 3083. [Google Scholar] [CrossRef]
- Zhou, Y.; Shi, J.; Wang, C.; Hu, Y.; Zhou, Z.; Yang, X.; Wei, S. SAR Ground Moving Target Refocusing by Combining Mre3 Network and Tvβ-Lstm. IEEE Trans. Geosci. Remote Sens. 2020, 60, 5200814. [Google Scholar] [CrossRef]
- Damini, A.; Balaji, B.; Parry, C.; Mantle, V. A VideoSAR Mode for the X-band Wideband Experimental Airborne Radar. In Algorithms for Synthetic Aperture Radar Imagery, 17th ed.; International Society for Optics and Photonics: Bellingham, WA, USA, 2010; p. 76990. [Google Scholar]
- Zhang, W.; Zhang, X.; Xu, X.; Xu, Y.; Shao, Z.; Shi, J.; Wei, S.; Zeng, T. GNN-JFL: Graph Neural Network for Video SAR Shadow Tracking with Joint Motion-Appearance Feature Learning. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5209117. [Google Scholar] [CrossRef]
- Zhao, B.; Han, Y.; Wang, H.; Tang, L.; Liu, X.; Wang, T. Robust Shadow Tracking for Video SAR. IEEE Geosci. Remote Sens. Lett. 2020, 18, 821–825. [Google Scholar] [CrossRef]
- Xu, X.; Zhang, X.; Zhang, T.; Yang, Z.; Shi, J.; Zhan, X. Shadow-Background-Noise 3D Spatial Decomposition Using Sparse Low-Rank Gaussian Properties for Video-SAR Moving Target Shadow Enhancement. IEEE Geosci. Remote Sens. Lett. 2022, 19, 4516105. [Google Scholar] [CrossRef]
- Huang, X.; Ding, J.; Guo, Q. Unsupervised Image Registration for Video SAR. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 1075–1083. [Google Scholar] [CrossRef]
- He, Z.; Chen, X.; Yu, C.; Li, Z.; Yu, A.; Dong, Z. A Robust Moving Target Shadow Detection and Tracking Method for VideoSAR. J. Electron. Inf. Technol. 2021, 44, 3882–3890. [Google Scholar]
- Bao, J.; Zhang, X.; Zhang, T.; Shi, J.; Wei, S. A Novel Guided Anchor Siamese Network for Arbitrary Target-of-Interest Tracking in Video-SAR. Remote Sens. 2021, 13, 4504. [Google Scholar] [CrossRef]
- Ding, J.; Wen, L.; Zhong, C.; Loffeld, O. Video SAR Moving Target Indication Using Deep Neural Network. IEEE Trans. Geosci. Remote Sens. 2020, 58, 7194–7204. [Google Scholar] [CrossRef]
- Wen, L.; Ding, J.; Loffeld, O. Video SAR Moving Target Detection Using Dual Faster R-CNN. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 2984–2994. [Google Scholar] [CrossRef]
- Huang, X.; Liang, D.; Ding, J. Moving Target Detection in Video SAR Based on Improved Faster R-CNN. In Proceedings of the European Conference on Synthetic Aperture Radar (EUSAR), Online Event, 29 March–1 April 2021; pp. 1–5. [Google Scholar]
- Yan, H.; Huang, J.; Li, R.; Wang, X.; Zhang, J.; Zhu, D. Research on Video SAR Moving Target Detection Algorithm Based on Improved Faster Region-based CNN. J. Electron. Inf. Technol. 2021, 43, 615–622. [Google Scholar]
- Hu, Y. Research on Shadow-Based SAR Multi-Target Tracking Method. Master’s Thesis, University of Electronic Science and Technology of China, Chengdu, China, 2021. [Google Scholar]
- Bao, J.; Zhang, X.; Zhang, T.; Xu, X. ShadowDeNet: A Moving Target Shadow Detection Network for Video SAR. Remote Sens. 2022, 14, 320. [Google Scholar] [CrossRef]
- Wu, Z.; Xie, H.; Gao, T.; Zhang, Y.; Liu, H. Moving Target Shadow Detection Method Based on Improved ViBe in VideoSAR Images. IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens. 2024, 17, 14575–14587. [Google Scholar] [CrossRef]
- Zhang, J.; Xie, H.; Zhang, L.; Lu, Z. Information Extraction and Three-Dimensional Contour Reconstruction of Vehicle Target Based on Multiple Different Pitch-Angle Observation Circular Synthetic Aperture Radar Data. Remote Sens. 2024, 16, 401. [Google Scholar] [CrossRef]
- Fang, H.; Liao, G.; Liu, Y.; Zeng, C.; He, X.; Meng, Q. A Dual-mode Framework for Robust Long-term Tracking in Video SAR. IEEE Sens. J. 2024, 24, 13028–13042. [Google Scholar] [CrossRef]
- Shang, S.; Wu, F.; Zhou, Y.; Liu, Z. Moving Target Velocity Estimation of Video SAR Based on Shadow Detection. In Proceedings of the Cross Strait Radio Science & Wireless Technology Conference (CSRSWTC), Fuzhou, China, 13–16 December 2020; pp. 1–3. [Google Scholar]
- Yu, F.; Li, W.; Li, Q.; Liu, Y.; Shi, X.; Yan, J. POI: Multiple Object Tracking with High Performance Detection and Appearance Feature. In Proceedings of the European Conference on Computer Vision Workshops, Amsterdam, The Netherlands, 8–10 and 15–16 October 2016; pp. 36–42. [Google Scholar]
- Wang, Z.; Zheng, L.; Liu, Y.; Li, Y.; Wang, S. Towards Real-Time Multi-Object Tracking. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; pp. 107–122. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed]
- Zhang, H.; Liu, Z. Moving Target Shadow Detection Based on Deep Learning in Video SAR. In Proceedings of the 2021 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Brussels, Belgium, 11–16 July 2021; pp. 4155–4158. [Google Scholar]
- Lin, T.-Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Wang, W.; Zhou, Y.; Xie, Z.; Zhang, T.; Shi, J.; Zhang, X. Moving Target Shadow Detection using Transformer in Video SAR. In Proceedings of the IGARSS 2022—2022 IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, 17–22 July 2022; pp. 2614–2617. [Google Scholar]
- Zhu, X.; Su, W.; Lu, L.; Li, B.; Wang, X.; Dai, J. Deformable DETR: Deformable Transformers for End-to-End Object Detection. arXiv 2020, arXiv:2010.04159. [Google Scholar]
- Wang, W.; Hu, Y.; Zou, Z.; Zhou, Y.; Wang, C.; Shi, J.; Zhang, X. Video SAR Ground Moving Target Indication Based on Multi-Target Tracking Neural Network. In Proceedings of the 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, Brussels, Belgium, 11–16 July 2021; pp. 4584–4587. [Google Scholar]
- Duan, K.; Bai, S.; Xie, L.; Qi, H.; Huang, Q.; Tian, Q. CenterNet: Keypoint Triplets for Object Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 27 October–2 November 2019; pp. 6569–6578. [Google Scholar]
- Wei, B.; Yu, A.; Dong, Z.; He, Z. Video SAR Target Detection and Tracking Method Based on Yolov5+Bytetrack. In Proceedings of the 2023 8th International Conference on Signal and Image Processing (ICSIP), Wuxi, China, 8–10 July 2023; pp. 151–155. [Google Scholar]
- Xu, X.; Zhang, X.; Zhang, T. Lite-YOLOv5: A Lightweight Deep Learning Detector for On-Board Ship Detection in Large-Scene Sentinel-1 SAR Images. Remote Sens. 2022, 14, 1018. [Google Scholar] [CrossRef]
- Gu, A.; Dao, T. Mamba: Linear-Time Sequence Modeling with Selective State Spaces. arXiv 2023, arXiv:2312.00752. [Google Scholar]
- Liu, Y.; Tian, Y.; Zhao, Y.; Yu, H.; Xie, L.; Wang, Y.; Ye, Q.; Liu, Y. VMamba: Visual State Space Model. arXiv 2024, arXiv:2401.10166. [Google Scholar]
- Zhu, L.; Liao, B.; Zhang, Q.; Wang, X.; Liu, W.; Wang, X. Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model. arXiv 2024, arXiv:2401.09417. [Google Scholar]
- National Technology and Engineering Solutions of Sandia. Pathfinder Radar ISR & SAR Systems. Eubank Gate and Traffic VideoSAR. 2021. Available online: http://www.sandia.gov/radar/video (accessed on 1 November 2024).
- Gu, A.; Goel, K.; Re, C. Efficiently Modeling Long Sequences with Structured State Spaces. arXiv 2022, arXiv:2111.00396. [Google Scholar]
- Li, Y.; Hu, J.; Wen, Y.; Evangelidis, G.; Salahi, K.; Wang, Y.; Tulyakov, S.; Ren, J. Rethinking Vision Transformers for MobileNet Size and Speed. In Proceedings of the IEEE International Conference on Computer Vision, Paris, France, 4–6 October 2023. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image is Worth 16 × 16 Words: Transformers for Image Recognition at Scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Wanga, Z.; Lia, C.; Xu, H.; Zhu, X. Mamba YOLO: SSMs-Based YOLO For Object Detection. arXiv 2024, arXiv:2406.05835v1. [Google Scholar]
- Dauphin, Y.N.; Fan, A.; Auli, M.; Grangier, D. Language Modeling with Gated Convolutional Networks. arXiv 2017, arXiv:1612.08083. [Google Scholar]
- Rajagopal, A.; Nirmala, V. Convolutional Gated MLP: Combining Convolutions and gMLP. In Proceedings of the International Conference on Big Data, Machine Learning, and Applications, Singapore, 19–20 December 2021; Springer Nature: Singapore, 2021; pp. 721–735. [Google Scholar]
- Liu, S.; Qi, L.; Qin, H. Path Aggregation Network for Instance Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 8759–8768. [Google Scholar]
- Li, H.; Li, J.; Wei, H.; Liu, Z.; Zhan, Z.; Ren, Q. Slim-Neck by GSConv: A Lightweight-Fesign for Real-Time Detector Architectures. J. Real-Time Image Process. 2024, 21, 62. [Google Scholar] [CrossRef]
- Ketkar, N. Introduction to Pytorch. Deep Learning with Python: A Hands-On Introduction; Apress: Berkeley, CA, USA, 2017; pp. 195–208. Available online: https://link.springer.com/chapter/10.1007/978-1-4842-2766-4_12 (accessed on 1 November 2024).
- Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Hosang, J.; Benenson, R.; Schiele, B. Learning Non-Maximum Suppression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4507–4515. [Google Scholar]
- Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. Distance-IoU loss: Faster and Better Learning for Bounding Box Regression. In Proceedings of the 34th AAAI Conference on Artificial Intelligence, AAAI 2020, New York, NY, USA, 7–12 February 2020; pp. 12993–13000. [Google Scholar]
- Zhang, T.; Zhang, X. Shipdenet-20: An Only 20 Convolution Layers and <1-Mb Lightweight SAR Ship Detector. IEEE Geosci. Remote Sens. Lett. 2021, 18, 1234–1238. [Google Scholar]
- Li, X.; Lv, C.; Wang, W.; Li, G.; Yang, L.; Yang, J. Generalized Focal Loss: Towards Efficient Representation Learning for Dense Object Detection. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 3139–3153. [Google Scholar] [CrossRef]
- Zheng, G.; Songtao, L.; Feng, W.; Zeming, L.; Jian, S. YOLOX: Exceeding YOLO Series in 2021. arXiv 2021, arXiv:2107.08430. [Google Scholar]
- Lin, T.-Y.; Goyal, P.; Girshick, R.; He, K.; Dollar, P. Focal Loss for Dense Object Detection. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 318–327. [Google Scholar] [CrossRef]
- Cai, Z.; Vasconcelos, N. Cascade R-CNN: Delving into High Quality Object Detection. In Proceedings of the 31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, 18–22 June 2018; pp. 6154–6162. [Google Scholar]










| Mamba-Backbone | GT | TP | FP | FN | P (%) | R (%) | AP (%) | F1 (%) | 
|---|---|---|---|---|---|---|---|---|
| ✕ | 1694 | 1165 | 223 | 529 | 83.93 | 68.77 | 65.69 | 75.60 | 
| ✓ | 1694 | 1241 | 155 | 453 | 88.90 | 73.26 | 71.99 | 80.32 | 
| Slim-PAFPN | GT | TP | FP | FN | P (%) | R (%) | AP (%) | F1 (%) | 
|---|---|---|---|---|---|---|---|---|
| ✕ | 1694 | 1169 | 58 | 525 | 95.27 | 69.01 | 68.87 | 80.04 | 
| ✓ | 1694 | 1241 | 155 | 453 | 88.90 | 73.26 | 71.99 | 80.32 | 
| Slim-PAFPN | FLOPs (G) | #Para (M) | FPS | 
|---|---|---|---|
| ✕ | 11.36 | 4.59 | 41.32 | 
| ✓ | 10.65 | 4.47 | 44.44 | 
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. | 
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xu, X.; Zhang, T.; Zhang, X.; Zhang, W.; Ke, X.; Zeng, T. MambaShadowDet: A High-Speed and High-Accuracy Moving Target Shadow Detection Network for Video SAR. Remote Sens. 2025, 17, 214. https://doi.org/10.3390/rs17020214
Xu X, Zhang T, Zhang X, Zhang W, Ke X, Zeng T. MambaShadowDet: A High-Speed and High-Accuracy Moving Target Shadow Detection Network for Video SAR. Remote Sensing. 2025; 17(2):214. https://doi.org/10.3390/rs17020214
Chicago/Turabian StyleXu, Xiaowo, Tianwen Zhang, Xiaoling Zhang, Wensi Zhang, Xiao Ke, and Tianjiao Zeng. 2025. "MambaShadowDet: A High-Speed and High-Accuracy Moving Target Shadow Detection Network for Video SAR" Remote Sensing 17, no. 2: 214. https://doi.org/10.3390/rs17020214
APA StyleXu, X., Zhang, T., Zhang, X., Zhang, W., Ke, X., & Zeng, T. (2025). MambaShadowDet: A High-Speed and High-Accuracy Moving Target Shadow Detection Network for Video SAR. Remote Sensing, 17(2), 214. https://doi.org/10.3390/rs17020214
 
        


 
                                                



 
       