Hierarchical Local-Global Feature Fusion Network for Robust Ship Target Recognition in Complex Maritime Environment
Highlights
- This paper proposes a hierarchical local-global feature fusion model that integrates local structural features extracted by convolutional neural networks with global semantic dependencies modeled by Transformer architectures through a progressive multilayer self-attention mechanism.
- Extensive experiments on both the FUSAR dataset and a measured dataset demonstrate that the proposed model achieves superior classification accuracy and F1 scores compared with traditional CNNs, pure Transformer models, and representative recent vision architectures, while maintaining competitive inference efficiency. The model also exhibits strong robustness under low signal-to-noise ratios and limited sample conditions.
- Hierarchical encoding of local structural features and global contextual dependencies provides a novel approach for extracting vessel target features under complex sea conditions, enhancing the reliability of maritime target recognition.
- Transfer learning methods based on partial fine-tuning can efficiently adapt to limited labeled data, enabling rapid deployment of high-precision recognition systems in resource-constrained environments.
Abstract
1. Introduction
2. Methods
2.1. Overall Framework
2.2. Datasets and Data Preparation
2.2.1. Datasets
2.2.2. Data Preparation and Augmentation
- Three-channel processing of images better simulates the visual information of the real world and makes it easier for the network to find more detailed target features;
- Dimensionality restructuring to accommodate the network input requirements;
- Central cropping to reinforce the model’s focus on the central area of targets;
- Random rotation to increase the diversity of data viewpoints and enable the model to adapt to different observation angles;
- Addition of Gaussian noise to improve the model’s robustness against noise conditions;
- Tensorization and normalization of data to ensure consistency and effectiveness of data inputs.
2.3. Transfer Learning Strategy
2.4. Network Architecture of HLGF-Net
2.4.1. Overview of the Network Architecture
2.4.2. Hierarchical Local–Global Feature Modeling Principle
- Extracting the classification token , which serves as a global representation of the input;
- Computing the global average pooling (GAP) over the 49 patch embeddings to obtain a pooled feature vector ;
- Concatenating the classification token and the pooled feature to form the final feature vector:
3. Results
3.1. Experimental Setup
3.2. Overall Performance Comparison
3.3. Robustness Evaluation Under Different SNR Conditions
3.4. Confusion Matrix Analysis
3.5. Ablation Analysis
4. Discussion
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Li, X.; He, Y.; Zhu, W.; Qu, W.; Li, Y.; Li, C.A.-O.; Zhu, B. Split_ Composite: A Radar Target Recognition Method on FFT Convolution Acceleration. Sensors 2024, 24, 4476. [Google Scholar] [CrossRef] [PubMed]
- Patel, K.; Bhatt, C.; Mazzeo, P.L. Deep Learning-Based Automatic Detection of Ships: An Experimental Study Using Satellite Images. J. Imaging 2022, 8, 182. [Google Scholar] [CrossRef] [PubMed]
- Li, S.; Yan, F.; Liu, Y.; Shen, Y.; Liu, L.; Wang, K. A multi-scale rotated ship targets detection network for remote sensing images in complex scenarios. Sci. Rep. 2025, 15, 2510. [Google Scholar] [CrossRef]
- Wind, H.J.d.; Cilliers, J.E.; Herselman, P.L. DataWare: Sea Clutter and Small Boat Radar Reflectivity Databases [Best of the Web]. IEEE Signal Process. Mag. 2010, 27, 145–148. [Google Scholar] [CrossRef]
- Bounaceur, H.; Khenchaf, A.; Le Caillec, J.-M. Analysis of Small Sea-Surface Targets Detection Performance According to Airborne Radar Parameters in Abnormal Weather Environments. Sensors 2022, 22, 3263. [Google Scholar] [CrossRef]
- Li, C.; Yue, C.; Li, H.; Wang, Z. Context-aware SAR image ship detection and recognition network. Front. Neurorobot. 2024, 18, 1293992. [Google Scholar] [CrossRef]
- Wang, G.; Zhang, R.; He, J.; Tang, Y.; Wang, Y.; He, Y.; Gong, X.; Ye, J. S2M-Net: A Novel Lightweight Network for Accurate Small Ship Recognition in SAR Images. Remote Sens. 2025, 17, 3347. [Google Scholar] [CrossRef]
- Pang, C.; Han, Y.; Hou, H.; Liu, S.; Zhang, N. Micro-Doppler Signal Time-Frequency Algorithm Based on STFRFT. Sensors 2016, 16, 1559. [Google Scholar] [CrossRef]
- Wang, J.; Li, S. SALA-LSTM: A novel high-precision maritime radar target detection method based on deep learning. Sci. Rep. 2023, 13, 12125. [Google Scholar] [CrossRef]
- Ding, M.; Li, Y.; Quan, Y.; Guo, L.; Xing, M. A Novel Reconstruction Method of K-Distributed Sea Clutter with Spatial–Temporal Correlation. Sensors 2020, 20, 2377. [Google Scholar] [CrossRef] [PubMed]
- Wen, B.; Wei, Y.; Lu, Z. Sea Clutter Suppression and Target Detection Algorithm of Marine Radar Image Sequence Based on Spatio-Temporal Domain Joint Filtering. Entropy 2022, 24, 250. [Google Scholar] [CrossRef]
- Li, J.; Xu, C.; Su, H.; Gao, L.; Wang, T. Deep Learning for SAR Ship Detection: Past, Present and Future. Remote Sens. 2022, 14, 2712. [Google Scholar] [CrossRef]
- Fu, H.; Li, Y.; Wang, Y.; Li, P. Maritime Ship Targets Recognition with Deep Learning. In Proceedings of the 2018 37th Chinese Control Conference (CCC), Wuhan, China, 25–27 July 2018; pp. 9297–9302. [Google Scholar]
- Xu, W.; Guo, Z.; Huang, P.; Tan, W.; Gao, Z. Towards Efficient SAR Ship Detection: Multi-Level Feature Fusion and Lightweight Network Design. Remote Sens. 2025, 17, 2588. [Google Scholar] [CrossRef]
- He, F.; Wang, C.; Guo, B. SSGY: A Lightweight Neural Network Method for SAR Ship Detection. Remote Sens. 2025, 17, 2868. [Google Scholar] [CrossRef]
- Yu, A.; Yu, H.; Ji, Y.; Tong, W.; Dong, Z. High-Precision Geolocation of SAR Images via Multi-View Fusion Without Ground Control Points. Remote Sens. 2025, 17, 3775. [Google Scholar] [CrossRef]
- Yosinski, J.; Clune, J.; Bengio, Y.; Lipson, H. How Transferable Are Features in Deep Neural Networks; MIT Press: Cambridge, MA, USA, 2014. [Google Scholar]
- Wang, Y.; Wang, C.; Zhang, H. Combining single shot multibox detector with transfer learning for ship detection using Sentinel-1 images. In Proceedings of the 2017 SAR in Big Data Era: Models, Methods and Applications (BIGSARDATA), Beijing, China, 13–14 November 2017; pp. 1–4. [Google Scholar]
- Li, J.; Zhang, S.; Zhu, L.; Chen, S.; Hou, L.; Li, X.; Chen, K. Carrier-Free Ultra-Wideband Sensor Target Recognition in the Jungle Environment. Remote Sens. 2024, 16, 1549. [Google Scholar] [CrossRef]
- Li, R.; Liu, W.; Yang, L.; Sun, S.; Hu, W.; Zhang, F.; Li, W. DeepUNet: A Deep Fully Convolutional Network for Pixel-Level Sea-Land Segmentation. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2017, 11, 3954–3962. [Google Scholar] [CrossRef]
- Li, M.; Wei, G. A Review of Quantitative Evaluation of Electromagnetic Environmental Effects: Research Progress and Trend Analysis. Sensors 2023, 23, 4257. [Google Scholar] [CrossRef] [PubMed]
- Gao, Y.; Shao, Q.; Yan, B.; Li, Q.; Guo, S. Parabolic Equation Modeling of Electromagnetic Wave Propagation over Rough Sea Surfaces. Sensors 2019, 19, 1252. [Google Scholar] [CrossRef]
- Zeiler, M.D.; Fergus, R. Visualizing and Understanding Convolutional Neural Networks; Springer International Publishing: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2014. [Google Scholar] [CrossRef]
- Szegedy, C.; Wei, L.; Yangqing, J.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. arXiv 2017, arXiv:1706.03762. [Google Scholar] [CrossRef]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Houlsby, N. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Tang, C.; Zhang, L.; Zhang, Y.; Song, H. Factor Graph-Assisted Distributed Cooperative Positioning Algorithm in the GNSS System. Sensors 2018, 18, 3748. [Google Scholar] [CrossRef] [PubMed]
- Zhang, L.; Tang, C.; Zhang, Y.; Song, H. Inertial-Navigation-Aided Single-Satellite Highly Dynamic Positioning Algorithm. Sensors 2019, 19, 4196. [Google Scholar] [CrossRef]
- Zhang, L.; Wu, S.; Tang, C.; Lin, H. UUV Cluster Distributed Navigation Fusion Positioning Method with Information Geometry. J. Mar. Sci. Eng. 2025, 13, 696. [Google Scholar] [CrossRef]
- Tang, C.; Yu, T.; Zhang, L.; Liu, Y.; Dan, Z.; Yue, Z. Distributed Vehicle Back Propagation Neural Network Cooperative Positioning Method With Fireworks Algorithm. IEEE Internet Things J. 2025, 12, 37008–37021. [Google Scholar] [CrossRef]
- Gao, F.; Fan, C.; He, X.; Wang, J.; Sun, J.; Hussain, A. Weakly Supervised SAR Ship Oriented-Detection Algorithm Based on Pseudo-Label Generation Optimization and Guidance. Remote Sens. 2025, 17, 3663. [Google Scholar] [CrossRef]
- Reitermanová, Z. Data Splitting; Matfyzpress: Prague, Czech Republic, 2010; Volume 10, pp. 31–36. [Google Scholar]
- Wang, Y.; Wang, C.; Zhang, H. Ship Classification in High-Resolution SAR Images Using Deep Learning of Small Datasets. Sensors 2018, 18, 2929. [Google Scholar] [CrossRef]
- Oquab, M.; Bottou, L.; Laptev, I.; Sivic, J. Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 1717–1724. [Google Scholar]
- Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Kai, L.; Li, F.-F. ImageNet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar]
- Zhuang, F.; Qi, Z.; Duan, K.; Xi, D.; Zhu, Y.; Zhu, H.; Xiong, H.; He, Q. A Comprehensive Survey on Transfer Learning. Proc. IEEE 2021, 109, 43–76. [Google Scholar] [CrossRef]
- Lecun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]






| Categories | Length Overall/m | Beam/m | Draft/m | Sample Size (Statistics) |
|---|---|---|---|---|
| General Cargo (353274000) | 153 | 24 | 9.2 | 200 |
| Vehicle Carrier A (372399000) | 200 | 32 | 9.0 | 205 |
| Vehicle Carrier B (431546000) | 180 | 32 | 7.7 | 200 |
| Vehicle Carrier C (477625700) | 179 | 32 | 7.6 | 219 |
| Rescue Vessel (413021140) | 99 | 15 | 6.0 | 195 |
| Container ship (413454350) | 136 | 23 | 7.8 | 254 |
| Dredger (413699050) | 84 | 13 | 3.9 | 205 |
| Model | Final Test Accuracy | F1 Score | Parameter Sizes (M) | FLOPS | Inference Time per Sample (ms) | Training Time per Sample (ms) |
|---|---|---|---|---|---|---|
| Lenet | 58.02% | 0.4780 | 5.41 M | 55.56 MMac | 0.635 | 2.155 |
| Resnet50 | 86.77% | 0.8683 | 25.56 M | 4.13 GMac | 2.214 | 8.323 |
| VGG | 84.73% | 0.8467 | 119.57 M | 15.52 GMac | 3.965 | 6.821 |
| ViT | 82.44% | 0.8235 | 85.80 M | 16.87 GMac | 23.931 | 5.395 |
| Proposed | 91.35% | 0.9130 | 155.21 M | 31.66 GMac | 3.902 | 12.157 |
| Swin-T | 86.77% | 0.8667 | 27.52 M | 2.19 GMac | 3.368 | 10.525 |
| ConvNeXt-T | 86.73% | 0.8681 | 27.82 M | 4.46 GMac | 2.859 | 18.480 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Liu, X.; Zhang, S.; Chen, S.; Li, J.; Luo, Y. Hierarchical Local-Global Feature Fusion Network for Robust Ship Target Recognition in Complex Maritime Environment. Sensors 2026, 26, 29. https://doi.org/10.3390/s26010029
Liu X, Zhang S, Chen S, Li J, Luo Y. Hierarchical Local-Global Feature Fusion Network for Robust Ship Target Recognition in Complex Maritime Environment. Sensors. 2026; 26(1):29. https://doi.org/10.3390/s26010029
Chicago/Turabian StyleLiu, Xuanhe, Shuning Zhang, Si Chen, Jianchao Li, and Yingying Luo. 2026. "Hierarchical Local-Global Feature Fusion Network for Robust Ship Target Recognition in Complex Maritime Environment" Sensors 26, no. 1: 29. https://doi.org/10.3390/s26010029
APA StyleLiu, X., Zhang, S., Chen, S., Li, J., & Luo, Y. (2026). Hierarchical Local-Global Feature Fusion Network for Robust Ship Target Recognition in Complex Maritime Environment. Sensors, 26(1), 29. https://doi.org/10.3390/s26010029

