A Learnable Feature Processing Front-End Based Multimodal Fusion Network for SAR Ship Classification

Wang, Bowen; Liu, Liguo; Zhang, Qingyi

doi:10.3390/rs18101610

This is an early access version, the complete PDF, HTML, and XML versions will be available soon.

Open AccessArticle

A Learnable Feature Processing Front-End Based Multimodal Fusion Network for SAR Ship Classification

by

Bowen Wang

^†,

Liguo Liu

^*

and

Qingyi Zhang

^†

Naval University of Engineering, Wuhan 430033, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Remote Sens. 2026, 18(10), 1610; https://doi.org/10.3390/rs18101610

Submission received: 25 March 2026 / Revised: 29 April 2026 / Accepted: 7 May 2026 / Published: 17 May 2026

(This article belongs to the Special Issue Multimodal Data Fusion for Synthetic Aperture Radar (SAR) Image Processing)

Download Versions Notes

Abstract

Ship classification in synthetic aperture radar (SAR) imagery is essential for maritime surveillance but remains challenging due to limited resolution, insufficient textural details, and difficulties in effectively fusing multimodal information. Existing methods either rely on handcrafted features with limited adaptability or employ simplistic fusion strategies that fail to fully exploit the complementary guidance across modalities. To address these issues, we propose a multimodal fusion network based on a learnable feature preprocessing front-end (LFPF-MFN), which integrates polarimetric, textural, and geometric information in an end-to-end learnable manner. Specifically, LFPF-MFN introduces a learnable preprocessing front-end to embed scattering and enhanced textural features. Meanwhile, geometric information from the Automatic Identification System (AIS) is incorporated through textual embedding, and effective multimodal fusion is achieved via a bidirectional cross-attention mechanism. Extensive experiments on the OpenSARShip 2.0 dataset demonstrate that the proposed method achieves state-of-the-art performance in both three-class and six-class classification tasks, validating the effectiveness of each designed module and the superiority of the multimodal fusion strategy.

Keywords: multimodal fusion; learnable feature preprocessing; dual-polarization SAR; AIS; scattering and texture feature; ship classification

Share and Cite

MDPI and ACS Style

Wang, B.; Liu, L.; Zhang, Q. A Learnable Feature Processing Front-End Based Multimodal Fusion Network for SAR Ship Classification. Remote Sens. 2026, 18, 1610. https://doi.org/10.3390/rs18101610

AMA Style

Wang B, Liu L, Zhang Q. A Learnable Feature Processing Front-End Based Multimodal Fusion Network for SAR Ship Classification. Remote Sensing. 2026; 18(10):1610. https://doi.org/10.3390/rs18101610

Chicago/Turabian Style

Wang, Bowen, Liguo Liu, and Qingyi Zhang. 2026. "A Learnable Feature Processing Front-End Based Multimodal Fusion Network for SAR Ship Classification" Remote Sensing 18, no. 10: 1610. https://doi.org/10.3390/rs18101610

APA Style

Wang, B., Liu, L., & Zhang, Q. (2026). A Learnable Feature Processing Front-End Based Multimodal Fusion Network for SAR Ship Classification. Remote Sensing, 18(10), 1610. https://doi.org/10.3390/rs18101610

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Learnable Feature Processing Front-End Based Multimodal Fusion Network for SAR Ship Classification

Abstract

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI