Next Article in Journal
UAV-Based Remote Sensing Methods in the Structural Assessment of Remediated Landfills
Previous Article in Journal
Research on Net Ecosystem Exchange Estimation Model for Alpine Ecosystems Based on Multimodal Feature Fusion: A Case Study of the Babao River Basin, China
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

SAR-to-Optical Remote Sensing Image Translation Method Based on InternImage and Cascaded Multi-Head Attention

College of Electrical and Information Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2026, 18(1), 55; https://doi.org/10.3390/rs18010055
Submission received: 11 November 2025 / Revised: 18 December 2025 / Accepted: 23 December 2025 / Published: 24 December 2025

Abstract

Synthetic aperture radar (SAR), with its all-weather and all-day observation capabilities, plays a significant role in the field of remote sensing. However, due to the unique imaging mechanism of SAR, its interpretation is challenging. Translating SAR images into optical remote sensing images has become a research hotspot in recent years to enhance the interpretability of SAR images. This paper proposes a deep learning-based method for SAR-to-optical remote sensing image translation. The network comprises three parts: a global representor, a generator with cascaded multi-head attention, and a multi-scale discriminator. The global representor, built upon InternImage with deformable convolution v3 (DCNv3) as its core operator, leverages its global receptive field and adaptive spatial aggregation capabilities to extract global semantic features from SAR images. The generator follows the classic “encoder-bottleneck-decoder” structure, where the encoder focuses on extracting local detail features from SAR images. The cascaded multi-head attention module within the bottleneck layer optimizes local detail features and facilitates feature interaction between global semantics and local details. The discriminator adopts a multi-scale structure based on the local receptive field PatchGAN, enabling joint global and local discrimination. Furthermore, for the first time in SAR image translation tasks, structural similarity index metric (SSIM) loss is combined with adversarial loss, perceptual loss, and feature matching loss as the loss function. A series of experiments demonstrate the effectiveness and reliability of the proposed method. Compared to mainstream image translation methods, our method ultimately generates higher-quality optical remote sensing images that are semantically consistent, texturally authentic, clearly detailed, and visually reasonable appearances.
Keywords: SAR-to-optical image translation; InternImage; cascaded multi-head attention; structural similarity index metric loss SAR-to-optical image translation; InternImage; cascaded multi-head attention; structural similarity index metric loss

Share and Cite

MDPI and ACS Style

Xu, C.; Kong, Y. SAR-to-Optical Remote Sensing Image Translation Method Based on InternImage and Cascaded Multi-Head Attention. Remote Sens. 2026, 18, 55. https://doi.org/10.3390/rs18010055

AMA Style

Xu C, Kong Y. SAR-to-Optical Remote Sensing Image Translation Method Based on InternImage and Cascaded Multi-Head Attention. Remote Sensing. 2026; 18(1):55. https://doi.org/10.3390/rs18010055

Chicago/Turabian Style

Xu, Cheng, and Yingying Kong. 2026. "SAR-to-Optical Remote Sensing Image Translation Method Based on InternImage and Cascaded Multi-Head Attention" Remote Sensing 18, no. 1: 55. https://doi.org/10.3390/rs18010055

APA Style

Xu, C., & Kong, Y. (2026). SAR-to-Optical Remote Sensing Image Translation Method Based on InternImage and Cascaded Multi-Head Attention. Remote Sensing, 18(1), 55. https://doi.org/10.3390/rs18010055

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop