Submit to Special Issue Submit Abstract to Special Issue Review for Remote Sensing Propose a Special Issue

Journal Menu

Journal Browser

► Journal Browser

Multimodal Learning and Explainable AI for Remote Sensing Image Interpretation

Special Issue Editors
Special Issue Information
Keywords
Benefits of Publishing in a Special Issue
Published Papers

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "AI Remote Sensing".

Deadline for manuscript submissions: 30 June 2026 | Viewed by 994

Share This Special Issue

Special Issue Editors

Dr. Le Yang

E-Mail Website
Guest Editor

The School of Geo-Science & Technology, Zhengzhou University, Zhengzhou 450001, China
Interests: remote sensing image quality improvement; calibration; intelligence processing

Prof. Dr. Xiaoli Ding

E-Mail Website
Guest Editor

Department of Land Surveying & Geo-Informatics, The Hong Kong Polytechnic University, Hong Kong, China
Interests: SAR; InSAR; PSInSAR; deformation monitoring; landslides; geohazards; GNSS; mobile surveying
Special Issues, Collections and Topics in MDPI journals

Dr. Lei Shi

E-Mail Website
Guest Editor

State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan, China
Interests: processing of the TomoSAR data and InSAR data for the forest structure estimation and the vegetation height retrieval
Special Issues, Collections and Topics in MDPI journals

Dr. Yuchen Li

E-Mail Website
Guest Editor

School of Geography, University of Leeds, Leeds LS2 9JT, UK
Interests: GIS; urban data science; health geography; environmental epidemiology; transportation
Special Issues, Collections and Topics in MDPI journals

Dr. Weidong Sun

E-Mail Website
Guest Editor

State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China
Interests: SAR calibration; soil moisture retrieval; multi-source remote sensing image processing
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

With the development of modern space technology, remote sensing imagery has become increasingly prevalent. However, significant variances in radiometric and spatial properties across various types of remote sensing imagery pose major challenges for accurate image interpretation. Multimodal machine learning, which aims to process and integrate information from diverse modalities, offers a promising solution through machine learning approaches. Multimodal learning for the interpretation of remote sensing imagery is an emerging field in earth observation and computer vision. Given that multiple modalities contribute unequally to the final prediction, explainable AI will be necessary to elucidate their predicted contribution and identify potential biases in remote sensing image interpretation.

Multimodal machine learning remains demanding in the context of rapidly evolving remote sensing, and the complexity of current models often leads to limited explicability and transparency. This Special Issue aims to explore the recent advances, challenges, and practical applications of multimodal learning and explainable AI for interpreting multisource remote sensing imagery. Contributions may address new theories, methodologies, or applications of multimodal learning for the processing and analysis of remote sensing data, along with related challenges and opportunities.

Articles may address, but are not limited to, the following topics:

Multimodal Fusion Architectures for Heterogeneous Remote Sensing Data;
Cross-Modal Alignment, Registration Representation, and Translation Learning;
Interpretable Multimodal Learning Frameworks in Remote Sensing;
Explainable AI Techniques for Transparent Multisource Data Interpretation;
Research on Multimodal Remote Sensing Image Matching;
Domain Adaptation and Cross-Modal Transfer Learning for Remote Sensing Data;
Multimodal Data Reconstruction and Quality Enhancement with Explainable Fusion Strategies;
Application of Multimodal Machine Learning for Earth Observation: Case Studies in Land Use, Ecosystem Monitoring, and Disaster Response;
Environmental Monitoring with Multimodal Learning Technology;
Benchmarks, Datasets, and Evaluation Metrics for Multimodal Remote Sensing.

Dr. Le Yang
Prof. Dr. Xiaoli Ding
Dr. Lei Shi
Dr. Yuchen Li
Dr. Weidong Sun
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

remote sensing
multimodal machine learning
explainable AI
image interpretation
data fusion
heterogeneous data
image registration
earth observation
environmental monitoring
model interpretability

Benefits of Publishing in a Special Issue

Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (1 paper)

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

29 pages, 2266 KB

Open AccessArticle

Test-Time Candidate-Aware Dual Refinement for Remote Sensing Image–Text Retrieval

by Bofan Zhang and Hao Wu

Remote Sens. 2026, 18(9), 1389; https://doi.org/10.3390/rs18091389 - 30 Apr 2026

Viewed by 441

Abstract

Remote sensing image–text retrieval (RSITR) is a pivotal task aimed at achieving efficient bidirectional matching between visual content and textual descriptions in large-scale remote sensing databases. Nevertheless, it faces a fundamental challenge: the severe information asymmetry between sparse, abstract captions and dense, multi-scale overhead imagery. Prior works predominantly focus on learning static cross-modal representations during training; however, this frozen inference process is fundamentally limited in bridging the asymmetry due to its inability to dynamically compensate for missing details or resolve visual ambiguities in heterogeneous scenes. To overcome this limitation, we propose CADRE (Test-Time Candidate-Aware Dual Refinement), a retrieval-backbone-agnostic framework exploiting retrieved candidates as feedback for bidirectional alignment. Operating on a novel Inject-and-Suppress paradigm, CADRE comprises two complementary modules. First, the Visual-Context Injection (VCI) module addresses textual sparsity by incorporating an adaptive filtering mechanism to efficiently mine hierarchical visual evidence from high-confidence candidates and inject it into the query via a domain-adapted Multimodal Large Language Model (MLLM). Second, the Query-Guided Disambiguation (QGD) module targets visual ambiguity by generating multi-view visual hypotheses and utilizing the query as a semantic probe to suppress background noise. Extensive experiments on three standard benchmarks (RSICD, RSITMD, and UCM) demonstrate good transferability across several strong RSITR backbones. Full article

(This article belongs to the Special Issue Multimodal Learning and Explainable AI for Remote Sensing Image Interpretation)

► Show Figures

Journal Menu

Journal Browser

Multimodal Learning and Explainable AI for Remote Sensing Image Interpretation

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (1 paper)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI