remotesensing-logo

Journal Browser

Journal Browser

Bridging AI and Remote Sensing: Multimodal Learning for Advanced Semantic Understanding

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "Remote Sensing Image Processing".

Deadline for manuscript submissions: 31 March 2026 | Viewed by 578

Special Issue Editors

School of Automation, Nanjing University of Information Science and Technology, Nanjing 210044, China
Interests: big data; machine learning; remote sensing
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
College of Information Science and Technology, Nanjing Forestry University, Nanjing 210037, China
Interests: wireless sensor network; forestry internet of things
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

The advanced semantic understanding of remote sensing data is the core to fully unleash the potential of remote sensing technology. It can provide core support for complex land cover pattern recognition, capturing urban dynamic changes, analyzing environmental evolution laws, and responding to various disaster events. However, traditional semantic analysis methods have always struggled to cope with the inherent complexity of remote sensing data: firstly, the fusion of multi-source data (such as optical data, synthetic aperture radar data, LiDAR data, and auxiliary geographic spatial information) faces enormous challenges in feature alignment and information fusion; Secondly, the ambiguity of semantic concepts in large-scale heterogeneous scenarios severely limits the accuracy of interpretation results; Thirdly, the urgent need for real-time and fine-grained analysis has far exceeded the capability boundaries of single modal methods or rule-based methods.

In recent years, the rapid advancement of artificial intelligence (AI), particularly multimodal learning, has opened up unprecedented opportunities to bridge the gap between AI and remote sensing. Multimodal learning techniques enable the synergistic integration of complementary information from diverse data sources, allowing models to capture richer contextual and structural features that single-modality methods cannot. This fusion not only enhances the robustness and accuracy of semantic understanding but also expands the scope of remote sensing applications—from precision agriculture and smart cities to climate change monitoring and national security. Yet, despite promising progress, critical challenges remain: how can we effectively handle the heterogeneity of remote sensing modalities, how can we design lightweight multimodal models adaptable to resource-constrained platforms, how can we ensure model interpretability in high-stakes applications, and how can we generalize models across different geographic regions and scenarios?

Yet, despite promising progress, critical challenges remain: how to effectively handle the heterogeneity of remote sensing modalities, how to design lightweight multimodal models adaptable to resource-constrained platforms, how to ensure model interpretability in high-stakes applications, and how to generalize models across different geographic regions and scenarios.

To address these challenges and showcase cutting-edge research at the intersection of artificial intelligence (AI) and remote sensing, we are pleased to announce the launch of a special issue entitled "Bridging AI and Remote Sensing: Multimodal Learning for Advanced Semantic Understanding".​

This special issue aims to provide a premier platform for researchers and practitioners to share innovative methods, novel applications, and insights into multimodal learning-driven semantic understanding in remote sensing. We welcome high-quality original research papers, review articles, and technical notes that advance the state-of-the-art in this rapidly evolving field.

Topics of interest include, but are not limited to:

  • Multimodal data fusion strategies for remote sensing semantic understanding (e.g., fusion of optical, Synthetic Aperture Radar (SAR), LiDAR, and remote sensing text data);​
  • Novel multimodal model architectures tailored for remote sensing data (e.g., transformer-based, graph neural network-based, and diffusion model-based multimodal methods);​
  • Few-shot, zero-shot, and transfer learning for multimodal remote sensing semantic understanding;​
  • Interpretability and trustworthiness of multimodal remote sensing models;​
  • Lightweight and edge-deployable multimodal models for real-time remote sensing applications;​
  • Multimodal remote sensing semantic segmentation, object detection, and scene classification;​
  • Applications of multimodal remote sensing semantic understanding (e.g., urban planning, crop yield estimation, disaster assessment, biodiversity monitoring);​
  • Benchmark datasets and evaluation metrics for multimodal remote sensing semantic tasks;​
  • Challenges and solutions for handling noisy, incomplete, or imbalanced multimodal remote sensing data.

Prof. Dr. Min Xia
Dr. Ligou Weng
Dr. Haifeng Lin
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • multimodal learning
  • remote sensing semantic understanding
  • multi-source remote sensing data fusion
  • deep learning

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (1 paper)

Order results
Result details
Select all
Export citation of selected articles as:

Research

26 pages, 11141 KB  
Article
MISA-Net: Multi-Scale Interaction and Supervised Attention Network for Remote-Sensing Image Change Detection
by Haoyu Yin, Junzhe Wang, Shengyan Liu, Yuqi Wang, Yi Liu, Tengyue Guo and Min Xia
Remote Sens. 2026, 18(2), 376; https://doi.org/10.3390/rs18020376 - 22 Jan 2026
Cited by 1 | Viewed by 285
Abstract
Change detection in remote sensing imagery plays a vital role in land use analysis, disaster assessment, and ecological monitoring. However, existing remote sensing change detection methods often lack a structured and tightly coupled interaction paradigm to jointly reconcile multi-scale representation, bi-temporal discrimination, and [...] Read more.
Change detection in remote sensing imagery plays a vital role in land use analysis, disaster assessment, and ecological monitoring. However, existing remote sensing change detection methods often lack a structured and tightly coupled interaction paradigm to jointly reconcile multi-scale representation, bi-temporal discrimination, and fine-grained boundary modeling under practical computational constraints. To address this fundamental challenge, we propose a Multi-scale Interaction and Supervised Attention Network (MISANet). To improve the model’s ability to perceive changes at multiple scales, we design a Progressive Multi-Scale Feature Fusion Module (PMFFM), which employs a progressive fusion strategy to effectively integrate multi-granular cross-scale features. To enhance the interaction between bi-temporal features, we introduce a Difference-guided Gated Attention Interaction (DGAI) module. This component leverages difference information between the two time phases and employs a gating mechanism to retain fine-grained details, thereby improving semantic consistency. Furthermore, to guide the model’s focus on change regions, we design a Supervised Attention Decoder Module (SADM). This module utilizes a channel–spatial joint attention mechanism to reweight the feature maps. In addition, a deep supervision strategy is incorporated to direct the model’s attention toward both fine-grained texture differences and high-level semantic changes during training. Experiments conducted on the LEVIR-CD, SYSU-CD, and GZ-CD datasets demonstrate the effectiveness of our method, achieving F1-scores of 91.19%, 82.25%, and 88.35%, respectively. Compared with the state-of-the-art BASNet model, MISANet achieves performance gains of 0.50% F1 and 0.85% IoU on LEVIR-CD, 2.13% F1 and 3.02% IoU on SYSU-CD, and 1.28% F1 and 2.03% IoU on GZ-CD. The proposed method demonstrates strong generalization capabilities and is applicable to various complex change detection scenarios. Full article
Show Figures

Figure 1

Back to TopTop