remotesensing-logo

Journal Browser

Journal Browser

Artificial Intelligence Remote Sensing for Earth Observation

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "AI Remote Sensing".

Deadline for manuscript submissions: 15 September 2025 | Viewed by 5666

Special Issue Editors


E-Mail Website
Guest Editor
School of Cybersecurity, Northwestern Polytechnical University, Xi’an 710129, China
Interests: remote sensing; image processing; visual language model
Special Issues, Collections and Topics in MDPI journals

E-Mail Website1 Website2
Guest Editor
Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, Xidian University, Xi’an 710071, China
Interests: deep learning; object detection and tracking; reinforcement learning; hyperspectral image processing
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Aerospace and Geodesy, Technical University of Munich, 85521 Munich, Germany
Interests: remote sensing; image segmentation; visual language model
School of Computer science, Xi’an University of Posts & Telecommunications, Xi’an 710121, China
Interests: remote sensing; image processing; machine learning
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Remote sensing imaging captures electromagnetic radiation across various wavelengths, producing multimodal images with rich information. Consequently, remote sensing images have a wide range of applications in Earth observation, including environmental monitoring, agriculture, urban planning, and geological exploration. The development of artificial intelligence (AI) presents both opportunities and challenges for remote sensing-based Earth observations. Over the past decade, researchers have observed significant advancements in remote sensing image processing techniques driven by deep learning.

In recent years, the field of AI has experienced new developments. The remarkable success of ChatGPT has sparked a renewed wave of interest in AI, and advancements in visual language models (VLMs) have pushed this enthusiasm to new heights. Similar to previous developments, remote sensing is also embracing these new advancements and reaching a new level. Technological advancements have enabled us to design more efficient and lightweight AI models to handle specific remote sensing tasks. These advancements even allow us to move beyond traditional discriminative models, using a generative paradigm to solve problems that were previously impossible to model. However, the application of new technologies such as Mamba, TTT, and VLM in the field of remote sensing is still relatively limited. At the same time, the use of these new technologies also needs to overcome certain challenges unique to the field of remote sensing, such as modality gaps and resolution differences. Therefore, more effort should be paid to exploiting advanced AI techniques, e.g., CLIP, VLM, Mamba, and large-foundation modelling, facilitating the wide application of remote sensing images.

For this Special Issue, we encourage submissions that utilise advanced AI techniques to address remote sensing image processing tasks. This includes both traditional tasks such as image segmentation and fusion, and emerging tasks such as remote sensing-based visual question answering (VQA) and AI for scientific applications.

This Special Issue welcomes high-quality submissions that provide the community with the most recent advancements in remote sensing for Earth observation, including but not limited to the following:

  • Spatial and spectral remote sensing image super-resolution;
  • Remote sensing image segmentation/classification;
  • Multimodal remote sensing image fusion;
  • Remote sensing object detection;
  • Contrastive language and remote sensing image pretraining;
  • Remote sensing image-based visual language model for Earth observation;
  • Other topics on applications of remote sensing for Earth observation.

Prof. Dr. Haokui Zhang
Prof. Dr. Jie Feng
Dr. Xizhe Xue
Dr. Chen Ding
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • image super-resolution
  • image segmentation
  • image classification
  • multimodal fusion
  • language–image contrastive learning
  • visual language model

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

21 pages, 1426 KiB  
Article
Adaptive Conditional Reasoning for Remote Sensing Visual Question Answering
by Yiqun Gao, Zongwen Bai, Meili Zhou, Bolin Jia, Peiqi Gao and Rui Zhu
Remote Sens. 2025, 17(8), 1338; https://doi.org/10.3390/rs17081338 - 9 Apr 2025
Viewed by 291
Abstract
Remote Sensing Visual Question Answering (RS-VQA) is a research task that combines remote sensing image processing and natural language understanding. The increasing complexity and diversity of question types in Remote Sensing Visual Question Answering (RS-VQA) pose significant challenges for unified multimodal reasoning within [...] Read more.
Remote Sensing Visual Question Answering (RS-VQA) is a research task that combines remote sensing image processing and natural language understanding. The increasing complexity and diversity of question types in Remote Sensing Visual Question Answering (RS-VQA) pose significant challenges for unified multimodal reasoning within a single model architecture. Therefore, we propose the Adaptive Conditional Reasoning (ACR) network, a novel framework that dynamically tailors reasoning pathways to question semantics through type-aware feature fusion. The ACR module selectively applies different reasoning strategies depending on whether the question is open-ended or closed-ended, thereby tailoring the reasoning process to the specific nature of the question. In order to enhance the multimodal fusion process of different types of questions, the ACR model further integrates visual and textual features by leveraging type-guided cross-attention. Meanwhile, we use a Dual-Reconstruction Feature Enhancer that mitigates spatial and channel redundancy in remote sensing images via spatial and channel reconstruction convolution, enhancing discriminative feature extraction for key regions. Experimental results demonstrate that our method achieves 78.5% overall accuracy on the EarthVQA dataset, showcasing the effectiveness of adaptive reasoning in remote sensing application. Full article
(This article belongs to the Special Issue Artificial Intelligence Remote Sensing for Earth Observation)
Show Figures

Figure 1

22 pages, 8743 KiB  
Article
A Lightweight and Adaptive Image Inference Strategy for Earth Observation on LEO Satellites
by Bo Wang, Yuhang Fang, Dongyan Huang, Zelin Lu and Jiaqi Lv
Remote Sens. 2025, 17(7), 1175; https://doi.org/10.3390/rs17071175 - 26 Mar 2025
Viewed by 263
Abstract
Low Earth Orbit (LEO) satellite equipped with image inference capabilities (LEO-IISat) offer significant potential for Earth Observation (EO) missions. However, the dual challenges of limited computational capacity and unbalanced energy supply present significant obstacles. This paper introduces the Accuracy-Energy Efficiency (AEE) index to [...] Read more.
Low Earth Orbit (LEO) satellite equipped with image inference capabilities (LEO-IISat) offer significant potential for Earth Observation (EO) missions. However, the dual challenges of limited computational capacity and unbalanced energy supply present significant obstacles. This paper introduces the Accuracy-Energy Efficiency (AEE) index to quantify inference accuracy unit of energy consumption and evaluate the inference performance of LEO-IISat. It also proposes a lightweight and adaptive image inference strategy utilizing the Markov Decision Process (MDP) and Deep Q Network (DQN), which dynamically optimizes model selection to balance accuracy and energy efficiency under varying conditions. Simulations demonstrate a 31.3% improvement in inference performance compared to a fixed model strategy at the same energy consumption, achieving a maximum inference accuracy of 91.8% and an average inference accuracy of 89.1%. Compared to MDP-Policy Gradient and MDP-Q Learning strategies, the proposed strategy improves the AEE by 12.2% and 6.09%, respectively. Full article
(This article belongs to the Special Issue Artificial Intelligence Remote Sensing for Earth Observation)
Show Figures

Figure 1

20 pages, 707 KiB  
Article
Remote Sensing Cross-Modal Text-Image Retrieval Based on Attention Correction and Filtering
by Xiaoyu Yang, Chao Li, Zhiming Wang, Hao Xie, Junyi Mao and Guangqiang Yin
Remote Sens. 2025, 17(3), 503; https://doi.org/10.3390/rs17030503 - 31 Jan 2025
Viewed by 939
Abstract
Remote sensing cross-modal text-image retrieval constitutes a pivotal component of multi-modal retrieval in remote sensing, central to which is the process of learning integrated visual and textual representations. Prior research predominantly emphasized the overarching characteristics of remote sensing images, or employed attention mechanisms [...] Read more.
Remote sensing cross-modal text-image retrieval constitutes a pivotal component of multi-modal retrieval in remote sensing, central to which is the process of learning integrated visual and textual representations. Prior research predominantly emphasized the overarching characteristics of remote sensing images, or employed attention mechanisms for meticulous alignment. However, these investigations, to some degree, overlooked the intricacies inherent in the textual descriptions accompanying remote sensing images. In this paper, we introduce a novel cross-modal retrieval model, specifically tailored for remote sensing image-text, leveraging attention correction and filtering mechanisms. The proposed model is architected around four primary components: an image feature extraction module, a text feature extraction module, an attention correction module, and an attention filtering module. Within the image feature extraction module, the Visual Graph Neural Network (VIG) serves as the principal encoder, augmented by a multi-tiered node feature fusion mechanism. This ensures a comprehensive understanding of remote sensing images. For text feature extraction, both the Bidirectional Gated Recurrent Unit (BGRU) and the Graph Attention Network (GAT) are employed as encoders, furnishing the model with an enriched understanding of the associated text. The attention correction segment minimizes potential misalignments in image-text pairings, specifically by modulating attention weightings in cases where there’s a unique correlation between visual area attributes and textual descriptors. Concurrently, the attention filtering segment diminishes the influence of extraneous visual sectors and terms in the image-text matching process, thereby enhancing the precision of cross-modal retrieval. Extensive experimentation carried out on both the RSICD and RSITMD datasets, yielded commendable results, attesting to the superior efficacy of the proposed methodology in the domain of remote sensing cross-modal text-image retrieval. Full article
(This article belongs to the Special Issue Artificial Intelligence Remote Sensing for Earth Observation)
Show Figures

Figure 1

20 pages, 2388 KiB  
Article
The Spectrum Difference Enhanced Network for Hyperspectral Anomaly Detection
by Shaohua Liu, Huibo Guo, Shiwen Gao and Wuxia Zhang
Remote Sens. 2024, 16(23), 4518; https://doi.org/10.3390/rs16234518 - 2 Dec 2024
Viewed by 939
Abstract
Most deep learning-based hyperspectral anomaly detection (HAD) methods focus on modeling or reconstructing the hyperspectral background to obtain residual maps from the original hyperspectral images. However, these methods typically do not pay enough attention to the spectral similarity in the complex environment, resulting [...] Read more.
Most deep learning-based hyperspectral anomaly detection (HAD) methods focus on modeling or reconstructing the hyperspectral background to obtain residual maps from the original hyperspectral images. However, these methods typically do not pay enough attention to the spectral similarity in the complex environment, resulting in inadequate distinction between background and anomalies. Moreover, some anomalies and background are different objects, but they are sometimes recognized as the objects with the same spectrum. To address the issues mentioned above, this paper proposes a Spectrum Difference Enhanced Network (SDENet) for HAD, which employs variational mapping and Transformer to amplify spectrum differences. The proposed network is based on the encoder–decoder structure, which contains a CSWin-Transformer encoder, Variational Mapping Module (VMModule), and CSWin-Transformer decoder. First, the CSWin-Transformer encoder and decoder are designed to supplement image information by extracting deep and semantic features, where a cross-shaped window self-attention mechanism is designed to provide strong modeling capability with minimal computational cost. Second, in order to enhance the spectral difference characteristics between anomalies and background, a randomly sampling VMModule is presented for feature space transformation. Finally, all fully connected mapping operations are replaced with convolutional layers to reduce the model parameters and computational load. The effectiveness of the proposed SDENet is verified on three datasets, and experimental results show that it achieves better detection accuracy and lower model complexity compared with existing methods. Full article
(This article belongs to the Special Issue Artificial Intelligence Remote Sensing for Earth Observation)
Show Figures

Figure 1

23 pages, 6153 KiB  
Article
An Enhanced Shuffle Attention with Context Decoupling Head with Wise IoU Loss for SAR Ship Detection
by Yunshan Tang, Yue Zhang, Jiarong Xiao, Yue Cao and Zhongjun Yu
Remote Sens. 2024, 16(22), 4128; https://doi.org/10.3390/rs16224128 - 5 Nov 2024
Viewed by 2363
Abstract
Synthetic Aperture Radar (SAR) imagery is widely utilized in military and civilian applications. Recent deep learning advancements have led to improved ship detection algorithms, enhancing accuracy and speed over traditional Constant False-Alarm Rate (CFAR) methods. However, challenges remain with complex backgrounds and multi-scale [...] Read more.
Synthetic Aperture Radar (SAR) imagery is widely utilized in military and civilian applications. Recent deep learning advancements have led to improved ship detection algorithms, enhancing accuracy and speed over traditional Constant False-Alarm Rate (CFAR) methods. However, challenges remain with complex backgrounds and multi-scale ship targets amidst significant interference. This paper introduces a novel method that features a context-based decoupled head, leveraging positioning and semantic information, and incorporates shuffle attention to enhance feature map interpretation. Additionally, we propose a new loss function with a dynamic non-monotonic focus mechanism to tackle these issues. Experimental results on the HRSID and SAR-Ship-Dataset demonstrate that our approach significantly improves detection performance over the original YOLOv5 algorithm and other existing methods. Full article
(This article belongs to the Special Issue Artificial Intelligence Remote Sensing for Earth Observation)
Show Figures

Figure 1

Back to TopTop