remotesensing-logo

Journal Browser

Journal Browser

Advanced Artificial Intelligence and Deep Learning for Remote Sensing (3rd Edition)

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "Remote Sensing Image Processing".

Deadline for manuscript submissions: closed (28 February 2026) | Viewed by 11775

Special Issue Editors


E-Mail Website
Guest Editor
College of Microelectronics and Communication Engineering, Chongqing University, Chongqing 401331, China
Interests: radar signal detection; target detection and recognition; radar system
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Remote sensing is a fundamental tool for looking at the world from afar. The development of artificial intelligence (AI) and deep learning (DL) applications has paved the way for new research opportunities in various fields such as remote sensing, which uses Earth observation, disaster warning, and environmental monitoring. In recent years, with the continuous development of remote sensing technologies, especially the continuous emergence of different detection sensors and new detection systems, and the continuous accumulation of historical data and samples, it is possible to use AI and DL for big data training, and the field has become a research hotspot.

This Special Issue aims to report the latest advances and trends concerning advanced AI and DL techniques applied to remote sensing data processing issues. Papers of both theoretical and applicative nature, as well as contributions regarding new AI and DL techniques for the remote sensing research community, are welcome. For this Special Issue, we invite experts and scholars in the field to contribute to the latest research progress of AI and DL in the fields of Earth observation, disaster warning, surface multi-temporal changes, environmental remote sensing, optical remote sensing, and different sensor detection and imaging, to further promote the technological progress in this field.

The topics include but are not limited to the following:

  • Object detection in high-resolution remote sensing imagery.
  • SAR object detection and scene classification.
  • Target-oriented multi-temporal change detection.
  • Infrared target detection and recognition.
  • LiDAR point cloud data processing and scene reconstruction.
  • UAV remote sensing and scene perception.
  • Big data mining in remote sensing.
  • Interpretable deep learning in remote sensing.

This Special Issue is the third edition of “Advanced Artificial Intelligence and Deep Learning for Remote Sensing II”.

Prof. Dr. Zhenming Peng
Prof. Dr. Zhengzhou Li
Dr. Yimian Dai
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • object detection
  • artificial intelligence
  • deep learning
  • scene reconstruction
  • scene perception
  • data mining
  • change detection
  • object recognition

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Related Special Issue

Published Papers (8 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

31 pages, 23615 KB  
Article
A Memory-Efficient Class-Incremental Learning Framework for Remote Sensing Scene Classification via Feature Replay
by Yunze Wei, Yuhan Liu, Ben Niu, Xiantai Xiang, Jingdun Lin, Yuxin Hu and Yirong Wu
Remote Sens. 2026, 18(6), 896; https://doi.org/10.3390/rs18060896 - 15 Mar 2026
Viewed by 164
Abstract
Most existing deep learning models for remote sensing scene classification (RSSC) adopt an offline learning paradigm, where all classes are jointly optimized on fixed-class datasets. In dynamic real-world scenarios with streaming data and emerging classes, such paradigms are inherently prone to catastrophic forgetting [...] Read more.
Most existing deep learning models for remote sensing scene classification (RSSC) adopt an offline learning paradigm, where all classes are jointly optimized on fixed-class datasets. In dynamic real-world scenarios with streaming data and emerging classes, such paradigms are inherently prone to catastrophic forgetting when models are incrementally trained on new data. Recently, a growing number of class-incremental learning (CIL) methods have been proposed to tackle these issues, some of which achieve promising performance by rehearsing training data from previous tasks. However, implementing such strategy in real-world scenarios is often challenging, as the requirement to store historical data frequently conflicts with strict memory constraints and data privacy protocols. To address these challenges, we propose a novel memory-efficient feature-replay CIL framework (FR-CIL) for RSSC that retains compact feature embeddings, rather than raw images, as exemplars for previously learned classes. Specifically, a progressive multi-scale feature enhancement (PMFE) module is proposed to alleviate representation ambiguity. It adopts a progressive construction scheme to enable fine-grained and interactive feature enhancement, thereby improving the model’s representation capability for remote sensing scenes. Then, a specialized feature calibration network (FCN) is trained in a transductive learning paradigm with manifold consistency regularization to adapt stored feature descriptors to the updated feature space, thereby effectively compensating for feature space drift and enabling a unified classifier. Following feature calibration, a bias rectification (BR) strategy is employed to mitigate prediction bias by exclusively optimizing the classifier on a balanced exemplar set. As a result, this memory-efficient CIL framework not only addresses data privacy concerns but also mitigates representation drift and classifier bias. Extensive experiments on public datasets demonstrate the effectiveness and robustness of the proposed method. Notably, FR-CIL outperforms the leading state-of-the-art CIL methods in mean accuracy by margins of 3.75%, 3.09%, and 2.82% on the six-task AID, seven-task RSI-CB256, and nine-task NWPU-45 datasets, respectively. At the same time, it reduces memory storage requirements by over 94.7%, highlighting its strong potential for real-world RSSC applications under strict memory constraints. Full article
Show Figures

Figure 1

34 pages, 13605 KB  
Article
BUM: Bayesian Uncertainty Minimization for Transferable Adversarial Examples in SAR Recognition
by Hongqiang Wang, Yuqing Lan, Fuzhan Yue, Zhenghuan Xia and Tao Zhang
Remote Sens. 2026, 18(5), 693; https://doi.org/10.3390/rs18050693 - 26 Feb 2026
Viewed by 229
Abstract
Adversarial examples pose a significant threat to Deep Neural Networks (DNNs) underpinning Synthetic Aperture Radar (SAR) Automatic Target Recognition (ATR) systems, as these models exhibit acute susceptibility to such malicious inputs. While white-box attacks achieve high success rates, their transferability to unknown black-box [...] Read more.
Adversarial examples pose a significant threat to Deep Neural Networks (DNNs) underpinning Synthetic Aperture Radar (SAR) Automatic Target Recognition (ATR) systems, as these models exhibit acute susceptibility to such malicious inputs. While white-box attacks achieve high success rates, their transferability to unknown black-box models—particularly across different network architectures (e.g., from CNNs to Vision Transformers)—remains a significant challenge. Existing gradient-based iterative methods often overfit the specific decision boundary of the surrogate model, resulting in poor generalization. To address this, we propose a novel generative attack framework termed BUM. Instead of merely maximizing the classification error, BUM explicitly models and minimizes the epistemic uncertainty of the surrogate model. By leveraging Monte Carlo (MC) Dropout to simulate a Bayesian ensemble, we train a generator to craft perturbations that are consistently adversarial across stochastic sub-models. This regularization forces the attack to target high-level, structure-aware semantic features shared among architectures, rather than low-level, model-specific artifacts. Extensive experiments on the MSTAR and FUSAR datasets demonstrate the superior black-box transferability of BUM. Full article
Show Figures

Figure 1

28 pages, 11618 KB  
Article
Cascaded Multi-Attention Feature Recurrent Enhancement Network for Spectral Super-Resolution Reconstruction
by He Jin, Jinhui Lan, Zhixuan Zhuang and Yiliang Zeng
Remote Sens. 2026, 18(2), 202; https://doi.org/10.3390/rs18020202 - 8 Jan 2026
Viewed by 414
Abstract
Hyperspectral imaging (HSI) captures the same scene across multiple spectral bands, providing richer spectral characteristics of materials than conventional RGB images. The spectral reconstruction task seeks to map RGB images into hyperspectral images, enabling high-quality HSI data acquisition without additional hardware investment. Traditional [...] Read more.
Hyperspectral imaging (HSI) captures the same scene across multiple spectral bands, providing richer spectral characteristics of materials than conventional RGB images. The spectral reconstruction task seeks to map RGB images into hyperspectral images, enabling high-quality HSI data acquisition without additional hardware investment. Traditional methods based on linear models or sparse representations struggle to effectively model the nonlinear characteristics of hyperspectral data. Although deep learning approaches have made significant progress, issues such as detail loss and insufficient modeling of spatial–spectral relationships persist. To address these challenges, this paper proposes the Cascaded Multi-Attention Feature Recurrent Enhancement Network (CMFREN). This method achieves targeted breakthroughs over existing approaches through a cascaded architecture of feature purification, spectral balancing and progressive enhancement. This network comprises two core modules: (1) the Hierarchical Residual Attention (HRA) module, which suppresses artifacts in illumination transition regions through residual connections and multi-scale contextual feature fusion, and (2) the Cascaded Multi-Attention (CMA) module, which incorporates a Spatial–Spectral Balanced Feature Extraction (SSBFE) module and a Spectral Enhancement Module (SEM). The SSBFE combines Multi-Scale Residual Feature Enhancement (MSRFE) with Spectral-wise Multi-head Self-Attention (S-MSA) to achieve dynamic optimization of spatial–spectral features, while the SEM synergistically utilizes attention and convolution to progressively enhance spectral details and mitigate spectral aliasing in low-resolution scenes. Experiments across multiple public datasets demonstrate that CMFREN achieves state-of-the-art (SOTA) performance on metrics including RMSE, PSNR, SAM, and MRAE, validating its superiority under complex illumination conditions and detail-degraded scenarios. Full article
Show Figures

Figure 1

19 pages, 1976 KB  
Article
GRADE: A Generalization Robustness Assessment via Distributional Evaluation for Remote Sensing Object Detection
by Decheng Wang, Yi Zhang, Baocun Bai, Xiao Yu, Xiangbo Shu and Yimian Dai
Remote Sens. 2025, 17(22), 3771; https://doi.org/10.3390/rs17223771 - 20 Nov 2025
Viewed by 895
Abstract
The performance of remote sensing object detectors often degrades severely when deployed in new operational environments due to covariate shift in the data distribution. Existing evaluation paradigms, which primarily rely on aggregate performance metrics such as mAP, generally lack the analytical depth to [...] Read more.
The performance of remote sensing object detectors often degrades severely when deployed in new operational environments due to covariate shift in the data distribution. Existing evaluation paradigms, which primarily rely on aggregate performance metrics such as mAP, generally lack the analytical depth to provide insights into the mechanisms behind such generalization failures. To fill this critical gap, we propose the GRADE (Generalization Robustness Assessment via Distributional Evaluation) framework, a multi-dimensional, systematic methodology for assessing model robustness. The framework quantifies shifts in background context and object-centric features through a hierarchical analysis of distributional divergence, utilizing Scene-level Fréchet Inception Distance (FID) and Instance-level FID, respectively. These divergence measures are systematically integrated with a standardized performance decay metric to form a unified, adaptively weighted Generalization Score (GS). This composite score serves not only as an evaluation tool but also as a powerful analytical tool, enabling the fine-grained attribution of performance loss to specific sources of domain shift—whether originating from scene variations or anomalies in object appearance. Compared to conventional single-dimensional evaluation methods, the GRADE framework offers enhanced interpretability, a standardized evaluation protocol, and reliable cross-model comparability, establishing a principled theoretical foundation for cross-domain generalization assessment. Extensive empirical validation on six mainstream remote sensing benchmark datasets and multiple state-of-the-art detection models demonstrates that the model rankings produced by the GRADE framework exhibit high fidelity to real-world performance, thereby effectively quantifying and explaining the cross-domain generalization penalty. Full article
Show Figures

Figure 1

21 pages, 13741 KB  
Article
Individual Tree Species Classification Using Pseudo Tree Crown (PTC) on Coniferous Forests
by Kongwen (Frank) Zhang, Tianning Zhang and Jane Liu
Remote Sens. 2025, 17(17), 3102; https://doi.org/10.3390/rs17173102 - 5 Sep 2025
Viewed by 1527
Abstract
Coniferous forests in Canada play a vital role in carbon sequestration, wildlife conservation, climate change mitigation, and long-term sustainability. Traditional methods for classifying and segmenting coniferous trees have primarily relied on the direct use of spectral or LiDAR-based data. In 2024, we introduced [...] Read more.
Coniferous forests in Canada play a vital role in carbon sequestration, wildlife conservation, climate change mitigation, and long-term sustainability. Traditional methods for classifying and segmenting coniferous trees have primarily relied on the direct use of spectral or LiDAR-based data. In 2024, we introduced a novel data representation method, pseudo tree crown (PTC), which provides a pseudo-3D pixel-value view that enhances the informational richness of images and significantly improves classification performance. While our original implementation was successfully tested on urban and deciduous trees, this study extends the application of PTC to Canadian conifer species, including jack pine, Douglas fir, spruce, and aspen. We address key challenges such as snow-covered backgrounds and evaluate the impact of training dataset size on classification results. Classification was performed using Random Forest, PyTorch (ResNet50), and YOLO versions v10, v11, and v12. The results demonstrate that PTC can substantially improve individual tree classification accuracy by up to 13%, reaching the high 90% range. Full article
Show Figures

Figure 1

26 pages, 29132 KB  
Article
DCS-YOLOv8: A Lightweight Context-Aware Network for Small Object Detection in UAV Remote Sensing Imagery
by Xiaozheng Zhao, Zhongjun Yang and Huaici Zhao
Remote Sens. 2025, 17(17), 2989; https://doi.org/10.3390/rs17172989 - 28 Aug 2025
Cited by 5 | Viewed by 2692
Abstract
Small object detection in UAV-based remote sensing imagery is crucial for applications such as traffic monitoring, emergency response, and urban management. However, aerial images often suffer from low object resolution, complex backgrounds, and varying lighting conditions, leading to missed or false detections. To [...] Read more.
Small object detection in UAV-based remote sensing imagery is crucial for applications such as traffic monitoring, emergency response, and urban management. However, aerial images often suffer from low object resolution, complex backgrounds, and varying lighting conditions, leading to missed or false detections. To address these challenges, we propose DCS-YOLOv8, an enhanced object detection framework tailored for small target detection in UAV scenarios. The proposed model integrates a Dynamic Convolution Attention Mixture (DCAM) module to improve global feature representation and combines it with the C2f module to form the C2f-DCAM block. The C2f-DCAM block, together with a lightweight SCDown module for efficient downsampling, constitutes the backbone DCS-Net. In addition, a dedicated P2 detection layer is introduced to better capture high-resolution spatial features of small objects. To further enhance detection accuracy and robustness, we replace the conventional CIoU loss with a novel Scale-based Dynamic Balanced IoU (SDBIoU) loss, which dynamically adjusts loss weights based on object scale. Extensive experiments on the VisDrone2019 dataset demonstrate that the proposed DCS-YOLOv8 significantly improves small object detection performance while maintaining efficiency. Compared to the baseline YOLOv8s, our model increases precision from 51.8% to 54.2%, recall from 39.4% to 42.1%, mAP0.5 from 40.6% to 44.5%, and mAP0.5:0.95 from 24.3% to 26.9%, while reducing parameters from 11.1 M to 9.9 M. Moreover, real-time inference on RK3588 embedded hardware validates the model’s suitability for onboard UAV deployment in remote sensing applications. Full article
Show Figures

Figure 1

20 pages, 7167 KB  
Article
FM-Net: Frequency-Aware Masked-Attention Network for Infrared Small Target Detection
by Yongxian Liu, Zaiping Lin, Boyang Li, Ting Liu and Wei An
Remote Sens. 2025, 17(13), 2264; https://doi.org/10.3390/rs17132264 - 1 Jul 2025
Cited by 3 | Viewed by 1742
Abstract
Infrared small target detection (IRSTD) aims to locate and separate targets from complex backgrounds. The challenges in IRSTD primarily come from extremely sparse target features and strong background clutter interference. However, existing methods typically perform discrimination directly on the features extracted by deep [...] Read more.
Infrared small target detection (IRSTD) aims to locate and separate targets from complex backgrounds. The challenges in IRSTD primarily come from extremely sparse target features and strong background clutter interference. However, existing methods typically perform discrimination directly on the features extracted by deep networks, neglecting the distinct characteristics of weak and small targets in the frequency domain, thereby limiting the improvement of detection capability. In this paper, we propose a frequency-aware masked-attention network (FM-Net) that leverages multi-scale frequency clues to assist in representing global context and suppressing noise interference. Specifically, we design the wavelet residual block (WRB) to extract multi-scale spatial and frequency features, which introduces a wavelet pyramid as the intermediate layer of the residual block. Then, to perceive global information on the long-range skip connections, a frequency-modulation masked-attention module (FMM) is used to interact with multi-layer features from the encoder. FMM contains two crucial elements: (a) a mask attention (MA) mechanism for injecting broad contextual feature efficiently to promote full-level semantic correlation and focus on salient regions, and (b) a channel-wise frequency modulation module (CFM) for enhancing the most informative frequency components and suppressing useless ones. Extensive experiments on three benchmark datasets (e.g., SIRST, NUDT-SIRST, IRSTD-1k) demonstrate that FM-Net achieves superior detection performance. Full article
Show Figures

Graphical abstract

23 pages, 31391 KB  
Article
A Method for Airborne Small-Target Detection with a Multimodal Fusion Framework Integrating Photometric Perception and Cross-Attention Mechanisms
by Shufang Xu, Heng Li, Tianci Liu and Hongmin Gao
Remote Sens. 2025, 17(7), 1118; https://doi.org/10.3390/rs17071118 - 21 Mar 2025
Cited by 5 | Viewed by 2931
Abstract
In recent years, the rapid advancement and pervasive deployment of unmanned aerial vehicle (UAV) technology have catalyzed transformative applications across the military, civilian, and scientific domains. While aerial imaging has emerged as a pivotal tool in modern remote sensing systems, persistent challenges remain [...] Read more.
In recent years, the rapid advancement and pervasive deployment of unmanned aerial vehicle (UAV) technology have catalyzed transformative applications across the military, civilian, and scientific domains. While aerial imaging has emerged as a pivotal tool in modern remote sensing systems, persistent challenges remain in achieving robust small-target detection under complex all-weather conditions. This paper presents an innovative multimodal fusion framework incorporating photometric perception and cross-attention mechanisms to address the critical limitations of current single-modality detection systems, particularly their susceptibility to reduced accuracy and elevated false-negative rates in adverse environmental conditions. Our architecture introduces three novel components: (1) a bidirectional hierarchical feature extraction network that enables the synergistic processing of heterogeneous sensor data; (2) a cross-modality attention mechanism that dynamically establishes inter-modal feature correlations through learnable attention weights; (3) an adaptive photometric weighting fusion module that implements spectral characteristic-aware feature recalibration. The proposed system achieves multimodal complementarity through two-phase integration: first by establishing cross-modal feature correspondences through attention-guided feature alignment, then performing weighted fusion based on photometric reliability assessment. Comprehensive experiments demonstrate that our framework achieves an improvement of at least 3.6% in mAP compared to the other models on the challenging LLVIP dataset, and with particular improvements in detection reliability on the KAIST dataset. This research advances the state of the art in aerial target detection by providing a principled approach for multimodal sensor fusion, with significant implications for surveillance, disaster response, and precision agriculture applications. Full article
Show Figures

Graphical abstract

Back to TopTop