remotesensing-logo

Journal Browser

Journal Browser

Remote Sensing Image Classification and Semantic Segmentation (Second Edition)

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "Remote Sensing Image Processing".

Deadline for manuscript submissions: 31 July 2025 | Viewed by 17513

Special Issue Editors


E-Mail Website
Guest Editor
The State Key Laboratory of Integrated Services Networks, Xidian University, Xi’an 710071, China
Interests: remote sensing image processing; spectral super-resolution; 3D computer vision; deep learning
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
INRIA, University Grenoble Alpes, Grenoble, France
Interests: image analysis; hyperspectral remote sensing; data fusion; machine learning; artificial intelligence
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
The State Key Laboratory of Integrated Services Networks, Xidian University, Xi’an 710071, China
Interests: hyperspectral image processing; deep learning
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
The State Key Laboratory of Integrated Services Networks, Xidian University, Xi’an 710071, China
Interests: image/video codec; computer vision; 3D computer vision; remote sensing image processing
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
The state key laboratory of Integrated Services Networks, Xidian University, Xi'an 710071, China
Interests: image/video processing; coding and transmission; chip design; high-performance computing
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

With the rapid growth in remote sensing imaging technology, vast amounts of remote sensing data are generated, which is nontrivial for land-monitoring systems, national security, agriculture, medical, atmosphere, etc., for Earth, Mars, etc. In recent decades, deep learning techniques have had a significant effect on remote sensing data processing and analysis, especially in image classification and semantic segmentation. However, several challenges still exist due to the limited number of annotated datasets, restricted computing resources, the special characteristics of different sensors or data sources, the complexity and diversity of large-scale areas and other specific problems, which make deep-learning-based algorithms more difficult in real-world applications. Therefore, the novel deep neural networks combined with few-shot learning, meta-learning, attention mechanisms or other new transformer technologies need to be given more attention, which is of vital importance in remote sensing image classification and semantic segmentation. It is also necessary to develop lightweight, explainable, and robust networks for remote image applications, especially image classification and sematic segmentation.

This Special Issue aims to develop state-of-the-art deep networks for more accurate remote sensing image classification and sematic segmentation. Furthermore, it also aims to achieve a cross-domain performance with high efficiency through a lightweight network design.

This Special Issue encourages authors to submit research articles, review articles or application-oriented articles on topics regarding remote sensing image classification, semantic segmentation, detection, spectral super-resolution and understanding-related works; these include, but are not limited to, the following topics:

  • Machine/deep-learning-based algorithms;
  • Remote sensing image processing and pattern recognition;
  • Image classification;
  • Semantic segmentation;
  • Target detection/change detection;
  • Image or data fusion/fusion classification;
  • Lightweight deep neural networks;
  • Domain-adaptation/few-shot-learning/meta-learning-based algorithms;
  • Onboard real-time applications.

Dr. Jiaojiao Li
Prof. Dr. Qian Du
Prof. Dr. Jocelyn Chanussot
Prof. Dr. Wei Li
Dr. Bobo Xi
Prof. Dr. Rui Song
Prof. Dr. Yunsong Li
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • remote sensing
  • deep learning
  • semantic segmentation
  • classification
  • cross-domain
  • earth observation

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Related Special Issue

Published Papers (10 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

24 pages, 12113 KiB  
Article
Hyperspectral Image Mixed Denoising via Robust Representation Coefficient Image Guidance and Nonlocal Low-Rank Approximation
by Jiawei Song, Baolong Guo, Zhe Yuan, Chao Wang, Fangliang He and Cheng Li
Remote Sens. 2025, 17(6), 1021; https://doi.org/10.3390/rs17061021 - 14 Mar 2025
Viewed by 350
Abstract
Recently, hyperspectral image (HSI) mixed denoising methods based on nonlocal subspace representation (NSR) have achieved significant success. However, most of these methods focus on optimizing the denoiser for representation coefficient images (RCIs) without considering how to construct RCIs that better inherit the spatial [...] Read more.
Recently, hyperspectral image (HSI) mixed denoising methods based on nonlocal subspace representation (NSR) have achieved significant success. However, most of these methods focus on optimizing the denoiser for representation coefficient images (RCIs) without considering how to construct RCIs that better inherit the spatial structure of the clean HSI, thereby affecting subsequent denoising performance. Although existing works have constructed RCIs from the perspective of sparse principal component analysis (SPCA), the refinement of RCIs in mixed noise conditions still leaves much to be desired. To address the aforementioned challenges, in this paper, we reconstructed robust RCIs based on SPCA in mixed noise circumstances to better preserve the spatial structure of the clean HSI. Furthermore, we propose to utilize the robust RCIs as prior information and perform iterative denoising in the denoiser that incorporates low-rank approximation. Extensive experiments conducted on both simulated and real HSI datasets demonstrate that the proposed robust RCIs guidance and low-rank approximation method, denoted as RRGNLA, exhibits competitive performance in terms of mixed denoising accuracy and computational efficiency. For instance, on the Washington DC Mall (WDC) dataset in Case 3, the denoising quantitative metrics of the mean peak signal-to-noise ratio (MPSNR), mean structural similarity index (MSSIM), and spectral angle mean (SAM) are 36.06 dB, 0.963, and 3.449, respectively, with a running time of 35.24 s. On the Pavia University (PaU) dataset in Case 4, the denoising quantitative metrics of MPSNR, MSSIM, and SAM are 34.34 dB, 0.924, and 5.505, respectively, with a running time of 32.79 s. Full article
Show Figures

Figure 1

30 pages, 13252 KiB  
Article
GLCANet: Global–Local Context Aggregation Network for Cropland Segmentation from Multi-Source Remote Sensing Images
by Jinglin Zhang, Yuxia Li, Zhonggui Tong, Lei He, Mingheng Zhang, Zhenye Niu and Haiping He
Remote Sens. 2024, 16(24), 4627; https://doi.org/10.3390/rs16244627 - 10 Dec 2024
Cited by 1 | Viewed by 873
Abstract
Cropland is a fundamental basis for agricultural development and a prerequisite for ensuring food security. The segmentation and extraction of croplands using remote sensing images are important measures and prerequisites for detecting and protecting farmland. This study addresses the challenges of diverse image [...] Read more.
Cropland is a fundamental basis for agricultural development and a prerequisite for ensuring food security. The segmentation and extraction of croplands using remote sensing images are important measures and prerequisites for detecting and protecting farmland. This study addresses the challenges of diverse image sources, multi-scale representations of cropland, and the confusion of features between croplands and other land types in large-area remote sensing image information extraction. To this end, a multi-source self-annotated dataset was developed using satellite images from GaoFen-2, GaoFen-7, and WorldView, which was integrated with public datasets GID and LoveDA to create the CRMS dataset. A novel semantic segmentation network, the Global–Local Context Aggregation Network (GLCANet), was proposed. This method integrates the Bilateral Feature Encoder (BFE) of CNNs and Transformers with a global–local information mining module (GLM) to enhance global context extraction and improve cropland separability. It also employs a multi-scale progressive upsampling structure (MPUS) to refine the accuracy of diverse arable land representations from multi-source imagery. To tackle the issue of inconsistent features within the cropland class, a loss function based on hard sample mining and multi-scale features was constructed. The experimental results demonstrate that GLCANet improves OA and mIoU by 3.2% and 2.6%, respectively, compared to the existing advanced networks on the CRMS dataset. Additionally, the proposed method also demonstrated high precision and practicality in segmenting large-area croplands in Chongzhou City, Sichuan Province, China. Full article
Show Figures

Figure 1

27 pages, 5518 KiB  
Article
Small Object Detection in UAV Remote Sensing Images Based on Intra-Group Multi-Scale Fusion Attention and Adaptive Weighted Feature Fusion Mechanism
by Zhe Yuan, Jianglei Gong, Baolong Guo, Chao Wang, Nannan Liao, Jiawei Song and Qiming Wu
Remote Sens. 2024, 16(22), 4265; https://doi.org/10.3390/rs16224265 - 15 Nov 2024
Cited by 1 | Viewed by 1722
Abstract
In view of the issues of missed and false detections encountered in small object detection for UAV remote sensing images, and the inadequacy of existing algorithms in terms of complexity and generalization ability, we propose a small object detection model named IA-YOLOv8 in [...] Read more.
In view of the issues of missed and false detections encountered in small object detection for UAV remote sensing images, and the inadequacy of existing algorithms in terms of complexity and generalization ability, we propose a small object detection model named IA-YOLOv8 in this paper. This model integrates the intra-group multi-scale fusion attention mechanism and the adaptive weighted feature fusion approach. In the feature extraction phase, the model employs a hybrid pooling strategy that combines Avg and Max pooling to replace the single Max pooling operation used in the original SPPF framework. Such modifications enhance the model’s ability to capture the minute features of small objects. In addition, an adaptive feature fusion module is introduced, which is capable of automatically adjusting the weights based on the significance and contribution of features at different scales to improve the detection sensitivity for small objects. Simultaneously, a lightweight intra-group multi-scale fusion attention module is implemented, which aims to effectively mitigate background interference and enhance the saliency of small objects. Experimental results indicate that the proposed IA-YOLOv8 model has a parameter quantity of 10.9 MB, attaining an average precision (mAP) value of 42.1% on the Visdrone2019 test set, an mAP value of 82.3% on the DIOR test set, and an mAP value of 39.8% on the AI-TOD test set. All these results outperform the existing detection algorithms, demonstrating the superior performance of the IA-YOLOv8 model in the task of small object detection for UAV remote sensing. Full article
Show Figures

Figure 1

20 pages, 24465 KiB  
Article
Unsupervised Multi-Scale Hybrid Feature Extraction Network for Semantic Segmentation of High-Resolution Remote Sensing Images
by Wanying Song, Fangxin Nie, Chi Wang, Yinyin Jiang and Yan Wu
Remote Sens. 2024, 16(20), 3774; https://doi.org/10.3390/rs16203774 - 11 Oct 2024
Cited by 3 | Viewed by 1869
Abstract
Generating pixel-level annotations for semantic segmentation tasks of high-resolution remote sensing images is both time-consuming and labor-intensive, which has led to increased interest in unsupervised methods. Therefore, in this paper, we propose an unsupervised multi-scale hybrid feature extraction network based on the CNN-Transformer [...] Read more.
Generating pixel-level annotations for semantic segmentation tasks of high-resolution remote sensing images is both time-consuming and labor-intensive, which has led to increased interest in unsupervised methods. Therefore, in this paper, we propose an unsupervised multi-scale hybrid feature extraction network based on the CNN-Transformer architecture, referred to as MSHFE-Net. The MSHFE-Net consists of three main modules: a Multi-Scale Pixel-Guided CNN Encoder, a Multi-Scale Aggregation Transformer Encoder, and a Parallel Attention Fusion Module. The Multi-Scale Pixel-Guided CNN Encoder is designed for multi-scale, fine-grained feature extraction in unsupervised tasks, efficiently recovering local spatial information in images. Meanwhile, the Multi-Scale Aggregation Transformer Encoder introduces a multi-scale aggregation module, which further enhances the unsupervised acquisition of multi-scale contextual information, obtaining global features with stronger feature representation. The Parallel Attention Fusion Module employs an attention mechanism to fuse global and local features in both channel and spatial dimensions in parallel, enriching the semantic relations extracted during unsupervised training and improving the performance of unsupervised semantic segmentation. K-means clustering is then performed on the fused features to achieve high-precision unsupervised semantic segmentation. Experiments with MSHFE-Net on the Potsdam and Vaihingen datasets demonstrate its effectiveness in significantly improving the accuracy of unsupervised semantic segmentation. Full article
Show Figures

Figure 1

18 pages, 4074 KiB  
Article
Infrared Weak Target Detection in Dual Images and Dual Areas
by Junbin Zhuang, Wenying Chen, Baolong Guo and Yunyi Yan
Remote Sens. 2024, 16(19), 3608; https://doi.org/10.3390/rs16193608 - 27 Sep 2024
Cited by 2 | Viewed by 1076
Abstract
This study proposes a novel approach for detecting weak small infrared (IR) targets, called double-image and double-local contrast measurement (DDLCM), designed to overcome challenges of low contrast and complex backgrounds in images. In this approach, the original image is decomposed into odd and [...] Read more.
This study proposes a novel approach for detecting weak small infrared (IR) targets, called double-image and double-local contrast measurement (DDLCM), designed to overcome challenges of low contrast and complex backgrounds in images. In this approach, the original image is decomposed into odd and even images, and the gray difference contrast is determined using a dual-neighborhood sliding window structure, enhancing target saliency and contrast by increasing the distinction between the target and the local background. A central unit is then constructed to capture relationships between neighboring and non-neighboring units, aiding in clutter suppression and eliminating bright non-target interference. Lastly, the output value is derived by extracting the lowest contrast value of the weak small targets from the saliency map in each direction. Experimental results on two datasets demonstrate that the DDLCM algorithm significantly enhances real-time IR dim target detection, achieving an average performance improvement of 32.83%. The area under the ROC curve (AUC) decline is effectively controlled, with a maximum reduction limited to 3%. Certain algorithms demonstrate a notable AUC improvement of up to 43.96%. To advance infrared dim target detection research, we introduce the IFWS dataset for benchmarking and validating algorithm performance. Full article
Show Figures

Figure 1

20 pages, 7707 KiB  
Article
Relevance Pooling Guidance and Class-Balanced Feature Enhancement for Fine-Grained Oriented Object Detection in Remote Sensing Images
by Yu Wang, Hao Chen, Ye Zhang and Guozheng Li
Remote Sens. 2024, 16(18), 3494; https://doi.org/10.3390/rs16183494 - 20 Sep 2024
Cited by 3 | Viewed by 1176
Abstract
Fine-grained object detection in remote sensing images is highly challenging due to class imbalance and high inter-class indistinguishability. The strategies employed by most existing methods to resolve these two challenges are relatively rudimentary, resulting in suboptimal model performance. To address these issues, we [...] Read more.
Fine-grained object detection in remote sensing images is highly challenging due to class imbalance and high inter-class indistinguishability. The strategies employed by most existing methods to resolve these two challenges are relatively rudimentary, resulting in suboptimal model performance. To address these issues, we propose a fine-grained object-oriented detection method based on relevance pooling guidance and class balance feature enhancement. Firstly, we propose a global attention mechanism that dynamically retains spatial features pertinent to objects through relevance pooling during down-sampling, thereby enabling the model to acquire more discriminative features. Next, a class balance correction module is proposed to alleviate the class imbalance problem. This module employs feature translation and a learnable reinforcement coefficient to highlight the boundaries of tail class features while maintaining their distinctiveness. Furthermore, we present an enhanced contrastive learning strategy. By dynamically adjusting the contribution of inter-class samples and intra-class similarity measures, this strategy not only constrains inter-class feature distances but also facilitates tighter intra-class clustering, making it more suitable for imbalanced datasets. Evaluation on the FAIR1M and MAR20 datasets demonstrates that our method is superior compared to other methods in object detection and achieves 46.44 and 85.05% mean average precision, respectively. Full article
Show Figures

Graphical abstract

22 pages, 9248 KiB  
Article
Developing a Comprehensive Oil Spill Detection Model for Marine Environments
by Farkhod Akhmedov, Rashid Nasimov and Akmalbek Abdusalomov
Remote Sens. 2024, 16(16), 3080; https://doi.org/10.3390/rs16163080 - 21 Aug 2024
Cited by 3 | Viewed by 4087
Abstract
Detecting oil spills in marine environments is crucial for avoiding environmental damage and facilitating rapid response efforts. In this study, we propose a robust method for oil spill detection leveraging state-of-the-art (SOTA) deep learning techniques. We constructed an extensive dataset comprising images and [...] Read more.
Detecting oil spills in marine environments is crucial for avoiding environmental damage and facilitating rapid response efforts. In this study, we propose a robust method for oil spill detection leveraging state-of-the-art (SOTA) deep learning techniques. We constructed an extensive dataset comprising images and frames extracted from video sourced from Google, significantly augmenting the dataset through frame extraction techniques. Each image is meticulously labeled to ensure high-quality training data. Utilizing the Yolov8 segmentation model, we trained our oil spill detection model to accurately identify and segment oil spills in ocean environments. K-means and Truncated Linear Stretching algorithms are combined with trained model weight to increase model detection accuracy. The model demonstrated exceptional performance, yielding high detection accuracy and precise segmentation capabilities. Our results indicate that this approach is highly effective for real-time oil spill detection, offering a promising tool for environmental monitoring and disaster management. In training metrics, the model reached over 97% accuracy in 100 epochs. In evaluation, model achieved its best detection rates by 94% accuracy in F1, 93.9% accuracy in Precision, and 95.5% mAP@0.5 accuracy in Recall curves. Full article
Show Figures

Figure 1

20 pages, 7699 KiB  
Article
SSANet-BS: Spectral–Spatial Cross-Dimensional Attention Network for Hyperspectral Band Selection
by Chuanyu Cui, Xudong Sun, Baijia Fu and Xiaodi Shang
Remote Sens. 2024, 16(15), 2848; https://doi.org/10.3390/rs16152848 - 3 Aug 2024
Cited by 5 | Viewed by 1523
Abstract
Band selection (BS) aims to reduce redundancy in hyperspectral imagery (HSI). Existing BS approaches typically model HSI only in a single dimension, either spectral or spatial, without exploring the interactions between different dimensions. To this end, we propose an unsupervised BS method based [...] Read more.
Band selection (BS) aims to reduce redundancy in hyperspectral imagery (HSI). Existing BS approaches typically model HSI only in a single dimension, either spectral or spatial, without exploring the interactions between different dimensions. To this end, we propose an unsupervised BS method based on a spectral–spatial cross-dimensional attention network, named SSANet-BS. This network is comprised of three stages: a band attention module (BAM) that employs an attention mechanism to adaptively identify and select highly significant bands; two parallel spectral–spatial attention modules (SSAMs), which fuse complex spectral–spatial structural information across dimensions in HSI; a multi-scale reconstruction network that learns spectral–spatial nonlinear dependencies in the SSAM-fusion image at various scales and guides the BAM weights to automatically converge to the target bands via backpropagation. The three-stage structure of SSANet-BS enables the BAM weights to fully represent the saliency of the bands, thereby valuable bands are obtained automatically. Experimental results on four real hyperspectral datasets demonstrate the effectiveness of SSANet-BS. Full article
Show Figures

Figure 1

25 pages, 4045 KiB  
Article
MBT-UNet: Multi-Branch Transform Combined with UNet for Semantic Segmentation of Remote Sensing Images
by Bin Liu, Bing Li, Victor Sreeram and Shuofeng Li
Remote Sens. 2024, 16(15), 2776; https://doi.org/10.3390/rs16152776 - 29 Jul 2024
Cited by 3 | Viewed by 1680
Abstract
Remote sensing (RS) images play an indispensable role in many key fields such as environmental monitoring, precision agriculture, and urban resource management. Traditional deep convolutional neural networks have the problem of limited receptive fields. To address this problem, this paper introduces a hybrid [...] Read more.
Remote sensing (RS) images play an indispensable role in many key fields such as environmental monitoring, precision agriculture, and urban resource management. Traditional deep convolutional neural networks have the problem of limited receptive fields. To address this problem, this paper introduces a hybrid network model that combines the advantages of CNN and Transformer, called MBT-UNet. First, a multi-branch encoder design based on the pyramid vision transformer (PVT) is proposed to effectively capture multi-scale feature information; second, an efficient feature fusion module (FFM) is proposed to optimize the collaboration and integration of features at different scales; finally, in the decoder stage, a multi-scale upsampling module (MSUM) is proposed to further refine the segmentation results and enhance segmentation accuracy. We conduct experiments on the ISPRS Vaihingen dataset, the Potsdam dataset, the LoveDA dataset, and the UAVid dataset. Experimental results show that MBT-UNet surpasses state-of-the-art algorithms in key performance indicators, confirming its superior performance in high-precision remote sensing image segmentation tasks. Full article
Show Figures

Figure 1

21 pages, 22540 KiB  
Article
JointNet: Multitask Learning Framework for Denoising and Detecting Anomalies in Hyperspectral Remote Sensing
by Yingzhao Shao, Shuhan Li, Pengfei Yang, Fei Cheng, Yueli Ding and Jianguo Sun
Remote Sens. 2024, 16(14), 2619; https://doi.org/10.3390/rs16142619 - 17 Jul 2024
Cited by 1 | Viewed by 1294
Abstract
One of the significant challenges with traditional single-task learning-based anomaly detection using noisy hyperspectral images (HSIs) is the loss of anomaly targets during denoising, especially when the noise and anomaly targets are similar. This issue significantly affects the detection accuracy. To address this [...] Read more.
One of the significant challenges with traditional single-task learning-based anomaly detection using noisy hyperspectral images (HSIs) is the loss of anomaly targets during denoising, especially when the noise and anomaly targets are similar. This issue significantly affects the detection accuracy. To address this problem, this paper proposes a multitask learning (MTL)-based method for detecting anomalies in noisy HSIs. Firstly, a preliminary detection approach based on the JointNet model, which decomposes the noisy HSI into a pure background and a noise–anomaly target mixing component, is introduced. This approach integrates the minimum noise fraction rotation (MNF) algorithm into an autoencoder (AE), effectively isolating the noise while retaining critical features for anomaly detection. Building upon this, the JointNet model is further optimized to ensure that the noise information is shared between the denoising and anomaly detection subtasks, preserving the integrity of the training data during the anomaly detection process and resolving the issue of losing anomaly targets during denoising. A novel loss function is designed to enable the joint learning of both subtasks under the multitask learning model. In addition, a noise score evaluation metric is introduced to calculate the probability of a pixel being an anomaly target, allowing for a clear distinction between noise and anomaly targets, thus providing the final anomaly detection results. The effectiveness of the proposed model and method is validated via testing on the HYDICE and San Diego datasets. The denoising metric results of the PSNR, SSIM, and SAM are 41.79, 0.91, and 4.350 and 42.83, 0.93, and 3.558 on the HYDICE and San Diego datasets, respectively. The anomaly detection ACU is 0.943 and 0.959, respectively. The proposed method outperforms the other algorithms, demonstrating that the reconstructed images using this method exhibited lower noise levels and more complete image information, and the JointNet model outperforms the mainstream HSI anomaly detection algorithms in both the quantitative evaluation and visual effect, showcasing its improved detection capabilities. Full article
Show Figures

Figure 1

Back to TopTop