Submit to Special Issue Submit Abstract to Special Issue Review for Remote Sensing Propose a Special Issue

Journal Menu

Journal Browser

Object Detection and Information Extraction Based on Remote Sensing Imagery (Second Edition)

Print Special Issue Flyer
Special Issue Editors
Special Issue Information
Keywords
Benefits of Publishing in a Special Issue
Related Special Issue
Published Papers

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "Remote Sensing Image Processing".

Deadline for manuscript submissions: 28 February 2026 | Viewed by 14128

Share This Special Issue

Special Issue Editors

Prof. Dr. Jie Feng

E-Mail Website
Guest Editor

Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education of China, Xidian University, Xi’an 710071, China
Interests: deep learning; object detection and tracking; reinforcement learning; hyperspectral image processing
Special Issues, Collections and Topics in MDPI journals

Prof. Dr. Gui-Song Xia

E-Mail Website
Guest Editor

State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China
Interests: mathematical models for visual information; graph matching problem and its applications; computer vision and machine learning; large-scale 3D reconstruction of visual scenes; information processing, fusion, and scene understanding in unmanned intelligent systems; interpretation and information mining of remote sensing images
Special Issues, Collections and Topics in MDPI journals

Prof. Dr. Xiangrong Zhang

E-Mail Website
Guest Editor

Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education of China, Xidian University, Xi’an 710071, China
Interests: remote sensing image processing; hyperspectral remote sensing; deep learning in remote sensing; change detection in remote sensing; remote sensing applications in urban planning; geospatial data analysis and modeling; SAR remote sensing
Special Issues, Collections and Topics in MDPI journals

Prof. Dr. Gong Cheng

E-Mail Website1 Website2
Guest Editor

School of Automation, Northwestern Polytechnical University, Xi’an 710072, China
Interests: computer vision; pattern recognition; image processing; machine learning; deep learning; object detection and tracking; video analysis; remote sensing applications
Special Issues, Collections and Topics in MDPI journals

Prof. Dr. Lichao Mou

E-Mail Website
Guest Editor

1. International AI Future Lab on AI4EO, TUM, Munich, Germany
2. Visual Learning and Reasoning Team, Department EO Data Science, DLR-IMF, Oberpfaffenhofen, Germany
Interests: natural language and earth observation; UAV video understanding; 3D structure inference from monocular optical/SAR imagery; recognition in remote sensing imagery
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

We are launching the second Special Issue of Remote Sensing to be released under the title “Object Detection and Information Extraction Based on Remote Sensing Imagery”.

Remote sensing technology has become a fundamental means by which humans might observe the Earth, and has driven progress in many applicative fields, such as environmental surveillance, disaster monitoring, ocean situational awareness, traffic management, and modern military, etc. However, the intelligent interpretation of remote sensing data poses unique challenges due to limited imaging capabilities, extremely high annotation costs, and insufficient multimodal data fusion. In recent years, deep learning techniques, represented by convolutional neural networks (CNNs) and transformers, have shown remarkable success in computer vision tasks due to their powerful feature extraction and representation capabilities. However, their application in remote sensing imagery is still relatively limited. In this Special Issue, we aim to compile state-of-the-art research pertaining to the application of machine learning methods for object detection and information extraction based on remote sensing imagery.

This Special Issue aims to present the latest advancements and emerging trends in the field of object detection and information extraction in remote sensing imagery. Specifically, the topics of interest include, but are not limited, to the following suggested themes:

Object detection and tracking in remote sensing images/videos;
Scene recognition, road extraction, and semantic segmentation;
Anomaly detection and quality evaluation of remote sensing data;
Multimodal remote sensing information extraction and fusion;
Few/zero-shot learning in remote sensing data.

Prof. Dr. Jie Feng
Prof. Dr. Gui-Song Xia
Prof. Dr. Xiangrong Zhang
Prof. Dr. Gong Cheng
Prof. Dr. Lichao Mou
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

object detection of remote sensing images
object detection and tracking of remote sensing videos
few/zero-shot learning
multi-source data fusion
weakly supervised learning
semantic segmentation
remote sensing image classification

Benefits of Publishing in a Special Issue

Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Related Special Issue

Object Detection and Information Extraction Based on Remote Sensing Imagery in Remote Sensing (20 articles)

Published Papers (7 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

27 pages, 25256 KB

Open AccessArticle

A Progressive Target-Aware Network for Drone-Based Person Detection Using RGB-T Images

by Zhipeng He, Boya Zhao, Yuanfeng Wu, Yuyang Jiang and Qingzhan Zhao

Remote Sens. 2025, 17(19), 3361; https://doi.org/10.3390/rs17193361 - 4 Oct 2025

Viewed by 351

Abstract

Drone-based target detection using visible and thermal (RGB-T) images is critical in disaster rescue, intelligent transportation, and wildlife monitoring. However, persons typically occupy fewer pixels and exhibit more varied postures than vehicles or large animals, making them difficult to detect in unmanned aerial vehicle (UAV) remote sensing images with complex backgrounds. We propose a novel progressive target-aware network (PTANet) for person detection using RGB-T images. A global adaptive feature fusion module (GAFFM) is designed to fuse the texture and thermal features of persons. A progressive focusing strategy is used. Specifically, we incorporate a person segmentation auxiliary branch (PSAB) during training to enhance target discrimination, while a cross-modality background mask (CMBM) is applied in the inference phase to suppress irrelevant background regions. Extensive experiments demonstrate that the proposed PTANet achieves high accuracy and generalization performance, reaching 79.5%, 47.8%, and 97.3% mean average precision (mAP)@50 on three drone-based person detection benchmarks (VTUAV-det, RGBTDronePerson, and VTSaR), with only 4.72 M parameters. PTANet deployed on an embedded edge device with TensorRT acceleration and quantization achieves an inference speed of 11.177 ms (640 × 640 pixels), indicating its promising potential for real-time onboard person detection. The source code is publicly available on GitHub. Full article

(This article belongs to the Special Issue Object Detection and Information Extraction Based on Remote Sensing Imagery (Second Edition))

► Show Figures

Figure 1

29 pages, 55752 KB

Open AccessArticle

PolSAR-SFCGN: An End-to-End PolSAR Superpixel Fully Convolutional Generation Network

by Mengxuan Zhang, Jingyuan Shi, Long Liu, Wenbo Zhang, Jie Feng, Jin Zhu and Boce Chu

Remote Sens. 2025, 17(15), 2723; https://doi.org/10.3390/rs17152723 - 6 Aug 2025

Viewed by 383

Abstract

Polarimetric Synthetic Aperture Radar (PolSAR) image classification is one of the most important applications in remote sensing. The impressive superpixel generation approaches can improve the efficiency of the subsequent classification task and restrain the influence of the speckle noise to an extent. Most of the classical PolSAR superpixel generation approaches use the features extracted manually and even only consider the pseudocolor images. They do not make full use of polarimetric information and do not necessarily lead to good enough superpixels. The deep learning methods can extract effective deep features but they are difficult to combine with superpixel generation to achieve true end-to-end training. Addressing the above issues, this study proposes an end-to-end fully convolutional superpixel generation network for PolSAR images. It integrates the extraction of polarization information features and the generation of PolSAR superpixels into one step. PolSAR superpixels can be generated based on deep polarization feature extraction and need no traditional clustering process. Both the performance and efficiency of generations of PolSAR superpixels can be enhanced effectively. The experimental results on various PolSAR datasets show that the proposed method can achieve impressive superpixel segmentation by fitting the real boundaries of different types of ground objects effectively and efficiently. It can achieve excellent classification performance by connecting a very simple classification network, which is helpful to improve the efficiency of the subsequent PolSAR image classification tasks. Full article

(This article belongs to the Special Issue Object Detection and Information Extraction Based on Remote Sensing Imagery (Second Edition))

► Show Figures

Figure 1

23 pages, 16046 KB

Open AccessArticle

A False-Positive-Centric Framework for Object Detection Disambiguation

by Jasper Baur and Frank O. Nitsche

Remote Sens. 2025, 17(14), 2429; https://doi.org/10.3390/rs17142429 - 13 Jul 2025

Viewed by 1227

Abstract

Existing frameworks for classifying the fidelity for object detection tasks do not consider false positive likelihood and object uniqueness. Inspired by the Detection, Recognition, Identification (DRI) framework proposed by Johnson 1958, we propose a new modified framework that defines three categories as visible anomaly, identifiable anomaly, and unique identifiable anomaly (AIU) as determined by human interpretation of imagery or geophysical data. These categories are designed to better capture false positive rates and emphasize the importance of identifying unique versus non-unique targets compared to the DRI Index. We then analyze visual, thermal, and multispectral UAV imagery collected over a seeded minefield and apply the AIU Index for the landmine detection use-case. We find that RGB imagery provided the most value per pixel, achieving a 100% identifiable anomaly rate at 125 pixels on target, and the highest unique target classification compared to thermal and multispectral imaging for the detection and identification of surface landmines and UXO. We also investigate how the AIU Index can be applied to machine learning for the selection of training data and informing the required action to take after object detection bounding boxes are predicted. Overall, the anomaly, identifiable anomaly, and unique identifiable anomaly index prescribes essential context for false-positive-sensitive or resolution-poor object detection tasks with applications in modality comparison, machine learning, and remote sensing data acquisition. Full article

(This article belongs to the Special Issue Object Detection and Information Extraction Based on Remote Sensing Imagery (Second Edition))

► Show Figures

Figure 1

22 pages, 21858 KB

Open AccessArticle

High-Order Temporal Context-Aware Aerial Tracking with Heterogeneous Visual Experts

by Shichao Zhou, Xiangpan Fan, Zhuowei Wang, Wenzheng Wang and Yunpu Zhang

Remote Sens. 2025, 17(13), 2237; https://doi.org/10.3390/rs17132237 - 29 Jun 2025

Viewed by 583

Abstract

Visual tracking from the unmanned aerial vehicle (UAV) perspective has been at the core of many low-altitude remote sensing applications. Most of the aerial trackers follow “tracking-by-detection” paradigms or their temporal-context-embedded variants, where the only visual appearance cue is encompassed for representation learning and estimating the spatial likelihood of the target. However, the variation of the target appearance among consecutive frames is inherently unpredictable, which degrades the robustness of the temporal context-aware representation. To address this concern, we advocate extra visual motion exhibiting predictable temporal continuity for complete temporal context-aware representation and introduce a dual-stream tracker involving explicit heterogeneous visual tracking experts. Our technical contributions involve three-folds: (1) high-order temporal context-aware representation integrates motion and appearance cues over a temporal context queue, (2) bidirectional cross-domain refinement enhances feature representation through cross-attention based mutual guidance, and (3) consistent decision-making allows for anti-drifting localization via dynamic gating and failure-aware recovery. Extensive experiments on four UAV benchmarks (UAV123, UAV123@10fps, UAV20L, and DTB70) illustrate that our method outperforms existing aerial trackers in terms of success rate and precision, particularly in occlusion and fast motion scenarios. Such superior tracking stability highlights its potential for real-world UAV applications. Full article

(This article belongs to the Special Issue Object Detection and Information Extraction Based on Remote Sensing Imagery (Second Edition))

► Show Figures

Graphical abstract

22 pages, 21162 KB

Open AccessArticle

SEMA-YOLO: Lightweight Small Object Detection in Remote Sensing Image via Shallow-Layer Enhancement and Multi-Scale Adaptation

by Zhenchuan Wu, Hang Zhen, Xiaoxinxi Zhang, Xuechen Bai and Xinghua Li

Remote Sens. 2025, 17(11), 1917; https://doi.org/10.3390/rs17111917 - 31 May 2025

Cited by 2 | Viewed by 3659

Abstract

Small object detection remains a challenge in the remote sensing field due to feature loss during downsampling and interference from complex backgrounds. A novel network, termed SEMA-YOLO, is proposed in this paper as an enhanced YOLOv11-based framework incorporating three technical advancements. By fundamentally reducing information loss and incorporating a cross-scale feature fusion mechanism, the proposed framework significantly enhances small object detection performance. First, the Shallow Layer Enhancement (SLE) strategy reduces backbone depth and introduces small-object detection heads, thereby increasing feature map size and improving small object detection performance. Then, the Global Context Pooling-enhanced Adaptively Spatial Feature Fusion (GCP-ASFF) architecture is designed to optimize cross-scale feature interaction across four detection heads. Finally, the RFA-C3k2 module, which integrates Receptive Field Adaptation (RFA) with the C3k2 structure, is introduced to achieve more refined feature extraction. SEMA-YOLO demonstrates significant advantages in complex urban environments and dense target areas, while its generalization capability meets the detection requirements across diverse scenarios. The experimental results show that SEMA-YOLO achieves mAP₅₀ scores of 72.5% on the RS-STOD dataset and 61.5% on the AI-TOD dataset, surpassing state-of-the-art models. Full article

(This article belongs to the Special Issue Object Detection and Information Extraction Based on Remote Sensing Imagery (Second Edition))

► Show Figures

Figure 1

19 pages, 21587 KB

Open AccessArticle

LocaLock: Enhancing Multi-Object Tracking in Satellite Videos via Local Feature Matching

by Lingyu Kong, Zhiyuan Yan, Hanru Shi, Ting Zhang and Lei Wang

Remote Sens. 2025, 17(3), 371; https://doi.org/10.3390/rs17030371 - 22 Jan 2025

Cited by 2 | Viewed by 1782

Abstract

Multi-object tracking (MOT) in satellite videos is a challenging task due to the small size and blurry features of objects, which often lead to intermittent detection and tracking instability. Many existing object detection and tracking models often struggle with these issues, as they are not designed to effectively handle the unique characteristics of satellite videos. To address these challenges, we propose LocaLock, a joint detection and tracking framework for MOT that incorporates feature matching concepts from single object tracking (SOT) to enhance tracking stability and reduce intermittent tracking results. Specifically, LocaLock utilizes an anchor-free detection backbone for efficiency and employs a local cost volume (LCV) module to perform precise feature matching in the local area. This provides valuable object priors to the detection head, enabling the model to “lock” onto objects with greater accuracy and mitigate the instability associated with small object detection. Additionally, the local computation within the LCV module ensures low computational complexity and memory usage. Furthermore, LocaLock incorporates a novel motion flow (MoF) module to accumulate and exploit temporal information, further enhancing feature robustness and consistency across frames. Rigorous evaluations on the VISO dataset demonstrate the superior performance of LocaLock, surpassing existing methods in tracking accuracy and precision within the demanding satellite video analysis domain. Notably, LocaLock achieved state-of-the-art performance on the VISO benchmark, achieving a multi-object tracking accuracy (MOTA) of 62.6 while ensuring fast running speed. Full article

(This article belongs to the Special Issue Object Detection and Information Extraction Based on Remote Sensing Imagery (Second Edition))

► Show Figures

Figure 1

27 pages, 39262 KB

Open AccessArticle

Advanced Object Detection in Low-Light Conditions: Enhancements to YOLOv7 Framework

by Dewei Zhao, Faming Shao, Sheng Zhang, Li Yang, Heng Zhang, Shaodong Liu and Qiang Liu

Remote Sens. 2024, 16(23), 4493; https://doi.org/10.3390/rs16234493 - 29 Nov 2024

Cited by 8 | Viewed by 4946

Abstract

Object detection in low-light conditions is increasingly relevant across various applications, presenting a challenge for improving accuracy. This study employs the popular YOLOv7 framework and examines low-light image characteristics, implementing performance enhancement strategies tailored to these conditions. We integrate an agile hybrid convolutional module to enhance edge information extraction, improving detailed discernment in low-light scenes. Convolutional attention and deformable convolutional modules are added to extract rich semantic information. Cross-layer connection structures are established to reinforce critical information, enhancing feature representation. We use brightness-adjusted data augmentation and a novel bounding box loss function to improve detection performance. Evaluations on the ExDark dataset show that our method achieved an mAP₅₀ of 80.1% and an mAP_50:95 of 52.3%, improving by 8.6% and 11.5% over the baseline model, respectively. These results validate the effectiveness of our approach for low-light object detection. Full article

(This article belongs to the Special Issue Object Detection and Information Extraction Based on Remote Sensing Imagery (Second Edition))

► Show Figures

Journal Menu

Journal Browser

Object Detection and Information Extraction Based on Remote Sensing Imagery (Second Edition)

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Related Special Issue

Published Papers (7 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI