remotesensing-logo

Journal Browser

Journal Browser

Deep Learning-Based Small-Target Detection in Remote Sensing

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "Remote Sensing Image Processing".

Deadline for manuscript submissions: closed (15 March 2026) | Viewed by 11075

Special Issue Editors

Key Laboratory of Technology in Geo-Spatial Information Processing and Application System, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing, Chia
Interests: object detection; remote sensing and scene perception; infrared image processing
Special Issues, Collections and Topics in MDPI journals
Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China
Interests: synthetic aperture radar image processing; synthetic aperture radar target detection and feature analysis

E-Mail Website
Guest Editor
Department of Computer Science, Faculty of Environment, Science and Economy, University of Exeter, Exeter EX4 4RN, UK
Interests: small-target detection; reinforce-learning; machine learning; remote sensing
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Small target detection in remote sensing is a challenging task due to the limited pixel representation of targets, complex backgrounds, and varying environmental conditions. Recently, the application of deep learning techniques has demonstrated significant improvements in detecting small objects in remote sensing imagery.

This Special Issue aims to explore cutting-edge developments in deep learning-based small target detection, with a strong focus on remote sensing data acquisition, preprocessing, and multi-sensor integration. We encourage contributions that investigate sensor-driven enhancements, geospatial data fusion, and innovative applications of AI to detect small targets under complex observation conditions. Topics of interest include, but are not limited to, the following:

  • Small target detection in high-resolution optical, Synthetic Aperture Radar, hyperspectral, and thermal remote sensing imagery
  • Deep learning models tailored for small target identification in remote sensing data
  • Noise reduction and contrast enhancement methods for improving target visibility
  • Multi-sensor fusion strategies for detecting small objects in complex environments
  • Change detection and anomaly-based approaches for identifying small targets
  • Advances in Synthetic Aperture Radar imaging techniques combined with AI for small object recognition
  • Spatio-temporal modeling of small target movement using remote sensing and AI
  • UAV-based remote sensing for real-time small target detection and tracking
  • Applications in environmental monitoring, disaster response, maritime surveillance, and security

We look forward to receiving your contributions.

Dr. Yuhan Liu
Prof. Dr. Zhenming Peng
Dr. Fei Teng
Dr. Xiaoyang Wang
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • small target detection
  • deep learning
  • remote sensing image processing
  • hyperspectral and infrared sensing
  • multi-sensor data fusion
  • feature extraction
  • target recognition

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (8 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

25 pages, 42196 KB  
Article
Frequency–Spatial Domain Jointly Guided Perceptual Network for Infrared Small Target Detection
by Yeteng Han, Minrui Ye, Bohan Liu, Jie Li, Chaoxian Jia, Wennan Cui and Tao Zhang
Remote Sens. 2026, 18(7), 1000; https://doi.org/10.3390/rs18071000 - 26 Mar 2026
Viewed by 556
Abstract
Infrared small target detection is a critical task in remote sensing. However, it remains highly challenging due to low contrast, heavy background clutter, and large variations in target scale. Traditional convolutional networks are inadequate for joint modeling, as they cannot effectively capture both [...] Read more.
Infrared small target detection is a critical task in remote sensing. However, it remains highly challenging due to low contrast, heavy background clutter, and large variations in target scale. Traditional convolutional networks are inadequate for joint modeling, as they cannot effectively capture both fine structural details and global contextual dependencies. To address these issues, we propose FSGPNet, a frequency–spatial domain jointly guided perceptual network that explicitly exploits complementary representations in both the frequency and spatial domains. Specifically, a Frequency–Spatial Enhancement Module (FSEM) is introduced to strengthen target details while suppressing background interference through high-frequency enhancement and Perona–Malik diffusion. To enhance global context modeling, we propose a Multi-Scale Global Perception (MSGP) module that integrates non-local attention with multi-scale dilated convolutions, enabling robust background modeling. Furthermore, a Gabor Transformer Attention Module (GTAM) is designed to achieve selective frequency–spatial feature aggregation via self-attention over multi-directional and multi-scale Gabor responses, effectively highlighting discriminative structures of various small targets. Extensive experiments are conducted on two benchmark datasets (IRSTD-1K and NUDT-SIRST) that cover typical remote sensing infrared scenarios. Quantitative and qualitative results demonstrate that FSGPNet consistently outperforms state-of-the-art methods across multiple evaluation metrics. These findings validate the effectiveness and robustness of the proposed FSGPNet for detecting small infrared targets in remote sensing applications. Full article
(This article belongs to the Special Issue Deep Learning-Based Small-Target Detection in Remote Sensing)
Show Figures

Figure 1

24 pages, 38139 KB  
Article
Improved Multispectral Target Detection Using Target-Specific Spectral Reconstruction
by Nicola Acito, Michael Alibani and Marco Diani
Remote Sens. 2026, 18(5), 760; https://doi.org/10.3390/rs18050760 - 3 Mar 2026
Viewed by 342
Abstract
Hyperspectral sensors provide high spectral resolution, enabling accurate material discrimination and effective target detection. However, their practical use is constrained by limited spatial resolution and high acquisition costs. This paper proposes a novel framework to enhance small-target detection in multispectral imagery by leveraging [...] Read more.
Hyperspectral sensors provide high spectral resolution, enabling accurate material discrimination and effective target detection. However, their practical use is constrained by limited spatial resolution and high acquisition costs. This paper proposes a novel framework to enhance small-target detection in multispectral imagery by leveraging deep learning-based spectral reconstruction to generate high-resolution hyperspectral representations from multispectral inputs. Two state-of-the-art reconstruction networks, MST++ and MIRNet, are trained using paired multispectral–hyperspectral samples derived from AVIRIS-NG data through proper spectral response functions. To improve discriminative capability for the target of interest, a rapid, target-specific fine-tuning stage is introduced, allowing the models to adapt to spectral signatures that are poorly represented or absent in the original training data. Target detection is performed using a spectral signature-based detector applied to the reconstructed hyperspectral data. The proposed framework is evaluated in a real-world scenario involving known field-deployed targets and hyperspectral imagery acquired from an unmanned aerial vehicle. Experimental results demonstrate that the proposed approach significantly outperforms baseline detection applied directly to multispectral data. These findings underscore the effectiveness of spectral reconstruction for downstream tasks such as target detection, particularly in scenarios where hyperspectral data are expensive or unavailable. Full article
(This article belongs to the Special Issue Deep Learning-Based Small-Target Detection in Remote Sensing)
Show Figures

Figure 1

28 pages, 66640 KB  
Article
SSABNet: Spatial-Semantic Aggregation and Balancing Network for Small-Target Detection in UAV Remote Sensing Images
by Hongxing Zhang, Zhonghong Ou, Siyuan Yao, Shigeng Wang, Yang Guo and Meina Song
Remote Sens. 2026, 18(4), 550; https://doi.org/10.3390/rs18040550 - 9 Feb 2026
Viewed by 484
Abstract
The precise localization of small objects in UAV-captured remote sensing imagery remains a formidable challenge due to their limited spatial support, coarse resolution, and severe background clutter. These factors often cause weak target cues to be progressively overwhelmed during deep feature extraction. Existing [...] Read more.
The precise localization of small objects in UAV-captured remote sensing imagery remains a formidable challenge due to their limited spatial support, coarse resolution, and severe background clutter. These factors often cause weak target cues to be progressively overwhelmed during deep feature extraction. Existing deep learning-based detectors typically suffer from two fundamental limitations: the irreversible loss of fine-grained spatial details during hierarchical feature fusion and the scale-insensitive optimization of conventional loss functions, which inadequately emphasize hard-to-detect small targets. To address these issues, we propose a novel Spatial-Semantic Aggregation and Balancing Network (SSABNet) tailored for UAV-based small-target detection. First, a Spatial-Semantic Aggregation (SSA) module is introduced to establish a high-fidelity restoration pathway that recovers fine-grained texture and boundary information from shallow layers. By employing content-aware operators, SSA effectively reconciles the structural discrepancy between spatial details and semantic abstractions, enabling precise cross-scale feature fusion while suppressing aliasing artifacts. Second, we design a Scale-Aware Balancing Loss (SABL) to mitigate the gradient instability and vanishing-gradient issues commonly encountered when optimizing non-overlapping small targets. SABL adopts a scale-dependent modulation mechanism that smoothly transitions from Wasserstein distance for distributional alignment of small objects to Euclidean distance for geometric refinement of larger targets, thereby ensuring stable and balanced optimization across object scales. Extensive experiments on the VisDrone benchmark demonstrate that SSABNet outperforms state-of-the-art detectors, achieving gains of 1.3% in overall AP and 2.5% in APs. Further evaluation on the UAVDT dataset confirms its strong generalization capability, yielding improvements of 0.5% in AP and 16.9% in APs. These results validate the effectiveness of jointly addressing feature representation and scale-aware optimization for UAV small-target detection. Full article
(This article belongs to the Special Issue Deep Learning-Based Small-Target Detection in Remote Sensing)
Show Figures

Figure 1

28 pages, 4151 KB  
Article
FANet: Frequency-Aware Attention-Based Tiny-Object Detection in Remote Sensing Images
by Zixiao Wen, Peifeng Li, Yuhan Liu, Jingming Chen, Xiantai Xiang, Yuan Li, Huixian Wang, Yongchao Zhao and Guangyao Zhou
Remote Sens. 2025, 17(24), 4066; https://doi.org/10.3390/rs17244066 - 18 Dec 2025
Cited by 4 | Viewed by 1377
Abstract
In recent years, deep learning-based remote sensing object detection has achieved remarkable progress, yet the detection of tiny objects remains a significant challenge. Tiny objects in remote sensing images typically occupy only a few pixels, resulting in low contrast, poor resolution, and high [...] Read more.
In recent years, deep learning-based remote sensing object detection has achieved remarkable progress, yet the detection of tiny objects remains a significant challenge. Tiny objects in remote sensing images typically occupy only a few pixels, resulting in low contrast, poor resolution, and high sensitivity to localization errors. Their diverse scales and appearances, combined with complex backgrounds and severe class imbalance, further complicate the detection tasks. Conventional spatial feature extraction methods often struggle to capture the discriminative characteristics of tiny objects, especially in the presence of noise and occlusion. To address these challenges, we propose a frequency-aware attention-based tiny-object detection network with two plug-and-play modules that leverage frequency-domain information to enhance the targets. Specifically, we introduce a Multi-Scale Frequency Feature Enhancement Module (MSFFEM) to adaptively highlight the contour and texture details of tiny objects while suppressing background noise. Additionally, a Channel Attention-based RoI Enhancement Module (CAREM) is proposed to selectively emphasize high-frequency responses within RoI features, further improving object localization and classification. Furthermore, to mitigate sample imbalance, we employ multi-directional flip sample augmentation and redundancy filtering strategies, which significantly boost detection performance for few-shot categories. Extensive experiments on public object detection datasets, i.e., AI-TOD, VisDrone2019, and DOTA-v1.5, demonstrate that the proposed FANet consistently improves detection performance for tiny objects, outperforming existing methods and providing new insights into the integration of frequency-domain analysis and attention mechanisms for robust tiny-object detection in remote sensing applications. Full article
(This article belongs to the Special Issue Deep Learning-Based Small-Target Detection in Remote Sensing)
Show Figures

Figure 1

31 pages, 4757 KB  
Article
MFEF-YOLO: A Multi-Scale Feature Extraction and Fusion Network for Small Object Detection in Aerial Imagery over Open Water
by Qi Liu, Haiyang Yu, Ping Zhang, Tingting Geng, Xinru Yuan, Bingqian Ji, Shengmin Zhu and Ruopu Ma
Remote Sens. 2025, 17(24), 3996; https://doi.org/10.3390/rs17243996 - 11 Dec 2025
Cited by 3 | Viewed by 1188
Abstract
Current object detection using UAV platforms in open water faces challenges such as low detection accuracy, limited storage, and constrained computational capabilities. To address these issues, we propose MFEF-YOLO, a small object detection network based on multi-scale feature extraction and fusion. First, we [...] Read more.
Current object detection using UAV platforms in open water faces challenges such as low detection accuracy, limited storage, and constrained computational capabilities. To address these issues, we propose MFEF-YOLO, a small object detection network based on multi-scale feature extraction and fusion. First, we introduce a Dual-Branch Spatial Pyramid Pooling Fast (DBSPPF) module in the backbone network to replace the original SPPF module, while integrating ODConv and C3k2 modules to collectively enhance feature extraction capabilities. Second, we improve small object detection by adding a P2 detection head and reduce model parameters by removing the P5 detection head. Finally, we design an Island-based Multi-scale Feature Fusion Network (IMFFNet) and employ a Coordinate-guided Multi-scale Feature Fusion Module (CMFFM) to strengthen contextual information and boost detection accuracy. We validate the effectiveness of MFEF-YOLO using the public dataset SeaDronesSee and our custom dataset TPDNV. Experimental results show that compared to the baseline model, mAP50 improves by 0.11 and 0.03 using the two datasets, respectively, while model parameters are reduced by 11.54%. Furthermore, DBSPPF and IMFFNet demonstrate superior performance in comparative studies with other methods, confirming their effectiveness. These improvements and outstanding performance make MFEF-YOLO particularly suitable for UAV-based object detection in open waters. Full article
(This article belongs to the Special Issue Deep Learning-Based Small-Target Detection in Remote Sensing)
Show Figures

Figure 1

29 pages, 48102 KB  
Article
Infrared Temporal Differential Perception for Space-Based Aerial Targets
by Lan Guo, Xin Chen, Cong Gao, Zhiqi Zhao and Peng Rao
Remote Sens. 2025, 17(20), 3487; https://doi.org/10.3390/rs17203487 - 20 Oct 2025
Viewed by 1062
Abstract
Space-based infrared (IR) detection, with wide coverage, all-time operation, and stealth, is crucial for aerial target surveillance. Under low signal-to-noise ratio (SNR) conditions, however, its small target size, limited features, and strong clutters often lead to missed detections and false alarms, reducing stability [...] Read more.
Space-based infrared (IR) detection, with wide coverage, all-time operation, and stealth, is crucial for aerial target surveillance. Under low signal-to-noise ratio (SNR) conditions, however, its small target size, limited features, and strong clutters often lead to missed detections and false alarms, reducing stability and real-time performance. To overcome these issues of energy-integration imaging in perceiving dim targets, this paper proposes a biomimetic vision-inspired Infrared Temporal Differential Detection (ITDD) method. The ITDD method generates sparse event streams by triggering pixel-level radiation variations and establishes an irradiance-based sensitivity model with optimized threshold voltage, spectral bands, and optical aperture parameters. IR sequences are converted into differential event streams with inherent noise, upon which a lightweight multi-modal fusion detection network is developed. Simulation experiments demonstrate that ITDD reduces data volume by three orders of magnitude and improves the SNR by 4.21 times. On the SITP-QLEF dataset, the network achieves a detection rate of 99.31%, and a false alarm rate of 1.97×105, confirming its effectiveness and application potential under complex backgrounds. As the current findings are based on simulated data, future work will focus on building an ITDD demonstration system to validate the approach with real-world IR measurements. Full article
(This article belongs to the Special Issue Deep Learning-Based Small-Target Detection in Remote Sensing)
Show Figures

Figure 1

24 pages, 4844 KB  
Article
DSAD: Multi-Directional Contrast Spatial Attention-Driven Feature Distillation for Infrared Small Target Detection
by Yonghao Li, Boyang Li, Guoliang Zhang, Jun Chen, Siyi Deng and Hanxiao Zhang
Remote Sens. 2025, 17(20), 3466; https://doi.org/10.3390/rs17203466 - 17 Oct 2025
Cited by 4 | Viewed by 1173
Abstract
Recent deep learning methods have achieved promising performance in infrared small target detection (IRSTD) but with high computational cost, limiting deployment or operation on resource-limited scenarios. There is an urgent need to develop both lightweight and high-precision model compression methods. In this paper, [...] Read more.
Recent deep learning methods have achieved promising performance in infrared small target detection (IRSTD) but with high computational cost, limiting deployment or operation on resource-limited scenarios. There is an urgent need to develop both lightweight and high-precision model compression methods. In this paper, we propose a Multi-Directional Contrast Spatial Attention-driven Feature Distillation (DSAD) method for achieving quick and high-performance IRSTD. Specifically, we first extract feature maps from teacher and student networks. Then, a standard Gaussian transformation is adopted to eliminate magnitude effects. After that, a Multi-Directional Contrast Spatial Attention (DSA) is designed to capture multi-directional spatial information from teacher features, which can make student networks pay more attention to small target areas while suppressing background. Finally, we propose a Perceptual Weighted Mean Square Error (PWMSE) distillation loss by combining the DSA with feature discrepancies, guiding student networks to learn more effective information from small target features. Experimental results on the two benchmark datasets (e.g., NUDT-SIRST and NUAA-SIRST) demonstrate that our distillation method can achieve remarkable detection performance compared with the teacher counterparts on several benchmark IRSTD networks (e.g., DNANet, AMFU-Net, and DMFNet) and introduce consistent gains in inference speed (i.e., 2× more) on edge devices (NVIDIA AGX and HUAWEI Ascend-310B). Full article
(This article belongs to the Special Issue Deep Learning-Based Small-Target Detection in Remote Sensing)
Show Figures

Figure 1

32 pages, 8925 KB  
Article
HSF-DETR: Hyper Scale Fusion Detection Transformer for Multi-Perspective UAV Object Detection
by Yi Mao, Haowei Zhang, Rui Li, Feng Zhu, Rui Sun and Pingping Ji
Remote Sens. 2025, 17(12), 1997; https://doi.org/10.3390/rs17121997 - 9 Jun 2025
Cited by 7 | Viewed by 3457
Abstract
Unmanned aerial vehicle (UAV) imagery detection faces challenges in preserving small object features during multi-level downsampling, handling angle and altitude-dependent variations in aerial scenes, achieving accurate localization in dense environments, and performing real-time detection. To address these limitations, we propose HSF-DETR, a lightweight [...] Read more.
Unmanned aerial vehicle (UAV) imagery detection faces challenges in preserving small object features during multi-level downsampling, handling angle and altitude-dependent variations in aerial scenes, achieving accurate localization in dense environments, and performing real-time detection. To address these limitations, we propose HSF-DETR, a lightweight transformer-based detector specifically designed for UAV imagery. First, we design a hybrid progressive fusion network (HPFNet) as the backbone, which adaptively modulates receptive fields to capture multi-scale information while preserving fine-grained details critical for small object detection. Second, building upon features extracted by HPFNet, we develop MultiScaleNet, which enhances feature representation through dual-layer optimization and cross-domain feature learning, significantly improving the model’s capability to handle complex aerial scenarios with diverse object orientations. Finally, to address spatial–semantic alignment challenges, we devise a position-aware align context and spatial tuning (PACST) module that ensures effective feature calibration through precise alignment and adaptive fusion across scales. This hierarchical architecture is complemented by our novel AdaptDist-IoU loss with dynamic weight allocation, which enhances localization accuracy, particularly in dense environments. Extensive experiments using standard detection metrics (mAP50 and mAP50:95) on the VisDrone2019 test dataset demonstrate that HSF-DETR achieves superior performance with 0.428 mAP50 (+5.4%) and 0.253 mAP50:95 (+4%) when compared with RT-DETR, while maintaining real-time inference (69.3 FPS) on an NVIDIA RTX 4090D GPU with only 15.24M parameters and 63.6 GFLOPs. Further validation across multiple public remote sensing datasets confirms the robust generalization capability of HSF-DETR in diverse aerial scenarios, offering a practical solution for resource-constrained UAV applications where both detection quality and processing speed are crucial. Full article
(This article belongs to the Special Issue Deep Learning-Based Small-Target Detection in Remote Sensing)
Show Figures

Graphical abstract

Back to TopTop