MDPI - Publisher of Open Access Journals

20 pages, 49845 KB

Open AccessArticle

DDF-YOLO: A Small Target Detection Model Using Multi-Scale Dynamic Feature Fusion for UAV Aerial Photography

by Ziang Ma, Chao Wang, Chuanzhi Chen, Jinbao Chen and Guang Zheng

Aerospace 2025, 12(10), 920; https://doi.org/10.3390/aerospace12100920 (registering DOI) - 13 Oct 2025

Unmanned aerial vehicle (UAV)-based object detection shows promising potential in intelligent transportation and disaster response. However, detecting small targets remains challenging due to inherent limitations (long-distance and low-resolution imaging) and environmental interference (complex backgrounds and occlusions). To address these issues, this paper proposes [...] Read more.

Unmanned aerial vehicle (UAV)-based object detection shows promising potential in intelligent transportation and disaster response. However, detecting small targets remains challenging due to inherent limitations (long-distance and low-resolution imaging) and environmental interference (complex backgrounds and occlusions). To address these issues, this paper proposes an enhanced small target detection model, DDF-YOLO, which achieves higher detection performance. First, a dynamic feature extraction module (C2f-DCNv4) employs deformable convolutions to effectively capture features from irregularly shaped objects. In addition, a dynamic upsampling module (DySample) optimizes multi-scale feature fusion by combining shallow spatial details with deep semantic features, preserving critical low-level information while enhancing generalization across scales. Finally, to balance rapid convergence with precise localization, an adaptive Focaler-ECIoU loss function dynamically adjusts training weights based on sample quality during bounding box regression. Extensive experiments on VisDrone2019 and UAVDT benchmarks demonstrate DDF-YOLO’s superiority. Compared to YOLOv8n, our model achieves gains of 8.6% and 4.8% in mAP50, along with improvements of 5.0% and 3.3% in mAP50-95, respectively. Furthermore, it exhibits superior efficiency, requiring only 7.3 GFLOPs and attaining an inference speed of 179 FPS. These results validate the model’s robustness for UAV-based detection, particularly in small-object scenarios. Full article

(This article belongs to the Section Aeronautics)

► Show Figures

Figure 1

39 pages, 13725 KB

Open AccessArticle

SRTSOD-YOLO: Stronger Real-Time Small Object Detection Algorithm Based on Improved YOLO11 for UAV Imageries

by Zechao Xu, Huaici Zhao, Pengfei Liu, Liyong Wang, Guilong Zhang and Yuan Chai

Remote Sens. 2025, 17(20), 3414; https://doi.org/10.3390/rs17203414 - 12 Oct 2025

Viewed by 177

Abstract

To address the challenges of small target detection in UAV aerial images—such as difficulty in feature extraction, complex background interference, high miss rates, and stringent real-time requirements—this paper proposes an innovative model series named SRTSOD-YOLO, based on YOLO11. The backbone network incorporates a [...] Read more.

To address the challenges of small target detection in UAV aerial images—such as difficulty in feature extraction, complex background interference, high miss rates, and stringent real-time requirements—this paper proposes an innovative model series named SRTSOD-YOLO, based on YOLO11. The backbone network incorporates a Multi-scale Feature Complementary Aggregation Module (MFCAM), designed to mitigate the loss of small target information as network depth increases. By integrating channel and spatial attention mechanisms with multi-scale convolutional feature extraction, MFCAM effectively locates small objects in the image. Furthermore, we introduce a novel neck architecture termed Gated Activation Convolutional Fusion Pyramid Network (GAC-FPN). This module enhances multi-scale feature fusion by emphasizing salient features while suppressing irrelevant background information. GAC-FPN employs three key strategies: adding a detection head with a small receptive field while removing the original largest one, leveraging large-scale features more effectively, and incorporating gated activation convolutional modules. To tackle the issue of positive-negative sample imbalance, we replace the conventional binary cross-entropy loss with an adaptive threshold focal loss in the detection head, accelerating network convergence. Additionally, to accommodate diverse application scenarios, we develop multiple versions of SRTSOD-YOLO by adjusting the width and depth of the network modules: a nano version (SRTSOD-YOLO-n), small (SRTSOD-YOLO-s), medium (SRTSOD-YOLO-m), and large (SRTSOD-YOLO-l). Experimental results on the VisDrone2019 and UAVDT datasets demonstrate that SRTSOD-YOLO-n improves the mAP@0.5 by 3.1% and 1.2% compared to YOLO11n, while SRTSOD-YOLO-l achieves gains of 7.9% and 3.3% over YOLO11l, respectively. Compared to other state-of-the-art methods, SRTSOD-YOLO-l attains the highest detection accuracy while maintaining real-time performance, underscoring the superiority of the proposed approach. Full article

(This article belongs to the Special Issue Advanced Image Processing Algorithms for Object Detection and Tracking in Aerial and Satellite Imagery)

► Show Figures

Figure 1

24 pages, 76400 KB

Open AccessArticle

MBD-YOLO: An Improved Lightweight Multi-Scale Small-Object Detection Model for UAVs Based on YOLOv8

by Bo Xu, Di Cai, Kelin Sui, Zheng Wang, Chuangchuang Liu and Xiaolong Pei

Appl. Sci. 2025, 15(20), 10877; https://doi.org/10.3390/app152010877 - 10 Oct 2025

Viewed by 245

Abstract

To address the challenges of low detection accuracy and weak generalization in UAV aerial imagery caused by complex ground environments, significant scale variations among targets, dense small objects, and background interference, this paper proposes an improved lightweight multi-scale small-object detection model, MBD-YOLO (MBFF [...] Read more.

To address the challenges of low detection accuracy and weak generalization in UAV aerial imagery caused by complex ground environments, significant scale variations among targets, dense small objects, and background interference, this paper proposes an improved lightweight multi-scale small-object detection model, MBD-YOLO (MBFF module, BiMS-FPN, and Dual-Stream Head). Specifically, to enhance multi-scale feature extraction capabilities, we introduce the Multi-Branch Feature Fusion (MBFF) module, which dynamically adjusts receptive fields through parallel branches and adaptive depthwise convolutions, expanding the receptive field while preserving detail perception. We further design a lightweight Bidirectional Multi-Scale Feature Aggregation Pyramid Network (BiMS-FPN), integrating bidirectional propagation paths and a Multi-Scale Feature Aggregation (MSFA) module to mitigate feature spatial misalignment and improve small-target detection. Additionally, the Dual-Stream Head with NMS-free architecture leverages a task-aligned architecture and dynamic matching strategies to boost inference speed without compromising accuracy. Experiments on the VisDrone2019 dataset demonstrate that MBD-YOLO-n surpasses YOLOv8n by 6.3% in mAP50 and 8.2% in mAP50–95, with accuracy gains of 17.96–55.56% for several small-target categories, while increasing parameters by merely 3.1%. Moreover, MBD-YOLO-s achieves superior detection accuracy, efficiency, and generalization with only 12.1 million parameters, outperforming state-of-the-art models and proving suitable for resource-constrained embedded deployment scenarios. The superior performance of MBD-YOLO, which harmonizes high precision with low computational demand, fulfills the critical requirements for real-time deployment on resource-limited UAVs, showing great promise for applications in traffic monitoring, urban security, and agricultural surveying. Full article

(This article belongs to the Special Issue Application, Optimization and Architecture of Deep Learning Neural Network)

► Show Figures

Figure 1

14 pages, 1304 KB

Open AccessArticle

RoadNet: A High-Precision Transformer-CNN Framework for Road Defect Detection via UAV-Based Visual Perception

by Long Gou, Yadong Liang, Xingyu Zhang and Jianfeng Yang

Drones 2025, 9(10), 691; https://doi.org/10.3390/drones9100691 - 9 Oct 2025

Viewed by 110

Abstract

Automated Road defect detection using Unmanned Aerial Vehicles (UAVs) has emerged as an efficient and safe solution for large-scale infrastructure inspection. However, object detection in aerial imagery poses unique challenges, including the prevalence of extremely small targets, complex backgrounds, and significant scale variations. [...] Read more.

Automated Road defect detection using Unmanned Aerial Vehicles (UAVs) has emerged as an efficient and safe solution for large-scale infrastructure inspection. However, object detection in aerial imagery poses unique challenges, including the prevalence of extremely small targets, complex backgrounds, and significant scale variations. Mainstream deep learning-based detection models often struggle with these issues, exhibiting limitations in detecting small cracks, high computational demands, and insufficient generalization ability for UAV perspectives. To address these challenges, this paper proposes a novel comprehensive network, RoadNet, specifically designed for high-precision road defect detection in UAV-captured imagery. RoadNet innovatively integrates Transformer modules with a convolutional neural network backbone and detection head. This design not only significantly enhances the global feature modeling capability crucial for understanding complex aerial contexts but also maintains the computational efficiency necessary for potential real-time applications. The model was trained and evaluated on a self-collected UAV road defect dataset (UAV-RDD). In comparative experiments, RoadNet achieved an outstanding mAP@0.5 score of 0.9128 while maintaining a fast-processing speed of 210.01 ms per image, outperforming other state-of-the-art models. The experimental results demonstrate that RoadNet possesses superior detection performance for road defects in complex aerial scenarios captured by drones. Full article

(This article belongs to the Special Issue Advances in Cooperative Perception Application for Unmanned System in Modern Transportation)

► Show Figures

Figure 1

14 pages, 2759 KB

Open AccessArticle

Unmanned Airborne Target Detection Method with Multi-Branch Convolution and Attention-Improved C2F Module

by Fangyuan Qin, Weiwei Tang, Haishan Tian and Yuyu Chen

Sensors 2025, 25(19), 6023; https://doi.org/10.3390/s25196023 - 1 Oct 2025

Viewed by 181

Abstract

In this paper, a target detection network algorithm based on a multi-branch convolution and attention improvement Cross-Stage Partial-Fusion Bottleneck with Two Convolutions (C2F) module is proposed for the difficult task of detecting small targets in unmanned aerial vehicles. A C2F module method consisting [...] Read more.

In this paper, a target detection network algorithm based on a multi-branch convolution and attention improvement Cross-Stage Partial-Fusion Bottleneck with Two Convolutions (C2F) module is proposed for the difficult task of detecting small targets in unmanned aerial vehicles. A C2F module method consisting of fusing partial convolutional (PConv) layers was designed to improve the speed and efficiency of extracting features, and a method consisting of combining multi-scale feature fusion with a channel space attention mechanism was applied in the neck network. An FA-Block module was designed to improve feature fusion and attention to small targets’ features; this design increases the size of the miniscule target layer, allowing richer feature information about the small targets to be retained. Finally, the lightweight up-sampling operator Content-Aware ReAssembly of Features was used to replace the original up-sampling method to expand the network’s sensory field. Experimental tests were conducted on a self-complied mountain pedestrian dataset and the public VisDrone dataset. Compared with the base algorithm, the improved algorithm improved the mAP50, mAP50-95, P-value, and R-value by 2.8%, 3.5%, 2.3%, and 0.2%, respectively, on the Mountain Pedestrian dataset and the mAP50, mAP50-95, P-value, and R-value by 9.2%, 6.4%, 7.7%, and 7.6%, respectively, on the VisDrone dataset. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

24 pages, 57744 KB

Open AccessArticle

A Small Landslide as a Big Lesson: Drones and GIS for Monitoring and Teaching Slope Instability

by Benito Zaragozí, Pablo Giménez-Font, Joan Cano-Aladid and Juan Antonio Marco-Molina

Geosciences 2025, 15(10), 375; https://doi.org/10.3390/geosciences15100375 - 30 Sep 2025

Viewed by 351

Abstract

Small landslides, though frequent, are often overlooked despite their significant potential impact on human-affected areas. This study presents an analysis of the Bella Orxeta landslide in Alicante, Spain, a rotational landslide event that occurred in March 2017 following intense and continued rainfall. Utilizing [...] Read more.

Small landslides, though frequent, are often overlooked despite their significant potential impact on human-affected areas. This study presents an analysis of the Bella Orxeta landslide in Alicante, Spain, a rotational landslide event that occurred in March 2017 following intense and continued rainfall. Utilizing multitemporal datasets, including LiDAR from 2009 and 2016 and drone-based photogrammetry from 2021 and 2023, we generated high-resolution digital terrain models (DTMs) to assess morphological changes, estimate displaced volumes of approximately 3500 cubic meters, and monitor slope activity. Our analysis revealed substantial mass movement between 2016 and 2021, followed by relatively minor changes between 2021 and 2023, primarily related to fluvial erosion. This study demonstrates the effectiveness of UAV and DTM differencing techniques for landslide detection, volumetric analysis, and long-term monitoring in urbanized settings. Beyond its scientific contributions, the Bella Orxeta case offers pedagogical value across academic disciplines, supporting practical training in geomorphology, geotechnical assessment, GIS, and risk planning. It also highlights policy gaps in existing territorial risk plans, particularly regarding the integration of modern monitoring tools for small-scale but recurrent geohazards. Given climate change projections indicating more frequent high-intensity rainfall events in Mediterranean areas, the paper advocates for the systematic documentation of local landslide cases to improve hazard preparedness, urban resilience, and geoscience education. Full article

(This article belongs to the Special Issue Remote Sensing Monitoring of Geomorphological Hazards)

► Show Figures

Figure 1

24 pages, 14851 KB

Open AccessArticle

LiteFocus-YOLO: An Efficient Network for Identifying Dense Tassels in Field Environments

by Heyang Wang, Jinghuan Hu, Yunlong Ji, Chong Peng, Yu Bao, Hang Zhu, Caocan Zhu, Mengchao Chen, Ye Mu and Hongyu Guo

Agriculture 2025, 15(19), 2036; https://doi.org/10.3390/agriculture15192036 - 28 Sep 2025

Viewed by 266

Abstract

High-efficiency and precise detection of crop ears in the field is a core component of intelligent agricultural yield estimation. However, challenges such as overlapping ears caused by dense planting, complex background interference, and blurred boundaries of small targets severely limit the accuracy and [...] Read more.

High-efficiency and precise detection of crop ears in the field is a core component of intelligent agricultural yield estimation. However, challenges such as overlapping ears caused by dense planting, complex background interference, and blurred boundaries of small targets severely limit the accuracy and practicality of existing detection models. This paper introduces LiteFocus-YOLO(LF-YOLO), an efficient small-object detection model. By synergistically enhancing feature expression through cross-scale texture optimization and attention mechanisms, it achieves high-precision identification of maize tassels and wheat ears. The model innovatively incorporates the following: The Lightweight Target-Aware Attention Module (LTAM) strengthens high-frequency feature expression for small targets while reducing background interference, enhancing robustness in densely occluded scenes. The Cross-Feature Fusion Module (CFFM) addresses semantic detail loss through deep-shallow feature fusion modulation, optimizing small target localization accuracy. The experiment validated performance on the drone-based maize tassel dataset. Results show that LF-YOLO achieved an mAP50 of 97.9%, with mAP50 scores of 94.6% and 95.7% on the publicly available maize tassel and wheat ear datasets, respectively. It achieves generalization across different crops while maintaining high accuracy and recall. Compared to current mainstream object detection models, LF-YOLO delivers higher precision at lower computational cost, providing efficient technical support for dense small object detection tasks in agricultural fields. Full article

(This article belongs to the Special Issue Plant Diagnosis and Monitoring for Agricultural Production)

► Show Figures

Figure 1

17 pages, 2172 KB

Open AccessArticle

GLDS-YOLO: An Improved Lightweight Model for Small Object Detection in UAV Aerial Imagery

by Zhiyong Ju, Jiacheng Shui and Jiameng Huang

Electronics 2025, 14(19), 3831; https://doi.org/10.3390/electronics14193831 - 27 Sep 2025

Viewed by 603

Abstract

To enhance small object detection in UAV aerial imagery suffering from low resolution and complex backgrounds, this paper proposes GLDS-YOLO, an improved lightweight detection model. The model integrates four core modules: Group Shuffle Attention (GSA) to strengthen small-scale feature perception, Large Separable Kernel [...] Read more.

To enhance small object detection in UAV aerial imagery suffering from low resolution and complex backgrounds, this paper proposes GLDS-YOLO, an improved lightweight detection model. The model integrates four core modules: Group Shuffle Attention (GSA) to strengthen small-scale feature perception, Large Separable Kernel Attention (LSKA) to capture global semantic context, DCNv4 to enhance feature adaptability with reduced parameters, and further proposes a novel Small-object-enhanced Multi-scale and Structure Detail Enhancement (SMSDE) module, which enhances edge-detail representation of small objects while maintaining lightweight efficiency. Experiments on VisDrone2019 and DOTA1.0 demonstrate that GLDS-YOLO achieves superior detection performance. On VisDrone2019, it improves mAP@0.5 and mAP@0.5:0.95 by 12.1% and 7%, respectively, compared with YOLOv11n, while maintaining competitive results on DOTA. These results confirm the model’s effectiveness, robustness, and adaptability for complex small object detection tasks in UAV scenarios. Full article

► Show Figures

Figure 1

23 pages, 11401 KB

Open AccessArticle

HSFANet: Hierarchical Scale-Sensitive Feature Aggregation Network for Small Object Detection in UAV Aerial Images

by Hongxing Zhang, Zhonghong Ou, Siyuan Yao, Yifan Zhu, Yangfu Zhu, Hailin Li, Shigeng Wang, Yang Guo and Meina Song

Drones 2025, 9(9), 659; https://doi.org/10.3390/drones9090659 - 19 Sep 2025

Viewed by 536

Abstract

Small object detection in aerial images, particularly from Unmanned Aerial Vehicle (UAV) platforms, remains a significant challenge due to limited object resolution, dense scenes, and background interference. However, existing small object detectors often overlook making full use of hierarchical features and inevitably introduce [...] Read more.

Small object detection in aerial images, particularly from Unmanned Aerial Vehicle (UAV) platforms, remains a significant challenge due to limited object resolution, dense scenes, and background interference. However, existing small object detectors often overlook making full use of hierarchical features and inevitably introduce noise interference because of hierarchical upsampling operations, and commonly used loss metrics lack sensitivity to scale information; these two issues jointly lead to performance deterioration. To address these issues, we propose Hierarchical Scale-Sensitive Feature Aggregation Network (HSFANet), a novel framework that conducts robust cross-layer feature interaction to perceive the small objects’ position information in hierarchical feature pyramids and enforces the model to balance the multi-scale prediction heads for accurate instances localization. HSFANet introduces a Dynamic Position Aggregation (DPA) module to explicitly enhance the object area in both shallow and deep layers, which is capable of exploiting the complementarily salient representation of the small objects. Additionally, an efficient Scale-Sensitive Loss (SSL) is proposed to balance the small object detection outputs in hierarchical prediction heads, thereby effectively improving the performance of small object detection. Extensive experiments on two challenging UAV benchmarks, VisDrone and UAVDT, demonstrate that HSFANet achieves state-of-the-art (SOTA) results, with a 1.3% gain in overall average precision (AP) and a notable 2.2% improvement in AP for small objects on VisDrone. On UAVDT, HSFANet outperforms previous methods by 0.3% in overall AP and 16.7% in small object AP. These results highlight the effectiveness of HSFANet in enhancing small object detection performance in complex aerial imagery, making it well suited for practical UAV-based applications. Full article

(This article belongs to the Special Issue Intelligent Image Processing and Sensing for Drones, 2nd Edition)

► Show Figures

Figure 1

26 pages, 7650 KB

Open AccessArticle

ACD-DETR: Adaptive Cross-Scale Detection Transformer for Small Object Detection in UAV Imagery

by Yang Tong, Hui Ye, Jishen Yang and Xiulong Yang

Sensors 2025, 25(17), 5556; https://doi.org/10.3390/s25175556 - 5 Sep 2025

Viewed by 1330

Abstract

Small object detection in UAV imagery remains challenging due to complex aerial perspectives and the presence of dense, small targets with blurred boundaries. To address these challenges, we propose ACD-DETR, an adaptive end-to-end Transformer detector tailored for UAV-based small object detection. The framework [...] Read more.

Small object detection in UAV imagery remains challenging due to complex aerial perspectives and the presence of dense, small targets with blurred boundaries. To address these challenges, we propose ACD-DETR, an adaptive end-to-end Transformer detector tailored for UAV-based small object detection. The framework introduces three core modules: the Multi-Scale Edge-Enhanced Feature Fusion Module (MSEFM) to preserve fine-grained details; the Omni-Grained Boundary Calibrator (OG-BC) for boundary-aware semantic fusion; and the Dynamic Position Bias Attention-based Intra-scale Feature Interaction (DPB-AIFI) to enhance spatial reasoning. Furthermore, we introduce ACD-DETR-SBA+, a fusion-enhanced variant that removes OG-BC and DPB-AIFI while deploying densely connected Semantic–Boundary Aggregation (SBA) modules to intensify boundary–semantic fusion. This design sacrifices computational efficiency in exchange for higher detection precision, making it suitable for resource-rich deployment scenarios. On the VisDrone2019 dataset, ACD-DETR achieves 50.9% mAP@0.5, outperforming the RT-DETR-R18 baseline by 3.6 percentage points, while reducing parameters by 18.5%. ACD-DETR-SBA+ further improves accuracy to 52.0% mAP@0.5, demonstrating the benefit of SBA-based fusion. Extensive experiments on the VisDrone2019 and DOTA datasets demonstrate that ACD-DETR achieves a state-of-the-art trade-off between accuracy and efficiency, while ACD-DETR-SBA+ achieves further performance improvements at higher computational cost. Ablation studies and visual analyses validate the effectiveness of the proposed modules and design strategies. Full article

(This article belongs to the Section Remote Sensors)

► Show Figures

Figure 1

20 pages, 8561 KB

Open AccessArticle

LCW-YOLO: An Explainable Computer Vision Model for Small Object Detection in Drone Images

by Dan Liao, Rengui Bi, Yubi Zheng, Cheng Hua, Liangqing Huang, Xiaowen Tian and Bolin Liao

Appl. Sci. 2025, 15(17), 9730; https://doi.org/10.3390/app15179730 - 4 Sep 2025

Viewed by 1292

Abstract

Small targets in drone imagery are often difficult to accurately locate and identify due to scale imbalance and limitations, such as pixel representation and dynamic environmental interference, and the balance between detection accuracy and resource consumption of the model also poses challenges. Therefore, [...] Read more.

Small targets in drone imagery are often difficult to accurately locate and identify due to scale imbalance and limitations, such as pixel representation and dynamic environmental interference, and the balance between detection accuracy and resource consumption of the model also poses challenges. Therefore, we propose an interpretable computer vision framework based on YOLOv12m, called LCW-YOLO. First, we adopt multi-scale heterogeneous convolutional kernels to improve the lightweight channel-level and spatial attention combined context (LA2C2f) structure, enhancing spatial perception capabilities while reducing model computational load. Second, to enhance feature fusion capabilities, we propose the Convolutional Attention Integration Module (CAIM), enabling the fusion of original features across channels, spatial dimensions, and layers, thereby strengthening contextual attention. Finally, the model incorporates Wise-IoU (WIoU) v3, which dynamically allocates loss weights for detected objects. This allows the model to adjust its focus on samples of average quality during training based on object difficulty, thereby improving the model’s generalization capabilities. According to experimental results, LCW-YOLO eliminates 0.4 M parameters and improves mAP@0.5 by 3.3% on the VisDrone2019 dataset when compared to YOLOv12m. And the model improves mAP@0.5 by 1.9% on the UAVVaste dataset. In the task of identifying small objects with drones, LCW-YOLO, as an explainable AI (XAI) model, provides visual detection results and effectively balances accuracy, lightweight design, and generalization capabilities. Full article

(This article belongs to the Special Issue Explainable Artificial Intelligence Technology and Its Applications)

► Show Figures

Figure 1

17 pages, 16767 KB

Open AccessArticle

AeroLight: A Lightweight Architecture with Dynamic Feature Fusion for High-Fidelity Small-Target Detection in Aerial Imagery

by Hao Qiu, Xiaoyan Meng, Yunjie Zhao, Liang Yu and Shuai Yin

Sensors 2025, 25(17), 5369; https://doi.org/10.3390/s25175369 - 30 Aug 2025

Viewed by 748

Abstract

Small-target detection in Unmanned Aerial Vehicle (UAV) aerial images remains a significant and unresolved challenge in aerial image analysis, hampered by low target resolution, dense object clustering, and complex, cluttered backgrounds. In order to cope with these problems, we present AeroLight, a novel [...] Read more.

Small-target detection in Unmanned Aerial Vehicle (UAV) aerial images remains a significant and unresolved challenge in aerial image analysis, hampered by low target resolution, dense object clustering, and complex, cluttered backgrounds. In order to cope with these problems, we present AeroLight, a novel and efficient detection architecture that achieves high-fidelity performance in resource-constrained environments. AeroLight is built upon three key innovations. First, we have optimized the feature pyramid at the architectural level by integrating a high-resolution head specifically designed for minute object detection. This design enhances sensitivity to fine-grained spatial details while streamlining redundant and computationally expensive network layers. Second, a Dynamic Feature Fusion (DFF) module is proposed to adaptively recalibrate and merge multi-scale feature maps, mitigating information loss during integration and strengthening object representation across diverse scales. Finally, we enhance the localization precision of irregular-shaped objects by refining bounding box regression using a Shape-IoU loss function. AeroLight is shown to improve mAP50 and mAP50-95 by 7.5% and 3.3%, respectively, on the VisDrone2019 dataset, while reducing the parameter count by 28.8% when compared with the baseline model. Further validation on the RSOD dataset and Huaxing Farm Drone dataset confirms its superior performance and generalization capabilities. AeroLight provides a powerful and efficient solution for real-world UAV applications, setting a new standard for lightweight, high-precision object recognition in aerial imaging scenarios. Full article

(This article belongs to the Section Remote Sensors)

► Show Figures

Figure 1

30 pages, 25011 KB

Open AccessArticle

Multi-Level Contextual and Semantic Information Aggregation Network for Small Object Detection in UAV Aerial Images

by Zhe Liu, Guiqing He and Yang Hu

Drones 2025, 9(9), 610; https://doi.org/10.3390/drones9090610 - 29 Aug 2025

Viewed by 606

Abstract

In recent years, detection methods for generic object detection have achieved significant progress. However, due to the large number of small objects in aerial images, mainstream detectors struggle to achieve a satisfactory detection performance. The challenges of small object detection in aerial images [...] Read more.

In recent years, detection methods for generic object detection have achieved significant progress. However, due to the large number of small objects in aerial images, mainstream detectors struggle to achieve a satisfactory detection performance. The challenges of small object detection in aerial images are primarily twofold: (1) Insufficient feature representation: The limited visual information for small objects makes it difficult for models to learn discriminative feature representations. (2) Background confusion: Abundant background information introduces more noise and interference, causing the features of small objects to easily be confused with the background. To address these issues, we propose a Multi-Level Contextual and Semantic Information Aggregation Network (MCSA-Net). MCSA-Net includes three key components: a Spatial-Aware Feature Selection Module (SAFM), a Multi-Level Joint Feature Pyramid Network (MJFPN), and an Attention-Enhanced Head (AEHead). The SAFM employs a sequence of dilated convolutions to extract multi-scale local context features and combines a spatial selection mechanism to adaptively merge these features, thereby obtaining the critical local context required for the objects, which enriches the feature representation of small objects. The MJFPN introduces multi-level connections and weighted fusion to fully leverage the spatial detail features of small objects in feature fusion and enhances the fused features further through a feature aggregation network. Finally, the AEHead is constructed by incorporating a sparse attention mechanism into the detection head. The sparse attention mechanism efficiently models long-range dependencies by computing the attention between the most relevant regions in the image while suppressing background interference, thereby enhancing the model’s ability to perceive targets and effectively improving the detection performance. Extensive experiments on four datasets, VisDrone, UAVDT, MS COCO, and DOTA, demonstrate that the proposed MCSA-Net achieves an excellent detection performance, particularly in small object detection, surpassing several state-of-the-art methods. Full article

(This article belongs to the Special Issue Intelligent Image Processing and Sensing for Drones, 2nd Edition)

► Show Figures

Figure 1

15 pages, 1690 KB

Open AccessArticle

OTB-YOLO: An Enhanced Lightweight YOLO Architecture for UAV-Based Maize Tassel Detection

by Yu Han, Xingya Wang, Luyan Niu, Song Shi, Yingbo Gao, Kuijie Gong, Xia Zhang and Jiye Zheng

Plants 2025, 14(17), 2701; https://doi.org/10.3390/plants14172701 - 29 Aug 2025

Viewed by 603

Abstract

To tackle the challenges posed by substantial variations in target scale, intricate background interference, and the likelihood of missing small targets in multi-temporal UAV maize tassel imagery, an optimized lightweight detection model derived from YOLOv11 is introduced, named OTB-YOLO. Here, “OTB” is an [...] Read more.

To tackle the challenges posed by substantial variations in target scale, intricate background interference, and the likelihood of missing small targets in multi-temporal UAV maize tassel imagery, an optimized lightweight detection model derived from YOLOv11 is introduced, named OTB-YOLO. Here, “OTB” is an acronym derived from the initials of the model’s core improved modules: Omni-dimensional dynamic convolution (ODConv), Triplet Attention, and Bi-directional Feature Pyramid Network (BiFPN). This model integrates the PaddlePaddle open-source maize tassel recognition benchmark dataset with the public Multi-Temporal Drone Corn Dataset (MTDC). Traditional convolutional layers are substituted with omni-dimensional dynamic convolution (ODConv) to mitigate computational redundancy. A triplet attention module is incorporated to refine feature extraction within the backbone network, while a bidirectional feature pyramid network (BiFPN) is engineered to enhance accuracy via multi-level feature pyramids and bidirectional information flow. Empirical analysis demonstrates that the enhanced model achieves a precision of 95.6%, recall of 92.1%, and mAP@0.5 of 96.6%, marking improvements of 3.2%, 2.5%, and 3.1%, respectively, over the baseline model. Concurrently, the model’s computational complexity is reduced to 6.0 GFLOPs, rendering it appropriate for deployment on UAV edge computing platforms. Full article

(This article belongs to the Special Issue Application of Remote Sensing in Crop Production and Farmland Soil Monitoring)

► Show Figures

Figure 1

26 pages, 29132 KB

Open AccessArticle

DCS-YOLOv8: A Lightweight Context-Aware Network for Small Object Detection in UAV Remote Sensing Imagery

by Xiaozheng Zhao, Zhongjun Yang and Huaici Zhao

Remote Sens. 2025, 17(17), 2989; https://doi.org/10.3390/rs17172989 - 28 Aug 2025

Viewed by 957

Abstract

Small object detection in UAV-based remote sensing imagery is crucial for applications such as traffic monitoring, emergency response, and urban management. However, aerial images often suffer from low object resolution, complex backgrounds, and varying lighting conditions, leading to missed or false detections. To [...] Read more.

Small object detection in UAV-based remote sensing imagery is crucial for applications such as traffic monitoring, emergency response, and urban management. However, aerial images often suffer from low object resolution, complex backgrounds, and varying lighting conditions, leading to missed or false detections. To address these challenges, we propose DCS-YOLOv8, an enhanced object detection framework tailored for small target detection in UAV scenarios. The proposed model integrates a Dynamic Convolution Attention Mixture (DCAM) module to improve global feature representation and combines it with the C2f module to form the C2f-DCAM block. The C2f-DCAM block, together with a lightweight SCDown module for efficient downsampling, constitutes the backbone DCS-Net. In addition, a dedicated P2 detection layer is introduced to better capture high-resolution spatial features of small objects. To further enhance detection accuracy and robustness, we replace the conventional CIoU loss with a novel Scale-based Dynamic Balanced IoU (SDBIoU) loss, which dynamically adjusts loss weights based on object scale. Extensive experiments on the VisDrone2019 dataset demonstrate that the proposed DCS-YOLOv8 significantly improves small object detection performance while maintaining efficiency. Compared to the baseline YOLOv8s, our model increases precision from 51.8% to 54.2%, recall from 39.4% to 42.1%,

m A P_{0.5}

from 40.6% to 44.5%, and

m A P_{0.5 : 0.95}

from 24.3% to 26.9%, while reducing parameters from 11.1 M to 9.9 M. Moreover, real-time inference on RK3588 embedded hardware validates the model’s suitability for onboard UAV deployment in remote sensing applications. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence and Deep Learning for Remote Sensing (3rd Edition))

► Show Figures

Figure 1

Search Results (248)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (248)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI