MDPI - Publisher of Open Access Journals

19 pages, 3520 KiB

Open AccessArticle

Vision-Guided Maritime UAV Rescue System with Optimized GPS Path Planning and Dual-Target Tracking

by Suli Wang, Yang Zhao, Chang Zhou, Xiaodong Ma, Zijun Jiao, Zesheng Zhou, Xiaolu Liu, Tianhai Peng and Changxing Shao

Drones 2025, 9(7), 502; https://doi.org/10.3390/drones9070502 - 16 Jul 2025

Viewed by 116

Abstract

With the global increase in maritime activities, the frequency of maritime accidents has risen, underscoring the urgent need for faster and more efficient search and rescue (SAR) solutions. This study presents an intelligent unmanned aerial vehicle (UAV)-based maritime rescue system that combines GPS-driven [...] Read more.

With the global increase in maritime activities, the frequency of maritime accidents has risen, underscoring the urgent need for faster and more efficient search and rescue (SAR) solutions. This study presents an intelligent unmanned aerial vehicle (UAV)-based maritime rescue system that combines GPS-driven dynamic path planning with vision-based dual-target detection and tracking. Developed within the Gazebo simulation environment and based on modular ROS architecture, the system supports stable takeoff and smooth transitions between multi-rotor and fixed-wing flight modes. An external command module enables real-time waypoint updates. This study proposes three path-planning schemes based on the characteristics of drones. Comparative experiments have demonstrated that the triangular path is the optimal route. Compared with the other schemes, this path reduces the flight distance by 30–40%. Robust target recognition is achieved using a darknet-ROS implementation of the YOLOv4 model, enhanced with data augmentation to improve performance in complex maritime conditions. A monocular vision-based ranging algorithm ensures accurate distance estimation and continuous tracking of rescue vessels. Furthermore, a dual-target-tracking algorithm—integrating motion prediction with color-based landing zone recognition—achieves a 96% success rate in precision landings under dynamic conditions. Experimental results show a 4% increase in the overall mission success rate compared to traditional SAR methods, along with significant gains in responsiveness and reliability. This research delivers a technically innovative and cost-effective UAV solution, offering strong potential for real-world maritime emergency response applications. Full article

(This article belongs to the Special Issue Innovative Applications of UAVs in Search and Rescue: Improving Safety and Effectiveness)

► Show Figures

Figure 1

21 pages, 3826 KiB

Open AccessArticle

UAV-OVD: Open-Vocabulary Object Detection in UAV Imagery via Multi-Level Text-Guided Decoding

by Lijie Tao, Guoting Wei, Zhuo Wang, Zhaoshuai Qi, Ying Li and Haokui Zhang

Drones 2025, 9(7), 495; https://doi.org/10.3390/drones9070495 - 14 Jul 2025

Viewed by 182

Abstract

Object detection in drone-captured imagery has attracted significant attention due to its wide range of real-world applications, including surveillance, disaster response, and environmental monitoring. Although the majority of existing methods are developed under closed-set assumptions, and some recent studies have begun to explore [...] Read more.

Object detection in drone-captured imagery has attracted significant attention due to its wide range of real-world applications, including surveillance, disaster response, and environmental monitoring. Although the majority of existing methods are developed under closed-set assumptions, and some recent studies have begun to explore open-vocabulary or open-world detection, their application to UAV imagery remains limited and underexplored. In this paper, we address this limitation by exploring the relationship between images and textual semantics to extend object detection in UAV imagery to an open-vocabulary setting. We propose a novel and efficient detector named Unmanned Aerial Vehicle Open-Vocabulary Detector (UAV-OVD), specifically designed for drone-captured scenes. To facilitate open-vocabulary object detection, we propose improvements from three complementary perspectives. First, at the training level, we design a region–text contrastive loss to replace conventional classification loss, allowing the model to align visual regions with textual descriptions beyond fixed category sets. Structurally, building on this, we introduce a multi-level text-guided fusion decoder that integrates visual features across multiple spatial scales under language guidance, thereby improving overall detection performance and enhancing the representation and perception of small objects. Finally, from the data perspective, we enrich the original dataset with synonym-augmented category labels, enabling more flexible and semantically expressive supervision. Experiments conducted on two widely used benchmark datasets demonstrate that our approach achieves significant improvements in both mean mAP and Recall. For instance, for Zero-Shot Detection on xView, UAV-OVD achieves 9.9 mAP and 67.3 Recall, 1.1 and 25.6 higher than that of YOLO-World. In terms of speed, UAV-OVD achieves 53.8 FPS, nearly twice as fast as YOLO-World and five times faster than DetrReg, demonstrating its strong potential for real-time open-vocabulary detection in UAV imagery. Full article

(This article belongs to the Special Issue Applications of UVs in Digital Photogrammetry and Image Processing)

► Show Figures

Figure 1

28 pages, 19790 KiB

Open AccessArticle

HSF-DETR: A Special Vehicle Detection Algorithm Based on Hypergraph Spatial Features and Bipolar Attention

by Kaipeng Wang, Guanglin He and Xinmin Li

Sensors 2025, 25(14), 4381; https://doi.org/10.3390/s25144381 - 13 Jul 2025

Viewed by 236

Abstract

Special vehicle detection in intelligent surveillance, emergency rescue, and reconnaissance faces significant challenges in accuracy and robustness under complex environments, necessitating advanced detection algorithms for critical applications. This paper proposes HSF-DETR (Hypergraph Spatial Feature DETR), integrating four innovative modules: a Cascaded Spatial Feature [...] Read more.

Special vehicle detection in intelligent surveillance, emergency rescue, and reconnaissance faces significant challenges in accuracy and robustness under complex environments, necessitating advanced detection algorithms for critical applications. This paper proposes HSF-DETR (Hypergraph Spatial Feature DETR), integrating four innovative modules: a Cascaded Spatial Feature Network (CSFNet) backbone with Cross-Efficient Convolutional Gating (CECG) for enhanced long-range detection through hybrid state-space modeling; a Hypergraph-Enhanced Spatial Feature Modulation (HyperSFM) network utilizing hypergraph structures for high-order feature correlations and adaptive multi-scale fusion; a Dual-Domain Feature Encoder (DDFE) combining Bipolar Efficient Attention (BEA) and Frequency-Enhanced Feed-Forward Network (FEFFN) for precise feature weight allocation; and a Spatial-Channel Fusion Upsampling Block (SCFUB) improving feature fidelity through depth-wise separable convolution and channel shift mixing. Experiments conducted on a self-built special vehicle dataset containing 2388 images demonstrate that HSF-DETR achieves mAP50 and mAP50-95 of 96.6% and 70.6%, respectively, representing improvements of 3.1% and 4.6% over baseline RT-DETR while maintaining computational efficiency at 59.7 GFLOPs and 18.07 M parameters. Cross-domain validation on VisDrone2019 and BDD100K datasets confirms the method’s generalization capability and robustness across diverse scenarios, establishing HSF-DETR as an effective solution for special vehicle detection in complex environments. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

29 pages, 16466 KiB

Open AccessArticle

DMF-YOLO: Dynamic Multi-Scale Feature Fusion Network-Driven Small Target Detection in UAV Aerial Images

by Xiaojia Yan, Shiyan Sun, Huimin Zhu, Qingping Hu, Wenjian Ying and Yinglei Li

Remote Sens. 2025, 17(14), 2385; https://doi.org/10.3390/rs17142385 - 10 Jul 2025

Viewed by 404

Abstract

Target detection in UAV aerial images has found increasingly widespread applications in emergency rescue, maritime monitoring, and environmental surveillance. However, traditional detection models suffer significant performance degradation due to challenges including substantial scale variations, high proportions of small targets, and dense occlusions in [...] Read more.

Target detection in UAV aerial images has found increasingly widespread applications in emergency rescue, maritime monitoring, and environmental surveillance. However, traditional detection models suffer significant performance degradation due to challenges including substantial scale variations, high proportions of small targets, and dense occlusions in UAV-captured images. To address these issues, this paper proposes DMF-YOLO, a high-precision detection network based on YOLOv10 improvements. First, we design Dynamic Dilated Snake Convolution (DDSConv) to adaptively adjust the receptive field and dilation rate of convolution kernels, enhancing local feature extraction for small targets with weak textures. Second, we construct a Multi-scale Feature Aggregation Module (MFAM) that integrates dual-branch spatial attention mechanisms to achieve efficient cross-layer feature fusion, mitigating information conflicts between shallow details and deep semantics. Finally, we propose an Expanded Window-based Bounding Box Regression Loss Function (EW-BBRLF), which optimizes localization accuracy through dynamic auxiliary bounding boxes, effectively reducing missed detections of small targets. Experiments on the VisDrone2019 and HIT-UAV datasets demonstrate that DMF-YOLOv10 achieves 50.1% and 81.4% mAP50, respectively, significantly outperforming the baseline YOLOv10s by 27.1% and 2.6%, with parameter increases limited to 24.4% and 11.9%. The method exhibits superior robustness in dense scenarios, complex backgrounds, and long-range target detection. This approach provides an efficient solution for UAV real-time perception tasks and offers novel insights for multi-scale object detection algorithm design. Full article

► Show Figures

Graphical abstract

32 pages, 2740 KiB

Open AccessArticle

Vision-Based Navigation and Perception for Autonomous Robots: Sensors, SLAM, Control Strategies, and Cross-Domain Applications—A Review

by Eder A. Rodríguez-Martínez, Wendy Flores-Fuentes, Farouk Achakir, Oleg Sergiyenko and Fabian N. Murrieta-Rico

Eng 2025, 6(7), 153; https://doi.org/10.3390/eng6070153 - 7 Jul 2025

Viewed by 734

Abstract

Camera-centric perception has matured into a cornerstone of modern autonomy, from self-driving cars and factory cobots to underwater and planetary exploration. This review synthesizes more than a decade of progress in vision-based robotic navigation through an engineering lens, charting the full pipeline from [...] Read more.

Camera-centric perception has matured into a cornerstone of modern autonomy, from self-driving cars and factory cobots to underwater and planetary exploration. This review synthesizes more than a decade of progress in vision-based robotic navigation through an engineering lens, charting the full pipeline from sensing to deployment. We first examine the expanding sensor palette—monocular and multi-camera rigs, stereo and RGB-D devices, LiDAR–camera hybrids, event cameras, and infrared systems—highlighting the complementary operating envelopes and the rise of learning-based depth inference. The advances in visual localization and mapping are then analyzed, contrasting sparse and dense SLAM approaches, as well as monocular, stereo, and visual–inertial formulations. Additional topics include loop closure, semantic mapping, and LiDAR–visual–inertial fusion, which enables drift-free operation in dynamic environments. Building on these foundations, we review the navigation and control strategies, spanning classical planning, reinforcement and imitation learning, hybrid topological–metric memories, and emerging visual language guidance. Application case studies—autonomous driving, industrial manipulation, autonomous underwater vehicles, planetary rovers, aerial drones, and humanoids—demonstrate how tailored sensor suites and algorithms meet domain-specific constraints. Finally, the future research trajectories are distilled: generative AI for synthetic training data and scene completion; high-density 3D perception with solid-state LiDAR and neural implicit representations; event-based vision for ultra-fast control; and human-centric autonomy in next-generation robots. By providing a unified taxonomy, a comparative analysis, and engineering guidelines, this review aims to inform researchers and practitioners designing robust, scalable, vision-driven robotic systems. Full article

(This article belongs to the Special Issue Interdisciplinary Insights in Engineering Research)

► Show Figures

Figure 1

28 pages, 1210 KiB

Open AccessArticle

A Multi-Ray Channel Modelling Approach to Enhance UAV Communications in Networked Airspace

by Fawad Ahmad, Muhammad Yasir Masood Mirza, Iftikhar Hussain and Kaleem Arshid

Inventions 2025, 10(4), 51; https://doi.org/10.3390/inventions10040051 - 1 Jul 2025

Cited by 1 | Viewed by 303

Abstract

In recent years, the use of unmanned aerial vehicles (UAVs), commonly known as drones, has significantly surged across civil, military, and commercial sectors. Ensuring reliable and efficient communication between UAVs and between UAVs and base stations is challenging due to dynamic factors such [...] Read more.

In recent years, the use of unmanned aerial vehicles (UAVs), commonly known as drones, has significantly surged across civil, military, and commercial sectors. Ensuring reliable and efficient communication between UAVs and between UAVs and base stations is challenging due to dynamic factors such as altitude, mobility, environmental obstacles, and atmospheric conditions, which existing communication models fail to address fully. This paper presents a multi-ray channel model that captures the complexities of the airspace network, applicable to both ground-to-air (G2A) and air-to-air (A2A) communications to ensure reliability and efficiency within the network. The model outperforms conventional line-of-sight assumptions by integrating multiple rays to reflect the multipath transmission of UAVs. The multi-ray channel model considers UAV flights’ dynamic and 3-D nature and the conditions in which UAVs typically operate, including urban, suburban, and rural environments. A technique that calculates the received power at a target UAV within a networked airspace is also proposed, utilizing the reflective characteristics of UAV surfaces along with the multi-ray channel model. The developed multi-ray channel model further facilitates the characterization and performance evaluation of G2A and A2A communications. Additionally, this paper explores the effects of various factors, such as altitude, the number of UAVs, and the spatial separation between them on the power received by the target UAV. The simulation outcomes are validated by empirical data and existing theoretical models, providing comprehensive insight into the proposed channel modelling technique. Full article

(This article belongs to the Special Issue Revolutionizing Mobility: Unleashing the Power of Software-Defined Networking for Electric Vehicle Communication)

► Show Figures

Figure 1

29 pages, 18908 KiB

Open AccessArticle

Toward Efficient UAV-Based Small Object Detection: A Lightweight Network with Enhanced Feature Fusion

by Xingyu Di, Kangning Cui and Rui-Feng Wang

Remote Sens. 2025, 17(13), 2235; https://doi.org/10.3390/rs17132235 - 29 Jun 2025

Viewed by 514

Abstract

UAV-based small target detection is crucial in environmental monitoring, circuit detection, and related applications. However, UAV images often face challenges such as significant scale variation, dense small targets, high inter-class similarity, and intra-class diversity, which can lead to missed detections, thus reducing performance. [...] Read more.

UAV-based small target detection is crucial in environmental monitoring, circuit detection, and related applications. However, UAV images often face challenges such as significant scale variation, dense small targets, high inter-class similarity, and intra-class diversity, which can lead to missed detections, thus reducing performance. To solve these problems, this study proposes a lightweight and high-precision model UAV-YOLO based on YOLOv8s. Firstly, a double separation convolution (DSC) module is designed to replace the Bottleneck structure in the C2f module with deep separable convolution and point-by-point convolution fusion, which can reduce the model parameters and calculation complexity while enhancing feature expression. Secondly, a new SPPL module is proposed, which combines spatial pyramid pooling rapid fusion (SPPF) with long-distance dependency modeling (LSKA) to improve the robustness of the model to multi-scale targets through cross-level feature association. Then, DyHead is used to replace the original detector head, and the discrimination ability of small targets in complex background is enhanced by adaptive weight allocation and cross-scale feature optimization fusion. Finally, the WIPIoU loss function is proposed, which integrates the advantages of Wise-IoU, MPDIoU and Inner-IoU, and incorporates the geometric center of bounding box, aspect ratio and overlap degree into a unified measure to improve the localization accuracy of small targets and accelerate the convergence. The experimental results on the VisDrone2019 dataset showed that compared to YOLOv8s, UAV-YOLO achieved an 8.9% improvement in the recall of mAP@0.5 and 6.8%, while the parameters and calculations were reduced by 23.4% and 40.7%, respectively. Additional evaluations of the DIOR, RSOD, and NWPU VHR-10 datasets demonstrate the generalization capability of the model. Full article

(This article belongs to the Special Issue Geospatial Intelligence in Remote Sensing)

► Show Figures

Figure 1

18 pages, 13123 KiB

Open AccessArticle

Field Study of UAV Variable-Rate Spraying Method for Orchards Based on Canopy Volume

by Pengchao Chen, Haoran Ma, Zongyin Cui, Zhihong Li, Jiapei Wu, Jianhong Liao, Hanbing Liu, Ying Wang and Yubin Lan

Agriculture 2025, 15(13), 1374; https://doi.org/10.3390/agriculture15131374 - 27 Jun 2025

Viewed by 377

Abstract

The use of unmanned aerial vehicle (UAV) pesticide spraying technology in precision agriculture is becoming increasingly important. However, traditional spraying methods struggle to address the precision application need caused by the canopy differences of fruit trees in orchards. This study proposes a UAV [...] Read more.

The use of unmanned aerial vehicle (UAV) pesticide spraying technology in precision agriculture is becoming increasingly important. However, traditional spraying methods struggle to address the precision application need caused by the canopy differences of fruit trees in orchards. This study proposes a UAV orchard variable-rate spraying method based on canopy volume. A DJI M300 drone equipped with LiDAR was used to capture high-precision 3D point cloud data of tree canopies. An improved progressive TIN densification (IPTD) filtering algorithm and a region-growing algorithm were applied to segment the point cloud of fruit trees, construct a canopy volume-based classification model, and generate a differentiated prescription map for spraying. A distributed multi-point spraying strategy was employed to optimize droplet deposition performance. Field experiments were conducted in a citrus (Citrus reticulata Blanco) orchard (73 trees) and a litchi (Litchi chinensis Sonn.) orchard (82 trees). Data analysis showed that variable-rate treatment in the litchi area achieved a maximum canopy coverage of 14.47% for large canopies, reducing ground deposition by 90.4% compared to the continuous spraying treatment; variable-rate treatment in the citrus area reached a maximum coverage of 9.68%, with ground deposition reduced by approximately 64.1% compared to the continuous spraying treatment. By matching spray volume to canopy demand, variable-rate spraying significantly improved droplet deposition targeting, validating the feasibility of the proposed method in reducing pesticide waste and environmental pollution and providing a scalable technical path for precision plant protection in orchards. Full article

(This article belongs to the Special Issue Smart Spraying Technology in Orchards: Innovation and Application)

► Show Figures

Figure 1

20 pages, 4198 KiB

Open AccessArticle

HiDRA-DCDNet: Dynamic Hierarchical Attention and Multi-Scale Context Fusion for Real-Time Remote Sensing Small-Target Detection

by Jiale Wang, Zhe Bai, Ximing Zhang, Yuehong Qiu, Fan Bu and Yuancheng Shao

Remote Sens. 2025, 17(13), 2195; https://doi.org/10.3390/rs17132195 - 25 Jun 2025

Viewed by 304

Abstract

Small-target detection in remote sensing presents three fundamental challenges: limited pixel representation of targets, multi-angle imaging-induced appearance variance, and complex background interference. This paper introduces a dual-component neural architecture comprising Hierarchical Dynamic Refinement Attention (HiDRA) and Densely Connected Dilated Block (DCDBlock) to address [...] Read more.

Small-target detection in remote sensing presents three fundamental challenges: limited pixel representation of targets, multi-angle imaging-induced appearance variance, and complex background interference. This paper introduces a dual-component neural architecture comprising Hierarchical Dynamic Refinement Attention (HiDRA) and Densely Connected Dilated Block (DCDBlock) to address these challenges systematically. The HiDRA mechanism implements a dual-phase feature enhancement process: channel competition through bottleneck compression for discriminative feature selection, followed by spatial-semantic reweighting for foreground–background decoupling. The DCDBlock architecture synergizes multi-scale dilated convolutions with cross-layer dense connections, establishing persistent feature propagation pathways that preserve critical spatial details across network depths. Extensive experiments on AI-TOD, VisDrone, MAR20, and DOTA-v1.0 datasets demonstrate our method’s consistent superiority, achieving average absolute gains of +1.16% (mAP₅₀), +0.93% (mAP₉₅), and +1.83% (F1-score) over prior state-of-the-art approaches across all benchmarks. With 8.1 GFLOPs computational complexity and 2.6 ms inference speed per image, our framework demonstrates practical efficacy for real-time remote sensing applications, achieving superior accuracy–efficiency trade-off compared to existing approaches. Full article

► Show Figures

Figure 1

21 pages, 5194 KiB

Open AccessArticle

LMEC-YOLOv8: An Enhanced Object Detection Algorithm for UAV Imagery

by Xuchuan Tai and Xinjun Zhang

Electronics 2025, 14(13), 2535; https://doi.org/10.3390/electronics14132535 - 23 Jun 2025

Viewed by 428

Abstract

Despite the rapid development of UAV (Unmanned Aerial Vehicle) technology, its application for object detection in complex scenarios faces challenges regarding the small target sizes and environmental interference. This paper proposes an improved algorithm, LMEC-YOLOv8, based on YOLOv8n, which aims to enhance the [...] Read more.

Despite the rapid development of UAV (Unmanned Aerial Vehicle) technology, its application for object detection in complex scenarios faces challenges regarding the small target sizes and environmental interference. This paper proposes an improved algorithm, LMEC-YOLOv8, based on YOLOv8n, which aims to enhance the detection accuracy and real-time performance of UAV imagery for small targets. We propose three key enhancements: (1) a lightweight multi-scale module (LMS-PC2F) to replace C2f; (2) a multi-scale attention mechanism (MSCBAM) for optimized feature extraction; and (3) an adaptive pyramid module (ESPPM) and a bidirectional feature network (CBiFPN) to boost fusion capability. Experimental results on the VisDrone2019 dataset demonstrate that LMEC-YOLOv8 achieves a 10.1% improvement in mAP50, a 20% reduction in parameter count, and a frame rate of 42 FPS compared to the baseline YOLOv8n. When compared to other state-of-the-art algorithms, the proposed model achieves an optimal balance between accuracy and speed, validating its robustness and practicality in complex environments. Full article

(This article belongs to the Special Issue Deep Learning for Computer Vision, 2nd Edition)

► Show Figures

Figure 1

31 pages, 2868 KiB

Open AccessArticle

Optimized Scheduling for Multi-Drop Vehicle–Drone Collaboration with Delivery Constraints Using Large Language Models and Genetic Algorithms with Symmetry Principles

by Mingyang Geng and Anping Chen

Symmetry 2025, 17(6), 934; https://doi.org/10.3390/sym17060934 - 12 Jun 2025

Viewed by 438

Abstract

With the rapid development of e-commerce and globalization, logistics distribution systems have become integral to modern economies, directly impacting transportation efficiency, resource utilization, and supply chain flexibility. However, solving the Vehicle and Multi-Drone Cooperative Delivery Problem with Delivery Restrictions is challenging due to [...] Read more.

With the rapid development of e-commerce and globalization, logistics distribution systems have become integral to modern economies, directly impacting transportation efficiency, resource utilization, and supply chain flexibility. However, solving the Vehicle and Multi-Drone Cooperative Delivery Problem with Delivery Restrictions is challenging due to complex constraints, including limited payloads, short endurance, regional restrictions, and multi-objective optimization. Traditional optimization methods, particularly genetic algorithms, struggle to address these complexities, often relying on static rules or single-objective optimization that fails to balance exploration and exploitation, resulting in local optima and slow convergence. The concept of symmetry plays a crucial role in optimizing the scheduling process, as many logistics problems inherently possess symmetrical properties. By exploiting these symmetries, we can reduce the problem’s complexity and improve solution efficiency. This study proposes a novel and scalable scheduling approach to address the Vehicle and Multi-Drone Cooperative Delivery Problem with Delivery Restrictions, tackling its high complexity, constraint handling, and real-world applicability. Specifically, we propose a logistics scheduling method called Loegised, which integrates large language models with genetic algorithms while incorporating symmetry principles to enhance the optimization process. Loegised includes three innovative modules: a cognitive initialization module to accelerate convergence by generating high-quality initial solutions, a dynamic operator parameter adjustment module to optimize crossover and mutation rates in real-time for better global search, and a local optimum escape mechanism to prevent stagnation and improve solution diversity. The experimental results on benchmark datasets show that Loegised achieves an average delivery time of 14.80, significantly outperforming six state-of-the-art baseline methods, with improvements confirmed by Wilcoxon signed-rank tests (

p < 0.001

). In large-scale scenarios, Loegised reduces delivery time by over 20% compared to conventional methods, demonstrating strong scalability and practical applicability. These findings validate the effectiveness and real-world potential of symmetry-enhanced, language model-guided optimization for advanced logistics scheduling. Full article

(This article belongs to the Special Issue Advances in Machine Learning with Symmetry/Asymmetry in Transportation)

► Show Figures

Figure 1

32 pages, 8925 KiB

Open AccessArticle

HSF-DETR: Hyper Scale Fusion Detection Transformer for Multi-Perspective UAV Object Detection

by Yi Mao, Haowei Zhang, Rui Li, Feng Zhu, Rui Sun and Pingping Ji

Remote Sens. 2025, 17(12), 1997; https://doi.org/10.3390/rs17121997 - 9 Jun 2025

Viewed by 584

Abstract

Unmanned aerial vehicle (UAV) imagery detection faces challenges in preserving small object features during multi-level downsampling, handling angle and altitude-dependent variations in aerial scenes, achieving accurate localization in dense environments, and performing real-time detection. To address these limitations, we propose HSF-DETR, a lightweight [...] Read more.

Unmanned aerial vehicle (UAV) imagery detection faces challenges in preserving small object features during multi-level downsampling, handling angle and altitude-dependent variations in aerial scenes, achieving accurate localization in dense environments, and performing real-time detection. To address these limitations, we propose HSF-DETR, a lightweight transformer-based detector specifically designed for UAV imagery. First, we design a hybrid progressive fusion network (HPFNet) as the backbone, which adaptively modulates receptive fields to capture multi-scale information while preserving fine-grained details critical for small object detection. Second, building upon features extracted by HPFNet, we develop MultiScaleNet, which enhances feature representation through dual-layer optimization and cross-domain feature learning, significantly improving the model’s capability to handle complex aerial scenarios with diverse object orientations. Finally, to address spatial–semantic alignment challenges, we devise a position-aware align context and spatial tuning (PACST) module that ensures effective feature calibration through precise alignment and adaptive fusion across scales. This hierarchical architecture is complemented by our novel AdaptDist-IoU loss with dynamic weight allocation, which enhances localization accuracy, particularly in dense environments. Extensive experiments using standard detection metrics (mAP50 and mAP50:95) on the VisDrone2019 test dataset demonstrate that HSF-DETR achieves superior performance with 0.428 mAP50 (+5.4%) and 0.253 mAP50:95 (+4%) when compared with RT-DETR, while maintaining real-time inference (69.3 FPS) on an NVIDIA RTX 4090D GPU with only 15.24M parameters and 63.6 GFLOPs. Further validation across multiple public remote sensing datasets confirms the robust generalization capability of HSF-DETR in diverse aerial scenarios, offering a practical solution for resource-constrained UAV applications where both detection quality and processing speed are crucial. Full article

(This article belongs to the Special Issue Deep Learning-Based Small-Target Detection in Remote Sensing)

► Show Figures

Graphical abstract

29 pages, 44456 KiB

Open AccessArticle

AUHF-DETR: A Lightweight Transformer with Spatial Attention and Wavelet Convolution for Embedded UAV Small Object Detection

by Hengyu Guo, Qunyong Wu and Yuhang Wang

Remote Sens. 2025, 17(11), 1920; https://doi.org/10.3390/rs17111920 - 31 May 2025

Viewed by 745

Abstract

Real-time object detection on embedded unmanned aerial vehicles (UAVs) is crucial for emergency rescue, autonomous driving, and target tracking applications. However, UAVs’ hardware limitations create conflicts between model size and detection accuracy. Moreover, challenges such as complex backgrounds from the UAV’s perspective, severe [...] Read more.

Real-time object detection on embedded unmanned aerial vehicles (UAVs) is crucial for emergency rescue, autonomous driving, and target tracking applications. However, UAVs’ hardware limitations create conflicts between model size and detection accuracy. Moreover, challenges such as complex backgrounds from the UAV’s perspective, severe occlusion, densely packed small targets, and uneven lighting conditions complicate real-time detection for embedded UAVs. To tackle these challenges, we propose AUHF-DETR, an embedded detection model derived from RT-DETR. In the backbone, we introduce a novel WTC-AdaResNet paradigm that utilizes reversible connections to decouple small-object features. We further replace the original global attention mechanism with the PSA module to strengthen inter-feature relationships within each ROI, thereby resolving the embedded challenges posed by RT-DETR’s complex token computations. In the encoder, we introduce a BDFPN for multi-scale feature fusion, effectively mitigating the small-object detection difficulties caused by the baseline’s Hungarian assignment. Extensive experiments on the public VisDrone2019, HIT-UAV, and CARPK datasets demonstrate that compared with RT-DETR-r18, AUHF-DETR achieves a 2.1% increase in

A P_{s}

on VisDrone2019, reduces the parameter count by 49.0%, and attains 68 FPS (AGX Xavier), thus satisfying the real-time requirements for small-object detection in embedded UAVs. Full article

(This article belongs to the Special Issue Advances in Artificial Intelligence (AI) and Deep Learning (DL) in UAV-Based Remote Sensing)

► Show Figures

Figure 1

19 pages, 3448 KiB

Open AccessArticle

Method for Multi-Target Wireless Charging for Oil Field Inspection Drones

by Yilong Wang, Li Ji and Ming Zhang

Drones 2025, 9(5), 381; https://doi.org/10.3390/drones9050381 - 20 May 2025

Viewed by 403

Abstract

Wireless power transfer (WPT) systems are critical for enabling safe and efficient charging of inspection drones in flammable oilfield environments, yet existing solutions struggle with multi-target compatibility and reactive power losses. This study proposes a novel frequency-regulated LCC-S topology that achieves both constant [...] Read more.

Wireless power transfer (WPT) systems are critical for enabling safe and efficient charging of inspection drones in flammable oilfield environments, yet existing solutions struggle with multi-target compatibility and reactive power losses. This study proposes a novel frequency-regulated LCC-S topology that achieves both constant current (CC) and constant voltage (CV) charging modes for heterogeneous drones using a single hardware configuration. By dynamically adjusting the operating frequency, the system minimizes the input impedance angle (θ < 10°) while maintaining load-independent CC and CV outputs, thereby reducing reactive power by 92% and ensuring spark-free operation in explosive atmospheres. Experimental validation with two distinct oilfield inspection drones demonstrates seamless mode transitions, zero-phase-angle (ZPA) resonance, and peak efficiencies of 92.57% and 91.12%, respectively. The universal design eliminates the need for complex alignment mechanisms, offering a scalable solution for multi-drone fleets in energy, agriculture, and disaster response applications. Full article

► Show Figures

Figure 1

22 pages, 3708 KiB

Open AccessArticle

A Hybrid Optimization Framework for Dynamic Drone Networks: Integrating Genetic Algorithms with Reinforcement Learning

by Mustafa Ulaş, Anıl Sezgin and Aytuğ Boyacı

Appl. Sci. 2025, 15(9), 5176; https://doi.org/10.3390/app15095176 - 6 May 2025

Viewed by 907

Abstract

The growing use of unmanned aerial vehicles (UAVs) in diverse fields such as disaster recovery, rural regions, and smart cities necessitates effective dynamic drone network establishment techniques. Conventional optimization techniques like genetic algorithms (GAs) and particle swarm optimization (PSO) are weak when it [...] Read more.

The growing use of unmanned aerial vehicles (UAVs) in diverse fields such as disaster recovery, rural regions, and smart cities necessitates effective dynamic drone network establishment techniques. Conventional optimization techniques like genetic algorithms (GAs) and particle swarm optimization (PSO) are weak when it comes to real-time adjustment to the environment and multi-objective constraints. This paper proposes a hybrid optimization framework combining genetic algorithms and reinforcement learning (RL) to improve the deployment of drone networks. We integrate Q-learning into the GA mutation process to allow drones to adaptively adjust locations in real time under coverage, connectivity, and energy constraints. In the scenario of large-scale simulations for wildfire tracking, disaster response, and urban monitoring tasks, the hybrid approach performs better than GA and PSO. The greatest enhancements are 6.7% greater coverage, 7.5% less average link distance, and faster convergence to optimal deployment. The proposed framework allows drones to establish strong and stable networks that are dynamic in nature and adapt to dynamic mission demands with efficient real-time coordination. This research has important applications in autonomous UAV systems for mission-critical applications where adaptability and robustness are essential. Full article

► Show Figures

Figure 1

Search Results (176)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (176)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI