Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,857)

Search Parameters:
Keywords = UAV drones

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
36 pages, 2263 KB  
Article
Probabilistic Evaluation of Measurement Uncertainty and Decision Risk in UAV-Based Dimensional Inspection
by Dmytro Malakhov, Tatiana Kelemenová and Michal Kelemen
Drones 2026, 10(6), 405; https://doi.org/10.3390/drones10060405 - 24 May 2026
Abstract
Unmanned aerial vehicles (UAVs) are increasingly used for remote dimensional inspection in transportation monitoring and infrastructure control. In such applications, measurement results are often interpreted relative to regulatory thresholds, making the reliability of inspection decisions strongly dependent on measurement uncertainty. This study presents [...] Read more.
Unmanned aerial vehicles (UAVs) are increasingly used for remote dimensional inspection in transportation monitoring and infrastructure control. In such applications, measurement results are often interpreted relative to regulatory thresholds, making the reliability of inspection decisions strongly dependent on measurement uncertainty. This study presents a probabilistic framework for evaluating measurement uncertainty and decision risk in UAV-based dimensional inspection tasks. A measurement model describing uncertainty scaling with observation geometry is formulated, and the probability of exceedance relative to a regulatory limit is derived. The framework integrates probabilistic measurement modeling with a risk-based decision formulation that accounts for false-positive and false-negative inspection outcomes. The resulting integral inspection risk is analyzed for representative sensing modalities commonly used in UAV platforms, including vision-based systems, LiDAR, and radar sensors. The results demonstrate that uncertainty scaling with flight altitude significantly influences exceedance probability and decision reliability. Sensors with lower intrinsic dispersion maintain sharper threshold transitions and therefore provide more stable regulatory decisions. Sensitivity analysis further confirms that moderate variations in measurement uncertainty can substantially affect inspection risk. The proposed framework provides a quantitative tool for evaluating sensing technologies in UAV-based inspection missions and supports the design of reliable drone-assisted dimensional compliance monitoring systems. Full article
28 pages, 5551 KB  
Article
Capacity-Aware Lightweight Object Detection for UAV Remote Sensing: Dynamic Coupling Regularity and the SP-YOLO Model Family
by Shihao Yin and Weiqiang Tang
Appl. Sci. 2026, 16(11), 5249; https://doi.org/10.3390/app16115249 - 23 May 2026
Abstract
Object detection in UAV remote sensing imagery is confronted with three primary challenges: severe scale variation, densely clustered small targets, and constrained computational resources. This work introduces a family of lightweight detection models guided by the “Capacity-Aware Configuration Regularity” and incorporates a Feature-Refinement [...] Read more.
Object detection in UAV remote sensing imagery is confronted with three primary challenges: severe scale variation, densely clustered small targets, and constrained computational resources. This work introduces a family of lightweight detection models guided by the “Capacity-Aware Configuration Regularity” and incorporates a Feature-Refinement C2f module to enhance representational efficiency. A dynamic coupling mechanism is identified between detection head capacity and the representational quality of Backbone features, which is further validated through systematic ablation studies spanning three parameter magnitudes. Evaluated on the VisDrone2019 benchmark, the proposed model family exhibits a progressive parameter scaling from 1.67 M to 6.15 M. The nano variant achieves 31.7% mAP50 using only 55% of the parameter budget of YOLOv8n, surpassing it by 0.7 percentage points. The small variant, with a parameter budget comparable to YOLOv8n, attains 36.7% mAP50, exceeding it by 5.7 points. The medium variant reaches 43.1% mAP50 with 58% of the parameters of YOLOv8s, outperforming it by 4.1 points. The improvements are pronounced under the stricter mAP50–95 metric, where the small variant outperforms YOLOv8n by 3.3 points and the medium variant surpasses YOLOv8s by 2.8 points, demonstrating robust localization accuracy across a wide range of IoU thresholds. This consistent superiority in the accuracy–efficiency trade-off extends to the DIOR dataset, confirming the robust generalization of the proposed models across diverse remote sensing scenarios. Moreover, the uncovered capacity-matching regularity offers transferable methodological guidance for designing lightweight detection models tailored to resource-constrained platforms. Full article
(This article belongs to the Section Applied Industrial Technologies)
26 pages, 13961 KB  
Article
A UAV–3DGS–VR Workflow for Scenario-Comparable Immersive Review in Heritage Landscapes
by Xintong Li, Wenqi Sheng, Yixuan Tang, Yingwen Yu and Yuyang Peng
Drones 2026, 10(6), 404; https://doi.org/10.3390/drones10060404 - 23 May 2026
Abstract
Unmanned aerial vehicles (UAVs) are widely used for documentation, surveying, and 3D modeling in the built environment, yet their outputs often remain difficult to reuse for immersive comparison of alternative construction scenarios. This study presents a low-cost UAV-to-3DGS-to-VR workflow for constructing scenario-comparable immersive [...] Read more.
Unmanned aerial vehicles (UAVs) are widely used for documentation, surveying, and 3D modeling in the built environment, yet their outputs often remain difficult to reuse for immersive comparison of alternative construction scenarios. This study presents a low-cost UAV-to-3DGS-to-VR workflow for constructing scenario-comparable immersive environments for built-environment review. The workflow combines multi-angle UAV imagery, point-cloud-based geometric anchoring, 3D Gaussian Splatting (3DGS), and Unity-based virtual reality (VR) to transform drone-captured reality into a reusable scene for controlled scenario comparison. The workflow is demonstrated in Middenbeemster, the central town of the Beemster polder World Heritage property. One present-condition scene (M0) and three alternative construction scenarios (M1 to M3) were created within a shared spatial reference. Reconstruction quality was assessed using PSNR and SSIM, and the VR scenes were further evaluated through eye-tracking, head-motion recording, and subjective ranking. The results indicate that the workflow can generate visually reliable and directly comparable immersive scenes from UAV data in this case study. Behavioral and subjective findings showed a consistent pattern, with M1 appearing more compatible than M2 and M3 in this pilot evaluation. The study contributes a pilot UAV-based workflow that links reality capture, immersive scenario comparison, and supplementary behavioral evidence within one process. Full article
(This article belongs to the Topic 3D Documentation of Natural and Cultural Heritage)
20 pages, 58594 KB  
Article
FLKFormer: Frequency-Enhanced Large-Kernel Framework for Object Detection in UAV Imagery
by Yunhao Chen, Wen-Zhun Huang, Zhen Wang, Sihao Zeng and Chen Yang
Remote Sens. 2026, 18(11), 1686; https://doi.org/10.3390/rs18111686 - 22 May 2026
Viewed by 87
Abstract
UAV object detection remains challenging due to large scale variation, dense small objects, frequent occlusion, and complex background interference. Existing CNN-based detectors are often limited by weak small-object representation, while Transformer-based detectors may not adequately preserve local details in dense aerial scenes. This [...] Read more.
UAV object detection remains challenging due to large scale variation, dense small objects, frequent occlusion, and complex background interference. Existing CNN-based detectors are often limited by weak small-object representation, while Transformer-based detectors may not adequately preserve local details in dense aerial scenes. This paper proposes a dual-path detection framework that integrates frequency-domain enhancement with large-kernel convolution and Transformer-based global modeling. An FFT Large-Kernel Convolution (FFLKC) module is introduced to enhance high-frequency details and enlarge the effective receptive field. A Transformer pathway with Full-Process Feature Attention (FPFA) is designed to strengthen long-range dependency modeling and semantic representation. A Frequency-Semantic Memory-guided Adaptive Fusion (FMSAF) module is further employed to integrate local detail features and global contextual information. Experiments on UAVDT and VisDrone demonstrate that the proposed method achieves superior overall detection performance and stronger small-object perception than mainstream detectors. The method reaches 58.7 AP and 51.8 APS on UAVDT, and 39.4 AP and 30.5 APS on VisDrone. Qualitative and quantitative results verify the effectiveness of the proposed design in improving detection quality under complex UAV backgrounds. Full article
19 pages, 5072 KB  
Article
MDCL-DETR: Multi-Domain Enhancement and Cross-Layer Feature Fusion for Small Object Detection
by Tianran Hao, Xiao Zhang and Bing Zhou
Sensors 2026, 26(11), 3305; https://doi.org/10.3390/s26113305 - 22 May 2026
Viewed by 148
Abstract
Small object detection in uncrewed aerial vehicle (UAV) imagery is hindered by limited pixels, insufficient detailed information, and strong background interference, leading to weak feature representation and poor contextual modeling. To address these issues, we propose a multi-domain enhancement and cross-layer feature fusion [...] Read more.
Small object detection in uncrewed aerial vehicle (UAV) imagery is hindered by limited pixels, insufficient detailed information, and strong background interference, leading to weak feature representation and poor contextual modeling. To address these issues, we propose a multi-domain enhancement and cross-layer feature fusion detection Transformer (MDCL-DETR) with progressive feature processing. First, a multi-domain enhancement module (MDEM) based on CSP (cross stage partial) structure is proposed, which fuses spatial and frequency-domain features in a lightweight manner to enhance object detail and global structures while effectively distinguishing object features from background interference. Second, a cross-layer feature extraction module (CLEM) is introduced to aggregate multi-scale features across layers, alleviate information loss caused by downsampling, and preserve spatial details of small objects while integrating high-level contextual semantics. Meanwhile, a gated Mamba fusion module (GMFM) is proposed, which adopts the Mamba architecture for long-range dependency modeling of multi-scale features and integrates a gating mechanism to realize the dynamic weighted fusion of local details and global context, further improving feature discriminability and global modeling capability. Finally, a fine-grained enhancement module (FGEM) is designed, which leverages feature reorganization and adaptive feature extraction to reinforce and compensate fine-grained features. Extensive experimental results validate the effectiveness and generalization of the proposed method, achieving mAP50 scores of 54.1% and 56.2% on the VisDrone2019 and AI-TOD datasets. Full article
(This article belongs to the Section Sensing and Imaging)
25 pages, 14069 KB  
Article
RSMamDet: Efficient UAV Remote Sensing Vehicle Detection via Linear State Space Models and Adaptive Multi-Level Feature Fusion
by Man Wu, Xiaozhang Liu, Xiulai Li and Wenbiao Gan
Drones 2026, 10(5), 396; https://doi.org/10.3390/drones10050396 - 21 May 2026
Viewed by 108
Abstract
Accurate and efficient vehicle detection from unmanned aerial vehicle (UAV) imagery is essential for intelligent transportation, urban monitoring, and public safety, yet this task remains challenging due to high target density, extreme scale variation, complex backgrounds, and stringent onboard computational constraints. Existing DETR-based [...] Read more.
Accurate and efficient vehicle detection from unmanned aerial vehicle (UAV) imagery is essential for intelligent transportation, urban monitoring, and public safety, yet this task remains challenging due to high target density, extreme scale variation, complex backgrounds, and stringent onboard computational constraints. Existing DETR-based detectors model global context through self-attention but incur quadratic O(N2) complexity that is prohibitive for high-resolution UAV images, while CNN-based methods lack the long-range contextual awareness needed for dense small-object scenarios. We propose RSMamDet, an efficient end-to-end detection framework built upon RT-DETR that replaces quadratic self-attention with linear O(N) State Space Model scanning. The framework integrates a MobileMamba backbone with a Selective Feature Scanning module for efficient global context modeling, a Dimension-Aware Selective Integration module for adaptive cross-scale feature fusion, a Poly Kernel Inception Network encoder for multi-receptive-field feature enrichment, and an Adaptive Multi-Level Feature Fusion module for content-aware dynamic upsampling, complemented by an Uncertainty-Minimal Composite loss for stable query selection in cluttered aerial scenes. Experiments on DroneVehicle and VisDrone2019 demonstrate that RSMamDet achieves mAP50 of 72.6% and 40.2%, surpassing state-of-the-art methods by 4.1% and 2.2%, respectively, while maintaining real-time inference at 186.2 FPS with only 19.8M parameters and 42.3 GFLOPs, representing a 6.14× reduction in computational cost and a 3.86× reduction in model parameters compared to the strongest baseline. Full article
(This article belongs to the Topic Advances in Autonomous Vehicles, Automation, and Robotics)
Show Figures

Figure 1

24 pages, 2250 KB  
Article
From Generic to Adaptive: Similarity-Adaptive Receptive-Field Cross DETR for Remote-Sensing Object Detection
by Chenyu Lin, Yunzhan Fu, Hang Xu, Xuyang Teng and Tingyu Wang
Remote Sens. 2026, 18(10), 1670; https://doi.org/10.3390/rs18101670 - 21 May 2026
Viewed by 102
Abstract
Object detection in optical remote sensing imagery faces persistent challenges from severe instance overlap, extreme spatial density, and motion or atmospheric blur. These degradations cause conventional detectors to over-mix neighboring instance features and fail to separate closely packed objects. To address these limitations, [...] Read more.
Object detection in optical remote sensing imagery faces persistent challenges from severe instance overlap, extreme spatial density, and motion or atmospheric blur. These degradations cause conventional detectors to over-mix neighboring instance features and fail to separate closely packed objects. To address these limitations, we propose SARC-DETR, a detection framework that augments the RT-DETR architecture with two complementary plug-in modules: Similarity Adaptive Convolution (SAC) and Receptive Field Cross Convolution (RCC). SAC introduces a reproducing-kernel-Hilbert-space (RKHS) motivated similarity gate that selectively suppresses responses inconsistent with local feature prototypes, thereby reducing cross-instance interference in overlapped and blurred regions. RCC constructs a large directional receptive field through orthogonal strip-based aggregation and content-adaptive fusion, enabling efficient long-range context capture without quadratic complexity overhead. Both modules can be integrated into existing DETR-style detectors without modifying the detection head or training protocol. On VisDrone2019-DET, SARC-DETR improves APval from 29.7 to 34.8, AP50val from 49.5 to 56.2, and APSval from 19.2 to 24.8. On DIOR, AP rises from 57.9 to 68.4, and on NWPU VHR-10, from 44.4 to 66.5, demonstrating robust cross-dataset generalization. After structural reparameterization, the additional overhead is less than 0.75 M parameters and 0.36 G FLOPs, confirming deployment suitability for UAV and satellite-based remote sensing applications. Full article
18 pages, 477 KB  
Systematic Review
Human-Drone Interaction in Older Adults: A Systematic Review
by Agustín Gómez-López, Yuxa Maya-López, Pablo Olivos-Jara and Rafael Morales
Drones 2026, 10(5), 389; https://doi.org/10.3390/drones10050389 - 20 May 2026
Viewed by 255
Abstract
An aging population, increased life expectancy and loneliness among older people constitute a growing challenge, driving interest in technological solutions such as home drones. The aim of this study is to analyze their potential for older adults through a systematic review following PRISMA [...] Read more.
An aging population, increased life expectancy and loneliness among older people constitute a growing challenge, driving interest in technological solutions such as home drones. The aim of this study is to analyze their potential for older adults through a systematic review following PRISMA guidelines, including articles indexed in Web of Science, Scopus, PubMed and the ACM Digital Library up to February 2026 and following the Joanna Briggs Institute (JBI) methodology. A total of 285 records were initially identified and imported into JBI, of which 41 duplicate records were removed, and 231 studies were excluded after screening, resulting in 13 studies meeting the inclusion criteria. The reviewed studies suggest generally favorable perceptions among some older adults regarding the use of drones in the areas of health, support and safety, alongside barriers related to usability, trust and user interaction. Recent studies incorporate practical applications, highlighting the potential applicability of drones in supporting aspects related to autonomy, health and safety among older adults. Overall, the literature, though still limited, shows a shift towards more specific applications, highlighting the potential of drones to support the autonomy, health and safety of older adults, although their implementation remains influenced by factors of acceptance and user experience. Full article
Show Figures

Graphical abstract

26 pages, 9060 KB  
Article
Synergistic Multi-Model Fusion for Efficient–Accurate Multi-Defect Detection in Power Lines
by Linfeng Xi, Tao Shen, Guanglong Zhao, Nan Wang and Zhi Li
Sensors 2026, 26(10), 3185; https://doi.org/10.3390/s26103185 - 18 May 2026
Viewed by 333
Abstract
In unmanned aerial vehicle (UAV)-based power line inspection, multi-scale defects and complex backgrounds challenge the balance between detection accuracy, speed, and model lightweighting, limiting automated grid inspection. This paper proposes a Multi-Scale Mamba Framework (MS-Mamba) for efficient and accurate defect perception. A drone [...] Read more.
In unmanned aerial vehicle (UAV)-based power line inspection, multi-scale defects and complex backgrounds challenge the balance between detection accuracy, speed, and model lightweighting, limiting automated grid inspection. This paper proposes a Multi-Scale Mamba Framework (MS-Mamba) for efficient and accurate defect perception. A drone inspection dataset containing 5137 images from 14 defect categories was constructed and divided into training and validation sets with an 8:2 split. To address the large scale variation among defects, the categories are decoupled into macroscopic, mesoscopic, and microscopic groups according to physical attributes and visual scales. As the core perception engine, a lightweight state-space mechanism is designed to balance accuracy and deployability. A spatial resolution-aware hierarchical reconstruction strategy and a dynamic feature selection mechanism are integrated to enhance feature extraction, reduce background redundancy, and improve small-target representation. Compared with the YOLOv5s baseline, MS-Mamba achieves an mAP@0.5 of 0.749, corresponding to a 15.6 percentage-point improvement, while reducing parameters by 0.13 M and computational cost by 1.7 GFLOPs. Ablation studies and visual analyses further confirm fewer missed and false detections in complex backgrounds. The developed end-to-end inspection system was validated through closed-loop engineering tests, demonstrating strong potential for industrial deployment. Full article
Show Figures

Figure 1

29 pages, 1625 KB  
Article
EfficientIR-Det Towards Efficient and Accurate DETR for UAV Infrared Object Detection
by Xiang Yang, Hanbin Li and Xiaolan Xie
Sensors 2026, 26(10), 3129; https://doi.org/10.3390/s26103129 - 15 May 2026
Viewed by 125
Abstract
Infrared (IR) object detection on unmanned aerial vehicle (UAV) platforms is fundamentally challenged by low signal-to-noise ratios and extremely tight onboard computational budgets. Conventional CNNs lack sufficient global context, while Transformers suffer from quadratic complexity, hindering real-time deployment. To address these bottlenecks, we [...] Read more.
Infrared (IR) object detection on unmanned aerial vehicle (UAV) platforms is fundamentally challenged by low signal-to-noise ratios and extremely tight onboard computational budgets. Conventional CNNs lack sufficient global context, while Transformers suffer from quadratic complexity, hindering real-time deployment. To address these bottlenecks, we propose EfficientIR-Det, a lightweight end-to-end detector featuring a holistic optimization of the backbone, encoder, and sampling mechanisms. Specifically, we design a Partial Star Network (PSN) backbone that achieves implicit high-dimensional feature expansion via element-wise multiplication to amplify weak IR signals with minimal redundancy. Furthermore, a Hierarchical Mamba (HiMamba) encoder leverages selective state-space modeling to provide linear-complexity global enhancement with superior hardware efficiency. To refine cross-scale representations, we introduce an Adaptive Gated Sampling (AGS) module and a Hierarchical Sampling Strategy (HSS) to optimize feature fusion and sampling budget allocation toward dim-small targets. On HIT-UAV, EfficientIR-Det achieves 88.4% mAP@0.5, outperforming the RT-DETR-R18 baseline by 3.3 points while reducing FLOPs and parameters by 48.9% and 44.2%, respectively. On the larger-scale DroneVehicle dataset, it consistently leads with a 74.1% mAP@0.5 and a high inference speed of 140.8 FPS. Our results offer a promising research scheme for robust, real-time infrared perception on edge-constrained UAV platforms. Full article
Show Figures

Figure 1

24 pages, 4429 KB  
Article
SDP-YOLOv8: A Lightweight Enhancement Algorithm for Small Object Detection in UAV Aerial Photography
by You-Chao Lu, Yi-Han Xu, Wen Zhou and Ding Zhou
Appl. Sci. 2026, 16(10), 4941; https://doi.org/10.3390/app16104941 - 15 May 2026
Viewed by 137
Abstract
To overcome the limitations of existing UAV object detection algorithms—particularly missed detections, false alarms, and the progressive loss of fine-grained features for small objects—this paper proposes SDP-YOLOv8, a lightweight and parameter-efficient enhancement of YOLOv8. The design aims to improve small-object detection accuracy while [...] Read more.
To overcome the limitations of existing UAV object detection algorithms—particularly missed detections, false alarms, and the progressive loss of fine-grained features for small objects—this paper proposes SDP-YOLOv8, a lightweight and parameter-efficient enhancement of YOLOv8. The design aims to improve small-object detection accuracy while maintaining a lightweight architecture suitable for deployment on memory-constrained UAV platforms. Four lightweight-oriented modifications are introduced: (1) SCFS, which combines SPD-Conv for low-information-loss downsampling with a C2f block and SimAM attention; (2) DCSPPF, expanding the receptive field via parallel dilated convolutions; (3) a GhostConv-infused Patch Merging upsampling layer for local context enhancement; and (4) an extra small-scale detection head to preserve fine details. On VisDrone2019, experimental results show that SDP-YOLOv8 improved mAP@0.5 by 3.90% and mAP@0.5:0.95 by 2.60%, with a 14.4% reduction in parameters. The model maintains real-time performance (53.5 FPS on an RTX 3090 at FP32 with batch size 1, 38.7 FPS on a Jetson Orin Nano with TensorRT FP16 at batch size 1) and offers a favorable trade-off between detection accuracy, parameter efficiency, and memory footprint, making it a potential candidate for onboard deployment on resource-limited UAVs in aerial monitoring scenarios, pending further validation on diverse datasets and hardware platforms. Full article
Show Figures

Figure 1

30 pages, 21776 KB  
Article
LDSNet: A Lightweight Detail-Sensitive Network for Small Object Detection in Low-Altitude UAV Scenarios
by Tong Tan, Xianrong Peng, Jianlin Zhang, Haorui Zuo, Yao Zhang, Yunhao Wu and Hui Li
J. Imaging 2026, 12(5), 209; https://doi.org/10.3390/jimaging12050209 - 14 May 2026
Viewed by 284
Abstract
Object detection in Unmanned Aerial Vehicle (UAV) imagery faces significant challenges due to the unique aerial perspective. A major bottleneck is the weak feature representation of small objects, which limits both detection accuracy and computational efficiency. To address this issue, we propose a [...] Read more.
Object detection in Unmanned Aerial Vehicle (UAV) imagery faces significant challenges due to the unique aerial perspective. A major bottleneck is the weak feature representation of small objects, which limits both detection accuracy and computational efficiency. To address this issue, we propose a Lightweight Detail-Sensitive Network (LDSNet). Specifically, LDSNet consists of three key components: (1) Lightweight Detail-Sensitive Downsampling (LDSDown), which combines anti-aliasing smoothing with dual-path feature extraction to preserve the spatial details of small objects during downsampling; (2) Shared Recursive Dilated Convolution (SRDC), which uses weight-shared multi-rate dilated convolutions to capture multi-scale context and enlarge the receptive field without introducing extra parameters; and (3) Deeply Decoupled Grouped Head (DGHead), which employs high-ratio grouped convolutions to significantly reduce the computational cost of processing high-resolution inputs. Extensive experiments on the VisDrone2019 and HIT-UAV datasets demonstrate that LDSNet achieves an excellent trade-off between accuracy and efficiency. Compared to the YOLOv11n baseline, LDSNet reduces parameters by 84.6% (from 2.6 M to 0.4 M) and FLOPs by 29.2% (from 6.5 G to 4.6 G), while improving mAP50 by 2.2% on VisDrone2019 and achieving 94.5% on HIT-UAV. Full article
(This article belongs to the Special Issue AI-Driven Remote Sensing Image Processing and Pattern Recognition)
Show Figures

Figure 1

17 pages, 2705 KB  
Article
A Cooperative Network Management Architecture for Manned–Unmanned Aircraft Teaming Using Network Drones
by Changmin Park and Hwangnam Kim
Electronics 2026, 15(10), 2102; https://doi.org/10.3390/electronics15102102 - 14 May 2026
Viewed by 200
Abstract
Conventional direct communication in Manned–Unmanned Teaming (MUM-T) suffers from fundamental scalability and security limitations. As the number of Unmanned Aerial Vehicles (UAVs) increases, the communication burden on the manned aircraft (MA) grows significantly, while security threats originating from UAVs may directly propagate to [...] Read more.
Conventional direct communication in Manned–Unmanned Teaming (MUM-T) suffers from fundamental scalability and security limitations. As the number of Unmanned Aerial Vehicles (UAVs) increases, the communication burden on the manned aircraft (MA) grows significantly, while security threats originating from UAVs may directly propagate to the MA. To address these challenges, this paper proposes a hierarchical communication architecture that introduces dedicated Network Drones (NDs) as intermediate communication mediators and trust boundaries between the MA and multiple UAV swarms. In the proposed design, the MA interacts exclusively with NDs, while UAV swarms communicate through ND-mediated links, effectively bounding the number of MA-facing connections and enabling scalable communication. Building on this structured communication model, a message-level Zero-Trust framework is enforced at the MA–ND interface. Each message is evaluated using a multi-dimensional risk model that incorporates authentication consistency, behavioral consistency, content validity, and contextual information, enabling early detection and containment of compromised UAV behavior. Furthermore, the architecture incorporates backup planning mechanisms, including dynamic reassociation and hot-standby operation, to ensure robust communication under ND failure conditions. Experimental results demonstrate that the proposed approach reduces MA-facing communication overhead, stabilizes end-to-end latency, and improves detection performance in terms of false positives and false negatives, while maintaining system robustness under failure scenarios. Full article
(This article belongs to the Special Issue Intelligent Technologies for Vehicular Networks, 2nd Edition)
Show Figures

Figure 1

23 pages, 2910 KB  
Article
MD-YOLO: A Multi-Scale Adaptive and Dual-Attention Enhanced YOLOv11 for Small Object Detection
by Wenyan Zhou and Gu Gong
Electronics 2026, 15(10), 2099; https://doi.org/10.3390/electronics15102099 - 14 May 2026
Viewed by 217
Abstract
Recent YOLO-based object detection methods have demonstrated strong performance in real-time applications due to their efficient end-to-end architecture. However, in complex scenarios such as VisDrone2019, existing methods still face limitations in small object detection and multi-scale feature modeling capability. These performance bottlenecks are [...] Read more.
Recent YOLO-based object detection methods have demonstrated strong performance in real-time applications due to their efficient end-to-end architecture. However, in complex scenarios such as VisDrone2019, existing methods still face limitations in small object detection and multi-scale feature modeling capability. These performance bottlenecks are not only attributed to model-level constraints, such as the loss of low-level spatial details during progressive downsampling and the insufficient preservation of fine-grained structural information in high-level semantic representations during feature propagation, which consequently limits multi-scale feature representation and fusion, but are also influenced by data-level factors, including long-tailed distributions and spatial distribution bias. To address these limitations, this paper proposes an improved model named MD-YOLO. First, a Multi-scale Adaptive Channel (MAC) module is introduced into the backbone to replace conventional stride-based downsampling, enhancing multi-scale feature representation while preserving fine-grained information. Second, a Dual Attention Feature Fusion (DAFA) module is designed to align features across different resolutions and further enhance fused representations using both channel and spatial attention mechanisms. Furthermore, a high-resolution P2 detection head is incorporated to enhance the detection capability for dense small objects. Experimental results on the VisDrone2019 dataset demonstrate that the proposed method substantially outperforms the YOLOv11s baseline, improving mAP@0.5 from 38.5% to 45.6% and mAP@0.5:0.95 from 22.8% to 27.1%, while maintaining a reasonable computational cost. Full article
Show Figures

Figure 1

24 pages, 6298 KB  
Article
Siamese-ViT: A Local–Global Feature Fusion Method for Real-Time Visual Navigation of UAVs in Real-World Environments
by Yu Cheng, Xixiang Liu, Shuai Chen and Chuan Xu
Remote Sens. 2026, 18(10), 1556; https://doi.org/10.3390/rs18101556 - 13 May 2026
Viewed by 172
Abstract
Visual scene matching navigation (VSMN) for unmanned aerial vehicles (UAVs) boasts advantages such as high precision, high reliability, and autonomy. The biggest challenge lies in the tension between local fine-grained information and global semantics, as well as limited generalization ability in real-world environments. [...] Read more.
Visual scene matching navigation (VSMN) for unmanned aerial vehicles (UAVs) boasts advantages such as high precision, high reliability, and autonomy. The biggest challenge lies in the tension between local fine-grained information and global semantics, as well as limited generalization ability in real-world environments. While existing Transformer-based cross-view geolocation methods enhance global context modeling capabilities, they still generally face issues such as high demands on training data and computational resources, insufficient fusion of local fine-grained information and global semantics, and real-time performance in real-world complex environment. To address these problems, we propose a scene matching and localization algorithm based on the Siamese-ViT. For feature extraction, we use the ViT model to extract global features and K-means clustering to aggregate local features. Combined with the global features extracted by the ViT, a robust local–global feature representation vector is generated. For feature matching, incremental principal component analysis (IPCA) is used to reduce the dimensionality of the high-dimensional feature space, and a KD-tree is constructed for fast feature retrieval to improve matching efficiency. We validated our algorithm on the University-1652 dataset and a dataset of real-world satellite-drone image pairs. The results show that our Siamese-ViT outperforms other models in both Recall and AP. We conduct flight experiments in real-world environments, capturing drone images of complex scenes, including farmland, urban buildings, and waterways. The results show that, at a flight altitude of 350 m, our algorithm achieves an average absolute value of 6.2063 m for latitude, 6.7552 m for longitude, and 10.1922 m for horizontal error. Therefore, our Siamese-ViT demonstrates ideal overall positioning accuracy. Full article
Show Figures

Figure 1

Back to TopTop