Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (40)

Search Parameters:
Keywords = cross-domain unmanned aerial vehicle

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 5245 KB  
Article
Mobility-Aware Joint Optimization for Hybrid RF-Optical UAV Communications
by Jing Wang, Zhuxian Lian, Fei Wang and Tong Xue
Photonics 2025, 12(12), 1205; https://doi.org/10.3390/photonics12121205 - 7 Dec 2025
Viewed by 314
Abstract
This paper investigates a UAV-assisted wireless communication system that integrates optical wireless communication (LiFi) with conventional RF links to enhance network capacity in crowd-gathering scenarios. While the unmanned aerial vehicle (UAV) serves as a flying base station providing downlink transmission to mobile ground [...] Read more.
This paper investigates a UAV-assisted wireless communication system that integrates optical wireless communication (LiFi) with conventional RF links to enhance network capacity in crowd-gathering scenarios. While the unmanned aerial vehicle (UAV) serves as a flying base station providing downlink transmission to mobile ground users, the study places particular emphasis on the role of LiFi as a complementary physical layer technology within heterogeneous networks—an aspect closely connected to optical and photonics advancements. The proposed system is designed for environments such as theme parks and public events, where user groups move collectively toward points of interest (PoIs). To maintain quality of service (QoS) under dynamic mobility, we develop a joint optimization framework that simultaneously designs the UAV’s flight path and resource allocation over time. Given the problem’s non-convexity, a block coordinate descent (BCD) based approach is introduced, which decomposes the problem into power allocation and path planning subproblems. The power allocation step is solved using convex optimization techniques, while the path planning subproblem is handled via successive convex approximation (SCA). Simulation results demonstrate that the proposed algorithm achieves rapid convergence within 3–5 iterations while guaranteeing 100% heterogeneous QoS satisfaction, ultimately yielding nearly 15.00 bps/Hz system capacity enhancement over baseline approaches. These findings motivate the integration of coordinated three-dimensional trajectory planning for multi-UAV cooperation as a promising direction for further enhancement. Although LiFi is implemented in free-space optics rather than fiber-based sensing, this work highlights a relevant optical technology that may inspire future cross-domain applications, including those in optical sensing, where UAVs and reconfigurable optical links play a role. Full article
Show Figures

Figure 1

22 pages, 14869 KB  
Article
WMFA-AT: Adaptive Teacher with Weighted Multi-Layer Feature Alignment for Cross-Domain UAV Object Detection
by Gui Cheng, Hao Yang, Yan Tian, Meilin Xie, Chaoya Dang, Qing Ding and Xubin Feng
Remote Sens. 2025, 17(23), 3854; https://doi.org/10.3390/rs17233854 - 28 Nov 2025
Viewed by 511
Abstract
Unmanned Aerial Vehicle (UAV) object detection has witnessed rapid progress in recent years. However, its heavy reliance on labeled data and the assumption of consistent data distributions between training and deployment domains limit its generalization ability, leading to significant performance degradation under domain [...] Read more.
Unmanned Aerial Vehicle (UAV) object detection has witnessed rapid progress in recent years. However, its heavy reliance on labeled data and the assumption of consistent data distributions between training and deployment domains limit its generalization ability, leading to significant performance degradation under domain shifts. To address this challenge arising from substantial discrepancies in feature distributions across UAV images captured under diverse conditions, we propose a novel framework: Adaptive Teacher with Weighted Multi-layer Feature Alignment (WMFA-AT) for cross-domain UAV object detection. WMFA-AT adopts a teacher–student mutual learning paradigm, integrating domain adversarial learning with weighted multi-layer feature alignment and strong-weak data augmentation to effectively mitigate domain discrepancies. Specifically, the student model performs adversarial alignment using multiple domain discriminators applied to different feature layers, where layer-wise transferability is quantitatively estimated and used to adaptively weight the alignment process. This strategy ensures that features from the source and target domains are aligned in a distribution-aware manner. Meanwhile, the teacher model benefits from the student model via mutual learning, incorporating knowledge from both source and target domains while avoiding overfitting to the source. To comprehensively evaluate the proposed approach, we construct four challenging cross-domain UAV object detection benchmarks covering cross-time, cross-camera, cross-view, and cross-weather scenarios. Experimental results demonstrate that WMFA-AT consistently improves detection accuracy across diverse domain shifts, highlighting its robustness, generalization capability, and practical applicability in real-world UAV deployment settings. Full article
Show Figures

Figure 1

23 pages, 1757 KB  
Review
A Survey on Privacy Preservation Techniques in IoT Systems
by Rupinder Kaur, Tiago Rodrigues, Nourin Kadir and Rasha Kashef
Sensors 2025, 25(22), 6967; https://doi.org/10.3390/s25226967 - 14 Nov 2025
Viewed by 1547
Abstract
The Internet of Things (IoT) has become deeply embedded in modern society, enabling applications across smart homes, healthcare, industrial automation, and environmental monitoring. However, as billions of interconnected devices continuously collect and exchange sensitive data, privacy and security concerns have escalated. This survey [...] Read more.
The Internet of Things (IoT) has become deeply embedded in modern society, enabling applications across smart homes, healthcare, industrial automation, and environmental monitoring. However, as billions of interconnected devices continuously collect and exchange sensitive data, privacy and security concerns have escalated. This survey systematically reviews the state-of-the-art privacy-preserving techniques in IoT systems, emphasizing approaches that protect user data during collection, transmission, and storage. Peer-reviewed studies from 2016 to 2025 and technical reports were analyzed to examine applied mechanisms, datasets, and analytical models. Our analysis shows that blockchain and federated learning are the most prevalent decentralized privacy-preserving methods, while homomorphic encryption and differential privacy have recently gained traction for lightweight and edge-based IoT implementations. Despite these advancements, challenges persist, including computational overhead, limited scalability, and real-time performance constraints in resource-constrained devices. Furthermore, gaps remain in cross-domain interoperability, energy-efficient cryptographic designs, and privacy solutions for Unmanned Aerial Vehicle (UAV) and vehicular IoT systems. This survey offers a comprehensive overview of current research trends, identifies critical limitations, and outlines promising future directions to guide the design of secure and privacy-aware IoT architectures. Full article
(This article belongs to the Special Issue Security and Privacy in Wireless Sensor Networks (WSNs))
Show Figures

Figure 1

31 pages, 17949 KB  
Article
Domain-Unified Adaptive Detection Framework for Small Vehicle Targets in Monostatic/Bistatic SAR Images
by Zheng Ye and Peng Zhou
Remote Sens. 2025, 17(22), 3671; https://doi.org/10.3390/rs17223671 - 7 Nov 2025
Viewed by 723
Abstract
Benefiting from the advantages of unmanned aerial vehicle (UAV) platforms such as low cost, rapid deployment capability, and miniaturization, the application of UAV-borne synthetic aperture radar (SAR) has developed rapidly. Utilizing a self-developed monostatic Miniaturized SAR (MiniSAR) system and a bistatic MiniSAR system, [...] Read more.
Benefiting from the advantages of unmanned aerial vehicle (UAV) platforms such as low cost, rapid deployment capability, and miniaturization, the application of UAV-borne synthetic aperture radar (SAR) has developed rapidly. Utilizing a self-developed monostatic Miniaturized SAR (MiniSAR) system and a bistatic MiniSAR system, our team conducted multiple imaging missions over the same vehicle equipment display area at different times. However, system disparities and time-varying factors lead to a mismatch between the distributions of the training and test data. Additionally, small ground vehicle targets under complex background clutter exhibit limited size and weak scattering characteristics. These two issues pose significant challenges to the precise detection of small ground vehicle targets. To address these issues, this article proposes a domain-unified adaptive target detection framework (DUA-TDF). The approach consists of two stages: image-to-image translation and feature extraction and target detection. In the first stage, a multi-scale detail-aware CycleGAN (MSDA-CycleGAN) is proposed to align the source and target domains at the image level by achieving unpaired image style transfer while emphasizing both global structure and local details of the generated images. In the second stage, a cross-window axial self-attention target detection network (CWASA-Net) is proposed. This network employs a hybrid backbone centered on the cross-window axial self-attention mechanism to enhance feature representation, coupled with a convolution-based stacked cross-scale feature fusion network to strengthen multi-scale feature interaction. To validate the effectiveness and generalization capability of the proposed algorithm, comprehensive experiments are conducted on both self-developed monostatic/bistatic SAR datasets and public dataset. Experimental results demonstrate that our method achieves an mAP50 exceeding 90% in within-domain tests and maintains over 80% in cross-domain scenarios, demonstrating exceptional and robust detection performance as well as cross-domain adaptability. Full article
Show Figures

Figure 1

40 pages, 10478 KB  
Review
Unmanned Aerial Underwater Vehicles: Research Progress and Prospects
by Hangyu Zhou, Weiqiang Hu, Zhaoyu Wei, Yuehui Teng and Liyang Dong
Appl. Sci. 2025, 15(22), 11868; https://doi.org/10.3390/app152211868 - 7 Nov 2025
Viewed by 2380
Abstract
Unmanned aerial underwater vehicles (UAUVs) will play significant roles in several complex application scenarios including observation of mesoscale ocean phenomena, monitoring of offshore platforms, ocean protection, and maritime rescue. These innovative vehicles can be used in the air and underwater and can easily [...] Read more.
Unmanned aerial underwater vehicles (UAUVs) will play significant roles in several complex application scenarios including observation of mesoscale ocean phenomena, monitoring of offshore platforms, ocean protection, and maritime rescue. These innovative vehicles can be used in the air and underwater and can easily enter and exit water. This review systematically analyzes the research progress, design challenges, and future prospects of UAUVs, emphasizing their potential to revolutionize integrated cross-domain collaboration. We classify UAUVs into five categories—rotary-wing, fixed-wing, folding-wing, hybrid-wing, and flapping-wing—based on propulsion configurations, and critically evaluate their prototypes, highlighting technological milestones and functional limitations. Unlike prior reviews focused solely on technical developments, this study advocates for a paradigm shift from a technology-push to a market-pull and technology-push interactive development model. Combining the design of UAUV with solutions to technical challenges and specific application requirements is crucial for practical deployment. By synthesizing historical context, current advancements, and future developments, this review not only provides possible strategies for design challenges but also lays a roadmap for UAUV commercialization. Full article
(This article belongs to the Special Issue Advances in Autonomous Underwater Vehicle Technology)
Show Figures

Figure 1

40 pages, 10740 KB  
Article
Structural Design of an Unmanned Aerial Underwater Vehicle with Coaxial Twin Propellers and the Numerical Simulation of the Cross-Domain Characteristics
by Jiancheng Wang, Yikun Feng, Guoqing Zhang, Qiqian Ge, Haobin Jin and Zhewei Zhang
Drones 2025, 9(11), 766; https://doi.org/10.3390/drones9110766 - 6 Nov 2025
Viewed by 1126
Abstract
This paper addresses the structural adaptability and dynamic stability challenges faced by unmanned aerial underwater vehicle (UAUV) during the transition between air and water. To overcome these issues, this paper innovatively proposes a UAUV that uses coaxial twin propellers for propulsion and conducts [...] Read more.
This paper addresses the structural adaptability and dynamic stability challenges faced by unmanned aerial underwater vehicle (UAUV) during the transition between air and water. To overcome these issues, this paper innovatively proposes a UAUV that uses coaxial twin propellers for propulsion and conducts a detailed overall structural design and subsystem design for it. Accurate prediction of the kinematic characteristics of UAUV during cross-domain motion is of great significance for the design of high-performance UAUVs. Therefore, a numerical simulation method for UAUV cross-domain motion based on the STAR CCM+ (version 202402) software, the volume of fluid (VOF) method, and the dynamic fluid body interaction (DFBI) module was established. The results showed that when the water-entry speed is small, as the water-entry angle increases, the UAUV’s movement trajectory will exhibit continuous undulating motion. Moreover, during the water-exit process, the smaller the water-exit speed and angle, the greater the change in attitude. The analysis of the dynamic characteristics of cavitation during the UAUV’s water-entry process reveals that the premature rupture of the cavities is detrimental to the UAUV’s movement along the initial entry direction. During the process of the UAUV’s exit from the water, the detachment of water adhering to the UAUV surface will cause certain disturbances to its attitude. The findings of this study provide key theoretical insights and technical references for optimizing the structural design of UAUVs. Full article
Show Figures

Figure 1

25 pages, 8305 KB  
Article
SAHI-Tuned YOLOv5 for UAV Detection of TM-62 Anti-Tank Landmines: Small-Object, Occlusion-Robust, Real-Time Pipeline
by Dejan Dodić, Vuk Vujović, Srđan Jovković, Nikola Milutinović and Mitko Trpkoski
Computers 2025, 14(10), 448; https://doi.org/10.3390/computers14100448 - 21 Oct 2025
Cited by 1 | Viewed by 846
Abstract
Anti-tank landmines endanger post-conflict recovery. Detecting camouflaged TM-62 landmines in low-altitude unmanned aerial vehicle (UAV) imagery is challenging because targets occupy few pixels and are low-contrast and often occluded. We introduce a single-class anti-tank dataset and a YOLOv5 pipeline augmented with a SAHI-based [...] Read more.
Anti-tank landmines endanger post-conflict recovery. Detecting camouflaged TM-62 landmines in low-altitude unmanned aerial vehicle (UAV) imagery is challenging because targets occupy few pixels and are low-contrast and often occluded. We introduce a single-class anti-tank dataset and a YOLOv5 pipeline augmented with a SAHI-based small-object stage and Weighted Boxes Fusion. The evaluation combines COCO metrics with an operational operating point (score = 0.25; IoU = 0.50) and stratifies by object size and occlusion. On a held-out test partition representative of UAV acquisition, the baseline YOLOv5 attains mAP@0.50:0.95 = 0.553 and AP@0.50 = 0.851. With tuned SAHI (768 px tiles, 40% overlap) plus fusion, performance rises to mAP@0.50:0.95 = 0.685 and AP@0.50 = 0.935—ΔmAP = +0.132 (+23.9% rel.) and ΔAP@0.50 = +0.084 (+9.9% rel.). At the operating point, precision = 0.94 and recall = 0.89 (F1 = 0.914), implying a 58.4% reduction in missed detections versus a non-optimized SAHI baseline and a +14.3 AP@0.50 gain on the small/occluded subset. Ablations attribute gains to tile size, overlap, and fusion, which boost recall on low-pixel, occluded landmines without inflating false positives. The pipeline sustains real-time UAV throughput and supports actionable triage for humanitarian demining, as well as motivating RGB–thermal fusion and cross-season/-domain adaptation. Full article
(This article belongs to the Special Issue Advanced Image Processing and Computer Vision (2nd Edition))
Show Figures

Figure 1

16 pages, 3792 KB  
Article
Design and Implementation of Polar UAV and Ice-Based Buoy Cross-Domain Observation System
by Teng Wang, Yuan Liu, Songwei Zhang, Guangyu Zuo, Liwei Kou and Yinke Dou
J. Mar. Sci. Eng. 2025, 13(9), 1701; https://doi.org/10.3390/jmse13091701 - 3 Sep 2025
Viewed by 1032
Abstract
Polar environmental research requires advanced detection methods to understand rapid changes in these regions. Unmanned aerial vehicles (UAVs) bridge the gap between satellite remote sensing and in situ ice-based buoy measurements, offering improved spatiotemporal resolution and operational efficiency. However, their widespread use in [...] Read more.
Polar environmental research requires advanced detection methods to understand rapid changes in these regions. Unmanned aerial vehicles (UAVs) bridge the gap between satellite remote sensing and in situ ice-based buoy measurements, offering improved spatiotemporal resolution and operational efficiency. However, their widespread use in polar regions remains limited due to insufficient endurance capabilities. To address this problem, this paper presents a new monitoring system, the so-called UAV and Ice-based buoy cross-domain observation system (UBCOS). Particularly, the ice-based buoy integrates a Real-Time Kinematic (RTK) base station, a contact-based charging system, and an Iridium communication system, providing UAVs with centimeter-level positioning correction, low-temperature charging support, and remote data transmission capabilities. UAVs equipped with pod-mounted cameras capture imagery of sea ice surface characteristics within a 4 km radius of the buoy. Field tests conducted in the Arctic in 2024 demonstrate that the system achieved expected performance in both monitoring task execution and data collection, validating its practicality and reliability for polar sea ice monitoring. Full article
Show Figures

Figure 1

21 pages, 1143 KB  
Review
A Review of Robotic Applications in the Management of Structural Health Monitoring in the Saudi Arabian Construction Sector
by Yazeed Hamdan Alazmi, Mohammad Al-Zu'bi, Mazen J. Al-Kheetan and Musab Rabi
Buildings 2025, 15(16), 2965; https://doi.org/10.3390/buildings15162965 - 21 Aug 2025
Cited by 2 | Viewed by 2181
Abstract
The integration of robotics into Structural Health Monitoring (SHM) is rapidly reshaping how infrastructure is assessed and maintained. This review critically examines the current landscape of robotic technologies applied in SHM, with a specific focus on their implementation within the Saudi Arabian construction [...] Read more.
The integration of robotics into Structural Health Monitoring (SHM) is rapidly reshaping how infrastructure is assessed and maintained. This review critically examines the current landscape of robotic technologies applied in SHM, with a specific focus on their implementation within the Saudi Arabian construction sector. It explores recent advancements in robotic platforms, such as unmanned aerial vehicles (UAVs), wall-climbing robots, and AI-driven inspection systems, and assesses their roles in damage detection, vibration monitoring, and real-time diagnostics. In addition to outlining technological capabilities, this paper identifies major adoption challenges related to system readiness, regulatory gaps, workforce limitations, and environmental constraints. Drawing on comparative experiences in the healthcare, energy, and legal domains, this review extracts cross-sectoral insights that offer practical guidance for accelerating robotic integration in SHM. This paper concludes by outlining research gaps and actionable recommendations to support scholars, policymakers, and industry professionals in advancing robotics-based monitoring in complex infrastructure environments. Full article
(This article belongs to the Section Construction Management, and Computers & Digitization)
Show Figures

Figure 1

36 pages, 8958 KB  
Article
Dynamic Resource Target Assignment Problem for Laser Systems’ Defense Against Malicious UAV Swarms Based on MADDPG-IA
by Wei Liu, Lin Zhang, Wenfeng Wang, Haobai Fang, Jingyi Zhang and Bo Zhang
Aerospace 2025, 12(8), 729; https://doi.org/10.3390/aerospace12080729 - 17 Aug 2025
Cited by 1 | Viewed by 1433
Abstract
The widespread adoption of Unmanned Aerial Vehicles (UAVs) in civilian domains, such as airport security and critical infrastructure protection, has introduced significant safety risks that necessitate effective countermeasures. High-Energy Laser Systems (HELSs) offer a promising defensive solution; however, when confronting large-scale malicious UAV [...] Read more.
The widespread adoption of Unmanned Aerial Vehicles (UAVs) in civilian domains, such as airport security and critical infrastructure protection, has introduced significant safety risks that necessitate effective countermeasures. High-Energy Laser Systems (HELSs) offer a promising defensive solution; however, when confronting large-scale malicious UAV swarms, the Dynamic Resource Target Assignment (DRTA) problem becomes critical. To address the challenges of complex combinatorial optimization problems, a method combining precise physical models with multi-agent reinforcement learning (MARL) is proposed. Firstly, an environment-dependent HELS damage model was developed. This model integrates atmospheric transmission effects and thermal effects to precisely quantify the required irradiation time to achieve the desired damage effect on a target. This forms the foundation of the HELS–UAV–DRTA model, which employs a two-stage dynamic assignment structure designed to maximize the target priority and defense benefit. An innovative MADDPG-IA (I: intrinsic reward, and A: attention mechanism) algorithm is proposed to meet the MARL challenges in the HELS–UAV–DRTA problem: an attention mechanism compresses variable-length target states into fixed-size encodings, while a Random Network Distillation (RND)-based intrinsic reward module delivers dense rewards that alleviate the extreme reward sparsity. Large-scale scenario simulations (100 independent runs per scenario) involving 50 UAVs and 5 HELS across diverse environments demonstrate the method’s superiority, achieving mean damage rates of 99.65% ± 0.32% vs. 72.64% ± 3.21% (rural), 79.37% ± 2.15% vs. 51.29% ± 4.87% (desert), and 91.25% ± 1.78% vs. 67.38% ± 3.95% (coastal). The method autonomously evolved effective strategies such as delaying decision-making to await the optimal timing and cross-region coordination. The ablation and comparison experiments further confirm MADDPG-IA’s superior convergence, stability, and exploration capabilities. This work bridges the gap between complex mathematical and physical mechanisms and real-time collaborative decision optimization. It provides an innovative theoretical and methodological basis for public-security applications. Full article
Show Figures

Figure 1

20 pages, 10557 KB  
Article
HAUV-USV Collaborative Operation System for Hydrological Monitoring
by Qiusheng Wang, Shuibo Hu, Zhou Yang and Guofeng Wu
J. Mar. Sci. Eng. 2025, 13(8), 1540; https://doi.org/10.3390/jmse13081540 - 11 Aug 2025
Cited by 1 | Viewed by 1272
Abstract
Research in marine hydrographic environmental monitoring continues to deepen, necessitating a hardware platform capable of traversing air–water interfaces to collect vertical gradient parameters across oceanographic profiles. This paper proposes a deeply integrated heterogeneous monitoring platform for marine hydrological vertical profiling, addressing the functional [...] Read more.
Research in marine hydrographic environmental monitoring continues to deepen, necessitating a hardware platform capable of traversing air–water interfaces to collect vertical gradient parameters across oceanographic profiles. This paper proposes a deeply integrated heterogeneous monitoring platform for marine hydrological vertical profiling, addressing the functional limitations of conventional unmanned surface vehicles (USVs) and unmanned aerial vehicles (UAVs) in subsurface monitoring. By co-designing a hybrid aerial underwater vehicle (HAUV) with cross-domain capabilities and a USV, the system leverages USVs for long-endurance surface operations and HAUVs for high-speed vertical column monitoring. Key innovations include (1) a distributed collaborative architecture enabling “Air–Sea–Air” cyclic operations; (2) dynamic modeling of HAUV-USV interactions incorporating aerodynamic and hydrodynamic coupling; (3) an MPC-based collaborative tracking algorithm for real-time USV pursuit under marine disturbances; and (4) a vision-guided synchronous landing strategy achieving decimeter-level docking accuracy in bad conditions. Simulation experiments validate the system’s efficacy in trajectory tracking and precision landing. This work bridges the critical gap in marine vertical profile monitoring while demonstrating robust cross-domain coordination. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

30 pages, 2470 KB  
Review
Open-Vocabulary Object Detection in UAV Imagery: A Review and Future Perspectives
by Yang Zhou, Junjie Li, Congyang Ou, Dawei Yan, Haokui Zhang and Xizhe Xue
Drones 2025, 9(8), 557; https://doi.org/10.3390/drones9080557 - 8 Aug 2025
Cited by 2 | Viewed by 6011
Abstract
Due to its extensive applications, aerial image object detection has long been a hot topic in computer vision. In recent years, advancements in unmanned aerial vehicle (UAV) technology have further propelled this field to new heights, giving rise to a broader range of [...] Read more.
Due to its extensive applications, aerial image object detection has long been a hot topic in computer vision. In recent years, advancements in unmanned aerial vehicle (UAV) technology have further propelled this field to new heights, giving rise to a broader range of application requirements. However, traditional UAV aerial object detection methods primarily focus on detecting predefined categories, which significantly limits their applicability. The advent of cross-modal text–image alignment (e.g., CLIP) has overcome this limitation, enabling open-vocabulary object detection (OVOD), which can identify previously unseen objects through natural language descriptions. This breakthrough significantly enhances the intelligence and autonomy of UAVs in aerial scene understanding. This paper presents a comprehensive survey of OVOD in the context of UAV aerial scenes. We begin by aligning the core principles of OVOD with the unique characteristics of UAV vision, setting the stage for a specialized discussion. Building on this foundation, we construct a systematic taxonomy that categorizes existing OVOD methods for aerial imagery and provides a comprehensive overview of the relevant datasets. This structured review enables us to critically dissect the key challenges and open problems at the intersection of these fields. Finally, based on this analysis, we outline promising future research directions and application prospects. This survey aims to provide a clear road map and a valuable reference for both newcomers and seasoned researchers, fostering innovation in this rapidly evolving domain. We keep track of related works in a public GitHub repository. Full article
Show Figures

Figure 1

21 pages, 12122 KB  
Article
RA3T: An Innovative Region-Aligned 3D Transformer for Self-Supervised Sim-to-Real Adaptation in Low-Altitude UAV Vision
by Xingrao Ma, Jie Xie, Di Shao, Aiting Yao and Chengzu Dong
Electronics 2025, 14(14), 2797; https://doi.org/10.3390/electronics14142797 - 11 Jul 2025
Viewed by 916
Abstract
Low-altitude unmanned aerial vehicle (UAV) vision is critically hindered by the Sim-to-Real Gap, where models trained exclusively on simulation data degrade under real-world variations in lighting, texture, and weather. To address this problem, we propose RA3T (Region-Aligned 3D Transformer), a novel self-supervised framework [...] Read more.
Low-altitude unmanned aerial vehicle (UAV) vision is critically hindered by the Sim-to-Real Gap, where models trained exclusively on simulation data degrade under real-world variations in lighting, texture, and weather. To address this problem, we propose RA3T (Region-Aligned 3D Transformer), a novel self-supervised framework that enables robust Sim-to-Real adaptation. Specifically, we first develop a dual-branch strategy for self-supervised feature learning, integrating Masked Autoencoders and contrastive learning. This approach extracts domain-invariant representations from unlabeled simulated imagery to enhance robustness against occlusion while reducing annotation dependency. Leveraging these learned features, we then introduce a 3D Transformer fusion module that unifies multi-view RGB and LiDAR point clouds through cross-modal attention. By explicitly modeling spatial layouts and height differentials, this component significantly improves recognition of small and occluded targets in complex low-altitude environments. To address persistent fine-grained domain shifts, we finally design region-level adversarial calibration that deploys local discriminators on partitioned feature maps. This mechanism directly aligns texture, shadow, and illumination discrepancies which challenge conventional global alignment methods. Extensive experiments on UAV benchmarks VisDrone and DOTA demonstrate the effectiveness of RA3T. The framework achieves +5.1% mAP on VisDrone and +7.4% mAP on DOTA over the 2D adversarial baseline, particularly on small objects and sparse occlusions, while maintaining real-time performance of 17 FPS at 1024 × 1024 resolution on an RTX 4080 GPU. Visual analysis confirms that the synergistic integration of 3D geometric encoding and local adversarial alignment effectively mitigates domain gaps caused by uneven illumination and perspective variations, establishing an efficient pathway for simulation-to-reality UAV perception. Full article
(This article belongs to the Special Issue Innovative Technologies and Services for Unmanned Aerial Vehicles)
Show Figures

Figure 1

19 pages, 3044 KB  
Review
Deep Learning-Based Sound Source Localization: A Review
by Kunbo Xu, Zekai Zong, Dongjun Liu, Ran Wang and Liang Yu
Appl. Sci. 2025, 15(13), 7419; https://doi.org/10.3390/app15137419 - 2 Jul 2025
Cited by 1 | Viewed by 4116
Abstract
As a fundamental technology in environmental perception, sound source localization (SSL) plays a critical role in public safety, marine exploration, and smart home systems. However, traditional methods such as beamforming and time-delay estimation rely on manually designed physical models and idealized assumptions, which [...] Read more.
As a fundamental technology in environmental perception, sound source localization (SSL) plays a critical role in public safety, marine exploration, and smart home systems. However, traditional methods such as beamforming and time-delay estimation rely on manually designed physical models and idealized assumptions, which struggle to meet practical demands in dynamic and complex scenarios. Recent advancements in deep learning have revolutionized SSL by leveraging its end-to-end feature adaptability, cross-scenario generalization capabilities, and data-driven modeling, significantly enhancing localization robustness and accuracy in challenging environments. This review systematically examines the progress of deep learning-based SSL across three critical domains: marine environments, indoor reverberant spaces, and unmanned aerial vehicle (UAV) monitoring. In marine scenarios, complex-valued convolutional networks combined with adversarial transfer learning mitigate environmental mismatch and multipath interference through phase information fusion and domain adaptation strategies. For indoor high-reverberation conditions, attention mechanisms and multimodal fusion architectures achieve precise localization under low signal-to-noise ratios by adaptively weighting critical acoustic features. In UAV surveillance, lightweight models integrated with spatiotemporal Transformers address dynamic modeling of non-stationary noise spectra and edge computing efficiency constraints. Despite these advancements, current approaches face three core challenges: the insufficient integration of physical principles, prohibitive data annotation costs, and the trade-off between real-time performance and accuracy. Future research should prioritize physics-informed modeling to embed acoustic propagation mechanisms, unsupervised domain adaptation to reduce reliance on labeled data, and sensor-algorithm co-design to optimize hardware-software synergy. These directions aim to propel SSL toward intelligent systems characterized by high precision, strong robustness, and low power consumption. This work provides both theoretical foundations and technical references for algorithm selection and practical implementation in complex real-world scenarios. Full article
Show Figures

Figure 1

22 pages, 21858 KB  
Article
High-Order Temporal Context-Aware Aerial Tracking with Heterogeneous Visual Experts
by Shichao Zhou, Xiangpan Fan, Zhuowei Wang, Wenzheng Wang and Yunpu Zhang
Remote Sens. 2025, 17(13), 2237; https://doi.org/10.3390/rs17132237 - 29 Jun 2025
Cited by 1 | Viewed by 1066
Abstract
Visual tracking from the unmanned aerial vehicle (UAV) perspective has been at the core of many low-altitude remote sensing applications. Most of the aerial trackers follow “tracking-by-detection” paradigms or their temporal-context-embedded variants, where the only visual appearance cue is encompassed for representation learning [...] Read more.
Visual tracking from the unmanned aerial vehicle (UAV) perspective has been at the core of many low-altitude remote sensing applications. Most of the aerial trackers follow “tracking-by-detection” paradigms or their temporal-context-embedded variants, where the only visual appearance cue is encompassed for representation learning and estimating the spatial likelihood of the target. However, the variation of the target appearance among consecutive frames is inherently unpredictable, which degrades the robustness of the temporal context-aware representation. To address this concern, we advocate extra visual motion exhibiting predictable temporal continuity for complete temporal context-aware representation and introduce a dual-stream tracker involving explicit heterogeneous visual tracking experts. Our technical contributions involve three-folds: (1) high-order temporal context-aware representation integrates motion and appearance cues over a temporal context queue, (2) bidirectional cross-domain refinement enhances feature representation through cross-attention based mutual guidance, and (3) consistent decision-making allows for anti-drifting localization via dynamic gating and failure-aware recovery. Extensive experiments on four UAV benchmarks (UAV123, UAV123@10fps, UAV20L, and DTB70) illustrate that our method outperforms existing aerial trackers in terms of success rate and precision, particularly in occlusion and fast motion scenarios. Such superior tracking stability highlights its potential for real-world UAV applications. Full article
Show Figures

Graphical abstract

Back to TopTop