Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (369)

Search Parameters:
Keywords = vision-based navigation system

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 1283 KB  
Article
Design and Performance Validation of 4D Radar ICP-Integrated Navigation with Stochastic Cloning Augmentation
by Hyeongseob Shin, Dongha Kwon and Sangkyung Sung
Sensors 2026, 26(5), 1660; https://doi.org/10.3390/s26051660 (registering DOI) - 5 Mar 2026
Abstract
Automotive radar has emerged as a pivotal technology for navigation in GNSS-denied environments, offering superior robustness to adverse weather and fluctuating lighting conditions compared to vision or LiDAR-based sensors. Despite these advantages, the inherent sparsity and noise of radar measurements often lead to [...] Read more.
Automotive radar has emerged as a pivotal technology for navigation in GNSS-denied environments, offering superior robustness to adverse weather and fluctuating lighting conditions compared to vision or LiDAR-based sensors. Despite these advantages, the inherent sparsity and noise of radar measurements often lead to degraded estimation accuracy and system reliability. To address these challenges, various radar-based localization frameworks have been explored, ranging from optimization-based and Extended Kalman Filter (EKF) approaches fused with Inertial Measurement Units (IMUs) to point cloud registration techniques like Iterative Closest Point (ICP). While filter-based methods are favored in multi-sensor fusion for their proven stability, ICP is widely utilized for high-precision pose estimation in point-cloud-centric systems. In this study, we propose a novel Radar-Inertial Odometry (RIO) framework that synergistically integrates ICP-based relative pose estimation with model-based sensor fusion. The proposed methodology leverages relative transformations derived from ICP alongside ego-velocity estimations obtained from radar Doppler measurements. To effectively incorporate relative ICP constraints, a stochastic cloning technique is implemented to augment previous states and their associated covariances, ensuring that the uncertainty of historical poses is explicitly accounted for. The performance of the proposed method is validated using public open-source datasets, demonstrating higher localization accuracy and more consistent performance compared to existing algorithms used for comparison. Full article
(This article belongs to the Section Navigation and Positioning)
34 pages, 2334 KB  
Review
Survey on Reconnaissance Autonomous Robotic Systems for Disaster Management
by Sahaj Sinha, Sinjae Lee and Saurabh Singh
Sensors 2026, 26(5), 1659; https://doi.org/10.3390/s26051659 (registering DOI) - 5 Mar 2026
Abstract
Systems that operate in dangerous environments are becoming essential in case of emergencies. This survey reviews the latest ground reconnaissance robots using computer vision (CV), machine learning (ML), MCU-based control, LoRa communication, DC motors, and dual-power systems. The analysis includes hardware and algorithms, [...] Read more.
Systems that operate in dangerous environments are becoming essential in case of emergencies. This survey reviews the latest ground reconnaissance robots using computer vision (CV), machine learning (ML), MCU-based control, LoRa communication, DC motors, and dual-power systems. The analysis includes hardware and algorithms, and their performance in the field and lab. There has been clear progress in navigation, sensor fusion, and situational awareness. The main challenges which remain include the use of energy and standardization of benchmarks. This survey focuses exclusively on Unmanned Ground Vehicles (UGVs) for disaster reconnaissance, examining recent advances in hardware, software, and autonomy. The survey highlights the improvements in navigation, sensor fusion, and intelligence, and identifies remaining challenges such as energy limitations, robustness in harsh conditions, and the lack of standardized benchmarks. The analysis synthesizes findings from over 190 recent studies (2020–2025) in ground-based disaster robotics, providing a comprehensive overview of current capabilities and research gaps. It encapsulates all issues with their remedy for future disaster-response systems. Full article
(This article belongs to the Special Issue Advanced Sensors and AI Integration for Human–Robot Teaming)
27 pages, 12041 KB  
Article
FPGA-Based CNN Acceleration on Zynq-7020 for Embedded Ship Recognition in Unmanned Surface Vehicles
by Abdelilah Haijoub, Aissam Bekkari, Anas Hatim, Mounir Arioua, Mohamed Nabil Srifi and Antonio Guerrero-Gonzalez
Sensors 2026, 26(5), 1626; https://doi.org/10.3390/s26051626 - 5 Mar 2026
Abstract
Unmanned surface vehicles (USVs) increasingly rely on vision-based perception for safe navigation and maritime surveillance, while onboard computing is constrained by strict size, weight, and power (SWaP) budgets. Although deep convolutional neural networks (CNNs) offer strong recognition performance, their computational and memory requirements [...] Read more.
Unmanned surface vehicles (USVs) increasingly rely on vision-based perception for safe navigation and maritime surveillance, while onboard computing is constrained by strict size, weight, and power (SWaP) budgets. Although deep convolutional neural networks (CNNs) offer strong recognition performance, their computational and memory requirements pose significant challenges for deployment on low-cost embedded platforms. This paper presents a hardware–software co-design architecture and deployment study for CNN acceleration on a heterogeneous ARM–FPGA system, targeting energy-efficient near-sensor processing for embedded maritime applications. The proposed approach exploits a fully streaming hardware architecture in the FPGA fabric, based on line-buffered convolutions and AXI-Stream dataflow, while the ARM processing system is responsible for lightweight configuration, scheduling, and data movement. The architecture was evaluated using representative CNN models trained on a maritime ship dataset. Our experimental results on a Zynq-7020 system-on-chip demonstrate that the proposed co-design strategy achieves a balanced trade-off between throughput, resource utilisation, and power consumption under tight embedded constraints, highlighting its suitability as a practical building block for onboard perception in USVs. Full article
(This article belongs to the Section Vehicular Sensing)
Show Figures

Figure 1

17 pages, 761 KB  
Article
Obstacle Avoidance in Mobile Robotics: A CNN-Based Approach Using CMYD Fusion of RGB and Depth Images
by Chaymae El Mechal, Mostefa Mesbah and Najiba El Amrani El Idrissi
Digital 2026, 6(1), 20; https://doi.org/10.3390/digital6010020 - 2 Mar 2026
Viewed by 84
Abstract
Over the last few years, deep neural networks have achieved outstanding results in computer vision, and have been widely integrated into mobile robot obstacle avoidance systems, where perception-driven classification supports navigation decisions. Most existing approaches rely on either color images (RGB) or depth [...] Read more.
Over the last few years, deep neural networks have achieved outstanding results in computer vision, and have been widely integrated into mobile robot obstacle avoidance systems, where perception-driven classification supports navigation decisions. Most existing approaches rely on either color images (RGB) or depth images (D) as the primary source of information, which limits their ability to jointly exploit appearance and geometric cues. This paper proposes a deep learning-based classification approach that simultaneously exploits RGB and depth information for mobile robot obstacle avoidance. The method adopts an early-stage fusion strategy in which RGB images are first converted into the CMYK color space, after which the K (black) channel is replaced by a normalized depth map to form a four-channel CMYD representation. This representation preserves chromatic information while embedding geometric structure in an intensity-consistent channel and is used as input to a convolutional neural network (CNN). The proposed method is evaluated using locally acquired data under different training options and hyperparameter settings. Experimental results show that, when using the baseline CNN architecture, the proposed fusion strategy achieves an overall classification accuracy of 93.3%, outperforming depth-only inputs (86.5%) and RGB-only images (92.9%). When the refined CNN architecture is employed, classification accuracy is further improved across all tested input representations, reaching approximately 93.9% for RGB images, 91.0% for depth-only inputs, 94.6% for the CMYK color space, and 96.2% for the proposed CMYD fusion. These results demonstrate that combining appearance and depth information through CMYD fusion is beneficial regardless of the network variant, while the refined CNN architecture further enhances the effectiveness of the fused representation for robust obstacle avoidance. Full article
Show Figures

Figure 1

21 pages, 6235 KB  
Article
Vision-Based Smart Wearable Assistive Navigation System Using Deep Learning for Visually Impaired People
by Syed Salman Shah, Abid Imran, Saad-Ur-Rehman, Arsalan Arif, Khurram Khan, Muhammad Arsalan, Sajjad Manzoor and Ghulam Jawad Sirewal
Automation 2026, 7(2), 41; https://doi.org/10.3390/automation7020041 - 1 Mar 2026
Viewed by 152
Abstract
People affected by vision impairment experience significant challenges in mobility and daily life activities. In this paper, a smart assistive navigation system is proposed to address mobility challenges and to enhance the independence of visually impaired individuals. Three modules are integrated into the [...] Read more.
People affected by vision impairment experience significant challenges in mobility and daily life activities. In this paper, a smart assistive navigation system is proposed to address mobility challenges and to enhance the independence of visually impaired individuals. Three modules are integrated into the proposed system. The vision module detects obstacles and interactive objects such as doors, chairs, people, fire extinguishers, etc. The depth cam-based distance module provides the distance of detected objects and obstacles. The voice module provides auditory feedback to visually impaired individuals about the detected objects and obstacles that fall under the pre-defined threshold distance. Finally, the proposed system is optimized in terms of performance and user experience. Jetson Nano is used to reduce the cost of the overall system; however, it has compatibility issues with many of the latest object detection models. The YOLOv5n model is used considering compatibility for object detection, but it has low Mean Average Precision (mAP) and frame rate. To improve the performance of the vision module, various hyperparameters of YOLOv5n are fine-tuned along with transfer learning to enhance the mAP@50 from the original 0.457 to 0.845 and mAP@50-95 from 0.28 to 0.593. Tensor-RT optimization is employed to increase the frame rate to deploy the model in a real scenario. The real-time experimentation shows that the proposed system successfully alerts users to key objects, hazards, and obstacles which enables independent and confident navigation. Full article
(This article belongs to the Section Intelligent Control and Machine Learning)
Show Figures

Figure 1

25 pages, 7095 KB  
Article
AGCNeRF: Air–Ground Collaborative Visual Mapping and Navigation via Landmark-Enhanced Neural Radiance Fields
by Chenxi Lu, Meng Yu, Yin Wang and Hua Li
Drones 2026, 10(3), 171; https://doi.org/10.3390/drones10030171 - 28 Feb 2026
Viewed by 132
Abstract
Unmanned vehicles are becoming increasingly essential in executing high-risk missions in unknown environments such as search and rescue. As the complexity of operational environments escalates, carrying out unmanned tasks becomes cumbersome or even infeasible for a single vehicle, hampered by limited perception and [...] Read more.
Unmanned vehicles are becoming increasingly essential in executing high-risk missions in unknown environments such as search and rescue. As the complexity of operational environments escalates, carrying out unmanned tasks becomes cumbersome or even infeasible for a single vehicle, hampered by limited perception and operational constraints. Aiming at enhancing the flexibility of unmanned operations under complicated scenarios, this study introduces AGC-NeRF, an innovative air–ground collaborative exploration framework that harnesses the functional complementarity of UAVs and UGVs—enabling a UGV to navigate through a complex scenario with the assistance of a UAV via referencing a neural radiance map. First, a UAV is employed to collect aerial images for reconstructing the environment to be explored by a UGV, leveraging its aerial perspective to achieve wide-area coverage and global environmental perception that is unattainable for a single UGV. Concurrently, an innovative image saliency evaluation approach is introduced to meticulously select landmarks that are contributive to the UGV’s navigation system, yielding a pre-trained NeRF model of the operation scene. Then, a landmark-aware 6-DOF ego-motion estimator and collision-free trajectory optimizer are designed for the UGV based on the NeRF map. Finally, an online replanning architecture is established which relies on a ground station for NeRF training and state optimization by synergizing the trajectory planner and the state estimator, which forms a dual-agent vision-only navigation pipeline. Simulations and experiments validate that AGC-NeRF enables reliable UGV trajectory planning and state estimation in unknown environments, demonstrating superior efficacy and robustness of the air–ground collaborative paradigm. Full article
Show Figures

Figure 1

15 pages, 960 KB  
Article
ArmTenna: Two-Armed RFID Explorer for Dynamic Warehouse Management
by Abdussalam A. Alajami and Rafael Pous
Sensors 2026, 26(5), 1513; https://doi.org/10.3390/s26051513 - 27 Feb 2026
Viewed by 113
Abstract
Efficient RFID spatial exploration in dynamic warehouse environments is challenging due to occlusions, sensing geometry constraints, and the weak coupling between information acquisition and navigation decisions. Many existing inventory robots treat RFID sensing as a passive data source during exploration, without explicitly optimizing [...] Read more.
Efficient RFID spatial exploration in dynamic warehouse environments is challenging due to occlusions, sensing geometry constraints, and the weak coupling between information acquisition and navigation decisions. Many existing inventory robots treat RFID sensing as a passive data source during exploration, without explicitly optimizing sensing pose or prioritizing inventory-driven frontiers, which can result in incomplete coverage and redundant traversal. This paper presents ArmTenna, an articulated mobile robotic platform that formulates RFID inventory exploration as an active perception problem. The system integrates dual 4-DOF robotic arms carrying directional UHF RFID antennas and a 2-DOF neck-mounted RGB-D camera, enabling adaptive interrogation of candidate regions. We propose a multi-modal frontier exploration framework that combines newly detected EPC tags, average RSSI values, and vision-based product detections into a composite utility function for goal selection. By embedding articulated antenna control directly into the frontier evaluation loop, the robot tightly couples sensing geometry with exploration decisions. Experimental validation with 150 tagged items across three separated warehouse zones shows that ArmTenna achieves up to 97% map coverage, compared to 72% for a baseline platform, while reducing missed-tag regions. These results demonstrate that integrating active sensing pose control with multi-modal frontier evaluation provides an effective and scalable solution for RFID-driven warehouse inventory automation. Full article
Show Figures

Figure 1

25 pages, 13812 KB  
Article
Robust and Cost-Effective Vision-Based Indoor UAV Localization with RWA-YOLO
by Feifei Wang, Kun Sun and Yuanqing Wang
Sensors 2026, 26(5), 1469; https://doi.org/10.3390/s26051469 - 26 Feb 2026
Viewed by 159
Abstract
Accurate indoor localization for unmanned aerial vehicles (UAVs) remains challenging in GPS-denied environments, especially for small-object detection and under low-light conditions. We propose Robust Wavelet-Aware YOLO (RWA-YOLO), a vision-based detection framework that integrates a wavelet-aware attention fusion module with a dual multi-path aggregation [...] Read more.
Accurate indoor localization for unmanned aerial vehicles (UAVs) remains challenging in GPS-denied environments, especially for small-object detection and under low-light conditions. We propose Robust Wavelet-Aware YOLO (RWA-YOLO), a vision-based detection framework that integrates a wavelet-aware attention fusion module with a dual multi-path aggregation mechanism to enhance small-object detection and multi-scale feature representation. UAV-mounted LEDs are utilized to ensure robust visual perception in low-light indoor scenarios. The UAV’s three-dimensional position is estimated through multi-view geometric triangulation without relying on external beacons or artificial markers. Beyond static localization, the system is validated under dynamic flight conditions, demonstrating smooth and temporally coherent trajectory reconstruction suitable for real-time control loops (update rate 25FPS). Extensive experiments in real indoor environments achieve centimeter-level localization accuracy (root mean square error: 9.9 mm, 95th percentile error: 13.5 mm), outperforming state-of-the-art vision-based methods and achieving accuracy comparable to or better than representative hybrid ultra-wideband–vision systems reported in the literature. These results confirm the effectiveness, robustness, and real-time capability of RWA-YOLO for indoor UAV navigation in constrained environments. Full article
(This article belongs to the Section Navigation and Positioning)
Show Figures

Figure 1

9 pages, 3625 KB  
Proceeding Paper
A Framework for Integrity Monitoring for Positioning Through Graph-Based SLAM Optimization
by Sam Bekkers and Heiko Engwerda
Eng. Proc. 2026, 126(1), 25; https://doi.org/10.3390/engproc2026126025 - 25 Feb 2026
Viewed by 145
Abstract
As satellite navigation systems show vulnerabilities in specific circumstances such as urban canyons or jamming and spoofing situations, additional sensors such as cameras may be incorporated on the platform. Despite advancements in the robotics and computer vision community, which have led to increasingly [...] Read more.
As satellite navigation systems show vulnerabilities in specific circumstances such as urban canyons or jamming and spoofing situations, additional sensors such as cameras may be incorporated on the platform. Despite advancements in the robotics and computer vision community, which have led to increasingly accurate Simultaneous Localization and Mapping (SLAM) positioning solutions, visual navigation has its own vulnerabilities. It therefore remains of critical importance for many applications to study the integrity of fused navigation algorithms and their components, which is done less for SLAM than for satellite navigation. In this paper, a framework for integrity monitoring (IM) of a visual SLAM algorithm is proposed. A sensor-level IM scheme analyses feature reprojection errors. It is demonstrated that, in dynamic environments, multiple hypotheses can be generated from different subsets of extracted features. Additionally, the factor graph-based framework employs a fusion-level IM scheme which deals with these multiple hypotheses and selects the most probable one by calculating the sum of weighted measurement residuals. These concepts are applied to scenarios from real and simulated experiments in order to demonstrate applicability. Full article
(This article belongs to the Proceedings of European Navigation Conference 2025)
Show Figures

Figure 1

26 pages, 11920 KB  
Article
Autonomous Control of Satellite Swarms Using Minimal Vision-Based Behavioral Control
by Marco Sabatini
Aerospace 2026, 13(3), 207; https://doi.org/10.3390/aerospace13030207 - 24 Feb 2026
Viewed by 195
Abstract
In recent years, the trend toward spacecraft miniaturization has led to the widespread adoption of micro- and nanosatellites, driven by their reduced development costs and simplified launch logistics. Operating these platforms in coordinated fleets, or swarms, represents a promising approach to overcoming the [...] Read more.
In recent years, the trend toward spacecraft miniaturization has led to the widespread adoption of micro- and nanosatellites, driven by their reduced development costs and simplified launch logistics. Operating these platforms in coordinated fleets, or swarms, represents a promising approach to overcoming the inherent limitations of individual spacecraft by distributing sensing and processing capabilities across multiple units. For systems of this scale, decentralized guidance and control architectures based on so-called behavioral strategies offer an attractive solution. These approaches are inspired by biological swarms, which exhibit remarkable robustness and adaptability through simple local interactions, minimal information exchange, and the absence of centralized supervision, but their application to space scenarios is limited, if not negligible. This work investigates the feasibility of autonomous swarm maintenance subject to orbital forces, under the stringent actuation, sensing, and computational constraints typical of nanosatellite platforms. Each spacecraft is assumed to carry a single monocular camera aligned with the along-track direction. The proposed behavioral control framework enables decentralized formation keeping without ground intervention or centralized coordination. Since control actions rely on the relative motion of neighboring satellites, a lightweight relative navigation capability is required. The results indicate that complex vision pipelines can be replaced by simple blob-based image processing, although a (rough) reconstruction of elative parameters remains essential to avoid unnecessary control effort arising from suboptimal guidance decisions. Full article
(This article belongs to the Special Issue Progress in Satellite Formation Flying Technologies)
Show Figures

Figure 1

23 pages, 1013 KB  
Article
Occlusion-Robust Swarm Motion via Pheromone-Modulated Orientation Change
by Liwei Xuan, Mingyong Liu, Guoyuan He and Zhiqiang Yan
J. Mar. Sci. Eng. 2026, 14(4), 399; https://doi.org/10.3390/jmse14040399 - 22 Feb 2026
Viewed by 165
Abstract
Effective collective motion hinges on the seamless transfer of local information, yet vision-based mechanisms, while potent for generating rapid consensus, are inherently fragile. Visual links can be severed instantly by occlusions, leading to a phenomenon characterized as “sensory amnesia.” Seeking to fortify this [...] Read more.
Effective collective motion hinges on the seamless transfer of local information, yet vision-based mechanisms, while potent for generating rapid consensus, are inherently fragile. Visual links can be severed instantly by occlusions, leading to a phenomenon characterized as “sensory amnesia.” Seeking to fortify this vulnerability, Pheromone-Modulated Body Orientation Change (PM-BOC) is introduced as a dual-channel framework that fuses transient visual cues with a persistent environmental memory. Rather than treating these inputs in isolation, motion salience is quantified via BOC and mapped onto a decaying virtual pheromone field, dynamically modulating interaction weights by coupling instantaneous visual projections with local pheromone concentrations. This strategy effectively constructs a temporal buffer, bridging the informational voids left by blind spots. Validation, spanning from systematic physics simulations to high-fidelity simulations with a swarm of 50 UUVs, reveals that PM-BOC sustains superior cohesion in obstacle-laden environments where baseline visual models falter. Notably, this coupling suppresses high-frequency sensory noise while inducing resilient, scale-free velocity correlations that scale linearly with system size. By reconciling the trade-off between the immediacy of visual responsiveness and the robustness of environmental memory, this study offers a scalable paradigm for engineering resilient swarm systems capable of navigating the uncertainties of perception-limited environments. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

29 pages, 31856 KB  
Article
A Vision–Locomotion Framework Toward Obstacle Avoidance for a Bio-Inspired Gecko Robot
by Wenrui Xiang, Barmak Honarvar Shakibaei Asli and Aihong Ji
Electronics 2026, 15(4), 882; https://doi.org/10.3390/electronics15040882 - 20 Feb 2026
Viewed by 229
Abstract
This paper presents the design and experimental evaluation of a bio-inspired gecko robot, focusing on mechanical design, vision-based obstacle perception, and rhythmic locomotion control as enabling technologies for future obstacle avoidance in complex environments. The robot features a 17-degrees-of-freedom mechanical structure with a [...] Read more.
This paper presents the design and experimental evaluation of a bio-inspired gecko robot, focusing on mechanical design, vision-based obstacle perception, and rhythmic locomotion control as enabling technologies for future obstacle avoidance in complex environments. The robot features a 17-degrees-of-freedom mechanical structure with a flexible spine and multi-jointed limbs, providing a physical basis for adaptive locomotion. For perception, a custom obstacle detection dataset was constructed from the robot’s onboard camera view and used to train a YOLOv5-based detection model. Experimental results show that the trained model achieves a mean average precision (mAP) of 0.979 and a maximum F1-score of 0.97 at an optimal confidence threshold, demonstrating reliable real-time obstacle perception under diverse indoor conditions. For motion control, a central pattern generator (CPG) based on Hopf oscillators is implemented to generate rhythmic locomotion. Experimental evaluations confirm stable diagonal gait generation, with coordinated joint trajectories oscillating at 1 Hz. The flexible spine exhibits periodic lateral deflection with peak amplitudes of ±15°, ±10°, and ±8° across spinal joints, enhancing locomotion continuity and turning capability. Physical robot experiments further demonstrate smooth straight-line crawling enabled by the coupled limb–spine motion. While visual perception and CPG-based locomotion are experimentally validated as independent subsystems, their real-time closed-loop integration is not implemented in this study. Instead, this work establishes a system-level framework and experimental baseline for future perception–motion coupling, providing a foundation for closed-loop obstacle avoidance and autonomous navigation in bio-inspired gecko robots. Full article
Show Figures

Figure 1

25 pages, 6514 KB  
Article
An Optimization-Based Method for Relative Pose Estimation for Collaborating UAVs Using Observed Predefined Trajectories
by Guven Cetinkaya and Yakup Genc
Drones 2026, 10(2), 135; https://doi.org/10.3390/drones10020135 - 14 Feb 2026
Viewed by 325
Abstract
Accurate relative pose estimation between unmanned aerial vehicles (UAVs) is a key requirement for cooperative navigation, formation control, and swarm operation in GNSS-denied environments. In multi-UAV systems, monocular vision is attractive due to its low weight and power requirements; however, bearing-only measurements can [...] Read more.
Accurate relative pose estimation between unmanned aerial vehicles (UAVs) is a key requirement for cooperative navigation, formation control, and swarm operation in GNSS-denied environments. In multi-UAV systems, monocular vision is attractive due to its low weight and power requirements; however, bearing-only measurements can lead to angular ambiguities, particularly under symmetric or planar target motion. This paper presents a geometric framework for monocular relative pose estimation using observed known motion patterns, rather than relying on complex distributed system architectures. The method exploits trajectory-induced geometric constraints by back-projecting the observed image-plane trajectory of a target UAV into three-dimensional space and tracing rays from the camera center toward a geometrically parameterized reference trajectory. Relative pose parameters are refined through nonlinear optimization using Levenberg–Marquardt, enabling accurate estimation under noisy conditions. Beyond the estimation framework, the influence of cooperative trajectory geometry on angular observability is investigated through simulation experiments. The results indicate that planar collaborative motion may induce angular ambiguity despite numerical convergence, whereas introducing modest out-of-plane excitation through three-dimensional trajectories significantly improves observability. In addition to simulation-based evaluation, a limited real-world flight experiment is conducted to qualitatively validate the observed ambiguity patterns under practical sensing conditions. In particular, three-dimensional eight-shaped trajectories are shown to significantly suppress large angular outliers and improve estimation robustness without increasing computational complexity, providing validated guidance for active trajectory design to ensure observability in vision-based aerial scenarios. Full article
(This article belongs to the Section Artificial Intelligence in Drones (AID))
Show Figures

Figure 1

23 pages, 16184 KB  
Article
A Lightweight Drone Vision System for Autonomous Inspection with Real-Time Processing
by Zhengran Zhou, Wei Wang, Hao Wu, Tong Wang and Satoshi Suzuki
Drones 2026, 10(2), 126; https://doi.org/10.3390/drones10020126 - 11 Feb 2026
Viewed by 424
Abstract
Automated inspection of power infrastructure with drones requires processing video streams in real time and performing object recognition from image data with constrained resources. Server-based object recognition algorithms depend on transmitting data over a network and require considerable computational resources. In this study, [...] Read more.
Automated inspection of power infrastructure with drones requires processing video streams in real time and performing object recognition from image data with constrained resources. Server-based object recognition algorithms depend on transmitting data over a network and require considerable computational resources. In this study, we present an automated system designed to inspect power infrastructure using drones in real time. The proposed system is implemented on the Rockchip RK3588 platform and uses a lightweight YOLOv8 architecture incorporating a Slim-Neck model with a VanillaBlock module integrated into the backbone. To support real-time operation, we developed a digital video stream processing system (DVSPS) to coordinate multimedia processor (MPP)-based hardware video decoding, with inference performed on a multicore neural processing unit (NPU) using thread pooling. The system can navigate autonomously using a closed-loop machine vision system that computes the latitude and longitude of electrical towers to perform multilevel inspections. The proposed model attained an 84.2% mAP50 and 52.5% mAP50:95 with 3.7 GFLOPs and an average throughput of 111.3 FPS with 34% fewer parameters. These results demonstrate that the proposed method is an efficient and scalable solution for autonomous inspection across diverse operational conditions. Full article
Show Figures

Figure 1

23 pages, 3879 KB  
Article
Simultaneous Digital Twin: Chaining Climbing-Robot, Defect Segmentation, and Model Updating for Building Facade Inspection
by Changhao Song, Chang Lu, Yilong Shi, Aili He, Jiarui Lin and Zhiliang Ma
Buildings 2026, 16(3), 646; https://doi.org/10.3390/buildings16030646 - 4 Feb 2026
Viewed by 251
Abstract
The rapid deterioration of building facades presents substantial safety hazards in urban environments, necessitating advanced, automated inspection solutions. While computer vision (CV) and deep learning (DL) techniques have shown promise for defect analysis, critical gaps remain in achieving real-time, quantitative, and generalizable damage [...] Read more.
The rapid deterioration of building facades presents substantial safety hazards in urban environments, necessitating advanced, automated inspection solutions. While computer vision (CV) and deep learning (DL) techniques have shown promise for defect analysis, critical gaps remain in achieving real-time, quantitative, and generalizable damage assessment suitable for robotic deployment. Current methods often lack precise metric quantification, struggle with diverse material appearances, and are computationally intensive for on-site processing. To address these limitations, this paper introduces a fully automated, end-to-end inspection framework integrating a wall-climbing robot, a real-time vision-based analysis system, and a digital twin management platform. The primary contributions are threefold: (1) a novel, fully integrated robotic framework for autonomous navigation, multi-sensor data collection, and real-time analysis; (2) a lightweight, synthetic data-augmented DL model for real-time defect segmentation and metric quantification, achieving a mean Average Precision (mAP) of 0.775 for segmentation, an average defect length error of 1.140 cm, and an average center position error of 0.826 cm; (3) a cloud-based digital twin platform enabling quantitative defect visualization, spatiotemporal traceability, and data-driven project management, with the on-site inspection cycle demonstrating a responsive latency of 2.8–4.8 s. Validated through laboratory tests and real building projects, the framework demonstrates significant improvements in inspection efficiency, quantitative accuracy, and decision support over conventional methods. Full article
Show Figures

Figure 1

Back to TopTop