Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (263)

Search Parameters:
Keywords = lidar-based perception

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
25 pages, 6368 KB  
Article
Comfort-Oriented Pothole Traversal Using Multi-Sensor Perception and Fuzzy Control
by Chaochun Yuan, Shiqi Hang, Youguo He, Jie Shen, Long Chen, Yingfeng Cai, Shuofeng Weng and Junxian Wang
Sensors 2026, 26(6), 1925; https://doi.org/10.3390/s26061925 - 19 Mar 2026
Abstract
Potholes are typical negative road obstacles that can significantly compromise vehicle safety and ride comfort when traversed at inappropriate speeds. To address this issue, this paper proposes a pothole-detection-based, comfort-oriented pothole traversal algorithm that integrates multi-sensor fusion perception, comfort-constrained speed planning, and fuzzy [...] Read more.
Potholes are typical negative road obstacles that can significantly compromise vehicle safety and ride comfort when traversed at inappropriate speeds. To address this issue, this paper proposes a pothole-detection-based, comfort-oriented pothole traversal algorithm that integrates multi-sensor fusion perception, comfort-constrained speed planning, and fuzzy control. A camera and a single-point ranging LiDAR are first fused to extract key geometric features of potholes, including contour, area, and depth. Based on these features, a vehicle–pothole dynamic model is developed in ADAMS to quantify the influence of pothole area and depth on vehicle vertical vibration. The vertical frequency-weighted root-mean-square (RMS) acceleration is adopted as the ride comfort indicator, based on which the maximum allowable traversal speed under different pothole geometries is determined. Furthermore, a longitudinal pothole traversal control strategy based on fuzzy theory is designed to regulate vehicle acceleration, enabling the vehicle to reach the comfort-constrained limiting speed within a finite preview distance while ensuring braking safety. The proposed method is validated through multi-scenario co-simulations using MATLAB/Simulink and CarSim, as well as real-vehicle experiments. Results demonstrate that the proposed strategy can effectively adjust vehicle speed before pothole traversal, satisfying comfort constraints and improving ride comfort without sacrificing driving safety. Full article
(This article belongs to the Section Vehicular Sensing)
Show Figures

Figure 1

19 pages, 2755 KB  
Article
CA-Adv: Curvature-Adaptive Weighted Adversarial 3D Point Cloud Generation Method for Remote Sensing Scenarios
by Yanwen Sun, Shijia Xiao, Weiquan Liu, Min Huang, Chaozhi Cheng, Shiwei Lin, Jinhe Su, Zongyue Wang and Guorong Cai
Remote Sens. 2026, 18(6), 882; https://doi.org/10.3390/rs18060882 - 13 Mar 2026
Viewed by 142
Abstract
Adversarial robustness in 3D point cloud recognition models is a critical concern in remote sensing applications, such as autonomous driving and infrastructure monitoring. Existing adversarial attack methods can compromise model performance; moreover, they often neglect the intrinsic geometric properties of point clouds, leading [...] Read more.
Adversarial robustness in 3D point cloud recognition models is a critical concern in remote sensing applications, such as autonomous driving and infrastructure monitoring. Existing adversarial attack methods can compromise model performance; moreover, they often neglect the intrinsic geometric properties of point clouds, leading to perceptually unnatural perturbations that limit their practicality for robustness evaluation in real-world scenarios. To address this, we propose CA-Adv, a novel curvature-adaptive weighted adversarial generation method for 3D point clouds. Our approach first employs Shapley values to assess regional sensitivity and identify salient regions. It then adaptively partitions these regions based on local curvature and assigns perturbation weights accordingly, concentrating the attack on geometrically sensitive areas while preserving overall structural consistency through explicit geometric constraints. Extensive experiments on real-world remote sensing data (KITTI) and synthetic benchmarks (ModelNet40, ShapeNet) demonstrate that CA-Adv achieves a high attack success rate with a minimal perturbation budget. The generated adversarial examples maintain superior visual naturalness and geometric fidelity. The method provides a practical tool for evaluating the robustness of 3D recognition models in applications such as autonomous driving, urban-scale LiDAR perception, and remote sensing point cloud analysis. Full article
Show Figures

Figure 1

15 pages, 3088 KB  
Article
Lightweight Semantic Segmentation Algorithm Based on Gated Visual State Space Models
by Kui Di, Jinming Cheng, Lili Zhang and Yubin Bao
Electronics 2026, 15(6), 1175; https://doi.org/10.3390/electronics15061175 - 12 Mar 2026
Viewed by 234
Abstract
LiDAR serves as the primary sensor for acquiring environmental information in intelligent driving systems. However, under adverse weather conditions, point cloud signals obtained by LiDAR suffer from intensity attenuation and noise interference, leading to a decline in segmentation accuracy. To address these issues, [...] Read more.
LiDAR serves as the primary sensor for acquiring environmental information in intelligent driving systems. However, under adverse weather conditions, point cloud signals obtained by LiDAR suffer from intensity attenuation and noise interference, leading to a decline in segmentation accuracy. To address these issues, this paper designs a lightweight semantic segmentation system based on the Gated Visual State Space Model (VMamba), named RainMamba. Specifically, the system utilizes spherical projection to transform point clouds into 2D sequences and constructs a physical perception feature embedding module guided by the Beer–Lambert law to explicitly model and suppress spatial noise at the source. Subsequently, an uncertainty-weighted cross-modal correction module is employed to incorporate RGB images for dynamically calibrating the degraded point cloud data. Finally, a VMamba backbone is adopted to establish global dependencies with linear complexity. Experimental results on the SemanticKITTI dataset demonstrate that the system achieves an inference speed of 83 FPS, with a relative mIoU improvement of approximately 7.2% compared to the real-time baseline PolarNet. Furthermore, zero-shot evaluations on the real-world SemanticSTF dataset validate the system’s robust Sim-to-Real generalization capability. Notably, RainMamba delivers highly competitive accuracy comparable to the state-of-the-art heavy-weight model PTv3 while requiring a significantly lower parameter footprint, thereby demonstrating its immense potential for practical edge-computing deployment. Full article
Show Figures

Figure 1

26 pages, 3911 KB  
Article
Integrated Multimodal Perception and Predictive Motion Forecasting via Cross-Modal Adaptive Attention
by Bakhita Salman, Alexander Chavez and Muneeb Yassin
Future Transp. 2026, 6(2), 64; https://doi.org/10.3390/futuretransp6020064 - 11 Mar 2026
Viewed by 203
Abstract
Accurate environmental perception is fundamental to safe autonomous driving; however, most existing multimodal systems rely on fixed or heuristic sensor fusion strategies that cannot adapt to scene-dependent variations in sensor reliability. This paper proposes Cross-Modal Adaptive Attention (CMAA), a unified end-to-end Bird’s-Eye-View (BEV) [...] Read more.
Accurate environmental perception is fundamental to safe autonomous driving; however, most existing multimodal systems rely on fixed or heuristic sensor fusion strategies that cannot adapt to scene-dependent variations in sensor reliability. This paper proposes Cross-Modal Adaptive Attention (CMAA), a unified end-to-end Bird’s-Eye-View (BEV) perception framework that dynamically fuses camera, LiDAR, and RADAR information through learnable, context-aware modality gating. Unlike static fusion approaches, CMAA adaptively reweights sensor contributions based on global scene descriptors, enabling the robust integration of semantic, geometric, and motion cues without manual tuning. The proposed architecture jointly performs 3D object detection, multi-object tracking, and motion forecasting within a shared BEV representation, preserving spatial alignment across tasks and supporting efficient real-time deployment. Experiments conducted on the official nuScenes validation split demonstrate that CMAA achieves 0.528 mAP and 0.691 NDS, outperforming fixed-weight fusion baselines while maintaining a compact model size and efficient inference. Additional tracking evaluation using the official nuScenes tracking devkit reports improved tracking performance, while motion forecasting experiments show reduced trajectory displacement errors (minADE and minFDE). Ablation studies further confirm the complementary contributions of adaptive modality gating and bidirectional cross-modal refinement, and a stratified dynamic analysis reveals consistent reductions in velocity estimation error across object classes, motion regimes, and environmental conditions. These results demonstrate that adaptive multimodal fusion improves robustness, motion reasoning, and perception reliability in complex traffic environments while remaining computationally efficient for deployment in safety-critical autonomous driving systems. Full article
Show Figures

Figure 1

21 pages, 3931 KB  
Article
Vehicle Speed Estimation Using Infrastructure-Mounted LiDAR via Rectangle Edge Matching
by Injun Hong and Manbok Park
Appl. Sci. 2026, 16(5), 2513; https://doi.org/10.3390/app16052513 - 5 Mar 2026
Viewed by 201
Abstract
Smart transportation infrastructure is increasingly deployed, and cooperative perception using stationary Light Detection and Ranging (LiDAR) sensors installed at intersections and along roadsides is becoming more important. However, infrastructure LiDAR often suffers from sparse point-cloud data (PCD) at long ranges and frequent occlusions, [...] Read more.
Smart transportation infrastructure is increasingly deployed, and cooperative perception using stationary Light Detection and Ranging (LiDAR) sensors installed at intersections and along roadsides is becoming more important. However, infrastructure LiDAR often suffers from sparse point-cloud data (PCD) at long ranges and frequent occlusions, which can degrade the stability of inter-frame displacement and speed estimation. This paper proposes a real-time vehicle speed estimation method that operates robustly under sparse and partially observed conditions. The proposed approach extracts boundary points from clustered vehicle PCD and removes outliers, and then fits a 2D rectangle to the vehicle contour via Gauss–Newton optimization by minimizing distance-based residuals between boundary points and rectangle edges. To further improve robustness, we incorporate Hessian augmentation terms that account for boundary states and size variations, thereby alleviating excessive boundary violations and abnormal deformation of the width and height parameters during iterations. Next, from the fitted rectangles in consecutive frames, we construct a nearest corner with respect to the LiDAR origin and an auxiliary point, and perform 2D SVD-based alignment using only these two representative points. This enables efficient computation of inter-frame displacement and speed without full point-cloud registration (e.g., iterative closest point (ICP)). Experiments conducted at an intersection in K-City (Hwaseong, Republic of Korea) using a 40-channel LiDAR, a test vehicle (Genesis G70), and a real-time kinematic (RTK) system (MRP-2000) show that the proposed method stably preserves representative points and fits rectangles, even in sparse regions where only about two LiDAR rings are observed. Using CAN-based vehicle speed as the reference, the proposed method achieves an MAE of 0.76–1.37 kph and an RMSE of 0.90–1.58 kph over the tested speed settings (30, 50, and 70 kph, as well as high speed (~90 kph)) and trajectory scenarios. Furthermore, per-object processing-time measurements confirm the real-time feasibility of the proposed algorithm. Full article
Show Figures

Figure 1

22 pages, 5554 KB  
Article
Image Inpainting-Based Point Cloud Restoration for Enhancing Tactical Classification of Unmanned Surface Vehicles
by Hyunjun Jeon, Eon-ho Lee, Jane Shin and Sejin Lee
Sensors 2026, 26(5), 1637; https://doi.org/10.3390/s26051637 - 5 Mar 2026
Viewed by 177
Abstract
The operational effectiveness of Unmanned Surface Vehicles (USVs) in modern naval scenarios depends on robust situational awareness. While LiDAR sensors are integral to 3D perception, their performance is frequently affected by incomplete data resulting from long-range sparsity and target occlusion. This study investigates [...] Read more.
The operational effectiveness of Unmanned Surface Vehicles (USVs) in modern naval scenarios depends on robust situational awareness. While LiDAR sensors are integral to 3D perception, their performance is frequently affected by incomplete data resulting from long-range sparsity and target occlusion. This study investigates a framework to restore incomplete point clouds to support improved surface vessel classification. The framework first estimates the target’s heading angle using a 2D area projection technique, combined with a descriptor to address orientation ambiguity. Subsequently, the 3D point cloud is converted into a 2D multi-channel image representation to leverage a deep learning-based image inpainting algorithm for data restoration. Finally, a high-density keypoint extraction method is applied to the completed point cloud to generate features for classification. This image-based approach is designed to prioritize computational efficiency and inference speed, facilitating deployment on resource-constrained maritime platforms. Experiments conducted on a simulator dataset reveal that the classification of restored point clouds yields higher accuracy compared to using the original, incomplete LiDAR data, particularly at extended distances (>70 m) and challenging aspect angles (0° and 180°). The results suggest the framework’s potential to address perception failures in sparse data scenarios, thereby supporting the operational envelope of USVs in contested environments. Full article
(This article belongs to the Section Sensors and Robotics)
Show Figures

Figure 1

29 pages, 5420 KB  
Article
Theoretical Analysis and Systematic Comparison of Local Navigation Control Strategies in Semi-Structured Environments: A Systems Approach
by Claudio Urrea and Kevin Valencia-Aragón
Systems 2026, 14(3), 228; https://doi.org/10.3390/systems14030228 - 24 Feb 2026
Viewed by 330
Abstract
This study benchmarks three ROS 2 Navigation2 local controllers—Dynamic Window Approach Based (DWB), Regulated Pure Pursuit (RPP), and Model Predictive Path Integral (MPPI)—under three complementary operational stressors in simulation: (i) a structured corridor with a transient dynamic obstacle, (ii) a sloped environment where [...] Read more.
This study benchmarks three ROS 2 Navigation2 local controllers—Dynamic Window Approach Based (DWB), Regulated Pure Pursuit (RPP), and Model Predictive Path Integral (MPPI)—under three complementary operational stressors in simulation: (i) a structured corridor with a transient dynamic obstacle, (ii) a sloped environment where terrain inclination biases a planar 2D LiDAR costmap through spurious occupancy projections, and (iii) a narrow corridor that amplifies inflation effects. A reproducible rosbag2-based protocol records five key performance indicators per trial: time-to-goal, lateral tracking RMSE, stopped time, heading oscillations, and control effort. With 15 independent repetitions per cell (scene × controller × direction), the design yields 270 trials. The results expose complementary value profiles: RPP minimizes mission time, DWB produces the fewest heading oscillations through critic-based shaping, and MPPI achieves the lowest control effort via smooth trajectory generation. In the sloped scene, the tracking RMSE differences compress across all controllers—a signature of a perception-limited regime in which costmap bias overshadows controller logic. These findings translate into an actionable controller-selection guide and a reproducible baseline for quantifying gains from upstream perception and cost-representation improvements. In concrete terms, we contribute (i) a controlled benchmark with fixed planning, localization, and costmaps, (ii) full configuration disclosure (controller parameters, costmap settings, and software versions with package pinning), and (iii) a scene-specific costmap distortion index that links slope-induced local cost bias to measurable performance shifts, underpinning a decision matrix for controller selection in semi-structured environments. Full article
(This article belongs to the Section Systems Engineering)
Show Figures

Figure 1

22 pages, 1546 KB  
Article
Multimodal Fusion Attention Network for Real-Time Obstacle Detection and Avoidance for Low-Altitude Aircraft
by Xiaoqi Xu and Yiyang Zhao
Symmetry 2026, 18(2), 384; https://doi.org/10.3390/sym18020384 - 22 Feb 2026
Viewed by 306
Abstract
The rapid expansion of low-altitude unmanned aerial vehicles demands robust obstacle detection and avoidance systems capable of operating under diverse environmental conditions. This paper proposes a multimodal fusion attention network that integrates visual imagery and Light Detection and Ranging (LiDAR) point cloud data [...] Read more.
The rapid expansion of low-altitude unmanned aerial vehicles demands robust obstacle detection and avoidance systems capable of operating under diverse environmental conditions. This paper proposes a multimodal fusion attention network that integrates visual imagery and Light Detection and Ranging (LiDAR) point cloud data for real-time obstacle perception. The architecture incorporates a bidirectional cross-modal attention mechanism that learns dynamic correspondences between heterogeneous sensor modalities, enabling adaptive feature integration based on contextual reliability. An adaptive weighting component automatically modulates modal contributions according to estimated sensor confidence under varying environmental conditions. The network further employs gated fusion units and multi-scale feature pyramids to ensure comprehensive obstacle representation across different distances. A hierarchical avoidance decision framework translates detection outputs into executable control commands through threat assessment and graduated response strategies. Experimental evaluation on both public benchmarks and a purpose-collected low-altitude obstacle dataset demonstrates that the proposed method achieves 84.9% mean Average Precision (mAP) while maintaining 47.3 frames per second (FPS) on Graphics Processing Unit (GPU) hardware and 23.6 FPS on embedded platforms. Ablation studies confirm the contribution of each architectural component, with cross-modal attention providing the most substantial performance improvement. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

22 pages, 1472 KB  
Review
Innovations in Robots for Weed and Pest Control: A Systematic Review of Cutting-Edge Research
by Nicola Furnitto, Giuseppe Todde, Maria Spagnuolo, Giuseppe Sottosanti, Maria Caria, Giampaolo Schillaci and Sabina I. G. Failla
Mach. Learn. Knowl. Extr. 2026, 8(2), 51; https://doi.org/10.3390/make8020051 - 22 Feb 2026
Viewed by 540
Abstract
In recent years, agriculture has begun to transform thanks to the arrival of robots and autonomous vehicles capable of performing complex operations such as weeding and spraying in an intelligent and targeted manner. In fact, new-generation agricultural robots use artificial intelligence (AI), cameras, [...] Read more.
In recent years, agriculture has begun to transform thanks to the arrival of robots and autonomous vehicles capable of performing complex operations such as weeding and spraying in an intelligent and targeted manner. In fact, new-generation agricultural robots use artificial intelligence (AI), cameras, and sensors to recognise weeds, analyse crop conditions, and apply plant protection products only where necessary, thus reducing waste and environmental impact. Some systems combine drones and ground vehicles to achieve even more accurate results. This systematic review synthesises recent advances in agricultural robotics for weed and pest management through a PRISMA-based approach. Literature was collected from major scientific databases (Scopus, Web of Science, IEEE Xplore, Google Scholar) and complementary sources, leading to the inclusion of 83 eligible studies. The selected evidence was structured into four application domains: (i) weed detection and mapping, (ii) robotic and non-chemical weed control (mechanical and laser-based approaches), (iii) selective/variable-rate spraying for pest and disease management, and (iv) integrated weeding–spraying solutions, including cooperative Unmanned Aerial Vehicle–Unmanned Ground Vehicle (UAV–UGV) systems. Overall, the reviewed studies confirm rapid progress in real-time perception (deep learning-based detection), navigation/localization (e.g., GNSS/RTK, LiDAR, sensor fusion) and targeted actuation (spot spraying and precision interventions), while also revealing persistent limitations: heterogeneous evaluation protocols, limited system-level comparisons in terms of work rate, scalability, costs and robustness under variable field conditions, and an often unclear distinction between prototype platforms and solutions close to commercialization. However, the large-scale spread of these technologies is still hampered by high costs, technical complexity, and cultural resistance. The review highlights how the integration of automation, sustainability, and accessibility is key to the agriculture of the future. Full article
(This article belongs to the Section Thematic Reviews)
Show Figures

Graphical abstract

18 pages, 15632 KB  
Article
Design of a 3D High-Definition Map Visualizer for Pose Estimation and Autonomous Navigation in Dynamic Environments
by Yunchen Ge, Marcelo Contreras, Neel P. Bhatt and Ehsan Hashemi
Sensors 2026, 26(4), 1344; https://doi.org/10.3390/s26041344 - 19 Feb 2026
Viewed by 283
Abstract
A high-definition (HD) map development framework providing real-time visualization of multimodal perception data for state estimation, motion planning, and decision-making in autonomous navigation is presented and experimentally validated. The proposed framework integrates synchronized visual and LiDAR data and generates consistent frame transformations to [...] Read more.
A high-definition (HD) map development framework providing real-time visualization of multimodal perception data for state estimation, motion planning, and decision-making in autonomous navigation is presented and experimentally validated. The proposed framework integrates synchronized visual and LiDAR data and generates consistent frame transformations to construct accurate and interpretable HD maps suitable for navigation in dynamic environments. In addition, the framework enables flexible customization of essential map elements, including road features and static landmarks, facilitating efficient map generation and visualization. Building upon the developed HD map visualizer, a semantic-aware visual odometry (VO)-based pose estimation module is designed and verified through extensive evaluations and under perceptually degraded conditions. To ensure the reliability of synchronized multimodal data used by downstream perception and pose estimation modules, a sensor health monitoring system is also developed and validated in urban canyon scenarios with intermittent or unavailable global navigation satellite system (GNSS) measurements. Experimental results demonstrate that the proposed HD map visualizer and associated perception modules are transferable for autonomous navigation and can be effectively employed as benchmarking tools for state estimation and motion planning algorithms in autonomous driving. Full article
(This article belongs to the Section Navigation and Positioning)
Show Figures

Graphical abstract

25 pages, 15267 KB  
Article
3D Semantic Map Reconstruction for Orchard Environments Using Multi-Sensor Fusion
by Quanchao Wang, Yiheng Chen, Jiaxiang Li, Yongxing Chen and Hongjun Wang
Agriculture 2026, 16(4), 455; https://doi.org/10.3390/agriculture16040455 - 15 Feb 2026
Viewed by 535
Abstract
Semantic point cloud maps play a pivotal role in smart agriculture. They provide not only core three-dimensional data for orchard management but also empower robots with environmental perception, enabling safer and more efficient navigation and planning. However, traditional point cloud maps primarily model [...] Read more.
Semantic point cloud maps play a pivotal role in smart agriculture. They provide not only core three-dimensional data for orchard management but also empower robots with environmental perception, enabling safer and more efficient navigation and planning. However, traditional point cloud maps primarily model surrounding obstacles from a geometric perspective, failing to capture distinctions and characteristics between individual obstacles. In contrast, semantic maps encompass semantic information and even topological relationships among objects in the environment. Furthermore, existing semantic map construction methods are predominantly vision-based, making them ill-suited to handle rapid lighting changes in agricultural settings that can cause positioning failures. Therefore, this paper proposes a positioning and semantic map reconstruction method tailored for orchards. It integrates visual, LiDAR, and inertial sensors to obtain high-precision pose and point cloud maps. By combining open-vocabulary detection and semantic segmentation models, it projects two-dimensional detected semantic information onto the three-dimensional point cloud, ultimately generating a point cloud map enriched with semantic information. The resulting 2D occupancy grid map is utilized for robotic motion planning. Experimental results demonstrate that on a custom dataset, the proposed method achieves 74.33% mIoU for semantic segmentation accuracy, 12.4% relative error for fruit recall rate, and 0.038803 m mean translation error for localization. The deployed semantic segmentation network Fast-SAM achieves a processing speed of 13.36 ms per frame. These results demonstrate that the proposed method combines high accuracy with real-time performance in semantic map reconstruction. This exploratory work provides theoretical and technical references for future research on more precise localization and more complete semantic mapping, offering broad application prospects and providing key technological support for intelligent agriculture. Full article
(This article belongs to the Special Issue Advances in Robotic Systems for Precision Orchard Operations)
Show Figures

Figure 1

10 pages, 10777 KB  
Proceeding Paper
Blender-Based Simulation and Evaluation Framework for GNSS-LiDAR Sensor Fusion
by Adam Kalisz, Muhammad Khalil, Iñigo Cortés, Santiago Urquijo, Katrin Dietmayer, Matthias Overbeck, Christoph Miksovsky and Alexander Rügamer
Eng. Proc. 2026, 126(1), 21; https://doi.org/10.3390/engproc2026126021 - 14 Feb 2026
Viewed by 171
Abstract
The fusion of Global Navigation Satellite System (GNSS) and Light Detection and Ranging (LiDAR) sensors has emerged as a critical research area for high-precision navigation and mapping applications. While GNSS provides absolute positioning, it is susceptible to multipath errors, signal occlusions, and atmospheric [...] Read more.
The fusion of Global Navigation Satellite System (GNSS) and Light Detection and Ranging (LiDAR) sensors has emerged as a critical research area for high-precision navigation and mapping applications. While GNSS provides absolute positioning, it is susceptible to multipath errors, signal occlusions, and atmospheric disturbances. LiDAR, on the other hand, offers high-resolution environmental perception but lacks absolute localization and is sensitive to sensor noise and drift over time. To address these limitations, robust sensor fusion architectures are necessary to improve positioning accuracy, reliability, and robustness in diverse environments. This research focuses on the systematic modeling of GNSS and LiDAR errors to enhance sensor fusion performance. A key aspect of this work is the design of fusion architectures that optimize trade-offs between accuracy, environmental-dependency, and robustness to sensor failures. To this end, this research investigates trajectory alignment, geometric similarity, and sensor signal dropouts. Various fusion strategies, including tightly coupled and loosely coupled approaches, are explored to evaluate their effectiveness under different operational conditions. Simulation-based evaluation is a core component of this study, enabling controlled analysis of sensor errors, fusion methodologies, and performance metrics. A custom Blender-based simulation framework has been developed to facilitate reproducible experiments and allow for the benchmarking of different fusion strategies. By systematically analyzing fusion performance in terms of accuracy, consistency, and computational cost, this work aims to provide valuable insights into the optimal integration of GNSS and LiDAR for real-world applications. The simulation framework generates a reusable output format in order to demonstrate the flexibility of this methodology by running a selected fusion approach on real data (Sim2Real). The proposed framework and findings contribute to the research community by providing tools and methodologies for evaluating sensor fusion strategies, fostering advancements in precise and resilient localization solutions for autonomous systems, robotics, and geospatial applications in challenging environments. Full article
(This article belongs to the Proceedings of European Navigation Conference 2025)
Show Figures

Figure 1

30 pages, 19923 KB  
Article
Curriculum-Based Reinforcement Learning for Autonomous UAV Navigation in Unknown Curved Tubular Conduits
by Zamirddine Mari, Jérôme Pasquet and Julien Seinturier
Sensors 2026, 26(4), 1236; https://doi.org/10.3390/s26041236 - 13 Feb 2026
Viewed by 287
Abstract
Autonomous drone navigation in confined tubular environments remains a major challenge due to the constraining geometry of the conduits, the proximity of the walls, and the perceptual limitations inherent to such scenarios. We propose a reinforcement learning (RL) approach enabling a drone to [...] Read more.
Autonomous drone navigation in confined tubular environments remains a major challenge due to the constraining geometry of the conduits, the proximity of the walls, and the perceptual limitations inherent to such scenarios. We propose a reinforcement learning (RL) approach enabling a drone to navigate unknown three-dimensional tubes without any prior knowledge of their geometry, relying solely on local observations from a Light Detection and Ranging (LiDAR) sensor and a conditional visual detection of the tube center. In contrast, the Pure Pursuit algorithm, used as a deterministic baseline, benefits from explicit access to the centerline, creating an information asymmetry designed to assess the ability of RL to compensate for the absence of a geometric model. The agent is trained through a progressive curriculum learning strategy that gradually exposes it to increasingly curved geometries, where the tube center frequently disappears from the visual field. A turning-negotiation mechanism, based on the combination of direct visibility, directional memory, and LiDAR symmetry cues, proves essential for ensuring stable navigation under such partial observability conditions. Experiments show that the Proximal Policy Optimization (PPO) policy acquires robust and generalizable behavior, consistently outperforming the deterministic controller despite its limited access to geometric information. Validation in a high-fidelity three-dimensional environment further confirms the transferability of the learned behavior to continuous physical dynamics. In particular, this work introduces an explicit formulation of the turn negotiation problem in tubular navigation, coupled with a reward design and evaluation metrics that make turn-handling behavior measurable and analyzable. This explicit focus on turn negotiation distinguishes our approach from prior learning-based and heuristic methods. The proposed approach thus provides a complete framework for autonomous navigation in unknown tubular environments and opens perspectives for industrial, underground, or medical applications where progressing through narrow and weakly perceptive conduits represents a central challenge. Full article
(This article belongs to the Topic Advances in Autonomous Vehicles, Automation, and Robotics)
Show Figures

Figure 1

16 pages, 5384 KB  
Article
In-Pixel Time-to-Digital Converter with 156 ps Accuracy in dToF Image Sensors
by Liying Chen, Bangtian Li and Chuantong Cheng
Photonics 2026, 13(2), 158; https://doi.org/10.3390/photonics13020158 - 6 Feb 2026
Viewed by 271
Abstract
As the mainstream technology solution for deep imaging LiDAR, dToF measurement has been widely applied in emerging fields such as environmental perception and obstacle recognition, 3D terrain reconstruction, real-time motion capture, and drone obstacle avoidance navigation due to its advantages of high resolution, [...] Read more.
As the mainstream technology solution for deep imaging LiDAR, dToF measurement has been widely applied in emerging fields such as environmental perception and obstacle recognition, 3D terrain reconstruction, real-time motion capture, and drone obstacle avoidance navigation due to its advantages of high resolution, long-range detection capability, and high sensitivity. In order to adapt to functional applications in different scenarios, the resolution of TDC needs to be adjustable and can work normally in different environments. In view of this, this article studies the pixel array and TDC circuit in the chip and locks a voltage-controlled ring oscillator (VCRO) with the same structure as the pixel to a fixed frequency through a PLL structure. Then copy the control voltage of the locked VCRO to the control terminal of the TDC in each pixel. In an ideal situation, this control voltage can make the oscillation frequency of VCRO within the pixel consistent with the locking frequency of VCRO within the PLL, and insensitive to changes in PVT. This study developed a module expandable 16 × 16-pixel array dToF sensor chip based on TDC architecture using CMOS technology. Finally, six configurable 16 × 16-pixel subarrays were integrated and constructed into a 32 × 48 large-scale dToF sensor chip through modular splicing. The top-level layout design was completed using SMIC 180 nm technology, with a layout area of 5285 µm × 3669 µm. Post-simulation verification showed that, under the testing conditions of a 400 MHz system clock and a 33.3 kHz frame rate, the dToF chip system performance indicators were: time measurement resolution of 156 ps, DNL < 1 LSB, INL < 0.85 LSB, and absolute ranging accuracy better than 2.5 cm. Full article
Show Figures

Figure 1

19 pages, 5725 KB  
Article
Real-Time 3D Scene Understanding for Road Safety: Depth Estimation and Object Detection for Autonomous Vehicle Awareness
by Marcel Simeonov, Andrei Kurdiumov and Milan Dado
Vehicles 2026, 8(2), 28; https://doi.org/10.3390/vehicles8020028 - 2 Feb 2026
Viewed by 627
Abstract
Accurate depth perception is vital for autonomous driving and roadside monitoring. Traditional stereo vision methods are cost-effective but often fail under challenging conditions such as low texture, reflections, or complex lighting. This work presents a perception pipeline built around FoundationStereo, a Transformer-based stereo [...] Read more.
Accurate depth perception is vital for autonomous driving and roadside monitoring. Traditional stereo vision methods are cost-effective but often fail under challenging conditions such as low texture, reflections, or complex lighting. This work presents a perception pipeline built around FoundationStereo, a Transformer-based stereo depth estimation model. At low resolutions, FoundationStereo achieves real-time performance (up to 26 FPS) on embedded platforms like NVIDIA Jetson AGX Orin with TensorRT acceleration and power-of-two input sizes, enabling deployment in roadside cameras and in-vehicle systems. For Full HD stereo pairs, the same model delivers dense and precise environmental scans, complementing LiDAR while maintaining a high level of accuracy. YOLO11 object detection and segmentation is deployed in parallel for object extraction. Detected objects are removed from depth maps generated by FoundationStereo prior to point cloud generation, producing cleaner 3D reconstructions of the environment. This approach demonstrates that advanced stereo networks can operate efficiently on embedded hardware. Rather than replacing LiDAR or radar, it complements existing sensors by providing dense depth maps in situations where other sensors may be limited. By improving depth completeness, robustness, and enabling filtered point clouds, the proposed system supports safer navigation, collision avoidance, and scalable roadside infrastructure scanning for autonomous mobility. Full article
Show Figures

Figure 1

Back to TopTop