MDPI - Publisher of Open Access Journals

47 pages, 3288 KB

Open AccessReview

LiDAR-Based Road Surface Damage Classification: A Survey

by Trevor Greene, Meisam Shayegh Moradi, Muhammad Umair, Nafiul Nawjis, Naima Kaabouch and Timothy Pasch

Sensors 2026, 26(8), 2338; https://doi.org/10.3390/s26082338 - 10 Apr 2026

Viewed by 29

Abstract

Unlike image-only systems that falter in shadows, glare, and low contrast, LiDAR directly records surface geometry and supports depth-aware quantification. This survey examines LiDAR-based road surface damage classification across the entire pipeline, encompassing acquisition with mobile and terrestrial laser scanning, preprocessing and representation [...] Read more.

Unlike image-only systems that falter in shadows, glare, and low contrast, LiDAR directly records surface geometry and supports depth-aware quantification. This survey examines LiDAR-based road surface damage classification across the entire pipeline, encompassing acquisition with mobile and terrestrial laser scanning, preprocessing and representation choices, supervised, semi-supervised, and unsupervised learning techniques, as well as multisensor fusion at early, mid, and late stages. A consistent thread is measurement, not just detection: we describe how LiDAR damage classification maps to agency practices such as the Distress Identification Manual and the Pavement Condition Index. We summarize datasets and evaluation protocols for detection, segmentation, 3D reconstruction, and ride quality. We outline practical concerns for corridor-scale deployment: calibration and timing, intensity normalization, tiling/streaming, and runtime budgeting. The review concludes with open problems and outlines directions for robust, severity-aware, and scalable field systems. Full article

(This article belongs to the Section Remote Sensors)

38 pages, 3132 KB

Open AccessArticle

Lightweight Semantic-Aware Route Planning on Edge Hardware for Indoor Mobile Robots: Monocular Camera–2D LiDAR Fusion with Penalty-Weighted Nav2 Route Server Replanning

by Bogdan Felician Abaza, Andrei-Alexandru Staicu and Cristian Vasile Doicin

Sensors 2026, 26(7), 2232; https://doi.org/10.3390/s26072232 - 4 Apr 2026

Viewed by 652

Abstract

The paper introduces a computationally efficient semantic-aware route planning framework for indoor mobile robots, designed for real-time execution on resource-constrained edge hardware (Raspberry Pi 5, CPU-only). The proposed architecture fuses monocular object detection with 2D LiDAR-based range estimation and integrates the resulting semantic [...] Read more.

The paper introduces a computationally efficient semantic-aware route planning framework for indoor mobile robots, designed for real-time execution on resource-constrained edge hardware (Raspberry Pi 5, CPU-only). The proposed architecture fuses monocular object detection with 2D LiDAR-based range estimation and integrates the resulting semantic annotations into the Nav2 Route Server for penalty-weighted route selection. Object localization in the map frame is achieved through the Angular Sector Fusion (ASF) pipeline, a deterministic geometric method requiring no parameter tuning. The ASF projects YOLO bounding boxes onto LiDAR angular sectors and estimates the object range using a 25th-percentile distance statistic, providing robustness to sparse returns and partial occlusions. All intrinsic and extrinsic sensor parameters are resolved at runtime via ROS 2 topic introspection and the URDF transform tree, enabling platform-agnostic deployment. Detected entities are classified according to mobility semantics (dynamic, static, and minor) and persistently encoded in a GeoJSON-based semantic map, with these annotations subsequently propagated to navigation graph edges as additive penalties and velocity constraints. Route computation is performed by the Nav2 Route Server through the minimization of a composite cost functional combining geometric path length with semantic penalties. A reactive replanning module monitors semantic cost updates during execution and triggers route invalidation and re-computation when threshold violations occur. Experimental evaluation over 115 navigation segments (legs) on three heterogeneous robotic platforms (two single-board RPi5 configurations and one dual-board setup with inference offloading) yielded an overall success rate of 97% (baseline: 100%, adaptive: 94%), with 42 replanning events observed in 57% of adaptive trials. Navigation time distributions exhibited statistically significant departures from normality (Shapiro–Wilk, p < 0.005). While central tendency differences between the baseline and adaptive modes were not significant (Mann–Whitney U, p = 0.157), the adaptive planner reduced temporal variance substantially (σ = 11.0 s vs. 31.1 s; Levene’s test W = 3.14, p = 0.082), primarily by mitigating AMCL recovery-induced outliers. On-device YOLO26n inference, executed via the NCNN backend, achieved 5.5 ± 0.7 FPS (167 ± 21 ms latency), and distributed inference reduced the average system CPU load from 85% to 48%. The study further reports deployment-level observations relevant to the Nav2 ecosystem, including GeoJSON metadata persistence constraints, graph discontinuity (“path-gap”) artifacts, and practical Route Server configuration patterns for semantic cost integration. Full article

(This article belongs to the Special Issue Advances in Sensing, Control and Path Planning for Robotic Systems)

► Show Figures

Figure 1

25 pages, 12227 KB

Open AccessArticle

Air–Ground Collaborative Autonomous Exploration and Mapping Method for Complex Multi-Grain Pile Environments

by Lan Wu, Menghao Chen and Xuhui Liang

Sensors 2026, 26(7), 2184; https://doi.org/10.3390/s26072184 - 1 Apr 2026

Viewed by 392

Abstract

Prompt 3D mapping of grain storage is essential for effective management. However, standard mapping algorithms encounter a number of challenges, with the typical granary environment containing dust, grain piles, and narrow aisles. A single robotic agent is not able to provide complete area [...] Read more.

Prompt 3D mapping of grain storage is essential for effective management. However, standard mapping algorithms encounter a number of challenges, with the typical granary environment containing dust, grain piles, and narrow aisles. A single robotic agent is not able to provide complete area coverage, and most multi-robot approaches involve re-scanning the same areas due to a lack of explicit viewpoint-based task allocation processes. In order to overcome the above issues, we propose an air–ground collaborative exploration system for complex multi-grain pile scenarios. Exploration redundancy can be reduced by estimating the advantages of viewpoints through ray tracing and assigning the tops of the grain piles to aerial robots with ground vehicles in lower regions and narrow aisles. In order to manage dense dust (5–15 mg/m³), the quality-aware fusion strategy evaluates the reliability of the distance and point density of the sensing to reduce the influence of degraded aerial depth data. Moreover, mapping relies on LiDAR data to ensure mapping quality. A mechanism for re-scanning to enable coverage-driven exploitation of insufficiently explored regions is subsequently proposed. The simulation results show that the design achieved a grain pile coverage of 97.2%, with the total exploration time reduced by 20.1% over single-robot baselines. The results indicate that viewpoint-aware task allocation and dust-sensitive perception fusion can offer a practical solution for autonomous inspection in GPS-restricted, dust-rich industrial environments, such as granary facilities. Full article

(This article belongs to the Topic Intelligent Agriculture: Perception Technologies and Agricultural Equipment for Crop Production Processes)

► Show Figures

Graphical abstract

21 pages, 40575 KB

Open AccessArticle

Navigation Error Characteristics of LIO-, VIO-, and RIMU-Assisted INS/GNSS Multi-Sensor Fusion Schemes in a GNSS-Denied Environment

by Kai-Wei Chiang, Syun Tsai, Chi-Hsin Huang, Yang-En Lu, Surachet Srinara, Meng-Lun Tsai, Naser El-Sheimy and Mengchi Ai

Sensors 2026, 26(7), 2068; https://doi.org/10.3390/s26072068 - 26 Mar 2026

Viewed by 425

Abstract

Autonomous vehicles at level 3 and above must maintain high navigation accuracy, particularly in global navigation satellite system (GNSS)-denied environments. The main innovations of this work are threefold. First, we integrate visual inertial odometry (VIO) and light detection and ranging (LiDAR) inertial odometry [...] Read more.

Autonomous vehicles at level 3 and above must maintain high navigation accuracy, particularly in global navigation satellite system (GNSS)-denied environments. The main innovations of this work are threefold. First, we integrate visual inertial odometry (VIO) and light detection and ranging (LiDAR) inertial odometry (LIO) as external updates to mitigate the rapid drift of micro-electromechanical system (MEMS)-based industrial-grade inertial measurement units (IMUs) during long-term GNSS outages. Second, we adopt a redundant IMU (RIMU) approach that fuses multiple low-cost IMUs to reduce sensor noise and improve reliability. Third, we propose a system calibration methodology using both static and dynamic vehicle motion to estimate extrinsic parameters (boresight angles and lever arms) of the sensors, achieving an overall boresight angle root-mean-square error of 0.04 degrees in the simulation. Experiments were conducted under a 7 min GNSS-denied scenario in an underground parking lot, allowing for comparison of the error characteristics of multi-sensor fusion schemes against a navigation-grade reference. The INS/GNSS/LIO framework achieved a two-dimensional root-mean-square position error of 1.22 m (95% position error within 2.5 m), meeting the lane-level (1.5 m) accuracy requirement under a GNSS outage exceeding 7 min without prior maps. In contrast, the RINS/GNSS/VIO framework yielded a 4.71 m 2D mean position error under the same conditions. This paper provides a quantitative comparison of the baseline error characteristics of VIO-, LIO-, and RIMU-assisted INS/GNSS fusion under a GNSS-denied navigation scenario. Full article

(This article belongs to the Section Remote Sensors)

► Show Figures

Figure 1

23 pages, 11145 KB

Open AccessArticle

DiffLiGS: Diffusion-Guided LiDAR-Enhanced 3D Gaussian Splatting

by Shucheng Gong, Hong Xie, Jiang Song, Longze Zhu and Hongping Zhang

ISPRS Int. J. Geo-Inf. 2026, 15(4), 140; https://doi.org/10.3390/ijgi15040140 - 24 Mar 2026

Viewed by 620

Abstract

Multi-view 3D reconstruction is essential for smart city, supporting applications such as smart city planning and autonomous navigation. While traditional reconstruction pipelines and recent neural implicit methods, such as NeRF, achieve high visual fidelity, they often struggle with geometric accuracy and sparse-view scenarios. [...] Read more.

Multi-view 3D reconstruction is essential for smart city, supporting applications such as smart city planning and autonomous navigation. While traditional reconstruction pipelines and recent neural implicit methods, such as NeRF, achieve high visual fidelity, they often struggle with geometric accuracy and sparse-view scenarios. To address this challenge, we present DiffLiGS, a novel multi-modal 3D reconstruction framework that integrates LiDAR point clouds and LiDAR-guided diffusion-based priors into the 3D Gaussian Splatting (3DGS) pipeline, enabling high-fidelity and geometrically accurate models. Our method first densifies sparse LiDAR depths using a diffusion model and refines them through multi-view geometric constraints, producing dense LiDAR depth maps that provide robust supervision for 3DGS optimization. Leveraging these dense depth maps, we guide a Stable Video Diffusion model to synthesize novel view images, which are incorporated into training to enhance reconstruction completeness and visual realism. By jointly fusing rich appearance cues from multi-view images with precise LiDAR-derived geometry and diffusion priors, DiffLiGS achieves unified, geometry-aware 3D scene representations. Our extensive experiments demonstrate that our approach significantly improves both geometric accuracy and rendering quality compared to existing 3D reconstruction methods, enabling real-time, high-precision modeling of complex urban environments. Full article

(This article belongs to the Topic 3D Computer Vision and Smart Building and City, 3rd Edition)

► Show Figures

Figure 1

20 pages, 11919 KB

Open AccessArticle

Optimized UAV-LiDAR Workflows for Fine-Scale Stream Network Mapping in Low-Gradient Wetlands: A Case Study of the Kushiro Wetland, Japan

by Waruth Pojsilapachai, Takehiko Ito and Tomohito J. Yamada

Water 2026, 18(6), 693; https://doi.org/10.3390/w18060693 - 16 Mar 2026

Viewed by 388

Abstract

Accurate delineation of stream networks in low-gradient wetlands remains challenging due to subtle topographic variation and dense vegetation cover. This study systematically evaluated 48 Unmanned Aerial Vehicle Light Detection and Ranging (UAV-LiDAR) processing workflows through 1128 pairwise comparisons to identify optimal approaches for [...] Read more.

Accurate delineation of stream networks in low-gradient wetlands remains challenging due to subtle topographic variation and dense vegetation cover. This study systematically evaluated 48 Unmanned Aerial Vehicle Light Detection and Ranging (UAV-LiDAR) processing workflows through 1128 pairwise comparisons to identify optimal approaches for mapping fine-scale channels in Japan’s Kushiro Wetland, a Ramsar-designated ecosystem. The workflows combined three ground filtering methods (Progressive Morphological Filter, Cloth Simulation Filter, Multiscale Curvature Classification), four interpolation techniques (Inverse Distance Weighting, Triangulated Irregular Network, Kriging, Multilevel B-spline Approximation), two sink-filling algorithms (Planchon & Darboux; Wang & Liu), and two flow direction models (D8, D-infinity). Performance was first assessed using pixel-based Intersection over Union (IoU) metrics to quantify inter-method consensus. Independent plausibility-based validation was then conducted using near-contemporaneous Sentinel-2 imagery. Although pairwise statistical analysis identified workflows that achieved high inter-method consensus (median IoU = 0.90), external validation demonstrated that the CSF-MBA-Planchon-D8 workflow provided the most realistic presentation of optically observable channel corridors (validation IoU ≈ 0.85). These findings reveal that high inter-method agreement does not necessarily imply accurate landscape representation; multiple workflows may converge on systematically biased solutions. Ground filtering exerted the strongest influence on pairwise consensus, whereas plausibility-based validation highlighted the importance of selecting workflow combinations that preserve subtle channel morphology. Sink-filling and flow direction choices exerted comparatively minor effects in this low-gradient setting. The proposed dual-validation framework provides methodological guidance for wetland restoration planning and highlights the necessity of external validation in LiDAR-derived hydrological feature extraction. Full article

(This article belongs to the Special Issue Recent Advances in Water Sciences Under a Variable and Changing Environment)

► Show Figures

Figure 1

26 pages, 3911 KB

Open AccessArticle

Integrated Multimodal Perception and Predictive Motion Forecasting via Cross-Modal Adaptive Attention

by Bakhita Salman, Alexander Chavez and Muneeb Yassin

Future Transp. 2026, 6(2), 64; https://doi.org/10.3390/futuretransp6020064 - 11 Mar 2026

Viewed by 426

Abstract

Accurate environmental perception is fundamental to safe autonomous driving; however, most existing multimodal systems rely on fixed or heuristic sensor fusion strategies that cannot adapt to scene-dependent variations in sensor reliability. This paper proposes Cross-Modal Adaptive Attention (CMAA), a unified end-to-end Bird’s-Eye-View (BEV) [...] Read more.

Accurate environmental perception is fundamental to safe autonomous driving; however, most existing multimodal systems rely on fixed or heuristic sensor fusion strategies that cannot adapt to scene-dependent variations in sensor reliability. This paper proposes Cross-Modal Adaptive Attention (CMAA), a unified end-to-end Bird’s-Eye-View (BEV) perception framework that dynamically fuses camera, LiDAR, and RADAR information through learnable, context-aware modality gating. Unlike static fusion approaches, CMAA adaptively reweights sensor contributions based on global scene descriptors, enabling the robust integration of semantic, geometric, and motion cues without manual tuning. The proposed architecture jointly performs 3D object detection, multi-object tracking, and motion forecasting within a shared BEV representation, preserving spatial alignment across tasks and supporting efficient real-time deployment. Experiments conducted on the official nuScenes validation split demonstrate that CMAA achieves 0.528 mAP and 0.691 NDS, outperforming fixed-weight fusion baselines while maintaining a compact model size and efficient inference. Additional tracking evaluation using the official nuScenes tracking devkit reports improved tracking performance, while motion forecasting experiments show reduced trajectory displacement errors (minADE and minFDE). Ablation studies further confirm the complementary contributions of adaptive modality gating and bidirectional cross-modal refinement, and a stratified dynamic analysis reveals consistent reductions in velocity estimation error across object classes, motion regimes, and environmental conditions. These results demonstrate that adaptive multimodal fusion improves robustness, motion reasoning, and perception reliability in complex traffic environments while remaining computationally efficient for deployment in safety-critical autonomous driving systems. Full article

► Show Figures

Figure 1

16 pages, 2080 KB

Open AccessArticle

Lidar–Vision Depth Fusion for Robust Loop Closure Detection in SLAM Systems

by Bingzhuo Liu, Panlong Wu, Rongting Chen, Yidan Zheng and Mengyu Li

Machines 2026, 14(3), 282; https://doi.org/10.3390/machines14030282 - 3 Mar 2026

Viewed by 436

Abstract

Loop Closure Detection (LCD) is a key component of Simultaneous Localization and Mapping (SLAM) systems, responsible for correcting odometric drift and maintaining global consistency in localization and mapping. However, single-modality LCD methods suffer from inherent limitations: LiDAR-based approaches are affected by point cloud [...] Read more.

Loop Closure Detection (LCD) is a key component of Simultaneous Localization and Mapping (SLAM) systems, responsible for correcting odometric drift and maintaining global consistency in localization and mapping. However, single-modality LCD methods suffer from inherent limitations: LiDAR-based approaches are affected by point cloud sparsity, limiting feature representation in unstructured environments, while vision-based methods are sensitive to illumination and weather variations, reducing robustness. To address these issues, this paper presents a LiDAR–vision multimodal fusion LCD algorithm. Spatiotemporal alignment between LiDAR point clouds and images is achieved through extrinsic calibration and timestamp interpolation to ensure cross-modal consistency. Harris corner detection and BRIEF descriptors are employed to extract visual features, and a LiDAR-projected sparse depth map is used to complete depth information, mapping 2D features into 3D space. A hybrid feature representation is then constructed by fusing LiDAR geometric triangle descriptors with visual BRIEF descriptors, enabling efficient loop candidate retrieval via hash indexing. Finally, an improved RANSAC algorithm performs geometric verification to enhance the robustness of relative pose estimation. Experiments on the KITTI and NCLT datasets show that the proposed method achieves average F1 scores of 85.28% and 77.63%, respectively, outperforming both unimodal and existing multimodal approaches. When integrated into a SLAM framework, it reduces the Absolute Error (ATE) RMSE by 11.2–16.4% compared with LiDAR-only methods, demonstrating improved loop detection accuracy and overall system robustness in complex environments. Full article

(This article belongs to the Special Issue Climbing Robots and Autonomous Systems: Mechanisms, Intelligence, and Real-World Applications)

► Show Figures

Figure 1

20 pages, 3202 KB

Open AccessArticle

Robust LiDAR-Based Train Detection via Point Cloud Segmentation for Railway Safety

by Yuxing Yang, Siyue Yu and Jimin Xiao

Sensors 2026, 26(5), 1514; https://doi.org/10.3390/s26051514 - 27 Feb 2026

Viewed by 316

Abstract

Ensuring railway safety requires reliable monitoring of trains in critical safety areas, such as station throat zones and railway crossings. Compared with cameras, roadside LiDAR can more reliably capture the geometry of trains under low-light, high-speed, and adverse weather conditions. However, industrial LiDAR [...] Read more.

Ensuring railway safety requires reliable monitoring of trains in critical safety areas, such as station throat zones and railway crossings. Compared with cameras, roadside LiDAR can more reliably capture the geometry of trains under low-light, high-speed, and adverse weather conditions. However, industrial LiDAR solutions still primarily use the background comparison technique, which compares each sample against a pre-recorded clean map and then applies a size-based filter. Such approaches are highly sensitive to point cloud background changes arising from varying LiDAR installation distances, train speeds, and surface materials, often resulting in fragmented clustering and missed detections. In this paper, train detection is reformulated as a point-level semantic segmentation problem. A lightweight 3D segmentation network that directly predicts train points from raw data is designed, and clustering-based post-processing is applied to generate train-level events in real time. Experiments on real railway data under various operating conditions show that the proposed method achieves higher detection accuracy and greater robustness than traditional compare-based methods and representative deep learning benchmark methods, and is therefore suitable for practical railway safety monitoring. Full article

(This article belongs to the Section Intelligent Sensors)

► Show Figures

Figure 1

23 pages, 15134 KB

Open AccessArticle

Multi-Technique Data Fusion for Obtaining High-Resolution 3D Models of Narrow Gorges and Canyons to Determine Water Level in Flooding Events

by José Luis Pérez-García, José Miguel Gómez-López, Antonio Tomás Mozas-Calvache and Diego Vico-García

GeoHazards 2026, 7(1), 25; https://doi.org/10.3390/geohazards7010025 - 17 Feb 2026

Viewed by 466

Abstract

Precise modeling of narrow gorges is challenging due to extreme confinement, hindering visibility and accessibility. These environments often render Global Navigation Satellite Systems (GNSS)-based positioning unfeasible, a difficulty compounded by water and dense vegetation. Consequently, multi-technique data fusion is required. This study proposes [...] Read more.

Precise modeling of narrow gorges is challenging due to extreme confinement, hindering visibility and accessibility. These environments often render Global Navigation Satellite Systems (GNSS)-based positioning unfeasible, a difficulty compounded by water and dense vegetation. Consequently, multi-technique data fusion is required. This study proposes a robust methodology to generate high-resolution 3D models of such complex environments by integrating multiple aerial (e.g., Unmanned Aerial Vehicles, UAVs) and terrestrial techniques. A multi-sensor approach combined UAV-Light Detection and Ranging (LiDAR) and UAV-photogrammetry for external areas with Terrestrial laser scanning (TLS), Mobile Mapping System (MMS), and Spherical Photogrammetry (SP) for the canyon floor. Furthermore, the representativeness of these 3D models was analyzed against standard Digital Terrain Models (DTMs) for determining water height levels during flood events. A one-dimensional hydraulic (1DH) model compared the 3D mesh approach with the traditional 2.5D perspective in a challenging, narrow canyon prone to flooding. Our results show that traditional 2.5D DTMs significantly over- or underestimate water levels in narrow sections—failing to account for overhangs and vertical wall irregularities—whereas high-resolution 3D meshes provide a more realistic representation of hydraulic behavior. This work demonstrates that multi-sensor data fusion is essential for accurate flood risk management and infrastructure planning in complex fluvial environments. Full article

► Show Figures

Graphical abstract

25 pages, 15267 KB

Open AccessArticle

3D Semantic Map Reconstruction for Orchard Environments Using Multi-Sensor Fusion

by Quanchao Wang, Yiheng Chen, Jiaxiang Li, Yongxing Chen and Hongjun Wang

Agriculture 2026, 16(4), 455; https://doi.org/10.3390/agriculture16040455 - 15 Feb 2026

Viewed by 737

Abstract

Semantic point cloud maps play a pivotal role in smart agriculture. They provide not only core three-dimensional data for orchard management but also empower robots with environmental perception, enabling safer and more efficient navigation and planning. However, traditional point cloud maps primarily model [...] Read more.

Semantic point cloud maps play a pivotal role in smart agriculture. They provide not only core three-dimensional data for orchard management but also empower robots with environmental perception, enabling safer and more efficient navigation and planning. However, traditional point cloud maps primarily model surrounding obstacles from a geometric perspective, failing to capture distinctions and characteristics between individual obstacles. In contrast, semantic maps encompass semantic information and even topological relationships among objects in the environment. Furthermore, existing semantic map construction methods are predominantly vision-based, making them ill-suited to handle rapid lighting changes in agricultural settings that can cause positioning failures. Therefore, this paper proposes a positioning and semantic map reconstruction method tailored for orchards. It integrates visual, LiDAR, and inertial sensors to obtain high-precision pose and point cloud maps. By combining open-vocabulary detection and semantic segmentation models, it projects two-dimensional detected semantic information onto the three-dimensional point cloud, ultimately generating a point cloud map enriched with semantic information. The resulting 2D occupancy grid map is utilized for robotic motion planning. Experimental results demonstrate that on a custom dataset, the proposed method achieves 74.33% mIoU for semantic segmentation accuracy, 12.4% relative error for fruit recall rate, and 0.038803 m mean translation error for localization. The deployed semantic segmentation network Fast-SAM achieves a processing speed of 13.36 ms per frame. These results demonstrate that the proposed method combines high accuracy with real-time performance in semantic map reconstruction. This exploratory work provides theoretical and technical references for future research on more precise localization and more complete semantic mapping, offering broad application prospects and providing key technological support for intelligent agriculture. Full article

(This article belongs to the Special Issue Advances in Robotic Systems for Precision Orchard Operations)

► Show Figures

Figure 1

19 pages, 5725 KB

Open AccessArticle

Real-Time 3D Scene Understanding for Road Safety: Depth Estimation and Object Detection for Autonomous Vehicle Awareness

by Marcel Simeonov, Andrei Kurdiumov and Milan Dado

Vehicles 2026, 8(2), 28; https://doi.org/10.3390/vehicles8020028 - 2 Feb 2026

Viewed by 857

Abstract

Accurate depth perception is vital for autonomous driving and roadside monitoring. Traditional stereo vision methods are cost-effective but often fail under challenging conditions such as low texture, reflections, or complex lighting. This work presents a perception pipeline built around FoundationStereo, a Transformer-based stereo [...] Read more.

Accurate depth perception is vital for autonomous driving and roadside monitoring. Traditional stereo vision methods are cost-effective but often fail under challenging conditions such as low texture, reflections, or complex lighting. This work presents a perception pipeline built around FoundationStereo, a Transformer-based stereo depth estimation model. At low resolutions, FoundationStereo achieves real-time performance (up to 26 FPS) on embedded platforms like NVIDIA Jetson AGX Orin with TensorRT acceleration and power-of-two input sizes, enabling deployment in roadside cameras and in-vehicle systems. For Full HD stereo pairs, the same model delivers dense and precise environmental scans, complementing LiDAR while maintaining a high level of accuracy. YOLO11 object detection and segmentation is deployed in parallel for object extraction. Detected objects are removed from depth maps generated by FoundationStereo prior to point cloud generation, producing cleaner 3D reconstructions of the environment. This approach demonstrates that advanced stereo networks can operate efficiently on embedded hardware. Rather than replacing LiDAR or radar, it complements existing sensors by providing dense depth maps in situations where other sensors may be limited. By improving depth completeness, robustness, and enabling filtered point clouds, the proposed system supports safer navigation, collision avoidance, and scalable roadside infrastructure scanning for autonomous mobility. Full article

(This article belongs to the Special Issue Emerging Solutions and Technologies for Smart Mobility and Vehicle Safety in Transportation)

► Show Figures

Figure 1

38 pages, 6725 KB

Open AccessArticle

A BIM-Based Digital Twin Framework for Urban Roads: Integrating MMS and Municipal Geospatial Data for AI-Ready Urban Infrastructure Management

by Vittorio Scolamiero and Piero Boccardo

Sensors 2026, 26(3), 947; https://doi.org/10.3390/s26030947 - 2 Feb 2026

Viewed by 806

Abstract

Digital twins (DTs) are increasingly adopted to enhance the monitoring, management, and planning of urban infrastructure. While DT development for buildings is well established, applications to urban road networks remain limited, particularly in integrating heterogeneous geospatial datasets into semantically rich, multi-scale representations. This [...] Read more.

Digital twins (DTs) are increasingly adopted to enhance the monitoring, management, and planning of urban infrastructure. While DT development for buildings is well established, applications to urban road networks remain limited, particularly in integrating heterogeneous geospatial datasets into semantically rich, multi-scale representations. This study presents a methodology for developing a BIM-based DT of urban roads by integrating geospatial data from Mobile Mapping System (MMS) surveys with semantic information from municipal geodatabases. The approach follows a multi-modal (point clouds, imagery, vector data), multi-scale and multi-level framework, where ‘multi-level’ refers to modeling at different scopes—from a city-wide level, offering a generalized representation of the entire road network, to asset-level detail, capturing parametric BIM elements for individual road segments or specific components such as road sign and road marker, lamp posts and traffic light. MMS-derived LiDAR point clouds allow accurate 3D reconstruction of road surfaces, curbs, and ancillary infrastructure, while municipal geodatabases enrich the model with thematic layers including pavement condition, road classification, and street furniture. The resulting DT framework supports multi-scale visualization, asset management, and predictive maintenance. By combining geometric precision with semantic richness, the proposed methodology delivers an interoperable and scalable framework for sustainable urban road management, providing a foundation for AI-ready applications such as automated defect detection, traffic simulation, and predictive maintenance planning. The resulting DT achieved a geometric accuracy of ±3 cm and integrated more than 45 km of urban road network, enabling multi-scale analyses and AI-ready data fusion. Full article

(This article belongs to the Special Issue Intelligent Sensors and Artificial Intelligence in Building)

► Show Figures

Figure 1

20 pages, 4015 KB

Open AccessArticle

Adaptive Kalman Filter-Based SLAM in LiDAR-Degenerated Environments

by Ran Ma, Tao Zhou and Liang Chen

Sensors 2026, 26(3), 861; https://doi.org/10.3390/s26030861 - 28 Jan 2026

Viewed by 876

Abstract

Owing to the low cost, small size, and convenience for installation, 2D LiDAR has been widely used in mobile robots for simultaneous positioning and mapping (SLAM). However, traditional 2D LiDAR SLAM methods have low robustness and accuracy in LiDAR-degenerated environments. To improve the [...] Read more.

Owing to the low cost, small size, and convenience for installation, 2D LiDAR has been widely used in mobile robots for simultaneous positioning and mapping (SLAM). However, traditional 2D LiDAR SLAM methods have low robustness and accuracy in LiDAR-degenerated environments. To improve the robustness of the SLAM method in such environments, an innovative SLAM method is developed, which mainly includes two parts, i.e., the front-end positioning and the back-end optimization. Specifically, in the front-end part, the AKF (adaptive Kalman filter) method is applied to estimate the pose of the mobile robot, zero bias of acceleration and gyroscope, lever arm length, and the mounting angle. The adaptive factor of the AKF can dynamically adjust the variance of the process and measurement noises based on the residual. In the back-end part, a particle filter (PF) is employed to optimize the pose estimation and build the map, where the pose domain constraint from the output of the front-end is introduced in the PF to avoid mismatch and enhance positioning accuracy. To verify the performance of the method, a series of experiments is carried out in four typical environments. The experimental results show that the positioning precision has been improved by about 61.3–97.9%, 35.7–99.0%, and 43.8–93.0% compared to the Karto SLAM, Hector SLAM, and Cartographer, respectively. Full article

(This article belongs to the Section Navigation and Positioning)

► Show Figures

Figure 1

35 pages, 10558 KB

Open AccessArticle

Cave of Altamira (Spain): UAV-Based SLAM Mapping, Digital Twin and Segmentation-Driven Crack Detection for Preventive Conservation in Paleolithic Rock-Art Environments

by Jorge Angás, Manuel Bea, Carlos Valladares, Cristian Iranzo, Gonzalo Ruiz, Pilar Fatás, Carmen de las Heras, Miguel Ángel Sánchez-Carro, Viola Bruschi, Alfredo Prada and Lucía M. Díaz-González

Drones 2026, 10(1), 73; https://doi.org/10.3390/drones10010073 - 22 Jan 2026

Cited by 1 | Viewed by 943

Abstract

The Cave of Altamira (Spain), a UNESCO World Heritage site, contains one of the most fragile and inaccessible Paleolithic rock-art environments in Europe, where geomatics documentation is constrained not only by severe spatial, lighting and safety limitations but also by conservation-driven restrictions on [...] Read more.

The Cave of Altamira (Spain), a UNESCO World Heritage site, contains one of the most fragile and inaccessible Paleolithic rock-art environments in Europe, where geomatics documentation is constrained not only by severe spatial, lighting and safety limitations but also by conservation-driven restrictions on time, access and operational procedures. This study applies a confined-space UAV equipped with LiDAR-based SLAM navigation to document and assess the stability of the vertical rock wall leading to “La Hoya” Hall, a structurally sensitive sector of the cave. Twelve autonomous and assisted flights were conducted, generating dense LiDAR point clouds and video sequences processed through videogrammetry to produce high-resolution 3D meshes. A Mask R-CNN deep learning model was trained on manually segmented images to explore automated crack detection under variable illumination and viewing conditions. The results reveal active fractures, overhanging blocks and sediment accumulations located on inaccessible ledges, demonstrating the capacity of UAV-SLAM workflows to overcome the limitations of traditional surveys in confined subterranean environments. All datasets were integrated into the DiGHER digital twin platform, enabling traceable storage, multitemporal comparison, and collaborative annotation. Overall, the study demonstrates the feasibility of combining UAV-based SLAM mapping, videogrammetry and deep learning segmentation as a reproducible baseline workflow to inform preventive conservation and future multitemporal monitoring in Paleolithic caves and similarly constrained cultural heritage contexts. Full article

(This article belongs to the Topic 3D Documentation of Natural and Cultural Heritage)

► Show Figures

Figure 1

Search Results (351)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (351)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI