MDPI - Publisher of Open Access Journals

62 pages, 10380 KB

Open AccessReview

Semantic SLAM with Multi-Modal Perception: Survey on Robust Long-Term Localization for Autonomous Vehicles

by Álvaro Navarro-Pérez, Bladimir Bacca-Cortés and Eduardo Caicedo-Bravo

Robotics 2026, 15(5), 88; https://doi.org/10.3390/robotics15050088 - 28 Apr 2026

Long-term localization in dynamic and changing environments remains a key challenge for autonomous vehicles. Semantic Simultaneous Localization and Mapping (SLAM) enhances traditional SLAM by integrating high-level semantic understanding, enabling robust mapping and localization even under complex scenarios. In this context, multi-modal sensor fusion—particularly [...] Read more.

Long-term localization in dynamic and changing environments remains a key challenge for autonomous vehicles. Semantic Simultaneous Localization and Mapping (SLAM) enhances traditional SLAM by integrating high-level semantic understanding, enabling robust mapping and localization even under complex scenarios. In this context, multi-modal sensor fusion—particularly the combination of LiDAR and camera data—has proven essential in leveraging complementary strengths: the geometric accuracy of LiDAR and the rich semantic cues from images. A significant advancement in this domain is the adoption of graph-based semantic localization frameworks, where semantic entities and spatial relationships are encoded in graph structures to improve map consistency, loop closure detection, and data association over time. This review presents a comprehensive survey of recent developments in Semantic SLAM, with a focus on long-term localization for autonomous vehicles using multi-modal fusion strategies. We categorize existing methods into traditional SLAM, vision-based, point-cloud-based, and graph-based techniques, emphasizing the role of semantic data association and loop closure in maintaining long-term consistency. Additionally, we discuss the integration of deep learning techniques for semantic segmentation and feature extraction. Finally, we analyze widely used datasets and evaluation metrics, identifying current limitations and proposing directions for future research on robust, scalable, and semantically enriched localization. Full article

(This article belongs to the Topic Advances in Robot Vision Perception and Control Technology)

► Show Figures

Graphical abstract

27 pages, 2173 KB

Open AccessArticle

Efficient Incremental SLAM via Information-Guided Gating and Selective Partial Optimization

by Reza Arablouei

Robotics 2026, 15(5), 87; https://doi.org/10.3390/robotics15050087 - 27 Apr 2026

Abstract

We present an efficient incremental SLAM back-end that reduces computation while preserving accuracy close to that of a full incremental Gauss–Newton (GN) solver across benchmark pose-graph datasets. The method combines information-guided gating (IGG), which uses a log-determinant-based information surrogate to decide when broad [...] Read more.

We present an efficient incremental SLAM back-end that reduces computation while preserving accuracy close to that of a full incremental Gauss–Newton (GN) solver across benchmark pose-graph datasets. The method combines information-guided gating (IGG), which uses a log-determinant-based information surrogate to decide when broad updates are warranted, with selective partial optimization (SPO), which confines multi-iteration GN updates to variables that remain affected after each iteration. We provide a local perturbation analysis, showing that, under standard regularity conditions, the proposed approximation tracks full GN within a threshold-controlled neighborhood and recovers the same local minimizer and asymptotic convergence rate when the effective approximation error vanishes asymptotically. Experiments on benchmark pose-graph SLAM datasets show competitive final and increment-averaged accuracy together with substantial reductions in update and solve FLOPs. These results support IGG-SPO as a practically promising SLAM back-end for robots operating under limited onboard computational resources. Full article

(This article belongs to the Special Issue State of the Art in Mobile Robot Localization)

► Show Figures

Figure 1

37 pages, 4727 KB

Open AccessArticle

UWB-Assisted Intelligent Light-Band Navigation System for Driverless Mining Vehicles: A Case Study in Underground Mines

by Junhong Liu, Xiaoquan Li and Chenglin Yin

Eng 2026, 7(5), 195; https://doi.org/10.3390/eng7050195 - 26 Apr 2026

Viewed by 69

Abstract

Autonomous driving in underground mines faces significant challenges due to Global Navigation Satellite System (GNSS) denial and harsh environmental conditions. Mainstream multi-sensor fusion and Simultaneous Localization and Mapping (SLAM) schemes have achieved substantial progress in underground navigation, but their deployment in feature-sparse tunnels [...] Read more.

Autonomous driving in underground mines faces significant challenges due to Global Navigation Satellite System (GNSS) denial and harsh environmental conditions. Mainstream multi-sensor fusion and Simultaneous Localization and Mapping (SLAM) schemes have achieved substantial progress in underground navigation, but their deployment in feature-sparse tunnels may still face challenges related to computational burden and perception robustness. This study explores an infrastructure-assisted navigation architecture that transforms the roadway into a structured luminous guidance channel by deploying programmable Light Emitting Diode (LED) strips along the tunnel roof. The proposed system simplifies complex three-dimensional pose estimation into a two-dimensional visual servoing task targeting optical signals. Central to this approach is a robust data fusion strategy that utilizes a topology matching algorithm to map noisy Ultra-Wide-band (UWB) coordinates onto a discrete LED index space, thereby providing a reliable global positioning reference. Furthermore, a hierarchical fault-tolerant controller based on a Finite State Machine (FSM) is designed to facilitate seamless degradation to a UWB-assisted ultrasonic wall-following mode in the event of visual degradation, supporting fault-tolerant operation under controlled laboratory conditions. Experimental results in a laboratory simulation environment demonstrate that the system achieves millimeter-level static initialization accuracy, a dynamic tracking Root Mean Square Error of approximately 4 cm, and a 100% autonomous recovery rate from visual failures in straight tunnels. These results demonstrate the feasibility of the proposed infrastructure-assisted route under controlled laboratory conditions and suggest its potential as an engineering reference for structured underground transport scenarios with acceptable infrastructure modification. Full article

28 pages, 3354 KB

Open AccessArticle

Loop Closure with 3D Gaussian Splatting for Dynamic SLAM

by Zhanwu Ma, Wansheng Cheng and Song Fan

Sensors 2026, 26(9), 2669; https://doi.org/10.3390/s26092669 - 25 Apr 2026

Viewed by 576

Abstract

Robust pose estimation and high-fidelity scene reconstruction in dynamic environments represent core challenges in the field of Visual Simultaneous Localization and Mapping (SLAM). Although 3D Gaussian Splatting (3DGS)-based techniques have demonstrated significant potential, existing methods typically assume static scenes and struggle to address [...] Read more.

Robust pose estimation and high-fidelity scene reconstruction in dynamic environments represent core challenges in the field of Visual Simultaneous Localization and Mapping (SLAM). Although 3D Gaussian Splatting (3DGS)-based techniques have demonstrated significant potential, existing methods typically assume static scenes and struggle to address the inconsistency between photometric and geometric observations in dynamic settings, leading to a notable degradation in pose estimation and map accuracy. To address these issues, this paper presents a novel dynamic SLAM method: Loop Closure with 3D Gaussian Splatting for Dynamic SLAM (LCD-Splat). Taking RGB-D images as input, LCD-Splat integrates Mask R-CNN with an improved multi-view geometry approach to detect dynamic objects, generating static scene maps and filling in occluded backgrounds. By leveraging 3DGS submaps and a frame to model tracking strategy, LCD-Splat achieves dense map construction. The method initiates online loop closure detection and employs a novel coarse to fine 3DGS registration algorithm to compute loop closure constraints between submaps. Global consistency is ultimately ensured through robust pose graph optimization. Experimental results on real-world datasets such as TUM RGB-D and Bonn demonstrate that LCD-Splat outperforms existing state-of-the-art SLAM methods in terms of tracking, scene reconstruction, and rendering performance. This approach provides novel insights for high-precision SLAM in dynamic environments and holds significant implications for scene understanding in complex settings. Full article

(This article belongs to the Special Issue Perception and Control Technology for Intelligent Autonomous Unmanned Systems)

29 pages, 16631 KB

Open AccessArticle

Stretch-ICP: A Continuous-Trajectory Registration and Deskewing Algorithm in Scenarios of Aggressive Motions

by Simon-Pierre Deschênes, Veronica Vannini, Philippe Giguère and François Pomerleau

Sensors 2026, 26(8), 2567; https://doi.org/10.3390/s26082567 - 21 Apr 2026

Viewed by 261

Abstract

Robust robotic autonomy remains challenging in complex environments, where loss of stability on uneven or slippery terrain can induce extreme accelerations and angular velocities. Such motions corrupt sensor measurements and degrade state estimation, motivating the need for improved algorithmic robustness. To investigate this [...] Read more.

Robust robotic autonomy remains challenging in complex environments, where loss of stability on uneven or slippery terrain can induce extreme accelerations and angular velocities. Such motions corrupt sensor measurements and degrade state estimation, motivating the need for improved algorithmic robustness. To investigate this issue, we introduce the Tumbling-Induced Gyroscope Saturation (TIGS) dataset, which consists of recordings from a mechanical lidar and an Inertial Measurement Unit (IMU) tumbling down a hill. The dataset contains angular speeds up to four times higher than those in similar datasets and is publicly available. We then propose two complementary methods to improve Simultaneous Localization And Mapping (SLAM) robustness and evaluate them on TIGS. First, Saturation-Aware Angular Velocity Estimation (SAAVE) estimates angular velocities when gyroscope measurements become saturated during aggressive motions, reducing angular speed estimation error by 83.4%. Second, Stretch-ICP, a novel registration and deskewing algorithm, enables reconstruction of smoother 6-Degrees Of Freedom (DOF) trajectories under aggressive motions compared to classical Iterative Closest Point (ICP). Stretch-ICP reduces linear and angular velocity errors by 95.2% and 94.8%, respectively, at scan boundaries. Together, these contributions improve the robustness and consistency of lidar-inertial state estimation under aggressive motions. Full article

(This article belongs to the Special Issue New Challenges and Sensor Techniques in Robot Positioning)

► Show Figures

Figure 1

21 pages, 23093 KB

Open AccessArticle

Keyframe-Guided Crack Segmentation and 3D Localization for UAV-Based Monocular Inspection

by Feifei Tang, Wuyuntana Gongzhabayier, Jing Li, Tao Zhou, Yue Qiu, Yong Zhan and Qiulin Song

Symmetry 2026, 18(4), 657; https://doi.org/10.3390/sym18040657 - 15 Apr 2026

Viewed by 261

Abstract

In unmanned aerial vehicle (UAV)-based monocular inspection, cracks typically present as geometrically asymmetric, elongated, low-contrast weak targets, making accurate segmentation and spatial localization challenging. Existing methods are susceptible to missed detections and false positives when handling slender cracks, and monocular 3D reconstruction for [...] Read more.

In unmanned aerial vehicle (UAV)-based monocular inspection, cracks typically present as geometrically asymmetric, elongated, low-contrast weak targets, making accurate segmentation and spatial localization challenging. Existing methods are susceptible to missed detections and false positives when handling slender cracks, and monocular 3D reconstruction for localization is often burdened by redundant frames, resulting in limited modeling efficiency. To mitigate these issues, we propose a high-precision framework for crack segmentation and spatial localization from UAV imagery. First, Oriented FAST and Rotated BRIEF–Simultaneous Localization and Mapping, version 3 (ORB-SLAM3) is adopted for keyframe selection to suppress data redundancy and improve reconstruction stability. Second, we develop an enhanced YOLOv11-seg model by integrating the Dilation-wise Residual Segmentation (DWRSeg) module, the Weighted IoU (WIoU) loss, and the Lightweight shared convolutional separator batch-normalization detection head (LSCSBD) to strengthen feature discrimination and segmentation robustness for slender cracks, yielding high-quality crack masks. Finally, the predicted masks are projected onto the reconstructed 3D surface to obtain precise spatial localization. Our experimental results demonstrate that the proposed approach improves the segmentation mAP@50 by 7.2% over the baseline while reducing computational complexity from 10.2 to 9.8 GFLOPs. In addition, keyframe-based processing reduces the 3D modeling time by 59.4% compared to that with full-frame reconstruction. Overall, the proposed framework jointly enhances crack segmentation accuracy and substantially accelerates 3D modeling and localization, providing an effective solution for efficient UAV-based crack inspection. Full article

(This article belongs to the Special Issue Symmetry/Asymmetry in Intelligent Transportation)

► Show Figures

Figure 1

23 pages, 3484 KB

Open AccessArticle

IFA-ICP: A Low-Complexity and Image Feature-Assisted Iterative Closest Point (ICP) Scheme for Odometry Estimation in SLAM, and Its FPGA-Based Hardware Accelerator Design

by Jia-En Li and Yin-Tsung Hwang

Sensors 2026, 26(8), 2326; https://doi.org/10.3390/s26082326 - 9 Apr 2026

Viewed by 233

Abstract

Odometry estimation, which calculates the trajectory of a moving object across timeframes, is a critical and time-consuming function in SLAM (Simultaneous Localization and Mapping) systems. Although LiDAR-based sensing is most popular for outdoor and long-range applications because of its ranging accuracy, the sparsity [...] Read more.

Odometry estimation, which calculates the trajectory of a moving object across timeframes, is a critical and time-consuming function in SLAM (Simultaneous Localization and Mapping) systems. Although LiDAR-based sensing is most popular for outdoor and long-range applications because of its ranging accuracy, the sparsity of laser point cloud poses a significant challenge to feature extraction and matching in odometry estimation. In this paper, we investigate odometry estimation from two aspects, i.e., algorithm optimization, and system design/implementation. In algorithm optimization, we present an image feature-assisted odometry estimation scheme that leverages the richness of image information captured by a companion camera to enhance the accuracy of laser point cloud matching. This also serves as a screening mechanism to reduce the matching size and lower the computing complexity for a higher estimation rate. In addition, various schemes, such as adaptive threshold in image feature point selection, principal component analysis (PCA)-based plane fitting for laser point interpolation, and Gauss–Newton optimization for calculating the transform matrix, are also employed to improve the accuracy of odometry estimation. The performance of improved odometry estimation is verified using an existing FLOAM (Fast Lidar Odometry and Mapping) framework. The KITTI dataset for autonomous vehicles with ground truth was used as the test bench. Simulation results indicate that the translation error and rotation error can be reduced by 16.6% and 1.3%, respectively. Computing complexity, measured as the software execution time, also reduced by 63%. In system implementation, a hardware/software (HW/SW) co-design strategy was adopted, where complexity profiling was first conducted to determine the task partitioning and time-consuming tasks are offloaded to a hardware accelerator. This facilitates real-time execution on a resource-constrained embedded platform consisting of a microprocessor module (Raspberry Pi) and an attached FPGA board (Pynq Z2). Efficient hardware designs for customized DSP functions (adaptive threshold and PCA) were developed in an FPGA capable of completing one data frame in 20ms. The final system implementation met the target throughput of 10 estimations per second, and can be scaled up further. Full article

(This article belongs to the Topic Advances in Autonomous Vehicles, Automation, and Robotics)

► Show Figures

Figure 1

20 pages, 13040 KB

Open AccessArticle

SLAM Mobile Mapping for Complex Archaeological Environments: Integrated Above–Below-Ground Surveying

by Gabriele Bitelli, Anna Forte and Emanuele Mandanici

Geomatics 2026, 6(2), 31; https://doi.org/10.3390/geomatics6020031 - 26 Mar 2026

Viewed by 470

Abstract

Archaeological sites characterized by the coexistence of extensive above-ground terrain and hypogeum structures present major challenges for accurate and comprehensive geospatial documentation. Conventional survey approaches—such as static terrestrial laser scanning (TLS), total-station measurements, and aerial photogrammetry—often suffer from operational constraints, particularly in the [...] Read more.

Archaeological sites characterized by the coexistence of extensive above-ground terrain and hypogeum structures present major challenges for accurate and comprehensive geospatial documentation. Conventional survey approaches—such as static terrestrial laser scanning (TLS), total-station measurements, and aerial photogrammetry—often suffer from operational constraints, particularly in the presence of narrow underground spaces, low or absent illumination, harsh environmental conditions, and restrictions on UAV deployment. Additional complexity arises when both surface and subterranean elements must be consistently georeferenced to a common global reference system, especially where establishing a traditional topographic–geodetic control network is impractical. Within the framework of the EIMAWA Egyptian–Italian Mission conducted by the University of Milano since 2018, the Geomatics group of the University of Bologna designed and implemented a multi-scale multi-technique 3D documentation workflow, with a prominent role assumed by Simultaneous Localization and Mapping (SLAM) mobile laser scanning. The approach was supported by GNSS measurements providing centimetric accuracy. SLAM was employed to document both the surface necropolis and multiple hypogeal tombs, enabling rapid acquisition of dense three-dimensional data in environments where traditional techniques are limited. All datasets were integrated within a unified reference system, resulting in a coherent, multi-layered spatial dataset representing both landscape and underground spaces. The results demonstrate that SLAM can produce dense point clouds that document at few-centimetric level accuracy and continuously both above- and below-ground contexts. Quantitative analyses of the co-registration and mutual alignment of multiple SLAM datasets confirm a high degree of internal consistency, further enhanced through post-processing refinement. Overall, the experience indicates that this solution represents a practical and reliable technique for complex archaeological surveying. Full article

► Show Figures

Graphical abstract

22 pages, 26802 KB

Open AccessArticle

Attention-Guided Semantic Segmentation and Scan-to-Model Geometric Reconstruction of Underground Tunnels from Mobile Laser Scanning

by Yingjia Huang, Jiang Ye, Xiaohui Li and Jingliang Du

Appl. Sci. 2026, 16(6), 3042; https://doi.org/10.3390/app16063042 - 21 Mar 2026

Viewed by 360

Abstract

Mobile Laser Scanning (MLS) integrated with Simultaneous Localization and Mapping (SLAM) has emerged as a key technology for digitizing GNSS-denied environments, such as underground mines. However, the automated interpretation of unstructured, high-density point clouds into semantic engineering models remains challenging due to extreme [...] Read more.

Mobile Laser Scanning (MLS) integrated with Simultaneous Localization and Mapping (SLAM) has emerged as a key technology for digitizing GNSS-denied environments, such as underground mines. However, the automated interpretation of unstructured, high-density point clouds into semantic engineering models remains challenging due to extreme geometric anisotropy in point distributions and severe class imbalance inherent to narrow tunnel environments. To address these issues, this study proposes a highly automated scan-to-model framework for precise semantic segmentation and vectorized two-dimensional (2D) profile reconstruction. First, an enhanced hierarchical deep learning network tailored for point clouds is introduced. The architecture incorporates a context-aware sampling strategy with an expanded receptive field of up to 10 m to preserve axial continuity, coupled with a spatial–geometric dual-attention mechanism to refine boundary delineation. In addition, a composite Focal–Dice loss function is employed to alleviate the dominance of wall points during network training. Experimental validation on a field-collected dataset comprising 16 mine tunnels demonstrates that the proposed model achieves a mean Intersection over Union (mIoU) of 85.15% (±0.29%) and an Overall Accuracy (OA) of 95.13% (±0.13%). Building on this semantic foundation, a robust geometric modeling pipeline is established using curvature-guided filtering and density-adaptive B-spline fitting. The reconstructed profiles accurately recover the geometric mean surface of the tunnel wall, yielding an overall filtered Root Mean Square Error (RMSE) of 4.96 ± 0.48 cm. The proposed framework provides an efficient end-to-end solution for deformation analysis and digital twinning of underground mining infrastructure. Full article

(This article belongs to the Special Issue Artificial Intelligence Applications in Underground Space Technology)

► Show Figures

Figure 1

19 pages, 894 KB

Open AccessReview

Indoor Mapping as a Spatiotemporal Framework for Mitigating Greenhouse Gas Emissions in Buildings: A Review

by Vinuri Nilanika Goonetilleke, Muditha K. Heenkenda and Kamil Zaniewski

Geomatics 2026, 6(2), 27; https://doi.org/10.3390/geomatics6020027 - 19 Mar 2026

Viewed by 556

Abstract

Climate change is a critical global challenge, and the building sector accounts for nearly 30% of global greenhouse gas (GHG) emissions, remaining a key target for mitigation. Indoor environments contribute significantly to GHG emissions, primarily through heating, cooling, lighting, and occupant-driven energy use. [...] Read more.

Climate change is a critical global challenge, and the building sector accounts for nearly 30% of global greenhouse gas (GHG) emissions, remaining a key target for mitigation. Indoor environments contribute significantly to GHG emissions, primarily through heating, cooling, lighting, and occupant-driven energy use. Indoor mapping, serving as the foundation for Digital Twins (DTs), provides a spatiotemporal framework that integrates sensor data with Building Information Modelling (BIM), Geographic Information Systems (GIS), and Internet of Things (IoT) to support energy-efficient, low-carbon building operations. This review examined the role of indoor mapping in understanding, modelling, and reducing GHG emissions in buildings. It synthesized current advancements in indoor spatial data acquisition, ranging from Light Detection And Ranging (LiDAR) and Simultaneous Localization and Mapping (SLAM) to deep learning-based floor plan extraction, and evaluated their contribution to improved indoor environmental analysis. The review highlighted emerging techniques, challenges, and gaps, particularly the limited integration of physical indoor spaces with virtual layers representing assets, occupants, and equipment. Addressing this gap requires embedding spatial modelling as an intermediate analytical layer that structures and contextualizes sensor data to support spatiotemporal decision-making. Overall, this review demonstrated that indoor mapping plays a critical role in transforming spatial information into actionable insights, enabling more accurate energy modelling, enhanced real-time building management, and stronger data-driven strategies for GHG mitigation in the built environment. Full article

► Show Figures

Graphical abstract

23 pages, 6668 KB

Open AccessArticle

Development of a Visual SLAM-Based Autonomous UAV System for Greenhouse Plant Monitoring

by Jing-Heng Lin and Ta-Te Lin

Drones 2026, 10(3), 205; https://doi.org/10.3390/drones10030205 - 15 Mar 2026

Viewed by 1054

Abstract

Autonomous monitoring is essential for precision agriculture in greenhouses, yet deploying unmanned aerial vehicles (UAVs) in confined, GPS-denied environments remains limited by payload, power, and cost constraints. This study developed and validated an autonomous UAV system for reliable, low-cost operation in such conditions. [...] Read more.

Autonomous monitoring is essential for precision agriculture in greenhouses, yet deploying unmanned aerial vehicles (UAVs) in confined, GPS-denied environments remains limited by payload, power, and cost constraints. This study developed and validated an autonomous UAV system for reliable, low-cost operation in such conditions. The proposed system employs a dual-link edge-computing architecture: a lightweight onboard controller handles flight control and sensor acquisition, while visual simultaneous localization and mapping (V-SLAM) is offloaded to an edge computer via the FPV video link. Phenotyping (flower detection and tracking/counting) is performed offline from the side-view RGB stream and does not participate in the flight control loop. Using muskmelon (Cucumis melo L.) flower development as a case study, the UAV autonomously executed daily missions for 27 days in a commercial greenhouse, performing flower detection and tracking to monitor phenological dynamics. Localization and control accuracy were evaluated against a validated UWB reference system, achieving 5.4~8.0 cm 2D RMSE for trajectory tracking and 12.7 cm translation RMSE for greenhouse mapping. This work demonstrates a practical architecture for autonomous monitoring in GPS-denied agricultural environments, with operational boundaries characterized through the sustained field deployment. The system’s design principles may extend to other indoor or communication-limited scenarios requiring lightweight, intelligent robotic operation. Full article

(This article belongs to the Section Drones in Agriculture and Forestry)

► Show Figures

Figure 1

17 pages, 1708 KB

Open AccessArticle

Robust Visual–Inertial SLAM and Biomass Assessment for AUVs in Marine Ranching

by Yangyang Wang, Ziyu Liu, Tianzhu Gao and Xijun Du

Symmetry 2026, 18(3), 495; https://doi.org/10.3390/sym18030495 - 13 Mar 2026

Viewed by 330

Abstract

Environmental perception is a cornerstone for autonomous underwater vehicles (AUVs) to achieve robust self-localization and scene understanding, which are pivotal for the intelligent management of marine ranching. However, underwater image degradation and weak-textured scenes significantly hinder reliable self-localization and fine-grained environmental perception. To [...] Read more.

Environmental perception is a cornerstone for autonomous underwater vehicles (AUVs) to achieve robust self-localization and scene understanding, which are pivotal for the intelligent management of marine ranching. However, underwater image degradation and weak-textured scenes significantly hinder reliable self-localization and fine-grained environmental perception. To address the perceptual asymmetry arising from these challenges, this paper proposes a robust visual–inertial simultaneous localization and mapping (SLAM) and biomass assessment scheme for marine ranching. Specifically, we first propose a robust tightly coupled underwater visual–inertial localization scheme, which leverages a multi-sensor fusion strategy to solve the image degradation problem of localization in complex underwater environments. Furthermore, we propose a novel underwater scene perception method, which enables the simultaneous visual reconstruction of aquaculture species and the quantitative mapping of their spatial distribution in marine ranching. Finally, we develop a low-cost, agile, and portable multisensor-integrated system that consolidates autonomous localization and aquaculture biomass assessment modules, with its performance validated through extensive real-world underwater experiments. The experimental results demonstrate that the proposed methods can effectively overcome the interference of complex underwater environments and provide high-precision perception support for both AUV state estimation and aquaculture asset management. Full article

(This article belongs to the Special Issue Symmetry in Next-Generation Intelligent Information Technologies)

► Show Figures

Figure 1

20 pages, 24767 KB

Open AccessArticle

VINA-SLAM: A Voxel-Based Inertial and Normal-Aligned LiDAR–IMU SLAM

by Ruyang Zhang and Bingyu Sun

Sensors 2026, 26(6), 1810; https://doi.org/10.3390/s26061810 - 13 Mar 2026

Viewed by 639

Abstract

Environments with sparse or repetitive geometric structures, such as long corridors and narrow stairwells, remain challenging for LiDAR–inertial simultaneous localization and mapping (LiDAR–IMU SLAM) due to insufficient geometric observability and unreliable data associations. To address these issues, we propose VINA-SLAM, a novel LiDAR–IMU [...] Read more.

Environments with sparse or repetitive geometric structures, such as long corridors and narrow stairwells, remain challenging for LiDAR–inertial simultaneous localization and mapping (LiDAR–IMU SLAM) due to insufficient geometric observability and unreliable data associations. To address these issues, we propose VINA-SLAM, a novel LiDAR–IMU SLAM framework that constructs a unified global voxel map to explicitly exploit structural consistency. VINA-SLAM continuously tracks surface normals stored in the global voxel map using a normal-guided correspondence strategy, enabling stable scan-to-map alignment in degenerate scenes. Furthermore, a tangent-space metric is introduced to supplement missing rotational constraints around planar regions, providing reliable initial pose estimates for local optimization. A tightly coupled sliding-window bundle adjustment is then formulated by jointly incorporating IMU factors, voxel normal consistency factors, and planar regularization terms. In particular, the minimum eigenvalue of each voxel’s covariance is used as a statistically principled planar constraint, improving the Hessian conditioning and cross-view geometric consistency. The proposed system directly aligns raw LiDAR scans to the voxelized map without explicit feature extraction or loop closure. Experiments on 25 sequences from the HILTI and MARS-LVIG datasets show that VINA-SLAM reduces ATE by 25–40% on average while maintaining real-time performance at 10 Hz in the evaluated geometrically degenerate environments. Full article

(This article belongs to the Topic Unmanned Vehicles Technology and Embodied Intelligence Systems for Intelligent Transportation)

► Show Figures

Figure 1

30 pages, 3812 KB

Open AccessReview

Video-Based 3D Reconstruction: A Review of Photogrammetry and Visual SLAM Approaches

by Ali Javadi Moghadam, Abbas Kiani, Reza Naeimaei, Shirin Malihi and Ioannis Brilakis

J. Imaging 2026, 12(3), 128; https://doi.org/10.3390/jimaging12030128 - 13 Mar 2026

Viewed by 1417

Abstract

Three-dimensional (3D) reconstruction using images is one of the most significant topics in computer vision and photogrammetry, with wide-ranging applications in robotics, augmented reality, and mapping. This study investigates methods of 3D reconstruction using video (especially monocular video) data and focuses on techniques [...] Read more.

Three-dimensional (3D) reconstruction using images is one of the most significant topics in computer vision and photogrammetry, with wide-ranging applications in robotics, augmented reality, and mapping. This study investigates methods of 3D reconstruction using video (especially monocular video) data and focuses on techniques such as Structure from Motion (SfM), Multi-View Stereo (MVS), Visual Simultaneous Localization and Mapping (V-SLAM), and videogrammetry. Based on a statistical analysis of SCOPUS records, these methods collectively account for approximately 6863 journal publications up to the end of 2024. Among these, about 80 studies are analyzed in greater detail to identify trends and advancements in the field. The study also shows that the use of video data for real-time 3D reconstruction is commonly addressed through two main approaches: photogrammetry-based methods, which rely on precise geometric principles and offer high accuracy at the cost of greater computational demand; and V-SLAM methods, which emphasize real-time processing and provide higher speed. Furthermore, the application of IMU data and other indicators, such as color quality and keypoint detection, for selecting suitable frames for 3D reconstruction is investigated. Overall, this study compiles and categorizes video-based reconstruction methods, emphasizing the critical step of keyframe extraction. By summarizing and illustrating the general approaches, the study aims to clarify and facilitate the entry path for researchers interested in this area. Finally, the paper offers targeted recommendations for improving keyframe extraction methods to enhance the accuracy and efficiency of real-time video-based 3D reconstruction, while also outlining future research directions in addressing challenges like dynamic scenes, reducing computational costs, and integrating advanced learning-based techniques. Full article

(This article belongs to the Section Computer Vision and Pattern Recognition)

► Show Figures

Figure 1

36 pages, 15804 KB

Open AccessArticle

An RGB-D SLAM Algorithm Based on a Multi-Layer Refraction Model for Underwater Scenarios

by Xianshuai Sun, Yabiao Wang, Yuming Zhao, Zhigang Li, Zhen He and Xiaohui Wang

J. Mar. Sci. Eng. 2026, 14(5), 485; https://doi.org/10.3390/jmse14050485 - 3 Mar 2026

Viewed by 464

Abstract

The use of depth cameras in low-texture environments is crucial for ensuring the feasibility of visual simultaneous localization and mapping (SLAM) algorithms. Nevertheless, in underwater scenarios, light propagation through multi-layered media gives rise to refractive distortion. Directly utilizing distorted images acquired by depth [...] Read more.

The use of depth cameras in low-texture environments is crucial for ensuring the feasibility of visual simultaneous localization and mapping (SLAM) algorithms. Nevertheless, in underwater scenarios, light propagation through multi-layered media gives rise to refractive distortion. Directly utilizing distorted images acquired by depth cameras for visual SLAM computations inevitably introduces substantial errors in localization and mapping. Additionally, the waterproof glass mounted in front of the depth camera renders traditional air-based camera calibration ineffective, thereby introducing calibration inaccuracies. To mitigate these challenges, we propose a comprehensive SLAM algorithm framework for underwater multi-layered media refraction correction based on RGB-D cameras. Firstly, a multi-layer refraction calibration module is developed to calibrate the depth camera in air. Subsequently, the calibrated parameters are leveraged to construct an underwater multi-layer refraction correction module, which retrieves undistorted color images and aligned depth images. Finally, the corrected color images and depth images are fed into the front-end of the visual SLAM algorithm to generate dense point cloud maps. Both simulation and real-world experiments are conducted to validate the accuracy of the multi-layer refraction calibration results and the precision of the dense point clouds obtained via multi-layer refraction correction. Furthermore, the superiority of the proposed method is demonstrated through both qualitative and quantitative evaluations. Full article

(This article belongs to the Section Ocean Engineering)

► Show Figures

Figure 1

Search Results (793)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (793)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI