Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (51)

Search Parameters:
Keywords = direct visual odometry

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
27 pages, 480 KB  
Article
Hardware-Oriented Lie-Group Optimization Library for FPGA-Accelerated SLAM Using Custom Numeric Precision
by Emanuel Trabes and Carlos Valderrama Sakuyama
Electronics 2026, 15(11), 2272; https://doi.org/10.3390/electronics15112272 - 25 May 2026
Viewed by 457
Abstract
Nonlinear optimization is a central component of visual odometry and simultaneous localization and mapping (SLAM), but its repeated small- and medium-scale linear algebra operations are difficult to deploy efficiently on embedded hardware. This paper presents a synthesizable C++ library for AMD/Xilinx Vitis high-level [...] Read more.
Nonlinear optimization is a central component of visual odometry and simultaneous localization and mapping (SLAM), but its repeated small- and medium-scale linear algebra operations are difficult to deploy efficiently on embedded hardware. This paper presents a synthesizable C++ library for AMD/Xilinx Vitis high-level synthesis (HLS) that provides field-programmable gate array (FPGA)-oriented dense linear algebra kernels and Lie-group primitives on SO(3) and SE(3). The library supports configurable scalar types, including IEEE floating point, posit arithmetic, and reduced-precision floating-point formats, enabling design-space exploration between numerical accuracy and hardware cost. The proposed kernels are integrated into the back-end of a monocular direct mesh-based visual SLAM system and evaluated on an AMD/Xilinx Kria KV260 platform. Compared with the software reference running on the embedded processor, the integrated FPGA implementation reduces the end-to-end optimization iteration time from 32.0 ms to 8.9 ms, corresponding to a speed-up of 3.6×, and reduces the dominant kernel latency from 25.0 ms to 4.9 ms. The most resource-efficient reduced-precision configuration reduces lookup table (LUT) usage by 29.6%, flip-flop (FF) usage by 25.7%, block random-access memory (BRAM) usage by 25.9%, and digital signal processor (DSP) usage by 38.6% relative to the floating-point hardware baseline, while keeping the relative trajectory error within 1.42%. The results show that Lie-group-aware optimization back-ends can be mapped to embedded FPGAs efficiently when fixed-size algebraic kernels, synthesis-aware memory structures, and configurable arithmetic are considered together. Full article
Show Figures

Figure 1

24 pages, 8894 KB  
Article
An Improved Robust ESKF Fusion Positioning Method with a Novel UWB-VIO Initialization
by Changqiang Wang, Biao Li, Yuzuo Duan, Xin Sui, Zhengxu Shi, Song Gao, Zhe Zhang and Ji Chen
Sensors 2026, 26(6), 1804; https://doi.org/10.3390/s26061804 - 12 Mar 2026
Viewed by 555
Abstract
Visual–inertial odometry (VIO) often struggles with illumination variations, sparse visual features, and inertial drift in complex indoor settings, leading to scale uncertainties and accumulated errors. To address these issues, this paper proposes a new UWB–VIO initialization method combined with an enhanced Robust error-state [...] Read more.
Visual–inertial odometry (VIO) often struggles with illumination variations, sparse visual features, and inertial drift in complex indoor settings, leading to scale uncertainties and accumulated errors. To address these issues, this paper proposes a new UWB–VIO initialization method combined with an enhanced Robust error-state Kalman filter (Robust ESKF) fusion technique for mobile robot localization. During initialization, common problems include scale drift and heading inconsistency. To solve these, a direction-consistent constrained initialization model is developed. By jointly optimizing the scale factor and yaw angle, this model ensures consistent alignment between the visual–inertial and ultra-wideband (UWB) coordinate frames. This approach removes the need for external calibration and independent coordinate transformation, which are typically required by traditional methods. In the fusion process, an improved residual-weighted robust filtering mechanism is employed to minimize the impact of abnormal UWB ranging data and noise interference. This mechanism adaptively suppresses outliers caused by UWB multipath reflections and non-line-of-sight (NLOS) propagation, thereby reducing VIO drift and improving the overall robustness and stability of the localization system. Experiments conducted in narrow-corridor environments, where both UWB and visual sensors are affected by interference, demonstrate that the proposed method significantly reduces trajectory drift and attitude jumps, resulting in better positioning accuracy and trajectory continuity. Compared to conventional UWB–VIO fusion algorithms, the proposed method enhances average localization accuracy by over 50% and maintains stable estimation even in severe multipath interference conditions, demonstrating high precision and strong robustness. Full article
(This article belongs to the Section Navigation and Positioning)
Show Figures

Graphical abstract

18 pages, 1767 KB  
Article
Integrating Roadway Sign Data and Biomimetic Path Integration for High-Precision Localization in Unstructured Coal Mine Roadways
by Miao Yu, Zilong Zhang, Xi Zhang, Junjie Zhang, Bin Zhou and Bo Chen
Electronics 2026, 15(3), 528; https://doi.org/10.3390/electronics15030528 - 26 Jan 2026
Viewed by 428
Abstract
High-precision autonomous localization remains a critical challenge for intelligent mining vehicles in GNSS-denied and unstructured coal mine roadways, where traditional odometry-based methods suffer from severe cumulative drift and perceptual aliasing. Inspired by the synergy between mammalian visual cues and cognitive neural mechanisms, this [...] Read more.
High-precision autonomous localization remains a critical challenge for intelligent mining vehicles in GNSS-denied and unstructured coal mine roadways, where traditional odometry-based methods suffer from severe cumulative drift and perceptual aliasing. Inspired by the synergy between mammalian visual cues and cognitive neural mechanisms, this paper proposes a robust biomimetic localization framework that integrates multi-source perception with a prior cognitive map. The core contributions are three-fold: First, a semantic-enhanced biomimetic localization method is developed, leveraging roadway sign data as absolute spatial anchors to suppress long-distance cumulative errors. Second, an optimized head direction (HD) cell model is formulated by incorporating a speed balance factor, kinematic constraints, and a drift correction influence factor, significantly improving the precision of angular perception. Third, boundary-adaptive and sign-based semantic constraint terms are integrated into a continuous attractor network (CAN)-based path integration model, effectively preventing trajectory deviation into non-navigable regions. Comprehensive evaluations conducted in large-scale underground scenarios demonstrate that the proposed framework consistently outperforms conventional IMU-odometry fusion, representative 3D SLAM solutions, and baseline biomimetic algorithms. By effectively integrating semantic landmarks as spatial anchors, the system exhibits superior resilience against cumulative drift, maintaining high localization precision where standard methods typically diverge. The results confirm that our approach significantly enhances both trajectory consistency and heading stability across extensive distances, validating its robustness and scalability in handling the inherent complexities of unstructured coal mine environments for enhanced intrinsic safety. Full article
Show Figures

Figure 1

28 pages, 2836 KB  
Article
MA-EVIO: A Motion-Aware Approach to Event-Based Visual–Inertial Odometry
by Mohsen Shahraki, Ahmed Elamin and Ahmed El-Rabbany
Sensors 2025, 25(23), 7381; https://doi.org/10.3390/s25237381 - 4 Dec 2025
Cited by 1 | Viewed by 1386
Abstract
Indoor localization remains a challenging task due to the unavailability of reliable global navigation satellite system (GNSS) signals in most indoor environments. One way to overcome this challenge is through visual–inertial odometry (VIO), which enables real-time pose estimation by fusing camera and inertial [...] Read more.
Indoor localization remains a challenging task due to the unavailability of reliable global navigation satellite system (GNSS) signals in most indoor environments. One way to overcome this challenge is through visual–inertial odometry (VIO), which enables real-time pose estimation by fusing camera and inertial measurements. However, VIO suffers from performance degradation under high-speed motion and in poorly lit environments. In such scenarios, motion blur, sensor noise, and low temporal resolution reduce the accuracy and robustness of the estimated trajectory. To address these limitations, we propose a motion-aware event-based VIO (MA-EVIO) system that adaptively fuses asynchronous event data, frame-based imagery, and inertial measurements for robust and accurate pose estimation. MA-EVIO employs a hybrid tracking strategy combining sparse feature matching and direct photometric alignment. A key innovation is its motion-aware keyframe selection, which dynamically adjusts tracking parameters based on real-time motion classification and feature quality. This motion awareness also enables adaptive sensor fusion: during fast motion, the system prioritizes event data, while under slow or stable motion, it relies more on RGB frames and feature-based tracking. Experimental results on the DAVIS240c and VECtor benchmarks demonstrate that MA-EVIO outperforms state-of-the-art methods, achieving a lower mean position error (MPE) of 0.19 on DAVIS240c compared to 0.21 (EVI-SAM) and 0.24 (PL-EVIO), and superior performance on VECtor with MPE/mean rotation error (MRE) of 1.19%/1.28 deg/m versus 1.27%/1.42 deg/m (EVI-SAM) and 1.93%/1.56 deg/m (PL-EVIO). These results validate the effectiveness of MA-EVIO in challenging dynamic indoor environments. Full article
(This article belongs to the Special Issue Multi-Sensor Integration for Mobile and UAS Mapping)
Show Figures

Figure 1

28 pages, 10678 KB  
Article
Deep-DSO: Improving Mapping of Direct Sparse Odometry Using CNN-Based Single-Image Depth Estimation
by Erick P. Herrera-Granda, Juan C. Torres-Cantero, Israel D. Herrera-Granda, José F. Lucio-Naranjo, Andrés Rosales, Javier Revelo-Fuelagán and Diego H. Peluffo-Ordóñez
Mathematics 2025, 13(20), 3330; https://doi.org/10.3390/math13203330 - 19 Oct 2025
Cited by 2 | Viewed by 2643
Abstract
In recent years, SLAM, visual odometry, and structure-from-motion approaches have widely addressed the problems of 3D reconstruction and ego-motion estimation. Of the many input modalities that can be used to solve these ill-posed problems, the pure visual alternative using a single monocular RGB [...] Read more.
In recent years, SLAM, visual odometry, and structure-from-motion approaches have widely addressed the problems of 3D reconstruction and ego-motion estimation. Of the many input modalities that can be used to solve these ill-posed problems, the pure visual alternative using a single monocular RGB camera has attracted the attention of multiple researchers due to its low cost and widespread availability in handheld devices. One of the best proposals currently available is the Direct Sparse Odometry (DSO) system, which has demonstrated the ability to accurately recover trajectories and depth maps using monocular sequences as the only source of information. Given the impressive advances in single-image depth estimation using neural networks, this work proposes an extension of the DSO system, named DeepDSO. DeepDSO effectively integrates the state-of-the-art NeW CRF neural network as a depth estimation module, providing depth prior information for each candidate point. This reduces the point search interval over the epipolar line. This integration improves the DSO algorithm’s depth point initialization and allows each proposed point to converge faster to its true depth. Experimentation carried out in the TUM-Mono dataset demonstrated that adding the neural network depth estimation module to the DSO pipeline significantly reduced rotation, translation, scale, start-segment alignment, end-segment alignment, and RMSE errors. Full article
(This article belongs to the Section E1: Mathematics and Computer Science)
Show Figures

Figure 1

24 pages, 23437 KB  
Article
Fusing Direct and Indirect Visual Odometry for SLAM: An ICM-Based Framework
by Jeremias Gaia, Javier Gimenez, Eugenio Orosco, Francisco Rossomando, Carlos Soria and Fernando Ulloa-Vásquez
World Electr. Veh. J. 2025, 16(9), 510; https://doi.org/10.3390/wevj16090510 - 10 Sep 2025
Cited by 1 | Viewed by 1720
Abstract
The loss of localization in robots navigating GNSS-denied environments poses a critical challenge that can compromise mission success and safe operation. This article presents a method that fuses visual odometry outputs from both direct and feature-based (indirect) methods using Iterated Conditional Modes (ICMs), [...] Read more.
The loss of localization in robots navigating GNSS-denied environments poses a critical challenge that can compromise mission success and safe operation. This article presents a method that fuses visual odometry outputs from both direct and feature-based (indirect) methods using Iterated Conditional Modes (ICMs), an efficient iterative optimization algorithm that maximizes the posterior probability in Markov random fields, combined with uncertainty-aware gain adjustment to perform pose estimation and mapping. The proposed method enhances the performance of visual localization and mapping algorithms in low-texture or visually degraded scenarios. The method was validated using the TUM RGB-D benchmark dataset and through real-world tests in both indoor and outdoor environments. Outdoor experiments were conducted on an electric vehicle, where the method maintained stable tracking. These initial results suggest that the technique could be transferable to electric vehicle platforms and applicable in a variety of real-world conditions. Full article
Show Figures

Figure 1

13 pages, 4728 KB  
Article
Stereo Direct Sparse Visual–Inertial Odometry with Efficient Second-Order Minimization
by Chenhui Fu and Jiangang Lu
Sensors 2025, 25(15), 4852; https://doi.org/10.3390/s25154852 - 7 Aug 2025
Cited by 1 | Viewed by 2835
Abstract
Visual–inertial odometry (VIO) is the primary supporting technology for autonomous systems, but it faces three major challenges: initialization sensitivity, dynamic illumination, and multi-sensor fusion. In order to overcome these challenges, this paper proposes stereo direct sparse visual–inertial odometry with efficient second-order minimization. It [...] Read more.
Visual–inertial odometry (VIO) is the primary supporting technology for autonomous systems, but it faces three major challenges: initialization sensitivity, dynamic illumination, and multi-sensor fusion. In order to overcome these challenges, this paper proposes stereo direct sparse visual–inertial odometry with efficient second-order minimization. It is entirely implemented using the direct method, which includes a depth initialization module based on visual–inertial alignment, a stereo image tracking module, and a marginalization module. Inertial measurement unit (IMU) data is first aligned with a stereo image to initialize the system effectively. Then, based on the efficient second-order minimization (ESM) algorithm, the photometric error and the inertial error are minimized to jointly optimize camera poses and sparse scene geometry. IMU information is accumulated between several frames using measurement preintegration and is inserted into the optimization as an additional constraint between keyframes. A marginalization module is added to reduce the computation complexity of the optimization and maintain the information about the previous states. The proposed system is evaluated on the KITTI visual odometry benchmark and the EuRoC dataset. The experimental results demonstrate that the proposed system achieves state-of-the-art performance in terms of accuracy and robustness. Full article
(This article belongs to the Section Vehicular Sensing)
Show Figures

Figure 1

59 pages, 3738 KB  
Article
A Survey of Visual SLAM Based on RGB-D Images Using Deep Learning and Comparative Study for VOE
by Van-Hung Le and Thi-Ha-Phuong Nguyen
Algorithms 2025, 18(7), 394; https://doi.org/10.3390/a18070394 - 27 Jun 2025
Cited by 1 | Viewed by 6221
Abstract
Visual simultaneous localization and mapping (Visual SLAM) based on RGB-D image data includes two main tasks: One is to build an environment map, and the other is to simultaneously track the position and movement of visual odometry estimation (VOE). Visual SLAM and VOE [...] Read more.
Visual simultaneous localization and mapping (Visual SLAM) based on RGB-D image data includes two main tasks: One is to build an environment map, and the other is to simultaneously track the position and movement of visual odometry estimation (VOE). Visual SLAM and VOE are used in many applications, such as robot systems, autonomous mobile robots, assistance systems for the blind, human–machine interaction, industry, etc. To solve the computer vision problems in Visual SLAM and VOE from RGB-D images, deep learning (DL) is an approach that gives very convincing results. This manuscript examines the results, advantages, difficulties, and challenges of the problem of Visual SLAM and VOE based on DL. In this paper, the taxonomy is proposed to conduct a complete survey based on three methods to construct Visual SLAM and VOE from RGB-D images (1) using DL for the modules of the Visual SLAM and VOE systems; (2) using DL to supplement the modules of Visual SLAM and VOE systems; and (3) using end-to-end DL to build Visual SLAM and VOE systems. The 220 scientific publications on Visual SLAM, VOE, and related issues were surveyed. The studies were surveyed based on the order of methods, datasets, evaluation measures, and detailed results. In particular, studies on using DL to build Visual SLAM and VOE systems have analyzed the challenges, advantages, and disadvantages. We also proposed and published the TQU-SLAM benchmark dataset, and a comparative study on fine-tuning the VOE model using a Multi-Layer Fusion network (MLF-VO) framework was performed. The comparison results of VOE on the TQU-SLAM benchmark dataset range from 16.97 m to 57.61 m. This is a huge error compared to the VOE methods on the KITTI, TUM RGB-D SLAM, and ICL-NUIM datasets. Therefore, the dataset we publish is very challenging, especially in the opposite direction (OP-D) when collecting and annotation data. The results of the comparative study are also presented in detail and available. Full article
(This article belongs to the Special Issue Advances in Deep Learning and Next-Generation Internet Technologies)
Show Figures

Figure 1

26 pages, 10564 KB  
Article
DynaFusion-SLAM: Multi-Sensor Fusion and Dynamic Optimization of Autonomous Navigation Algorithms for Pasture-Pushing Robot
by Zhiwei Liu, Jiandong Fang and Yudong Zhao
Sensors 2025, 25(11), 3395; https://doi.org/10.3390/s25113395 - 28 May 2025
Cited by 1 | Viewed by 2426
Abstract
Aiming to address the problems of fewer related studies on autonomous navigation algorithms based on multi-sensor fusion in complex scenarios in pastures, lower degrees of fusion, and insufficient cruising accuracy of the operation path in complex outdoor environments, a multimodal autonomous navigation system [...] Read more.
Aiming to address the problems of fewer related studies on autonomous navigation algorithms based on multi-sensor fusion in complex scenarios in pastures, lower degrees of fusion, and insufficient cruising accuracy of the operation path in complex outdoor environments, a multimodal autonomous navigation system is proposed based on a loosely coupled architecture of Cartographer–RTAB-Map (real-time appearance-based mapping). Through laser-vision inertial guidance multi-sensor data fusion, the system achieves high-precision mapping and robust path planning in complex scenes. First, comparing the mainstream laser SLAM algorithms (Hector/Gmapping/Cartographer) through simulation experiments, Cartographer is found to have a significant memory efficiency advantage in large-scale scenarios and is thus chosen as the front-end odometer. Secondly, a two-way position optimization mechanism is innovatively designed: (1) When building the map, Cartographer processes the laser with IMU and odometer data to generate mileage estimations, which provide positioning compensation for RTAB-Map. (2) RTAB-Map fuses the depth camera point cloud and laser data, corrects the global position through visual closed-loop detection, and then uses 2D localization to construct a bimodal environment representation containing a 2D raster map and a 3D point cloud, achieving a complete description of the simulated ranch environment and material morphology and constructing a framework for the navigation algorithm of the pushing robot based on the two types of fused data. During navigation, the combination of RTAB-Map’s global localization and AMCL’s local localization is used to generate a smoother and robust positional attitude by fusing IMU and odometer data through the EKF algorithm. Global path planning is performed using Dijkstra’s algorithm and combined with the TEB (Timed Elastic Band) algorithm for local path planning. Finally, experimental validation is performed in a laboratory-simulated pasture environment. The results indicate that when the RTAB-Map algorithm fuses with the multi-source odometry, its performance is significantly improved in the laboratory-simulated ranch scenario, the maximum absolute value of the error of the map measurement size is narrowed from 24.908 cm to 4.456 cm, the maximum absolute value of the relative error is reduced from 6.227% to 2.025%, and the absolute value of the error at each location is significantly reduced. At the same time, the introduction of multi-source mileage fusion can effectively avoid the phenomenon of large-scale offset or drift in the process of map construction. On this basis, the robot constructs a fusion map containing a simulated pasture environment and material patterns. In the navigation accuracy test experiments, our proposed method reduces the root mean square error (RMSE) coefficient by 1.7% and Std by 2.7% compared with that of RTAB-MAP. The RMSE is reduced by 26.7% and Std by 22.8% compared to that of the AMCL algorithm. On this basis, the robot successfully traverses the six preset points, and the measured X and Y directions and the overall position errors of the six points meet the requirements of the pasture-pushing task. The robot successfully returns to the starting point after completing the task of multi-point navigation, achieving autonomous navigation of the robot. Full article
(This article belongs to the Section Navigation and Positioning)
Show Figures

Figure 1

20 pages, 22712 KB  
Article
Adaptive Route Memory Sequences for Insect-Inspired Visual Route Navigation
by Efstathios Kagioulis, James Knight, Paul Graham, Thomas Nowotny and Andrew Philippides
Biomimetics 2024, 9(12), 731; https://doi.org/10.3390/biomimetics9120731 - 1 Dec 2024
Cited by 3 | Viewed by 2128
Abstract
Visual navigation is a key capability for robots and animals. Inspired by the navigational prowess of social insects, a family of insect-inspired route navigation algorithms—familiarity-based algorithms—have been developed that use stored panoramic images collected during a training route to subsequently derive directional information [...] Read more.
Visual navigation is a key capability for robots and animals. Inspired by the navigational prowess of social insects, a family of insect-inspired route navigation algorithms—familiarity-based algorithms—have been developed that use stored panoramic images collected during a training route to subsequently derive directional information during route recapitulation. However, unlike the ants that inspire them, these algorithms ignore the sequence in which the training images are acquired so that all temporal information/correlation is lost. In this paper, the benefits of incorporating sequence information in familiarity-based algorithms are tested. To do this, instead of comparing a test view to all the training route images, a window of memories is used to restrict the number of comparisons that need to be made. As ants are able to visually navigate when odometric information is removed, the window position is updated via visual matching information only and not odometry. The performance of an algorithm without sequence information is compared to the performance of window methods with different fixed lengths as well as a method that adapts the window size dynamically. All algorithms were benchmarked on a simulation of an environment used for ant navigation experiments and showed that sequence information can boost performance and reduce computation. A detailed analysis of successes and failures highlights the interaction between the length of the route memory sequence and environment type and shows the benefits of an adaptive method. Full article
(This article belongs to the Special Issue Bio-Inspired Robotics and Applications)
Show Figures

Figure 1

22 pages, 2553 KB  
Review
Advancements in Indoor Precision Positioning: A Comprehensive Survey of UWB and Wi-Fi RTT Positioning Technologies
by Jiageng Qiao, Fan Yang, Jingbin Liu, Gege Huang, Wei Zhang and Mengxiang Li
Network 2024, 4(4), 545-566; https://doi.org/10.3390/network4040027 - 29 Nov 2024
Cited by 19 | Viewed by 8967
Abstract
High-precision indoor positioning is essential for various applications, such as the Internet of Things, robotics, and smart manufacturing, requiring accuracy better than 1 m. Conventional indoor positioning methods, like Wi-Fi or Bluetooth fingerprinting, typically provide low accuracy within a range of several meters, [...] Read more.
High-precision indoor positioning is essential for various applications, such as the Internet of Things, robotics, and smart manufacturing, requiring accuracy better than 1 m. Conventional indoor positioning methods, like Wi-Fi or Bluetooth fingerprinting, typically provide low accuracy within a range of several meters, while techniques such as laser or visual odometry often require fusion with absolute positioning methods. Ultra-wideband (UWB) and Wi-Fi Round-Trip Time (RTT) are emerging radio positioning technologies supported by industry leaders like Apple and Google, respectively, both capable of achieving high-precision indoor positioning. This paper offers a comprehensive survey of UWB and Wi-Fi positioning, beginning with an overview of UWB and Wi-Fi RTT ranging, followed by an explanation of the fundamental principles of UWB and Wi-Fi RTT-based geometric positioning. Additionally, it compares the strengths and limitations of UWB and Wi-Fi RTT technologies and reviews advanced studies that address practical challenges in UWB and Wi-Fi RTT positioning, such as accuracy, reliability, continuity, and base station coordinate calibration issues. These challenges are primarily addressed through a multi-sensor fusion approach that integrates relative and absolute positioning. Finally, this paper highlights future directions for the development of UWB- and Wi-Fi RTT-based indoor positioning technologies. Full article
Show Figures

Figure 1

13 pages, 2708 KB  
Article
Hybrid Visual Odometry Algorithm Using a Downward-Facing Monocular Camera
by Basil Mohammed Al-Hadithi, David Thomas and Carlos Pastor
Appl. Sci. 2024, 14(17), 7732; https://doi.org/10.3390/app14177732 - 2 Sep 2024
Cited by 1 | Viewed by 4442
Abstract
The increasing interest in developing robots capable of navigating autonomously has led to the necessity of developing robust methods that enable these robots to operate in challenging and dynamic environments. Visual odometry (VO) has emerged in this context as a key technique, offering [...] Read more.
The increasing interest in developing robots capable of navigating autonomously has led to the necessity of developing robust methods that enable these robots to operate in challenging and dynamic environments. Visual odometry (VO) has emerged in this context as a key technique, offering the possibility of estimating the position of a robot using sequences of onboard cameras. In this paper, a VO algorithm is proposed that achieves sub-pixel precision by combining optical flow and direct methods. This approach uses only a downward-facing, monocular camera, eliminating the need for additional sensors. The experimental results demonstrate the robustness of the developed method across various surfaces, achieving minimal drift errors in calculation. Full article
(This article belongs to the Topic Advances in Mobile Robotics Navigation, 2nd Volume)
Show Figures

Figure 1

15 pages, 18102 KB  
Article
LOFF: LiDAR and Optical Flow Fusion Odometry
by Junrui Zhang, Zhongbo Huang, Xingbao Zhu, Fenghe Guo, Chenyang Sun, Quanxi Zhan and Runjie Shen
Drones 2024, 8(8), 411; https://doi.org/10.3390/drones8080411 - 22 Aug 2024
Cited by 8 | Viewed by 4723
Abstract
Simultaneous Location and Mapping (SLAM) is a common algorithm for position estimation in GNSS-denied environments. However, the high structural consistency and low lighting conditions in tunnel environments pose challenges for traditional visual SLAM and LiDAR SLAM. To this end, this paper presents LiDAR [...] Read more.
Simultaneous Location and Mapping (SLAM) is a common algorithm for position estimation in GNSS-denied environments. However, the high structural consistency and low lighting conditions in tunnel environments pose challenges for traditional visual SLAM and LiDAR SLAM. To this end, this paper presents LiDAR and optical flow fusion odometry (LOFF), which uses a direction-separated data fusion method to fuse optical flow odometry into the degenerate direction of the LiDAR SLAM without sacrificing the accuracy. Moreover, LOFF incorporates detectors and a compensator, allowing for a smooth transition between general environments and degeneracy environments. This capability facilitates the stable flight of unmanned aerial vehicles (UAVs) in GNSS-denied tunnel environments, including corners and long-distance consistency. Through real-world experiments conducted in a GNSS-denied pedestrian tunnel, we demonstrate the superior position accuracy and trajectory smoothness of LOFF compared to state-of-the-art visual SLAM and LiDAR SLAM. Full article
(This article belongs to the Section Drone Design and Development)
Show Figures

Figure 1

28 pages, 5162 KB  
Review
A Comparative Review on Enhancing Visual Simultaneous Localization and Mapping with Deep Semantic Segmentation
by Xiwen Liu, Yong He, Jue Li, Rui Yan, Xiaoyu Li and Hui Huang
Sensors 2024, 24(11), 3388; https://doi.org/10.3390/s24113388 - 24 May 2024
Cited by 11 | Viewed by 4089
Abstract
Visual simultaneous localization and mapping (VSLAM) enhances the navigation of autonomous agents in unfamiliar environments by progressively constructing maps and estimating poses. However, conventional VSLAM pipelines often exhibited degraded performance in dynamic environments featuring mobile objects. Recent research in deep learning led to [...] Read more.
Visual simultaneous localization and mapping (VSLAM) enhances the navigation of autonomous agents in unfamiliar environments by progressively constructing maps and estimating poses. However, conventional VSLAM pipelines often exhibited degraded performance in dynamic environments featuring mobile objects. Recent research in deep learning led to notable progress in semantic segmentation, which involves assigning semantic labels to image pixels. The integration of semantic segmentation into VSLAM can effectively differentiate between static and dynamic elements in intricate scenes. This paper provided a comprehensive comparative review on leveraging semantic segmentation to improve major components of VSLAM, including visual odometry, loop closure detection, and environmental mapping. Key principles and methods for both traditional VSLAM and deep semantic segmentation were introduced. This paper presented an overview and comparative analysis of the technical implementations of semantic integration across various modules of the VSLAM pipeline. Furthermore, it examined the features and potential use cases associated with the fusion of VSLAM and semantics. It was found that the existing VSLAM model continued to face challenges related to computational complexity. Promising future research directions were identified, including efficient model design, multimodal fusion, online adaptation, dynamic scene reconstruction, and end-to-end joint optimization. This review shed light on the emerging paradigm of semantic VSLAM and how deep learning-enabled semantic reasoning could unlock new capabilities for autonomous intelligent systems to operate reliably in the real world. Full article
(This article belongs to the Section Navigation and Positioning)
Show Figures

Figure 1

19 pages, 3494 KB  
Article
Visual–Inertial Odometry of Structured and Unstructured Lines Based on Vanishing Points in Indoor Environments
by Xiaojing He, Baoquan Li, Shulei Qiu and Kexin Liu
Appl. Sci. 2024, 14(5), 1990; https://doi.org/10.3390/app14051990 - 28 Feb 2024
Cited by 3 | Viewed by 3321
Abstract
In conventional point-line visual–inertial odometry systems in indoor environments, consideration of spatial position recovery and line feature classification can improve localization accuracy. In this paper, a monocular visual–inertial odometry based on structured and unstructured line features of vanishing points is proposed. First, the [...] Read more.
In conventional point-line visual–inertial odometry systems in indoor environments, consideration of spatial position recovery and line feature classification can improve localization accuracy. In this paper, a monocular visual–inertial odometry based on structured and unstructured line features of vanishing points is proposed. First, the degeneracy phenomenon caused by a special geometric relationship between epipoles and line features is analyzed in the process of triangulation, and a degeneracy detection strategy is designed to determine the location of the epipoles. Then, considering that the vanishing point and the epipole coincide at infinity, the vanishing point feature is introduced to solve the degeneracy and direction vector optimization problem of line features. Finally, threshold constraints are used to categorize straight lines into structural and non-structural features under the Manhattan world assumption, and the vanishing point measurement model is added to the sliding window for joint optimization. Comparative tests on the EuRoC and TUM-VI public datasets validated the effectiveness of the proposed method. Full article
(This article belongs to the Topic Multi-Sensor Integrated Navigation Systems)
Show Figures

Figure 1

Back to TopTop