VS-SLAM: Robust SLAM Based on LiDAR Loop Closure Detection with Virtual Descriptors and Selective Memory Storage in Challenging Environments

Song, Zhixing; Zhang, Xuebo; Zhang, Shiyong; Wu, Songyang; Wang, Youwei

doi:10.3390/act14030132

Open AccessArticle

VS-SLAM: Robust SLAM Based on LiDAR Loop Closure Detection with Virtual Descriptors and Selective Memory Storage in Challenging Environments

by

Zhixing Song

^1,2

,

Xuebo Zhang

^1,2,*

,

Shiyong Zhang

^1,2,

Songyang Wu

^1,2 and

Youwei Wang

^1,2

¹

College of Artificial Intelligence, Nankai University, Tianjin 300350, China

²

Institute of Robotics and Automatic Information System (IRAIS), Nankai University, Tianjin 300350, China

^*

Author to whom correspondence should be addressed.

Actuators 2025, 14(3), 132; https://doi.org/10.3390/act14030132

Submission received: 11 February 2025 / Revised: 5 March 2025 / Accepted: 6 March 2025 / Published: 8 March 2025

(This article belongs to the Special Issue Modeling, Perception and Control of Robotic Systems with Real-World Applications)

Download

Browse Figures

Review Reports Versions Notes

Abstract

LiDAR loop closure detection is a key technology to mitigate localization drift in LiDAR SLAM, but it remains challenging in structurally similar environments and memory-constrained platforms. This paper proposes VS-SLAM, a novel and robust SLAM system that leverages virtual descriptors and selective memory storage to enhance LiDAR loop closure detection in challenging environments. Firstly, to mitigate the sensitivity of existing descriptors to translational changes, we propose a novel virtual descriptor technique that enhances translational invariance and improves loop closure detection accuracy. Then, to further improve the accuracy of loop closure detection in structurally similar environments, we propose an efficient and reliable selective memory storage technique based on scene recognition and key descriptor evaluation, which also reduces the memory consumption of the loop closure database. Next, based on the two proposed techniques, we develop a LiDAR SLAM system with loop closure detection capability, which maintains high accuracy and robustness even in challenging environments with structural similarity. Finally, extensive experiments in self-built simulation, real-world environments, and public datasets demonstrate that VS-SLAM outperforms state-of-the-art methods in terms of memory efficiency, accuracy, and robustness. Specifically, the memory consumption of the loop closure database is reduced by an average of 92.86% compared with SC-LVI-SAM and VS-SLAM-w/o-st, and the localization accuracy in structurally similar challenging environments is improved by an average of 66.41% compared with LVI-SAM.

Keywords:

LiDAR simultaneous localization and mapping (SLAM); loop closure detection; scene recognition; virtual descriptor; robot navigation; sensor fusion

1. Introduction

In recent years, simultaneous localization and mapping (SLAM) technology has made significant progress in fields such as robot navigation, autonomous driving, and augmented reality, becoming one of the core technologies in intelligent systems [1,2,3,4]. In particular, in the fields of autonomous driving and robotics, the rapid development of SLAM technology has driven major breakthroughs in autonomous navigation and environmental perception [5]. Depending on the sensors used, SLAM can generally be divided into Visual SLAM (V-SLAM) and LiDAR SLAM (L-SLAM). Compared with V-SLAM, L-SLAM has significant advantages in complex environments due to its robustness to illumination changes, higher measurement accuracy, and wider field of view [6,7].

Despite the strong capabilities demonstrated by L-SLAM in many applications, it still faces several challenges, especially in the key technical field of loop closure detection. The goal of loop closure detection is to eliminate the accumulated errors in the localization process by identifying the matching relationship between current and historical observations, thus ensuring the global consistency of the system [8]. However, most existing loop closure detection methods rely on high-dimensional descriptors to encode and match point clouds. Although this method improves detection accuracy, minimal descriptor variations in geometrically similar environments often cause mismatches, degrading loop closure detection performance. Existing descriptor techniques, such as Scan Context [9] and M2DP [10], construct descriptors by projecting point clouds into different dimensions or angles for loop closure detection. However, these methods perform poorly in geometrically similar environments and are highly sensitive to translation changes, which are inevitable when revisiting scenes such as roads, further affecting the accuracy and stability of loop closure detection.

Recently, the stable triangle descriptor (STD) proposed by Yuan et al. [11] and the loop closure detection method based on LiDAR occupancy set keys proposed by Zhang et al. [12] have enhanced the stability of geometric structures, thereby improving performance in complex environments. Although these methods have made significant progress in improving the robustness and accuracy of loop closure detection, real-time performance and memory consumption remain major bottlenecks, especially in scenarios with limited computational resources or high loop closure database storage requirements [13,14,15]. Furthermore, most existing loop closure detection methods typically assume significant environmental changes, but their performance often falls short in geometrically similar environments. Therefore, developing more robust and efficient loop closure detection methods to enhance system stability and accuracy in complex environments has become a key challenge in current SLAM research.

To address the above challenges, this paper proposes VS-SLAM. The main contributions are as follows:

To mitigate the sensitivity of existing descriptors to translational changes, we propose a novel virtual descriptor technique that enhances translational invariance and improves loop closure detection accuracy.
To further improve the accuracy of loop closure detection in structurally similar environments, we propose an efficient and reliable selective memory storage technique based on scene recognition and key descriptor evaluation, which also reduces the memory consumption of the loop closure database.
Based on the two proposed techniques, we developed a LiDAR SLAM system with loop closure detection capability, which maintains high accuracy and robustness even in challenging environments with structural similarity.
Experimental results in self-built simulation, real-world environments, and public datasets demonstrate that VS-SLAM outperforms state-of-the-art methods in terms of memory efficiency, accuracy, and robustness.

2. Related Work

Loop closure detection plays a crucial role in SLAM systems. Although various methods have made some progress in practical applications, their performance still face significant challenges in translation changes, structurally similar environments, and memory consumption.

2.1. Sensitivity of LiDAR Descriptors to Translation Changes

LiDAR SLAM technology has been widely used in robotics and autonomous driving. When revisiting scenes such as roads, translation changes are inevitable. However, existing LiDAR descriptors still face significant challenges in dealing with translation changes. Kim et al. proposed a global descriptor for 3D LiDAR scans—Scan Context [9]. This descriptor projects 3D LiDAR point clouds onto a 2D plane and constructs a matrix using a polar coordinate system to represent the 3D structure of the visible space. With its superior performance and high portability, Scan Context has been widely adopted in subsequent SLAM research, such as LIO-SAM [16] and ISC-LOAM [17], becoming a mainstream global descriptor design paradigm. However, the descriptor is sensitive to translation changes. Subsequently, Kim et al. proposed Scan Context++ [18], which leverages two sub-descriptors to achieve topological place retrieval and one-degree-of-freedom (1-DoF) semi-metric localization, bridging the gap between topological place retrieval and metric localization. As a result, it enhances robustness to rotation (yaw) and translational variations. Despite its advantages, the descriptor remains sensitive to translation changes. Recently, the LiDAR SLAM system proposed by Xu et al. [19] improves loop closure detection and odometry accuracy by using adaptive feature extraction and feature constraint classification. However, its performance is still limited in complex scenarios, especially in large-scale environments with strong translation changes. Similarly, the LiDAR-visual-inertial SLAM system proposed by Zhao et al. [20] has achieved good results by introducing multi-modal features and improved loop closure detection. However, its loop closure detection accuracy is still affected by translation changes, particularly in complex urban environments, where performance decreases significantly. These studies show that, although existing methods alleviate the impact of translation changes through feature optimization and multi-modal fusion, they have not fundamentally solved the sensitivity of descriptors to translation changes. To address this problem, this paper proposes a novel virtual descriptor technique to enhance the translation invariance of descriptors, thereby improving the accuracy of loop closure detection.

2.2. Accuracy and Memory Consumption of Loop Closure Detection

The accuracy of loop closure detection in structurally similar environments and the memory consumption of the loop closure database in large-scale environments are another major challenge faced by LiDAR-based loop closure detection methods. Yuan et al. proposed the stable triangle descriptor (STD) [11] and binary triangle combination descriptor (BTC) [21] methods, which improve location recognition accuracy through precise descriptor matching. Although these systems perform well in diverse scenarios, the robustness and accuracy of feature matching are still limited in structurally similar environments. Yoon et al. [22] improved point cloud registration by introducing viewpoint-aware visibility scores, enhancing matching accuracy in scenes with large viewpoint changes. However, the optimization of memory consumption has not been fully considered. Existing SLAM systems still face significant challenges in balancing accuracy and memory consumption. Many high-precision SLAM methods rely on large amounts of feature data and complex optimization algorithms. Although these methods have made significant progress in accuracy, they also lead to high memory consumption and computational complexity. The Omni Point method proposed by Im et al. [23] improves place recognition performance while avoiding the high computational overhead of deep learning methods by manually extracting features, performing excellently across multiple datasets. Although this method works well in small-scale environments, its memory management and scalability face challenges as the environment scales up. In addition, LCR-Net [24] improves the robustness of loop closure detection by optimizing feature extraction and pose perception mechanisms, but this method often requires significant computational resources, leading to higher memory consumption. The above studies demonstrate that existing loop closure detection methods face certain limitations in handling structurally similar environments and managing database memory consumption. To address these problems, this paper designs an efficient and reliable selective memory storage technology based on scene recognition and key descriptor evaluation, aiming to improve accuracy while significantly reducing the memory consumption of the database, so as to better adapt to the needs of large-scale and complex scenes.

3. System Framework

To solve the above problems, this paper proposes VS-SLAM, a novel and robust SLAM system that leverages virtual descriptors and selective memory storage to enhance LiDAR loop closure detection in challenging environments. The system framework is shown in Figure 1. It is worth noting that due to potential degradation when performing L-SLAM in geometrically similar environments, inspired by the works in [25,26,27], which highlight the challenges of degradation in L-SLAM systems and propose strategies to address these issues, we introduce visual-inertial odometry as a priori odometry to constrain the degradation direction of L-SLAM. Additionally, in non-geometrically similar environments, the main role of visual-inertial odometry is to provide an initial guess for LiDAR odometry optimization, further enhancing the efficiency of the system. Although a constant velocity motion model can replace this role, the introduction of visual-inertial odometry still provides more stable and accurate initial guess estimation for the system. The system is developed based on LVI-SAM [28] and consists of three main parts: LiDAR odometry, backend optimization, and loop closure detection. Similar to LVI-SAM, VS-SLAM adopts a multi-sensor fusion framework, combining LiDAR, visual, and inertial measurement unit (IMU) data to improve system robustness and accuracy. Additionally, both systems utilize visual-inertial odometry to provide prior information, which constrains the error of LiDAR odometry in degenerate directions. They also employ a loop closure detection module to correct accumulated errors, enhancing global consistency. The blue shaded blocks represent our contributions. By leveraging the two algorithms shown in the blue shaded blocks in Figure 1, VS-SLAM significantly outperforms LVI-SAM in terms of accuracy and robustness.

The pipeline is as follows. First, the LiDAR point cloud is undistorted using the priori inter-frame pose

{\hat{T}}_{k}^{k - 1}

provided by the visual-inertial odometry (VIO), and a set of planar points is extracted from the undistorted point cloud

P^{k}

. Second, feature matching is performed using the priori pose initialization

{\hat{T}}_{k}^{W}

and the local map, a point-to-plane constraint equation is constructed, and the front-end LiDAR odometry pose is estimated in well-constrained directions, while in degenerate directions, the priori initialization is directly adopted. Third, the LiDAR odometry employs an incremental optimization strategy and utilizes the Gauss-Newton method to solve the nonlinear optimization problem, ensuring computational efficiency while maximizing localization accuracy and system robustness.

Following that, we introduce the loop detection process. First, the undistorted point cloud is used to generate the LiDAR descriptor. To reduce the reliance of LVI-SAM on distance-based methods for loop closure detection, we introduced the Scan Context descriptor [9]. This method effectively enhances loop closure detection capabilities, especially in cases of large trajectory drift. By leveraging global features through the Scan Context descriptor, the system reduces the likelihood of loop closure detection failures that occur with purely distance-based methods. However, similar to distance-based methods, this approach still faces challenges with false loop closures in geometrically similar environments, and the Scan Context descriptor exhibits poor translation invariance. To further improve performance, we propose the virtual descriptor technique as the second step. This technique generates multiple virtual descriptors to enhance the translation invariance of the Scan Context descriptor, thereby enhancing loop detection accuracy. Third, the proposed selective memory storage strategy, based on scene recognition and key descriptor evaluation, significantly improves loop detection accuracy in structurally similar environments, while effectively reducing memory consumption of loop database in large-scale environments. Finally, loop constraints are incorporated into backend optimization through loop retrieval and convergence judgment, correcting accumulated errors in odometry and mapping, thereby further enhancing the accuracy of the SLAM system.

Loop closure constraints are incorporated into backend optimization [16] to effectively correct drift caused by long-term accumulated errors. The system represents the estimated pose of each frame as a node in a factor graph and constructs the graph using point-to-plane constraints and loop closure constraints. The point-to-plane constraints capture the relative pose relationships between consecutive frames, while the loop closure constraints link historical frames to the current frame, helping to reduce global errors. During optimization, the system aims to refine the factor graph by minimizing the errors of all constraint edges, as shown in (1):

\begin{matrix} min_{T_{k}^{W}} \sum_{(i, j) \in ε} {∥e_{i j} (T_{i}^{W}, T_{j}^{W})∥}_{Σ_{i j}}^{2}, \end{matrix}

(1)

where

e_{i j}

represents the error associated with edge,

ε

represents the set of all edges in the factor graph, and

Σ_{i j}

is the noise covariance matrix that quantifies the uncertainty of the error. By minimizing this objective function, the system optimizes the pose estimation and effectively reduces drift in the global map.

Through the above framework, the VS-SLAM system not only enhances the translation invariance of the LiDAR descriptors and addresses the accuracy problem of loop closure detection in geometrically similar environments, but also optimizes the memory consumption of the loop closure database, ensuring efficient operation even on memory-constrained platforms. Experimental results show that, compared with state-of-the-art methods, VS-SLAM demonstrates significant advantages in accuracy and robustness, particularly in structurally similar environments, where both localization accuracy and loop closure detection capabilities are substantially improved.

4. Methods

LiDAR loop closure detection is a key technique for mitigating localization drift in LiDAR SLAM. However, it still faces challenges in handling translation variations, structurally similar environments, and memory consumption. In this section, we propose virtual descriptors and selective memory storage techniques to improve the accuracy of loop closure detection while effectively reducing the memory consumption of the loop closure database.

4.1. Virtual Descriptor Technique

LiDAR loop closure detection algorithms typically store descriptors generated from scan point clouds in a database. For each new scan, a corresponding descriptor is generated and compared with historical descriptors in the database to retrieve loop closure frames. Compared with the distance-based loop closure detection algorithm [28], this method maintains high robustness even in scenarios with significant accumulated errors.

Kim et al. proposed a global descriptor for 3D LiDAR scans—Scan Context [9]. This descriptor projects 3D LiDAR point clouds onto a 2D plane and constructs a matrix using a polar coordinate system to represent the 3D structure of the visible space. Specifically, Scan Context divides the point cloud into multiple sectors based on angle and further segments it into several rings according to distance, generating a sparse matrix where each cell records the maximum height value within the corresponding region. Compared with traditional histogram-based or pre-trained methods, this 3D structure-based representation enables more efficient extraction of global environmental features while demonstrating greater robustness and “rotation invariance” in scenarios with viewpoint variations. Furthermore, the method introduces a similarity score-based distance metric and designs a two-stage search algorithm, which significantly improves the efficiency of loop closure detection.

In this section, to mitigate the sensitivity of the Scan Context descriptor to translational changes, a novel virtual descriptor technique is proposed to enhance its translation invariance and improve the accuracy of the loop closure detection algorithm. The detailed methodology is presented as follows.

First, we denote the LiDAR pose in the world coordinate system at the k-th scan as

T_{k}^{W} \in SE (3)

. Then, the point cloud is divided into

Q_{s}

sectors by angle and into

Q_{r}

rings by distance, generating a sparse matrix-based Scan Context descriptor

I_{r}^{k} \in R^{Q_{r} \times Q_{s}}

, where each cell records the maximum height value in the corresponding region.

Then, we translate the position of the LiDAR scan by

M \cdot N_{i}

meters along the X-axis, Y-axis, and diagonally at a

45^{\circ}

angle in the current LiDAR local coordinate system to generate multiple virtual positions, as shown in Figure 2. Where

N_{i} \in \{1, 2, \dots, N\}

, M is the distance between rings, N is the number of rings, and

θ

represents the translation direction angle, where

θ \in \{0^{\circ}, 45^{\circ}, \dots, 315^{\circ}\}

. The blue lines representing the translation directions intersect with the green rings at

8 \cdot N

dark red points, which correspond to the virtual positions.

After generating multiple virtual positions, the corresponding transformation matrix

T_{k}^{virtual}

between the virtual position pose and the original LiDAR pose can be calculated by

\begin{matrix} T_{k}^{virtual} = [\begin{matrix} 1 & 0 & 0 & - M \cdot N_{i} \cdot sin (θ_{j}) \\ 0 & 1 & 0 & M \cdot N_{i} \cdot cos (θ_{j}) \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{matrix}], \end{matrix}

(2)

where

θ_{j}

is the angle of the current translation direction, with

θ_{j} \in \{0^{\circ}, 45^{\circ}, \dots, 315^{\circ}\}

. The transformation matrix

T_{k}^{virtual}

is adjusted for each translation direction

θ_{j}

and the value of

N_{i}

for each ring.

The original undistorted point cloud

P^{k}

in the current lidar coordinate system is translated to generate

8 \cdot N

virtual point clouds

P^{virtual}

by

\begin{matrix} P^{virtual} = T_{k}^{virtual} \cdot P^{k} . \end{matrix}

(3)

Similarly, by calculating the Scan Context descriptors for the virtual point clouds

P^{virtual}

,

8 \cdot N

virtual descriptors

I_{v_{m}}^{k}

are obtained, where

v_{m} \in \{1, \dots, 8 \cdot N\}

.

Finally, during loop closure retrieval, we not only calculate the descriptor similarity between the original descriptor and the historical descriptors in the loop closure database, but also calculate the descriptor similarity between the multiple generated virtual descriptors and the historical descriptors. The specific process is as follows.

First, an initial screening of historical frames is performed. The similarity between two descriptors is judged using the cosine distance formula, where a smaller cosine distance indicates higher similarity. Specifically, the similarity is determined by summing the comparison of each column of the two descriptors, and the cosine distance between the column vectors is calculated by

\begin{matrix} d (I_{i}^{k}, I^{h}) = \frac{1}{Q_{s}} \sum_{j = 1}^{Q_{s}} (1 - \frac{c_{j}^{k_{i}} \cdot c_{j}^{h}}{∥c_{j}^{k_{i}}∥ \cdot ∥c_{j}^{h}∥}), \end{matrix}

(4)

where

I_{i}^{k}

is the i-th descriptor at the k-th scan, which includes both the original descriptor

I_{r}^{k}

and the virtual descriptor

I_{v_{m}}^{k}

,

i \in \{1, \dots, 8 \cdot N + 1\}

.

I^{h}

is the historical descriptor in the loop closure database,

c_{j}^{k_{i}}

is the column vector of the i-th descriptor at the k-th scan,

c_{j}^{h}

is the column vector of the historical descriptor, and

Q_{s}

is the number of columns in the descriptor.

Then, for the i-th descriptor

I_{i}^{k}

of the k-th scan, all possible column vector offsets are iterated. Since changes in the LiDAR viewpoint may cause shifts in the column vectors, all possible offsets need to be tested to find the minimum cosine distance

d (I_{i}^{k}, I^{h})

as the final similarity of the i-th descriptor for the k-th scan. This step achieves rotational invariance, ensuring that the same scene can be accurately recognized from different viewpoints. If any of the

8 \cdot N + 1

final similarities are below the predefined threshold, the k-th scan is considered to form a loop closure with the historical frame, followed by an ICP convergence judgment for the corresponding local maps of the two frames [29]. If convergence is achieved, the loop closure constraint is added to the backend optimization. Owing to the proposed virtual descriptor technique, the translational invariance of descriptors is enhanced, enabling accurate recognition of the same scene even with certain translational changes when revisiting it.

4.2. Selective Memory Storage Technology

To correct the cumulative errors in LiDAR SLAM and build globally consistent trajectories and maps, loop closure detection is essential. However, loop closure detection faces significant challenges in environments with similar geometric structures, such as tunnels and corridors. To improve the accuracy of loop closure detection, we propose an efficient and reliable selective memory storage technique based on scene recognition and key descriptor evaluation. This technique significantly reduces the memory consumption of the loop closure database. The specific principle is as follows.

4.2.1. Scene Recognition

In environments with similar geometric structures, to prevent false loop closures in LiDAR SLAM, we store only the descriptors of key locations, such as intersections, and add them to the loop closure database. During loop closure detection, the SLAM system performs loop retrieval and convergence judgment solely on the data from these key moments. To recognize structurally similar environments, we determine them based on the ratio of information across different principal directions. For the measurement of principal direction information, we refer to [27,30] as well as our previous work [26]. The specific method is as follows.

When using the Gauss-Newton method for iterative optimization to solve the LiDAR odometry pose estimation problem, the approximate Hessian matrix

H \in R^{6 \times 6}

can be expressed in the following submatrix form:

\begin{matrix} H = [\begin{matrix} H_{r r} & H_{r t} \\ H_{t r} & H_{t t} \end{matrix}] . \end{matrix}

(5)

By performing eigenvalue decomposition on the translational submatrix

H_{t t} \in R^{3 \times 3}

and the rotational submatrix

H_{r r} \in R^{3 \times 3}

, three translational eigenvalues and three rotational eigenvalues can be obtained. These eigenvalues are used as the axis lengths of the confidence ellipsoids to construct the 3D confidence ellipsoids in the translational and rotational spaces, respectively. Then, the three axis lengths of the confidence ellipsoids in the translational and rotational spaces are sorted in descending order and denoted as

l_{m a x}^{t}

,

l_{m e d}^{t}

, and

l_{m i n}^{t}

, as well as

l_{m a x}^{r}

,

l_{m e d}^{r}

, and

l_{m i n}^{r}

, respectively. The oblateness of the translational and rotational confidence ellipsoids,

α^{t}

and

α^{r}

, are obtained by

\begin{matrix} \{\begin{matrix} α^{t} = \frac{l_{m a x}^{t} - l_{m i n}^{t}}{l_{m a x}^{t}}, α^{t} \in [0.0, 1.0] \\ α^{r} = \frac{l_{m a x}^{r} - l_{m i n}^{r}}{l_{m a x}^{r}}, α^{r} \in [0.0, 1.0] \end{matrix} . \end{matrix}

(6)

The larger the difference in eigenvalues within the same subspace, the greater the oblateness, and the higher the likelihood of false loop closures occurring during L-SLAM loop closure detection. Therefore, we update the scene recognition result s by

\begin{matrix} s = \{\begin{matrix} 0 & if α^{t} \geq α_{t h} or α^{r} \geq α_{t h}, \\ 1 & otherwise, \end{matrix} \end{matrix}

(7)

where

α_{t h}

is the oblateness threshold for scene recognition, 0 represents a structurally similar scene, and 1 represents a non-structurally similar scene.

4.2.2. Key Descriptor Evaluation

Section 4.2.1 focused on selective memory for structurally similar environments to prevent false loop closures in LiDAR SLAM, while also reducing the memory consumption of the loop closure database. To further reduce the memory consumption of the loop closure database in non-structurally similar environments, inspired by the concept of keyframes in SLAM systems [16], this section proposes a key descriptor evaluation method based on similarity calculations.

For the point cloud

P^{k}

of the current k-th scan and its corresponding original descriptor

I_{r}^{k}

, we calculate the descriptor similarity between

I_{r}^{k}

and the previously stored descriptor

I^{p r}

in the loop database. In loop retrieval, we calculate the similarity between the current descriptor and the historical descriptor using the cosine distance Formula (4), where cosine distance is similarly used to measure the similarity between the two descriptors

I_{r}^{k}

and

I^{p r}

:

\begin{matrix} d (I_{r}^{k}, I^{p r}) = \frac{1}{Q_{s}} \sum_{j = 1}^{Q_{s}} (1 - \frac{c_{j}^{k} \cdot c_{j}^{p r}}{∥c_{j}^{k}∥ \cdot ∥c_{j}^{p r}∥}), \end{matrix}

(8)

where

c_{j}^{k}

is the column vector of the original descriptor

I_{r}^{k}

, and

c_{j}^{p r}

is the column vector of the previously stored descriptor

I^{p r}

in the loop database.

Then, we update the key descriptor evaluation result m by

\begin{matrix} m = \{\begin{matrix} 0 & if d (I_{r}^{k}, I^{p r}) \leq d_{t h}, \\ 1 & otherwise, \end{matrix} \end{matrix}

(9)

where

d_{t h}

is the cosine distance threshold for key descriptor evaluation, 0 represents a non-key descriptor, and 1 represents a key descriptor. If

m = 1

, the descriptor is stored in the database.

It should be noted that if the virtual descriptor technique proposed in Section 4.1 is used, the descriptor similarity between all virtual descriptors of the current frame and the previously stored descriptor

I^{p r}

is calculated. If all cosine distances are greater than

d_{t h}

, the descriptor

I_{r}^{k}

will be marked as a key descriptor and stored in the database.

4.2.3. Selective Memory Storage

For complex scenes, including both structurally similar environments and non-structurally similar environments, in order to reduce the memory consumption of the database and the rate of false loop closures, we combine the above methods and update the final descriptor storage result

τ

based on the scene recognition result s and key descriptor evaluation result m:

\begin{matrix} τ = s \times m . \end{matrix}

(10)

If

τ = 1

, the descriptor is stored in the database, otherwise it is not stored.

5. Experiments

In this section, we evaluate the performance of VS-SLAM in simulation environments, public datasets, and the real world. First, we constructed a typical structurally similar degenerate scene, the long corridor, in the Gazebo simulator, as shown in Figure 3a. The mobile robot platform is an Agilex Scout 2.0 (AgileX Robotics, Shenzhen, China), equipped with a 3D LiDAR (Velodyne VLP-16, Velodyne Lidar, San Jose, CA, USA, 10

Hz

), an Intel Realsense D435i camera (Intel Corporation, Santa Clara, CA, USA, left grayscale image, 30

Hz

), and an Xsens MTi-300 IMU (Xsens, Enschede, The Netherlands, 100

Hz

). To compare the performance of different methods, we recorded two sequences using the Scout 2.0 robot in the long corridor, moving in both counterclockwise and clockwise directions, named Gazebo-Corridor-01 and Gazebo-Corridor-02. To facilitate subsequent testing of the proposed virtual descriptor technique, there is a translation between the starting point and the endpoint.

Second, we evaluate the performance on the M2DGR public dataset [31], a widely recognized LiDAR-vision-inertial benchmark in SLAM research. This dataset was collected by a ground robot equipped with a 3D LiDAR (Velodyne VLP-32C, 10

Hz

), an Intel Realsense D435i camera (left grayscale image, 15

Hz

), and a Handsfree A9 (150

Hz

), with all sensors carefully calibrated and synchronized. The dataset contains 36 sequences from various indoor and outdoor scenes. Among them, the sequences M2DGR-Gate-01, M2DGR-Gate-02, M2DGR-Gate-03, M2DGR-Street-03, M2DGR-Street-04, and M2DGR-Street-08 have the potential for loop detection. We selected these six sequences to test the performance of the SLAM system and loop closure detection algorithm. The GNSS-RTK suite provides accurate ground truth poses.

Finally, to further test the performance of VS-SLAM, we tested it in a real-world degraded long corridor. The mobile robot platform is similar to the one used in the simulation platform, as shown in Figure 4. We recorded two sequences in the long corridor, named Real-Corridor-01 and Real-Corridor-02.

To compare the accuracy, robustness, and real-time performance of the proposed method with the baseline method, we conducted qualitative and quantitative comparisons and analyses in six aspects of the sequences recorded in these environments. These environments cover various lighting conditions, fully validating the adaptability of the algorithm in diverse challenging scenarios, as shown in Figure 5. To ensure a fair comparison, all methods were implemented on the same hardware platform: an Intel(R) Core(TM) i7-10750H CPU (Intel Corporation, Santa Clara, CA, USA) with 32 GB memory and an NVIDIA GeForce GTX 1660 Ti GPU (NVIDIA Corporation, Santa Clara, CA, USA).

5.1. Selective Memory Storage Technology Testing

In this section, we perform both qualitative and quantitative testing of the proposed selective memory storage technology, which is based on scene recognition and key descriptor evaluation, in simulation environments, public datasets, and the real world. The purpose of this technology is to reduce memory consumption in the loop closure database in complex scenarios and improve the accuracy of loop closure detection in structurally similar environments.

First, we present a qualitative result, visualizing scene recognition results, stored descriptor poses, and constructed maps on the trajectories of multiple sequences, as shown in Figure 6 and Figure 7. In the figures, the black lines represent the trajectories estimated by VS-SLAM, while the colored point clouds represent the maps built by VS-SLAM. The pink circular markers on the trajectory represent scans in non-structurally similar scenes, while the coordinate axes on the trajectory represent scans in structurally similar scenes. The blue arrows represent the poses of the key descriptors stored in the database. To enhance the clarity of the figures, we have added visual markers (legends) to explain the meaning of each marker. For both the simulation sequences and the corridor sequences recorded in the real world, the robot spends most of the time in the long corridor degenerate region with similar structure, as shown in Figure 6a,b, where most scans on the trajectory are represented by coordinate axes. Benefiting from the accurate scene recognition algorithm, our method only stores a small number of key descriptors from non-structurally similar scenes, as indicated by the blue arrows. This method not only reduces the memory consumption of the loop closure database but also improves the accuracy of loop closure detection in complex environments. For the sequences from the M2DGR dataset, the robot spends most of the time in feature-rich non-structurally similar areas, as shown in Figure 7a,b, where most scans on the trajectory are represented by pink spheres. As a result of the key descriptor evaluation, the SLAM system achieves robust loop closure detection by storing only a small number of key descriptors, as indicated by blue arrows.

Second, we illustrate the process of determining key descriptors in the point cloud with another qualitative result, as shown in Figure 8. Initially, the descriptor from Scan 159 is identified as a key descriptor and stored in the loop closure database. Next, the descriptor generated from Scan 163 is found to be highly similar to the key descriptor of Scan 159, with a cosine distance below the threshold after the descriptor similarity calculation, so it is not stored in the database. Subsequently, Scan 179 and its corresponding descriptor are obtained. The descriptor of Scan 179 differs from the key descriptor of Scan 159, and the cosine distance exceeds the threshold after similarity calculation. Additionally, before similarity calculation, scene recognition is performed, confirming that Scan 179 is in a non-structurally similar environment, as shown by location “C” in Figure 6a. As a result, the descriptor corresponding to Scan 179 is marked as a key descriptor and stored in the loop closure database.

Third, we present a quantitative result, where Table 1 shows a comparison result of descriptor count and memory consumption in the loop closure database. we denote VS-SLAM-w/o-st as the version of VS-SLAM without the selective memory storage technique. SC-LIO-SAM [9] and LVI-SAM [28] are the state-of-the-art SLAM methods with loop closure detection capabilities. SC-LIO-SAM employs the Scan Context descriptor for loop closure detection. To address the degradation problem of LiDAR SLAM in geometrically similar environments and ensure a fair comparison, we introduce visual-inertial odometry [32] into SC-LIO-SAM to constrain the degradation direction, naming it SC-LVI-SAM. Since the proposed virtual descriptor technique does not impact descriptor storage, the data for VS-SLAM-w/o-st and SC-LVI-SAM in Table 1 are almost identical. Additionally, LVI-SAM employs a distance-based method for loop closure detection, which does not require descriptor generation. However, it fails to perform loop closure when odometry errors are significant and is prone to false loop closures in structurally similar environments. These limitations highlight the advantage of our proposed method, which effectively reduces memory consumption while maintaining robust and accurate loop closure detection, even in challenging scenarios. For two simulation sequences and two real-world corridor sequences, the robot spends most of the time in structurally similar degraded areas. The proposed scene recognition algorithm accurately identifies these structurally similar scenes, and descriptors are not stored in the database for these areas. Benefiting from the scene recognition algorithm and the key descriptor evaluation based on similarity calculation, compared with SC-LVI-SAM, VS-SLAM reduces the number of descriptors by 91.50% to 95.04% and decreases memory consumption by 85.50% to 92.66%. For the M2DGR public dataset, owing to the key descriptor evaluation method based on similarity calculations, VS-SLAM reduces the number of descriptors by 94.03% to 98.14% and decreases memory consumption by 92.66% to 97.86%. Compared with SC-LVI-SAM and VS-SLAM-w/o-st, VS-SLAM achieves an average reduction in memory consumption of 92.86% across all sequences. These results further validate the effectiveness of the proposed selective memory storage technique in reducing memory usage. For example, in the Gazebo-Corridor-01 sequence, the memory consumption of VS-SLAM-w/o-st is 1048 bytes, while that of VS-SLAM is only 152 bytes, resulting in an 85.50% reduction. In the M2DGR-Street-04 sequence, the memory consumption of VS-SLAM-w/o-st is 8216 bytes, while that of VS-SLAM is only 280 bytes, resulting in a 96.59% reduction. These improvements significantly reduce the memory requirements of the loop closure database while maintaining the robustness and accuracy of the system.

Finally, to provide a more intuitive comparison of memory consumption in the loop closure database, we present the results in Figure 9. In the figure, the blue bars represent the results of VS-SLAM, while the red bars represent the results of SC-LVI-SAM. Note that SC-LVI-SAM has the same data as VS-SLAM-w/o-st. The comparison clearly demonstrates that the proposed selective memory storage technique significantly reduces memory consumption.

5.2. Comparison of Localization Accuracy and Ablation Study

In this section, we compare the localization accuracy of different SLAM methods. LVI-SAM [28] and SC-LVI-SAM [9] are the state-of-the-art SLAM methods with loop closure detection capabilities. LVI-SAM-w/o-loop is a version of LVI-SAM with closed loop detection. In simulation environments and public datasets, we evaluate localization accuracy using the root mean square error (RMSE) of the Absolute Trajectory Error (ATE) [33]. In real-world experiments, due to the lack of ground truth, we use the difference between the measured distance between the start and end points and the distance between the trajectory endpoints obtained by SLAM to evaluate the localization accuracy [34]. For each set of tests, we repeat it 5 times to eliminate the influence of uncertain factors.

First, we present a quantitative result, where the comparison results of localization accuracy are shown in Table 2. Benefiting from the superiority of the proposed loop detection algorithm based on virtual descriptors and selective memory storage technology, VS-SLAM outperforms all other methods in every sequence. In geometrically similar environments, incorrect loop detection can lead to poor localization accuracy and reliability. For example, in the Real-Corridor-01 sequence, the localization error of SC-LVI-SAM is 49.59 m. Due to the selective memory storage technology that recognizes structurally similar scenes and avoids storing descriptors and performing loop detection, the accuracy of the loop detection algorithm is significantly improved. VS-SLAM improves localization accuracy by 99.78% compared with SC-LVI-SAM. Another example is the Real-Corridor-02 sequence, where VS-SLAM outperforms SC-LVI-SAM by 99.82%. In the four structurally similar environment sequences from both simulation and real-world experiments, VS-SLAM achieves an average localization accuracy improvement of 66.41% compared with LVI-SAM. It is worth noting that the improvement over SC-LVI-SAM is even more significant, as the average localization accuracy of SC-LVI-SAM in these sequences is worse than that of LVI-SAM. In the M2DGR public dataset, due to the non-structurally similar environment, VS-SLAM achieves slightly higher localization accuracy than SC-LVI-SAM, while significantly reducing database memory consumption. These results demonstrate that VS-SLAM exhibits excellent robustness in complex environments.

To further test the effectiveness of the virtual descriptor technology, we denote VS-SLAM-w/o-vd as the version of VS-SLAM without the virtual descriptor technology. In the Gazebo-Corridor-01 and Gazebo-Corridor-02 sequences, the RMSE of ATE of VS-SLAM-w/o-vd is 0.75 m and 1.10 m, respectively. Benefiting from the proposed virtual descriptor technology improves the translational invariance of descriptors and the accuracy of the loop detection algorithm, VS-SLAM achieves improvements of 10.67% and 4.55% over VS-SLAM-w/o-vd, respectively.

Finally, to further support the data in Table 2, we visualize the trajectories of different methods on the Gazebo-Corridor-01 and Gazebo-Corridor-02 sequences, as shown in Figure 10. To minimize the impact of motion trajectories and pose on system performance, these two sequences simulate counterclockwise and clockwise moving trajectories, respectively. In the figure, the trajectory of the proposed VS-SLAM (red) closely matches the ground truth, while the trajectories of LVI-SAM (green) and SC-LVI-SAM (blue) exhibit significant drift due to false loop closures. The accuracy of the remaining two methods is slightly lower than that of VS-SLAM. This visualization clearly demonstrates the superior accuracy and robustness of our proposed method.

6. Conclusions

In this paper, we propose VS-SLAM, a novel and robust SLAM system that leverages virtual descriptors and selective memory storage to enhance LiDAR loop closure detection in challenging environments. Specifically, VS-SLAM differs significantly from previous research in two aspects: an novel virtual descriptor technique and an efficient and reliable selective memory storage technique based on scene recognition and key descriptor evaluation. Results from simulation scenarios, public datasets, and real-world experiments show that, compared with current state-of-the-art methods, VS-SLAM demonstrates superior performance in memory efficiency, accuracy, and robustness. Future research can explore deep learning-based LiDAR loop closure detection to improve feature extraction and matching accuracy, as well as multi-sensor fusion methods that combine LiDAR with visual data to enhance robustness and adaptability in complex environments. These directions will further advance the capabilities of SLAM systems.

Author Contributions

Conceptualization, methodology, software, validation, writing—original draft preparation, Z.S.; investigation, supervision, project administration, funding acquisition, resources, X.Z.; writing—review and editing, S.Z., S.W. and Y.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported in part by the National Natural Science Foundation of China under Grant 62303247, in part by the Beijing–Tianjin–Hebei Fundamental Research Cooperation Project under Grant 24JCZXJC00390, in part by the China Postdoctoral Science Foundation-Tianjin Joint Support Program under Grant Number 2023T013TJ, and in part by the Fundamental Research Funds for the Central Universities.

Data Availability Statement

The datasets generated and/or analyzed during the current study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no competing interests.

References

Ebadi, K.; Bernreiter, L.; Biggie, H.; Catt, G.; Chang, Y.; Chatterjee, A.; Denniston, C.E.; Deschênes, S.P.; Harlow, K.; Khattak, S.; et al. Present and future of SLAM in extreme environments: The DARPA SubT challenge. IEEE Trans. Robot. 2024, 40, 936–959. [Google Scholar] [CrossRef]
Li, N.; Yao, Y.; Xu, X.; Peng, Y.; Wang, Z.; Wei, H. An Efficient LiDAR SLAM with Angle-Based Feature Extraction and Voxel-based Fixed-Lag Smoothing. IEEE Trans. Instrum. Meas. 2024, 73, 1–13. [Google Scholar] [CrossRef]
Zhou, H.; Yao, Z.; Lu, M. Lidar/UWB fusion based SLAM with anti-degeneration capability. IEEE Trans. Veh. Technol. 2021, 70, 820–830. [Google Scholar] [CrossRef]
Bi, Q.; Zhang, X.; Wen, J.; Pan, Z.; Zhang, S.; Wang, R.; Yuan, J. CURE: A Hierarchical Framework for Multi-Robot Autonomous Exploration Inspired by Centroids of Unknown Regions. IEEE Trans. Autom. Sci. Eng. 2023, 21, 3773–3786. [Google Scholar] [CrossRef]
Zou, Q.; Sun, Q.; Chen, L.; Nie, B.; Li, Q. A comparative analysis of LiDAR SLAM-based indoor navigation for autonomous vehicles. IEEE Trans. Intell. Transp. Syst. 2022, 23, 6907–6921. [Google Scholar] [CrossRef]
Roriz, R.; Cabral, J.; Gomes, T. Automotive LiDAR technology: A survey. IEEE Trans. Intell. Transp. Syst. 2022, 23, 6282–6297. [Google Scholar] [CrossRef]
Yin, H.; Liu, P.X.; Zheng, M. Stereo visual odometry with automatic brightness adjustment and feature tracking prediction. IEEE Trans. Instrum. Meas. 2023, 72, 1–11. [Google Scholar] [CrossRef]
Zou, Z.; Yuan, C.; Xu, W.; Li, H.; Zhou, S.; Xue, K.; Zhang, F. LTA-OM: Long-term association LiDAR–IMU odometry and mapping. J. Field Robot. 2024, 41, 2455–2474. [Google Scholar] [CrossRef]
Kim, G.; Kim, A. Scan Context: Egocentric spatial descriptor for place recognition within 3D point cloud map. In Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 1–5 October 2018; pp. 4802–4809. [Google Scholar]
He, L.; Wang, X.; Zhang, H. M2DP: A novel 3D point cloud descriptor and its application in loop closure detection. In Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, Republic of Korea, 9–14 October 2016; pp. 231–237. [Google Scholar]
Yuan, C.; Lin, J.; Zou, Z.; Hong, X.; Zhang, F. STD: Stable triangle descriptor for 3D place recognition. In Proceedings of the 2023 IEEE International Conference on Robotics and Automation (ICRA), London, UK, 29 May–2 June 2023; pp. 1897–1903. [Google Scholar]
Zhang, Z.; Huang, Y.; Si, S.; Zhao, C.; Li, N.; Zhang, Y. OSK: A Novel LiDAR Occupancy Set Key-based Place Recognition Method in Urban Environment. IEEE Trans. Instrum. Meas. 2024, 73, 8502115. [Google Scholar] [CrossRef]
Jiao, J.; Ye, H.; Zhu, Y.; Liu, M. Robust odometry and mapping for multi-LiDAR systems with online extrinsic calibration. IEEE Trans. Robot. 2022, 38, 351–371. [Google Scholar] [CrossRef]
Jiao, J.; Zhu, Y.; Ye, H.; Huang, H.; Yun, P.; Jiang, L.; Wang, L.; Liu, M. Greedy-based feature selection for efficient LiDAR SLAM. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021; pp. 5222–5228. [Google Scholar]
Wang, H.; Wang, C.; Chen, C.L.; Xie, L. F-LOAM: Fast LiDAR odometry and mapping. In Proceedings of the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Prague, Czech Republic, 27 September–1 October 2021; pp. 4390–4396. [Google Scholar]
Shan, T.; Englot, B.; Meyers, D.; Wang, W.; Ratti, C.; Rus, D. LIO-SAM: Tightly-coupled lidar inertial odometry via smoothing and mapping. In Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA, 24 October 2020–24 January 2021; pp. 5135–5142. [Google Scholar]
Wang, H.; Wang, C.; Xie, L. Intensity Scan Context: Coding intensity and geometry relations for loop closure detection. In Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France, 31 May–31 August 2020; pp. 2095–2101. [Google Scholar]
Kim, G.; Choi, S.; Kim, A. Scan Context++: Structural place recognition robust to rotation and lateral variations in urban environments. IEEE Trans. Robot. 2021, 38, 1856–1874. [Google Scholar] [CrossRef]
Xu, M.; Lin, S.; Wang, J.; Chen, Z. A LiDAR SLAM System with Geometry Feature Group Based Stable Feature Selection and Three-Stage Loop Closure Optimization. IEEE Trans. Instrum. Meas. 2023, 72, 8504810. [Google Scholar] [CrossRef]
Zhao, X.; Wen, C.; Prakhya, S.M.; Yin, H.; Zhou, R.; Sun, Y.; Xu, J.; Bai, H.; Wang, Y. Multi-Modal Features and Accurate Place Recognition with Robust Optimization for Lidar-Visual-Inertial SLAM. IEEE Trans. Instrum. Meas. 2024, 73, 5033916. [Google Scholar]
Yuan, C.; Lin, J.; Liu, Z.; Wei, H.; Hong, X.; Zhang, F. BTC: A Binary and Triangle Combined Descriptor for 3D Place Recognition. IEEE Trans. Robot. 2024, 40, 1580–1599. [Google Scholar] [CrossRef]
Yoon, I.; Islam, T.; Kim, K.; Kwon, C. Viewpoint-Aware Visibility Scoring for Point Cloud Registration in Loop Closure. IEEE Robot. Autom. Lett. 2024, 9, 4146–4153. [Google Scholar] [CrossRef]
Im, J.U.; Ki, S.W.; Won, J.H. Omni Point: 3D LiDAR-based feature extraction method for place recognition and point registration. IEEE Trans. Intell. Veh. 2024, 9, 5255–5271. [Google Scholar] [CrossRef]
Shi, C.; Chen, X.; Xiao, J.; Dai, B.; Lu, H. Fast and Accurate Deep Loop Closing and Relocalization for Reliable LiDAR SLAM. IEEE Trans. Robot. 2024, 40, 2620–2640. [Google Scholar] [CrossRef]
Ebadi, K.; Palieri, M.; Wood, S.; Padgett, C.; Agha-mohammadi, A.A. DARE-SLAM: Degeneracy-aware and resilient loop closing in perceptually-degraded environments. J. Intell. Robot. Syst. 2021, 102, 1–25. [Google Scholar] [CrossRef]
Song, Z.; Wang, R.; Wu, S.; Wang, Y.; Tong, Y.; Zhang, X. DR-SLAM: Vision-Inertial-Aided Degenerate-Robust LiDAR SLAM Based on Dual Confidence Ellipsoid Oblateness. In Proceedings of the 2024 IEEE 14th International Conference on CYBER Technology in Automation, Control, and Intelligent Systems (CYBER), Copenhagen, Denmark, 16–19 July 2024; pp. 201–206. [Google Scholar]
Zhang, J.; Kaess, M.; Singh, S. On degeneracy of optimization-based state estimation problems. In Proceedings of the 2016 IEEE International Conference on Robotics and Automation (ICRA), Stockholm, Sweden, 16–21 May 2016; pp. 809–816. [Google Scholar]
Shan, T.; Englot, B.; Ratti, C.; Rus, D. LVI-SAM: Tightly-coupled lidar-visual-inertial odometry via smoothing and mapping. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021; pp. 5692–5698. [Google Scholar]
Chen, Z.; Xu, Y.; Yuan, S.; Xie, L. iG-LIO: An Incremental GICP-Based Tightly-Coupled LiDAR-Inertial Odometry. IEEE Robot. Autom. Lett. 2024, 9, 1883–1890. [Google Scholar] [CrossRef]
Tuna, T.; Nubert, J.; Nava, Y.; Khattak, S.; Hutter, M. X-ICP: Localizability-aware LiDAR registration for robust localization in extreme environments. IEEE Trans. Robot. 2024, 40, 452–471. [Google Scholar] [CrossRef]
Yin, J.; Li, A.; Li, T.; Yu, W.; Zou, D. M2DGR: A multi-sensor and multi-scenario slam dataset for ground robots. IEEE Robot. Autom. Lett. 2021, 7, 2266–2273. [Google Scholar] [CrossRef]
Qin, T.; Li, P.; Shen, S. VINS-Mono: A robust and versatile monocular visual-inertial state estimator. IEEE Trans. Robot. 2018, 34, 1004–1020. [Google Scholar] [CrossRef]
Grupp, M. evo: Python Package for the Evaluation of Odometry and SLAM. 2017. Available online: https://github.com/MichaelGrupp/evo (accessed on 6 December 2024).
Lu, Y.; Song, D. Robust RGB-D odometry using point and line features. In Proceedings of the 2015 IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 3934–3942. [Google Scholar]

Figure 1. The system framework of VS-SLAM. The blue shaded blocks represent our contributions.

Figure 2. Illustration of the virtual descriptor technology.

Figure 3. Environments in simulation and the real world: (a) Simulation; (b) Real world.

Figure 4. Experimental platform and sensor suite: (a) Mobile robot platform; (b) 3-D model of sensor suite.

Figure 5. Sequences with varying lighting conditions: (a) Real-Corridor-01: Non-uniform lighting; (b) M2DGR-Street-03: Dark lighting; (c) M2DGR-Gate-02: Low lighting; (d) M2DGR-Gate-03: Bright lighting.

Figure 6. Evaluation of selective memory storage technology in simulation and real-world environments: (a) Real-Corridor-01; (b) Gazebo-Corridor-01.

Figure 7. Evaluation of selective memory storage technology on public datasets: (a) M2DGR-Gate-02; (b) M2DGR-Street-08.

Figure 8. Illustration of key descriptor evaluation: (a) Point cloud of Scan 159 (8th key descriptor); (b) Descriptor of Scan 159 (8th key); (c) Point cloud of Scan 163 (Non-key descriptor); (d) Descriptor of Scan 163 (Non-key); (e) Point cloud of Scan 179 (9th key descriptor); (f) Descriptor of Scan 179 (9th key).

Figure 9. Memory consumption comparison in the loop closure database. SC-LVI-SAM has the same data as VS-SLAM-w/o-st.

Figure 10. Comparison of trajectories of different methods: (a) Gazebo-Corridor-01; (b) Gazebo-Corridor-02.

Table 1. Comparison of descriptor count and memory consumption in the loop closure database.

Sequence	SC-LVI-SAM [9]		VS-SLAM-w/o-st		VS-SLAM
Sequence	D.C.	M.S.	D.C.	M.S.	D.C.	M.S.
Gazebo-Corridor-01	200	1048	202	1048	17	152
Gazebo-Corridor-02	212	1048	215	1048	17	152
M2DGR-Gate-01	211	1048	218	1048	5	56
M2DGR-Gate-02	419	2072	418	2072	25	152
M2DGR-Gate-03	357	2072	360	2072	17	152
M2DGR-Street-03	591	4120	590	4120	11	88
M2DGR-Street-04	1362	8216	1355	8216	46	280
M2DGR-Street-08	587	4120	580	4120	16	88
Real-Corridor-01	434	2072	432	2072	23	152
Real-Corridor-02	423	2072	431	2072	21	152

D.C. refers to descriptor count, and M.S. refers to memory size (unit: bytes).

Table 2. Localization errors of different methods (unit: m).

Sequence	LVI-SAM [28]	LVI-SAM- w/o-loop [28]	SC-LVI-SAM [9]	VS-SLAM- w/o-vd	VS-SLAM
Gazebo-Corridor-01	5.58	0.77	6.18	0.75	0.67
Gazebo-Corridor-02	12.20	1.08	7.04	1.10	1.05
M2DGR-Gate-01	4.58	2.26	0.14	0.14	0.13
M2DGR-Gate-02	0.30	0.30	0.31	0.31	0.30
M2DGR-Gate-03	0.15	0.15	0.15	0.15	0.14
M2DGR-Street-03	0.15	0.15	0.15	0.15	0.13
M2DGR-Street-04	1.73	1.71	1.64	1.67	1.63
M2DGR-Street-08	10.56	2.47	0.71	0.71	0.65
Real-Corridor-01	0.18	3.06	49.59	0.14	0.11
Real-Corridor-02	0.19	2.18	56.61	0.16	0.10

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Song, Z.; Zhang, X.; Zhang, S.; Wu, S.; Wang, Y. VS-SLAM: Robust SLAM Based on LiDAR Loop Closure Detection with Virtual Descriptors and Selective Memory Storage in Challenging Environments. Actuators 2025, 14, 132. https://doi.org/10.3390/act14030132

AMA Style

Song Z, Zhang X, Zhang S, Wu S, Wang Y. VS-SLAM: Robust SLAM Based on LiDAR Loop Closure Detection with Virtual Descriptors and Selective Memory Storage in Challenging Environments. Actuators. 2025; 14(3):132. https://doi.org/10.3390/act14030132

Chicago/Turabian Style

Song, Zhixing, Xuebo Zhang, Shiyong Zhang, Songyang Wu, and Youwei Wang. 2025. "VS-SLAM: Robust SLAM Based on LiDAR Loop Closure Detection with Virtual Descriptors and Selective Memory Storage in Challenging Environments" Actuators 14, no. 3: 132. https://doi.org/10.3390/act14030132

APA Style

Song, Z., Zhang, X., Zhang, S., Wu, S., & Wang, Y. (2025). VS-SLAM: Robust SLAM Based on LiDAR Loop Closure Detection with Virtual Descriptors and Selective Memory Storage in Challenging Environments. Actuators, 14(3), 132. https://doi.org/10.3390/act14030132

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

VS-SLAM: Robust SLAM Based on LiDAR Loop Closure Detection with Virtual Descriptors and Selective Memory Storage in Challenging Environments

Abstract

1. Introduction

2. Related Work

2.1. Sensitivity of LiDAR Descriptors to Translation Changes

2.2. Accuracy and Memory Consumption of Loop Closure Detection

3. System Framework

4. Methods

4.1. Virtual Descriptor Technique

4.2. Selective Memory Storage Technology

4.2.1. Scene Recognition

4.2.2. Key Descriptor Evaluation

4.2.3. Selective Memory Storage

5. Experiments

5.1. Selective Memory Storage Technology Testing

5.2. Comparison of Localization Accuracy and Ablation Study

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI