Real-Time 3D Mapping in Complex Environments Using a Spinning Actuated LiDAR System

: LiDAR is a crucial sensor for 3D environment perception. However, limited by the ﬁeld of view of the LiDAR, it is sometimes difﬁcult to achieve complete coverage of the environment with a single LiDAR. In this paper, we designed a spinning actuated LiDAR mapping system that is compatible with both UAV and backpack platforms and propose a tightly coupled laser–inertial SLAM algorithm for it. In our algorithm, edge and plane features in the point cloud are ﬁrst extracted. Then, for the signiﬁcant changes in the distribution of point cloud features between two adjacent scans caused by the continuous rotation of the LiDAR, we employed an adaptive scan accumulation method to improve the stability and accuracy of point cloud registration. After feature matching, the LiDAR feature factors and IMU pre-integration factor are added to the factor graph and jointly optimized to output the trajectory. In addition, an improved loop closure detection algorithm based on the Cartographer algorithm is used to reduce the drift. We conducted exhaustive experiments to evaluate the performance of the proposed algorithm in complex indoor and outdoor scenarios. The results showed that our algorithm is more accurate than the state-of-the-art algorithms LIO-SAM and FAST-LIO2 for the spinning actuated LiDAR system, and it can achieve real-time performance.


Introduction
Light Detection and Ranging (LiDAR) can achieve highly accurate and efficient distance measurement.I is insensitive to illumination changes because it measures by actively emitting a pulsed laser.By combining LiDAR with Simultaneous Localization and Mapping (SLAM) technology, the laser SLAM system can obtain a three-dimensional map of the surrounding environment both indoors and outdoors.These advantages make the laser SLAM system play an important role in many fields, such as autonomous driving [1], building inspection [2], forestry investigation [3], etc.Currently, the LiDAR used for laser SLAM can be basically divided into two categories.Mechanical LiDAR is the most commonly used LiDAR type.Its horizontal Field Of View (FOV) can reach close to 360°, but its vertical FOV is very limited.Besides, with the development of microelectronics technology, solid-state LiDAR, e.g., the DJI Livox series, has become more and more commonly used [4,5].Generally, solid-state LiDAR can provide a larger vertical FOV than mechanical LiDAR, but its horizontal FOV is much smaller.Therefore, in many scenarios, both types of LiDAR cannot completely cover the whole environment.This shortcoming greatly limits the mapping efficiency of laser SLAM systems, especially when using a platform with limited endurance such as an Unmanned Aerial Vehicle (UAV) to carry the mapping system [6].In narrow environments, this limited FOV will degrade the accuracy and stability of localization as only a small number of objects can be scanned.
Much research has been conducted to expand the FOV of LiDAR, and the solutions can be roughly divided into two categories.The first category is to combine multiple LiDARs [7][8][9][10][11].Such systems typically use a horizontally mounted LiDAR to provide primary positioning information and a vertically mounted LiDAR to improve the scanning coverage.Its advantage is that the mechanical structure is simple.However, due to the high price of LiDARs, the cost of these solutions is greatly increased.Another category of solutions is to actuate the LiDAR dynamically, and LOAM [12] is one of the most-famous methods among them.LOAM first accumulates the point cloud with the assistance of the IMU and then matches the accumulated point cloud with the global map to correct the accumulated errors.Considering that the prediction error of the IMU grows exponentially with time and the feature extraction and matching algorithms need to wait for sufficient data to be accumulated, this time interval will become a bottleneck limiting the accuracy of the SLAM system.
To address this problem, we designed a mapping system based on a spinning actuated multi-beam LiDAR.By using a multi-beam LiDAR, more information can be obtained in each scan, which helps to increase the frequency of point cloud matching and generate a denser point cloud map.We first extracted feature points from each frame point cloud through an improved feature extraction method based on LOAM.Then, we judged whether the current point cloud contains enough information by analyzing the spatial distribution of feature points and performed scan-to-map matching once the requirement is met.Compared with accumulating point clouds with a fixed number of scans, this method allows a better balance between the matching frequency of point clouds and the accuracy and reliability of the matching results.Finally, in order to eliminate the accumulated error, we added a loop closure detection module to the algorithm.The main contributions of this paper can be summarized as follows:

•
We propose a tightly coupled laser-inertial SLAM algorithm named Spin-LOAM for a spinning actuated LiDAR system.

•
An adaptive scan accumulation method that can improve the accuracy and reliability of matching by analyzing the spatial distribution of feature points.

•
Extensive experiments were conducted in indoor and outdoor environments to verify the effectiveness of our algorithm.

Related Work
As a fundamental problem in robotics, numerous SLAM algorithms based on LiDAR have been proposed.LeGO-LOAM [13] extracts the ground point cloud from the real-time scan results to improve the accuracy in the elevation direction.T-LOAM [14] simultaneously extracts edge features, sphere features, planar features, and ground features to improve the matching accuracy.The works [15][16][17] refined the localization result by introducing plane constraints.Suma++ [18] not only uses geometric features in point clouds, but also introduces semantic information to assist point cloud matching.Besides these LiDAR-only odometry algorithms, many multi-sensor-fusion-based algorithms have been proposed to improve the accuracy and robustness.LIO-SAM [19] combines the IMU pre-integration factor with the LiDAR odometry factor through a factor graph.LIO-Mapping [20] integrates the LiDAR and IMU in a tightly coupled fashion.FAST-LIO [21] adopts a tightly coupled iterated extended Kalman filter on a manifold to fuse the data and is accelerated by introducing an incremental KD-Tree in FAST-LIO2 [22].CLINS [23] fuses LiDAR and IMU data by representing trajectories as a continuous-time function, and this framework is well compatible with arbitrary-frequency data from other asynchronous sensors.
For the purposes of acquiring complete 3D information of the environment, researchers have proposed many actuated LiDAR systems.Bosse et al. [24] designed a mapping system named Zebedee, which connects the sensors to the platform via springs.By treating the trajectory as a function of time, a surfel-based matching algorithm was adopted to estimate the 6-DOF pose.Kaul et al. [25] proposed a passively actuated rotating LiDAR system for UAV mapping, and they used a continuous-time SLAM algorithm to produce the trajectory.However, it cannot process the data in real-time.Park et al. [26] addressed this issue by introducing map deformation to replace the original global trajectory optimization in continuous-time SLAM.Fang et al. [27] proposed a two-stage matching algorithm to estimate the trajectory of a rotating LiDAR.In the algorithm, the distortion of the current point cloud was first removed using the estimated motion generated by matching it with the local map, and then, the undistorted point cloud was matched with the global map.Unlike the aforementioned works, Zhan et al. [28] used a rotating multi-beam LiDAR for 3D mapping and combined it with stereo cameras for dense 3D reconstruction.The precision of this system is very high, but it needs to remain stationary while collecting data.R-LOAM [29] improves the localization accuracy of rotating LiDAR by leveraging prior knowledge about a reference object.Karimi et al. [30] proposed an actuated LiDAR system using the Lissajous pattern [31].By using the scan slice instead of the full-sweep point cloud to match with the global map, they achieved low-latency localization for a UAV without an IMU in an indoor environment.

System Overview
We first introduce the hardware systems used in the study.As shown in Figure 1, our device mainly consists of a laser scanner, a step motor, and an IMU, in which the scanner is driven to rotate by the motor.The rotation angle is recorded by an encoder.The IMU is rigidly attached to the platform, so we regarded the IMU frame {I} as the body frame {B} for simplicity.The LiDAR frame is denoted as {L}, and the fixed LiDAR frame {FL} coincides with the initial LiDAR frame when the motor is reset.The Y-axis of the motor frame {M} is aligned with the spin axis.The two extrinsic parameters T FL I ∈ SE(3) and T FL M ∈ SE(3) are both calibrated manually, where T FL I represents the transformation from the frame {FL} to the frame {I} and T FL M represents the transformation from the frame {FL} to the frame {M}.The timestamp of each sensor is synchronized at the hardware level to ensure accuracy.Figure 2 provides an overview of our SLAM algorithm.In the front-end, first, the IMU measurements are used to construct the pre-integration factor and produce pose predictions.Next, the raw LiDAR point cloud is transformed to the fixed LiDAR frame and de-skewed using the pose predictions and motor encoder data.Then, edge and plane feature points are extracted from the de-skewed point cloud.In the scan-to-map registration module, the correspondences of these feature points and the global map are established.The spatial distribution of these matching point pairs is examined to decide whether they are to be added to the factor graph or accumulated to the next scan.In the back-end, the IMU pre-integration factor and LiDAR factors are jointly optimized to estimate the system states, as shown in Figure 3.In order to bound the amount of computation, only the latest state is optimized when no loop closure constraints are added.After optimization, the feature points are added to the global map using the optimized state.Loop closure detection is performed periodically in the background to reduce drift.

Methodology
The system state x to be estimated at time t is defined as where b ω t ∈ R 3 and b a t ∈ R 3 are the gyroscope and accelerometer biases, respectively.R t ∈ SO(3), p t ∈ R 3 , and v t ∈ R 3 are the orientation, position, and velocity of the sensor platform in the global coordinate frame {G}, respectively.

IMU Processing
The platform's angular velocity ω t and acceleration a t in the IMU frame can be obtained by the IMU, but the raw measurements of the IMU are corrupted by noise and bias.Commonly, the slow variations in the bias are modeled with Brownian motion; hence, the IMU measurement model is given by where g is the gravity vector, η ω , η a , η bω , and η ba are white Gaussian noise, and their standard deviations are σ η a , σ η g , σ η ba , and σ η bω , respectively.

Pose Prediction
Based on the posterior state xk−1 in the previous moment and the IMU measurements between (t k−1 , t k ), we can use the Euler integration to calculate the predicted poses.The recursive formula is where Exp(•) is the exponential map of manifold SO(3).These poses are sufficient to de-skew the point cloud and provide an initial value for registration.However, to help reject the outliers in Section 4.4, the covariance of these predicted poses also need to be estimated.The recursive formula of error state covariance propagation is where ω = ω i − b ω i , a = a i − b a i , and P i is the covariance matrix of system state at time t i .The diagonal elements of the noise covariance matrix Q are σ η ω 2 , σ η a 2 , σ η ba 2 , and σ η bω 2 .J r is the right Jacobian of SO(3).Note that the covariance of R is defined in the tangent space.

IMU Pre-Integration
The IMU pre-integration technique was first proposed in [32], and it has been widely applied in SLAM research.It uses IMU measurements between (t k−1 , t k ) to establish the constraint between two states x k−1 and x k .The IMU pre-integration factor is calculated as follows: where ∆R, ∆v, and ∆p are the relative motion between two timestamps t k−1 , t k .More details about the on-manifold IMU pre-integration can be found in [33].

Feature Extraction
The raw point cloud is measured in the LiDAR frame and distorted by the sensor's motion.Therefore, before feature extraction, it needs to be transformed to frame {FL} and de-skewed.Suppose c i L is a point in raw point cloud scan C = {c 0 , c 1 , . . . ,c n }: where R M i is the rotation matrix generated by encoder data at time t i , which represents the rotation of the LiDAR frame relative to the fixed LiDAR frame in the motor frame.Our feature extraction method extracts planar features F p and edge features F e from the input point cloud as shown in Figure 4 .The workflow of the method is as follows: (1) For a point c i ∈ C, find its previous neighbors N pre i = {c i−k , . . . ,c i−1 } and succeeding neighbors N succ i = {c i+1 , . . . ,c i+k } in the same scan line.
(2) Calculate the features α, β of the point using where α indicates the changing angle of the scan line at the point c i , which is used to characterize the smoothness.The points with α < α thr will be labeled as smooth points.β is the ratio of the distance from point c i to c i−1 and c i+1 , which is used to determine whether the point is an edge point.(3) For point c i with β > β thr , if all points in its closer neighbors (depending on the closest point belonging to the neighbor N pre i or N succ i ) are smooth points, then add c i to F e .(4) For point c i with β ≤ β thr , if all points in its previous and succeeding neighbors are smooth points, then add c i to the candidate set of edge points.(5) Use the standard LOAM-based method to extract planar features F p and edge points F e , except that the edge points must belong to the candidate set.
unstructured point edge point smoothness point wall This modification was based on the consideration that there are many unstructured objects (e.g., vegetation) in outdoor environments, which will degrade the performance of edge extraction.A comparison result is shown in Figure 5.In our experiment, α thr was set to 15°and k and β thr were set to 2 and 4, respectively, according to [34].

Scan-to-Map Registration
Similar to [19], for an input point cloud scan C, we used the pose predicted by the IMU to find the adjacent point cloud scans in the global map and merged them into the feature map M p and M e for data association.The map is downsampled by a voxel filter to accelerate the computation.Then, we find the k nearest neighboring points N f of each feature point f i ∈ F p or F e in the corresponding feature map and denote its nearest point as m i .For f i ∈ F p ; the plane normal vector n p of its N f is computed, and for f i ∈ F e , the line direction n e of its N f is computed.In addition, as the LiDAR is continuously rotating, to ensure the reliability of the matching in the initial stage, the initial global map is constructed by keeping the sensor platform stationary for about 3 s.
According to the results of feature matching, the LiDAR residual r LiDAR can be computed using the point-to-plane d p and point-to-line distance d e . ) where R and p are the predicted orientation and position, n p and n e are normalized, and m i is the corresponding point of feature point f i in the global map.

Adaptive Scan Accumulation
The spatial distribution of the point cloud has an important influence on the accuracy of the registration result.Taking point-to-plane registration as an example, the matched points should involve at least three non-parallel planes.Unlike the point cloud obtained by the fixed mounted LiDAR, the LiDAR in our device is continuously rotating, resulting in the continuous change of the spatial distribution of the obtained point cloud.As shown in Figure 6, using only a single frame of the point cloud sometimes fails to provide reliable registration results.A simple solution is to accumulate multi-scan point clouds before each registration, but this will reduce the computational efficiency, so we propose a method to adaptively decide whether to perform point cloud accumulation and how many scans need to be accumulated according to the spatial distribution of the point cloud.

Features' Distribution Inspection
To ensure the registration results are reliable, we used the covariance of the registration result as the indicator.In the least-squares registration problem, the normal equation formed by matched feature points is where J ∈ R n×6 is constructed by stacking Jacobian matrices J The covariance of the estimated pose is where E{•} is the expected value operator.The covariance of b is determined by the laser measurements, and each laser observation can be treated as an independent observation.Therefore, its covariance is: where σ l is the accuracy of the laser measurement, which can be set depending on the model of the LiDAR.Substituting Equation ( 14) into Equation ( 13), then Equation ( 15) reveals how the spatial distribution of matched points in matrix J affects the covariance of the registration result.If the spatial distribution is suitable for registration, the covariance of the estimated pose should not vary greatly in all directions.Suppose H tran ∈ R 3×3 is the submatrix of H corresponding to the translation part in the pose; the ratio γ can be computed by where operator diag(•) extracts the diagonal elements of the matrix as a vector.The smaller the value of γ, the more uniform the distribution of the point cloud is, and vice versa.Here, we set the threshold γ th = 3.If γ < γ th , the matched pairs will be accepted, and these plane and edge features will be added to the factor graph as constraints; otherwise, the extracted features will be accumulated to the next scan.

Outlier Removal in Matched Features
Since the point cloud in a single scan is sparse, there will inevitably be errors in the feature extraction results.These incorrectly matched point pairs will interfere with the features' distribution inspection and reduce the accuracy of pose estimation.To remove them while performing distribution inspection, an outlier removal algorithm based on the propagation of covariance is applied.We assumed that the error in the global map can be ignored, which means that the distance residual is mainly caused by the error of the predicted pose and the error of the laser measurement.Then, the standard deviation of the distance residuals can be computed by where D p is the covariance of the predicted pose and can be obtained by Equation ( 4) and D l is the accuracy of the LiDAR point.The point pair with d p ≥ 3σ d p or d e ≥ 3σ d e is considered an outlier and removed.In theory, if the distance residuals follow a Gaussian distribution, we can remove most of the mismatched point pairs while retaining 99.5% of the correct matching results.

Loop Closure Detection
Our loop closure detection method was developed based on Cartographer [35].As shown in Algorithm 1, there are two main improvements compared to the original algorithm: (1) point-to-plane ICP is used to replace the original probability occupancy-grid-based registration method in the fine registration stage; (2) the loop closure detection result is checked using the previous scan.Due to the rotation of the LiDAR, there is a noticeable difference even between two adjacent scans.This can provide auxiliary information to help reject false detection results.If the loop closure detection result of the current scan is correct, then the registration results T , s ← Branch-and-boundScanMatch(C k , S l ); ); // Check ICP result using previous scan ) to R loop ; 9 end 10 end

Experiments
Since there is no publicly available dataset containing spinning actuated LiDAR and IMU data, we evaluated the performance of the Spin-LOAM algorithm using the data collected in indoor and outdoor environments with the device shown in Figure 7.In the experiments, the sampling frequencies of the Velodyne VLP-16 LiDAR and Sensonor STIM300 IMU were 10 Hz and 200 Hz, respectively.The angular velocity of the step motor was set to 4.5 rad/s, and the angular resolution of the encoder was 10 bit.All experiments were conducted on an Intel NUC computer with an Intel Core i5-1135G7 CPU and 16 GB memory.

Evaluation in Indoor Environments
To verify the accuracy and effectiveness of Spin-LOAM, we first conducted tests in indoor environments.We compared our algorithm with two state-of-the-art LIO algorithms, FAST-LIO2 and LIO-SAM (modified for compatibility with a 6-axis IMU).In the indoor tests, the map voxel size in all algorithms was set to 0.2 m.The standard deviations of the noise related to the IMU and LiDAR were uniformly set according to the device model.
As shown in Figure 8, we collected three datasets in different environments using a backpack to evaluate the performance of our algorithm.Data 1 is two narrow corridors on different floors connected by stairs; Data 2 is an underground garage; Data 3 is a badminton hall.Since it was difficult to obtain the ground truth trajectory in the indoor environment, we returned to the starting position at the end of the trajectory and used the end-to-end translation error as the metric.The qualitative analysis results are given in Table 1, where "Spin-LOAM (odom)" represents our algorithm without loop closure and "Spin-LOAM (full)" represents the full SLAM algorithm.Data 1 is the most-challenging scene among the three indoor datasets.In this scene, LIO-SAM failed to finish the localization, and FAST-LIO2 also suffered from severe drift.Our algorithm achieved the lowest odometry drift and successfully corrected the drift through loop closure detection, as shown in Figure 9.In Data 2, our algorithm and LIO-SAM achieved similar localization accuracy, and FAST-LIO2 was slightly worse because it has no loop closure detection.However, the accuracy of our pure odometry method was comparable to LIO-SAM, which validates the effectiveness of our algorithm.In Data 3, the accuracy of all algorithms was at the same level, because the scene was free of occlusions and full of plane features.It is worth noting that the height of the roof in this scene was about 15 m, but our device completely acquired the point cloud of the entire scene with only one LiDAR.This reveals the advantage of the spinning actuated LiDAR system.

Evaluation in Outdoor Environments
In the outdoor test, the map voxel size in Spin-LOAM was changed to 0.4 m, and the voxel size in FAST-LIO2 and LIO-SAM was set according to the default value.Figure 10 gives an overview of the datasets collected in the outdoor environments.Data 5 was collected around a building; Data 6 was collected on a large ring road.Data 7 was collected in a residential area.Data 8 was collected using a drone on a construction site.To quantitatively compare the performance of the algorithms, we used the GNSS RTK trajectories as the ground truth and computed the absolute trajectory error (ATE) of the trajectories.Table 2 shows that our algorithm achieved the best accuracy in the outdoor environments.FAST-LIO2 also performed very well in the experiments with only significant drift in Data 6 and completed all tests as our algorithm.However, LIO-SAM failed to finish the localization in Data 6 and Data 8.The reason for its failure in Data 6 is that it directly uses the ICP algorithm for loop closure.This strategy is not suitable when large drift occurs, and the accumulated errors in the large ring road make the LIO-SAM algorithm fail.After turning off the loop closure, the accuracy of LIO-SAM in Data 6 was 0.4908 m. Figure 11 gives a detailed comparison of the point cloud at the road junction.It can be seen that, even without the loop closure, our algorithm still maintained high accuracy after walking through a long loop.In Data 8, the drone performed an aggressive motion to test the robustness of the algorithm, in which the maximum angular velocity was over 780°/s and the maximum linear velocity was over 6.5 m/s.This test proved that our tightly coupled algorithm can work when aggressive motion occurs.
An ablation study was conducted to further analyze the contribution of our proposed adaptive scan accumulation method."Spin-LOAM (wo-asa)" in Table 2 represents our algorithm without adaptive scan accumulation and loop closure.Comparing it with "Spin-LOAM (odom)", the results showed that our method can improve the accuracy, especially in complex environments such as Data 6 and 7.This is because the FOV of the LiDAR is limited; it is often occluded by trees or can only scan the ground in these environments, which will lead to an uneven distribution of features in a single scan.Our method can alleviate this problem during scan-to-map registration, which can help to improve the accuracy and robustness of registration.

Runtime Analysis
We selected indoor data and an outdoor data, respectively, for the algorithm performance evaluation.The processing time per scan of each algorithm is shown in Figure 12.FAST-LIO2 was much faster than LIO-SAM and our algorithm because it is a filter-based algorithm and does not require feature extraction.The average time cost per scan of our algorithm was 63.2 ms indoors and 43.8 ms outdoors.The computation time was more for the indoor data because the map voxel size had a significant impact on the performance.The average time cost for the loop closure was 808.4 ms indoors and 328.7 ms outdoors.Since the loop closure was performed by a separate thread in the background, it did not affect the real-time performance of our algorithm.

Conclusions
In this paper, we presented a tightly coupled laser-inertial SLAM algorithm specifically designed for a spinning actuated LiDAR system.In the front-end, to mitigate the influence of the unstable spatial distribution of the point cloud caused by the continuously rotating LiDAR, an adaptive scan accumulation method based on point cloud distribution inspection was adopted.In the back-end, a voxel-grid-based loop closure detection method was used to reduce the drift.We use the previous scan point cloud to assist in eliminating errors in the loop closure detection results.The experimental results demonstrated that our algorithm achieves high-precision localization results in various complex indoor and outdoor environments.We are committed to further refining and improving our algorithm, with a focus on improving its robustness in more extreme environmental conditions.One potential avenue for improvement is the integration of semantic information from the point cloud, which will aid in loop closure detection and removal of dynamic objects.

Figure 1 .
Figure 1.The mechanical structure of the hardware device.

Figure 2 .Figure 3 .
Figure 2. System overview of the proposed laser-inertial SLAM algorithm for the spinning actuated LiDAR system.
and T B k are the poses of the body frame obtained by the linear interpolation of the predicted pose, and t k is the timestamp of the latest point in the point cloud scan C.

Figure 4 .
Figure 4.The illustration of the edge point extraction.As described in Steps 3 and 4, there are two types of edge points.The first type of edge point is at the end of the wall, and the second type of edge point is at the corners.These edge points can be identified from unstructured points by analyzing the properties of the points in their neighborhoods.

Figure 5 .
Figure 5. Feature point extraction result.The green points are the raw point cloud.The edge points extracted by the standard LOAM-based extraction method are shown in blue, and the edge points extracted by our method are shown in red.It can be seen that many blue points are located in the vegetation, and these noises will affect the accuracy of the alignment.

Figure 6 .
Figure 6.Illustration of multi-scan point clouds' accumulation.The first scan point cloud is rendered in green and only contains a few points in the horizontal plane, which will lead to an inaccurate registration result, especially in the Z direction.This problem can be solved by merging it with the next scan point cloud.

p
and J d e p , n is the number of matched points, and b ∈ R n×1 is constructed by stacking distance residuals d p and d e .The solution to Equation (11) is given by x σ d e = J d e p D p J d e p T + J d e l D l J d e l T

.Algorithm 1 :
The threshold d th was set to 5 cm in the experiment.Loop closure detection.Input: scan C k , submap S l Output: loop closure residuals R loop // Obtain coarse registration result and score 1 T

Figure 7 .
Figure 7. Overview of the experimental sensor platform.(a) The sensor platform is carried by a backpack.(b) The sensor platform is carried by a UAV.

Figure 8 .
Figure 8. Mapping results of our algorithm in indoor environments.The trajectories are represented by the red lines in the figure.

9 .
Comparison of the mapping results of the stairs in Data 1.

Figure 10 .
Figure 10.Mapping results of our algorithm in outdoor environments.The ground truth trajectories generated by GNSS are represented by the black lines.

Figure 11 .
Figure 11.Point cloud maps generated by different algorithms in Data 6.

Figure 12 .
Figure 12.The evaluation results of the time cost per scan for each algorithm.The selected indoor data and outdoor data are Data 2 and Data 7, respectively, as they have the longest trajectories.
Bold font represents better results.