LiDAR-Inertial-GNSS Fusion Positioning System in Urban Environment: Local Accurate Registration and Global Drift-Free

: Aiming at the insufﬁcient accuracy and accumulated error of the point cloud registration of LiDAR-inertial odometry (LIO) in an urban environment, we propose a LiDAR-inertial-GNSS fusion positioning algorithm based on voxelized accurate registration. Firstly, a voxelized point cloud downsampling method based on curvature segmentation is proposed. Rough classiﬁcation is carried out by the curvature threshold, and the voxelized point cloud downsampling is performed using HashMap instead of a random sample consensus algorithm. Secondly, a point cloud registration model based on the nearest neighbors of the point and neighborhood point sets is constructed. Furthermore, an iterative termination threshold is set to reduce the probability of the local optimal solution. The registration time of a single frame point cloud is increased by an order of magnitude. Finally, we propose a LIO-GNSS fusion positioning model based on graph optimization that uses GNSS observations weighted by conﬁdence to globally correct local drift. The experimental results show that the average root mean square error of the absolute trajectory error of our algorithm is 1.58m on average in a large-scale outdoor environment, which is approximately 83.5% higher than that of similar algorithms. It is fully proved that our algorithm can realize a more continuous and accurate position and attitude and map reconstruction in urban environments.


Introduction
For any autonomous robot system, such as unmanned aerial vehicles and autonomous vehicles, the accurate and robust localization of a mobile carrier is one of the fundamental technologies [1]. Traditionally, the integrated navigation and positioning technology based on the global navigation satellite system (GNSS) and inertial navigation system (INS) is usually regarded as a reliable method to achieve high-accuracy positioning [2]. However, in complex urban environments, there are a large number of GNSS multipath or rejection areas due to the blockage of GNSS signals by urban objects such as tall buildings, tunnels and street trees. As a result, the integrated positioning method based on GNSS/INS is not effective in achieving a continuous and robust positioning of targets in large urban environments. In summary, there is an urgent need to upgrade and expand the traditional positioning techniques by introducing heterogeneous and complementary measurement information from other sensors.
In recent years, the multi-sensor fusion positioning technology based on simultaneous localization and mapping (SLAM) has received extensive attention from related enterprises and researchers [3]. It can not only make use of the excellent characteristics of cameras, LiDAR and other sensors, including the independence from environmental occlusion and signal refraction in complex areas, but can also effectively make up for the signal lock-out defect of GNSS signals in the parking lot or tunnel area. Moreover, incremental map reconstruction can be achieved by sensing the external environment. Depending on the primary sensor, SLAM-based multi-sensor fusion positioning solutions can be divided into vision-based SLAM and LiDAR-based SLAM [4]. Due to the superiority of the sensors, the solution of LiDAR-based SLAM allows for a higher frequency and more accurate acquisition of spatial fingerprint information, thus achieving a more accurate positioning than vision-based SLAM [5][6][7]. Secondly, analyzed at the algorithm level, LiDAR odometry is more lightweight in processing environmental features than visual odometry and more suitable for vehicle-mounted platforms with limited computational resources [8,9]. Therefore, the LiDAR-inertial odometry (LIO)-based SLAM scheme is widely used to obtain 3D geographic information of a complex environment, as well as carrier positioning and map reconstruction.
Throughout the development of the LiDAR-based SLAM, it can be seen that the registration of the point cloud of LiDAR is a key step in the pose estimation of a mobile carrier. It strictly affects the pose estimation and the map reconstruction results. The commonly used point cloud registration methods include normal distribution transform (NDT) [10], iterative closest point (ICP) [11], generalized iterative closest point (GICP) [12] and other improved algorithms [13][14][15][16]. The core of NDT algorithm is used to take the probability density function of the source point cloud and the target point cloud as the objective function; then, it uses a nonlinear optimization method to minimize the probability density between them to obtain the optimal solution. Andreasson et al. [17] avoids an explicit nearest neighbor search by establishing segmented continuous and differentiable probability distributions, and the registration speed is effectively improved. Although the real-time performance is better, the covariance matrix needs to be constructed at multiple points, which has a low robustness in the sparse area of the point cloud. Caballero et al. [18] proposed an improved NDT algorithm that was used to model the alignment problem as a distance field. The optimization equation is constructed by using the distance between the feature points of the current frame and the prior map, which improves the speed by an order of magnitude. However, the robustness of the localization algorithm is not guaranteed for unknown sections where the priori map is missing or unreliable [19].
As another method of point cloud registration, the ICP algorithm has a higher positioning accuracy than NDT, but it needs to search for the nearest neighbor again and obtain the transformation matrix in each iteration process, so the calculation efficiency needs to be improved. Koide et al. [20] proposed a generalized iterative nearest point algorithm that used a Gaussian probability model to fit the distribution of the point cloud to reduce the computational complexity. However, its accuracy is still limited by the maximum number of iterations. In addition, the algorithm is heavily influenced by the observation noise and the accuracy of the initial positional transformation matrix, and there is a risk of the algorithm falling into local minima. In order to break out of the logical limitation of being limited to local optimal solutions, Yang et al. [21] proposed Go-ICP, a branch-and-bound scheme to impose domain restrictions on the objective function of rigid alignment. This processing reduced the abnormal influence of the local minimum, and made the registration result of the point cloud approach to the global optimal solution. In 2021, Pan et al. [22] proposed MULLS-ICP, which uses an improved ICP algorithm based on double-threshold filtering and multi-scale linear least squares to realize the registration between the current frame and local sub-map, but the high computational cost of multiple filtering is difficult to adapt to the vehicle platform with limited computational resources. To sum up, on the basis of reducing the calculation cost, a high-precision real-time point cloud registration algorithm suitable for a vehicle platform still needs to be investigated.
In addition, as a local sensor integrator, the LIO has a cumulative offset between its local map and the global map when it performs a positional estimation of the current frame, which largely limits the positioning accuracy of the LIO position building scheme in large outdoor environments. Fortunately, the global observation information from GNSS can provide a credible global constraint correction for LIO [23]. Conversely, LIO systems can also compensate for the limitations of GNSS in terms of continuous precise positioning due to multipath effects and non-line-of-sight (NLOS) problems. Therefore, LIO-GNSS fusion positioning technology provides a feasible technical scheme for realizing globally weak drift and locally accurate positioning and mapping targets.
The mainstream LIO-GNSS fusion algorithms can be divided into two categories, filter-based methods and optimization-based methods, based on the method of sensor measurement data fusion. Li et al. [24] used the filter-based method as the integration strategy. They use the extended Kalman filter to realize LIO-GNSS tight coupling, but did not set up an anomaly detection mechanism, so it was prone to the dispersion of the positional estimates in GNSS multipath regions or point cloud degradation regions. To resolve this issue, Li et al. [25] uses an edge fault-tolerant mechanism to improve the robustness of the algorithm in case of single-sensor failure. However, it weakens the linearization error at the cost of increasing the amount of computation, which is contrary to the lightweight principle of large outdoor scenes. As another fusion method, the optimization-based method uses multiple iterations to approach the optimal solution, which can effectively handle such non-linear heterogeneous data fusion problems. Soloviev et al. [26] proposed an optimization-based LIO-GNSS scheme, but only the horizontal components of GNSS measurements were used to optimize the LIO pose estimation results, with low utilization of the measurement information. Shan et al. [27] puts forward an optimization framework that introduces 3D GNSS measurement factors to assist LIO, but the measurement information of a single key frame is redundant, and the reliability of GNSS factors added when driving to the GNSS multipath area is poor. Sun et al. [28] proposed a GNSS corner factor to constrain the local pose, but it does not consider the shortage of corners on straight road sections, so its application in a large-scale complex outdoor environment is limited.
From the above analysis, it can be seen that the research points of the LIO-GNSS fusion scheme are as follows: 1. Realizing real-time and high-precision point cloud alignment based on compressed computational costs. 2. On the basis of making full use of GNSS measurement information, global cumulative error correction of LIO is carried out by GNSS.
To address the above issues, in this contribution, we propose a LiDAR-inertial-GNSS fusion positioning system based on voxelized accurate registration. Firstly, a voxelized point cloud downsampling method based on curvature segmentation is proposed. Rough classification is carried out by a curvature threshold, and the voxelized point cloud downsampling is performed using HashMap instead of the random sample consensus algorithm. Therefore, the spatial distribution attributes of the source point cloud are retained to a greater extent. Secondly, a point cloud registration model based on the nearest neighbors of the point and neighborhood point sets is constructed. Thirdly, an optimization-based method is used to build a higher-order Markov model based on sliding windows, and a GNSS factor and loop factor are introduced into the factor graph to constrain LIO globally. Finally, on this basis, a GNSS residual construction method based on the GNSS reliability weight is proposed to make full use of GNSS measurement information. Therefore, the goal of positioning and mapping with a light weight, high precision and high applicability in a complex urban environment can be achieved.

System Overview
The proposed algorithm framework is shown in Figure 1. The main functions of each module are as follows.
The front-end of the system is mainly used to preprocess IMU observations and LiDAR original point cloud sequences, and to optimize the generation of local maps by inter-frame matching. The LiDAR raw point cloud sequence is clustered and segmented by a breadthfirst-search combined with the Euclidean angle threshold, and then edge and plane feature point clouds are extracted. These two types of feature clouds are downsampled for point cloud alignment, and the local inter-frame matching is optimized using the IMU preintegration as the initial pose estimate. Finally, the LIO local pose estimates are used to pre-process the GNSS global observations, including the temporal interpolation alignment of GNSS and LIO local observations and coordinate system alignment, so as to achieve the space-time synchronization among sensors.
achieve the space-time synchronization among sensors.
The back-end mainly uses the residuals of pose estimates of each sensor to optimize the map. The residual factors from local sensors include the IMU pre-integration and Li-DAR observation residual, whereas the global residual factors include the GNSS observation residual and loop residual. It should be noted that the global residual factors are added only when their existence is detected, and, when there is no global residual factor, the system only performs local position, such as when the carrier is travelling in a flat and straight tunnel environment. When global corrections are available, the obtained global positioning results are used to update the local pose estimates in the sliding window to obtain the best pose estimates with local accurate registration and global drift-free.

Point Cloud Voxelization Downsampling and Alignment
The accuracy of the registration of the environmental point cloud extracted by LiDAR strictly affects the result of the subsequent local pose estimation. Therefore, the processing steps of the front-end point cloud of the system need to be described in detail. This paper mainly involves the improved point cloud downsampling method and registration method.

Voxelized Downsampling Based on Curvature Segmentation
This paper presents a voxelized downsampling method based on curvature segmentation. Given a set of raw point cloud sequences collected by LiDAR, all points in the raw point cloud sequences are traversed and coarse clustering is performed using a breadthfirst algorithm. Furthermore, the geometric angle threshold based on Euclidean distance is used to finely segment the point cloud clusters with similar depth. Let the scanning center of LiDAR be O and the two adjacent edge points a p and b p in the point cloud cluster with depths a . Let the number of point clouds in The back-end mainly uses the residuals of pose estimates of each sensor to optimize the map. The residual factors from local sensors include the IMU pre-integration and LiDAR observation residual, whereas the global residual factors include the GNSS observation residual and loop residual. It should be noted that the global residual factors are added only when their existence is detected, and, when there is no global residual factor, the system only performs local position, such as when the carrier is travelling in a flat and straight tunnel environment. When global corrections are available, the obtained global positioning results are used to update the local pose estimates in the sliding window to obtain the best pose estimates with local accurate registration and global drift-free.

Point Cloud Voxelization Downsampling and Alignment
The accuracy of the registration of the environmental point cloud extracted by LiDAR strictly affects the result of the subsequent local pose estimation. Therefore, the processing steps of the front-end point cloud of the system need to be described in detail. This paper mainly involves the improved point cloud downsampling method and registration method.

Voxelized Downsampling Based on Curvature Segmentation
This paper presents a voxelized downsampling method based on curvature segmentation. Given a set of raw point cloud sequences collected by LiDAR, all points in the raw point cloud sequences are traversed and coarse clustering is performed using a breadth-first algorithm. Furthermore, the geometric angle threshold based on Euclidean distance is used to finely segment the point cloud clusters with similar depth. Let the scanning center of LiDAR be O and the two adjacent edge points p a and p b in the point cloud cluster with depths d a and d b , respectively (d a > d b ). Let the number of point clouds in the point cloud cluster where point p i is located be M. Then, the roughness of point cloud p i is: Set the roughness threshold as c, then traverse M. We classify the points of c < c as the set of edge feature points, classify the points of c > c as the set of plane feature points and perform downsampling operations on them, respectively.
This method is mainly used in the feature extraction step of LiDAR odometry [4]; we extend it to the downsampling step. This means that, for any application where downsampling of point clouds is required, such as artefact inspection, the method can better restore the spatial distribution properties of point clouds by downsampling in clusters.
Next, this paper proposes a point cloud downsampling strategy based on HashMap, instead of the random sample consensus (RANSAC), so that the downsampling result of the point cloud is closer to the approximate center of gravity of voxels. Let the coordinate of a feature point in a set of point cloud sequences in the voxel space be p(x, y, z). If the voxel grid size is r, the dimension of the voxel grid in the x direction is D x = (x max − x min )/r, and the index of p in the x direction within the voxel grid is h x = (x − x min )/r. The same applies to the y and z directions.
After obtaining the 3D index of feature points in the voxel space, if the random sorting strategy of [11] is adopted, the sorting complexity will be O((m + n) * log(m + n)), which has a negative impact on the down-sampling time. Therefore, this paper uses the hash function to sort the index of feature points quickly and map them to N containers (N = 80). The hash function is: To avoid hash conflicts, set the conflict detection conditions as follows: Once the hash conflict is detected, the index value in the current container is output and the container is emptied, and the new index value is put into the container.
To sum up, the main improvement of this section lies in extending the curvature segmentation step originally used for feature extraction to the downsampling step, and using hash mapping instead of the random sampling method for point cloud sampling. For the LiDAR odometer, using the clustering line and surface features again after the feature extraction step can improve the accuracy of downsampling single-frame or discontinuous point clouds at a weak time cost, thus providing more accurate point cloud distribution results for the pose estimation step between consecutive frames. In addition, using HashMap to downsample can further improve the sampling efficiency, and the time consumption of quadratic curvature segmentation is almost negligible. For other applications that need to downsample point clouds, the point cloud clustering method based on curvature segmentation can restore the spatial distribution of point clouds more accurately, and the benefits of this method are extensive and obvious.
The results and time consumption of the improved point cloud downsampling process are shown in Figure 2 and Table 1. Cloud number M = 112624, the line feature extraction threshold is 1, the surface feature extraction threshold is 0.1 and r = 0.3. It can be seen from Figure 2c that the present method has a clearer reduction in the spatial distribution of diagonal lines within a rectangular point cloud. Therefore, it can be proved that our method can retain the texture feature information of the source point cloud to a greater extent, and the accuracy and real-time performance of the downsampling results can be improved.

Voxelized Point Cloud Registration
The purpose of point cloud registration is to update the rigid frame transformation of a moving carrier by comparing two consecutive frames of point clouds or similar point clouds detected by a loopback to solve for the carrier's pose. Traditional LIO usually uses ICP to realize the precise registration of point clouds. The ICP can be briefly described as = , and the optimal solution is achieved through multiple iterations. However, an unreasonable initial position selection will make ICP fall into the misunderstanding of the local optimal solution, and the calculation resource consumption of the single-point nearest neighbor search is large. In view of the defects of the ICP algorithm, this paper utilizes a method based on the distribution of feature points in voxels, as shown in Figure 3. As shown in Figure 3. the problem of constructing the nearest neighbor model of a point pair by using a tree diagram is transformed into constructing the nearest neighbor

Voxelized Point Cloud Registration
The purpose of point cloud registration is to update the rigid frame transformation of a moving carrier by comparing two consecutive frames of point clouds or similar point clouds detected by a loopback to solve for the carrier's pose. Traditional LIO usually uses ICP to realize the precise registration of point clouds. The ICP can be briefly described as follows: given a set of source point cloud A = {a 1 , a 2 , . . . , a n } and target point cloud B = {b 1 , b 2 , . . . , b n }, the nearest neighbor search of KDTree is used to obtain the inter-frame pose transformation relationship b i = Ta i , and the optimal solution is achieved through multiple iterations. However, an unreasonable initial position selection will make ICP fall into the misunderstanding of the local optimal solution, and the calculation resource consumption of the single-point nearest neighbor search is large. In view of the defects of the ICP algorithm, this paper utilizes a method based on the distribution of feature points in voxels, as shown in Figure 3.

Voxelized Point Cloud Registration
The purpose of point cloud registration is to update the rigid frame transformation of a moving carrier by comparing two consecutive frames of point clouds or similar point clouds detected by a loopback to solve for the carrier's pose. Traditional LIO usually uses ICP to realize the precise registration of point clouds. The ICP can be briefly described as = , and the optimal solution is achieved through multiple iterations. However, an unreasonable initial position selection will make ICP fall into the misunderstanding of the local optimal solution, and the calculation resource consumption of the single-point nearest neighbor search is large. In view of the defects of the ICP algorithm, this paper utilizes a method based on the distribution of feature points in voxels, as shown in Figure 3. As shown in Figure 3. the problem of constructing the nearest neighbor model of a point pair by using a tree diagram is transformed into constructing the nearest neighbor As shown in Figure 3. the problem of constructing the nearest neighbor model of a point pair by using a tree diagram is transformed into constructing the nearest neighbor model of a point and a neighborhood point set. Firstly, the two sets of point cloud sequences are approximated as Gaussian distributions, i.e., i are covariance matrices of two sets of point cloud sequences, respectively. Let the distance between a pair of corresponding points between the target point cloud and the source point cloud be: Let the neighborhood point set of a i be B a i = b j a i − b j < λ , where λ is the neighborhood judgment threshold. Thus, the distance between the extended point and the neighborhood point set is:d , the rigid body transformation error e i is calculated as: In this way, the smoothing of all neighboring point clouds in the neighborhood of because e i is a highdimensional Gaussian distribution, its probability density function expansion form is: The negative logarithmic form of Equation (7) is: Solving the inter-frame pose transformation matrix T by maximum likelihood method: Furthermore, after introducing the number N i of point clouds in the neighborhood a i , Equation (9) can be written as: In addition to the smoothing of all neighboring point clouds in the neighborhood of a i , an iteration termination threshold ε was established to avoid falling into a blind region of local optima after multiple iterations as follows: where RMSE k+1 and RMSE k are the root mean square error of the previous k + 1 iterations and the previous k iterations, respectively. The iteration is completed when the absolute value of the change in the root mean square error |RMSE k+1 − RMSE k | ≤ ε, or the maximum number of iterations, is reached.

Local Pose Map Structure
Authors should discuss the results and how they can be interpreted from the perspective of previous studies and of the working hypotheses. The findings and their implications should be discussed in the broadest context possible. Future research directions may also be highlighted. The local state vectors in the local coordinate system in which the LiDAR and IMU are located are given as follows: where x bi denotes the state quantity after pre-integration of the ith IMU at t k , including position p L bi , rotation q L bi , speed v L bi and IMU bias b a , b g . d e k is the distance from the LiDAR feature point at t k−1 to the matching edge feature at t k , and d p k is the distance from the feature point at t k−1 to the matching planar feature at t k .
From this, the Gauss-Newton method can be used instead of the fastest gradient descent method used in [27] to minimize all cost functions so as to reduce the number of iterations for rapid convergence to a locally optimal estimate. The local optimization function is constructed as follows: where ∑ d e k + ∑ d p k is used to solve the carrier pose x LiDAR tk in the local coordinate system of LiDAR at time t k . r B ẑ i−1 i , X and Σ b are IMU measurement residuals and covariance matrices, respectively. The meanings of the terms are described below.
be the IMU pre-integration calculation value between the ith and i + 1th LiDAR key frames. Details of the derivation of the IMU pre-integration are presented in Appendix A. ∆t i is the time interval between the two LiDAR key frames, and the spatial transformation matrix from the IMU coordinate system to the LiDAR coordinate system in ith frame is represented by R b i L . The IMU residual can be obtained as follows: where the symbol [·] xyz represents extracting the real part of the quaternion used to calculate the rotation state error, and ⊗ represents the quaternion multiplication.
After the pose estimation of the previous key frame is completed, the IMU acceleration bias and gyroscope bigotry will be updated, the update amounts are set as δ b bi a and δ b bi g and the pre-integration calculation value at this time is updated as follows: The feature point cloud extracted by LiDAR can be divided into two types: line features and surface features. The LiDAR residuals of the two types need to be constructed separately and then summed to obtain the total LiDAR residuals. Details of the specific derivation of LiDAR residuals are presented in Appendix B. Figure 4 shows the schematic diagram of LiDAR residual construction.

LiDAR Factor
The feature point cloud extracted by LiDAR can be divided into two types: line features and surface features. The LiDAR residuals of the two types need to be constructed separately and then summed to obtain the total LiDAR residuals. Details of the specific derivation of LiDAR residuals are presented in Appendix B. Figure 4 shows the schematic diagram of LiDAR residual construction. , Similarly, if the surface features that match it in the k th scan are represented as

Spatial Unification of Multi-Sensor Poses
Constructing the time-space correlation of each sensor is a fundamental task in multi-sensor fusion optimization. For this system, it is necessary to spatially unify the positional estimation results of LiDAR and IMU in the local map with the GNSS measurements in the global map. Therefore, the spatial unification strategy of the multi-sensor pose involved in this paper is shown in Figure 5. As shown in Figure 4, let a feature point obtained in the k + 1th scan have the coordinates of X L (k+1,o) in the LiDAR coordinate system, and the coordinates of two end points of the line features matched with it in the kth scan are X L (k,a) and X L (k,b) . The residual error of the line features can be expressed by the point-to-line distance: Similarly, if the surface features that match it in the kth scan are represented as X L (k,c) , X L (k,d) and X L (k, f ) , then the surface feature residual can be represented by the point-to-surface distance:

Spatial Unification of Multi-Sensor Poses
Constructing the time-space correlation of each sensor is a fundamental task in multisensor fusion optimization. For this system, it is necessary to spatially unify the positional estimation results of LiDAR and IMU in the local map with the GNSS measurements in the global map. Therefore, the spatial unification strategy of the multi-sensor pose involved in this paper is shown in Figure 5.  As shown in Figure 5, W I MU GNSS is the external parameter conversion matrix from IMU to GNSS and W LiDAR I MU is the external parameter conversion matrix from LiDAR to IMU. Since the hardware is fixed to the mobile carrier, both are calibrated to a constant value. The left figure shows the positional conversion between LiDAR and IMU in the local coordinate system, whereas the right figure shows the multi-sensor positional spatial unification from the local to the global coordinate system. Here, it is necessary to introduce the spatial transformation parameters F L G (including translation p L G and rotation q L G ) to correlate the two positional spaces. The spatial unification of the multi-sensor pose at the moment of t can be expressed as: where the initial value of F L G is set as the unit matrix. Every time the GNSS factor is added to solve the global optimum, the value of F L G at the next moment will be updated, thus correcting the cumulative offset between the local and global coordinate systems.

Global Pose Map Structure
Global pose map construction can be regarded as a nonlinear optimization problem; that is, the nonlinear optimization of the state vector in the sliding window. Different from the factor graph method adopted in [27], this paper adopts the graph optimization method to directly construct the residual block in the original pose graph structure for nonlinear optimization, and only optimizes the key frames in the sliding window. However, the factor graph based on GTSAM [29] needs to construct the optimization problem into a new graph corresponding to the original pose graph, with the optimization variables as the vertices and error terms as the edges. The complicated constraint relationship among the vertices is more favorable toward the optimization accuracy. However, once a new key frame is detected, all of its associated constraint nodes will be updated, which is complicated and takes too long in the engineering field. Therefore, in order to meet the requirements of the lightweight and real-time performance of the vehicle platform, we choose not to build a new constraint-related Bayesian network, but to construct the residual error and nonlinear optimization in the original pose map structure. The global pose optimization framework proposed in this paper is shown in Figure 6. The global optimization function is constructed as follows: where ρ is the GNSS confidence level expressed by the covariance of the error in the The global optimization function is constructed as follows: where ρ is the GNSS confidence level expressed by the covariance of the error in the GNSS observations obtained by the pseudo-range single point positioning (SPP) algorithm solution. T W k is the pose transformation matrix between the current global point cloud P k and the local point cloud S k derived from the inter-frame local matching. The specific meaning of each sensor cost function in the formula are as follows.

LIO Factor
According to Section 4.1, the position p L t and rotation q L t of the carrier in the local coordinate system at the moment t can be obtained. Therefore, the LIO local residual factor can be constructed as follows: where the symbol represents the quaternion subtraction.

GNSS Factor
Set the time interval between two frames of GNSS observations as ∆t and realize the time alignment with LIO pose estimation by interpolation. Cubic spline interpolation is used for position interpolation and spherical linear interpolation is used for quaternion interpolation. Now, given the GNSS measurement p GNSS t in the ENU coordinate system and the LIO positional observations p G t in the global coordinate system, the GNSS residual factor is expressed as follows: When the carrier moves to the GNSS signal confidence region, in order to fully and reliably utilize the GNSS observations, the GNSS factor is added with the GNSS confidence as the weight. The GNSS confidence is determined by the number of visible and effective GNSS satellites. After GNSS participates in the global pose estimation, it will update the pose conversion parameter F L G between the local coordinate system and the global coordinate system. This ensures that, even if the mobile carrier enters a GNSS-rejected environment (e.g., indoor car parks and tunnels), our algorithm can provide a more accurate initial observation after GNSS correction.

Loop Factor
Considering the possible overlap of the moving vehicle driving areas, it is necessary to add a loop detection link to establish possible loop constraints between non-adjacent frames. According to Equation (5), the loop factors can be constructed as follows: Using the optimized point cloud registration method in Section 3.2, we optimize and correct the historical trajectory through the registration between the point cloud of the prior local map and the current global point cloud. This method ensures that the positional estimates converge to the global optimum result.

Point Cloud Registration Results
To verify the superiority of the point cloud registration algorithm used in this paper, we compared the registration results of the ICP algorithm used in traditional LiDAR odometry with our algorithm. The comparison results are shown in Figures 7-9. k N   Using the optimized point cloud registration method in Section 3.2, w correct the historical trajectory through the registration between the poin prior local map and the current global point cloud. This method ensures tha estimates converge to the global optimum result.

Point Cloud Registration Results
To verify the superiority of the point cloud registration algorithm used we compared the registration results of the ICP algorithm used in trad odometry with our algorithm. The comparison results are shown in Figure   ( Using the optimized point cloud registration method in Section 3.2, we optimize and correct the historical trajectory through the registration between the point cloud of the prior local map and the current global point cloud. This method ensures that the positional estimates converge to the global optimum result.

Point Cloud Registration Results
To verify the superiority of the point cloud registration algorithm used in this paper, we compared the registration results of the ICP algorithm used in traditional LiDAR odometry with our algorithm. The comparison results are shown in Figures 7-9. Compared with the typical indoor environment, for mobile carriers in complex urban environments, the angles and translations between the source and target point clouds during the continuous frame and the loopback detection may be larger. As shown in Figure  7, when the initial position is unreasonable, the registration results of the ICP algorithm cannot fully approach the global optimal solution, which is detrimental to both the local pose estimation and loopback correction of the mobile carrier. However, the registration accuracy of our algorithm is not affected by the large positional transformation of the vehicle platform. Compared with the traditional ICP algorithm, the registration result of our algorithm suited the needs of the vehicle platform better.
Furthermore, in order to avoid the contingency of the registered objects, we made a comparative experiment on the source point clouds with different rotation angles and translation distances, and quantitatively compared the registration accuracy and time consumption of each algorithm. The test results are shown in Table 2 and Table 3. Compared with the typical indoor environment, for mobile carriers in complex urban environments, the angles and translations between the source and target point clouds during the continuous frame and the loopback detection may be larger. As shown in Figure 7, when the initial position is unreasonable, the registration results of the ICP algorithm cannot fully approach the global optimal solution, which is detrimental to both the local pose estimation and loopback correction of the mobile carrier. However, the registration accuracy of our algorithm is not affected by the large positional transformation of the vehicle platform. Compared with the traditional ICP algorithm, the registration result of our algorithm suited the needs of the vehicle platform better. Furthermore, in order to avoid the contingency of the registered objects, we made a comparative experiment on the source point clouds with different rotation angles and translation distances, and quantitatively compared the registration accuracy and time consumption of each algorithm. The test results are shown in Tables 2 and 3.   Tables 2 and 3, it can be seen that the point cloud registration algorithm is more sensitive to rotation, which means that, if the vehicle rotates at a large angle in the city, the point cloud registration between consecutive frames is perhaps less reliable. It may even lead to a failure of the pose estimation, as has been demonstrated in [5]. However, as the rotation angle increases, it can be seen from Table 2 that the registration accuracy of our algorithm decreases the least, and, at the extreme 90 • rotation angle, the accuracy is still more than four times better than the other algorithms, with a root mean square error of approximately 2.67116 m. On the other hand, for large translations (5 m) between the source and target point clouds, our algorithm also shows an excellent registration accuracy, with a root mean square error of approximately 0.03047 m. It is worth noting that, in the aspect of single-point cloud registration, the registration time of our algorithm is increased by approximately one order of magnitude compared with others. To sum up, it is sufficient to verify the superiority of the proposed registration algorithm in terms of compressing time and improving the registration accuracy.

Positioning Accuracy
In this paper, the absolute trajectory error (ATE) was selected as the evaluation index of SLAM system positioning accuracy so as to directly reflect the difference between the global position estimation of the moving carrier and the ground truth. The absolute trajectory error is calculated as follows.
where A i is the absolute trajectory error of the SLAM system in the ith frame, g i and p i are the ground truth and the estimated pose, respectively, and S is the transformation matrix between the ground truth and the estimated pose. In this paper, the mean error (ATE_ME) and root mean square error (ATE_RMSE) of the absolute trajectory error were selected as evaluation criterion.

Public Dataset
To verify the positioning accuracy of the fusion algorithm in different outdoor environments, the KITTI_RAW dataset [30], which includes a variety of outdoor scenes, was used to evaluate the localization accuracy of the fusion algorithm and to compare it with other similar advanced algorithms. The experimental data acquisition platform is as follows: LiDAR point cloud data are acquired by Velodyne HDL~64 line LiDAR, with horizontal field angle of view of 360 • , vertical field angle range of (−24.8 • , +2 • ), horizontal resolution range of (0.08 • , 0.35 • ), vertical angle resolution of 0.4 • and scanning frequency of 10 Hz, which can meet the requirements of in-vehicle point cloud data acquisition. The GPS/IMU integrated system adopts OXTS RT3003, with a GPS output frequency of 1 Hz/s and an IMU output frequency of 100 Hz. The ground truth is provided by a high-precision integrated navigation system.
Four different outdoor scenarios were used to validate the performance of the fusion algorithm, including urban environments, open area, highway and forest road. The voxel grid size of the fusion algorithm was set to 0.3 × 0.2 × 0.3, the maximum iteration threshold was set to 30 and the iteration termination tolerance threshold was set to 1 × 10 −8 , so as to meet the real-time requirements and ensure the stable number of feature point clouds participating in the matching in sparse areas of outdoor environments. The comparison of the experimental results is shown in Figures 10 and 11 and Table 4.        Figure 9c, it can be seen that the positional outliers estimated by LIO-SAM are reduced by approximately 80% compared with the LIO system. Furthermore, the positional estimation errors of our algorithm are concentrated between (0.68 m, 1.23 m) with very few outliers, which fully demonstrates the superiority of the proposed algorithm in its positioning accuracy in urban environments.

Urban Dataset
To further investigate the extent to which improvements in both the front-end and backend components of the fusion algorithm improve its positioning accuracy, we conducted ablation experiments in a complex environment of GNSS signals. We constructed a system without GNSS global correction (-), a system without smoothed voxelized point cloud registration and loopback correction (#) and a complete system (Proposed), respectively. The experimental environment is the complex reflection area of GNSS in the urban environment. The experimental platform includes: the ground truth, which is provided by NovAtel SPAN-CPT positioning results; the LiDAR point cloud, which is acquired by HDL 32E Velodyne LiDAR, where the horizontal field of view angle is 360 • , the vertical field of view angle range is (−30 • , +10 • ) and the scanning frequency is 10 Hz, which is suitable for in-vehicle point cloud data acquisition; IMU, which is Xsens Mti 10, and the update frequency of the pose is 100 Hz; the GNSS receiver, which is u-blox M8T, and the update frequency is 1 Hz.
Different from the KITTI_RAW dataset, the GNSS confidence parameter in this experiment is not fixed. After solving the raw observation data collected by u-blox with the SPP algorithm, we obtained the GNSS confidence covariance as the GNSS factor weight parameter. This is more in line with the real urban environment, where GNSS reflected and refracted signals interfere with the direct signal superimposed, thus causing the pseudorange and carrier phase observations to deviate from the true value of the direct signal. The experimental results are shown in Figures 12 and 13 and Table 5.      As can be seen from Figure 13a, firstly, due to the accurate registration of the front-end point cloud, the absolute trajectory error of Proposed(-) decreases slightly compared to the pre-improved system. In the initial parking section of the dataset, the traditional ICP algorithm suffers from the problem of over-iterations, and the result is not the global optimal solution. However, the data smoothing processing and the setting of the iteration termination threshold of our algorithm can solve this problem well, providing a better initial value for the positional matching. The absolute trajectory error within the first 25 s drops by approximately 8m compared to LIO-SAM. Secondly, compared to LIO-SAM, which uses GNSS observations directly as global constraints without filtering, we introduced GNSS confidence into the optimization equation. It allows our algorithm to remain unaffected by poor-quality GNSS observations and to maintain a better positioning accuracy in the latter 50 s in areas with dense tall buildings and poor-quality point cloud distribution. In contrast, due to the poor quality of GNSS observations involved in optimization (LIO-SAM) and the low accuracy of LiDAR loop detection as a global constraint (A-LOAM and LeGO-LOAM), all other similar algorithms have a cumulative increase in the absolute trajectory error with steeper slopes. This is extremely detrimental for vehicle-mounted platforms driving in realistic large outdoor environments. From the experimental results, it can be seen that the global optimization link in our complete algorithm can well suppress the local cumulative drift and make the pose estimation result move more towards the global optimal solution.
In summary, driven by a combination of respective front-end and back-end improvements, our complete algorithm achieves a higher positioning accuracy than other comparable algorithms within real urban environments. Furthermore, as a result of the inherent advantage of local sensors not being subject to signal refraction environmental interference, it compensates for positioning outliers arising from multipath effects in traditional GNSS positioning in urban environments. It fully ensures the integrity and reliability of the fusion system's positioning.

Time-Consuming Performance
In this paper, the KITTI 09_30_0033 sequence is randomly selected to verify the real-time performance of our algorithm and the similar algorithms. In this paper, the above-mentioned three similar advanced algorithms are selected as control algorithms to compare with our algorithm, so as to verify the superior real-time performance of this algorithm in three stages: downsampling, point cloud registration and optimization. The experimental results are shown in Figure 14. The time consumption of the point cloud downsampling is shown in Figure 14a. The downsampling process of the three other algorithms uses RANSAC as the core algorithm, but its iterative approximation speed is slow, at approximately (0.5 ms, 1 ms), and the filtering and fitting quality of the depth information is not good. In contrast, our proposed algorithm uses HashMap instead of random sampling, which improves the speed of filtering out similar points in voxels to a certain extent and reduces the time consumption by two to five times compared with the traditional downsampling method. Although the time-consuming ratio of this process is relatively small in typical indoor environments or short-term positioning processes, for vehicles driving in large outdoor environments with complex point cloud environments for long periods of time, the accumulation of tiny instances of time consumption will lead to a cumulative increase in positioning time consumption. Therefore, the time-consuming compression in point cloud downsampling in this paper is beneficial for ensuring real-time vehicle positioning.
The time consumption of the point cloud registration is shown in Figure 14b, which shows that the time taken for the LIO-SAM and A-LOAM point cloud registration step is The time consumption of the point cloud downsampling is shown in Figure 14a. The downsampling process of the three other algorithms uses RANSAC as the core algorithm, but its iterative approximation speed is slow, at approximately (0.5 ms, 1 ms), and the filtering and fitting quality of the depth information is not good. In contrast, our proposed algorithm uses HashMap instead of random sampling, which improves the speed of filtering out similar points in voxels to a certain extent and reduces the time consumption by two to five times compared with the traditional downsampling method. Although the time-consuming ratio of this process is relatively small in typical indoor environments or short-term positioning processes, for vehicles driving in large outdoor environments with complex point cloud environments for long periods of time, the accumulation of tiny instances of time consumption will lead to a cumulative increase in positioning time consumption. Therefore, the time-consuming compression in point cloud downsampling in this paper is beneficial for ensuring real-time vehicle positioning.
The time consumption of the point cloud registration is shown in Figure 14b, which shows that the time taken for the LIO-SAM and A-LOAM point cloud registration step is in the range of (50 ms, 200 ms). By extracting and separating the ground point clouds, LeGO-LOAM can inhibit the time-consuming increase caused by outlier registration due to the interference between non-identical cluster point clouds to a certain extent. The point cloud registration step takes approximately (50 ms, 100 ms). There are two possible factors for the obvious fluctuation of the above algorithm. The first is the fluctuation in the registration time due to the changing distribution of the ambient point cloud. The second is that, according to the experiments in Section 5.1, the rotation angle and displacement between the source and the target point cloud also have a certain influence on time consumption, which is obvious in the urban driving environment, where the speed and driving direction change irregularly. However, the time consumption of our algorithm is stable between (20 ms, 30 ms). The smoothing of the single-to-many distribution of the point cloud sequence greatly reduces the effect of the sparsity of the point cloud distribution on the alignment time, ensuring that the point cloud alignment step is both time-efficient and stable.
The time consumption for the positional optimization step is shown in Figure 14c. The time taken to match the local map to the global map for LIO-SAM and LeGO-LOAM is around (30 ms, 130 ms). Due to the addition of GNSS sensors and the interference of some GNSS observations with low confidence, the total time consumption of LIO-SAM is even higher than that of LeGO-LOAM. However, A-LOAM takes (20 ms, 25 ms). The main reason is that the observation of only one sensor in LiDAR needs to be optimized, and the residual block is directly constructed in the original pose map structure, which reduces the computational burden of the multidimensional factor map. In this algorithm, the optimization method of A-LOAM is used for reference. It can be seen that, although the GNSS sensor is added, the time consumption is still stable at approximately 30 ms. The reason is that the global constraint of GNSS provides a more accurate transformation matrix from the local map to the global map for the fusion system, which makes it easier for the objective function of map matching to converge to the optimal solution. Secondly, we use the Gauss-Newton method instead of the steepest gradient descent method used in [27] to minimize all of the cost functions, so as to reduce the number of iterations to converge quickly to the locally optimal estimate, which is one of the main reasons for the decrease in time consumption. The average total time of each algorithm in a single frame is shown in Table 6. In summary, thanks to the double improvement of this algorithm in the front-end and back-end of our system, the time consumption of all three steps involved is compressed. Although the optimization vector of the GNSS sensor is newly added, it has a better real-time performance than other similar algorithms.

Mapping Results in the Real Urban Environment
As shown in Figure 15, this section shows the comparison of mapping results between our algorithm and similar advanced algorithms. Figure 15a shows the ground truth in the real outdoor environment, which is obtained by NovAtel SPAN-CPT. The vehicle travels for one week from the starting point in the lower right corner and then returns, and the trajectory is almost closed. Figure 15b shows the mapping result of LIO-SAM, and it is clear that the section near the end of the journey deviates significantly from the actual path travelled. The reason can be attributed to the fast displacement of the carrier, which leads to an increased difference in the point cloud clusters captured between the front and back frames, including rotation and displacement, resulting in the point cloud registration in the loopback detection of LIO-SAM being prone to failure and the loopback constraint results not being ideal. In addition, the GNSS constraint strategy adopted by LIO-SAM has a poor global correction effect on local sensors. However, compared with LIO-SAM, our algorithm has an obvious loop detection accuracy and GNSS global constraint effectiveness.

Discussion
SLAM-based multi-sensor fusion positioning technology expands the application field of traditional GNSS-based mapping techniques and makes continuous and reliable positioning in complex urban environments with good/intermittent/rejection GNSS a reality. The superior sensing capability of LIO for natural sources has been demonstrated in the literature [5,6] and others. However, the accuracy limitation of point cloud registration and the local pose drift of LIO limit the application of LIO in large outdoor environments to some extent. This paper focuses on the above two technical bottlenecks faced by the existing LiDAR SLAM and proposes simplified but effective improvement solutions.
First of all, the primary technical bottleneck is how to improve the accuracy of point cloud registration. Koideet al. [20] demonstrated that smoothing the point cloud cluster can improve the fault tolerance of point cloud registration and improve the accuracy of the point cloud from 1.20m to 0.89m. However, from our practical tests, it has been shown that, in long and large turning or translating sections in real environments, if the number of iterations is not limited, it may still fall into the local optimum problem. Therefore, on the basis of constructing a registration equation based on smooth voxelization filtering, we further use the judgment condition of iteration termination for the secondary constraint, so as to reduce the over-fitting problem of local point clouds in a typical environment. According to the registration experiment in Section 5.1 and the positional estimation accuracy experiment in Section 5.2, it can be seen that our registration method has a higher accuracy than the conventional methods. According to Figure 11, it can be seen that accurate point cloud alignment leads to an accurate initial value estimation of the positional attitude, which has a considerable positive effect on the positional estimation of the vehicle platform with a complex moving state.
Secondly, there is the challenge of how to effectively use GNSS measurements and loopback detection mechanisms to converge the local positioning results of LIO to a globally optimal solution. Firstly, the main step of loop detection dependency is point cloud registration. The previous paragraph has specifically analyzed the superiority of our method. Thanks to the positive point cloud registration results, we can reason that our global constraint using loopback detection is superior, as demonstrated in Section 5.2.2. Next, some related research has been carried out on the research of GNSS and LIO fusion positioning. Optimization-based methods have been applied in [31,32] and so on. However, taking [27] for example, in the traditional graph optimization model, the covariance effect of measurements is only used to determine whether to add factors or not, which is As shown in Figure 15c, the mapping trajectory of the algorithm proposed in this paper is basically fitted with the true value of the driving trajectory, and, in the loop road section in the lower right corner, the trajectories passing through the same landmark twice are basically coincident. The reasons are as follows: firstly, smoothing the point cloud clusters can improve the fault tolerance of the point cloud registration between the front and back frames; secondly, the weighted GNSS global constraint can eliminate the GNSS measurements with large observation gross errors, thus achieving superior mapping results.

Discussion
SLAM-based multi-sensor fusion positioning technology expands the application field of traditional GNSS-based mapping techniques and makes continuous and reliable positioning in complex urban environments with good/intermittent/rejection GNSS a reality. The superior sensing capability of LIO for natural sources has been demonstrated in the literature [5,6] and others. However, the accuracy limitation of point cloud registration and the local pose drift of LIO limit the application of LIO in large outdoor environments to some extent. This paper focuses on the above two technical bottlenecks faced by the existing LiDAR SLAM and proposes simplified but effective improvement solutions.
First of all, the primary technical bottleneck is how to improve the accuracy of point cloud registration. Koideet al. [20] demonstrated that smoothing the point cloud cluster can improve the fault tolerance of point cloud registration and improve the accuracy of the point cloud from 1.20m to 0.89m. However, from our practical tests, it has been shown that, in long and large turning or translating sections in real environments, if the number of iterations is not limited, it may still fall into the local optimum problem. Therefore, on the basis of constructing a registration equation based on smooth voxelization filtering, we further use the judgment condition of iteration termination for the secondary constraint, so as to reduce the over-fitting problem of local point clouds in a typical environment. According to the registration experiment in Section 5.1 and the positional estimation accuracy experiment in Section 5.2, it can be seen that our registration method has a higher accuracy than the conventional methods. According to Figure 11, it can be seen that accurate point cloud alignment leads to an accurate initial value estimation of the positional attitude, which has a considerable positive effect on the positional estimation of the vehicle platform with a complex moving state.
Secondly, there is the challenge of how to effectively use GNSS measurements and loopback detection mechanisms to converge the local positioning results of LIO to a globally optimal solution. Firstly, the main step of loop detection dependency is point cloud registration. The previous paragraph has specifically analyzed the superiority of our method. Thanks to the positive point cloud registration results, we can reason that our global constraint using loopback detection is superior, as demonstrated in Section 5.2.2. Next, some related research has been carried out on the research of GNSS and LIO fusion positioning. Optimization-based methods have been applied in [31,32] and so on. However, taking [27] for example, in the traditional graph optimization model, the covariance effect of measurements is only used to determine whether to add factors or not, which is crude for the screening of the measurements, especially for GNSS, a measurement signal that is greatly influenced by the environmental catadioptric surface, where its observation information has not been fully applied. Therefore, in this paper, on the basis of the rough screening of GNSS observations, the covariance of the quantitative measurements is added as a weight to the graph optimization model. It achieves a more adequate and accurate global constraint on the LIO local poses using GNSS observations. From the ablation experiment results in Section 4.3.2, it is evident that the method of the weighted GNSS residual added in this paper achieves a satisfactory positioning accuracy. According to the above results, we can draw a reasonable inference: even when entering an indoor parking lot or tunnel, the accumulated error of LIO in this algorithm will start to accumulate from a lower initial value of drift. In contrast, for the LIO algorithms without GNSS constraints, they already have a large deviation between the local map and the global map before entering the denial environment. Therefore, the cumulative error range of this algorithm is acceptable. Once the GNSS signal is restored, the local error will be corrected within the time alignment interval of 0.1s as set in Section 4.3.2. This provides an accurate and continuous positional estimation for the in-vehicle platform during travel in complex urban environments.

Conclusions
In this paper, a LiDAR-IMU-GNSS fusion positioning algorithm with accurate local alignment and weak global drift is proposed for the high-precision continuous positioning of mobile carriers in complex urban environments.
Firstly, a voxelized point cloud downsampling method based on curvature segmentation is proposed. Rough classification is carried out by a curvature threshold, and the voxelized point cloud downsampling is performed using HashMap instead of RANSAC, so that the spatial feature distribution attributes of the source point cloud, including texture feature information, such as surfaces and curves, are retained to a greater extent.
Secondly, a point cloud registration model based on the nearest neighbors of the point and neighborhood point sets is constructed. Furthermore, an iterative termination threshold is set to reduce the probability of the local optimal solution. This greatly improves the real-time performance of the point cloud registration and can also play a large role in aligning the point cloud between the front and back frames of a fast-moving carrier with large displacement.
Finally, we propose a LIO-GNSS fusion positioning model based on graph optimization that uses GNSS observations weighted by confidence to globally correct local drift. In addition, the loop detection mechanism using the above-mentioned point cloud registration algorithm is also added into the fusion system, resulting in further global constraints of the driving areas with prior maps. Experimental results show that our algorithm can realize a more continuous and accurate pose estimation and map reconstruction in complex urban environments than similar state-of-the-art algorithms.
In the future work, there are still several issues in our work that deserve further exploration. Firstly, we plan to build a deeper constraint relationship between LIO and GNSS and make use of the rich planar features perceived by LiDAR in the urban environments to compensate for GNSS occlusion or the presence of multipath areas in the direction of the constraint. It will reduce the probability of the unreliability of SPP positioning results in the urban environments. Secondly, considering the environments with multi-sensor failures, such as tunnels with a high environmental texture similarity, where both GNSS rejection and point cloud degradation failures exist. We consider a more accurate degradation direction detection using the degeneracy factor (DF) algorithm proposed in [5], and make a non-linear optimization correction for the positional attitude in that direction. Finally, the perception ability of 3D environmental features by using only LiDAR, IMU and GNSS sensors is still relatively limited. We plan to use the observation residuals from other sensors to add multi-dimensional feature constraints to the fusion positioning algorithm, such as cameras, wheel odometers and so on, so as to make full use of environmental features and realize accurate and real-time navigation and positioning targets with high environmental universality. Let z 15×1 i = [δα i , δθ i , δβ i , δb ai , δb ωi ] T and z 15×1 i+1 = [δα i+1 , δθ i+1 , δβ i+1 , δb ai+1 , δb ωi+1 ] T be the error state vector of the i th frame and the i+1 th frame, respectively, and n = n α i , n ω i , n α i+1 , n ω i+1 , n b a , n b ω T is the noise vector; then, Equation (A5) can be written as: where the initial Jacobian value is J i = I and the Jacobian iteration formula in the process of nonlinear optimization is: The iterative formula of the covariance of pre-integration in the nonlinear optimization process is: After the pre-integration derivation, Equation (14) is the variable quantity of position, rotation, velocity and IMU bias between two frames.