Tightly Coupled 3D Lidar Inertial SLAM for Ground Robot

: This paper proposes a robotic state estimation and map construction method. The traditional lidar SLAM methods are affected by sensor measurement noise, which causes the estimated trajectory to drift, especially along the altitude direction caused by lidar noise. In this paper, ground parameters in the environment are extracted to construct the ground factors to compress the trajectory estimation drifting along the altitude direction using the characteristics of constant robot pose relative to the ground. Our method uses tightly coupled lidar and inertial to obtain low-drift lidar odometry factors by factor graph optimization. The optimized lidar odometry factors are then added to a global factor graph, together with ground, loop closure, and GPS factors to obtain accurate robot state estimation and mapping after factor graph optimization. The experimental results show that our method has comparable results with advanced lidar SLAM methods, and even performs better in some complex and large-scale environments.


Introduction
Accurate self-motion estimation and mapping are prerequisites for mobile robots to achieve autonomous navigation [1,2].Simultaneous localization and mapping (SLAM) technology usually uses cameras and lidar to sense the surrounding environment, including estimating the robot's trajectory and constructing a map describing the surrounding environment.Although visual SLAM [3,4] is easier to scene understanding and position recognition, cameras are limited by light variation and Field Of View (FOV).Lidar-based SLAM approaches avoid the disadvantages of cameras, while providing more reliable measurements.Specifically, as an active sensor, 3D lidar is virtually unaffected by lighting and can sense horizontal 360 degree FOV with a frequency of about 10 Hz.Therefore, this paper uses 3D lidar to realize real-time, six Degree-Of-Freedom (DOF) robotic state estimation and mapping.
Typical lidar SLAM uses the iterative closest point algorithm (ICP) [5,6] to align two frame point clouds with estimating lidar motion.Errors are accumulated during the recursive estimation of trajectory.Therefore, a pose graph [7] with nodes representing the robot states and edges representing the relative relationship between nodes is established to reduce cumulative errors.These edges are measurements from different sensors, such as lidar, Inertial Measurement Unit (IMU), and global positioning system (GPS), or construct loop closure edges [8] obtained by identifying the revisit position.Lastly, the nonlinear least square optimization algorithm is used to optimize the graph to reduce estimated trajectory errors.
For convenience, the noise of lidar measurement has long been modeled as a zeromean Gaussian distribution.However, recent research [9] has found that the maximum error of lidar measurement at a high incidence angle is close to 20 cm.This measurement error in many ground robot working environments will cause the pose estimation to drift along the altitude direction, affecting localization accuracy and map quality.Some wellknown methods use ground information to limit the robot's drift in the altitude direction.However, most of these are applied in an indoor planar or multi-floor environment, which is challenging to adapt to an outdoor environment.In this paper, we propose a robot state estimation method based on the factor graph [10,11].Compared with a graph, the factor graph is very convenient for adding multi-modality measurements for smoothing and incremental trajectory estimation.The proposed method uses tightly coupled lidar and inertial to obtain low-drift lidar odometry, which is obtained by optimizing IMU preintegration factor and lidar odometry factor by the factor graph.Then, we compress the cumulative trajectory estimation error by ground, loop closure, and GPS constraint factors after factor graph optimization.In our method, the loop closure and GPS factors can effectively reduce the cumulative error as the scale of the environment increases.In particular, we use low-drift lidar odometry to construct a global ground and construct the ground factors based on the characteristic that the robot's pose is invariant to the ground.Since the ground factors are relative to the global ground rather than a specific plane, it can compress the robot's trajectory estimation drifting along the altitude direction caused by lidar noise, even in outdoor non-planar environments.After factor graph optimization, the points of each keyframe are saved to construct a point cloud map.Our contributions are summarized as follows: • We present a lidar inertial tightly coupled SLAM method for robot 6DOF state estimation, which uses the factor graph to fuse multi-modality measurements from lidar, IMU, and GPS.

•
We use ground factors to constrain the robot's trajectory estimation drifting along the altitude direction in outdoor non-planar environments.

•
The proposed method has been tested extensively on public datasets and in realworld environments.Compared with popular lidar SLAM methods, our accuracy is improved.

Related Work
Over the past two decades, much research has been conducted on improving the accuracy of 3D lidar SLAM.IMLS-SLAM [12] only uses one 3D lidar sensor to complete a robot's trajectory estimation by registering the lidar point clouds.Since it is difficult for a single lidar method to correct the skew points caused by self-motion on lidar measurement and accurately estimate self-motion in some lidar degradation environments, lidar is often combined with IMU for state estimation and mapping.Depending on the degree of coupling, it can be divided into two types: loose coupling and tight coupling.In LOAM [13], IMU measurement is used to remove lidar point cloud distortion and as a priori of lidar odometry.LOAM is loosely coupled because IMU measurement is not used to further optimize odometry.More popular loose coupling methods [14][15][16] use an extended Kalman filter to fuse IMU, lidar, and extra GPS for robot state estimation.The loosely coupled method is simpler but less accurate than the tightly coupled method.In LIO-MAPPING [17], the lidar and IMU are tightly coupled to achieve accurate state estimation by minimizing the common costs.The tightly coupled framework allows it to function stably in the case of lidar degradation or rapid rotation.The LIO-SAM [18] uses the factor graph to add the measurements from lidar, IMU, and GPS to complete the smooth and incremental robot trajectory estimation through factor graph optimization, which is an efficient and tightly coupled method.
Lidar inertial tightly coupled methods provide accurate state estimation in a short time but still produce unacceptable errors over time, especially in the altitude direction.GPS can provide absolute measurements to limit position estimation error.However, the error of GPS measurements in the altitude direction is much larger than that of the other two directions, and it is difficult to eliminate the cumulative error of the trajectory when the GPS signal is weak.Some lidar SLAM methods use ground information to constrain the error of the robot state estimation in the altitude direction.For example, Zheng et al. [19,20] propose fusing ground information with the pose estimation process so that the ground altitude is constrained by a two-dimensional plane.It does not consider constraining the robot pose in the space of six-DOF.Koide et al. [21] assume that the robot always moves along the same plane, limiting altitude changes in the altitude direction through ground detection.However, this is suitable indoors but not in outdoor environments.Geneva et al. [22] extract the plane in the three-axis direction of the environment instead of just the ground, use the closest point (CP) parameter to represent planes, and add the plane constraints into the graph optimization framework to limit the robot pose in 3D space.Wei et al. [23] and Zhang et al. [24] judge that the current ground of the robot has changed significantly from the last moment; therefore, they construct a non-planar global ground that uses the ground to limit the pose drift of the robot in a multi-floor environment.
Inspired by the above works, this paper proposes a lidar inertial tightly coupled robot state estimation method.In addition to using loop closure and GPS factors to constrain cumulative trajectory estimation errors, the method also uses ground factors to compress trajectory estimation drifting along the altitude direction.

System Overview
To first define the symbolic descriptions that appear throughout this paper, we use W for the world coordinate frame, I for the IMU coordinate frame, and the robot state X in the world coordinate system as the following: where R ∈ SO(3) is the rotation matrix, p ∈ R 3 is the position vector, v is the velocity, and b is the IMU bias.
As a maximum a posteriori (MAP) problem, the state estimation problem can be conveniently solved using the factor graph.Our method flow is shown in Figure 1, which indicates that the inputs are three different types of sensor and the output is a point cloud map.The lidar odometry factors and the IMU preintegration factors from lidar and IMU are optimized by the factor graph to obtain smoother lidar odometry factors.Then, the optimized lidar odometry factors are added to a global factor graph with the ground, GPS, and loop closure factors.Finally, the pose of keyframes is adjusted through factor graph optimization to obtain an accurate trajectory estimation, and the point cloud of each keyframe is saved as the point cloud map.The generation of each constraint factor is described in detail below.
Electronics 2023, 12, x FOR PEER REVIEW 3 of 13 [19,20] propose fusing ground information with the pose estimation process so that the ground altitude is constrained by a two-dimensional plane.It does not consider constraining the robot pose in the space of six-DOF.Koide et al. [21] assume that the robot always moves along the same plane, limiting altitude changes in the altitude direction through ground detection.However, this is suitable indoors but not in outdoor environments.Geneva et al. [22] extract the plane in the three-axis direction of the environment instead of just the ground, use the closest point (CP) parameter to represent planes, and add the plane constraints into the graph optimization framework to limit the robot pose in 3D space.Wei et al. [23] and Zhang et al. [24] judge that the current ground of the robot has changed significantly from the last moment; therefore, they construct a non-planar global ground that uses the ground to limit the pose drift of the robot in a multi-floor environment.
Inspired by the above works, this paper proposes a lidar inertial tightly coupled robot state estimation method.In addition to using loop closure and GPS factors to constrain cumulative trajectory estimation errors, the method also uses ground factors to compress trajectory estimation drifting along the altitude direction.

System Overview
To first define the symbolic descriptions that appear throughout this paper, we use W for the world coordinate frame, I for the IMU coordinate frame, and the robot state X in the world coordinate system as the following: where R∈SO( 3) is the rotation matrix, p∈R 3 is the position vector, v is the velocity, and b is the IMU bias.
As a maximum a posteriori (MAP) problem, the state estimation problem can be conveniently solved using the factor graph.Our method flow is shown in Figure 1, which indicates that the inputs are three different types of sensor and the output is a point cloud map.The lidar odometry factors and the IMU preintegration factors from lidar and IMU are optimized by the factor graph to obtain smoother lidar odometry factors.Then, the optimized lidar odometry factors are added to a global factor graph with the ground, GPS, and loop closure factors.Finally, the pose of keyframes is adjusted through factor graph optimization to obtain an accurate trajectory estimation, and the point cloud of each keyframe is saved as the point cloud map.The generation of each constraint factor is described in detail below.

IMU Preintegration Factor
The initial angular velocity and acceleration measured by IMU in I at time t are defined as Equations ( 2) and (3):

IMU Preintegration Factor
The initial angular velocity and acceleration measured by IMU in I at time t are defined as Equations ( 2) and ( 3 where b t is varying bias and n t is white noise.R I W is the rotation matrix from frame W to frame I. g is the gravity vector.The velocity, position, and rotation of the robot measured using the IMU at time t + ∆t can be given by Equations ( 4)-( 6): The IMU preintegration factor represents the relative motion estimation between two timesteps.The IMU preintegration measurements, ∆v ij , ∆p ij , and ∆R ij between time I and time j, can be calculated using the following equations:

Lidar Odometry Factor
Unnecessary pressure is placed on the real-time performance of the system by optimizing the robot's pose with the intrinsic frequency of the lidar.This paper selects a lidar frame with pose changes of more than 0.8 m or 0.15 rad as a keyframe and outputs the optimized keyframe poses as the robot's trajectory.The robot's state X of a keyframe is added to the factor graph as variable nodes, while the normal frames only participate in the lidar odometry calculation and do not join optimization.Because the points in a single lidar frame are not dense enough, we maintain a sliding window with a fixed number of 20 keyframes.Then, we splice keyframes to construct a local map M t according to the pose relationship between keyframes in the sliding window to improve the accuracy of the lidar odometry by scan-to-map.Our method used the motion estimated by IMU as the initial value of lidar odometry.Then, we use optimized IMU preintegration factors and lidar odometry factors on a factor graph to obtain accurate odometry and estimate the bias of IMU to achieve the lidar inertial tight coupling.When a point cloud F t+1 with corrected skew points arrives, it is divided into two types of feature points: edge points F e t+1 and plane points F p t+1 , F t+1 = {F e t+1 , F p t+1 }, according to the curvature of points.Our method first converts F t+1 to the vicinity of M t according to the motion estimated by IMU, as shown in Equation (10).Then, it calculates the distance between the corresponding feature points in F t+1 and M t to obtain the lidar odometry.
where p i is a feature point in F t+1 .R t t+1 and t t t+1 are the estimated motion of the IMU after the transformation from I to W coordinate frame.When p i is an edge point, the distance between the corresponding relations of feature points in F t+1 and M t is calculated, as shown in Equation ( 11): where d ε is the distance from the edge point in F t+1 to the edge point in M t .p i is the edge point in F t+1 , p a and p b are the edge feature points corresponding to p i in M t .The correspondence between feature points is detailed in reference [13].When p i is a plane point, the distance between the corresponding relations of plane points in F t+1 and M t are calculated, as shown in Equation ( 12): where d H is the distance from the plane point in F t+1 to the plane in M t .p l , p j , and p m are the plane points corresponding to p i in M t .Our method uses the Gaussian-Newton method to minimize the distance between the feature points in F t+1 and the corresponding feature points in M t , as shown in Equation ( 13): min where T t+1 is the robot's state X of F t+1 .Finally, the lidar odometry factor between two keyframes is calculated, as shown in Equation ( 14):

Ground Factor
It is reasonable to assume that the ground points near the robot can be modeled as an infinite plane in the form of HF( → n , d) (Hesse form, HF), where → n is the normal vector and d is the distance between the plane and the origin of the specified coordinate frame.HF is a hyperparametric planar description with the influence of a singular information matrix during optimization.In this paper, we use the HF parameter to facilitate the conversion between different grounds.Π = d → n is a minimal representation [22], while the infinite plane of HF is represented by Π during the process of factor graph optimization.Our method uses the RANSAC [25] algorithm to extract the ground of the HF parameter in the local map to which each keyframe belongs.If we assume ground with parameter HF( → n a , d a ) corresponding to keyframe F a , then for each point p a i belonging to the ground of F a satisfies Equation ( 15): if the transformation T a b between keyframes F a and F b is known, that is p a i = R a b p b i + t a b , substituted into Equation (15), then the ground parameter of keyframe F b is naturally represented under keyframe F a as: Equation ( 16) converts the ground of the current keyframe into the previous keyframe coordinate frame.In this paper, the conversion relationship between two keyframes is determined by the optimized lidar odometry factor, which is accurate in a short time.Then we compare the HF parameters between two keyframes.If the changes of normal vector angle ∆ θ and distance ∆ d of HF between two keyframes are less than the threshold, the two keyframes are considered to belong to the same ground.Otherwise, they belong to different grounds.We use Equation ( 16) to splice different grounds into a global ground.Finally, we add the difference between the ground parameter observed at each keyframe with the estimated global ground as a ground factor to the factor graph.Equations ( 17) and ( 18) use a ground factor corresponding to the measured likelihood of a model with Gaussian noise: where Σ is the noise covariance, Π W is the global ground, and h(Π i ) is the ground parameter measured from the keyframe F i in the W coordinate frame.

Loop Closure Factor
When the robot revisits the vicinity of the historical keyframe, a closed loop is formed between the current and the historical keyframe.This provides the relative pose relationship of two keyframes, which can compress the cumulative error of continuous keyframe pose estimation between them.This paper detects loop closure using a Euclidean space-based method [18].In this paper, the search for a loop closure occurs at a distance set to 10 m.When a new keyframe is added as a variable node to the factor graph, the variable node that meets the distance limit is searched from the factor graph.Then, the point cloud registration is performed in the same way as the lidar odometry to obtain the relative transformation ∆T between the two nodes, and the loop closure factor ∆T was added to the factor graph.In our experiments, loop closure had a positive effect.In an environment with loop closure, loop closure pulls the estimated trajectory back to the correct position.

GPS Factor
While lidar and inertial can accurately estimate the robotic state in a short time, they still produce large pose drift in large-scale and complex environments.One solution is the absolute measurement provided by GPS.When a GPS signal is available, our method converts the GPS measurements into the local Cartesian coordinate frame [18,26] to build GPS factors.Whenever a new keyframe is generated, we linearly interpolate the GPS measurement based on the time stamp of the lidar and associate the new node with the GPS factor.This process is selective because position estimation errors increase slowly, and the frequent insertion of noisy GPS measurements has a negative effect.This paper adds a GPS factor if the estimated location covariance is larger than the received GPS location covariance.Our method can work entirely without GPS, and when GPS is used, it effectively limits the cumulative error caused by position estimation.

Experimental Equipment
To validate our method, we conducted extensive experiments on the public datasets KITTI [27] and M2DGR [28], and in real-world environments.Our experimental platform is shown in Figure 2. It is a ground robot weighing about 160 kg on an Ackerman chassis, with a 65 cm wheelbase and 38 cm wheel diameter, equipped with an RS-LiDAR-16 lidar, an Xsens Mti-g710 9-axis IMU, a CHCNAV CGI-610 integrated navigation, and an AMD R7 4800 H central processing unit (CPU) laptop with 16G memory.All experiments were completed on this experimental platform.In our experimental environments, the frequencies of lidar, Xsens IMU, and CHCNAV GPS were set to 10 Hz, 400 Hz, and 4 Hz, respectively.
R7 4800 H central processing unit (CPU) laptop with 16G memory.All experiments were completed on this experimental platform.In our experimental environments, the frequencies of lidar, Xsens IMU, and CHCNAV GPS were set to 10 Hz, 400 Hz, and 4 Hz, respectively.

Experiments on Public Datasets
In this section, our method compares absolute trajectory errors (ATE) with several well-known lidar SLAM methods on the KITTI 07 sequence and the M2DGR street_04 sequence, including only a lidar using method A-LOAM [13] and two multi-sensor fusion methods, LIO-MAPPING [17] and LIO-SAM [18].The two public dataset sequences are typical vehicle driving environments, including flat ground, buildings, and trees.The robot's driving speed in sequences 07 and 04 was 6 m/s and 1 m/s, respectively.Tables 1 and  2 show the ATE of our method and the other three controlled methods.The experimental results show that the translation and rotation errors of A-LOAM were greater than those of the other three methods because the empty environment and some sharp curves of these two sequences were not friendly to the single-sensor method.Our method had a similar translation error in the horizontal direction to the two multi-sensor fusion methods, but a remarkable difference was reflected in the altitude direction.Due to the ground factor, the drift of our method along the altitude direction was not apparent, the altitude error was compressed below 0.4m, and our rotation error was lower than that of the other three methods in both sequences.Thanks to the lidar inertial tightly coupled framework, the errors of our method grew slowly over time, and accurate robot trajectory estimation was obtained by correcting the position with GPS and loop closure constraints.In addition, we obtained similar results to those shown in Tables 1 and 2 in several other outdoor sequences of these two public datasets.All four methods were run in real time and the translation error of our method was controlled within 1 m, which is lower than the other three methods.

Experiments on Public Datasets
In this section, our method compares absolute trajectory errors (ATE) with several wellknown lidar SLAM methods on the KITTI 07 sequence and the M2DGR street_04 sequence, including only a lidar using method A-LOAM [13] and two multi-sensor fusion methods, LIO-MAPPING [17] and LIO-SAM [18].The two public dataset sequences are typical vehicle driving environments, including flat ground, buildings, and trees.The robot's driving speed in sequences 07 and 04 was 6 m/s and 1 m/s, respectively.Tables 1 and 2 show the ATE of our method and the other three controlled methods.The experimental results show that the translation and rotation errors of A-LOAM were greater than those of the other three methods because the empty environment and some sharp curves of these two sequences were not friendly to the single-sensor method.Our method had a similar translation error in the horizontal direction to the two multi-sensor fusion methods, but a remarkable difference was reflected in the altitude direction.Due to the ground factor, the drift of our method along the altitude direction was not apparent, the altitude error was compressed below 0.4m, and our rotation error was lower than that of the other three methods in both sequences.Thanks to the lidar inertial tightly coupled framework, the errors of our method grew slowly over time, and accurate robot trajectory estimation was obtained by correcting the position with GPS and loop closure constraints.In addition, we obtained similar results to those shown in Tables 1 and 2 in several other outdoor sequences of these two public datasets.All four methods were run in real time and the translation error of our method was controlled within 1 m, which is lower than the other three methods.

Experiments on PARK and BLOCK
We used the experimental equipment in Figure 2 to acquire more datasets from challenging environments to test our method sufficiently.Figure 3a shows an experimental environment named PARK, where the robot goes around counterclockwise from the red start point back to the origin, with a length of 1328 m.This paper uses integrated navigation CHCNAV CGI-610 to provide the ground-truth trajectory.Table 3 shows the ATE between different methods and the ground-truth trajectory in PARK, and their trajectories are shown in Figure 3b,c.The lidar measurement noise makes the trajectory drift along the altitude direction.As shown in Table 3 and Figure 3c, the three comparison methods have more than twice the translation error in the vertical direction than the horizontal direction.Excessive error accumulation caused the loop closure failure of LIO-MAPPING and LIO-SAM, the same as A-LOAM without back-end optimization.As shown in Figure 3b, they did not return to the starting point.However, due to the ground factor constraining the drifting in the altitude direction, our translation error in the altitude direction was only 0.48 m, which was always within the range of loop closure that occurred safely.However, with the help of different constraint factors, only our method returned to starting point.Our method always approached the ground-truth trajectory.The translation error of our method was 0.952 m, while the rotation error was 0.950 degrees in PARK, which is more accurate than the other three methods. A-LOAM

Experiments on PARK and BLOCK
We used the experimental equipment in Figure 2 to acquire more datasets from challenging environments to test our method sufficiently.Figure 3a shows an experimental environment named PARK, where the robot goes around counterclockwise from the red start point back to the origin, with a length of 1328 m.This paper uses integrated navigation CHCNAV CGI-610 to provide the ground-truth trajectory.Table 3 shows the ATE between different methods and the ground-truth trajectory in PARK, and their trajectories are shown in Figure 3b,c.The lidar measurement noise makes the trajectory drift along the altitude direction.As shown in Table 3 and Figure 3c, the three comparison methods have more than twice the translation error in the vertical direction than the horizontal direction.Excessive error accumulation caused the loop closure failure of LIO-MAPPING and LIO-SAM, the same as A-LOAM without back-end optimization.As shown in Figure 3b, they did not return to the starting point.However, due to the ground factor constraining the drifting in the altitude direction, our translation error in the altitude direction was only 0.48 m, which was always within the range of loop closure that occurred safely.However, with the help of different constraint factors, only our method returned to the starting point.Our method always approached the ground-truth trajectory.The translation error of our method was 0.952 m, while the rotation error was 0.950 degrees in PARK, which is more accurate than the other three methods.

APPROACH x[m] y[m] z[m] trans[m] yaw[deg] pitch[deg] roll[deg] rot[deg]
A Figure 4 shows the point cloud maps of A-LOAM, LIO-MAPPING, and LIO-SAM in PARK.The color of their point cloud gradually changed from blue at the starting point to red in a counterclockwise direction, indicating that their point cloud height gradually increased and the estimated trajectory deviated from the ground-truth trajectory.There was a downhill slope at 3/4 of the PARK, and the altitude of the point clouds of the three methods began to drop, but at last, they all formed an altitude distance from the starting point, as shown in Figure 4e-g.The point cloud map of our method in PARK is shown in Figure 4d,h.The point cloud altitude was always close to the ground-truth trajectory.Under the ground constraint factor, the altitude of our point cloud changed with the relief of the terrain rather than drifting toward the altitude with time.Figure 4 shows the point cloud maps of A-LOAM, LIO-MAPPING, and LIO-SAM in PARK.The color of their point cloud gradually changed from blue at the starting point to red in a counterclockwise direction, indicating that their point cloud height gradually increased and the estimated trajectory deviated from the ground-truth trajectory.There was a downhill slope at 3/4 of the PARK, and the altitude of the point clouds of the three methods began to drop, but at last, they all formed an altitude distance from the starting point, as shown in Figure 4e-g.The point cloud map of our method in PARK is shown in Figure 4d,h.The point cloud altitude was always close to the ground-truth trajectory.Under the ground constraint factor, the altitude of our point cloud changed with the relief of the terrain rather than drifting toward the altitude with time.Furthermore, to illustrate the influence of different constraint factors in our method on the SLAM trajectory, the experiments based on tightly coupled lidar inertial odometry (LIO) were carried out with ground, loop closure, and GPS factors, respectively, in a large-scale environment BLOCK with a length of 2334 m.In the different methods, lio-only (LIO) was the minimization method of our system, meaning that only lidar odometry and IMU factors were used for the robot's odometry estimation, and there was no constraint factor to correct the cumulative error of trajectory estimation.The ground factor was added to LIO; we named this method lio-ground.Similarly, we added loop closure and GPS factor separately to LIO and named the two methods lio-loop and lio-gps, respectively.The ATE between the different methods and the ground-truth trajectory is given in Table 4, while Figure 5 shows the corresponding trajectory and the satellite image of BLOCK.Furthermore, to illustrate the influence of different constraint factors in our method on the SLAM trajectory, the experiments based on tightly coupled lidar inertial odometry (LIO) were carried out with ground, loop closure, and GPS factors, respectively, in a largescale environment BLOCK with a length of 2334 m.In the different methods, lio-only (LIO) was the minimization method of our system, meaning that only lidar odometry and IMU factors were used for the robot's odometry estimation, and there was no constraint factor to correct the cumulative error of trajectory estimation.The ground factor was added to LIO; we named this method lio-ground.Similarly, we added loop closure and GPS factor separately to LIO and named the two methods lio-loop and lio-gps, respectively.The ATE between the different methods and the ground-truth trajectory is given in Table 4, while Figure 5 shows the corresponding trajectory and the satellite image of BLOCK.

APPROACH x[m] y[m] z[m] trans[m] yaw[deg] pitch[deg] roll[deg] rot[deg]
lio-only  The results in BLOCK show that the translation and rotation error of lio-only was 6.24 m and 1.70 degrees, which indicates that LIO will produce unacceptable errors in a large-scale environment.It was necessary to reduce cumulative errors by constraint factors.Compared with LIO, the lio-ground method used the ground constraint factor to significantly reduce the translation error in the altitude direction from 5.65 m to 1.21 m, while the rotation error was reduced from 1.70 to 1.06 degrees.Because there were multiple repeated paths in BLOCK, lio-loop reduced the translation error of the trajectory in three directions from 6.24 m to 4.09 m and the rotation error from 1.70 to 1.19 degrees through loop closure factors.Under the function of the GPS constraint factor, the The results in BLOCK show that the translation and rotation error of lio-only was 6.24 m and 1.70 degrees, which indicates that LIO will produce unacceptable errors in a large-scale environment.It was necessary to reduce cumulative errors by constraint factors.Compared with LIO, the lio-ground method used the ground constraint factor to significantly reduce the translation error in the altitude direction from 5.65 m to 1.21 m, while the rotation error was reduced from 1.70 to 1.06 degrees.Because there were multiple repeated paths in BLOCK, lio-loop reduced the translation error of the trajectory in three directions from 6.24 m to 4.09 m and the rotation error from 1.70 to 1.19 degrees through loop closure factors.Under the function of the GPS constraint factor, the horizontal translation error of lio-gps was limited to about 1.11m, close to the ground-truth trajectory.However, the translation error in the altitude direction was only reduced from 5.65 m to 4.74 m, which is related to the measurement accuracy of GPS in the altitude direction.Table 4 and Figure 5b show that different constraint factors positively affect the LIO.The ours-finale indicates the final result of our method with all factors working together, and it is clear that our method can maintain high precision even in a large-scale environment.In BLOCK, the translation error of our method was 0.941 m, while the rotation error was 0.733 degrees.As a comparison, the point cloud maps in BLOCK generated by A-LOAM, LIO-MAPPING, LIO-SAM, and our method are shown in Figure 6.This figure shows that the three methods compared with ours have multiple ghosting and irregular roads, which is a phenomenon of inaccurate positioning.In contrast, our point cloud map shows more complete buildings and trees, preserving more details of the environment.from 5.65 m to 4.74 m, which is related to the measurement accuracy of GPS in the altitude direction.Table 4 and Figure 5b show that different constraint factors positively affect the LIO.The ours-finale indicates the final result of our method with all factors working together, and it is clear that our method can maintain high precision even in a large-scale environment.In BLOCK, the translation error of our method was 0.941 m, while the rotation error was 0.733 degrees.As a comparison, the point cloud maps in BLOCK generated by A-LOAM, LIO-MAPPING, LIO-SAM, and our method are shown in Figure 6.This figure shows that the three methods compared with ours have multiple ghosting and irregular roads, which is a phenomenon of inaccurate positioning.In contrast, our point cloud map shows more complete buildings and trees, preserving more details of the environment.

Conclusions and Future Work
This paper proposed a robust simultaneous localization and mapping method.It used the factor graph to fuse multi-modality sensor measurements, including five types of factors: IMU preintegration factor, lidar odometry factor, ground factor, loop closure factor, and GPS factor.The proposed method used tightly coupled lidar and inertial to obtain low-drift lidar odometry factors by factor graph optimization.The optimized lidar

Conclusions and Future Work
This paper proposed a robust simultaneous localization and mapping method.It used the factor graph to fuse multi-modality sensor measurements, including five types of factors: IMU preintegration factor, lidar odometry factor, ground factor, loop closure factor, and GPS factor.The proposed method used tightly coupled lidar and inertial to obtain low-drift lidar odometry factors by factor graph optimization.The optimized lidar odometry factors were then added to a global factor graph, together with ground, loop closure, and GPS factors, to obtain accurate keyframe pose estimation and save the point cloud of each keyframe to build a point cloud map after global factor graph optimization.
In order to compress the trajectory drift along the altitude direction, which is mainly caused by lidar measurement noise in outdoor non-plane environments, this paper used low-drift lidar odometry to recover a global ground.The ground factor was then constructed using the robot pose invariant relative to the ground.The results of different experimental environments show that the trajectory estimation error of our method grew slowly.In a general ground robot working environment such as BLOCK, the ground factor can positively constrain the trajectory drift in the altitude direction.Compared with ad-vanced lidar SLAM, our method achieved comparable results, and even performed better in some large-scale and complex environments.
One limitation of our method is that the translation drift in the altitude direction can be compressed through the ground in a flat environment, which remains challenging in some environments where the ground is uneven.In the future, we plan to add vision sensors to our methodology framework to improve the stability and performance of the system.

Figure 1 .
Figure 1.System overview of our framework.The system consists of three inputs and one output.The multi-modality measurements from lidar, IMU, and GPS are fused by the factor graph and adjust the trajectory estimation to output point cloud maps after factor graph optimization.

Figure 1 .
Figure 1.System overview of our framework.The system consists of three inputs and one output.The multi-modality measurements from lidar, IMU, and GPS are fused by the factor graph and adjust the trajectory estimation to output point cloud maps after factor graph optimization.

Figure 2 .
Figure 2. Experimental platform.The primary sensors include a 16-line 3D lidar, a 9-axis IMU, and an integrated navigation system.

Figure 2 .
Figure 2. Experimental platform.The primary sensors include a 16-line 3D lidar, a 9-axis IMU, and an integrated navigation system.

Figure 3 .
Figure 3. Satellite image of PARK and trajectories of different methods in PARK.(a) Satellite image of PARK; (b) Horizontal trajectory of different methods in PARK; (c) Vertical trajectory over time of different methods in PARK.

Figure 3 .
Figure 3. Satellite image of PARK and trajectories of different methods in PARK.(a) Satellite image of PARK; (b) Horizontal trajectory of different methods in PARK; (c) Vertical trajectory over time of different methods in PARK.
(a) Top view of A-LOAM (b) Top view of LIO-MAPPING (c) Top view of LIO-SAM (d) Top view of ours (e) Front view of A-LOAM (f) Front view of LIO-MAPPING (g) Front view of LIO-SAM (h) Front view of ours

Figure 4 .
Figure 4. Point cloud maps were generated by different methods in PARK.The point clouds were rendered according to the altitude value in the vertical direction.Blue indicates low altitude and red indicates high altitude.

Figure 5 .
Figure 5. Satellite image of BLOCK and trajectory of different methods in BLOCK.(a) Satellite image of BLOCK; (b) Horizontal trajectory of different methods in BLOCK.

Figure 5 .
Figure 5. Satellite image of BLOCK and trajectory of different methods in BLOCK.(a) Satellite image of BLOCK; (b) Horizontal trajectory of different methods in BLOCK.

Figure 6 .
Figure 6.Point cloud maps were generated by different methods in BLOCK.The point cloud colors represent altitude value.

Figure 6 .
Figure 6.Point cloud maps were generated by different methods in BLOCK.The point cloud colors represent altitude value.

Table 1 .
Root mean square error (RMSE) for translation and rotation in sequence 07 of the KITTI dataset.

Table 2 .
RMSE for translation and rotation in sequence street_04 of the M2DGR dataset.

Table 1 .
Root mean square error (RMSE) for translation and rotation in sequence 07 of the KITTI dataset.

Table 2 .
RMSE for translation and rotation in sequence street_04 of the M2DGR dataset.

Table 3 .
Translation and rotation RMSE of different methods in PARK.

Table 3 .
Translation and rotation RMSE of different methods in PARK.

Table 4 .
Translation and rotation RMSE of different methods in BLOCK.

Table 4 .
Translation and rotation RMSE of different methods in BLOCK.