Real-Time Scan-to-Map Matching Localization System Based on Lightweight Pre-Built Occupancy High-Deﬁnition Map

: High-precision and robust localization in GNSS-denied areas is crucial for autonomous vehicles and robots. Most state-of-the-art localization methods are based on simultaneous localization and mapping (SLAM) with a camera or light detection and ranging (LiDAR). However, SLAM will suffer from drift during long-term running without loop closure or prior constraints. Lightweight, high-precision environmental maps have gradually become an indispensable part of future autonomous driving. In order to solve the problem of real-time global localization for autonomous vehicles and robots, we propose a precise and robust LiDAR localization system based on a pre-built, occupied high-deﬁnition (HD) map called the Extended QuadTree (EQT) map. It makes use of a planar quadtree for block division and a Z-sequence index structure within the block cells. Then, a four-level occupancy probability cell value model is adopted. It will save about eight times the storage space compared with Google Cartographer, and the EQT map can be extended to store other information. For efﬁcient scan-to-map matching with our specialized EQT map, the Bursa linearized model is used in the Gauss–Newton iteration of our algorithm, which makes the calculation of partial derivatives fast. All the above improvements lead to optimal storage and efﬁcient querying for real-time scan-to-map matching localization. Field tests in an industrial park and road environment prove that positioning accuracy of about 6–13 cm and attitude accuracy of about 0.15 ◦ were achieved using a VLP-16 LiDAR. They also show that the method proposed in this paper is signiﬁcantly better than the NDT method. For the long and narrow environment of an underground mine tunnel, high-resolution maps are also helpful for accurate and robust localization.


Introduction
High-precision and robust localization is an extremely important task for autonomous vehicles and robots, as it is a prerequisite for autonomous vehicles and intelligent robots to perform the next task, such as path planning, obstacle avoidance, and so on [1]. Many researchers and companies have made great efforts to achieve high precision and reliable positioning in various complex scenarios.
Typically, the Global Navigation Satellite System (GNSS) and Inertial Navigation System (INS) are the most commonly used sensors to localize autonomous vehicles [2][3][4], as they are two kinds of sensors with good complementary properties. With the aid of differential signals, GNSS can provide absolute position without drift, and it is usually integrated with high-frequency relative pose information provided by INS for high-frequency and high-precision positioning. However, the GNSS/INS combination is not enough to meet the positioning needs of autonomous driving applications. GNSS signals are not always available, especially in urban areas with tall buildings and suffering from multi-path effects and signal block; furthermore, INS may suffer from boundless drift [5]. Except for the disadvantage mentioned above, GNSS/INS could only determine the position and orientation of the vehicle and could not perceive the environment around it. for autonomous vehicles by using a 2D laser rangefinder. Lane markers were detected from reflectivity values as local features, and the localization algorithm is based on the point-to-plane ICP algorithm. Farouk Ghallabi et al. [35] proposed a method to solve the positioning problem by matching the road perception information of the 3D LiDAR sensor with the HD map elements; specifically, the particle filter algorithm is used to match the observed High Reflective Landmarks (HRL) with HD map elements to estimate the position of the vehicle. Jeong et al. [36] use an onboard stereo camera and HD map to achieve 6-DOF pose estimation in an urban environment; their HD map stores 3D geometric data with semantic information expressed in a compressed format.

Methodology
The proposed LiDAR-based 6DOF localization method in this paper relies on a prebuilt 3D point cloud map; our goal is to solve the problem of continuous and reliable positioning in a GNSS-denied environment. The pre-built map is a volumetric (voxel) representation of the environment; each voxel stores a value representing the probability of being occupied [37]. We achieve 6DOF pose estimation by matching the online point cloud scan with the pre-built map, and the initial pose of the scan is continuously optimized through the Gauss-Newton iteration method.

Map Generation
One key challenge to achieving rapid map-based localization is building a prior representation of the world (modeling the 3D environment) [18], modeling the areas of free, occupied, and unmapped space, and importantly, ensuring that access times and memory consumption are efficient. In the field of robotics and SLAM, there have been several environmental models used in practice, such as point clouds, elevation maps, multi-level surface maps, and semantic maps. A point cloud is not memory efficient, as it would consume a large amount of memory to store points. An elevation map and multi-level surface map are memory efficient, but they could not model the area that was unmapped and lack the ability to model any 3D environment. In this paper, we use an occupancy grid map for environment representation that is similar to OctoMap [37]. The occupied grid map divides the environment into a series of grids, and each grid is assigned a probability value between 0 and 1. The probability value indicates the probability that this grid is occupied. The occupied grid map is widely used because of its convenience in creation and maintenance. It is especially suitable for processing laser or ultrasonic measurement data.
The state of each grid in the occupancy grid map is not a "black" or "white" state but is represented by the probability that the grid is occupied by obstacles. Let p(s = 0) indicate the probability of a grid being occupied and p(s = 1) indicate a grid being free; they satisfy p(s = 0) + p(s = 1)= 1. In actual application, the state of a grid was computed as Odd(s) = p(s=1) p(s=0) for convenience. The state of the grid in the occupancy grid map needs to be constantly updated when new point clouds are inserted into the map. The probability of a grid being occupied will be updated according to (1) when given the sensor measurement z ∼ {0, 1}.
According to Bayesian formula, we can get: p(s = 0|z ) = p(z|s=0 )p(s=0) Further, we can get a new state update (3) by taking the logarithm of the map grid state.

of 23
For log Odd(s|z ) update, it will be faster than Odd(s|z ) as multiplications are replaced by additions. log p(z|s=1 ) p(z|s=0 ) is called a measurement module, which has two forms: It is convenient to convert log Odd(s|z ) into probability and vice versa; therefore, log Odd(s|z ) is stored in the occupancy grid map instead of the occupancy probability. In the state-of-the-art work of Google Cartographer [38], the minimum and maximum occupancy probability values are mapped to the integer interval [1,32767]. To reduce the amount of floating-point calculations, it means that the occupation probability value is divided into 32,767 levels, and storing an occupancy probability value will consume 16 bits of memory. To further reduce memory consumption, a simplified occupancy probability value mapping model is used in this paper. We divided the occupancy probability value evenly into four levels, which means that storing an occupancy probability value will only consume 2 bits of memory. For a single float value (usually 4 bytes), one could only store 1 occupancy probability value (2 bytes for index and 2 bytes for occupancy probability values) when using Google's method, and a single float value could store 8 occupancy probability values (2 bytes for index and 2 bytes for 8 occupancy probability values) when using our method.
The octree data structure is used to store maps for flexible storage and application of map data and is a tree-like data structure used to describe a three-dimensional space. While the typical octree index structure will consume a lot of computing resources for spatial data queries, it is not suitable for real-time map matching. Therefore, considering the sparse characteristic of the point cloud, we simplify the index structure of voxels for quick queries. Specifically, we first divide and calculate the index items according to the ground plane in the horizontal direction, which are stored in a planar quadtree; thus, we can quickly query and locate the position of the planar voxel through the planar quadtree; after location to the planar quadnode, we divide and calculate the linear index along the elevation direction (Z axis direction), which are stored in a linked list structure. Each voxel only stores the occupancy probability information; in fact, each voxel can be further expanded to store more information, such as light intensity, semantics, and other information. In addition, the quadtree structure allows us to obtain multiple resolution maps by limiting the depth of the querying; thus, we can achieve coarse-to-fine matching in the scan-to-map matching stage. The high-precision voxel map model based on a planar quadtree and elevation index is shown in Figure 1.
In general, our point cloud map uses a planar quadtree for block division and a Zsequence index structure within the block. A four-level mapping occupancy probability value model is also adopted. All the efforts lead to optimal storage and query efficiency for real-time scan-to-map matching positioning. elevation direction (Z axis direction), which are stored in a linked list structure. Each voxel only stores the occupancy probability information; in fact, each voxel can be further expanded to store more information, such as light intensity, semantics, and other information. In addition, the quadtree structure allows us to obtain multiple resolution maps by limiting the depth of the querying; thus, we can achieve coarse-to-fine matching in the scan-to-map matching stage. The high-precision voxel map model based on a planar quadtree and elevation index is shown in Figure 1. In general, our point cloud map uses a planar quadtree for block division and a Zsequence index structure within the block. A four-level mapping occupancy probability value model is also adopted. All the efforts lead to optimal storage and query efficiency for real-time scan-to-map matching positioning.

Scan-to-Map Matching
Taking the mechanical drive mechanism of the VLP-16 lidar we used into account, it may take several milliseconds to acquire a complete scan. In this process, the continuous movement of the sensor will cause point cloud distortion. Thanks to our high-precision hardware time synchronization, the precise sampling time of each point can be accurately obtained. We eliminate the distortion of the original laser scanning by transforming each point to the same time based on the assumption of a uniform motion model Error! Reference source not found.. The scan-to-map matching module needs to be initialized with roughly accurate initial pose estimation. Generally, we could get the initial pose from a GNSS receiver or inversely calculate it from known landmarks. Then, the online point cloud scan obtained from LiDAR can be matched with the pre-built HD map continuously. The main goal of scan-to-map matching is to find rigid transformation parameters that make the best match between the online scan and the pre-built map. Here we try to estimate the rigid transformation parameters that minimize the residual of the occupation

Scan-to-Map Matching
Taking the mechanical drive mechanism of the VLP-16 lidar we used into account, it may take several milliseconds to acquire a complete scan. In this process, the continuous movement of the sensor will cause point cloud distortion. Thanks to our high-precision hardware time synchronization, the precise sampling time of each point can be accurately obtained. We eliminate the distortion of the original laser scanning by transforming each point to the same time based on the assumption of a uniform motion model [39]. The scan-to-map matching module needs to be initialized with roughly accurate initial pose estimation. Generally, we could get the initial pose from a GNSS receiver or inversely calculate it from known landmarks. Then, the online point cloud scan obtained from LiDAR can be matched with the pre-built HD map continuously. The main goal of scan-to-map matching is to find rigid transformation parameters that make the best match between the online scan and the pre-built map. Here we try to estimate the rigid transformation parameters that minimize the residual of the occupation probability value. The problem of scan-to-map matching can be modeled as a nonlinear least squares problem, which can be expressed as: where ξ = [x, y, z, roll, pitch, yaw] T is the 6DOF pose to be optimized from the current scan, n is the point number of the scan, S i (ξ) means transform a point in a LiDAR frame to a map coordinate frame, as in (2), and M(S i (ξ)) will return the occupancy probability value of point S i (ξ) on the map.
In this paper, we choose the Gauss-Newton iteration method to solve (5) for pose optimization. It is an iterative method for finding regression parameters for least squares in a nonlinear regression model. This method uses Taylor series expansion to approximately replace the nonlinear regression model, and then through multiple iterations to modify the regression coefficients, the regression coefficients are constantly approaching the best regression coefficients of the nonlinear regression model, finally minimizing the sum of residual squares. With an initial pose estimation ξ 0 = [x 0 , y 0 , z 0 , roll 0 , pitch 0 , yaw 0 ] T of ξ = [x, y, z, roll, pitch, yaw] T , we try to find an optimal pose change ∆ξ that makes the current point cloud projected on the pre-build map to get the smallest occupancy probability residual value: According to the Taylor series formulation, expand M(S i (ξ + ∆ξ)) at ξ = [x, y, z, roll, pitch, yaw] T to get: If ∆ξ makes (8) achieve the minimum, we can get (9): Thus, we can solve (9) to get ∆ξ: where ∇M(S i (ξ)) means the gradient of the map occupancy probability value, and H is called the Hessian matrix, which is expressed as: The whole procedure of scan-to-map matching for localization is presented as follows in Figure 2.
In the above Gauss-Newton iteration algorithm, the most important thing is to calculate the partial derivative of the pose ∂S i (ξ) ∂ξ and the probability gradient ∇M(S i (ξ)). In the process of obtaining the partial derivative of the pose, we adopted the Bursa simplification strategy in the linearization process. When the Euler angles in the pose parameter ξ = [x, y, z, roll, pitch, yaw] T are small, we can get a simplified coordinate transformation formula without considering the scale parameter in Bursa's six parameters shown in (12), and we can obtain the derivative directly without interpolating like Google Cartographer. [ ]

Convergence judgment
Final Pose Output ξ

Platforms
The performance of localization based on a pre-built HD map presented in this paper was tested and verified in a variety of different scenarios, including an industrial park, a city road, and an underground mine tunnel. MMS was used for collecting HD maps of the industrial park and road environment because there are good GNSS signals, and maps of underground mines were collected by Terrestrial Laser Scanning (TLS) and vice versa. The VLP-16 LiDAR and Honeywell HG4930 IMU were integrated with a high-precision GNSS board; the MMS built in this paper is shown in Figure 3. The specific performance

Platforms
The performance of localization based on a pre-built HD map presented in this paper was tested and verified in a variety of different scenarios, including an industrial park, a city road, and an underground mine tunnel. MMS was used for collecting HD maps of the industrial park and road environment because there are good GNSS signals, and maps of underground mines were collected by Terrestrial Laser Scanning (TLS) and vice versa. The VLP-16 LiDAR and Honeywell HG4930 IMU were integrated with a high-precision GNSS board; the MMS built in this paper is shown in Figure 3. The specific performance parameters of the HG4930 IMU are shown in Table 1. The VLP16 LiDAR and IMU were strictly time synchronized through the pulse per second (PPS) signal output by GNSS. The VLP16 LiDAR worked at a frequency of 10 Hz, the sampling rate of GNSS was 1Hz, and the IMU worked at a high frequency of 600 Hz. parameters of the HG4930 IMU are shown in Table 1. The VLP16 LiDAR and IMU were strictly time synchronized through the pulse per second (PPS) signal output by GNSS. The VLP16 LiDAR worked at a frequency of 10 Hz, the sampling rate of GNSS was 1Hz, and the IMU worked at a high frequency of 600 Hz.  is a widely used positioning technology that eliminates or weakens most of the errors (such as orbital error, clock error, ionospheric delay, tropospheric delay, etc.) through differential signals. GNSS RTK/INS can not only further improve positioning accuracy and robustness but also obtain attitude information. In the point cloud map generation phase, the GNSS RTK/INS smoothing integration result was used to stitch continuous multi-frame point clouds to generate point cloud maps. While GNSS signals are not available in underground mine spaces, we used the FARO Laser Scanner Focus 3D X330 for environmental point cloud collection. The FARO Laser Scanner Focus 3D X330 is a sophisticated measuring device that can produce a high-precision and high-resolution point cloud. In the data acquisition stage, a common-view target ball was placed between the measuring stations, and in the data processing stage, multiple stations were spliced to generate a complete point cloud.
Our test scenes were divided into two categories: one was an open-sky area with good GNSS signals, and the other was a GNSS-denied scene. We carefully quantitatively evaluated and analyzed the performance of our proposed algorithm, including RMS, maximum positioning error, maximum attitude error, and cost time. The algorithm ran on a modern laptop, which is an Intel(R) Core (TM) i7-7700HQ at 2.8GHz and 16.0 GB of RAM. In order to verify the performance of the algorithm proposed in this paper, we compared it with the traditional NDT method.

Industrial Park and Road Environments
We first conducted experiments in industrial parks and on urban roads. The two-way smoothing trajectory of the GNSS RTK/INS integrated navigation was utilized to split multiple frames of continuous point clouds into a point cloud map. Firstly, we collected a GNSS Real Time Kinematic (RTK) is a widely used positioning technology that eliminates or weakens most of the errors (such as orbital error, clock error, ionospheric delay, tropospheric delay, etc.) through differential signals. GNSS RTK/INS can not only further improve positioning accuracy and robustness but also obtain attitude information. In the point cloud map generation phase, the GNSS RTK/INS smoothing integration result was used to stitch continuous multi-frame point clouds to generate point cloud maps. While GNSS signals are not available in underground mine spaces, we used the FARO Laser Scanner Focus 3D X330 for environmental point cloud collection. The FARO Laser Scanner Focus 3D X330 is a sophisticated measuring device that can produce a high-precision and high-resolution point cloud. In the data acquisition stage, a common-view target ball was placed between the measuring stations, and in the data processing stage, multiple stations were spliced to generate a complete point cloud.
Our test scenes were divided into two categories: one was an open-sky area with good GNSS signals, and the other was a GNSS-denied scene. We carefully quantitatively evaluated and analyzed the performance of our proposed algorithm, including RMS, maximum positioning error, maximum attitude error, and cost time. The algorithm ran on a modern laptop, which is an Intel(R) Core (TM) i7-7700HQ at 2.8GHz and 16.0 GB of RAM. In order to verify the performance of the algorithm proposed in this paper, we compared it with the traditional NDT method.

Industrial Park and Road Environments
We first conducted experiments in industrial parks and on urban roads. The two-way smoothing trajectory of the GNSS RTK/INS integrated navigation was utilized to split multiple frames of continuous point clouds into a point cloud map. Firstly, we collected a set of synchronized GNSS, IMU, and LiDAR data to generate the pre-built EQT HD map, in which we manually removed dynamic objects so that only static objects remained on the map. The pre-built EQT maps and experimental site environments are shown in  We used commercially integrated navigation software, GINS, to calculate the reference groundtruth. GINS can provide information such as the number of available satellites and dilution of precision (DOP) value, which indicate the quality of the GNSS observation environment. The number of available satellites and DOP values in the industrial set of synchronized GNSS, IMU, and LiDAR data to generate the pre-built EQT HD map, in which we manually removed dynamic objects so that only static objects remained on the map. The pre-built EQT maps and experimental site environments are shown in  We used commercially integrated navigation software, GINS, to calculate the reference groundtruth. GINS can provide information such as the number of available satellites and dilution of precision (DOP) value, which indicate the quality of the GNSS observation environment. The number of available satellites and DOP values in the industrial We used commercially integrated navigation software, GINS, to calculate the reference groundtruth. GINS can provide information such as the number of available satellites and dilution of precision (DOP) value, which indicate the quality of the GNSS observation environment. The number of available satellites and DOP values in the industrial park environment and the road environment are shown in Figures 6 and 7. We can find that the number of available GNSS satellites in the industrial park environment is obviously less than that in the road environment, and the DOP value in the industrial park environment is obviously larger than that in the road environment. It shows that the observation conditions for GNSS in the road environment were obviously better than those in the industrial park environment. park environment and the road environment are shown in Figures 6 and 7. We can find that the number of available GNSS satellites in the industrial park environment is obviously less than that in the road environment, and the DOP value in the industrial park environment is obviously larger than that in the road environment. It shows that the observation conditions for GNSS in the road environment were obviously better than those in the industrial park environment.  The file size of the pre-built map is an important factor in practical applications, as it may affect the efficiency of the scan matching algorithm. More importantly, in the highprecision positioning map service for future autonomous driving, map data files must be placed on the cloud server first, so that the smaller the amount of map data, the better the map data transmission. The resolution of the map mainly determines the size of the memory occupied by the map. The shape of the industrial park is rectangular, and its area is about 250 m × 250 m. The road test environment includes 1.6 km of roads with essentially no tall buildings on either side. We constructed point cloud maps for testing industrial parks and road scenes with resolutions of 8 cm, 16 cm, and 24 cm. The NDT algorithm uses a PCD file as input, and the file sizes of the EQT map with different resolutions and the PCD file are shown in Table 2. It is shown that according to our map construction method, when the resolution changes from 8 cm to 16 cm or 24 cm, the map memory consumption of the industrial park is reduced by 77.9% and 90.3%, respectively, and the map memory consumption of the roads is reduced by 74.5% and 87.9%, respectively. The PCD file used in the NDT algorithm is much larger than the EQT map. park environment and the road environment are shown in Figures 6 and 7. We can find that the number of available GNSS satellites in the industrial park environment is obviously less than that in the road environment, and the DOP value in the industrial park environment is obviously larger than that in the road environment. It shows that the observation conditions for GNSS in the road environment were obviously better than those in the industrial park environment.  The file size of the pre-built map is an important factor in practical applications, as it may affect the efficiency of the scan matching algorithm. More importantly, in the highprecision positioning map service for future autonomous driving, map data files must be placed on the cloud server first, so that the smaller the amount of map data, the better the map data transmission. The resolution of the map mainly determines the size of the memory occupied by the map. The shape of the industrial park is rectangular, and its area is about 250 m × 250 m. The road test environment includes 1.6 km of roads with essentially no tall buildings on either side. We constructed point cloud maps for testing industrial parks and road scenes with resolutions of 8 cm, 16 cm, and 24 cm. The NDT algorithm uses a PCD file as input, and the file sizes of the EQT map with different resolutions and the PCD file are shown in Table 2. It is shown that according to our map construction method, when the resolution changes from 8 cm to 16 cm or 24 cm, the map memory consumption of the industrial park is reduced by 77.9% and 90.3%, respectively, and the map memory consumption of the roads is reduced by 74.5% and 87.9%, respectively. The PCD file used in the NDT algorithm is much larger than the EQT map. The file size of the pre-built map is an important factor in practical applications, as it may affect the efficiency of the scan matching algorithm. More importantly, in the highprecision positioning map service for future autonomous driving, map data files must be placed on the cloud server first, so that the smaller the amount of map data, the better the map data transmission. The resolution of the map mainly determines the size of the memory occupied by the map. The shape of the industrial park is rectangular, and its area is about 250 m × 250 m. The road test environment includes 1.6 km of roads with essentially no tall buildings on either side. We constructed point cloud maps for testing industrial parks and road scenes with resolutions of 8 cm, 16 cm, and 24 cm. The NDT algorithm uses a PCD file as input, and the file sizes of the EQT map with different resolutions and the PCD file are shown in Table 2. It is shown that according to our map construction method, when the resolution changes from 8 cm to 16 cm or 24 cm, the map memory consumption of the industrial park is reduced by 77.9% and 90.3%, respectively, and the map memory consumption of the roads is reduced by 74.5% and 87.9%, respectively. The PCD file used in the NDT algorithm is much larger than the EQT map.
The localization performance based on the EQT map of different resolutions and the NDT match are shown in Figures 8 and 9. As can be seen from Figures 8 and 9, the localization performance of the method proposed in this paper is better than the NDT algorithm. We have detailed statistics on the distribution of position error and attitude error, as shown in Tables 3 and 4  The localization performance based on the EQT map of different resolutions and the NDT match are shown in Figures 8 and 9. As can be seen from Figures 8 and 9, the localization performance of the method proposed in this paper is better than the NDT algorithm. We have detailed statistics on the distribution of position error and attitude error, as shown in Tables 3 and 4    The percentages of scans where the matching algorithm achieves positioning better than 0.1 m, 0.2 m, or 0.3 m and calculating attitude accuracy better than 0.1°, 0.2°, or 0.3° are shown in Tables 5 and 6, and maximum positioning errors are also presented. It can be seen that as the resolution increases from 8 cm to 24 cm, the probability of large errors increases significantly, especially in the field test of the industrial park, where the percentages of positioning achieving better than 0.1 m are 85.39%, 67.76%, and 42.00%, respectively. It can also be seen from Table 3 that the change in resolution has a smaller impact  The percentages of scans where the matching algorithm achieves positioning better than 0.1 m, 0.2 m, or 0.3 m and calculating attitude accuracy better than 0.1 • , 0.2 • , or 0.3 • are shown in Tables 5 and 6, and maximum positioning errors are also presented. It can be seen that as the resolution increases from 8 cm to 24 cm, the probability of large errors increases significantly, especially in the field test of the industrial park, where the percentages of positioning achieving better than 0.1 m are 85.39%, 67.76%, and 42.00%, respectively. It can also be seen from Table 3 that the change in resolution has a smaller impact on positioning performance in the field test of an industrial park than in road environments.  The time cost was calculated to verify the efficiency of the proposed positioning method based on a pre-built map. The statistical average time cost is shown in Table 7. We can find that the average time cost has a slight decreasing trend when the resolution of the pre-built map changes from 8 cm to 24 cm. The average time cost of each scan match is far less than the frame rate (100 ms) of LiDAR, which verified the efficiency of our algorithm and can be used for real-time positioning. From Table 7, we can also see that the method proposed in this paper is more computationally efficient than the NDT algorithm.

Underground Mine Tunnel Environment Experiment
We also evaluated the positioning performance of our algorithm in an underground mine tunnel environment. The road in the mine tunnel is rugged, with uphill and downhill sections. Since commonly used MMS cannot be used to collect point cloud maps in GNSSdenied environments such as underground mines, the FARO Laser Scanner Focus 3D X330 was used to collect point clouds to construct QET maps. The pre-built point cloud map and real underground mine tunnel environment are shown in Figure 10. The mine tunnel is about 5 m wide and 4.5 m high. We also used the vehicle-mounted VLP-16 LiDAR for positioning data collection, and the length of the test path is about 600 m. Due to the lack of groundtruth, we only performed a qualitative analysis on the positioning results. Due to the complex environment under the mine, the average speed of the vehicle was 10 km/h during the field test.

Underground Mine Tunnel Environment Experiment
We also evaluated the positioning performance of our algorithm in an underground mine tunnel environment. The road in the mine tunnel is rugged, with uphill and downhill sections. Since commonly used MMS cannot be used to collect point cloud maps in GNSS-denied environments such as underground mines, the FARO Laser Scanner Focus 3D X330 was used to collect point clouds to construct QET maps. The pre-built point cloud map and real underground mine tunnel environment are shown in Figure 10. The mine tunnel is about 5 m wide and 4.5 m high. We also used the vehicle-mounted VLP- 16 Li-DAR for positioning data collection, and the length of the test path is about 600 m. Due to the lack of groundtruth, we only performed a qualitative analysis on the positioning results. Due to the complex environment under the mine, the average speed of the vehicle was 10 km/h during the field test.  We constructed point cloud maps for underground mine tunnels with resolutions of 8 cm, 16 cm, and 24 cm; the file sizes of the EQT maps of different resolutions are shown in Table 8. From Table 8, we can see that according to the point cloud map construction method in this paper, when the resolution changed from 8 cm to 16 cm and 24 cm, the map memory consumption was reduced by 68.7% and 76.2%, respectively. The PCD file used in the NDT algorithm is also much larger than the EQT map. The positioning trajectories in the underground mine tunnel are shown in (a) of Figure 11, and the XYZ view is shown in b of Figure 11. The positioning trajectories based on the EQT maps of 8 cm, 16 cm, 24 cm resolution, and NDT trajectory are respectively marked as Traj_A, Traj_B, Traj_C, and Traj_NDT. It is obvious that Traj_NDT ends up in chaos, which indicates that the NDT algorithm failed at last. The trajectories inside the red ellipse in (a) of Figure 11 are enlarged and shown in (c) of Figure 11. As it can be seen from (b) of Figure 11, the differences between Traj_A, Traj_B, and Traj_C are very small except for the red box in (b) of Figure 11; the trajectories in the red box are enlarged and shown in (d) of Figure 11. It is obvious that the Traj_B and Traj_C were wrongly positioned, as the vehicle-mounted VLP-16 LiDAR was essentially moving at a constant speed during the field test. From (d) of Figure 11, we can also see that Traj_A was smoother than Traj_NDT. At the same time, we carefully checked the point cloud data used to build the map. The point cloud map in the NDT location failure part is complete, so the failed NDT algorithm trajectories are not related to defects in the construction of the map. We further compared and analyzed the differences between Traj_A, Traj_B, Traj_C, and Traj_NDT in detail. Due to the complex environment under the mine, there is no groundtruth that can be referenced, so we take Traj_A as a reference for qualitative analysis because Traj_A was the smoothest of the four trajectories. The difference between Traj_A and Traj_B, Traj_A and Traj_C, and Traj_A and Traj_NDT in the same epoch is shown in Figure 12 for each. Among the differences between Traj_A and Traj_B in the X, Y, and Z directions, the percentages of difference less than 0.1 m are 93.6%, 96.4%, and 97.3%. Among the differences between Traj_A and Traj_C in the X, Y, and Z directions, the percentages of difference less than 0.1 m are 89.1%, 91.2%, and 90.2%. The biggest difference between Traj_A and Traj_B can reach 4.5 m in the Y direction, and the biggest difference between Traj_A and Traj_C can reach 17.5m in the Y direction. We can see from (a),(b) of Figure 12 that they were essentially in the same place. Although it is obvious that there is a large positioning error in Traj_B and Traj_C, we can see a convergence process in (a), (b) of Figure 12; it converged to the correct position through subsequent epochs. From one aspect, it reflects that our algorithm can tolerate large initial position errors in an underground mine tunnel environment. While among the differences between Traj_A and Traj_NDT in X, Y, and Z directions, the percentages of difference less than 0.1 m are We further compared and analyzed the differences between Traj_A, Traj_B, Traj_C, and Traj_NDT in detail. Due to the complex environment under the mine, there is no groundtruth that can be referenced, so we take Traj_A as a reference for qualitative analysis because Traj_A was the smoothest of the four trajectories. The difference between Traj_A and Traj_B, Traj_A and Traj_C, and Traj_A and Traj_NDT in the same epoch is shown in Figure 12 for each. Among the differences between Traj_A and Traj_B in the X, Y, and Z directions, the percentages of difference less than 0.1 m are 93.6%, 96.4%, and 97.3%. Among the differences between Traj_A and Traj_C in the X, Y, and Z directions, the percentages of difference less than 0.1 m are 89.1%, 91.2%, and 90.2%. The biggest difference between Traj_A and Traj_B can reach 4.5 m in the Y direction, and the biggest difference between Traj_A and Traj_C can reach 17.5m in the Y direction. We can see from (a),(b) of Figure 12 that they were essentially in the same place. Although it is obvious that there is a large positioning error in Traj_B and Traj_C, we can see a convergence process in (a), (b) of Figure 12; it converged to the correct position through subsequent epochs. From one aspect, it reflects that our algorithm can tolerate large initial position errors in an underground mine tunnel environment. While among the differences between Traj_A and Traj_NDT in X, Y, and Z directions, the percentages of difference less than 0.1 m are 64.7%, 64.3%, and 71.2%. Traj_A and Traj_NDT have large errors in the last part because Traj_NDT was wrongly positioned at the end. The statistical average time cost of scan-to-map matching is shown in Table 9. It seems that the average time cost of using maps with different resolutions makes no difference. The average time cost in the field test of an underground mine tunnel was less than in the field test of an industrial park or road environment, which may have been caused by the long and narrow environmental characteristics of the underground mine tunnel. It also indicates that the method proposed in this paper is more computationally efficient than the NDT algorithm. Table 9. Average time cost of scan-to-map matching in underground mine tunnel.

Scenes
Map Res Average Time Cost (ms) The statistical average time cost of scan-to-map matching is shown in Table 9. It seems that the average time cost of using maps with different resolutions makes no difference. The average time cost in the field test of an underground mine tunnel was less than in the field test of an industrial park or road environment, which may have been caused by the long and narrow environmental characteristics of the underground mine tunnel. It also indicates that the method proposed in this paper is more computationally efficient than the NDT algorithm. We conducted four data collections according to the mining route, including two complete routes from the ore loading point to the ore unloading point and two complete routes from the ore unloading point to the ore loading point. We used the EQT map with a resolution of 8 cm for vehicle positioning; the trajectories are shown in Figure 13. Although there is no groundtruth, we can see that our four trajectories were continuous and complete, which can prove the stability of our algorithm to a certain extent. We conducted four data collections according to the mining route, including two complete routes from the ore loading point to the ore unloading point and two complete routes from the ore unloading point to the ore loading point. We used the EQT map with a resolution of 8 cm for vehicle positioning; the trajectories are shown in Figure 13. Although there is no groundtruth, we can see that our four trajectories were continuous and complete, which can prove the stability of our algorithm to a certain extent.

KITTI Dataset
The KITTI dataset was jointly established by the Karlsruhe Institute of Technology in Germany and the Toyota American Institute of Technology. It is the world's largest and most popular dataset for automatic driving and SLAM technology evaluation. First, we used the groundtruth provided by the KITTI dataset to splice all point cloud frames into a complete point cloud map and then used our method to generate an EQT map; all maps used 15 cm resolution. Then, we gave an initial pose to achieve continuous positioning of point cloud frames. The initial pose was based on the truth value, which added 0-20.0 cm of random error to the position and 0-1.0 degree of random error to the pose. Before the calculation, the point cloud of each frame was uniformly downsampled, and about 10,000 points were reserved for each frame. Then, we used the SLAM evo Error! Reference source not found. tool to evaluate the absolute position error (APE) and count the error information. Evo is a trajectory evaluation tool for odometer and SLAM problems. Its core function is to be able to draw the groundtruth, or evaluate the error between the trajectory and the groundtruth in formats supporting multiple data sets. We counted the maximum, average, root mean square error (RMSE), and standard deviation (STD) of the absolute position error of each group of data sets, as well as the average time cost (ATC) of each frame; the results are shown in Table 10. It can be seen from Table 10 that most of the absolute positioning accuracy of our method is very high, achieving centimeter-level positioning.

KITTI Dataset
The KITTI dataset was jointly established by the Karlsruhe Institute of Technology in Germany and the Toyota American Institute of Technology. It is the world's largest and most popular dataset for automatic driving and SLAM technology evaluation. First, we used the groundtruth provided by the KITTI dataset to splice all point cloud frames into a complete point cloud map and then used our method to generate an EQT map; all maps used 15 cm resolution. Then, we gave an initial pose to achieve continuous positioning of point cloud frames. The initial pose was based on the truth value, which added 0-20.0 cm of random error to the position and 0-1.0 degree of random error to the pose. Before the calculation, the point cloud of each frame was uniformly downsampled, and about 10,000 points were reserved for each frame. Then, we used the SLAM evo [40] tool to evaluate the absolute position error (APE) and count the error information. Evo is a trajectory evaluation tool for odometer and SLAM problems. Its core function is to be able to draw the groundtruth, or evaluate the error between the trajectory and the groundtruth in formats supporting multiple data sets. We counted the maximum, average, root mean square error (RMSE), and standard deviation (STD) of the absolute position error of each group of data sets, as well as the average time cost (ATC) of each frame; the results are shown in Table 10. It can be seen from Table 10 that most of the absolute positioning accuracy of our method is very high, achieving centimeter-level positioning. The positioning results of datasets 00 and 08 are poor, dataset 10 is the best. Figures 14  and 15 show the APE results and trajectories of datasets 00 and 10. From Figure 14, we can find that the reason for the poor statistical results of data set 00 is that there are a large number of large positioning errors in the repeated track sections. We think that the possible reason for this phenomenon is the inaccuracy of the two trajectories, which leads to the inaccurate stitching of the prior point cloud map. We do find that in the groundtruth, there is sometimes a large error in the Z-direction, which will cause the map to be inaccurate. Although we use the same set of data to build maps and locate, this can also show the high accuracy of our location algorithm. We have to admit that there is a risk of positioning failure when the map environment changes greatly. The positioning results of datasets 00 and 08 are poor, dataset 10 is the best. Figures  14 and 15 show the APE results and trajectories of datasets 00 and 10. From Figure 14, we can find that the reason for the poor statistical results of data set 00 is that there are a large number of large positioning errors in the repeated track sections. We think that the possible reason for this phenomenon is the inaccuracy of the two trajectories, which leads to the inaccurate stitching of the prior point cloud map. We do find that in the groundtruth, there is sometimes a large error in the Z-direction, which will cause the map to be inaccurate. Although we use the same set of data to build maps and locate, this can also show the high accuracy of our location algorithm. We have to admit that there is a risk of positioning failure when the map environment changes greatly.

Ablation Study
In order to verify the improvement in computing efficiency brought by the Bursa model, we carried out experimental verification on the KITTI dataset 09. The Bursa sevenparameters model is a model commonly used in geodesy to solve the problem of coordinate conversion parameters between two rectangular coordinate systems; of course, we  The positioning results of datasets 00 and 08 are poor, dataset 10 is the best. Figures  14 and 15 show the APE results and trajectories of datasets 00 and 10. From Figure 14, we can find that the reason for the poor statistical results of data set 00 is that there are a large number of large positioning errors in the repeated track sections. We think that the possible reason for this phenomenon is the inaccuracy of the two trajectories, which leads to the inaccurate stitching of the prior point cloud map. We do find that in the groundtruth, there is sometimes a large error in the Z-direction, which will cause the map to be inaccurate. Although we use the same set of data to build maps and locate, this can also show the high accuracy of our location algorithm. We have to admit that there is a risk of positioning failure when the map environment changes greatly.

Ablation Study
In order to verify the improvement in computing efficiency brought by the Bursa model, we carried out experimental verification on the KITTI dataset 09. The Bursa sevenparameters model is a model commonly used in geodesy to solve the problem of coordinate conversion parameters between two rectangular coordinate systems; of course, we

Ablation Study
In order to verify the improvement in computing efficiency brought by the Bursa model, we carried out experimental verification on the KITTI dataset 09. The Bursa sevenparameters model is a model commonly used in geodesy to solve the problem of coordinate conversion parameters between two rectangular coordinate systems; of course, we have not considered the scale factor here. The Bursa parameters model removes the linearized high-order terms based on the small angle vacation to simplify calculation. If we rotate the point according to three Euler angles between two coordinate systems, the rotation matrix is: cos ε y cos ε z cos ε y sin ε z − sin ε y − cos ε x sin ε z + sin ε x sin ε y cos ε z cos ε x cos ε z + sin ε x sin ε y sin ε z sin ε x cos ε y sin ε x sin ε z + cos ε x sin ε y cos ε z − sin ε x cos ε z + cos ε x sin ε y sin ε z cos ε x cos ε y   (13) If the form of a rotation matrix is used to solve the partial derivative H-matrix, the formula will be complex; although this calculation is not very large for the computer, each point needs to be calculated once for each iteration. We used the Bursa model and the rotation matrix model to simulate and calculate the partial derivative H matrix 10,000 times. The Bursa model hardly consumes time, and the rotation matrix would take about 1ms. Then we used the Bursa model and the rotation matrix model to calculate the 09 data of KITTI data; the calculation time and positioning result difference of each frame are shown in Figure 16. The average calculation time for using the Bursa model is 38.1 ms, and the average calculation time for using the rotation matrix is 45.2 ms. The average time consumption of using the Bursa model is reduced by 7.1 ms, which is 15.7%. From Figure 17, we can see that the positioning results of the two models are almost the same, so using the Bursa model will increase calculation efficiency without sacrificing accuracy.
If the form of a rotation matrix is used to solve the partial derivative H-matrix, the formula will be complex; although this calculation is not very large for the computer, each point needs to be calculated once for each iteration. We used the Bursa model and the rotation matrix model to simulate and calculate the partial derivative H matrix 10,000 times. The Bursa model hardly consumes time, and the rotation matrix would take about 1ms. Then we used the Bursa model and the rotation matrix model to calculate the 09 data of KITTI data; the calculation time and positioning result difference of each frame are shown in Figure 16. The average calculation time for using the Bursa model is 38.1 ms, and the average calculation time for using the rotation matrix is 45.2 ms. The average time consumption of using the Bursa model is reduced by 7.1 ms, which is 15.7%. From Figure  17, we can see that the positioning results of the two models are almost the same, so using the Bursa model will increase calculation efficiency without sacrificing accuracy.

Discussion
To evaluate the positioning performance of the method proposed in this paper, extensive field tests for different scenarios have been conducted. From the field test in industrial parks and road environments, the positioning accuracy was mostly consistent with the reference trajectory calculated by the GNSS RTK/INS integrated navigation system. At the same time, field tests show that the method proposed in this paper is significantly better than the NDT method. The position and attitude results are reliable in most areas, as almost above 99% of positioning errors are less than 0.3 m and 93-98% attitude errors are less than 0.3 • . Even if large positioning errors occasionally occur, the algorithm is able to converge in subsequent epochs. In addition, we tested the impact of using maps of different resolutions on the positioning accuracy. In the field test of industrial parks, the Generally speaking, the industrial park has rich environmental features, so the positioning results based on the prior map in the industrial park should be better than those in the road environment. In our experiment, the opposite situation occurs because the GNSS signal observation environment in the industrial park is poor, resulting in inaccurate groundtruth.
In the field test of an underground mine tunnel, there were obvious errors when the resolution map was 16 cm or 24 cm, but the algorithm can correct itself well in subsequent epochs. There is a trend that as the resolution of the map increases from 8 cm to 24 cm, the positioning results tend to worsen and the probability of large errors increases. The failure of the NDT algorithm is not caused by map construction defects. Alternatively, utilizing maps with a 24 cm resolution has the advantage of requiring less storage. Due to the narrow and long characteristics of underground mine tunnels, it is helpful to use highresolution maps to express more detailed environmental information for more accurate scan-to-map matching and positioning. Generally, in different computing platforms and application environments, an appropriate resolution should be selected to balance storage consumption, positioning accuracy, and computing efficiency.

Conclusions
In this paper, we have presented a precise and robust Lidar 6DOF global localization method based on a pre-built 3D occupied HD map called the EQT map. Extensive field tests were conducted and have verified the reliability and accuracy of the method. They also show that the method proposed in this paper is significantly better than the NDT method. Our novel pre-built EQT map utilizes a planar quadtree for block division and a Z-sequence index structure within the block cells. Then a four-level occupancy probability cell value model is adopted, which would save about 8 times the amount of storage space and accelerate querying the occupancy value from the map compared with the strategy of Google Cartographer. In our scan-to-map matching algorithm, the Bursa linearized model is applied in a Gauss-Newton iteration, which increases the stability and accelerates the positioning convergence, relying on our proposed EQT HD map. All the above improvements lead to optimal storage and query efficiency for real-time scan matching positioning. In the field tests of industrial parks and urban roads, a positioning accuracy of about 6cm-13cm and an attitude accuracy of 0.13-0.15 • were achieved only using a LiDAR. The positioning accuracy will decrease as the resolution increases from 8 cm to 24 cm. While in a long and narrow environment such as an underground mine tunnel, high-resolution maps are helpful for accurate and robust localization. We also verified the effectiveness of our algorithm through the KITTI dataset, which achieves centimeter-level positioning.
The LiDAR-based positioning method presented in this paper highly depends on the pre-built point cloud maps; it is inevitable that the positioning accuracy will decrease or even fail when the environment changes significantly, such as during road reconstruction, building construction, or demolition. A fast and low-cost prior point cloud map update method is an inevitable choice to solve this problem, which is also our future work. Another problem is that in this paper we give the initial position through GNSS or manually in a GNSS-denied environment. Automatic and fast initialization in a GNSS-denied environment is a problem worth studying.  Data Availability Statement: KITTI data was downloaded from the KITTI Vision Benchmark Suite (https://www.cvlibs.net/datasets/kitti/raw_data.php) (accessed on 20 December 2022). Other data is not applicable.

Conflicts of Interest:
The authors declare no conflict of interest.