Next Article in Journal
A Focal Length Calibration Method for Vision Measurement Systems Based on Multi-Feature Composite Variable Weighting
Previous Article in Journal
Mechanism of the AC-Light-Shift-Induced Phase Shift and a DC Compensation Strategy in Bell–Bloom Magnetometers
Previous Article in Special Issue
MC-H-Geo: A Multi-Scale Contextual Hierarchical Framework for Fine-Grained Lithology Classification
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Online Mapping from Weight Matching Odometry and Highly Dynamic Point Cloud Filtering via Pseudo-Occupancy Grid

1
China North Vehicle Research Institute, Norinco Group, Beijing 100072, China
2
National Key Laboratory of Vehicular Transmission, Beijing Institute of Technology, Beijing 100081, China
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Sensors 2025, 25(22), 6872; https://doi.org/10.3390/s25226872
Submission received: 29 September 2025 / Revised: 28 October 2025 / Accepted: 31 October 2025 / Published: 10 November 2025
(This article belongs to the Special Issue Application of LiDAR Remote Sensing and Mapping)

Abstract

Efficient locomotion in autonomous driving and robotics requires clearer visualization and more precise map. This paper presents a high accuracy online mapping including weight matching LiDAR-IMU-GNSS odometry and an object-level highly dynamic point cloud filtering method based on a pseudo-occupancy grid. The odometry integrates IMU pre-integration, ground point segmentation through progressive morphological filtering (PMF), motion compensation, and weight feature point matching. Weight feature point matching enhances alignment accuracy by combining geometric and reflectance intensity similarities. By computing the pseudo-occupancy ratio between the current frame and prior local submaps, the grid probability values are updated to identify the distribution of dynamic grids. Object-level point cloud cluster segmentation is obtained using the curved voxel clustering method, eventually leading to filtering out the object-level highly dynamic point clouds during the online mapping process. Compared to the LIO-SAM and FAST-LIO2 frameworks, the proposed odometry demonstrates superior accuracy in the KITTI, UrbanLoco, and Newer College (NCD) datasets. Meantime, the proposed highly dynamic point cloud filtering algorithm exhibits better detection precision than the performance of Removert and ERASOR. Furthermore, the high-accuracy online mapping is built from a real-time dataset with the comprehensive filtering of driving vehicles, cyclists, and pedestrians. This research contributes to the field of high-accuracy online mapping, especially in filtering highly dynamic objects in an advanced way.

1. Introduction

LiDAR mapping technology has made remarkable progress and is widely applied in various fields, including autonomous driving [1,2,3], SLAM systems [4,5,6] and robotic navigation [7,8,9,10]. Particularly in robotics [9,10,11], the synergistic integration of multi-sensors and computer vision has revolutionized the environmental perception capabilities of intelligent agents.
Recent research demonstrates that information fusion from a Global Navigation Satellite System (GNSS), an Inertial Measurement Unit (IMU), and a LiDAR sensor achieves high-precision mapping and localization in complex environments [12,13]. However, in reality, the presence of dynamic objects poses significant challenges to the accuracy of LiDAR mapping. Conventional point cloud matching algorithms typically assume a static environment. This assumption holds in most cases due to the short single-frame scanning time, and thus the impact of dynamic objects on odometry accuracy can be negligible [14]. Nevertheless, when objects move at a high speed, the resulting point cloud distortion severely disrupts matching precision, leading to accumulated localization errors. In some cases, moving objects introduced as “ghost artifacts” [15,16] appear into the generated map. The resulting map redundancy can severely mislead localization and path planning, decreasing accuracy and safety in navigation decision-making. Therefore, effectively suppressing and eliminating the side effects of highly dynamic point clouds on maps is of great significant in enhancing the reliability and practicality of LiDAR mapping systems.
According to the real-time performance of processing sensor data in dynamic environments, dynamic point cloud filtering algorithms can generally be categorized into two classes: online filtering and post-processing-based filtering. Online filtering methods typically select several consecutive point cloud frames as a reference and then compare them with the target frame in order to achieve superior real-time performance. For instance, RF-LIO [17], an extension of LIO-SAM [18], leverages the information from the initial pose of frontend odometry and local submaps to filter dynamic points on the current frame adaptively and iteratively, ultimately generating a static global map. ETH-Zurich’s ASL Lab proposed an end-to-end dynamic object detection framework [19], which is based on occupancy grids. This framework automatically labels dynamic objects and then trains them using 3D-MiniNet, enabling online detection and filtering of dynamic point clouds. Post-processing methods utilize all frames during the entire SLAM cycle as a reference, thus achieving higher accuracy and precision. The Peopleremover algorithm [20] constructs a regular voxel occupancy grid and then determines free voxels by traversing the lines of sight from the sensor to the measured points through the voxel grid. ERASOR [21], proposed by Hyungtae Lim et al., detects dynamic points by comparing region-wise occupancy ratios between the current scan and local submaps. The ratios exceeding a threshold are marked as dynamic. Removert [22], presented by Giseop Kim et al., built a novel 2D depth map by projecting a 3D point cloud with its neighboring submaps. By comparing the pixel depths at the same positions on the two maps, the shallower one is considered a dynamic point. This method outperforms manual annotations on the SemanticKITTI dataset [23], demonstrating an excellent filtering efficacy.
In real-world scenarios, to ensure accurate decision-making at high speeds requires prioritizing the real-time performance of maps. Simultaneously, at the same time, to optimize the effectiveness of decision-making and the robustness of motion behaviors, the map must be built with sufficiently high accuracy. In order to address it, therefore, this paper improves the precision of LiDAR-IMU-GNSS multi-sensor mapping by analyzing the geometric and reflective similarities of point cloud matching points to determine weight coefficients. It then employs a pseudo-occupancy grid-based method to filter object-level highly dynamic point clouds in real time, ultimately effectively enhancing accuracy and efficiency in mapping. Simulation experiments demonstrate that the accuracy of this odometry system surpasses that of LIO-SAM and FAST-LIO2 based on KITTI, UrbanLoco [24], and Newer College [25] (NCD) datasets. Furthermore, compared to the Removert and ERASOR algorithms, the proposed filtering method more effectively removes highly dynamic point clouds. The proposed algorithm is applied to a real-time dataset to build the map online with successful filtering of driving vehicles, cyclists, and pedestrians.

2. Algorithm Framework

This paper proposes an online mapping algorithm for the high accuracy multi-sensor fusion of LiDAR-IMU-GNSS and object-level highly dynamic point filtering. The framework comprises data preprocessing, weight feature point matching, and dynamic point filtering, as illustrated in Figure 1.
(i)
Data preprocessing: Raw point cloud data from LiDAR, IMU, and GNSS are processed to generate synchronized and fused multi-sensor measurements. This includes IMU pre-integration, ground point segmentation using the Progressive Morphological Filter (PMF), and motion compensation of LiDAR point clouds.
(ii)
Weight feature point matching: the weight coefficient of the Mahalanobis distance is determined by geometric and reflectance intensity similarities to obtain the optimized pose.
(iii)
Dynamic point filtering: Dynamic objects are removed from the point cloud, and the map is built in real time using a pseudo-occupancy grid filtering algorithm.

3. Data Fusion for Weight Matching LiDAR-IMU-GNSS Odometry

3.1. IMU Pre-Integration

IMU pre-integration constructs constraints between LiDAR keyframes to suppress odometry drift and establish a pose graph that includes attitude, velocity, and position information. After acquiring IMU data and accounting for bias and noise, the predicted pre-integration values for attitude R i , velocity v i and position p i between frames i and j are derived using a discrete-time integration method. Furthermore, their relative differences Δ R i , j , Δ v i , j , and Δ p i , j are expressed in Equation (1),
Δ R i , j = R i T R j = k = i j 1 E x p ω ~ k b k ω n k ω Δ t k Δ v i , j = v j v i = k = i j 1 R k a ~ k b k a n k a g Δ t k Δ p i , j = p j p i = k = i j 1 v k Δ t k + 1 2 R k a ~ k b k a n k a g Δ t k 2 .
where ω ~ k and a ~ k represent the angular velocity measured by the gyroscope and the linear acceleration by the accelerometer at the kth frame, respectively. n k ω and n k a denote the bias noise of the angular velocity and acceleration at the kth frame. b k ω and b k a are the measurement biases for the angular velocity and acceleration at the kth frame. Δ t k is the time interval between the kth frame and the (k + 1)th frame in the IMU data. g represents gravity. E x p . denotes the mapping from a rotation vector to a rotation matrix. Therefore, the state at the (k + 1)th frame can be predicted by integrating the state at the kth frame.
Assuming that the IMU biases b k ω = c o n s t and b k a = c o n s t over a short period, the noise terms can be further simplified. Using the linear approximation from Lie group theory, we establish the relationship between the predicted Δ R i , j , Δ v i , j , and Δ p i , j and the measured Δ R ~ i , j , Δ v ~ i , j , and Δ p ~ i , j . The measured values are written as follows:
Δ R i , j = Δ R ~ i , j E x p δ ϕ i , j Δ v i , j = Δ v ~ i , j δ v i , j Δ p i , j = Δ p ~ i , j δ p i , j .
On the left-hand side of Equation (2) are the predicted values derived from the state variables, and on the right-hand side are the measured values obtained through IMU integration and a random noise term. To explicitly represent the measured values in Equation (2), it is necessary to derive the state transition equations for the random noise terms.
Equation (3) reveals that Δ R i , j differs from Δ R ~ i , j by a noise term E x p δ ϕ i , j . By applying the operator L o g . , which denotes the mapping from a rotation matrix to a rotation vector, the noise term can be linearly approximated. Retaining only the first-order term, the δ ϕ i , j is expressed as
δ ϕ i , j = L o g k = i j 1 E x p Δ R ~ k + 1 , j T J r , k n k ω Δ t k k = i j 1 Δ R ~ k + 1 , j T J r , k n k ω Δ t k = Δ R ~ k + 1 , j T δ ϕ i , j 1 + J r , j 1 n j 1 ω Δ t j 1 ,
where J r , k is the Jacobian matrix between the rth frame and the kth frame, and similarly explanation to J r , j 1 . n j 1 ω represents the (j − 1)th bias noise of the angular velocity. Δ R ~ k + 1 , j T represents the measured attitude difference between the (k + 1)th frames and the j th frame, and T is transpose. Δ t j 1 is the time interval between the (j − 1)th frame and the jth frame in the IMU data.
Assuming that the noise terms of δ ϕ i , j , δ v i , j , and δ p i , j as well as the bias noise of the IMU, all follow zero-mean Gaussian distributions, we derive the following expression:
η i , j = δ ϕ i , j δ v i , j δ p i , j N 0 9 × 1 , Σ η i , j   a n d   η j n = n j ω n j a N 0 6 × 1 , Σ η j n .  
Similar procedures are applied in δ v i , j and δ p i , j terms. The state transition equation of η i , j is given by
η i , j = A j η i , j 1 + B j η j 1 n A j = Δ R ~ j 1 , j T 0 0 Δ R ~ i , j 1 a ~ j 1 b i a Δ t j 1 I 0 1 2 Δ R ~ i , j 1 a ~ j 1 b i a Δ t j 1 2 Δ t j 1 I   a n d   B j = J r , j 1 Δ t j 1 0 0 Δ R ~ i , j 1 Δ t j 1 0 1 2 Δ R ~ i , j 1 Δ t j 1 2 ,    
where I denotes the identity matrix. is the symbol for antisymmetric matrix. Consequently, the state transition equation for the covariance matrix of the noise term is derived as
Σ η i , j = A j Σ η i , j 1 A j T + B j Σ η j n B k T .
The variation of IMU biases over time significantly affects the accuracy of measurements. By assuming that the pre-integrated measurements vary linearly with the IMU biases, we retain the first-order partial derivatives of the pre-integrated measurements with respect to the biases, so as to compensate for bias-induced errors in pre-integration.
Δ R ~ i , j b i ω + δ b i ω = Δ R ~ i , j b i ω E x p Δ R ~ i , j b i ω b i ω Δ v ~ i , j b i ω + δ b i ω , b i a + δ b i a = Δ v ~ i , j b i ω , b i a + Δ v ~ i , j b i ω δ b i ω + Δ v ~ i , j b i a δ b i a Δ p ~ i , j b i ω + δ b i ω , b i a + δ b i a = Δ p ~ i , j b i ω , b i a + Δ p ~ i , j b i ω δ b i ω + p ~ i , j b i a δ b i a .

3.2. Ground Point Segmentation

The point cloud of the keyframe is extracted to reduce storage. In order to achieve the adaptive adjustment of the keyframe threshold, the parameter d k is determined based on the surrounding environment. Therefore, the adaptive keyframe threshold d i k is expressed in Equation (8):
d i k = 5.0 ,       d k > 20 3.0 ,       8 < d k 20 1.0 ,         d k 8 ,
where d k = α d k 1 + β r ̄ , r ̄ is the median of the Euclidean distances from the point cloud of the current frame to the origin of the LiDAR. d k 1 represents the openness of the scanning environment in the previous frame. α and β are constants, and α = 0.9 and β = 0.1 are chosen empirically in the experimental section.
During online mapping, the Progressive Morphological Filtering (PMF) algorithm is employed to segment ground points and non-ground points in each keyframe point cloud in real time.
PMF optimizes the window dimensions by gradually increasing the size of the filtering window. The update principles for the filtering window size w k and the height difference threshold d h k are presented in Equation (9).
w k = 2 b k + 1 , d h k = d h 0 , w k 3 s w k w k 1 c + d h 0 , w k > 3 d h m a x , d h k > d h m a x ,
where w k denotes the window size at the kth iteration. b represents the initial window size. d h 0 and d h m a x refer to the initial and maximum height difference thresholds, respectively. s indicates the slope, and s = 2 d h w k w k 1 m a x . c stands for the grid size.

3.3. Motion Compensation of LiDAR Point Clouds

At high vehicle speeds, motion distortion compensation for point clouds becomes essential. When IMU data is available, IMU-based motion compensation in continuous time is used. Otherwise, in the case of IMU data dropout, the motion compensation method based on a constant velocity model is used. Assuming that the current frame maintains the same motion state as the previous LiDAR frame, the LiDAR pose at each timestamp within the current frame can be calculated through interpolation, as expressed below,
R * = R i 1 E x p μ Δ ϕ Δ ϕ = L o g R i 2 T R i 1 T p * = p i 1 + μ p i 1 p i 2 μ = t * t i 1 t i 1 t i 2 ,
where R * and p * denote the query point’s attitude and position. t * represents the current timestamp.

3.4. Weight Feature Point Matching Method Based on Geometric-Reflectance Intensity Similarity

The Mahalanobis distance between points is commonly used as the criterion for feature point matching and for obtaining the optimized pose. However, assigning the same weight to each matched point fails to accurately reflect the varying influence of uncertainty across different matched point pairs. During the point matching process, the weight of false matches should be significantly reduced. In addition, the positional uncertainty of the matched points introduces noise, and thus the weight of such points should decrease as their uncertainty increases.
To enhance both point matching accuracy and calculational efficiency, an eigenvalue-based approach is employed to extract planar feature points. Accordingly, the centroid and the covariance matrix of the point distribution around the point i is given in Equation (11) based on the KD-Tree method.
p c e n t e r = 1 N j = 1 N p j C i = 1 N 1 i = 1 N p i p c e n t e r p i p c e n t e r T a n d C i = U i i V i T i = λ 0 0 0 0 λ 1 0 0 0 λ 2 λ 0 λ 1 λ 2 0 ,
where λ 0 , λ 1 , and λ 2 are the three eigenvalues of the covariance matrix.
If eigenvalues satisfy Equation (12), the point i is a planar feature point.
λ 2 λ 0 + λ 1 < ρ 0 λ 0 λ 1 λ 0 + λ 1 < ρ 1 ,
where ρ 0 and ρ 1 are thresholds.
In order to reduce incorrect matching, the similarity between two feature points should be taken into account. By assigning different weights to each point, the residual function for point cloud matching can be formulated. In this section, a weight feature point matching method based on the geometric-reflectance intensity similarity of point clouds is proposed.
The geometric similarity is defined by the positional relationship between the normal vectors n of the planar feature point i in one point cloud and j in the target point cloud, separately. Note that λ 2 is the smallest eigenvalue; its eigenvector v 2 is thus approximately equivalent to the normal vector of the corresponding plane. The subscript i and j correspond to the points i and j . Therefore, the geometric similarity S G , i can be expressed as
S G , i = n i T n j n i n j v 2 , i T v 2 , j v 2 , i v 2 , j .
The reflection intensity of matched points exhibits similar distributions and is thus chosen as the second similarity. We assume that the reflection intensity around points i and j follows normal distributions N i μ i , σ i 2 and N j μ j , σ j 2 , in which μ i , σ i and μ j , σ j represent the mean value and standard deviation for points i and j , respectively. The distances from the two points to the sensor center are r i and r j , with r j = κ r i (where κ is the scale factor), with their reflection intensity satisfies scaling law as well. Therefore, by applying Kullback–Leibler (KL) divergence theory to the reflection intensity of matched points [26], a modified KL divergence model is yielded based on Lambertian theory:
μ j = κ 2 μ i σ j 2 = κ 4 σ i 2 K L s y m = μ i 2 4 σ i 2 1 κ 4 + 1 1 κ 2 2 + 1 4 κ 4 + κ 4 4 1 2 .
Since the weight coefficient is not larger than 1, a Gaussian function is used to normalize the Kullback–Leibler (KL) divergence, thereby obtaining the reflection intensity similarity S I in Equation (15).
S I = exp K L s y m n o r m 2 2 η 2 ,
where η is the Gaussian scale factor.
To avoid the sparsity error, the average planarity S p of the points i and j is introduced as the third weight coefficient.
S p = 1 2 λ 1 i λ 2 i λ 0 i + λ 1 j λ 2 j λ 0 j .
Eventually, the weight w of the points i and j is defined as
w = S G S I S p .
To ensure the real-time performance of the odometer, the planar feature point cloud of the current frame is matched with that of the local submap in the keyframe. The transformation matrix is obtained by minimizing the weighted Mahalanobis distance derived from the local geometric characteristics, as shown in Equation (18).
R , t = a r g m i n R , t i = 0 N w i r L i r L i = d i T C ~ i M + T C ~ i P T T 1 d i T = R t 0 1 × 3 1 , C ~ i = C i 0 3 × 1 0 1 × 3 1 ,
where r L i represents the Mahalanobis distance calculated by the GICP algorithm. d i represents the transformation error term between the current frame and the local submap of the keyframe. C ~ i represents the covariance matrix of the points. The superscripts P and M represent the covariance matrices of the current frame and the submap, respectively. For the covariance matrix C i = V Λ V T , since all the feature points lie on a plane, the diagonal elements of Λ are replaced with (1,1,ϵ), where ϵ is a very small constant, to regularize the covariance matrix Λ .

4. Online Filtering Method for Highly Dynamic Point Clouds

If the set of keyframe point clouds is denoted as F , after extracting the ground points, the points are divided into ground points G and non-ground points P . Furthermore, the point cloud set obtained by transforming each keyframe point cloud into the global coordinate system and then splicing them together is the map M . If the highly dynamic point cloud map is represented as M d , then the static point cloud map can be expressed as M s = M M d . Removing point clouds from a map directly may result in point cloud loss and over-filtering, but removing only the point clouds from one or several frames allows the point clouds from other frames to compensate for the loss in the map. Therefore, we obtain the static map by filtering the object-level highly dynamic point clouds at each keyframe, as shown in Equation (19).
M s = k = 1 N G k + P k I k d I k I k d ,
where G k is the ground point cloud of the kth keyframe, which is regarded as the static point cloud. I k d represents the object-level highly dynamic point cloud set of the kth keyframe. P k denotes all object-level point clouds in the kth keyframe.
Objects that generate highly dynamic points, such as pedestrians, vehicles, and so forth, temporarily occupy a certain area in the map for a short time, while static points remain persistently fixed in place. Therefore, by comparing the grid occupancy between the current frame’s point cloud and the map, we can identify regions occupied by dynamic objects. Consequently, non-ground points within these dynamic regions can be classified as highly dynamic points.
First, to eliminate the influence of outliers in the point cloud, a pass-through filter is applied to extract the region of interest. The filtering range is set from h m i n to h m a x . Then, the point cloud of the region of interest in the current keyframe is represented as
P k i n t = p P k h m a x h p h m i n .
The grid obtained from the point cloud of the region of interest P k i n t in the global coordinate system is G k . Thus the local submap can be represented by Equation (21).
M k s u b = p M Γ p G k ,
where the grid projection function Γ p is defined as Γ p = x , y = ( f l o o r x r e s x , f l o o r y r e s y ) , r e s x and r e s y are the grid resolutions in the X and Y dimensions, respectively. x , y is the index of the grid corresponding to the point p .
The pseudo-occupancy rate is defined as the difference between the maximum and minimum heights of the point cloud within the grid. Therefore, the rates Δ h i , j k and Δ h i , j M k s u b of the current keyframe k and the local submap are expressed as follows:
Δ h i , j k = max z G i , j z k , i , j l o w Δ h i , j M k s u b = max z G i , j z i , j M k s u b , l o w ,
where G i , j represents the grid with index i , j ; z i , j M k s u b , l o w and z k , i , j l o w represent the minimum values of the point cloud heights of the grid with index i , j in the local submap and the current keyframe k , respectively.
The pseudo-occupancy ratio determines the occupancy status of dynamic point clouds in the current frame’s grid, and it is defined as the ratio between the pseudo-occupancy rate of the current frame and the submap grid, as shown in Equation (23).
Δ h i , j k Δ h i , j M k s u b > τ h ,
where τ h is the threshold for the pseudo-occupancy ratio. If the pseudo-occupancy ratio exceeds τ h , the grid i , j in the current frame is considered to contain dynamic point clouds.
The joint distribution of the binary states m for pseudo-occupancy grids is expressed as follows:
p m z 1 : t = i p m i z 1 : t ,
where m i represents the occupancy state of the ith grid, and z 1 : t denotes all observations up to frame t , where each observation corresponds to the difference between the scan at frame i and the map.
We update the probability information in the map grid according to Bayes’ rule; that is,
log i t p m i z 1 : t = log i t p m i z 1 : t 1 + log i t p m i z t log i t p m i ,
where log i t p x defines the occupancy and non-occupancy of the grid, and log i t p x = log p x / 1 p x .
The update rule for log i t p x is as follows:
log i t p m i , j z k = κ log i t p m i , j z k , Δ h i , j k Δ h i , j M k s u b > τ h log i t p m i , j z k , Δ h i , j k Δ h i , j M k s u b < τ h , G i , j k P k , i , j i n t g d > τ g d log i t p 0 ,
In Equation (26), for Δ h i , j k Δ h i , j M k s u b > τ h , log i t p m i , j z k is increased by a gain factor κ ( κ > 1 ). Therefore, the updated probability value log i t p m i , j z k increases, and log i t p m i , j z k > log i t p 0 . For Δ h i , j k Δ h i , j M k s u b < τ h and G i , j k P k , i , j i n t > τ g d , the grid is still likely occupied by dynamic points and is updated using log i t p m i , j z k . Otherwise, log i t p x retain its initial value log i t p 0 . p 0 represents an initial prior probability.
In order to avoid deleting objects with ambiguous boundaries by mistake, a curved voxel point cloud clustering algorithm is employed [27]. This algorithm detects the point cloud corresponding to the object in the map, enabling subsequent object-level segmentation of the highly dynamic point cloud in the map. If all objects containing dynamic grids are filtered out without considering their dynamic attributes, static objects may be inadvertently removed. Therefore, an object dynamic score is defined for each object point cloud, as shown in Equation (27), serving as a dynamic evaluation criterion to accurately distinguish and separate the highly dynamic object point cloud completely.
S c o r e i = 1 N u m C l i k = 1 N u m C l i log i t f G p k C l i ,
where N u m C l i represents the number of point clouds of the ith object. G p k C l i represents the grid corresponding to each point in the object C l i . The function f . returns the corresponding value based on the probability value of the grid. The definition is as follows:
f G p = p m i z 1 : t , p m i z 1 : t p 0 p ,
where the true probability of the grid is returned when the grid probability exceeds the initial value p 0 . Otherwise, a constant p ( p < 1 ) is returned. It should be noted that when the grid is always occupied by a static object, the probability of the grid remains p 0 , and its log odds value of the grid remains log i t p 0 . When both static and dynamic objects occupy the grid, the presence of p reduced the object dynamic score, thereby decreasing the probability of mistakenly removing the static object. The final result is dynamic object point cloud.
D = C l i log i t 1 S c o r e i > τ s c o r e ,
where log i t 1 is the inverse of the log odds function, converting the log odds back to a probability value. τ s c o r e is the adaptive probability threshold for dynamic objects. C l i is the point cloud cluster.
When performing point cloud clustering, two objects located close to each other may be mistakenly identified as a single cluster. As a result, static objects within may be removed incorrectly. Since such point clouds generally occupy a large number of grids but contain a small proportion of highly dynamic grids, they can be distinguished based on the total number of occupied grids and the proportion of dynamic grids, as shown in Equation (30).
N u m G C l i > τ g r i d 0 < N u m G d y n m i c C l i N u m G C l i < μ d y n a m i c .
The weights of point cloud features are considered by LiDAR-IMU-GNSS odometry when calculating the Mahalanobis distance between the matching point clouds.
N u m G C l i represents the number of grids occupied by the point cloud cluster C l i . τ g r i d represents the minimum threshold of occupied grids. N u m G d y n m i c C l i represents the number of dynamic grids in the point cloud cluster C l i . μ d y n a m i c represents the maximum proportion threshold of dynamic grids.
After removing highly dynamic grids from the under-segmented point cloud, the static map is ultimately established.

5. Algorithm Validation

5.1. Accuracy of Proposed Odometry Systems

For the sake of accurately validating the proposed weight matching LiDAR-IMU-GNSS odometry, our method is tested on the KITTI, UrbanLoco [24], and Newer College [25] (NCD) datasets, and compared with the LIO-SAM and FAST-LIO2 methods.
We present the trajectory comparison between the proposed algorithm and the LIO-SAM algorithm on 00, 05, 09, and 10 sequences of the KITTI dataset. Figure 2 illustrates the typical results from 00 and 05 sequences. The ground truth trajectory is shown in green, the LIO-SAM trajectory in blue, and the trajectory of the proposed algorithm in red. Clearly, the trajectory produced by the proposed algorithm is significantly closer to the ground truth, indicating that it achieves higher accuracy than the LIO-SAM algorithm.
The root-mean-squared error (RMSE) of the absolute trajectory error (ATE) was used to evaluate the error across the entire trajectory. The RMSE is defined as follows:
R M S E A T E = 1 N i = 1 N t r a n s T g t 1 T e s t , i 2 ,
where N represents the number of frames in the dataset. t r a n s ( . ) denotes the extraction of the translational component of the pose, and the Euclidean norm is used to compute the vector magnitude in the RMSE calculation.
Table 1 summarizes the RMSE values for three algorithms across subsets of the KITTI, UrbanLoco, and NCD datasets. The smaller the error is, the better the result is. The results indicate that the proposed algorithm achieves higher accuracy than the other two open-source algorithms in most datasets, demonstrating superior performance.

5.2. Comparison of Highly Dynamic Point Cloud Filtering Algorithm

Figure 3 and Figure 4 show the visualization results of the proposed highly dynamic point cloud filtering algorithm, the Removert and ERASOR algorithms based on sequence 00 and 05 of the SemanticKITTI dataset, meantime, the truth is given as well. In the figures, the red point clouds represent the highly dynamic object point clouds detected by each algorithm, while the blue ones represent the static point clouds.
In Figure 3a,b, compared with the ground truth, both the Removert and ERASOR algorithms effectively filter out dynamic points, but they incorrectly remove a large number of static points. This results in an excessive number of false-positive detections, ultimately producing a relatively sparse map. In contrast, the method proposed in this paper ensures high detection precision while generating fewer false positive points. In Figure 4, using the sequence 05 dataset, the Removert algorithm achieves a relatively low detection precision, and the ERASOR algorithm has a relatively high proportion of false positives, while the proposed algorithm outperforms both on this dataset.
Furthermore, using sequences 00, 01, 05, and 07 from the SemanticKITTI dataset, which contain numerous dynamic objects, the accuracy of the proposed highly dynamic point cloud filtering algorithm is compared with the Removert and ERASOR algorithms. Table 2 presents the Preservation Rate (PR), Recall Rate (RR), and F1 score of the proposed highly dynamic point cloud filtering module in comparison with the Removert and ERASOR algorithms. The three metrics are defined as follows:
P R = T P T P + F P R R = T P T P + F N F 1 = 2 × P R × R R P R + R R
where TP (True Positive) represents the number of samples that are correctly detected as dynamic object categories. FP (False Positive) represents the number of samples that are actually static but incorrectly detected as dynamic categories. FN (False Negative) represents the number of samples that are actually dynamic but incorrectly detected as static categories. The values of all three metrics range from 0 to 1, with values closer to 1 indicating better performance.
Table 2 presents the performance of Removert, ERASOR, and the proposed method across three metrics on the SemanticKITTI dataset. The results demonstrate that the proposed highly dynamic point cloud filtering algorithm achieves high accuracy, with fewer missed and false detections. It can effectively filter highly dynamic objects in single-frame scanned point clouds and eliminate the “long trailing shadows” in the map caused by moving objects.

5.3. Online Filtering Experiment

A dataset at an intersection on the campus of Beijing Institute of Technology (BIT) was obtained to evaluate the proposed online highly dynamic point cloud filtering algorithm in this paper. The intersection contains numerous highly dynamic objects, such as heavy traffic and a large pedestrian flow.
In the experiment, the Wuling Hongguang miniEV serves as the test vehicle, equipped with an environmental perception system including three sensors. The Robosense Helio-32 mechanical LiDAR offers a 360° horizontal field of view and a 70° vertical field of view, with the ranging accuracy within 2 cm and a scanning frequency of 10 Hz. The GW-NAV100B is a dual-antenna, high-precision MEMS integrated navigation system. It incorporates a built-in three-axis gyroscope, a three-axis accelerometer, and a four-mode (BD/GPS/GLONASS/GALILEO) receiver, enabling it to measure the velocity, position, and attitude of the test vehicle. Xsens Mti-630 is a 9-axis IMU with a 100 Hz output frequency. During the data collection process, the total duration of the experiment was approximately 1000 s, covering a distance of about 3000 m. To construct local submaps, the data was divided equally into five segments, with each being approximately 600 m long. The parameters of the Robosense Helio-32 mechanical LiDAR and GW-NAV100B are listed in Table 3.
Two typical highly dynamic scenarios are shown in Figure 5 and Figure 6. The black point clouds represent static environmental point clouds, while the red point clouds represent the highly dynamic objects detected by the method proposed in this paper.
In Figure 5 and Figure 6, it can be observed that at the intersection, there are a large number of point cloud “long trailing shadows” caused by pedestrians, cyclists, and vehicles, which degrades the quality of the point cloud map. Typical highly dynamic objects, such as driving vehicles, cyclists and pedestrians, are clearly identifiable. In Figure 5, the blue and green boxes, respectively, enclose the detailed point cloud maps of vehicles and cyclists; in Figure 6, the blue and green boxes, respectively, enclose the detailed point cloud maps of vehicles and pedestrians. As demonstrated in Figure 5 and Figure 6, the online highly dynamic point cloud filtering algorithm proposed in this paper can effectively detect major highly dynamic objects such as pedestrians, cyclists, and vehicles in highly dynamic scenarios.

6. Conclusions

This paper improves the mapping method with weight matching LiDAR-IMU-GNSS odometry and a proposed online highly dynamic point cloud filtering algorithm. The weight matching LiDAR-IMU-GNSS odometry includes IMU pre-integration, ground point extraction, motion compensation, and weight feature point matching. Specifically, within the weight matching LiDAR-IMU-GNSS odometer, a method for point cloud keyframe selection and submap construction based on an adaptive threshold is designed. Ground point clouds and planar point clouds are extracted separately using the PMF and KD-Tree methods. A weight feature point matching method based on geometric-reflectance intensity similarity is proposed to improve point cloud matching. Subsequently, by using the pseudo-occupancy ratio between the current frame point cloud and the previous local submap as an observation, grid probability values are updated to obtain the distribution of dynamic features in the map. In addition, the curved voxel point cloud clustering algorithm is used to obtain object-level point cloud clusters. This approach enables comprehensive filtering of highly dynamic object-level point clouds.
The higher accuracy of the weight matching LiDAR-IMU-GNSS odometer is valid in the KITTI, UrbanLoco, and NCD datasets in comparison with the LIO-SAM and FAST-LIO2 methods. The proposed highly dynamic point cloud filtering algorithm outperforms the Removert and ERASOR algorithms based on the sequences from the SemanticKITTI dataset. Finally, online mapping with highly dynamic point cloud filtering is applied in a real-time dataset, effectively detecting typical highly dynamic objects such as driving vehicles, cyclists, and pedestrians. We hope this research will contribute to advancements in high-accuracy online mapping with dynamic object filtering and promote the development of navigation and locomotion in autonomous driving and robotics.

Author Contributions

X.Z.: conceptualization, methodology, writing—original draft and funding acquisition. X.C.: methodology, resources and writing—review and editing; M.D.: investigation and formal analysis. D.J.: writing—review and editing; C.W.: conceptualization, supervision and project administration. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the State Administration of Science, Technology and Industry for National Defense Project. The funding number is No. 8KD006(2024)-2.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Faisal, A.; Kamruzzaman, M.; Yigitcanlar, T.; Currie, G. Understanding autonomous vehicles. J. Transp. Land Use 2019, 12, 45–72. [Google Scholar] [CrossRef]
  2. Cao, X.; Wei, C.; Hu, J.; Ding, M.; Zhang, M.; Kang, Z. RDP-LOAM: Remove-Dynamic-Points LiDAR Odometry and Mapping. In Proceedings of the 2023 IEEE International Conference on Unmanned Systems (ICUS), Hefei, China, 13–15 October 2023; pp. 211–216. [Google Scholar]
  3. Xu, H.; Chen, J.; Meng, S.; Wang, Y.; Chau, L.-P. A Survey on Occupancy Perception for Autonomous Driving: The Information Fusion Perspective. arXiv 2024, arXiv:2405.05173. [Google Scholar] [CrossRef]
  4. Hu, J.; Mao, M.; Bao, H.; Zhang, G.; Cui, Z. CP-SLAM: Collaborative Neural Point-based SLAM System. Adv. Neural Inf. Process. Syst. 2023, 36, 39429–39442. [Google Scholar]
  5. Pan, Y.; Zhong, X.; Wiesmann, L.; Posewsky, T.; Behley, J.; Stachniss, C. PIN-SLAM: LiDAR SLAM Using a Point-Based Implicit Neural Representation for Achieving Global Map Consistency. IEEE Trans. Robot. 2024, 40, 4045–4064. [Google Scholar] [CrossRef]
  6. Kerbl, B.; Kopanas, G.; Leimkuehler, T.; Drettakis, G. 3D Gaussian Splatting for Real-Time Radiance Field Rendering. ACM-Trans. Graph. 2023, 42, 139:1–139:14. [Google Scholar] [CrossRef]
  7. Zhu, S.; Mou, L.; Li, D.; Ye, B.; Huang, R.; Zhao, H. VR-Robo: A Real-to-Sim-to-Real Framework for Visual Robot Navigation and Locomotion. IEEE Robot. Autom. Lett. 2025, 10, 7875–7882. [Google Scholar] [CrossRef]
  8. Wang, Z.; Chen, H.; Fu, M. Whole-body motion planning and tracking of a mobile robot with a gimbal RGB-D camera for outdoor 3D exploration. J. Field Robot. 2025, 41, 604. [Google Scholar] [CrossRef]
  9. Longo, A.; Chung, C.; Palieri, M.; Kim, S.-K.; Agha, A.; Guaragnella, C.; Khattak, S. Pixels-to-Graph: Real-time Integration of Building Information Models and Scene Graphs for Semantic-Geometric Human-Robot Understanding. arXiv 2025, arXiv:2506.22593. [Google Scholar]
  10. Tourani, A.; Bavle, H.; Sanchez-Lopez, J.L.; Voos, H. Visual SLAM: What are the Current Trends and What to Expect? Sensors 2022, 22, 9297. [Google Scholar] [CrossRef] [PubMed]
  11. Ye, K.; Dong, S.; Fan, Q.; Wang, H.; Yi, L.; Xia, F.; Wang, J.; Chen, B. Multi-Robot Active Mapping via Neural Bipartite Graph Matching. arXiv 2022, arXiv:2203.16319. [Google Scholar]
  12. Hester, G.; Smith, C.; Day, P.; Waldock, A. The next generation of unmanned ground vehicles. Meas. Control 2012, 45, 117–121. [Google Scholar] [CrossRef]
  13. Chen, L.; Wang, S.; McDonald-Maier, K.; Hu, H. Towards autonomous localization and mapping of AUVs: A survey. Int. J. Intell. Unmanned Syst. 2013, 1, 97–120. [Google Scholar] [CrossRef]
  14. Hu, X.; Yan, L.; Xie, H.; Dai, J.; Zhao, Y.; Su, S. A novel lidar inertial odometry with moving object detection for dynamic scenes. In Proceedings of the 2022 IEEE International Conference on Unmanned Systems (ICUS), Guangzhou, China, 28–30 October 2022; pp. 356–361. [Google Scholar]
  15. Lu, Z.; Hu, Z.; Uchimura, K. SLAM estimation in dynamic outdoor environments: A review. In Proceedings of the Intelligent Robotics and Applications: Second International Conference, ICIRA 2009, Proceedings 2. Singapore, 16–18 December 2009; Springer: Berlin/Heidelberg, Germany, 2009; pp. 255–267. [Google Scholar]
  16. Liu, W.; Sun, W.; Liu, Y. Dloam: Real-time and robust lidar slam system based on cnn in dynamic urban environments. IEEE Open J. Intell. Transp. Syst. 2021, 1. [Google Scholar] [CrossRef]
  17. Qian, C.; Xiang, Z.; Wu, Z.; Sun, H. Rf-lio: Removal-first tightly-coupled lidar inertial odometry in high dynamic environments. arXiv Preprint 2022, arXiv:2206.09463. [Google Scholar]
  18. Shan, T.; Englot, B.; Meyers, D.; Wang, W.; Ratti, C.; Rus, D. Lio-sam: Tightly-coupled lidar inertial odometry via smoothing and mapping. In Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA, 24 October 2020–24 January 2021; pp. 5135–5142. [Google Scholar]
  19. Pfreundschuh, P.; Hendrikx, H.F.C.; Reijgwart, V.; Dube, R.; Siegwart, R.; Cramariuc, A. Dynamic object aware lidar slam based on automatic generation of training data. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021; pp. 11641–11647. [Google Scholar]
  20. Schauer, J.; Nüchter, A. The peopleremover—Removing dynamic objects from 3-d point cloud data by traversing a voxel occupancy grid. IEEE Robot. Autom. Lett. 2018, 3, 1679–1686. [Google Scholar] [CrossRef]
  21. Lim, H.; Hwang, S.; Myung, H. ERASOR: Egocentric ratio of pseudo occupancy-based dynamic object removal for static 3D point cloud map building. IEEE Robot. Autom. Lett. 2021, 6, 2272–2279. [Google Scholar] [CrossRef]
  22. Kim, G.; Kim, A. Remove, then revert: Static point cloud map construction using multiresolution range images. In Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA, 24 October 2020–24 January 2021; pp. 10758–10765. [Google Scholar]
  23. Behley, J.; Garbade, M.; Milioto, A.; Quenzel, J.; Behnke, S.; Stachniss, C.; Gall, J. Semantickitti: A dataset for semantic scene understanding of lidar sequences. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 9297–9307. [Google Scholar]
  24. Wen, W.; Zhou, Y.; Zhang, G.; Fahandezh-Saadi, S.; Bai, X.; Zhan, W.; Tomizuka, M.; Hsu, L.-T. UrbanLoco: A full sensor suite dataset for mapping and localization in urban scenes. In Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France, 31 May–31 August 2020; pp. 2310–2316. [Google Scholar]
  25. Ramezani, M.; Wang, Y.; Camurri, M.; Wisth, D.; Mattamala, M.; Fallon, M. The newer college dataset: Handheld lidar, inertial and vision with ground truth. In Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA, 24 October 2020–24 January 2021; pp. 4353–4360. [Google Scholar]
  26. Kullback, S.; Leibler, R.A. On information and sufciency. Ann. Math. Stat. 1951, 22, 79–86. [Google Scholar] [CrossRef]
  27. Park, S.; Wang, S.; Lim, H.; Kang, U. Curved-voxel clustering for accurate segmentation of 3D LiDAR point clouds with real-time performance. In Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, China, 3–8 November 2019; pp. 6459–6464. [Google Scholar]
Figure 1. Algorithm framework.
Figure 1. Algorithm framework.
Sensors 25 06872 g001
Figure 2. The 00 and 05 sequences of the KITTI dataset. The ground truth trajectory is shown in green, the LIO-SAM trajectory in blue, and the trajectory of the proposed algorithm in red.
Figure 2. The 00 and 05 sequences of the KITTI dataset. The ground truth trajectory is shown in green, the LIO-SAM trajectory in blue, and the trajectory of the proposed algorithm in red.
Sensors 25 06872 g002
Figure 3. Three algorithms on sequence 00 of the SemanticKITTI dataset. The red points represent the highly dynamic object point clouds, and the blue ones represent the static point clouds.
Figure 3. Three algorithms on sequence 00 of the SemanticKITTI dataset. The red points represent the highly dynamic object point clouds, and the blue ones represent the static point clouds.
Sensors 25 06872 g003aSensors 25 06872 g003b
Figure 4. Three algorithms on sequence 05 of the SemanticKITTI dataset. The red points represent the highly dynamic object point clouds, and the blue ones represent the static point clouds.
Figure 4. Three algorithms on sequence 05 of the SemanticKITTI dataset. The red points represent the highly dynamic object point clouds, and the blue ones represent the static point clouds.
Sensors 25 06872 g004
Figure 5. Scenario 1 and highly dynamic objects. The blue and green boxes enclose the detailed point cloud maps of vehicles and cyclists.
Figure 5. Scenario 1 and highly dynamic objects. The blue and green boxes enclose the detailed point cloud maps of vehicles and cyclists.
Sensors 25 06872 g005
Figure 6. Scenario 2 and highly dynamic objects. The blue and green boxes enclose the detailed point cloud maps of vehicles and pedestrians.
Figure 6. Scenario 2 and highly dynamic objects. The blue and green boxes enclose the detailed point cloud maps of vehicles and pedestrians.
Sensors 25 06872 g006
Table 1. The RMSE of three algorithm frameworks on datasets (RMSE/m).
Table 1. The RMSE of three algorithm frameworks on datasets (RMSE/m).
DatasetLIO-SAMFAST-LIO2Our Method
005.83.71.1
0111.310.810.9
0211.813.212.9
03---
041.21.00.9
053.02.82.5
061.01.31.1
071.21.11.1
084.43.93.9
094.34.82.1
10
UrbanLoCo-CA-1
UrbanLoCo-CA-2
UrbanLoCo-HK-1
UrbanLoCo-HK-2
NCD-long-13
2.4
5.295
11.635
1.342
1.782
0.187
1.7
10.943
7.901
1.196
1.802
0.194
1.5
4.615
7.189
1.159
1.768
0.163
NCD-long-140.1950.2120.185
NCD-long-150.1620.1730.169
Table 2. Results of SemanticKITTI datasets.
Table 2. Results of SemanticKITTI datasets.
DatasetMethodPRRRF1
00Removert86.890.60.88
ERASOR93.997.00.95
Our Algorithm98.798.50.98
01Removert95.857.00.71
ERASOR91.894.30.93
Our Algorithm96.894.60.95
05Removert86.987.80.87
ERASOR88.798.20.93
Our Algorithm97.596.30.96
07Removert80.698.80.88
ERASOR90.699.20.948
Our Algorithm96.698.90.977
Table 3. Parameters of Robosense Helio-32 mechanical LiDAR and GW-NAV100B.
Table 3. Parameters of Robosense Helio-32 mechanical LiDAR and GW-NAV100B.
Robosense Helio-32GW-NAV100B
ParametersValuesParametersValues
vertical fieldrange+15°~−55°frequency100 Hz
2°(+15°~+7° and −8°~−55°) (10 Hz for GNSS)
resolution1.5(+7°~+4°)position resolution0.01 m in horizonal field
1.33(+4°~−8°) 0.15 m in vertical field
horizontal fieldrange360°velocity resolution0.03 m/s
resolution0.2° 0.2° in roll
attitude resolution0.2° in pitch
0.1° in yaw
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhao, X.; Cao, X.; Ding, M.; Jiang, D.; Wei, C. Online Mapping from Weight Matching Odometry and Highly Dynamic Point Cloud Filtering via Pseudo-Occupancy Grid. Sensors 2025, 25, 6872. https://doi.org/10.3390/s25226872

AMA Style

Zhao X, Cao X, Ding M, Jiang D, Wei C. Online Mapping from Weight Matching Odometry and Highly Dynamic Point Cloud Filtering via Pseudo-Occupancy Grid. Sensors. 2025; 25(22):6872. https://doi.org/10.3390/s25226872

Chicago/Turabian Style

Zhao, Xin, Xingyu Cao, Meng Ding, Da Jiang, and Chao Wei. 2025. "Online Mapping from Weight Matching Odometry and Highly Dynamic Point Cloud Filtering via Pseudo-Occupancy Grid" Sensors 25, no. 22: 6872. https://doi.org/10.3390/s25226872

APA Style

Zhao, X., Cao, X., Ding, M., Jiang, D., & Wei, C. (2025). Online Mapping from Weight Matching Odometry and Highly Dynamic Point Cloud Filtering via Pseudo-Occupancy Grid. Sensors, 25(22), 6872. https://doi.org/10.3390/s25226872

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop