Radar and OpenStreetMap-Aided Consistent Trajectory Estimation in Canopy-Occluded Environments

Youchen Tang; Bijun Li; Haoran Zhong; Maosheng Yan; Shuiyun Jiang; Jian Zhou

doi:10.3390/rs18010070

,

and

¹

State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430072, China

²

China Ship Development and Design Center, Wuhan 430079, China

^*

Author to whom correspondence should be addressed.

Remote Sens.2026, 18(1), 70;https://doi.org/10.3390/rs18010070
(registering DOI)

This article belongs to the Special Issue State of the Art in Positioning Under Forest Canopies

Version Notes

Order Reprints

Highlights

What are the main findings?

A spatial–temporal enhancement strategy for radar SLAM reduced Absolute Trajectory Error by up to 55.23% compared to baseline frameworks in Canopy-Occluded routes.
Incorporating OpenStreetMap priors with adaptive constraint weighting achieved a minimum trajectory error of 1.83 m.

What is the implication of the main finding?

Demonstrates that integrating millimeter-wave radar sensing with spatial–temporal optimization and open-source maps substantially improves positioning robustness in real-world signal-degraded environments.

Abstract

Accurate localization in canopy-occluded, GNSS-challenged environments is critical for autonomous robots and intelligent vehicles. This paper presents a coarse-to-fine trajectory estimation framework using millimeter-wave radar as the primary sensor, leveraging its foliage penetration and robustness to low visibility. The framework integrates short- and long-term temporal feature enhancement to improve descriptor distinctiveness and suppress false loop closures, together with adaptive OpenStreetMap-derived priors that provide complementary global corrections in scenarios with sparse revisits. All constraints are jointly optimized within an outlier-robust backend to ensure global trajectory consistency under severe GNSS signal degradation. Evaluations conducted on the MulRan dataset, the OORD forest canopy dataset, and real-world campus experiments with partial and dense canopy coverage demonstrate up to 55.23% reduction in Absolute Trajectory Error (ATE) and a minimum error of 1.83 m compared with baseline radar- and LiDAR-based SLAM systems. The results indicate that the integration of temporally enhanced radar features with adaptive map constraints substantially improves large-scale localization robustness.

Keywords:

millimeter-wave radar; temporal radar feature enhancement; OpenStreetMap-assisted localization; urban vegetation obstruction; trajectory estimation

1. Introduction

Reliable trajectory estimation is a cornerstone for autonomous driving and mobile robotic systems, particularly in environments where GNSS signals are unavailable or severely degraded. In vegetation-covered regions, dense urban areas and adverse weather conditions, GNSS-based localization can suffer from significant errors, directly impacting navigation safety and operational stability [1,2]. In contrast to consumer-grade navigation services such as Google Maps or Apple Maps—which combine inertial sensors with continuous GPS updates for map matching—our target scenarios feature persistent GNSS outages where such methods cannot maintain reliable positioning. Conventional complementary sensors, such as cameras and LiDAR, also experience performance degradation in such scenarios: cameras are sensitive to illumination changes and motion blur [3,4], while LiDAR performance may deteriorate in rain, snow, or fog and suffer from occlusions by tree trunks or other traffic participants [5,6].

Millimeter-wave radar, leveraging FMCW technology, achieves long-range detection and strong penetration under low visibility, remaining robust to both dynamic occlusions and adverse weather. However, radar sensing also faces inherent limitations: relatively low angular resolution, high measurement noise, and reduced geometric detail in representations. Radar Cross Section (RCS) intensity images appear as elliptical spots with varying brightness, which limits descriptor distinctiveness and can lead to perception aliasing—a phenomenon that exacerbates odometry drift over time.

Existing radar SLAM approaches have explored temporal feature stacking, density-based descriptors, scan-sequence matching, and map-constrained optimization, but their integration has often followed incremental extensions without addressing the compounded challenges of GNSS-degraded scenarios with sparse and unreliable loop closures. Our work departs from these patterns by focusing explicitly on improving global consistency when GNSS is absent, and when the environment’s topology limits revisits.

Loop closure detection is essential for imposing global constraints in SLAM systems, thereby mitigating cumulative drift and enhancing trajectory consistency [7], yet its efficacy depends on the accuracy and stability of environmental feature representation, and the structural properties of the traversed routes. In radar-based SLAM, the sparsity of loop closures or incorrect matches caused by perception aliasing can severely hinder long-term consistency.

To address these challenges, we propose a coarse-to-fine millimeter-wave radar state estimation framework that integrates both short- and long-term temporal enhancements with adaptive map-based global constraints. Spatial enhancement through coarse initialization using OpenStreetMap (OSM), where radar odometry trajectories are aligned to the prior map and constraint strength is adaptively modulated based on matching confidence. Furthermore, short-term feature aggregation constructs a Temporal Density Context (TDC) descriptor that improves stability and distinctiveness while long-term temporal validation rejects false loop matches, alleviating issues from perception aliasing. Crucially, when loop closures are sparse, our adaptive map constraints act as a primary global correction mechanism, ensuring consistency without continuous GNSS or frequent revisits. These components are jointly optimized within a unified back-end framework, providing globally consistent trajectories and high-fidelity radar maps, even in dynamically changing and sensing-degraded scenarios.

The contributions of this work are summarized as follows:

Propose short- and long-term temporal enhancement strategies for radar SLAM, including TDC-based descriptor stabilization in the short term, and long-term loop closure pruning to address perception aliasing.
Introduce spatial enhancement via adaptive map constraints, integrating OSM priors with confidence-based weight adjustment for improved global consistency, especially under sparse loop-closure conditions in GNSS-denied environments.
Develop a coarse-to-fine radar trajectory estimation framework that fuses public maps with temporally enhanced radar odometry, validated through extensive experiments in urban and campus environments under GNSS outages and varying levels of canopy occlusion.

The remainder of this paper is organized as follows: Section 2 reviews related work on radar SLAM, place recognition, loop closure detection, and map-assisted optimization. Section 3 details the proposed framework, with Section 3.2.1 and Section 3.2.2 introducing short- and long-term temporal enhancement modules. Section 3.3 explains adaptive map integration, and Section 4 presents experiment setups and results, followed by discussion in Section 5. Section 6 concludes the paper and outlines future directions.

3. Methods

In this chapter, we present our millimeter-wave radar consistent state estimation framework designed for operation in dense foliage and canopy-occluded environments, as illustrated in Figure 1. The system comprises four main modules: radar odometry, trajectory-to-map matching, temporal-enhanced loop closure detection, and pose graph optimization with adaptive map factors. It takes spinning millimeter-wave radar polar images—often affected by high noise levels—and OpenStreetMap data as input, and outputs smooth, globally consistent trajectories and radar point cloud maps. The radar odometry module extracts stable features and estimates inter-frame poses from noisy radar polar data under severe occlusion. The trajectory-to-map matching module associates the current radar trajectory with the road network to provide coarse but stable global pose constraints, even where GNSS reception is degraded. The temporal-enhanced loop closure detection module supplies accurate constraints over extended time sequences, improving the robustness of long-term localization. Finally, the constraints generated by these three modules are integrated using pose graph optimization, ensuring both the smoothness and global consistency of the final trajectory and map outputs.

Figure 1. An overview of our system, which takes millimeter-wave radar images and OpenStreetMap as inputs, achieves continuous and globally consistent state estimation through radar odometry, trajectory-to-map association, loop closure detection, and back-end optimization. The colored arrows represent different constraints fed into the pose graph: cyan for odometry poses, green for loop constraints, and yellow for map matching constraints.

3.1. Radar Odometry

Given the high noise levels in radar inputs—particularly under dense foliage and canopy occlusion, where speckle noise appears as dense elliptical spots with adjacent interference—we employ a two-step denoising prior to feature extraction. The Prewitt operator is applied in both radial and azimuth directions to suppress speckle and adjacent noise. From these denoised polar images, salient features with intensity peaks and spatial continuity are extracted using signal-processing techniques from our previous work [62]. Considering the relatively low output frequency and the Doppler effect of spinning radar sensors, motion and Doppler correction is performed for each pixel using the velocity estimated from the previous pose, as shown in (1).

\{\begin{matrix} \hat{ρ} = ρ + \frac{v_{r} * f_{t}}{\frac{d_{f}}{d_{t}}} \\ \hat{θ} = θ + ω * d_{t} \end{matrix}

(1)

where

v_{r}

and

ω

represent the radial linear velocity (m/s) and angular velocity (rad/s), respectively;

ρ

and

θ

are the original range (m) and azimuth angle (rad) measured by the FMCW radar;

f_{t}

and

\frac{d_{f}}{d_{t}}

correspond to the carrier frequency (Hz) and slope (Hz/s) of the modulated wave,

d_{t}

is the frame time interval (s). The corrected features are then transformed to Cartesian coordinates:

\{\begin{matrix} x = \hat{ρ} * c o s (\hat{θ}) \\ y = \hat{ρ} * s i n (\hat{θ}) \end{matrix}

(2)

We then use RANSAC and ICP to match noisy feature points between adjacent frames, thereby obtaining inter-frame pose estimates. The feature points extracted from the radar sensors, along with the estimated poses, are subsequently utilized for loop closure detection, map matching, and back-end processing.

3.2. Temporal Enhanced Loop Closure Detection

Radar loop closure detection leverages compact and efficient feature descriptors from radar sensors to represent the surrounding environment. However, for FMCW spinning radar, existing algorithms often overlook the variability in radar returns and their inherent temporal correlations, which can lead to decreased stability and accuracy—especially in dense foliage and canopy-occluded environments where features are frequently obscured or distorted. To address this limitation, we propose a temporal-enhanced loop closure detection algorithm. By jointly incorporating short-term dynamics and long-term temporal dependencies, our approach strengthens the robustness of radar feature representations, significantly reduces both false positives and missed detections, and supplies reliable global constraints for subsequent pose graph optimization.

3.2.1. Short-Term Enhancement for Temporal Density Context Generation

Millimeter-wave radar sensors, due to its strong penetration capabilities and relatively low scan frame rate, often exhibits significant feature instability between adjacent frames. For example, a section of grass that is occluded by pedestrians in one frame, may become nearly fully visible in the next frame, as shown in Figure 2. This pronounced variability can lead to missed or false detection of loop closures in frames with such fluctuating features. To mitigate this problem, we propose leveraging the pose estimation results provided by radar odometry to overlay feature point clouds and construct a Temporal Density Context. This approach enhances the descriptors’ ability to characterize and stabilize the features.

Figure 2. The difference between two adjacent raw radar data. The red boxes highlight areas with significant feature differences between frames.

We first use the pose estimation results from consecutive frames of the radar to precisely align the point cloud data of adjacent frames. This alignment helps to mitigate the effects of feature instability and constructs a local feature point cloud representing the current environmental segment, as illustrated in (3)

P_{s u b} = ⋃_{i = t - n}^{t} (T_{i \to t} \cdot P_{i})

(3)

where

P_{s u b}

and

P_{i}

represent the aggregated point cloud for current frame t and the point cloud for single frame i. The transformation from frame i to frame t and the sliding window size are expressed as

T_{i \to t}

and n, respectively.

Due to the penetration capabilities of radar, multiple points may exist on the same scan line. Using radar scan context generated by occupancy grids is insufficient for capturing the complex structures and details in the environment. To address this, we have designed a short-term enhanced Temporal Density Context (TDC). This approach replaces the traditional occupancy grids with density information, providing a more continuous and detailed description of radar data. Figure 3 illustrates the process of generating Temporal Density Context (TDC). For the input point clouds from adjacent frames, we first calculate which density grid each point falls into, as shown in (4).

\{\begin{matrix} i = ⌊\frac{\sqrt{(x_{i} - x_{c})^{2} + (y_{i} - y_{c})^{2}}}{d_{r i n g}}⌋ \\ j = ⌊\frac{a t a n 2 (y_{i} - y_{c}, x_{i} - x_{c}) + 2 π}{α}⌋ \end{matrix}

(4)

where

p_{c} (x_{c}, y_{c}, z_{c})

represents the coordinates of the central point of the local feature point cloud, which is set to

(0, 0, 0)

. i and j indicate the index of ring and sector of the bin. Therefore, the maximum radar detection range can be expressed as

d_{m a x} = N_{r i n g} * d_{r i n g}

(5)

where

d_{r i n g}

and

N_{r i n g}

represent the distance across a single ring and the number of rings, respectively. The azimuth angle covered by each bin,

α_{b i n}

, can be expressed as

α_{b i n} = \frac{2 π}{N_{s}}

(6)

where

N_{s}

denotes the number of sectors. For each bin, we encode the density of points within the sector and normalize this encoding using the radar’s distance resolution as a weight. This effectively abstracts the environment, as shown in (7)–(9).

N_{m a x} = \frac{d_{r i n g}}{d_{r e s}}

(7)

D_{i j} = ω \frac{N_{i j}}{N_{m a x}}

(8)

I = (D_{i j}) \in R^{N_{r} \times N_{s}}

(9)

where

d_{r i n g}

and

d_{r e s}

represent the width of the radar ring and the distance resolution, respectively, while

N_{m a x}

indicates the maximum number of feature points that a single ring can accommodate.

ω

represents the environmental weight, which can be increased or decreased depending on whether the environment is sparse or dense.

N_{i j}

denotes the count of points in each bin,

D_{i j}

is the final density weight of each bin, and I is the resulting Temporal Density Context (TDC). Finally, we enhance the efficiency of the corresponding search by computing the row average of the density matrix to obtain the ring key.

Figure 3. The process of generating a temporal density context and the corresponding ringkey based on short-term enhancement. Point clouds of different colors correspond to distinct radar frames, while the yellow and green regions represent the partitioning based on azimuth and range, respectively.

3.2.2. Long-Term Enhancement for Loop Outlier Removal

Given that the measurements from spinning radar sensors demonstrate temporal continuity, it follows that loop closures should similarly embody this attribute. In other words, isolated loop closures are often likely to be incorrect. For example, if the current frame j is related to a historical frame i, then adjacent frames to frame j should be related to adjacent frames to frame i. This can be observed as obvious similarity along the diagonal or anti-diagonal directions in the similarity matrix, as illustrated in Figure 4. Based on this observation, we propose using long-term feature enhancement to further filter the candidate loop closures.

Figure 4. The principle of loop outlier removal based on long-term enhancement. The blue and orange gradient regions correspond to the historical (near frame i) and current (near frame j) sequences, respectively. The time axis progresses from left to right, with darker shades signifying closer temporal proximity and higher similarity.

After generating the Temporal Density Context (TDC) with short-term loop enhancement, we first use the global pose provided by the map-matching algorithm to select historical frames that are spatially close to the current pose as candidates for loop detection. This approach helps avoid the problem of increasing computational overhead associated with a growing number of historical frames. Subsequently, we filter the candidate frames by comparing the cosine similarity between the Temporal Density Context (TDC) of the current frame and that of the candidate frames, as in

d (I^{q}, I^{c}) = \frac{1}{N_{s}} \sum_{j = 1}^{N_{s}} (1 - \frac{c_{j}^{q} \cdot c_{j}^{c}}{|| c_{j}^{q} | | \cdot | | c_{j}^{c} ||})

(10)

where

I^{c}

and

I^{q}

represent the Temporal Density Context (TDC) descriptors for the candidate frame and the current frame, respectively, while

c^{c}

and

c^{q}

denote the column vectors at the same index. If the computed similarity is above a predetermined threshold, the candidate frame is retained. Specifically, the similarity threshold is estimated from validation sequences extracted from a dataset that contain at least one true loop closure. For each minimal sequence segment, we evaluate the Precision–Recall trade-off by sweeping threshold values, selecting the threshold that maximizes recall while keeping false positives low. This procedure ensures the selected value is grounded in representative data and tuned for diverse urban and canopy-covered scenarios, rather than being arbitrarily chosen. After processing all candidate frames, the k most similar candidates are selected from those retained for further loop outlier removal based on long-term enhancement.

For each of the k selected historical frames, a sliding window of length n is employed. We first compute the cosine similarity between the n frames preceding the selected frame and the n frames preceding the current frame, resulting in a forward similarity sequence. Similarly, we calculate the cosine similarity between the n frames following the selected frame and the n frames preceding the current frame, resulting in a backward similarity sequence. The difference between these two similarity sequences is then computed to form a difference sequence. Finally, a Gaussian kernel is applied to this difference sequence for weighted summation, yielding the final Symmetric Similarity Divergence (SSD). The Gaussian kernel

\hat{G (k)}

is normalized using the Gaussian function

G (k)

.

G (k) = \frac{1}{\sqrt{2 π}} e^{- \frac{1}{2} {(k - n)}^{2}}, k \in 1, 2, \dots, n

(11)

\hat{G} (k) = \frac{G (k)}{\sum_{0}^{n} G (k)}, k \in 1, 2, \dots, n

(12)

where k denotes the offset from the current frame, and n represents the size of the sliding window.

\begin{matrix} S S D = \hat{G} (0) \cdot {sim}^{i, j} + \sum_{k = 1}^{n} \hat{G} (k) \cdot |{sim}^{i - k, j - k} - {sim}^{i + k, j - k}| \end{matrix}

(13)

where

s i m^{(i - k, j - k)}

and

s i m^{(i + k, j - k)}

represent the cosine similarity of the forward and reverse similarity sequences, respectively. The Symmetric Similarity Divergence (SSD) quantifies the deviation in similarity, where a higher SSD indicates a greater likelihood of a true loop closure, either forward or backward, while a lower SSD suggests a potential false detection. Finally, based on the calculated SSD, the frame with the maximum divergence is selected as the final candidate, and those with lower similarity scores are filtered out using a threshold to further reduce the loop closure false detection rate.

3.3. Pose Graph Optimization with Adaptive Map Factors

Loop constraints are limited by the number of revisited areas along the trajectory, making it challenging to continuously correct accumulated errors caused by fluctuated radar features. Road network from OpenStreetMap, being easily accessible and structurally simple, can readily provide coarse but global pose references for the system. Consequently, we propose an adaptive map constraint approach (see Figure 5). By regarding the state estimation problem as a maximum a posteriori (MAP) problem, we define three types of factors to optimize the state X: odometry factors, loop factors, and adaptive map factors, as illustrated in (14).

\begin{matrix} f (x) = & arg min_{X} {\sum ({∥r_{F} (F_{i - 1}^{i}, X)∥}^{2} \\ + & {∥r_{L} (L_{i}^{j}, X)∥}^{2} + {∥r_{M} (M_{i}, X)∥}^{2})} \end{matrix}

(14)

where

X = \{X_{0}, X_{1}, X_{2}, \dots, X_{i}, \dots\}

represents the robot’s states. The odometry factor that provides the local pose relationship between frame i and frame

i - 1

is denoted as

r_{F} (F_{i - 1}^{i}, X)

. The loop factor that describes the long-term pose relationship between frame i and frame j is denoted as

r_{L} (L_{i}^{j}, X)

. The adaptive map factor that provides the global constraint for frame i is denoted as

r_{M} (M_{i}, X)

. All residuals are weighted by their respective information matrices.

Figure 5. The structure of the pose graph optimization with adaptive map factors. The blue and green circles represent robot states and radar measurements, respectively. Colored dots indicate specific constraints added to the graph: yellow for map matching, green for odometry, and cyan for loop closures.

3.3.1. Adaptive Map Factors

OpenStreetMap data is often readily available in urban road scenarios. These maps provide comprehensive road structure information and topological relationships, offering stable and reliable geometric constraints that serve as effective coarse pose priors.

Initially, we manually determine the starting point and heading on the OpenStreetMap, and construct a coarse mapping between local and global coordinates based on the WGS84 ellipsoid. Next, we convert the optimized trajectory within the time window into global coordinates using pose transformations. A Hidden Markov Model-based map matching algorithm [61] is then employed to establish the association between the local trajectory and the map. Finally, the trajectory is transformed back into local coordinates to build the pose constraints.

R_{o f f s e t}^{b} P_{i}^{b} + t_{o f f s e t}^{b} = T_{w}^{b} P_{i}^{w}

(15)

P_{i}^{w} = {[l a t i t u d e, l o n g i t u d e, a l t i t u d e]}^{T}

(16)

where

P_{i}^{b} \in R^{3 \times 1}

and

P_{i}^{w} \in R^{3 \times 1}

represent the pose of the frame i provided by the odometry in the local coordinate system and its corresponding position in the global coordinate system, i.e., LLA expressed as (16), respectively.

t_{o f f s e t}^{b}

and

R_{o f f s e t}^{b}

are the translation offset and the rotation offset between the local coordinate and ENU coordinate, i.e., east–north–up.

T_{w}^{b}

denotes the transformation from the world coordinate system to the local coordinate system. Since

T_{w}^{b}

is only related to the ellipsoid surface and the projection reference line, we can determine the offset between the reference point and the odometry origin in global coordinate by manually establishing

R_{o f f s e t}^{b}

and

P_{0}^{w}

. The offset is then given by (17).

t_{o f f s e t}^{b} = T_{w}^{b} P_{0}^{w}

(17)

Thus, the world coordinate mapping of each subsequent odometry pose can be obtained as follows:

P_{i}^{w} = {(T_{w}^{b})}^{- 1} (R_{o f f s e t}^{b} P_{i}^{b} + t_{o f f s e t}^{b})

(18)

The current and historical odometry poses in world coordinates are fed into the map matching module to obtain the matched poses.

{\hat{P}}_{i}^{w} = ϕ (⋃_{k = i - β}^{i} P_{k}^{w}, ψ)

(19)

where

ϕ (\cdot)

represents the map matching function,

β

denotes the number of historical frames involved in the matching, and

ψ

represents the input from the OpenStreetMap. The results are then transformed into the local coordinate system for residual computation and factor construction.

{\hat{P}}_{i}^{b} = {R_{o f f s e t}^{b}}^{- 1} (T_{w}^{b} {\hat{P}}_{i}^{w} - t_{o f f s e t}^{b})

(20)

The error function

e_{M}

is represented in (21).

e_{M} = {\hat{P}}_{i}^{b} ⊖ P_{i}^{b}

(21)

where

P_{i}^{b}

and

{\hat{P}}_{i}^{b}

represent the pose of frame i in the local coordinate system and its corresponding map-matched pose, respectively, and ⊖ denotes the pose difference operator in SE(2). The residual function is then expressed in (22).

r_{M} (M_{i}, X) = {(e_{M})}^{T} Ω_{M} (e_{M})

(22)

As a convention, we use the information matrix, which is the inverse of the covariance matrix, to represent the level of observation errors [21,63,64]. Given the limited accuracy of OpenStreetMap and errors from map-matching, communication delays, and initial point selection, we establish a constant error model based on prior knowledge. Translation errors are estimated using the distance between the current pose and its matched point, as well as the distance to the line l formed by two nearest historical matched points. Rotational errors are based on the orientation difference between the current and matched poses. This enables adaptive error estimation. In practice, the map-factor weight (information matrix

Ω_{M}

) is computed from the estimated planar error

δ_{d i s t}

and heading error

δ_{a n g l e}

of matched map points. These indicators are derived using Equations (23) and (24), and their inverse-squared values scale the corresponding entries in

Ω_{M}

. This formulation adaptively attenuates the influence of uncertain matches while retaining strong constraints for high-confidence associations. The thresholds used in error evaluation were tuned on validation routes to maintain consistent performance across different mapping qualities, providing sensitivity resilience against variations in OSM accuracy.

δ_{d i s t} = |{\hat{P}}_{i}^{b} ⊖ P_{i}^{b} - d (P_{i}^{b}, l ({\hat{P}}_{i}^{b}, {\hat{P}}_{i - 1}^{b}))|

(23)

where

l ({\hat{P}}_{i}^{b}, {\hat{P}}_{i - 1}^{b})

denote the line segment formed by the poses of the frame i and frame

i - 1

in the local coordinate system. The function

d (\cdot)

calculates the distance from a point to a line, and

δ_{d i s t}

represents the indicator for translation error.

The rotation error indicator is given by (24).

δ_{a n g l e} = Θ (l ({\hat{P}}_{i}^{b}, {\hat{P}}_{i - 1}^{b}), l (P_{i}^{b}, P_{i - 1}^{b}))

(24)

where

Θ (\cdot)

denotes the function for computing the angle between two lines, and

δ_{a n g l e}

represents the indicator for rotation error. Considering that the OpenStreetMap data is in 2D, only elements related to the local coordinates

(x, y)

and yaw angle are computed in

Ω_{M}

, with all other elements filled with zeros. Therefore, the adaptive map factor information matrix

Ω_{M}

is given by (25).

Ω_{M} = [\begin{matrix} \begin{matrix} \frac{1}{{(ε + δ_{d i s t})}^{2}} & \frac{1}{{(ε + δ_{d i s t})}^{2}} & 0 & 0 & 0 & \frac{1}{{(δ_{a n g l e})}^{2}} \end{matrix} \end{matrix}]

(25)

where

ϵ

represents the prior accuracy information for the OpenStreetMap.

Based on the distances and orientation between trajectory points and matched points, we provide a method to evaluate the planar and heading errors of the matching points. This allows for an adaptive assessment of the confidence in map constraints, thereby mitigating the impact of false matches on the pose graph.

3.3.2. Loop Closure Factors

Loop closure factors

r_{L} (L_{i}^{j}, X)

are used to provide long-term pose constraints between non-consecutive frames i and j when the system revisits a previously mapped area. In radar SLAM, loop detection is inherently more challenging than in LiDAR or visual SLAM due to speckle noise, lower spatial resolution, and the dependency of scattering patterns on environmental conditions. To ensure robust factor construction, we adopt a two-stage process: loop-candidate generation followed by geometric verification.

Each incoming radar frame is first processed by our loop closure module (Section 3.2.1), producing a descriptor vector from the feature points. We then compare this descriptor to a database of previous frames using cosine similarity:

s_{i, j} = \frac{⟨ d_{i}, d_{j} ⟩}{∥ d_{i} ∥ \cdot ∥ d_{j} ∥}

(26)

where

d_{i}

and

d_{j}

denote the descriptor vectors of frames i and j, respectively. Frames are considered loop candidates when

s_{i, j}

exceeds the similarity threshold

τ_{s}

(empirically determined from validation sequences).

To avoid false positives caused by similar scattering patterns in different locations, loop candidates are further verified through scan alignment. Instead of using a phase-correlation method, we obtain an initial relative orientation from descriptor matching, then perform feature-based point correspondence extraction in Cartesian-transformed radar data.

A robust RANSAC procedure is employed to estimate a coarse rigid-body transform

{\tilde{T}}_{j}^{i}

while eliminating outlier correspondences. This coarse transform is subsequently refined using Iterative Closest Point (ICP) registration, yielding the accurate relative pose

{\hat{T}}_{j}^{i} \in SE (2)

. The alignment score

m_{i, j}

is evaluated from:

m_{i, j} = \frac{1}{E_{ICP}}

(27)

where

E_{ICP}

is the RMS alignment error. Loop candidates are accepted when

m_{i, j} \geq τ_{m}

, with

τ_{m}

tuned for balanced recall and precision across varied radar-noise conditions. For each accepted loop

(i, j)

, the residual

e_{L}

is computed as:

e_{L} = X_{j} ⊖ ({\hat{T}}_{j}^{i} \cdot X_{i})

(28)

where ⊖ represents the pose difference operator in

SE (2)

. The corresponding residual function is:

r_{L} (L_{i}^{j}, X) = {(e_{L})}^{T} Ω_{L} (e_{L})

(29)

Ω_{L}

is the information matrix derived from the covariance of the scan-matching process, with diagonal entries inversely proportional to the estimated translation and rotation uncertainties. Similar to the adaptive map factor design in Section 3.3.1, the loop closure factor information matrix

Ω_{L}

encodes our confidence in the relative pose constraint.

Translation uncertainty is quantified using the RMS error

E_{ICP}

obtained during ICP refinement, while rotation uncertainty is indicated by the normalized descriptor similarity score

s_{i, j} / τ_{s}

. Specifically, we define:

Ω_{L} = [\begin{matrix} \begin{matrix} \frac{1}{{(ε_{t} + E_{ICP})}^{2}} & \frac{1}{{(ε_{t} + E_{ICP})}^{2}} & 0 & 0 & 0 & \frac{1}{{(ε_{r} + 1 - \frac{s_{i, j}}{τ_{s}})}^{2}} \end{matrix} \end{matrix}]

(30)

where

ε_{t}

and

ε_{r}

are small positive constants representing the prior translation and rotation accuracy bounds, respectively. The first two entries correspond to the

(x, y)

components in the local coordinate system, and the last non-zero entry corresponds to the yaw angle component. This design adaptively increases the loop closure weight for precise alignments (low

E_{ICP}

) and strong descriptor matches (high

s_{i, j}

), while attenuating the influence of uncertain closures.

High-confidence matches yield large

Ω_{L}

values to enforce strong constraints, while uncertain matches are down-weighted to avoid distorting the graph. Unlike the odometry factor

r_{F} (F_{i - 1}^{i}, X)

, which connects temporally adjacent frames and accumulates drift over long trajectories,

r_{L} (L_{i}^{j}, X)

leverages re-observed locations to correct accumulated errors.

Given our focus on canopy-occluded and GNSS-degraded environments, the loop closure factor is critical to achieving consistent localization when the adaptive map factors are temporarily weakened or unavailable.

4. Results

In this section, we validate the performance of our proposed method through both qualitative and quantitative experiments. The evaluation includes comparisons of trajectory accuracy and a comprehensive ablation study, covering GNSS-challenged scenarios such as canopy-occluded urban neighborhoods and campus environments. We first describe our experiment setup, including platform, datasets, and comparative frameworks in Section 4.1, followed by qualitative and quantitative comparisons of trajectories with existing SLAM systems based on millimeter-wave radar and LiDAR in Section 4.2. This assessment aims to evaluate the effectiveness of our system in delivering accurate and consistent state estimation under partial and dense vegetation coverage. In Section 4.4, we conduct ablation studies on the three proposed optimizations to assess their impact on the system and analyze the underlying reasons for their effects.

4.1. Experimental Setup

To evaluate the accuracy and consistency of the proposed system under various environmental conditions, we selected three scenarios from the MulRan dataset. Additionally, we conducted real-world experiments on a campus to further evaluate the effectiveness of our framework. All experiments were conducted on an Intel i7-12700H laptop running Ubuntu 20.04 with 32 GB of RAM. Our system, implemented in C++, utilizes GTSAM (v4.0.3) [65] and iSAM2 (https://borglab.github.io/gtsam/isam2/, accessed on 14 November 2025) [66] for pose graph optimization.

To quantify the accuracy of our system, we employed the established approach of aligning the trajectories of all methods with the ground truth and assessing them using Absolute Trajectory Error (ATE) [67]. The methods included in our comparison are SC-LeGO-LOAM [68], Yeti [14], radarSLAM [6], SC-ORORA [69], and LSRL [59]. Additionally, the assessment of OSM road networks against ground truth data has been conducted using the transverse Mercator projection. The experiments rely on three major datasets or collection platforms: the MulRan dataset [70], our real-world radar-based campus collection platform, and the OORD dataset [71] for canopy suitability testing.

Given the focus of our study on GNSS-challenged urban environments with partial and dense canopy obstruction, we selected three representative canopy-influenced scenarios from the MulRan dataset for our experiments: DCC, KAIST, and Riverside. These routes traverse neighborhoods, campuses, and riverside areas with varying levels of vegetation coverage, where tree canopies and roadside foliage periodically block or attenuate satellite signals. Such conditions not only degrade GNSS performance but also pose challenges for radar, LiDAR, and vision-based localization, making them ideal for evaluating the effectiveness of loop closure mechanisms and integration of map constraints under canopy-occluded conditions. MulRan dataset: Data were acquired using a ground vehicle equipped with a horizontally mounted 3D LiDAR (OS1-64, Ouster, Inc., San Francisco, CA, USA) and a horizontal millimeter-wave radar (CIR204-H, Navtech Radar Ltd., Ardington, UK), offering

0 . 9^{°}

and

0.06 m

resolution, a 360° field of view, and a detection range up to

200 m

at

4 Hz

. OORD dataset: To evaluate the system’s applicability in dense forest canopy environments, we employed the twolochs sequence from the oord dataset. This route includes mixed terrain such as tree-covered tarmac, gravel mountain tracks, and lochside roads, providing diverse surface conditions and canopy densities. For our experiments, we selected an approximately

550 m

section with continuous and dense canopy coverage to simulate persistent GNSS signal attenuation. This subset was used to assess the localization performance and robustness of the proposed method under sustained forest canopy obstruction.

Real-world collection platform: Hardware includes an integrated navigation device (CGI430, CHC Navigation, Shanghai, China), a computational unit (Orin, NVIDIA Corporation, Santa Clara, CA, USA) with

32 GB

RAM, and a horizontally mounted mm-wave radar (RAS-3, Navtech Radar Ltd., Ardington, UK) with

0.043 m

range resolution,

0 . 9^{°}

angular resolution,

75 m

maximum detection range, and

4 Hz

scan frequency, which is illustrated in Figure 6. Multimodal spatiotemporal synchronization was achieved using engineering drawing-based spatial calibration and GPS+PPS hardware clock synchronization to ensure data validity. Ground-truth trajectories were obtained via GNSS/INS integrated navigation post-processing in unobstructed or weakly occluded zones, and through matching with high-definition maps in densely occluded or GNSS-degraded zones.

Figure 6. Our platform for real-world experiments.

During the data acquisition process, the platform moved at an average speed of approximately 1.5 m per second. We selected two distinct scenarios for data collection on campus that offer different types and densities of canopy obstruction. The first sequence is a 220-m-long route navigating around the Starlake building, which involves sparse occlusions caused by roadside trees. The second sequence is a 430-m-long route traversing Youyi Square, containing alternating regions of dense canopy cover and open corridors with sparse vegetation, which cause varying levels of GNSS signal obstruction. Such canopy occlusion characteristics make this scenario particularly suitable for evaluating positioning performance under forest-like conditions. The ground truth trajectories were obtained through tightly coupled GNSS/INS integration, further refined via alignment with a high-definition map to compensate for incomplete RTK availability under dense canopy conditions (see Figure 7a,b).

Figure 7. Real-world GNSS-challenged scenarios used for evaluating spatial-temporal enhancements on radar SLAM system. The small triangles and associated images indicate the viewing direction and the corresponding real-world scenes, respectively, illustrating the level of canopy occlusion. (a) The Starlake building (sparse canopy cover). (b) The Youyi Square (dense and sparse canopy cover).

4.2. Trajectory Evaluation

We first conducted a qualitative analysis of our system’s trajectory consistency by comparing the trajectories generated by our system with those produced by current leading LiDAR and millimeter-wave radar algorithms that incorporate loop closure, as well as the road network from the OpenStreetMap.

Figure 8 illustrates the geometric consistency between our trajectory and the satellite map within the neighborhood containing dense canopy-occlusion. To highlight the challenges in this environment, a GNSS covariance bar chart for the DCC01 sequence is provided in Figure 9, showing the variation of GNSS position standard deviation (X, Y, Z) over time and indicating periods of degraded precision. Finally, Figure 10 compares our trajectory with those of other methods, using pure GNSS positioning results as a reference.

Figure 8. Visualization of the DCC01 sequence overlaid on a satellite map background. The yellow line represents the vehicle trajectory obtained by our method, while the colored points denote the mmWave feature point cloud. Point colors correspond to echo intensity, with warmer colors (towards red) indicating stronger returns and cooler colors indicating weaker returns.

Figure 9. GNSS position standard deviation in X, Y, and Z directions over time for the DCC01 sequence. Periods of increased variance correspond to degraded GNSS precision.

Figure 10. Trajectories on Mulran Dataset generated from methods including ground truth, Yeti, sc-lego-loam, sc-orora and our method for qualitative evaluation. The comparisons are performed on sequences: (a) DCC01, (b) DCC02, (c) DCC03, (d) KAIST01, (e) KAIST02, (f) KAIST03, (g) Riverside01, (h) Riverside02, (i) Riverside03.

To quantify the overall trajectory consistency performance of our proposed system, we calculated the Absolute Trajectory Error (ATE) by comparing our system-generated trajectories with the ground truth provided by the MulRan dataset and compared the results with other algorithms. Table 1 presents the ATE results for six different SLAM frameworks (SC-LEGO-LOAM, Yeti, SC-ORORA, radarSLAM, LSRL and ours) on the MulRan dataset. The ATE value represents the Root Mean Square Error (RMSE) of the positional difference between the algorithm’s trajectory and the ground truth, given temporal alignment. Lower values indicate better algorithm performance.

Table 1. Absolute trajectory error (ATE) of different systems on the MulRan Dataset. All values are root mean square error (RMSE) in meters (m). The symbols “-” and “NA” denote cases where data was not provided in the original paper and where the system failed to complete the sequence, respectively. In the latter case, this indicates that the error exceeded 100 m. Bold values indicate the best accuracy. Pure GNSS errors are shown in italics to illustrate signal fluctuations in these scenarios, noting that neither the proposed method nor the comparison methods utilize GNSS as an input.

In the Twolochs forest region, characterized by dense canopy occlusion, irregular terrain, and the absence of distinct road geometry, GNSS signals suffer severe degradation, leading to pronounced trajectory drifting in raw GNSS data. In contrast, as illustrated in Figure 11, our method generates a smooth and stable trajectory estimate, achieving an Absolute Trajectory Error of 6.75 m relative to the reference path. These results demonstrate the applicability of the proposed framework to unstructured and topologically ambiguous environments, where conventional map-matching and GNSS-based solutions typically fail.

Figure 11. Trajectory comparison under dense forest canopy in the Twolochs region (OORD dataset), including our method (blue solid line), GNSS data with outlier removal (orange solid line), and raw GNSS data (dashed line).

4.3. Real World Experiments

In our real-world experiments, we conducted both qualitative and quantitative evaluations. For the qualitative evaluation, we overlaid the estimated trajectories and radar-derived feature points onto satellite imagery, enabling direct visual inspection of their spatial alignment with the road network and the surrounding environment, which is shown in Figure 12. This approach verifies the consistency between the reconstructed trajectories, the radar point features, and the underlying map. For the quantitative evaluation, we computed the Absolute Trajectory Error (ATE) against ground truth to measure positioning accuracy. As shown in Table 2, our framework achieved an ATE of 0.55 m in the Starlake building sequence with sparse canopy occlusion, and 2.08 m in the Youyi Square loop containing a mix of dense and sparse canopy segments.

Figure 12. The trajectory of our system in two scenes in our real world experiments. The yellow line represents the estimated trajectory. The mmWave feature points are colored by echo intensity, with warmer colors (red) indicating stronger returns. (a) The Starlake building. (b) The Youyi Square.

Table 2. Absolute trajectory error (ATE) of our system on real world dataset. All values are given in meters (m).

4.4. Ablation and Real-Time Performance Analysis

In addition to accuracy evaluation, we also analyze the runtime performance of the proposed system to verify its suitability for real-time operation. The major processing modules include millimeter-wave radar odometry, loop closure detection, and back-end pose graph optimization. The map-matching module is executed asynchronously at a low frequency and does not require real-time processing, hence it is excluded from this analysis. Given the rotational millimeter-wave radar operates at 4 Hz, a per-frame computation time below 250 ms satisfies the real-time criterion.

We introduced short-term and long-term loop closure enhancements along with adaptive map constraints to develop our coarse-to-fine algorithm for consistent state estimation. To validate the effectiveness of these improvements, we conducted comprehensive ablation experiments. Figure 13 displays the Absolute Trajectory Error (ATE) results after each component was removed, with qualitative and quantitative differences summarized in the following subsections. We also utilize Precision-Recall (PR)-curve to represent the contribution of each module, which in shown in Figure 14.

Figure 13. The ATE results for evaluating the contribution of our proposed enhancement on the global accuracy.

Figure 14. Precision-Recall evaluation for three scenes in the ablation study. Loop detection accuracy decreases when short-term and long-term enhancements are removed. (a) Precision–Recall Analysis for DCC01. (b) Precision–Recall Analysis for KAIST01. (c) Precision–Recall Analysis for Riverside01.

Figure 15 shows the stacked per-frame computation time during processing. After an initial fluctuation stage, the odometry module stabilizes at approximately 110 ms per frame. The back-end optimization time begins to increase slightly after around 800 frames, but the overall average per-frame processing time converges to about 130 ms. This corresponds to an effective processing frequency exceeding 4 Hz, satisfying our real-time criterion while maintaining stable performance over extended trajectories.

Figure 15. Computation time breakdown per frame for major modules: millimeter-wave radar odometry (blue), loop closure detection (orange), and back-end pose graph optimization (yellow).

4.4.1. Short-Term Enhancement

To mitigate the negative impact of millimeter-wave radar feature fluctuations on loop closure detection, we propose a short-term loop closure enhancement method. This method constructs short-term features using inter-frame pose transformations provided by the radar odometry, which are then used as inputs for loop closure detection for the generation of Temporal Density Context (TDC). We first validated the impact of this enhancement on overall trajectory accuracy by performing experiments with and without the enhancement. We then conducted a statistical analysis of the loop closure detection results before and after removing the enhancement (Figure 16), investigating both the performance shifts and the corresponding input characteristics.

Figure 16. Visualization of feature points and Temporal Density Context (TDC) for two adjacent frames before and after short-term loop-closure enhancement.The red and blue squares highlight corresponding regions of drastic inter-frame feature fluctuation in the point clouds and descriptors, respectively, demonstrating the improved stability.

As shown in Figure 13, removing the short-term loop closure enhancement resulted in a noticeable decrease in overall trajectory accuracy, with a smaller reduction compared to map constraints, but with a greater decline in accuracy for neighborhood and bridge scenarios compared to campus scenarios. As shown in Figure 14, after the removal, loop detection accuracy decreased, particularly in Riverside, likely due to obvious occlusions and repetitive features. However, incorporating features from short-term sequences increased correctly detected loop closures, enhancing the feature representation of spinning radar sensors.

4.4.2. Long-Term Enhancement

Given the continuous nature of vehicle trajectories, we propose a long-term loop closure enhancement algorithm. This algorithm amplifies the similarity differences between adjacent frames in forward and reverse loop closures to accurately eliminate loop closure outliers, thereby further ensuring the stability of loop closure constraints. Similar to the short-term loop closure enhancement algorithm, we evaluated the effectiveness of this algorithm by comparing changes in overall trajectory accuracy and differences in loop closure detection results before and after removing the enhancement, as illustrated in Figure 17.

Figure 17. Comparison of the effects of long-term loop closure enhancement on forward and reverse loops before and after enhancement, showing significantly increased similarity for true forward and reverse loops compared with surrounding non-loop regions, thereby improving loop-closure robustness.

As shown in Figure 13, the removal of the long-term loop closure enhancement also led to a noticeable decrease in overall trajectory accuracy. The magnitude and trend of this decrease across most sequences are similar to those observed in the ablation experiments for the short-term enhancement. Precision–Recall analysis also showed a reduction in loop closure accuracy (Figure 14).

4.4.3. Adaptive Map Factors

OpenStreetMap typically exhibits relatively low single-point accuracy but high accuracy in topological relationships and geometric consistency. We leverage these characteristics to provide reliable prior pose information for our framework. When this module is removed, we observe varying degrees of accuracy degradation across different sequences, with the neighborhood and bridge scenarios experiencing the most significant impacts, while the campus scenario was less affected, which is shown in Figure 13.

To further investigate the robustness of our approach to potential inaccuracies in OSM geometry, we conducted a sensitivity analysis on the DCC01 sequence by introducing varying levels of artificial horizontal noise into the map-matching results. The noise levels were set to no noise, +5 m, +10 m, +20 m, and +50 m random displacement of matched positions, simulating different magnitudes of map geometry error. As shown in Table 3, the Absolute Trajectory Error (ATE) increased when noise was added, but our adaptive map-factor weighting maintained reasonable accuracy even at significant noise levels.

Table 3. Sensitivity of trajectory accuracy to map-matching geometry errors (DCC01 sequence). All values are in meters (m).

5. Discussion

This section presents a thorough examination and reasoned discussion of the experimental outcomes, aiming to interpret the observed performance patterns, compare them across different scenarios, and assess their significance in validating the effectiveness of the proposed approach.

5.1. Analysis of Trajectory Accuracy and Consistency

The qualitative results indicate that our algorithm demonstrates good trajectory consistency and overall stability. During the trajectory generation process, we observed that the prior map information and the temporal enhanced loop constraints continuously adjusted the current trajectory. This was particularly evident in bridge scenarios, where the repeated roadside trees causing both GNSS and LiDAR-based odometry to struggle with maintaining accurate orientation. In these cases, the coarse but consistently stable map constraints and intermittent but high-precision loop closure constraints collectively ensured sufficient consistency between the trajectory and the ground truth. In contrast, other algorithms exhibited noticeable deviations in trajectories within the bridge scenarios, particularly those based on LiDAR, such as SC-LeGO-LOAM. This could be attributed to the relatively limited detection range of LiDAR, with LiDAR concentrating features on repetitive roadside objects like lamp posts and tree trunks, and struggling to reliably capture distant building features obscured by trees, leading to significant perceptual aliasing.

We observe in the quantitative results that, across most neighborhood, campus, and bridge scenarios, our algorithm achieves the highest ATE accuracy, with an average ATE of 5.93 m, and even reaches an error level of 1.83 m in the KAIST03 sequence. This suggests, to some extent, the overall effectiveness of our algorithm. Regarding the higher accuracy in the KAIST03 sequence, we attribute this to the presence of sufficient correct loop closures in the KAIST campus scenario, which helps maintain trajectory consistency. We also note that other algorithms, such as SC-ORORA and radarSLAM, performed better in the three KAIST campus sequences compared to the other two scenarios, which aligns with the trend observed in our algorithm. This trend may be due to the KAIST sequence having the most loop closures, which helps maintain the trajectory’s geometric integrity. Nevertheless, sustained high-accuracy localization remains challenging in our evaluation datasets due to two intrinsic factors: (i) the routes are relatively long (often larger than 1 km in the MulRan dataset) and involve numerous turns and U-turns, which compound drift accumulation; and (ii) the angular resolution of the current rotating millimeter-wave radar, which is approximately 0.9 degree, limits the precision of fine-scale geometric constraints. We performed optimization trials adjusting loop-closure thresholds and motion-model parameters, but the lower bound of 1.83 m persists, indicating that the limitation is primarily due to sensor and dataset characteristics rather than algorithmic design.

The bridge scenarios present a challenge due to their overall lack of distinctive features and repetitive features. LiDAR algorithms performed inconsistently in these scenarios, whereas millimeter-wave radar algorithms, despite potential end-point error increase, generally ensured better trajectory consistency. This could be due to the longer detection range and higher penetration of millimeter-wave radar, which allows it to capture a greater number of geometric structural features that are occluded by roadside trees. We observed a notable increase in error levels for our algorithm in the Riverside03 sequence. The trajectory plot reveals a deviation in the final orientation, which may be attributed to differences in vehicle movement trajectories and the geometric layout of the Riverside03 map compared to the other sequences. These differences likely caused deviations in the constraints. Similarly, LSRL experiences problems in Riverside03, which are attributed to outdated map data [59]. Despite these challenges, our observations indicate that, although the algorithm’s performance is influenced by map accuracy, it still maintains overall geometric consistency, demonstrating its fault tolerance.

Furthermore, the GNSS position variance analysis (Figure 9) in the DCC01 sequence shows that our method is capable of maintaining trajectory stability during periods of markedly increased GNSS standard deviation in all spatial dimensions. This illustrates the framework’s ability to compensate for degraded satellite positioning through radar odometry, loop-closure constraints, and adaptive map factors, thereby limiting drift accumulation despite highly variable GNSS precision. In the Twolochs forest canopy experiment (Figure 11), the dense, continuous canopy combined with irregular terrain and lack of distinct road geometry caused severe GNSS degradation and significant trajectory drift in raw GNSS data. Our system produced a smooth and stable estimation with an ATE of 6.75 m relative to the reference path, without reliance on structured road features. This confirms the applicability of the proposed framework to unstructured, topologically ambiguous environments where conventional GNSS and map-matching approaches fail.

5.2. Generalization in Complex Real-World Scenarios

From Figure 12, it can be observed that, regardless of canopy density, our system consistently produces trajectories geometrically aligned with the satellite road network while maintaining coherent radar feature point clouds. The fact that the positioning accuracy in these canopy-occluded scenarios remains close to that obtained on public datasets demonstrates the effectiveness and generalization capability of the proposed radar–OpenStreetMap fusion framework under forest-like GNSS-challenged conditions.

Comparative analysis with the Twolochs region in the OORD dataset (representing dense forest canopy conditions in the field) reveals notable differences in trajectory accuracy. While the millimeter-wave radar range resolution remained identical across the experiments, the campus scenarios were recorded with a shorter sensor detection range. Nevertheless, the average trajectory error in the campus canopy-occlusion dataset was consistently lower than that observed under dense forest canopy conditions. Specifically, in the Starlake Building area (sparse canopy cover), the trajectory error ratio was approximately 0.25%, and in the Youyi Square area (dense canopy cover), approximately 0.48%. In contrast, in the Twolochs region (dense forest canopy in unstructured environments), the trajectory error ratio rose to approximately 1.23%. These results indicate that trajectory error increases moderately with canopy density in campus environments, but rises more substantially in unstructured forest settings. We attribute this to the relative scarcity of regular geometric landmarks in the field, which reduces the availability of fine-scale geometric constraints and thereby contributes to higher error levels. In campus environments, abundant man-made structures such as planar building facades, orthogonal road intersections, cylindrical lamp posts, and even spherical decorative elements provide distinctive radar signatures that help stabilize trajectory estimation. In contrast, unstructured forest terrain is dominated by irregularly shaped vegetation and uneven ground surfaces, which lack consistent geometric patterns and thus offer fewer reliable constraints for precise localization.

5.3. Impact of Modules and Computational Efficiency

The short-term loop closure enhancement improves feature stability between adjacent radar frames, which is particularly beneficial in environments containing occlusions or repetitive structures, such as Riverside. By augmenting loop closure inputs with short-term sequences, the algorithm increases the number of correctly detected loop closures and enhances the representation capacity of spinning radar sensors.

The long-term enhancement contributes significantly to loop closure reliability by distinguishing forward and reverse loops, thereby reducing the number of false positives. In campus scenarios such as KAIST, the accuracy drop upon removal of this enhancement is less severe than in other sequences, suggesting that these environments inherently produce fewer erroneous loop closures. In the campus scenario, we observed that the reduction in accuracy after removing long-term loop closure enhancement was less significant compared to the reduction observed after removing short-term loop closure enhancement. This may be attributed to the relatively low number of erroneous loops in the KAIST scenario. Furthermore, the decrease in loop closure accuracy is more significant when long-term enhancements are removed, as shown in Figure 14. This also demonstrates the effectiveness of temporal enhancements in loop closure detection.

Map constraints from OpenStreetMap are especially critical in scenarios with sparse or unreliable loop closures, e.g., neighborhood and bridge environments. Without these constraints, accumulated pose estimation errors cannot be sufficiently corrected, leading to notable degradation in global trajectory accuracy. In contrast, campus scenarios benefit less from map constraints. This is likely because the campus scenario has a lower number of false loop closures. Even in the absence of reliable prior map information, the global constraints provided by these loop closures are sufficient to maintain overall trajectory consistency. In contrast, the neighborhood and bridge scenarios have fewer available loop closures. As a result, the absence of map constraints leads to accumulating errors that are not adequately corrected by the existing loop closures, resulting in noticeable accuracy degradation. To further examine the robustness of our mapping constraints module, we conducted a sensitivity analysis by introducing controlled horizontal noise into the OSM-matching outputs in the DCC01 sequence (Section 4.4.3). While the Absolute Trajectory Error gradually increased with larger injected noise levels (Table 3), the adaptive map factor weighting mechanism sustained reasonable accuracy even under severe geometric perturbations (up to 50 m displacement). This result suggests that the proposed framework can tolerate moderate map geometry errors without catastrophic performance loss, which is particularly relevant when using publicly available, crowd-sourced map data in areas where surveyed high-precision maps are unavailable.

Overall, the ablation study demonstrates that each enhancement contributes to the global accuracy, with the map constraint module providing the most pronounced improvements in challenging GNSS-denied and canopy-occluded environments.

6. Conclusions

This work proposes a spatial–temporal enhancement framework for millimeter-wave radar SLAM, combining open-source map priors from OpenStreetMap with a coarse-to-fine radar trajectory estimation pipeline. The spatial component employs map-based initialization with confidence-adaptive constraints, while the temporal component enhances radar odometry through short-term feature aggregation and long-term loop closure pruning. Together, these strategies yield geometrically consistent trajectories in GNSS-degraded urban environments.

Extensive experiments across neighborhoods, campuses, and bridge routes demonstrate that the proposed system improves trajectory accuracy and robustness under challenging conditions such as vegetation-induced signal loss. The framework achieves up to 55.23% reduction in Absolute Trajectory Error compared to baseline methods and reaches a minimum error of 1.83 m in canopies-occluded campus scenarios, confirming its capability to maintain precise localization during extended GNSS outages.

While the results are promising, certain limitations remain. The construction of the Temporal Density Context depends on radar feature quality and odometry estimates, which may vary with sensor models and algorithms. Furthermore, the current evaluation has been limited to urban and campus routes, and its behavior in large-scale environments featuring prolonged GNSS outages has yet to be confirmed. Future work will focus on extending and testing the framework in large-scale dense forest canopy conditions, where multipath effects, foliage attenuation, and extremely sparse radar landmarks pose additional challenges. Such scenarios would enable validation of the system’s long-term localization stability and broaden its applicability for autonomous navigation and mapping in natural GNSS-challenged terrains. In addition, future work will explore the use of higher-resolution radar sensors and the integration of inertial measurement units (IMU) or other complementary modalities, to provide tighter motion constraints and further enhance trajectory accuracy in GNSS-degraded conditions.

Author Contributions

Visualization, Y.T.; Investigation, Y.T.; Software, Y.T. and J.Z.; Validation, Y.T. and M.Y.; Writing—original draft preparation, Y.T.; Writing—review and editing, Y.T. and H.Z.; Supervision, J.Z.; Methodology, J.Z.; Project administration, J.Z.; Resources, B.L.; Data curation, M.Y. and S.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the State Key Program of the National Natural Science Foundation of China (grant no. 52332010), the Key Research and Development Projects in Hubei Province (grant no. 2024BAB078) and the Major Program (JD) of Hubei Province (grant no. 2023BAA02604). The numerical calculations in this article were performed on the supercomputing system at the Supercomputing Center of Wuhan University.

Data Availability Statement

Data available in a publicly accessible repository “https://github.com/carbo-T/ROTE” (accessed on 14 November 2025).

Acknowledgments

We want to acknowledge Kai Liu for his help in data collection.

Conflicts of Interest

Authors Haoran Zhong and Shuiyun Jiang were employed by the China Ship Development and Design Center. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Zheng, S.; Wang, J.; Rizos, C.; Ding, W.; El-Mowafy, A. Simultaneous localization and mapping (slam) for autonomous driving: Concept and analysis. Remote Sens. 2023, 15, 1156. [Google Scholar] [CrossRef]
Liu, Z.; Kaartinen, H.; Hakala, T.; Hyyppä, J.; Kukko, A.; Chen, R. Performance analysis of standalone UWB positioning inside forest canopy. IEEE Trans. Instrum. Meas. 2024, 73, 9511214. [Google Scholar] [CrossRef]
Chen, L.; Li, Y.; Li, L.; Qi, S.; Zhou, J.; Tang, Y.; Yang, J.; Xin, J. High-precision positioning, perception and safe navigation for automated heavy-duty mining trucks. IEEE Trans. Intell. Veh. 2024, 9, 4644–4656. [Google Scholar] [CrossRef]
Chen, W.; Shang, G.; Ji, A.; Zhou, C.; Wang, X.; Xu, C.; Li, Z.; Hu, K. An overview on visual slam: From tradition to semantic. Remote Sens. 2022, 14, 3010. [Google Scholar] [CrossRef]
Liu, Z.; Kaartinen, H.; Hakala, T.; Hyyti, H.; Kukko, A.; Hyyppa, J.; Chen, R. Comparative Analysis of Ultra-Wideband and Mobile Laser Scanning Systems for Mapping Forest Trees under A Forest Canopy. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2025, X-G-2025, 551–557. [Google Scholar] [CrossRef]
Hong, Z.; Petillot, Y.; Wallace, A.; Wang, S. RadarSLAM: A robust simultaneous localization and mapping system for all weather conditions. Int. J. Robot. Res. 2022, 41, 519–542. [Google Scholar] [CrossRef]
Wang, X.; Liang, X.; Campos, M.; Zhang, J.; Wang, Y. Benchmarking of Laser-Based Simultaneous Localization and Mapping Methods in Forest Environments. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5706221. [Google Scholar] [CrossRef]
Barnes, D.; Weston, R.; Posner, I. Masking by Moving: Learning Distraction-Free Radar Odometry from Pose Information. arXiv 2019, arXiv:1909.03752. [Google Scholar]
Barnes, D.; Posner, I. Under the radar: Learning to predict robust keypoints for odometry estimation and metric localisation in radar. In Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France, 31 May 2020–31 August 2020; pp. 9484–9490. [Google Scholar]
Burnett, K.; Yoon, D.J.; Schoellig, A.P.; Barfoot, T.D. Radar odometry combining probabilistic estimation and unsupervised feature learning. arXiv 2021, arXiv:2105.14152. [Google Scholar] [CrossRef]
Lai, H.; Zheng, Z.; Zhao, M. CartoRadar: RF-Based 3D SLAM Rivaling Vision Approaches. In Proceedings of the 31st ACM International Conference on Mobile Computing and Networking (ACM MobiCom ’25), Hong Kong, China, 4–8 November 2025. [Google Scholar]
Sie, E.; Wu, X.; Guo, H.; Vasisht, D. Radarize: Enhancing Radar SLAM with Generalizable Doppler-Based Odometry. In Proceedings of the 22nd ACM International Conference on Mobile Systems, Applications, and Services (ACM MobiSys ’24), Tokyo, Japan, 3–7 June 2024; pp. 331–344. [Google Scholar] [CrossRef]
Adolfsson, D.; Magnusson, M.; Alhashimi, A.; Lilienthal, A.J.; Andreasson, H. Lidar-Level Localization With Radar? The CFEAR Approach to Accurate, Fast, and Robust Large-Scale Radar Odometry in Diverse Environments. IEEE Trans. Robot. 2022, 39, 1476–1495. [Google Scholar] [CrossRef]
Burnett, K.; Schoellig, A.P.; Barfoot, T.D. Do we need to compensate for motion distortion and doppler effects in spinning radar navigation? IEEE Robot. Autom. Lett. 2021, 6, 771–778. [Google Scholar] [CrossRef]
Rohling, H. Radar CFAR thresholding in clutter and multiple target situations. IEEE Trans. Aerosp. Electron. Syst. 1983, AES-19, 608–621. [Google Scholar] [CrossRef]
Cen, S.H.; Newman, P. Precise ego-motion estimation with millimeter-wave radar under diverse and challenging conditions. In Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, QLD, Australia, 21–25 May 2018; pp. 6045–6052. [Google Scholar]
Cen, S.H.; Newman, P. Radar-only ego-motion estimation in difficult settings via graph matching. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; pp. 298–304. [Google Scholar]
Wang, D.; Duan, Y.; Fan, X.; Meng, C.; Ji, J.; Zhang, Y. MAROAM: Map-based Radar SLAM through Two-step Feature Selection. arXiv 2022, arXiv:2210.13797. [Google Scholar]
Checchin, P.; Gérossier, F.; Blanc, C.; Chapuis, R.; Trassoudaine, L. Radar scan matching slam using the fourier-mellin transform. In Proceedings of the Field and Service Robotics: Results of the 7th International Conference, Cambridge, MA, USA, 14–16 July 2009; Springer: Berlin/Heidelberg, Germany, 2010; pp. 151–161. [Google Scholar]
Park, Y.S.; Shin, Y.S.; Kim, A. Pharao: Direct radar odometry using phase correlation. In Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France, 31 May–31 August 2020; pp. 2617–2623. [Google Scholar]
Lim, H.; Jeon, J.; Myung, H. UV-SLAM: Unconstrained line-based SLAM using vanishing points for structural mapping. IEEE Robot. Autom. Lett. 2022, 7, 1518–1525. [Google Scholar] [CrossRef]
Zhuang, Y.; Wang, B.; Huai, J.; Li, M. 4D iRIOM: 4D imaging radar inertial odometry and mapping. IEEE Robot. Autom. Lett. 2023, 8, 3246–3253. [Google Scholar] [CrossRef]
Zhou, B.; Mo, H.; Tang, S.; Zhang, X.; Li, Q. Backpack LiDAR-based SLAM with multiple ground constraints for multistory indoor mapping. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5705516. [Google Scholar] [CrossRef]
Jiang, W.; Xue, H.; Si, S.; Xiao, L.; Zhao, D.; Zhu, Q.; Nie, Y.; Dai, B. R2SCAT-LPR: Rotation-Robust Network with Self-and Cross-Attention Transformers for LiDAR-Based Place Recognition. Remote Sens. 2025, 17, 1057. [Google Scholar] [CrossRef]
Duan, C.; Zhou, J.; Li, B.; Dong, Z.; Tang, Y.; Xiao, J. RoCalib: Large-Scale Autonomous Geo-Calibration for Roadside Lidar with High-Definition Map. IEEE Trans. Intell. Transp. Syst. 2025, 26, 13734–13749. [Google Scholar] [CrossRef]
Schubert, S.; Neubert, P.; Garg, S.; Milford, M.; Fischer, T. Visual place recognition: A tutorial. IEEE Robot. Autom. Mag. 2023, 31, 139–153. [Google Scholar] [CrossRef]
Lowe, D.G. Object recognition from local scale-invariant features. In Proceedings of the Seventh IEEE International Conference on Computer Vision, Kerkyra, Greece, 20–27 September 1999; Volume 2, pp. 1150–1157. [Google Scholar]
Konolige, K.; Agrawal, M. FrameSLAM: From bundle adjustment to real-time visual mapping. IEEE Trans. Robot. 2008, 24, 1066–1077. [Google Scholar] [CrossRef]
Oliva, A.; Torralba, A. Building the gist of a scene: The role of global image features in recognition. Prog. Brain Res. 2006, 155, 23–36. [Google Scholar]
Ali-Bey, A.; Chaib-Draa, B.; Giguere, P. Mixvpr: Feature mixing for visual place recognition. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 2–7 January 2023; pp. 2998–3007. [Google Scholar]
Arandjelovic, R.; Gronat, P.; Torii, A.; Pajdla, T.; Sivic, J. NetVLAD: CNN architecture for weakly supervised place recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 5297–5307. [Google Scholar]
Kim, G.; Kim, A. Scan context: Egocentric spatial descriptor for place recognition within 3d point cloud map. In Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 1–5 October 2018; pp. 4802–4809. [Google Scholar]
Uy, M.A.; Lee, G.H. Pointnetvlad: Deep point cloud based retrieval for large-scale place recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 4470–4479. [Google Scholar]
Kim, G.; Choi, S.; Kim, A. Scan context++: Structural place recognition robust to rotation and lateral variations in urban environments. IEEE Trans. Robot. 2021, 38, 1856–1874. [Google Scholar] [CrossRef]
Xu, D.; Liu, J.; Liang, Y.; Lv, X.; Hyyppä, J. A LiDAR-based single-shot global localization solution using a cross-section shape context descriptor. ISPRS J. Photogramm. Remote Sens. 2022, 189, 272–288. [Google Scholar] [CrossRef]
Xiang, H.; Shi, W.; Fan, W.; Chen, P.; Bao, S.; Nie, M. FastLCD: A fast and compact loop closure detection approach using 3D point cloud for indoor mobile mapping. Int. J. Appl. Earth Obs. Geoinf. 2021, 102, 102430. [Google Scholar] [CrossRef]
Cai, K.; Wang, B.; Lu, C.X. Autoplace: Robust place recognition with single-chip automotive radar. In Proceedings of the 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA, 23–27 May 2022; pp. 2222–2228. [Google Scholar]
Cheng, Y.; Pang, C.; Jiang, M.; Liu, Y. Relocalization based on millimeter wave radar point cloud for visually degraded environments. J. Field Robot. 2023, 40, 901–918. [Google Scholar] [CrossRef]
Jang, H.; Jung, M.; Kim, A. Raplace: Place recognition for imaging radar using radon transform and mutable threshold. In Proceedings of the 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Detroit, MI, USA, 1–5 October 2023; pp. 11194–11201. [Google Scholar]
Liu, Z.; Kaartinen, H.; Hakala, T.; Hyyppä, J.; Kukko, A.; Chen, R. Tracking foresters and mapping tree stem locations with decimeter-level accuracy under forest canopies using UWB. Expert Syst. Appl. 2025, 262, 125519. [Google Scholar] [CrossRef]
Qiu, H.; Zhou, J.; Li, B.; Zou, Q.; Tang, Y.; Luo, M. Map4comm: A map-aware collaborative perception framework with efficient-bandwidth information fusion. Inf. Fusion 2025, 262, 103567. [Google Scholar] [CrossRef]
Qiu, H.; Tang, Y.; Zhou, J.; Xiong, C.; Liu, K.; Xie, F.; Li, B. PosiFusion: A Vehicle-to-Everything Cooperative Perception Framework with Positional Prior Fusion. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2025, 10, 115–122. [Google Scholar] [CrossRef]
Mur-Artal, R.; Montiel, J.M.M.; Tardos, J.D. ORB-SLAM: A versatile and accurate monocular SLAM system. IEEE Trans. Robot. 2015, 31, 1147–1163. [Google Scholar] [CrossRef]
Hong, Z.; Petillot, Y.; Wang, S. Radarslam: Radar based large-scale slam in all weathers. In Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA, 25–29 October 2020; pp. 5164–5170. [Google Scholar]
Shan, T.; Englot, B.; Meyers, D.; Wang, W.; Ratti, C.; Rus, D. Lio-sam: Tightly-coupled lidar inertial odometry via smoothing and mapping. In Proceedings of the 2020 IEEE/RSJ international conference on intelligent robots and systems (IROS), Las Vegas, NV, USA, 25–29 October 2020; pp. 5135–5142. [Google Scholar]
Cao, S.; Lu, X.; Shen, S. GVINS: Tightly coupled GNSS–visual–inertial fusion for smooth and consistent state estimation. IEEE Trans. Robot. 2022, 38, 2004–2021. [Google Scholar] [CrossRef]
Zhang, J.; Zhuge, H.; Wu, Z.; Peng, G.; Wen, M.; Liu, Y.; Wang, D. 4DRadarSLAM: A 4D imaging radar SLAM system for large-scale environments based on pose graph optimization. In Proceedings of the 2023 IEEE International Conference on Robotics and Automation (ICRA), London, UK, 29 May–2 June 2023; pp. 8333–8340. [Google Scholar]
Nedjah, N.; de Macedo Mourelle, L.; de Oliveira, P.J.A. Simultaneous localization and mapping using swarm intelligence based methods. Expert Syst. Appl. 2020, 159, 113547. [Google Scholar] [CrossRef]
Smith, R.C.; Cheeseman, P. On the representation and estimation of spatial uncertainty. Int. J. Robot. Res. 1986, 5, 56–68. [Google Scholar] [CrossRef]
Grisetti, G.; Stachniss, C.; Burgard, W. Improved techniques for grid mapping with rao-blackwellized particle filters. IEEE Trans. Robot. 2007, 23, 34–46. [Google Scholar] [CrossRef]
Cadena, C.; Carlone, L.; Carrillo, H.; Latif, Y.; Scaramuzza, D.; Neira, J.; Reid, I.; Leonard, J.J. Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age. IEEE Trans. Robot. 2016, 32, 1309–1332. [Google Scholar] [CrossRef]
Yang, S.; Zhu, X.; Nian, X.; Feng, L.; Qu, X.; Ma, T. A robust pose graph approach for city scale LiDAR mapping. In Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 1–5 October 2018; pp. 1175–1182. [Google Scholar]
Yin, H.; Wang, Y.; Tang, L.; Xiong, R. Radar-on-lidar: Metric radar localization on prior lidar maps. In Proceedings of the 2020 IEEE International Conference on Real-Time Computing and Robotics (RCAR), Virtual, 28–29 September 2020; pp. 1–7. [Google Scholar]
Ma, Y.; Zhao, X.; Li, H.; Gu, Y.; Lang, X.; Liu, Y. RoLM: Radar on LiDAR map localization. In Proceedings of the 2023 IEEE International Conference on Robotics and Automation (ICRA), London, UK, 29 May–2 June 2023; pp. 3976–3982. [Google Scholar]
Guo, Y.; Zhou, J.; Dong, Q.; Li, B.; Xiao, J.; Li, Z. Refined high-definition map model for roadside rest area. Transp. Res. Part A Policy Pract. 2025, 195, 104463. [Google Scholar] [CrossRef]
Zhou, J.; Yu, M.; Guo, Y.; Li, B.; Ying, S.; Li, Z. A high-definition map architecture for transportation digital twin system construction. Int. J. Appl. Earth Obs. Geoinf. 2025, 144, 104822. [Google Scholar] [CrossRef]
Ruchti, P.; Steder, B.; Ruhnke, M.; Burgard, W. Localization on openstreetmap data using a 3d laser scanner. In Proceedings of the 2015 IEEE international conference on robotics and automation (ICRA), Seattle, WA, USA, 26–30 May 2015; pp. 5260–5265. [Google Scholar]
Brubaker, M.A.; Geiger, A.; Urtasun, R. Map-based probabilistic visual self-localization. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 38, 652–665. [Google Scholar] [CrossRef]
Hong, Z.; Petillot, Y.; Zhang, K.; Xu, S.; Wang, S. Large-Scale Radar Localization using Online Public Maps. In Proceedings of the 2023 IEEE International Conference on Robotics and Automation (ICRA), London, UK, 29 May–2 June 2023; pp. 3990–3996. [Google Scholar]
Panphattarasap, P.; Calway, A. Automated map reading: Image based localisation in 2-D maps using binary semantic descriptors. In Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 1–5 October 2018; pp. 6341–6348. [Google Scholar]
Yang, C.; Gidofalvi, G. Fast map matching, an algorithm integrating hidden Markov model with precomputation. Int. J. Geogr. Inf. Sci. 2018, 32, 547–570. [Google Scholar] [CrossRef]
Tang, Y.; Zhou, J.; Yan, M.; Li, B.; Huang, Y.; Xiao, J. HGCM: A hybrid radar feature and GNSS based continuous mapping method. Expert Syst. Appl. 2025, 286, 128020. [Google Scholar] [CrossRef]
Maybeck, P.S. Stochastic Models, Estimation, and Control; Academic Press: New York, NY, USA, 1982. [Google Scholar]
Rosinol, A.; Leonard, J.J.; Carlone, L. Probabilistic volumetric fusion for dense monocular slam. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 2–7 January 2023; pp. 3097–3105. [Google Scholar]
Dellaert, F. Factor Graphs and GTSAM: A Hands-on Introduction; Technical Report; Georgia Institute of Technology: Atlanta, GA, USA, 2012. [Google Scholar]
Kaess, M.; Johannsson, H.; Roberts, R.; Ila, V.; Leonard, J.J.; Dellaert, F. iSAM2: Incremental smoothing and mapping using the Bayes tree. Int. J. Robot. Res. 2012, 31, 216–235. [Google Scholar] [CrossRef]
Zhang, Z.; Scaramuzza, D. A tutorial on quantitative trajectory evaluation for visual (-inertial) odometry. In Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 1–5 October 2018; pp. 7244–7251. [Google Scholar]
Shan, T.; Englot, B. Lego-loam: Lightweight and ground-optimized lidar odometry and mapping on variable terrain. In Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 1–5 October 2018; pp. 4758–4765. [Google Scholar]
Lim, H.; Han, K.; Shin, G.; Kim, G.; Hong, S.; Myung, H. Orora: Outlier-robust radar odometry. In Proceedings of the 2023 IEEE International Conference on Robotics and Automation (ICRA), London, UK, 29 May–2 June 2023; pp. 2046–2053. [Google Scholar]
Kim, G.; Park, Y.S.; Cho, Y.; Jeong, J.; Kim, A. Mulran: Multimodal range dataset for urban place recognition. In Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France, 31 May–31 August 2020; pp. 6246–6253. [Google Scholar]
Gadd, M.; De Martini, D.; Bartlett, O.; Murcutt, P.; Towlson, M.; Widojo, M.; Muşat, V.; Robinson, L.; Panagiotaki, E.; Pramatarov, G.; et al. Oord: The oxford offroad radar dataset. IEEE Trans. Intell. Transp. Syst. 2024, 25, 18779–18790. [Google Scholar] [CrossRef]

Figure 1. An overview of our system, which takes millimeter-wave radar images and OpenStreetMap as inputs, achieves continuous and globally consistent state estimation through radar odometry, trajectory-to-map association, loop closure detection, and back-end optimization. The colored arrows represent different constraints fed into the pose graph: cyan for odometry poses, green for loop constraints, and yellow for map matching constraints.

Figure 2. The difference between two adjacent raw radar data. The red boxes highlight areas with significant feature differences between frames.

Figure 3. The process of generating a temporal density context and the corresponding ringkey based on short-term enhancement. Point clouds of different colors correspond to distinct radar frames, while the yellow and green regions represent the partitioning based on azimuth and range, respectively.

Figure 4. The principle of loop outlier removal based on long-term enhancement. The blue and orange gradient regions correspond to the historical (near frame i) and current (near frame j) sequences, respectively. The time axis progresses from left to right, with darker shades signifying closer temporal proximity and higher similarity.

Figure 5. The structure of the pose graph optimization with adaptive map factors. The blue and green circles represent robot states and radar measurements, respectively. Colored dots indicate specific constraints added to the graph: yellow for map matching, green for odometry, and cyan for loop closures.

Figure 6. Our platform for real-world experiments.

Figure 7. Real-world GNSS-challenged scenarios used for evaluating spatial-temporal enhancements on radar SLAM system. The small triangles and associated images indicate the viewing direction and the corresponding real-world scenes, respectively, illustrating the level of canopy occlusion. (a) The Starlake building (sparse canopy cover). (b) The Youyi Square (dense and sparse canopy cover).

Figure 8. Visualization of the DCC01 sequence overlaid on a satellite map background. The yellow line represents the vehicle trajectory obtained by our method, while the colored points denote the mmWave feature point cloud. Point colors correspond to echo intensity, with warmer colors (towards red) indicating stronger returns and cooler colors indicating weaker returns.

Figure 9. GNSS position standard deviation in X, Y, and Z directions over time for the DCC01 sequence. Periods of increased variance correspond to degraded GNSS precision.

Figure 10. Trajectories on Mulran Dataset generated from methods including ground truth, Yeti, sc-lego-loam, sc-orora and our method for qualitative evaluation. The comparisons are performed on sequences: (a) DCC01, (b) DCC02, (c) DCC03, (d) KAIST01, (e) KAIST02, (f) KAIST03, (g) Riverside01, (h) Riverside02, (i) Riverside03.

Figure 11. Trajectory comparison under dense forest canopy in the Twolochs region (OORD dataset), including our method (blue solid line), GNSS data with outlier removal (orange solid line), and raw GNSS data (dashed line).

Figure 12. The trajectory of our system in two scenes in our real world experiments. The yellow line represents the estimated trajectory. The mmWave feature points are colored by echo intensity, with warmer colors (red) indicating stronger returns. (a) The Starlake building. (b) The Youyi Square.

Figure 13. The ATE results for evaluating the contribution of our proposed enhancement on the global accuracy.

Figure 14. Precision-Recall evaluation for three scenes in the ablation study. Loop detection accuracy decreases when short-term and long-term enhancements are removed. (a) Precision–Recall Analysis for DCC01. (b) Precision–Recall Analysis for KAIST01. (c) Precision–Recall Analysis for Riverside01.

Figure 15. Computation time breakdown per frame for major modules: millimeter-wave radar odometry (blue), loop closure detection (orange), and back-end pose graph optimization (yellow).

Figure 16. Visualization of feature points and Temporal Density Context (TDC) for two adjacent frames before and after short-term loop-closure enhancement.The red and blue squares highlight corresponding regions of drastic inter-frame feature fluctuation in the point clouds and descriptors, respectively, demonstrating the improved stability.

Figure 17. Comparison of the effects of long-term loop closure enhancement on forward and reverse loops before and after enhancement, showing significantly increased similarity for true forward and reverse loops compared with surrounding non-loop regions, thereby improving loop-closure robustness.

Table 1. Absolute trajectory error (ATE) of different systems on the MulRan Dataset. All values are root mean square error (RMSE) in meters (m). The symbols “-” and “NA” denote cases where data was not provided in the original paper and where the system failed to complete the sequence, respectively. In the latter case, this indicates that the error exceeded 100 m. Bold values indicate the best accuracy. Pure GNSS errors are shown in italics to illustrate signal fluctuations in these scenarios, noting that neither the proposed method nor the comparison methods utilize GNSS as an input.

Method	Sequence
Method	DCC01	DCC02	DCC03	KAIST01	KAIST02	KAIST03	RIV01	RIV02	RIV03
SC-LeGO-LOAM	29.34 m	9.70 m	5.29 m	49.27 m	32.10 m	31.16 m	NA	96.46 m	NA
Yeti	46.31 m	18.89 m	37.89 m	84.01 m	42.92 m	6.08 m	94.36 m	18.63 m	21.75 m
RadarSLAM	12.89 m	9.88 m	3.92 m	6.87 m	6.03 m	4.11 m	9.03 m	7.05 m	10.74 m
SC-ORORA	22.58 m	24.34 m	17.61 m	4.94 m	4.06 m	2.14 m	13.11 m	18.37 m	12.30 m
LSRL	-	-	-	7.00 m	6.00 m	6.20 m	6.70 m	5.70 m	NA
pure GNSS	7.44 m	10.61 m	3.60 m	2.83 m	2.76 m	3.37 m	3.85 m	3.01 m	2.57 m
OSM	7.00 m	9.54 m	4.79 m	5.98 m	7.00 m	7.40 m	12.97 m	11.41 m	10.90 m
ours	6.12 m	8.68 m	7.19 m	4.39 m	3.39 m	1.83 m	5.00 m	3.91 m	12.93 m

Table 2. Absolute trajectory error (ATE) of our system on real world dataset. All values are given in meters (m).

The Starlake Building	The Youyi Square
0.55 m	2.08 m

Table 3. Sensitivity of trajectory accuracy to map-matching geometry errors (DCC01 sequence). All values are in meters (m).

Noise Level	ATE (m)
No noise	6.41
+5 m random noise	10.60
+10 m random noise	10.30
+20 m random noise	11.10
+50 m random noise	10.11

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Article metric data becomes available approximately 24 hours after publication online.

Radar and OpenStreetMap-Aided Consistent Trajectory Estimation in Canopy-Occluded Environments

Highlights

Abstract

1. Introduction

2. Related Works

2.1. Radar-Based Localization and Mapping

2.2. Radar Loop Closure Detection

2.3. Pose Graph Optimization with Prior Map Integration

3. Methods

3.1. Radar Odometry

3.2. Temporal Enhanced Loop Closure Detection

3.2.1. Short-Term Enhancement for Temporal Density Context Generation

3.2.2. Long-Term Enhancement for Loop Outlier Removal

3.3. Pose Graph Optimization with Adaptive Map Factors

3.3.1. Adaptive Map Factors

3.3.2. Loop Closure Factors

4. Results

4.1. Experimental Setup

4.2. Trajectory Evaluation

4.3. Real World Experiments

4.4. Ablation and Real-Time Performance Analysis

4.4.1. Short-Term Enhancement

4.4.2. Long-Term Enhancement

4.4.3. Adaptive Map Factors

5. Discussion

5.1. Analysis of Trajectory Accuracy and Consistency

5.2. Generalization in Complex Real-World Scenarios

5.3. Impact of Modules and Computational Efficiency

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics