Stable and Smooth Trajectory Optimization for Autonomous Ground Vehicles via Halton-Sampling-Based MPPI

Xu, Kang; Ye, Lei; Li, Xiaohui; Sun, Zhenping; Bu, Yafeng

doi:10.3390/drones10020096

Open AccessArticle

Stable and Smooth Trajectory Optimization for Autonomous Ground Vehicles via Halton-Sampling-Based MPPI

by

Kang Xu

¹,

Lei Ye

^2,*,

Xiaohui Li

^1,*,

Zhenping Sun

¹ and

Yafeng Bu

¹

College of Intelligence Science and Technology, National University of Defense Technology, Sanyi Avenue, Changsha 410007, China

²

Test Center, National University of Defense Technology, Xi’an 710100, China

^*

Authors to whom correspondence should be addressed.

Drones 2026, 10(2), 96; https://doi.org/10.3390/drones10020096

Submission received: 29 December 2025 / Revised: 24 January 2026 / Accepted: 27 January 2026 / Published: 29 January 2026

(This article belongs to the Topic Advances in Autonomous Vehicles, Automation, and Robotics)

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

A Halton-sampling-based MPPI trajectory optimization framework is proposed for unmanned systems. By integrating low-discrepancy Halton sequences with temporally correlated Ornstein–Uhlenbeck (OU) perturbations, structured sampling constraints are introduced across both the control space and the temporal domain, leading to significantly improved stability and smoothness of trajectories and control sequences.
Extensive simulations on the BARN benchmark dataset demonstrate that the proposed method outperforms several representative MPPI variants in terms of trajectory smoothness and control smoothness while maintaining high navigation success rates and real-time computational performance; further validation through real-world experiments in outdoor environments confirms its robustness and engineering applicability.

What are the implications of the main findings?

The proposed Halton–OU sampling mechanism provides a generally applicable perturbation modeling and sampling optimization paradigm for sampling-based model predictive control in unmanned systems, effectively reducing sampling variance and control jitter without introducing additional computational burden, thereby enhancing optimization stability and convergence consistency.
Halton-MPPI can serve as a general local trajectory optimization module that can be directly integrated into multi-sensor fusion-based perception and navigation frameworks, offering a valuable technical pathway for enabling safe, smooth, and efficient autonomous motion of autonomous ground vehicles in complex environments.

Abstract

Achieving safe and stable navigation for autonomous ground vehicles (AGVs) in complex environments remains a key challenge in intelligent robotics. Conventional Model Predictive Path Integral (MPPI) control relies on pseudo-random Gaussian sampling, which often results in non-uniform sample distributions and jitter-prone control sequences, thereby limiting both convergence efficiency and control stability. This paper proposes a trajectory optimization method: Halton-MPPI, which improves MPPI by employing low-discrepancy sampling and modeling temporally correlated perturbations. Specifically, it utilizes the Halton sequence as the sampling basis for control disturbances to enhance spatial coverage, while the Ornstein–Uhlenbeck (OU) process is introduced to impose temporal correlation on control perturbations. This time-consistent noise propagation allows perturbation effects to accumulate over time, thereby expanding trajectory coverage. Large-scale simulations on the BARN dataset demonstrate that the method significantly enhances both trajectory smoothness (MSCX) and control smoothness (MSCU) while maintaining high success rates. Moreover, field tests in outdoor environments validate the effectiveness and robustness of Halton-MPPI, underscoring its practical value for autonomous navigation in complex environments.

Keywords:

autonomous ground vehicles; model predictive path integral; Halton sequences; trajectory optimization

1. Introduction

Achieving safe, reliable, and stable control of unmanned autonomous systems in complex and unknown environments remains a fundamental challenge in intelligent robotics research [1]. With the rapid advancement of large AI models, recent studies have investigated their potential in trajectory generation and autonomous decision-making for unmanned systems, particularly in the context of the low-altitude economy and large-scale intelligent deployment [2,3]. These developments underscore the increasing demand for robust, adaptive, and real-time motion planning and control frameworks across diverse autonomous platforms.

In high-density scenarios such as forests, narrow corridors, and industrial warehouses, autonomous ground vehicles are required to exhibit fully autonomous navigation capabilities. Specifically, they must not only avoid collisions in the presence of obstacles but also escape local minima while maintaining effective convergence toward target states under strict system constraints [4,5]. This process demands accurate perception of the surrounding environment and timely responses to different conditions. Consequently, trajectory optimization and control in such settings constitute inherently high-dimensional, nonlinear, and constraint-intensive real-time optimization problems, which are difficult to solve directly using traditional methods [6].

In recent years, sampling-based trajectory optimization methods, such as Model Predictive Path Integral (MPPI) control, have emerged as effective approaches for addressing the aforementioned challenges. This effectiveness stems primarily from their gradient-free nature, their ability to naturally accommodate nonlinear system dynamics and complex cost functions, and their inherent suitability for parallel computation [7,8]. However, the performance of MPPI control is highly sensitive to the quality of the sampling distribution. Conventional independent and identically distributed Gaussian sampling often suffers from clustering and void effects, where samples are excessively concentrated in certain regions while other regions remain underexplored [9]. This phenomenon not only degrades the accuracy of gradient-related cost estimates but also leads to slow convergence or even unstable optimization behavior, particularly under limited computational budgets or stringent real-time constraints [10].

Moreover, purely stochastic sampling often induces control chattering, which may render the resulting control sequences dynamically infeasible when deployed on physical robotic systems. To mitigate these limitations, prior studies have explored a range of enhancement strategies, including control signal smoothing [11], the integration of auxiliary control modules [12], and the design of improved sampling frameworks [13]. Additionally, several works have sought to reduce the proportion of infeasible samples by constructing more informative prior distributions [14,15,16]. In [17], a hybrid trajectory optimization method was proposed specifically for generating collision-free and smooth trajectories for autonomous mobile robots, which synergistically combines the MPPI algorithm with the gradient-based interior-point differential dynamic programming (IPDDP) approach. Nevertheless, despite these advances, none of the aforementioned strategies fundamentally address temporal smoothness in the generated control sequences—a critical bottleneck that continues to hinder their reliable application in practical scenarios.

Another line of research focuses on sampling in the geometric parameter space. Higgins et al. [18] and Lambert et al. [19] generated smooth trajectories by sampling spline curve parameters. Similarly, Miura et al. [20] enhanced MPPI control by integrating spline interpolation with Stein variational gradient descent, thereby enabling the generation of smooth control inputs even under high sampling noise in reactive navigation scenarios. Alternatively, kernel-based methods have been employed to directly search for smooth geometric paths [21]. Although spline-based parameterization methods inherently guarantee trajectory smoothness, they reformulate the optimization problem from the control space into a geometric parameter space. This transformation significantly increases the complexity of the mapping between geometry and control inputs, resulting in higher computational overhead and limited applicability in real-time settings.

It is worth noting that the manner in which perturbations are sampled from the policy’s Gaussian distribution plays a critical role in the convergence behavior of MPPI control and can be exploited to explicitly embed desirable properties, such as smoothness, into the control policy. In contrast to Monte Carlo methods based on pseudo-random sampling, which are prone to sample clustering, low-discrepancy sequences provide more uniform coverage of the sampling space, thereby significantly reducing variance and improving sampling efficiency. Among these approaches, the Halton sequence [22], a widely used low-discrepancy sequence generator, has demonstrated excellent uniformity in high-dimensional spaces and a strong ability to accelerate Monte Carlo convergence [23,24,25].

To address this issue, this paper proposes a low-discrepancy sampling-based MPPI framework, termed Halton-MPPI, which incorporates an Ornstein–Uhlenbeck (OU) process to model temporal correlations in control perturbations [26]. Unlike existing quasi-Monte Carlo MPPI variants, which primarily improve spatial sampling uniformity, or correlated-noise MPPI approaches, which address temporal smoothness via pseudo-random perturbations, the proposed Halton-MPPI framework unifies these two aspects within a single perturbation generation mechanism, enabling spatially uniform samples to evolve coherently over time along the prediction horizon. Specifically, low-discrepancy Halton sequences are first generated in the control dimension and transformed into standard Gaussian perturbation bases via an inverse normal transformation, after which temporal coherence is enforced through OU-based propagation. By explicitly structuring the sampling process, the proposed framework jointly enforces spatial coverage and temporal continuity at the perturbation level, thereby improving optimization stability and trajectory smoothness while preserving the original control-space formulation and computational efficiency of MPPI control, without resorting to spline reparameterization, kernelized path representations, or post hoc smoothing operations.

This paper proposes a Halton-MPPI algorithm, a trajectory optimization framework that enhances the classical MPPI method through structured control perturbation sampling. By leveraging the spatial uniformity of low-discrepancy Halton sequences and the temporal continuity induced by the Ornstein–Uhlenbeck process, the proposed approach achieves more stable convergence behavior and generates smoother, physically feasible trajectories under real-time constraints. The remainder of this paper is organized as follows: Section 2 reviews the classical MPPI framework and introduces the theoretical foundations of low-discrepancy Halton sequences and Ornstein–Uhlenbeck temporal modeling. Section 3 details the proposed Halton-MPPI algorithm, including the control perturbation sampling mechanism and the overall optimization strategy. Section 4 presents comprehensive simulation results and real-world experimental evaluations on autonomous ground vehicles, followed by quantitative comparisons with existing MPPI-based methods. Finally, Section 5 discusses the experimental findings and concludes the paper with directions for future research.

2. Materials

2.1. MPPI Framework

Model Predictive Path Integral (MPPI) [7] control is a class of sampling-based, gradient-free trajectory optimization methods enabling real-time optimal control for complex nonlinear and non-convex systems. Its core principle lies in perturbing the control sequence and performing parallel trajectory rollouts, followed by a cost-weighted update of the control using an expectation-based formulation, thereby circumventing the limitations of traditional MPC approaches that rely on explicit gradient computations [27]. MPPI control does not require differentiability or convexity assumptions on either the objective function or the system dynamics, making it particularly well suited for scenarios involving complex constraints and dynamic obstacles.

Consider a discrete-time system with state

x_{k} \in R^{n}

and control input

u_{k} \in R^{m}

. The system dynamics are described as follows [11]:

x_{k + 1} = f (x_{k}, v_{k}), v_{k} = u_{k} + δ u_{k},

(1)

where

δ u_{k} \sim N (0, Σ_{u})

denotes a zero-mean Gaussian noise injection with covariance

Σ_{u}

. Over a prediction horizon of length N, the control sequence and the corresponding state sequence are defined as follows:

U = (u_{0}, u_{1}, \dots, u_{N - 1}) \in R^{m \times N}, X = (x_{0}, x_{1}, \dots, x_{N - 1}) \in R^{n \times N} .

(2)

Within this framework, the optimization objective is to determine a control sequence U that enables the robot to safely transition from the initial state

x_{s}

to the target state

x_{f}

, while minimizing the following expected cost function:

J = E [ϕ (x_{N}) + \sum_{k = 0}^{N - 1} (q (x_{k}) + \frac{1}{2} u_{k}^{⊤} R u_{k})],

(3)

where

ϕ (x_{N})

denotes the terminal cost,

q (x_{k})

represents the state-dependent stage cost, and

R \in R^{m \times m}

is a positive definite control weighting matrix. The constraints include the system dynamics

x_{k + 1} = f (x_{k}, v_{k})

, collision avoidance constraints

O_{rob} (x_{k}) \cap O_{obs} = ⌀

, and state and input constraints expressed as

h (x_{k}, u_{k}) \leq 0

.

To solve the above optimization problem, MPPI performs parallel sampling of M perturbed trajectories at each time step and updates the control sequence using cost-based weighting. The cumulative cost-to-go of a trajectory

τ

is defined as follows:

\tilde{S} (τ) = ϕ (x_{N}) + \sum_{k = 0}^{N - 1} \tilde{q} (x_{k}, u_{k}, δ u_{k}),

(4)

where the instantaneous cost is given by

\begin{matrix} \tilde{q} (x_{k}, u_{k}, δ u_{k}) = & q (x_{k}) + \frac{1 - ν^{- 1}}{2} δ u_{k}^{⊤} R δ u_{k} \\ + u_{k}^{⊤} R δ u_{k} + \frac{1}{2} u_{k}^{⊤} R u_{k}, \end{matrix}

(5)

and

ν \in R^{+}

denotes the exploration parameter that regulates the spread of the perturbation distribution.

After the trajectory costs are evaluated, the control input is updated via a weighted averaging scheme:

u_{k + 1} = u_{k} + \frac{\sum_{m = 1}^{M} exp (- \frac{1}{λ} \tilde{S} (τ_{k, m})) δ u_{k, m}}{\sum_{m = 1}^{M} exp (- \frac{1}{λ} \tilde{S} (τ_{k, m}))},

(6)

where

λ \in R^{+}

is the inverse temperature parameter that determines the relative contribution of low-cost trajectories in the update, thereby controlling the selectivity and stability of the policy update. The update rule in (6) minimizes the Kullback–Leibler divergence [28], and the resulting optimal control distribution is characterized by a Boltzmann distribution.

Trajectory rollouts in MPPI control are mutually independent and do not rely on gradient computations or iterative solution procedures. This flexibility allows MPPI control to accommodate arbitrary predictive models and cost formulations, making it particularly well suited for navigation in non-convex environments. The first control input

u_{0}

is directly applied to the system, while the remaining controls are used as the initial solution for the subsequent optimization step via a warm-start strategy. This receding-horizon execution scheme ensures both real-time performance and efficient convergence of the algorithm.

2.2. Halton Sequence

In sampling-based optimization methods, the uniformity of the sampling distribution directly influences the coverage of the search space and the convergence performance. Conventional Monte Carlo methods rely on independent and identically distributed pseudo-random samples, whose error convergence rate is typically

O (N^{- 1 / 2})

, where N denotes the number of samples [29]. In contrast, low-discrepancy sequences enforce a more uniform distribution of sampling points, enabling the convergence rate to approach

O (N^{- 1})

[22]. Owing to this favorable property, low-discrepancy sequences have been widely adopted in numerical integration, global optimization, and trajectory planning.

Halton sequences can effectively avoid the clustering and void phenomena commonly observed in pseudo-random sampling, thereby substantially improving the efficiency of integration and sampling in high-dimensional spaces [23]. This property is particularly important for trajectory optimization and control problems that require a large number of samples. Leveraging these characteristics, integrating Halton sequences into the trajectory sampling process of MPPI control can effectively reduce sample variance and enhance the stability of the resulting control policy, enabling more efficient and reliable convergence under limited sampling budgets.

2.3. Ornstein–Uhlenbeck Process for Correlated Noise

Although Halton sequences exhibit favorable low-discrepancy properties in the spatial domain, they do not inherently enforce continuity in the temporal dimension. In sampling-based control frameworks such as MPPI control, control perturbations are applied sequentially over time. If these perturbations vary independently across consecutive time steps, the resulting control sequence may exhibit abrupt changes, which can lead to oscillatory behavior and degraded execution stability in continuous-time control systems [30]. This limitation motivates the introduction of an explicit mechanism for temporal smoothing.

In this work, temporal smoothing is achieved by introducing an Ornstein–Uhlenbeck (OU) process [31] to model temporal correlations in the control perturbations. Specifically, an independent Halton-based noise term

ε_{t}

is first generated at each time step, and the OU process is then used to propagate this noise across time, producing a temporally correlated perturbation sequence. The discrete-time formulation of the OU process is given by

ϵ_{t + 1} = ρ ϵ_{t} + \sqrt{1 - ρ^{2}} ε_{t},

(7)

where

ρ = e^{- Δ t / τ}

is the temporal correlation coefficient,

τ

denotes the noise decay time constant, and

ε_{t}

represents an independent Halton-based noise term. By adjusting

ρ

, the degree of temporal smoothness of the perturbations can be controlled: larger values of

ρ

yields smoother control inputs over time, whereas a smaller

ρ

results in behavior closer to that of white noise.

In summary, Halton sequences provide spatially uniform, low-discrepancy sampling that enhances exploration efficiency in the control space, while the OU process introduces structured temporal correlations that promote smooth and dynamically consistent control evolution. Their integration yields a Halton–OU sampling mechanism that effectively balances exploration and exploitation, enabling both broad trajectory coverage and smooth control execution. As a result, the proposed approach improves the stability, robustness, and convergence efficiency of the MPPI algorithm in complex environments.

3. Methods

To improve the sampling efficiency and stability of classical MPPI control under limited sampling budgets, this section introduces a structured sampling strategy that integrates low-discrepancy Halton sequences with a temporally correlated noise model, resulting in the proposed Halton–MPPI sampling strategy. Unlike conventional Gaussian perturbation-based sampling, Halton-MPPI employs low-discrepancy samples with more uniform coverage of the control space, while the OU process enforces smooth temporal correlations. This design effectively reduces estimation variance and enhances optimization convergence under a limited sampling budget. The overall algorithmic workflow is illustrated in Figure 1.

3.1. Perturbation Sampling Mechanism Based on Halton Sequences and OU Process

To improve the sampling efficiency and stability of classical MPPI control, this section proposes a structured perturbation sampling strategy termed Halton–MPPI. The proposed method integrates low-discrepancy Halton sequences for spatial exploration with an OU process to model temporal correlations across the planning horizon. The overall algorithmic workflow is illustrated in Figure 1.

1.: Halton Low-Discrepancy Sampling

For a given dimension s and the i-th sample, a Halton sequence is constructed using the radical-inverse function with respect to prime bases. Specifically, the integer i is first represented in base p (a prime number) as follows:

i = d_{0} + d_{1} p + d_{2} p^{2} + \dots + d_{m} p^{m},

(8)

where

d_{j} \in {0, 1, \dots, p - 1}

. The corresponding radical-inverse function is then defined as follows:

ϕ_{p} (i) = \frac{d_{0}}{p} + \frac{d_{1}}{p^{2}} + \frac{d_{2}}{p^{3}} + \dots .

(9)

In the multidimensional case, the first s prime numbers

{p_{1}, \dots, p_{s}}

are selected, and the i-th Halton sample is given by

x_{i} = (ϕ_{p_{1}} (i), ϕ_{p_{2}} (i), \dots, ϕ_{p_{s}} (i)), i = 1, 2, \dots, N .

(10)

The resulting point set exhibits low discrepancy over the unit hypercube

{[0, 1]}^{s}

, providing more uniform coverage than pseudo-random sampling with the same number of samples [32]. The pseudocode of the Halton sequence generation algorithm is presented in Algorithm 1. In Halton–MPPI, this property is exploited to improve the spatial distribution of control perturbations.

Algorithm 1 Halton Sequence Generation

Require:: Number of samples N; dimension s; first s prime bases ${p_{1}, p_{2}, \dots, p_{s}}$
Ensure:: Halton sequence ${x_{i}}_{i = 1}^{N}$ , where $x_{i} \in {[0, 1]}^{s}$
1:: for $i = 1$ to N do
2:: for $j = 1$ to s do
3:: $n \leftarrow i$
4:: $f \leftarrow 1$
5:: $r \leftarrow 0$
6:: while $n > 0$ do
7:: $f \leftarrow f / p_{j}$
8:: $r \leftarrow r + f \cdot (n mod p_{j})$
9:: $n \leftarrow ⌊ n / p_{j} ⌋$
10:: end while
11:: $x_{i} [j] \leftarrow r$
12:: end for
13:: end for
14:: return ${x_{i}}_{i = 1}^{N}$

2.: Mapping to Gaussian Perturbation Space

To maintain consistency with the standard MPPI formulation and its Gaussian-based importance weighting, each Halton point

z_{i} \in {[0, 1]}^{s}

is transformed using the inverse cumulative distribution function of the standard normal distribution:

{\tilde{ϵ}}_{i} = Φ^{- 1} (z_{i}), i = 1, \dots, N,

(11)

where

Φ^{- 1} (\cdot)

is applied component-wise. This transformation preserves the low-discrepancy structure across samples while ensuring marginal distributions consistent with

N (0, I)

, making the samples compatible with the MPPI formulation.

3.: OU-Based Temporal Correlation for Control Smoothness

While Halton sampling improves spatial coverage, it does not impose temporal structure across successive control inputs. To promote smooth control evolution, temporally correlated perturbations are generated using an OU process:

ϵ_{t} = ρ ϵ_{t - 1} + \sqrt{1 - ρ^{2}} {\tilde{ϵ}}_{i},

(12)

where

ρ \in [0, 1)

denotes the temporal correlation coefficient. This recursive formulation introduces smooth temporal correlations while retaining the spatial uniformity inherited from the Halton-based Gaussian samples.

4.: Affine Transformation for Policy Matching

At each iteration, the temporally correlated perturbations are aligned with the current control policy through an affine transformation:

δ u_{i, t} = μ_{t} + σ_{t} ϵ_{i, t},

(13)

where

μ_{t}

and

σ_{t}

denote the mean and covariance of the control distribution, respectively. This step follows the standard MPPI formulation and ensures compatibility with control constraints and system dynamics.

By integrating Halton low-discrepancy sampling with the OU-based temporal recursion, the proposed Halton–MPPI framework constructs a structured sampling distribution across both the control space and the temporal dimension. Specifically, the Halton sequence ensures uniform coverage of control perturbations in the control space, while the Ornstein–Uhlenbeck process introduces structured temporal correlations that suppress abrupt variations between successive control inputs and enable directionally consistent perturbations to accumulate over time. The overall Halton–MPPI algorithm is summarized in Algorithm 2. Following the standard formulation of MPPI control and its variants, this work adopts the same underlying theoretical assumptions commonly used in sampling-based path integral control, including finite-horizon receding execution, stochastic approximation via Monte Carlo sampling, and optimization over locally perturbed control sequences. The proposed modifications do not alter the fundamental MPPI optimization principle but instead improve the statistical efficiency and temporal consistency of the sampling process under limited computational budgets.

3.2. Halton-MPPI Strategy

In this work, we propose Halton-MPPI, a low-discrepancy sampling enhancement of the classical Model Predictive Path Integral (MPPI) algorithm [7], with the objective of improving sampling uniformity and convergence stability in trajectory optimization.

In this section, to empirically validate the effectiveness of Halton-based low-discrepancy sampling within the MPPI framework, we compare the statistical characteristics and spatial distributions of standard Gaussian sampling and Halton–Gaussian sampling. Figure 2 illustrates the marginal histograms and kernel density estimates (PDFs) of 1000 samples generated by the two sampling strategies. Specifically, Figure 2a corresponds to conventional Gaussian sampling

X \sim N (0, 1)

, while Figure 2b shows the distribution of Halton–Gaussian samples

Y \sim N (0, 1)

.

As shown in the results, the Halton–Gaussian samples exhibit a noticeably smoother histogram and a kernel density curve that more closely matches the theoretical Gaussian distribution. The prominent spike observed in the histogram of Figure 2a is primarily attributable to the finite-sample nature of pseudo-random Gaussian sampling. Under a limited sampling budget, independent and identically distributed Gaussian samples tend to exhibit local clustering effects, where multiple samples fall into a narrow interval near the mean, leading to overrepresented bins and pronounced spikes in the histogram. This phenomenon reflects the inherent variance and irregular spatial coverage of pseudo-random sampling, rather than a deviation from the underlying Gaussian distribution. This indicates that, under a finite sample budget, low-discrepancy sequences provide more uniform coverage of the sampling space, thereby yielding improved statistical consistency compared to pseudo-random Gaussian sampling.

Algorithm 2 Halton–MPPI Algorithm

Require:: Initial state $x_{0}$ , control dimension $d_{u}$ , horizon T, number of samples N, control covariance $σ_{u}$ , temperature parameter $λ$ , temporal smoothing coefficient $ρ$
Ensure:: Optimal control sequence $U^{*} = {u_{0}^{*}, \dots, u_{T - 1}^{*}}$
1:: Initialize the nominal control sequence $U = {u_{0}, \dots, u_{T - 1}}$
2:: Generate Halton sequences as low-discrepancy perturbation bases $z_{i} \in {[0, 1]}^{d_{u} \times T}, i = 1, \dots, N$
3:: Map $z_{i}$ to standard Gaussian noise ${\tilde{ϵ}}_{i} \sim N (0, I)$ via inverse CDF transformation
4:: for $i = 1$ to N do
5:: Initialize OU noise $ϵ_{i, 0} = {\tilde{ϵ}}_{i, 0}$
6:: for $t = 1$ to $T - 1$ do
7:: $ϵ_{i, t} = ρ ϵ_{i, t - 1} + \sqrt{1 - ρ^{2}} {\tilde{ϵ}}_{i, t}$
8:: end for
9:: Construct temporally correlated perturbation sequence
10:: $ϵ_{i} = {ϵ_{i, 0}, \dots, ϵ_{i, T - 1}}$
11:: Generate control sequence $U_{i} = U + σ_{u} ϵ_{i}$
12:: Rollout system dynamics to obtain trajectory
13:: $X_{i} = f (x_{0}, U_{i})$
14:: Evaluate trajectory cost
15:: $S_{i} = ϕ (x_{T}^{i}) + \sum_{t = 0}^{T - 1} q (x_{t}^{i}, u_{t}^{i})$
16:: end for
17:: Compute normalized importance weights
18:: $w_{i} = \frac{exp (- \frac{1}{λ} S_{i})}{\sum_{j = 1}^{N} exp (- \frac{1}{λ} S_{j})}$
19:: Update control sequence $U \leftarrow U + \sum_{i = 1}^{N} w_{i} ϵ_{i}$
20:: Return optimal control sequence $U^{*} = U$

To further assess the Gaussianity of the sampled perturbations, Q–Q plots are presented in Figure 3. In these plots, closer alignment of the sample points with the red reference line indicates stronger agreement with the standard normal distribution. Figure 3a shows the Q–Q plot obtained using conventional pseudo-random Gaussian sampling, while Figure 3b corresponds to the Halton–Gaussian samples. A direct comparison reveals that the Halton–Gaussian samples exhibit a more uniform alignment along the reference line, indicating reduced deviation from the ideal Gaussian distribution. These results further demonstrate that low-discrepancy sampling effectively mitigates the randomness-induced variability typically introduced by pseudo-random sampling.

To quantitatively evaluate the low-discrepancy property and assess the uniformity of the sample distributions, we compute the star discrepancy, a widely used metric for measuring the deviation of a point set from an ideal uniform distribution. Given a d-dimensional point set

P_{N} = {x_{1}, x_{2}, \dots, x_{N}}

with

x_{i} \in {[0, 1]}^{d}

, the star discrepancy is defined as follows:

D_{N}^{*} (P_{N}) = sup_{u \in {[0, 1]}^{d}} |\frac{1}{N} \sum_{i = 1}^{N} 1_{{x_{i} \in [0, u)}} - \prod_{j = 1}^{d} u_{j}|,

(14)

where

1_{{\cdot}}

denotes the indicator function.

This metric quantifies the maximum deviation between the empirical distribution of sample points within any origin-anchored hyper-rectangle

[0, u)

and the corresponding theoretical volume. A smaller value of

D_{N}^{*}

indicates that the sample distribution is closer to uniform. In this study, Gaussian samples are first mapped to the uniform space via their cumulative distribution function (CDF), after which the star discrepancy

D_{N}^{*}

is computed to compare the spatial uniformity of conventional Gaussian sampling and Halton–Gaussian sampling. The standard Gaussian samples yield a star discrepancy of

D_{N}^{*} = 0.0380

, whereas the Halton–Gaussian samples achieve a substantially lower value of

D_{N}^{*} = 0.0032

.

These results quantitatively demonstrate that Halton-based sampling provides a significantly more uniform coverage of the control space, effectively reducing the sampling gaps and clustering effects commonly observed in pseudo-random sampling.

In a two-dimensional control space, the coverage of sampled points has a direct impact on trajectory optimization performance in MPPI control. Figure 4 illustrates the two-dimensional scatterplots of control samples generated by the two sampling strategies, where Figure 4a corresponds to conventional Gaussian sampling and Figure 4b corresponds to Halton–Gaussian sampling. It can be clearly observed that Halton–Gaussian samples are distributed more uniformly over the control space, effectively avoiding clustering and void regions. This property plays a critical role in improving the stability and convergence speed of the control policy.

The sample-level differences in spatial uniformity are further manifested in the resulting trajectory distributions. In Figure 5, the trajectory ensembles generated by the two methods are compared, with

M = 1000

sampled trajectories in each case. Specifically, Figure 5a illustrates the trajectory set produced by the classical MPPI method, while Figure 5b presents the trajectories generated by the Halton–MPPI approach augmented with Ornstein–Uhlenbeck (OU) temporally correlated noise (

ρ = 0.95

).

The trajectories generated by standard MPPI control exhibit pronounced local clustering and coverage gaps, resulting in certain regions of the state space remaining persistently unexplored. In contrast, after introducing the OU process, Halton–MPPI not only produces more uniformly distributed trajectories in both the control and task spaces but also enables a substantial expansion of the overall trajectory coverage through the smooth temporal evolution of control perturbations. As a result, unexplored blind regions are effectively reduced, enabling a more comprehensive and continuous exploration of the system’s reachable state set.

Based on the experimental results obtained from the histogram analysis, Q–Q plots, two-dimensional scatter visualizations, and trajectory distribution comparisons, these results demonstrate that Halton low-discrepancy sampling significantly improves the uniformity and stability of MPPI sampling under a limited sample budget. This enhanced uniform coverage effectively reduces the variance of control updates and leads to improved convergence behavior.

4. Results

4.1. Simulation-Based Evaluation

In this section, we conduct a simulation-based evaluation of the proposed method using the BARN dataset [33]. This dataset consists of 300 scenarios with varying levels of difficulty, ranging from relatively open environments to highly cluttered settings, and is specifically designed to benchmark navigation performance in densely obstructed environments.

The remainder of this section is organized as follows: We first describe the experimental setup and implementation details. We then introduce the evaluation metrics. Finally, we present and analyze the quantitative results.

4.1.1. Experimental Setup and Evaluation Metrics

To demonstrate the advantages of the proposed Halton–MPPI control strategy over existing methods, extensive simulation experiments were conducted on an AGV autonomous navigation task in cluttered obstacle environments. The robotic platform was a differential-drive wheeled mobile robot, whose kinematic model is described by the following equations:

\dot{x} = v_{t} cos (θ_{t}), \dot{y} = v_{t} sin (θ_{t}), \dot{θ} = ω_{t},

(15)

where x and y denote the robot’s position coordinates, and

θ

represents the heading angle. The control inputs consist of the linear velocity

v_{t}

and the angular velocity

ω_{t}

, which together form the control vector

u_{t} = {[v_{t}, ω_{t}]}^{⊤}

.

The optimal control problem is subject to the following constraints:

\begin{matrix} 0 & \leq v_{t} \leq 1.0, & | ω_{t} | & \leq π / 4, & (x_{t}, y_{t}) & \notin O, \end{matrix}

(16)

where

O

denotes the obstacle region defined according to the environment configuration. Trajectory sampling is performed using the discrete-time kinematic model in (15). For the two-dimensional navigation task, the state-dependent cost function is simplified as follows:

q (x) = q_{state} (x) + q_{obs} (x),

(17)

where

q_{state} (x) = {(x - x_{f})}^{⊤} Q (x - x_{f})

is a quadratic cost that encourages the robot state

x

to approach the desired goal state

x_{f}

. In this work, the weighting matrix is set to

Q = 100

, which emphasizes accurate goal-reaching behavior while maintaining stable trajectory optimization. The obstacle-related cost is defined as

q_{obs} (x) = 10^{7} C_{crash}

, which imposes a large penalty on trajectories that result in collisions with obstacles. Here,

C_{crash}

is a Boolean indicator specifying whether a collision with an obstacle has occurred.

To evaluate the smoothness of both trajectories and control inputs, the Mean Squared Curvature (MSCX) metric and the Mean Smoothness of Control (MSCU) metric are introduced and defined as follows:

MSCX = \frac{1}{N - 2} \sum_{i = 1}^{N - 2} k_{i}^{2}, k_{i} = f^{″} (x_{i}),

(18)

MSCU = \frac{1}{T - 2} \sum_{t = 1}^{T - 2} {(u_{t + 1} - 2 u_{t} + u_{t - 1})}^{2} .

(19)

Here, N denotes the number of sampled points along the trajectory, and

k_{i}

represents the curvature-related term at the i-th sampling point. In practice, the planned trajectory is represented as a discrete sequence of planar waypoints

{p_{i} = (x_{i}, y_{i})}_{i = 1}^{N}

. To compute

f^{″} (x_{i})

in a numerically stable and reproducible manner, the trajectory is first reparameterized with respect to the path length and uniformly resampled. The second derivative of the trajectory coordinates with respect to the path parameter is then approximated using a central finite difference scheme:

f^{″} (x_{i}) \approx p_{i + 1} - 2 p_{i} + p_{i - 1}

. The curvature term

k_{i}

is computed as the Euclidean norm of this second-order difference. A smaller MSCX value indicates smoother curvature variations and, consequently, a smoother trajectory.

The control input at time step t is given by

u_{t} = {[v_{t}, ω_{t}]}^{T}

, where

v_{t}

and

ω_{t}

denote the linear and angular velocities, respectively. Equation (19) computes the temporal variation of the control inputs using a second-order finite difference, effectively capturing the temporal smoothness of the control signal. A lower MSCU value implies smoother control commands, leading to more stable system execution with reduced oscillations.

4.1.2. Quantitative Performance Comparison

The proposed Halton–MPPI framework was compared against several representative MPPI-based methods, including standard MPPI control [7], Log-MPPI [14], and Smooth-MPPI [34]. All selected baselines are widely adopted variants of MPPI control that target different aspects of sampling efficiency and trajectory smoothness. By reporting both the mean and standard deviation of MSCX and MSCU, the comparison enables a fair and informative assessment of not only the average performance but also the robustness of each method within the same control framework.

MPPI control [7] serves as the canonical baseline and remains a foundational method in sampling-based trajectory optimization.
Log-MPPI [14] is a recent extension that improves numerical stability and sampling robustness through logarithmic cost reformulation.
Smooth-MPPI [34] introduces an input lifting strategy by sampling control increments in the derivative action space and reconstructing the control sequence through temporal integration.

Experiments were conducted on 300 distinct two-dimensional environments from the BARN dataset. The map preprocessing procedure followed the approach in [35], where each environment was expanded from

3 m \times 3 m

to

3 m \times 5 m

to introduce additional free space around the initial and goal states and to reduce the likelihood of immediate collisions. The goal state was defined as

x_{goal} = {[1.5, 5.0, π / 2]}^{T}

, which is located in the upper central region of the map and oriented vertically upward. The initial state was set to

x_{init} = {[1.0, 0.0, π / 2]}^{T}

. A total of 300 simulation trials were performed, with one trial conducted in each environment.

Table 1 summarizes the shared parameter settings for all evaluated MPPI-based methods, including the number of samples, planning horizon, inverse temperature

λ

, and control noise variance

σ_{u}

. Under these unified conditions, a large-scale evaluation was conducted across challenging environments to ensure a fair and comprehensive assessment of the robustness and efficiency of Halton–MPPI. The corresponding quantitative performance metrics are reported in Table 2.

As illustrated in Figure 6 (specifically, Figure 6a–c) and corroborated by the results in Table 2, clear performance differences were observed across key metrics, including computational efficiency, trajectory smoothness, and control smoothness.

In terms of success rate, the proposed Halton–MPPI achieved comparable performance to standard MPPI control and slightly lower than that of Log–MPPI, indicating that the proposed sampling strategy maintains reliable task completion performance. With respect to computational efficiency, Figure 6a and Table 2 show that Halton–MPPI may incur marginally higher computational costs in certain individual cases. However, when averaged over all experiments, Halton–MPPI attained a mean computation time per planning iteration (

0.0710 \pm 0.1318

s) that is comparable to that of standard MPPI control and notably lower than that of both Log–MPPI and Smooth–MPPI under identical sampling budgets. This result indicates that the proposed sampling strategy introduces only limited additional overhead while maintaining practical real-time performance.

In terms of motion quality, Halton–MPPI consistently achieved the lowest trajectory smoothness metric MSCX (

0.00287 \pm 0.00076

) among all evaluated methods, demonstrating its ability to generate smoother spatial trajectories. Likewise, in terms of control smoothness, measured by MSCU, Halton–MPPI attained the lowest value (

1.1438 \pm 0.3632

), outperforming all baseline approaches. The reduced standard deviations further indicate improved consistency across repeated trials. Overall, these results demonstrate that Halton–MPPI improves both trajectory and control smoothness while maintaining comparable success rates and practical runtime performance under identical sampling budgets.

To enable a statistically meaningful comparison of algorithmic performance, extensive simulations were conducted across a wide range of MPPI parameter configurations. The number of MPPI samples (

N_{u}

) was increased from 100 to 12,800 by doubling the sample count at each step. The control covariance matrix (

σ_{u}

) was varied from

0.1 I

to

0.9 I

, where

I

denotes the identity matrix with compatible dimensions.

Figure 7 presents an overall performance comparison of the four MPPI-based methods in terms of success rate, computational time, and trajectory and control smoothness under different combinations of sample size

N_{u}

and control covariance

σ_{u}

.

Taken together, these evaluation metrics demonstrate that Halton–MPPI achieves a favorable balance between computational efficiency and trajectory smoothness through the use of a low-discrepancy Halton-sequence-based sampling strategy. Across all four performance indicators, Halton–MPPI exhibited either the best or near-optimal results, thereby validating its effectiveness on the BARN benchmark dataset. Nevertheless, it should be noted that Halton–MPPI, like other sampling-based local optimization methods, may still suffer from local minimum trapping in highly non-convex environments, particularly when global guidance information is sparse or misleading. While the proposed low-discrepancy and temporally correlated sampling strategy improves optimization stability and sampling efficiency, it does not fundamentally eliminate the reliance on local cost information inherent to MPPI-based frameworks.

4.1.3. Ablation Study

To better understand the contribution of each component in the proposed approach, we conducted a set of ablation experiments to systematically evaluate the effects of Ornstein–Uhlenbeck (OU) temporal correlation under different configurations within the MPPI framework. Specifically, four experimental configurations are considered:

MPPI: Standard MPPI control using independent Gaussian sampling to generate control perturbations.
Penalty-based MPPI: Standard MPPI control augmented with an explicit control perturbation penalty term to suppress high-frequency control variations.
Halton–MPPI w/o OU: Halton–MPPI without introducing OU-based temporal correlation.
Halton–MPPI: The full method combining Halton low-discrepancy sampling with OU-based temporally correlated control perturbations.

Under these four configurations, each method was independently evaluated ten times in the same representative scenario from the BARN dataset, as illustrated in Figure 8. The parameter settings for all MPPI-based methods are listed in Table 1. For the penalty-based MPPI configuration, an explicit control perturbation penalty term was added to the cost function and defined as

S_{total} = S_{original} + λ \sum_{j = 1}^{N - 1} {∥u_{j} - u_{j - 1}∥}^{2}

, where the penalty coefficient

λ

was set to 1 in all experiments. The performance of each configuration was evaluated using four performance metrics: navigation success rate, average computation time, trajectory smoothness (MSCX), and control smoothness (MSCU). The quantitative results of the ablation study are reported in Table 3.

As summarized in Table 3, introducing Halton low-discrepancy sampling alone (Halton–MPPI w/o OU) results in a substantial reduction in both MSCX and MSCU compared with standard MPPI control, while maintaining identical success rates and comparable computational efficiency. This result indicates that improving the spatial uniformity of control perturbation sampling already contributes to smoother trajectories and reduced control jitter, even in the absence of explicit temporal correlation.

When an explicit control perturbation penalty is introduced (penalty-based MPPI), both MSCX and MSCU are reduced relative to the baseline MPPI control. However, the improvement remains limited compared to Halton-based sampling, suggesting that cost function penalties primarily influence trajectory evaluation rather than fundamentally restructuring the sampled control sequences.

In contrast, the proposed Halton–MPPI framework consistently achieves the lowest MSCX and MSCU values among all configurations. Compared with the penalty-based approach, Halton–MPPI further reduces MSCX from 0.0030 to 0.0019 and MSCU from 1.2731 to 0.6344. These results demonstrate that OU-based temporal correlation provides a significantly stronger smoothing effect than a simple jitter penalty. This improvement can be attributed to the fact that the OU process introduces temporal coherence directly at the sampling stage, enforcing smooth evolution of control perturbations across time steps. In contrast, penalty-based methods act only during cost evaluation and do not alter the underlying sampling distribution, making their effectiveness sensitive to penalty weight tuning.

Overall, the ablation results quantitatively confirm that incorporating OU-based temporal correlation is not merely an alternative formulation of a jitter penalty but a complementary and more effective mechanism for suppressing control jitter. This justifies the added complexity of the OU process in the proposed Halton–MPPI framework. Figure 8 illustrates the complete planning process of a single navigation episode within a representative environment.

4.2. Real-World Demonstration

To further validate the practicality and effectiveness of the proposed method in real-world settings, extensive field experiments were conducted using an autonomous test vehicle operating in an outdoor environment. The test site primarily consists of a cement-paved road typical of outdoor environments. The surrounding environment features roadside drainage ditches with abrupt elevation changes and trees distributed along both sides of the road, which introduce potential collision risks and place demands on reliable perception and local planning capabilities. The road slope is mild, and static obstacles are mainly composed of trees and roadside structures. All experiments were conducted during daytime under natural lighting conditions, with clear visibility and no adverse weather effects. The experimental platform was a wheeled off-road vehicle equipped with multiple onboard sensors, including a LiDAR, cameras, and a millimeter-wave radar, enabling robust perception of complex terrain structures and surrounding obstacles. The proposed algorithm was implemented in C++ and deployed on a Linux-based industrial computer equipped with an Intel Core i9-13900E CPU, an NVIDIA L4 GPU, and 64 GB of RAM.

4.2.1. Perception Module

During real-world operation, the perception module processes LiDAR and millimeter-wave radar measurements as inputs and generates a local environmental representation online at a frequency of approximately 10 Hz. The resulting local map is centered at the vehicle’s current position and covers a circular area with a radius of approximately 50 m. The environment is represented in a bird’s-eye view (BEV) format and continuously updated as the vehicle moves, thereby providing real-time environmental information for the local path planning module.

4.2.2. Bicycle Kinematic Vehicle Model

For motion modeling, a bicycle kinematic vehicle model was adopted to describe the planar motion of the vehicle. This model achieves a favorable balance between computational efficiency and motion representation fidelity, making it well suited for sampling-based model predictive trajectory planning approaches.

Under this model, the system state is defined as follows:

x_{t} = {[\begin{matrix} x_{t} & y_{t} & ψ_{t} \end{matrix}]}^{T},

(20)

where

(x_{t}, y_{t})

denotes the position of the vehicle’s center of mass in the two-dimensional plane, and

ψ_{t}

represents the vehicle’s heading angle.

The corresponding control input is defined as follows:

u_{t} = {[\begin{matrix} v_{t} & κ_{t} \end{matrix}]}^{T},

(21)

where

v_{t}

denotes the longitudinal velocity and

κ_{t}

denotes the steering curvature. To ensure that the planned trajectories are dynamically feasible and consistent with the physical limitations of the real vehicle, explicit bounds were imposed on the control inputs. Specifically, the vehicle’s velocity was constrained within the range

0 \leq v_{t} \leq 10 km / h

, while the steering curvature was limited to

| κ_{t} | \leq 0.2

.

It is worth noting that, in real-world outdoor environments, vehicles’ motion is often influenced by complex dynamic effects such as uneven terrain and wheel slip, which are difficult to model accurately. In this work, such unmodeled dynamics are compensated by the low-level motion controller through closed-loop feedback, while the high-level planning module performs trajectory optimization based on the bicycle kinematic model.

4.2.3. Cost Function Modeling

For the navigation task, the state-dependent cost function was formulated as the sum of a state cost term and an obstacle cost term:

q (x) = q_{state} (x) + q_{obs} (x) .

(22)

All cost function parameters were kept consistent with those used in the simulation experiments, as defined in Equation (17).

4.2.4. Experimental Results

Field experiments were conducted in representative outdoor environments. As illustrated in the upper part of Figure 9, the test site features dense forested areas on both sides of the road, accompanied by continuous drainage ditch structures. Such environmental characteristics impose high demands on the safety and robustness of autonomous navigation systems. No prior high-precision maps exist for this environment; only low-resolution satellite imagery and sparsely annotated key areas served as sparse prior guidance information.

In experiments, the parameters of the Halton–MPPI planner were configured as follows:

T = 50

,

N = 1000

, and

ρ = 0.9

,

γ = 0.02

. The control noise covariance was specified as

σ_{u} = Diag (σ_{v}^{2}, σ_{k}^{2}) = Diag (0.2, 0.1)

. These parameters were kept identical across experimental scenarios to ensure fair and consistent evaluation.

The lower part of Figure 9 presents several representative planning results corresponding to regions marked ①–⑤, illustrating the vehicle’s local planning behavior under varying environmental conditions. In each scenario, 1000 candidate trajectories were sampled and generated. The sampled trajectories are color-coded from blue to red according to their associated costs, where higher-cost trajectories are shown in blue and lower-cost trajectories gradually transition toward red. The optimal trajectory ultimately selected by the planner is highlighted with a series of yellow boxes.

The experimental results demonstrate that the proposed method effectively distinguishes feasible trajectories from high-risk ones in complex environments. The generated optimal trajectories maintain a reasonable safety margin while reliably avoiding obstacles, showcasing excellent safety and robustness. Notably, in the sharp-turn scenario depicted in ③, the planning difficulty significantly increases due to stringent local environmental constraints. Nevertheless, the planner successfully identifies an optimal trajectory satisfying the vehicle’s kinematic constraints, achieving smooth and effective obstacle avoidance. This result further validates the proposed planning strategy’s robustness and adaptability under complex operating conditions.

During field testing, the vehicle successfully completed approximately 1 km of fully autonomous navigation, further validating the effectiveness and engineering practicality of the proposed method in real outdoor environments. The temporal profiles of the vehicle’s velocity, heading angle, and steering angle rate recorded during the experiment are illustrated in Figure 10. These control variable profiles provide quantitative evidence of the stability, smoothness, and feasibility of the proposed framework.

As shown in Figure 10a, the vehicle maintained smooth and stable motion throughout the experiment, with an average traveling speed of 2.708 m/s. The velocity profile exhibits gradual and continuous variations, indicating that the vehicle consistently tracked the planned trajectory without abrupt acceleration or deceleration. Moreover, the vehicle adaptively adjusted its speed in response to environmental complexity and local planning constraints, demonstrating the proposed method’s ability to effectively balance motion efficiency and operational safety in real outdoor environments.

The heading angle profile and the steering angle rate profile shown in Figure 10b,c jointly illustrate the kinematic consistency and control smoothness of the planned trajectory. Despite the presence of narrow passages, curved segments, and obstacle-induced constraints in the experimental environment, the heading angle evolves smoothly within a feasible range, without exhibiting high-frequency oscillations or abrupt directional changes. At the same time, the steering angle rate remains well bounded throughout the maneuver, with no noticeable spikes or discontinuities. This indicates that changes in steering commands are gradual and physically realizable, effectively avoiding overly aggressive control actions. Taken together, these results demonstrate that the proposed method produces trajectories that simultaneously respect the vehicle’s kinematic model and actuator limitations, enabling stable and reliable execution by the low-level motion controller.

Overall, the control variables—including velocity, heading angle, and steering rate of change—collectively demonstrate that the proposed method can generate dynamically feasible, safe, and stable driving trajectories in real-world environments. These experimental results provide strong empirical support for the effectiveness and engineering practicality of the proposed approach.

5. Conclusions

This paper proposes an enhanced Model Predictive Path Integral control framework, termed Halton–MPPI, which aims to address the control sequence instability commonly observed in classical MPPI algorithms. By introducing low-discrepancy Halton sequences as the sampling basis for control perturbations and incorporating an Ornstein–Uhlenbeck (OU) process to impose temporal correlation, the proposed method achieves an effective balance between uniform exploration of the control space and temporal continuity of the control sequence. Extensive large-scale simulation experiments on the BARN dataset, together with real-world field tests in outdoor environments, demonstrated that Halton–MPPI consistently outperforms multiple MPPI-based baseline methods in terms of computational efficiency and motion smoothness. Quantitative results further indicate that, while achieving navigation success rates comparable to those of the baseline approaches, Halton–MPPI achieves significant improvements in trajectory smoothness (MSCX) and control smoothness (MSCU). These improvements effectively enhance trajectory executability and overall control stability.

Future work will focus on two main directions: First, we will investigate the integration of topological awareness and structural priors to mitigate the local minimum trapping problem in complex non-convex environments. Second, we plan to exploit parallel computing architectures by deploying the framework on GPU platforms, thereby exploiting large-scale parallel sampling and trajectory rollout to further reduce computational latency and improve real-time performance.

Author Contributions

Conceptualization, K.X. and X.L.; methodology, K.X.; software, K.X.; validation, K.X., L.Y. and Z.S.; formal analysis, Y.B.; investigation, K.X. and Y.B.; resources, L.Y.; data curation, Z.S.; writing—original draft preparation, K.X.; writing—review and editing, X.L. and L.Y.; visualization, K.X.; supervision, L.Y. and X.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. Part of the data are not publicly available due to our laboratory’s confidentiality agreement and policies.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Wijayathunga, L.; Rassau, A.; Chai, D. Challenges and solutions for autonomous ground robot scene understanding and navigation in unstructured outdoor environments: A review. Appl. Sci. 2023, 13, 9877. [Google Scholar] [CrossRef]
Lyu, Z.; Gao, Y.; Chen, J.; Du, H.; Xu, J.; Huang, K.; Kim, D.I. Empowering Intelligent Low-Altitude Economy with Large AI Model Deployment. IEEE Wirel. Commun. 2026, 33, 64–72. [Google Scholar] [CrossRef]
Xiao, W.; Shi, C.; Chen, M.; Vasilakos, A.V.; Chen, M.; Farouk, A. LLM-based UAV Path Planning for Autonomous and Adaptive Industry Systems. ACM Trans. Auton. Adapt. Syst. 2025. [Google Scholar] [CrossRef]
Wang, N.; Li, X.; Zhang, K.; Wang, J.; Xie, D. A Survey on Path Planning for Autonomous Ground Vehicles in Unstructured Environments. Machines 2024, 12, 31. [Google Scholar] [CrossRef]
Gerdpratoom, N.; Yamamoto, K. Decentralized Nonlinear Model Predictive Control-Based Flock Navigation with Real-Time Obstacle Avoidance in Unknown Obstructed Environments. Front. Robot. AI 2025, 12, 1540808. [Google Scholar] [CrossRef]
Han, T.; Liu, A.; Li, A.; Spitzer, A.; Shi, G.; Boots, B. Model Predictive Control for Aggressive Driving over Uneven Terrain. arXiv 2023, arXiv:2311.12284. [Google Scholar] [CrossRef]
Williams, G.; Aldrich, A.; Theodorou, E.A. Model Predictive Path Integral Control: From Theory to Parallel Computation. J. Guid. Control Dyn. 2017, 40, 344–357. [Google Scholar] [CrossRef]
Williams, G.; Wagener, N.; Goldfain, B.; Drews, P.; Rehg, J.M.; Boots, B.; Theodorou, E.A. Information Theoretic MPC for Model-Based Reinforcement Learning. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Singapore, 29 May–3 June 2017; pp. 1714–1721. [Google Scholar]
Kazim, M.; Hong, J.; Kim, M.-G.; Kim, K.-K.K. Recent Advances in Path Integral Control for Trajectory Optimization: An Overview in Theoretical and Algorithmic Perspectives. Annu. Rev. Control 2024, 57, 100931. [Google Scholar] [CrossRef]
Bhardwaj, M.; Sundaralingam, B.; Mousavian, A.; Ratliff, N.D.; Fox, D.; Ramos, F.; Boots, B. STORM: An Integrated Framework for Fast Joint-Space Model-Predictive Control for Reactive Manipulation. In Proceedings of the Conference on Robot Learning (CoRL), Auckland, New Zealand, 14–18 December 2022; pp. 750–759. [Google Scholar]
Williams, G.; Drews, P.; Goldfain, B.; Rehg, J.M.; Theodorou, E.A. Information-Theoretic Model Predictive Control: Theory and Applications to Autonomous Driving. IEEE Trans. Robot. 2018, 34, 1603–1622. [Google Scholar] [CrossRef]
Williams, G.; Goldfain, B.; Drews, P.; Saigol, K.; Rehg, J.M.; Theodorou, E.A. Robust Sampling-Based Model Predictive Control with Sparse Objective Information. In Proceedings of the Robotics: Science and Systems (RSS), Pittsburgh, PA, USA, 26–30 June 2018. [Google Scholar]
Yin, J.; Zhang, Z.; Theodorou, E.; Tsiotras, P. Trajectory Distribution Control for Model Predictive Path Integral Control Using Covariance Steering. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA, 23–27 May 2022; pp. 1478–1484. [Google Scholar]
Mohamed, I.S.; Yin, K.; Liu, L. Autonomous Navigation of AGVs in Unknown Cluttered Environments: Log-MPPI Control Strategy. IEEE Robot. Autom. Lett. 2022, 7, 10240–10247. [Google Scholar] [CrossRef]
Mohamed, I.S.; Xu, J.; Sukhatme, G.S.; Liu, L. Towards Efficient MPPI Trajectory Generation with Unscented Guidance: U-MPPI Control Strategy. IEEE Trans. Robot. 2025, 41, 1172–1192. [Google Scholar] [CrossRef]
Asmar, D.M.; Senanayake, R.; Manuel, S.; Kochenderfer, M.J. Model Predictive Optimized Path Integral Strategies. arXiv 2022, arXiv:2203.16633. [Google Scholar]
Kim, M.G.; Jung, M.; Hong, J.G.; Kim, K.K.K. MPPI-IPDDP: A Hybrid Method of Collision-Free Smooth Trajectory Generation for Autonomous Robots. IEEE Trans. Ind. Inform. 2025, 21, 5037–5046. [Google Scholar] [CrossRef]
Higgins, J.; Mohammad, N.; Bezzo, N. A Model Predictive Path Integral Method for Fast, Proactive, and Uncertainty-Aware UAV Planning in Cluttered Environments. In Proceedings of the 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Detroit, MI, USA, 1–5 October 2023; pp. 830–837. [Google Scholar]
Lambert, A.; Fishman, A.; Fox, D.; Boots, B.; Ramos, F. Stein Variational Model Predictive Control. arXiv 2020, arXiv:2011.07641. [Google Scholar]
Miura, T.; Akai, N.; Honda, K.; Hara, S. Spline-Interpolated Model Predictive Path Integral Control with Stein Variational Inference for Reactive Navigation. In Proceedings of the 2024 IEEE International Conference on Robotics and Automation (ICRA), Yokohama, Japan, 13–17 May 2024; pp. 13171–13177. [Google Scholar]
Liu, B.; Jiang, G.; Zhao, F.; Mei, X. Collision-free motion generation based on stochastic optimization and composite signed distance field networks of articulated robot. IEEE Robot. Autom. Lett. 2023, 8, 7082–7089. [Google Scholar] [CrossRef]
Halton, J.H. Algorithm 247: Radical-Inverse Quasi-Random Point Sequence. Commun. ACM 1964, 7, 701–702. [Google Scholar] [CrossRef]
L’Ecuyer, P. Random Number Generation and Quasi-Monte Carlo. Wiley StatsRef Stat. Ref. Online 2014, 1–12. [Google Scholar] [CrossRef]
Berblinger, M.; Schlier, C. Monte Carlo Integration with Quasi-Random Numbers: Some Experience. Comput. Phys. Commun. 1991, 66, 157–166. [Google Scholar] [CrossRef]
Dick, J.; Kuo, F.Y.; Sloan, I.H. High-Dimensional Integration: The Quasi-Monte Carlo Way. Acta Numer. 2013, 22, 133–288. [Google Scholar] [CrossRef]
Maller, R.A.; Müller, G.; Szimayer, A. Ornstein–Uhlenbeck Processes and Extensions. In Handbook of Financial Time Series; Andersen, T.G., Davis, R.A., Kreiß, J.-P., Mikosch, T., Eds.; Springer: Berlin/Heidelberg, Germany, 2009; pp. 421–437. [Google Scholar]
Mayne, D.Q. Model Predictive Control: Recent Developments and Future Promise. Automatica 2014, 50, 2967–2986. [Google Scholar] [CrossRef]
Theodorou, E.A.; Todorov, E. Relative Entropy and Free Energy Dualities: Connections to Path Integral and KL Control. In Proceedings of the IEEE 51st Conference on Decision and Control, Maui, HI, USA, 10–13 December 2012; IEEE: Maui, HI, USA, 2012; pp. 1466–1473. [Google Scholar]
Caflisch, R.E. Monte Carlo and Quasi-Monte Carlo Methods. Acta Numer. 1998, 7, 1–49. [Google Scholar] [CrossRef]
Plappert, M.; Houthooft, R.; Dhariwal, P.; Sidor, S.; Chen, R.Y.; Chen, X.; Asfour, T.; Abbeel, P.; Andrychowicz, M. Parameter Space Noise for Exploration. arXiv 2017, arXiv:1706.01905. [Google Scholar]
Uhlenbeck, G.E.; Ornstein, L.S. On the Theory of the Brownian Motion. Phys. Rev. 1930, 36, 823–841. [Google Scholar] [CrossRef]
Ortega, R.; Astolfi, A.; Bastin, G.; Rodriguez, H. Stabilization of Food-Chain Systems Using a Port-Controlled Hamiltonian Description. In Proceedings of the American Control Conference (ACC), Chicago, IL, USA, 28–30 June 2000; pp. 2245–2249. [Google Scholar]
Perille, D.; Truong, A.; Xiao, X.; Stone, P. Benchmarking Metric Ground Navigation. In Proceedings of the IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), Abu Dhabi, United Arab Emirates, 4–6 November 2020; pp. 116–121. [Google Scholar]
Kim, T.; Park, G.; Kwak, K.; Bae, J.; Lee, W. Smooth Model Predictive Path Integral Control without Smoothing. IEEE Robot. Autom. Lett. 2022, 7, 10406–10413. [Google Scholar] [CrossRef]
Jung, M.; Kim, K. BiC-MPPI: Goal-Pursuing, Sampling-Based Bidirectional Rollout Clustering Path Integral for Trajectory Optimization. arXiv 2024, arXiv:2410.06493. [Google Scholar]

Figure 1. Overall workflow of the proposed algorithm.

Figure 2. Histogram and PDFs of 1000 random samples generated from (a) normal and (b) Halton–Gaussian sampling, with

μ_{n} = 0

,

σ_{n}^{2} = 1

.

Figure 2. Histogram and PDFs of 1000 random samples generated from (a) normal and (b) Halton–Gaussian sampling, with

μ_{n} = 0

,

σ_{n}^{2} = 1

.

Figure 3. Q-Q plots against the standard normal distribution: (a) Gaussian samples; (b) Halton–Gaussian samples.

Figure 4. 2D scatterplots in control space: (a) Gaussian samples; (b) Halton–Gaussian samples, showing more uniform coverage due to the low-discrepancy property.

Figure 5. Distribution of 1000 sampled trajectories. The robot’s initial state is

x = {[x, y, θ]}^{T} = {[0, 0, 0]}^{T}

in units of (m, m, deg), with a commanded control input

u = {[v, ω]}^{T} = {[1.5, 0]}^{T}

in (m/s, rad/s).

Figure 5. Distribution of 1000 sampled trajectories. The robot’s initial state is

x = {[x, y, θ]}^{T} = {[0, 0, 0]}^{T}

in units of (m, m, deg), with a commanded control input

u = {[v, ω]}^{T} = {[1.5, 0]}^{T}

in (m/s, rad/s).

Figure 6. Performance comparison of MPPI-based methods: (a) Average computation time per planning iteration, reflecting the real-time efficiency of each method. (b) MSCX, measuring trajectory smoothness. (c) MSCU, quantifying the smoothness of the optimized control inputs.

Figure 7. Performance comparison of different MPPI-based methods: (a) Standard MPPI control; (b) Log-MPPI; (c) Smooth-MPPI; (d) Halton–MPPI. For each method, the reported metrics include the success rate, average computation time per planning iteration, trajectory smoothness (MSCX), and control smoothness (MSCU).

Figure 8. Iterations of the Halton–MPPI algorithm for generating a collision-free path from

(1, 0)

to

(1.5, 5)

. In the figure, gray represents the sampled trajectories, while red represents the trajectory as illustrated in stages ①–⑥. As the algorithm continues to iterate, it eventually finds the optimal collision-free trajectory, as shown in stage ⑥.

Figure 8. Iterations of the Halton–MPPI algorithm for generating a collision-free path from

(1, 0)

to

(1.5, 5)

. In the figure, gray represents the sampled trajectories, while red represents the trajectory as illustrated in stages ①–⑥. As the algorithm continues to iterate, it eventually finds the optimal collision-free trajectory, as shown in stage ⑥.

Figure 9. Example results from real-world vehicle experiments: The upper image shows satellite imagery of the test site, where the green marker denotes the start position, the yellow dots indicate sparse guiding waypoints, the red marker represents the goal position, and the blue curve corresponds to the trajectory executed by the vehicle. The circled numbers (①–⑤) indicate representative locations selected for detailed analysis. The lower image shows the corresponding local planning details at these locations, illustrating the local planning behaviors of the proposed method under different constraint conditions.

Figure 10. Profiles during real-world experiments: (a) velocity profile; (b) heading angle profile; (c) steering angle rate profile.

Table 1. Parameters for different MPPI-based algorithms.

Parameter	MPPI	Log-MPPI	Smooth-MPPI	Halton-MPPI
$N_{u}$	2000	2000	2000	2000
T	100	100	100	100
$λ$	0.1	0.1	0.1	0.1
$σ_{u}$	(0.25, 0.25)	(0.25, 0.25)	(0.25, 0.25)	(0.25, 0.25)
$ρ$	-	-	-	0.95

Table 2. Performance comparison of MPPI methods in the BARN dataset.

Algorithm	Success Rate	Avg Time (s)	MSCX ( $μ \pm σ$ )	MSCU ( $μ \pm σ$ )
Halton–MPPI	97% (9/300)	0.0710 ± 0.1318	0.0029 ± 0.0008	1.1438 ± 0.3632
MPPI	97% (9/300)	0.0738 ± 0.0662	0.0033 ± 0.0008	1.4984 ± 0.3870
Log-MPPI	98% (6/300)	0.0791 ± 0.0711	0.0058 ± 0.0010	2.5804 ± 0.4880
Smooth-MPPI	95% (16/300)	0.1126 ± 0.1008	0.0063 ± 0.0013	2.7370 ± 0.6549

Table 3. Ablation study results.

Algorithm	Succ. Rate	Avg Time (s)	MSCX	MSCU
MPPI	100%	0.0742 ± 0.0351	0.0041 ± 0.0006	1.8070 ± 0.2816
Penalty-based MPPI	100%	0.1230 ± 0.0406	0.0030 ± 0.0004	1.2731 ± 0.2041
Halton–MPPI w/o OU	100%	0.0726 ± 0.0415	0.0021 ± 0.0004	0.7818 ± 0.4025
Halton–MPPI	100%	0.0712 ± 0.0586	0.0019 ± 0.0007	0.6344 ± 0.2035

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Xu, K.; Ye, L.; Li, X.; Sun, Z.; Bu, Y. Stable and Smooth Trajectory Optimization for Autonomous Ground Vehicles via Halton-Sampling-Based MPPI. Drones 2026, 10, 96. https://doi.org/10.3390/drones10020096

AMA Style

Xu K, Ye L, Li X, Sun Z, Bu Y. Stable and Smooth Trajectory Optimization for Autonomous Ground Vehicles via Halton-Sampling-Based MPPI. Drones. 2026; 10(2):96. https://doi.org/10.3390/drones10020096

Chicago/Turabian Style

Xu, Kang, Lei Ye, Xiaohui Li, Zhenping Sun, and Yafeng Bu. 2026. "Stable and Smooth Trajectory Optimization for Autonomous Ground Vehicles via Halton-Sampling-Based MPPI" Drones 10, no. 2: 96. https://doi.org/10.3390/drones10020096

APA Style

Xu, K., Ye, L., Li, X., Sun, Z., & Bu, Y. (2026). Stable and Smooth Trajectory Optimization for Autonomous Ground Vehicles via Halton-Sampling-Based MPPI. Drones, 10(2), 96. https://doi.org/10.3390/drones10020096

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

Stable and Smooth Trajectory Optimization for Autonomous Ground Vehicles via Halton-Sampling-Based MPPI

Highlights

Abstract

1. Introduction

2. Materials

2.1. MPPI Framework

2.2. Halton Sequence

2.3. Ornstein–Uhlenbeck Process for Correlated Noise

3. Methods

3.1. Perturbation Sampling Mechanism Based on Halton Sequences and OU Process

3.2. Halton-MPPI Strategy

4. Results

4.1. Simulation-Based Evaluation

4.1.1. Experimental Setup and Evaluation Metrics

4.1.2. Quantitative Performance Comparison

4.1.3. Ablation Study

4.2. Real-World Demonstration

4.2.1. Perception Module

4.2.2. Bicycle Kinematic Vehicle Model

4.2.3. Cost Function Modeling

4.2.4. Experimental Results

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI